Search This Blog

2011-06-17

Repair Genesys Datamart database

Database Type: MS SQL Server 2005
Genesys Database Name: datamart
Database Size: 18 GB

One 8 KB page block was corrupted after a disk full incident. The database failed to write the last block properly when the disk is full, which eventually causing database unable to bring online after brought offline

Following illustration show how to recover it using DBCC command. It is best to restore from database backup with transactional log. However, under disk full situation, I am not sure whether the transactional log is intact.

This method is also called as EMERGENCY mode recovery with potential data loss.

This error complains about MS SQL Server datamart01.mdf file has a logical corruption, and put the database in SUSPECT mode. Under SUSPECT mode, database is open but unaccessible by Genesys DataSourcer nor ETL Runtime, including native MS SQL Server client login

Following are the steps to take to bring it up. Please make a backup of mdf and ldf files first before trying so that there is a rollback point, or retry with other attempts or strategy.

use datamart
alter database datamart set single_user

Wait for 45 min (depending on storage performance) and gets following error


Error Message #1 Event Viewer Event ID 824
SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:257768; actual 1:14395423). It occurred during a read of page (1:257768) in database ID 9 at offset 0x0000007ddd0000 in file 'D:\Database\MSSQL\datamart\data\DATAMART01.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.


Error Message #2 Event Viewer Event ID 3314
During undoing of a logged operation in database 'DATAMART', an error occurred at log record ID (171116:180545:210). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.


Error Message #3 Event Viewer Event ID 3314
An error occurred during recovery, preventing the database 'DATAMART' (database ID 9) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.


Check the event_type from system table about the type of error in the affected page


SELECT * FROM msdb..suspect_pages WHERE (event_type = 1 OR event_type = 2 OR event_type = 3);

database_id file_id page_id event_type error_count last_update_date
9 1 257768 1 7 2011-06-16 17:40:10.137

See Microsoft MSDN for more info about event_type

Database ID 9, with FILE_ID 1, on PAGE_ID 257768 encountering EVENT_TYPE 1, and Event Viewer Event ID 824 means other error that is NOT bad checksum or torn page. Microsoft does not explain precisely what is the exact error, as this is a general error other than those number documented (2-7). Nevertheless, this is certainly related to operating system, or hardware, which EVENT_TYPE 1 is designed for.

Set the database in single user (already done so above) plus emergency mode, and start recovery

alter database datamart set emergency
DBCC CHECKDB (datamart, REPAIR_ALLOW_DATA_LOSS) WITH tablock

Note: WITH TABLOCK is optional as database is in single user mode. If under other situation where database is in MULTI_USERS mode, then TABLOCK is prefer to minimize tempdb usage, and will be faster

Msg 924, Level 14, State 1, Line 1
Database 'DATAMART' is already open and can only have one user at a time.
DBCC results for 'DATAMART'.
CHECKDB found 0 allocation errors and 0 consistency errors in database 'DATAMART'.

Events found in Event Viewer - Application

1 transactions rolled back in database 'DATAMART' (9). This is an informational message only. No user action is required.



Recovery is writing a checkpoint in database 'DATAMART' (9). This is an informational message only. No user action is required.

Message above indicated there is no data loss. I believe it is a torn page to populate checksum due to disk full

Depending of disk performance and database size, this example database taken 1 hr to repair.

SELECT * FROM msdb..suspect_pages WHERE (event_type = 1 OR event_type = 2 OR event_type = 3);

database_id file_id page_id event_type error_count last_update_date
9 1 257768 1 8 2011-06-17 09:05:58.320

Now the ERROR_COUNT increased by 1, and LAST_UPDATE_DATE is the time of completed the repair.

Open database in multi user mode (normal)

ALTER DATABASE datamart SET multi_user
Command(s) completed successfully.

Event shows in Event Viewer - Application
Setting database option MULTI_USER to ON for database datamart

Take a full backup of the database immediately

Bounce the MS SQL Server database to ensure it comes up properly.

Start Genesys DataSourcer, and ETL Runtime. They should execute properly


Please use following PayPal donate if my post helped

No comments: