2006/08/05

VSS & ntbackup errors

We run our backups through scheduled tasks with ntbackup. We do a SystemState backup every day (for us it is enough, maybe in other scenarios more frecuent backups are needed). As part of our dayly checks, we check last run status in the Scheduled Tasks for every server to see if it was successfull (error code 0). Some weeks ago, in order to verify that the backups were restorable, we went to see the actual file (.bkf) and, for our surprise, it was only 2Kb in size. Fortunately, we did not really need to restore the server, it was running fine but... not so fine if the backups were not being done. Why did not we notice it before? Scheduled Task last status said: 0 Ok We went to check the Application Event Log and found:
Type: Error
Source: NTBackup
Category: None
Event ID: 8019
Date: 7/17/2006
Time: 10:16:03 PM
User: N/A
Computer: SERVERNAME
Description: End Operation: Warnings or errors were encountered.
Consult the backup report for more details.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
So we went to check the ntbackup log file, located in %USERPROFILE$\Local settings\Application data\Microsoft\Windows NT\NTBackup\data, and found:
Backup Status
Operation: Backup
Active backup destination: File
Media name: "System State SERVERNAME.bkf created 2006-07-17 at 22:15"

Volume shadow copy creation: Attempt 1.
"MSDEWriter" has reported an error 0x800423f4. This is part of System State.
The backup cannot continue.

Error returned while creating the volume shadow copy: 800423f4
Aborting Backup.

----------------------

The operation did not successfully complete.

----------------------
In spanish:
Estado de la copia de seguridad
Operación: copia de seguridad
Destino de la copia de seguridad activo: Archivo
Nombre del medio: "System State MYSERVER.bkf creado 17/07/2006 a las 22:15"

Creación de instantánea de volumen: Intento 1.
"MSDEWriter" informó acerca de un error 0x800423f4. Esto forma parte del estado del sistema.
La copia de seguridad no puede continuar.

Error devuelto al crear la instantánea de volumen: 800423f4
Anular la copia de seguridad.

----------------------

Operación cancelada.

----------------------
After some research we found that we were experiencing the symptoms exposed in KB828481 - Error 800423f4 appears in the backup log file when you back up a volume by using the Volume Shadow Copy service in Windows Server 2003 with some particularities: Our server is a Windows 2003 Server R2 Standard x64 running as domain controller and with SQL Server 2005 Standard + SP1 installed. Some of our SQL databases have a recovery mode set as Full and should be kept in this mode (changing them into Simple is not an option). However, if we check the server file versions and timestamps for them, and compare them with the data shown in the former KB828481 we found that our server's are more up to date (newer) than those shown in the KB828481:
ntbackup.exe  5.2.3790.1830 (srv03_sp1_rtm.050324-1447)   30/Nov/2005 14:00 
ws03res.dll   5.2.3790.1830 (srv03_sp1_rtm.050324-1447)   30/Nov/2005 14:00 
Wws03res.dll  Not found in our server
Being the files in our server newer than those exposed in KB828481, we understand that the hotfix described there is not suitable for our case, even though we are experiencing the behaviour shown there. After a little more research we found KB913648 - A new Volume Shadow Copy Service update is now available that fixes various Volume Shadow Copy Service problems in Windows Server 2003. In this case the document is newer (Juy 26, 2006) but having a look into the problems fixed by this hotfix suggests us that this is not our case either (too complex things for our little simple configuration/server). Besides, the server had been doing the backups for months (confirmed) before we noticed that it was not doing them anymore. Just for your information, when we run vssadmin list writers in a command prompt, the process seemed to wait indefinitely and, after 30 minutes, we press CTRL+C:
C:\WINDOWS\system32>vssadmin list writers
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001 Microsoft Corp.

Waiting for responses.
These may be delayed if a shadow copy is being prepared.

^C
C:\WINDOWS\system32>
The server experiencing the problem has not a very high workload, not from the fileserver point of view (litle company with 25 employees), nor from the SQL Server's transactional point of view (somewhat near 1000 transactions per hour). Solution: Finally, the steps that solved the problem were:
cd %windir%/system32
net stop vss
regsvr32 /s ole32.dll
regsvr32 /s vss_ps.dll
vssvc /Register
regsvr32 /s /i swprv.dll
regsvr32 /s /i eventcls.dll
regsvr32 /s es.dll
regsvr32 /s stdprov.dll
regsvr32 /s vssui.dll
regsvr32 /s msxml.dll
regsvr32 /s msxml3.dll
regsvr32 /s msxml4.dll
And a reboot of the server. Links:

No comments: