A recent client had migrated off their terminal server and onto a virtualized 2008 R2 RDS server. Actually a farm of them but for this case it did not matter. Their previous setup had been all contained on one 2003 server which also ran their AD, print server, and whatever else was crammed into the kitchen sink. This new setup had some proper separation and centralized storage all on 2008 servers. Now for all of their data and programs they would reach into a file share on the SAN. This was working great except for one program they had would keep on crashing unless the data files were local to the server. Event IDs were as such with one immediately following the other:
Event ID 1000
Faulting application name: OMNIS7.exe, version: 188.8.131.52, time stamp: 0x3bb82293
Faulting module name: OMNIS7.exe, version: 184.108.40.206, time stamp: 0x3bb82293
Exception code: 0xc0000006
Event ID 1005
Windows cannot access the file for one of the following reasons: there is a problem with the network connection, the disk that the file is stored on, or the storage drivers installed on this computer; or the disk is missing. Windows closed the program Omnis 7 core executable because of this error.
Program: Omnis 7 core executable
The error value is listed in the Additional Data section.
1. Open the file again. This situation might be a temporary problem that corrects itself when the program runs again.
2. If the file still cannot be accessed and
- It is on the network, your network administrator should verify that there is not a problem with the network and that the server can be contacted.
- It is on a removable disk, for example, a floppy disk or CD-ROM, verify that the disk is fully inserted into the computer.
3. Check and repair the file system by running CHKDSK. To run CHKDSK, click Start, click Run, type CMD, and then click OK. At the command prompt, type CHKDSK /F, and then press ENTER.
4. If the problem persists, restore the file from a backup copy.
5. Determine whether other files on the same disk can be opened. If not, the disk might be damaged. If it is a hard disk, contact your administrator or computer hardware vendor for further assistance.
Error value: C00000C4
Disk type: 0
It was very swiftly recognize that the situation was not a temporary problem though unfortunately a sporadic one. Patterns noted were that crashed the most often in the morning when everyone would sign on and the afternoon when everyone was closing out. Now 0xC00000C4 is STATUS_UNEXPECTED_NETWORK_ERROR but that doesn’t provide much to go on. Grabbing some performance logs also showed that there shouldn’t be a network performance problem either bandwidth-wise. The first thing that was tried was disabling rss and offloading but that did not help matters. Doing more research I was lead to believe that the problem was being caused by oplocks.
Oplocks, short for Opportunistic Locking, is a process in the SMB protocol that was designed to allow multiple processes to lock a file while providing client side caching. The purpose of this is to improve performance for the local clients on the network. For more reading on this consult this document and this document. So all the crashing basically came down to cache integrity since the database used by the client was a flat file instead of transactional database.
To disable oplocks on the server you go into this key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters
And set EnableOplocks to 0. If it is not there create it as a REG_DWORD. Reboot to take effect.
Unfortunately Server 2008 introduces a new problem. Server 2008 will communicate via SMB2 to any client using Vista or newer. SMB2 also does not allow oplocks to be disabled. The work around for this is that if SMB2 is disabled on either the client or the server then communication will fall back to using SMB. Easiest way to fix this then is to disable on the server.
Create a REG_DWORD named SMB2 and set it to 0. Reboot to take effect.
You may notice that the server takes substantially longer to start up after making this change. They were severe enough that I decided to test an alternative method for disabling SMB2. Since communication will default to SMB if either the client or the server did not support SMB2 then SMB2 could be disabled on the client side. Disabling on the client side is a bit different since you’re actually disabling a service.
You may want to backup this key for easy restoration. Then edit DependOnService and remove MRxSmb20.
You may want to backup this key as well. In this key set Start to 4. Reboot your client and SMB2 will now be disabled.
Ever since implementing these changes the client’s applications have been running solid as a rock. For oplocks reading from Microsoft check here.