Home » Troubleshooting

Category Archives: Troubleshooting

Advertisements

Mail Queuing for Mail Enabled Public Folders?

Feeling pretty good about yourself you come into the office and sit down to get some work done. After all, someone has to retrieve that Amulet of Yendor so it might as well be you, right? Unfortunately it doesn’t look like today will be your day. The warnings are piling up that your mail queue is getting rather large and some users have been asking where their daily messages in their public folders are. Taking a peek at the queue you see a large and growing number of emails in your Unreachable Domain queue. But your public folder database looks like it is mounted OK. Not cool.

What broke?

This is a fairly common scenario I’ve run into after migrating off of Exchange 2003. Your public folders migrated over successfully and mail had been flowing for a while but as soon as you took down the 2003 server the mail starts queuing up for your mail enabled public folders. Or maybe you went in and started doing some manual cleanup with ADSI Edit. Sometimes even just the uninstall of Exchange 2003 has some unexpected side effects. You remembered to do a backup of your AD prior to that major change, right? There’s a good chance that your public folder hierarchy is missing.

Great, so can we fix this?

The good news is that there is a road to recovery. Let’s check on things first, is your public folder hierarchy actually missing? Pop open the good old Exchange Management Shell and let’s check on a few things.

Import-Module ActiveDirectory
$SearchPath = "CN=Folder Hierarchies,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
Get-ADObject -SearchBase $SearchPath -SearchScope OneLevel -Filter {CN -eq "Public Folders"}

Hopefully you will get a result such as below.

DistinguishedName             Name                          ObjectClass                   ObjectGUID

—————–             —-                          ———–                   ———-

CN=Public Folders,CN=Folde… Public Folders                msExchPFTree                  f6a3cbd4-10e5-452d-9abe-44…

If you get a directory object not found then your public folder hierarchy is missing and we’ll have to recreate it. That’s step one on our way to saving the day. Let’s step back one further and make sure about whether our Folder Hierarchies container is there.

$SearchPath = "CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
Get-ADObject -SearchBase $SearchPath -SearchScope OneLevel -Filter {CN -eq "Folder Hierarchies"}

If you get no results then that is missing as well. If you do then at least our container is there and we just need to create the hierarchy. Here’s the bit of PowerShell code that will fix up the missing public folders hierarchy.

<#
.SYNOPSIS
Recreates your public folders hierarchy
.DESCRIPTION
Checks your AD for whether the Folder Hierarchies container exists and the 
Public Folders hierarchy. If one does not exist then it is created.
#>
Import-Module ActiveDirectory

# Build path to the container
$SearchPath = "CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
$PFContainer = Get-ADObject -SearchBase $SearchPath -SearchScope OneLevel -Filter {CN -eq "Folder Hierarchies"}

# If it does not exist then create the container
if(!$PFContainer)
{
    New-ADObject -Name "Folder Hierarchies" -Type msExchPublicfolderTreeContainer -Path $SearchPath
    Write-Host "Folder Hierarchies container created."
}
else
{
    Write-Host "Folder Hierarchies container exists already." -ForeGroundColor Yellow
}

# Build path for the public folder tree
$SearchPath = "CN=Folder Hierarchies,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
$PFHierarchy = Get-ADObject -SearchBase $SearchPath -SearchScope OneLevel -Filter {CN -eq "Public Folders"}

# If it does not exist then create it
if(!$PFHierarchy)
{
    New-ADObject -Name "Public Folders" -Type msExchPFTree -Path $SearchPath -OtherAttributes @{msExchPFTreeType="1"}
    Write-Host "Public Folders hierarchy created."
}
else
{
    Write-Host "Public Folders hierarchy already exists." -ForeGroundColor Yellow
}

# Set to our PF hierarchy DN
$PFHierarchy = "CN=Public Folders," + $SearchPath

# DN for our databases
$SearchPath = "CN=Databases,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
$PFDatabases = Get-ADObject -SearchBase $SearchPath -SearchScope OneLevel -Filter {objectClass -eq "msExchPublicMDB"}

# Grab all of the public folder databases and loop through them
if($PFDatabases)
{
    foreach($PFDatabase in $PFDatabases)
    {
        $PFDatabase.msExchOwningPFTree = $PFHierarchy
        Set-ADObject -Instance $PFDatabase
        Write-Host "Fixed database $($PFDatabase.Name)"
    }
}
# Or if no public folder databases exist you have further problems ...
else
{
    Write-Host "No Public Folder Databases found." -ForeGroundColor Yellow
}

But you’ll find that your work is not quite done yet. Your public folders are missing their homeMDB. Or this could have been your problem all along without any need to recreate the public folder hierarchy. You can verify this as the problem with this quick search:

$PFPath = "CN=Public Folders,CN=Folder Hierarchies,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
$SearchBase = "CN=Microsoft Exchange System Objects," + (Get-ADRootDSE).rootDomainNamingContext 
Get-ADObject -SearchBase $SearchBase -SearchScope OneLevel -Filter { (homeMDB -notlike "*") -and (ObjectClass -eq "publicFolder")}

If you don’t see anything then you know that your mail enabled public folders are fine. But most likely you’ll get a few results. To quickly fix those up run through this script.

<#
.SYNOPSIS
Fix any missing homeMDB attributes on public folders.
.DESCRIPTION
The script runs an LDAP search for all mail enabled public folder objects and
sets the homeMDB attribute to the LDAP path to your public folder hierarchy.
.NOTES
The script needs to be run in all domains requiring the fix.
#>
Import-Module ActiveDirectory
# Build the DN to the public folders hierarchy
$PFPath = "CN=Public Folders,CN=Folder Hierarchies,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups," + (Get-OrganizationConfig).DistinguishedName
# Build the DN to the public folder objects
$SearchBase = "CN=Microsoft Exchange System Objects," + (Get-ADRootDSE).rootDomainNamingContext
# Search for all PFs with a blank homeMDB
$TargetPFs = Get-ADObject -SearchBase $SearchBase -SearchScope OneLevel -Filter { (homeMDB -notlike "*") -and (ObjectClass -eq "publicFolder")}

# Fix all of the public folders
if($TargetPFs)
{
  foreach($TargetPF in $TargetPFs)
  {
    Write-Host "Fixing $($TargetPF.Name)"
    $TargetPF.homeMDB = $PFPath
    Set-ADObject -Instance $TargetPF
  }
}
# Good news (maybe), no public folders that require fixing
else
{
  Write-Host "No public folders missing homemdb."
}

Fantastic, are we done yet?

Nearly! Just give the mail queues a kick and you should see the emails quickly flushing out into your public folder databases. Now you can get back to your day of NetHack knowing that all is well with the world once again.

Advertisements

452 4.3.1 Insufficient System Resources – Continued Telnet Training

This is a problem that crops up fairly often if you have a lot of disparate Exchange servers out there without a solid monitoring solution in place. Very common for MSPs. Oh, and actually have somebody paying attention to those monitoring alerts. Nobody likes paying attention to monitoring alerts. There are reams of rules dedicated to keeping them out of sight in Outlook clients around the world. But that makes for an entirely separate topic/rant. The symptoms of this problem are that you’ll be getting reports from the end users that they don’t seem to be receiving any email, or at least any external email. But oddly enough sending out email is working just fine.

This is the point where a quick telnet test will focus you in on what is going on really fast. Continuing with what you learned from the post on Essential Exchange Troubleshooting – Send Email via Telnet you will want to telnet into the server from outside the organization. You may immediately get a response of:

452 4.3.1 Insufficient System Resources

But more likely you’ll receive a typical SMTP banner such as

220 myserver.contoso.com Microsoft ESMTP MAIL Service ready at Mon, 27 May 2013 08:19:44 -0700

If so then I recommend that you continue through with sending in an email via telnet. The next likely place that you’ll encounter this error is when you issue the RCPT TO: command to which you receive a response of

452 4.3.1 Insufficient System Resources

The fix for this is fairly simple. Check your Exchange server for low disk space usage on the partition where your queues reside, which will most likely be the partition with your Exchange installation. I find that most often what has eaten all of your space, in cases of single server Exchange 2007/2010 installations, is the IIS log files. When setting up your Exchange server it is a good idea to make sure that you have an archiving/recycling policy in place for your IIS logs to keep them from swallowing the entire partition over time. BES installations have the same problem as well with log files swallowing the drive.

The key phrase that you’ll want to keep in mind with this is “back pressure.” In a later post I’ll delve into this term.

More to the topic on hand, here’s an extra PowerShell fix for you to keep those IIS log files under control. It can also be easily customized for BES logs or other logging happy programs. Or even just keeping your temp files cleaned up regularly. You’ll want to set it to run as a scheduled task on a daily, weekly or monthly basis depending upon your organizations policies.

# CleanIISLogs.ps1
# Find and remove files older than $days
# Set $LogPath to where the IIS logs you want to recycle are kept
# 

$days = 31
$LogPath = C:\inetpub\logs\LogFiles\W3SVC1
# Find the target date
$startdate = Get-Date
$startdate = $startdate.AddDays(-$days)

# Clean the directory of log files older than the target date
Get-ChildItem -Path "$($LogPath)" -Recurse | where {$_.LastWriteTime -lt $startdate} | Remove-Item -Confirm:$false

Is this post helpful to you or is there something you would like me to go into greater detail on? Please let me know, thanks.

Essential Exchange Troubleshooting – Send Email via Telnet

One of the best tools available for troubleshooting mail flow issues is right at your fingertips. It is quick, simple, and only requires a little training to use effectively. I am always surprised at how very few Exchange administrators seem to use it. You can see some of this in action in my previous NDR troubleshooting post. So let’s delve into some of the basics of how to use telnet for troubleshooting your mail flow issues.

First off, it is a great way to see if your SMTP service is even available. If you cannot connect to it via telnet then you immediately know that you need to check on the health of your services and if that is ok then you most likely have a firewall or other networking issue. So to execute this basic check pop open a command prompt and run

telnet myserver.contoso.com 25

Substituting your server address and if you are troubleshooting an alternate port change the 25 to whatever port you are troubleshooting. The majority of the time it will be port 25 though. If you receive a successful connection you should be greeted by your mail server’s banner, probably something along the lines of below

220 myserver.contoso.com Microsoft ESMTP MAIL Service ready at Mon, 27 May 2013 08:19:44 -0700

This is also a good time to check whether you are seeing the correct server address in your banner. If you are seeing the internal address for myserver.contoso.local you will want to update this.

At this point you need respond with a HELO or EHLO command to start the SMTP conversation. What is the difference between them? HELO is for SMTP while EHLO is for eSMTP. In the context of sending an e-mail via telnet it won’t matter which you use but it mail be useful to use EHLO to see what verbs are being offered, especially if you are suspecting that there mail be a problem with eSMTP.

EHLO mail.alternatecontoso.com

You should receive a response similar to what is below

250-myserver.contoso.com Hello [4.3.2.1]
250-SIZE
250-PIPELINING
250-DSN
250-ENHANCEDSTATUSCODES
250-STARTTLS
250-AUTH NTLM LOGIN
250-8BITMIME
250-BINARYMIME
250 CHUNKING

If you have seen all of the above then so far so good. Your routing is good (assuming you aren’t being routed to the wrong SMTP server, but if you are and you don’t know it then you have bigger problems), your firewall configuration is correct and your hub transport is listening. Also note from the verbs sent above you can see that this service supports TLS and authentication as you can see in the STARTTLS and AUTH NTLM LOGIN verbs.

Now we want to start sending an email to someone on this server. Most likely your postmaster account since you are just testing your mail flow.

MAIL FROM: someone@alternatecontoso.com

You should receive a Sender OK response. If not then you’ll know that you need to look into sender permissions.

250 2.1.0 Sender OK

We need to specify who we are sending to

RCPT TO: postmaster@contoso.com

Here you should receive a Recipient OK response. This is the part where the conversation is most likely to break down and you will get an error code that you can start working with.

250 2.1.5 Recipient OK

So far so good, now we can send the actual email. Start off with the DATA command

DATA

And the server will be ready to receive your input. You can get as fancy or as simple as you like here but once you are done with the message use a . to end the mail input

354 Start mail input; end with <CRLF>.<CRLF>
.
250 2.6.0 <53ea1be2-3d1a-4856-8bdf-3c576c14cfc0@mail.contoso.com> [InternalId=47975] Queued mail for delivery

Assuming everything is still going well you should see either a queued mail for delivery or a spam rejection, depending upon how strict your spam filter is. You also may get an error message that would be worth researching as well. If everything is still going well you can close out the conversation.

QUIT
221 2.0.0 Service closing transmission channel

This is all for sending an email similar to how an anonymous server on the Internet would send an e-mail. There are a few variations that you will want to be aware of as well though. The first is for testing relaying. When you get to the part where you input the recipient you will input a remote server recipient.

RCPT TO: foreignemail@externalserver.com

If you are NOT wanting the server to relay the expected response would be

550 5.7.1 Unable to relay

This is a good thing as you do not want open relays sitting around. But on the other hand if this is an internal connector that is supposed to be relaying then you could have a permissions problem on your hand.

The other method that you would normally use telnet testing for is authentication. This is a bit more complex. After your HELO/EHLO command you will issue

AUTH LOGIN

To which if basic authentication is supported you will receive a response of

334 VXNlcm5hbWU6

This gibberish is actually a base64 encoded response that says “Username:”. The expected response to this is a base64 encoded username. An online utility I recommend is at this site which is pretty simple for encoding/decoding base64 message. So translate the username you are attempting to use into base64 and respond with that. I responded with “logon” encoded.

bG9nb24=

You should receive a response of

334 UGFzc3dvcmQ6

Which translates to “Password.” So now you need to response with the account’s password encoded in base64. My response was “My simple password”

TXkgc2ltcGxlIHBhc3N3b3Jk

You should now receive an Authentication successful message

235 Authentication succeeded

And you can continue with the rest of your steps of sending an email.

Was this post helpful? Do you have any topics you would be interested in seeing me cover in a later blog post? Just leave your suggestion in the comments below or shoot me an email.

OWA Login – Your Account has been Disabled

While this may not be a common issue, or at least I certainly hope it is not a common issue for you, it can be a bit vexing to figure out what is going on. You have a user with a recently restored account that is attempting to login to OWA and they are receiving an error similar to the following:

Your account has been disabled.

Copy error details to clipboard

Show details

Request

Url: https://mail.contoso.com:443/owa/

User host address: 1.2.3.4

User: Jane Doe

EX Address: /o=first organization/ou=exchange administrative group
(fydibohf23spdlt)/cn=recipients/cn=jane doe96d

SMTP Address: jdoe@contoso.com

OWA version: 14.2.318.2

The steps leading up to this error are most likely as follows.

  1. A user’s account was deleted and their mailbox removed recently. Possibly by accident or possibly by company politics.
  2. The user’s account is recreated as opposed to restored (which means a new SID and all the fun that goes along with that) and their mailbox is reattached to the account.
  3. The user now attempts to login with their “new” account into their old mailbox.
  4. Angry calls to your help desk now ensue.

Most likely your first thought was to do an iisreset but in this case you would be wrong. Here is how you clear this issue up swiftly and easily. Open up the EMS and run:

Clean-MailboxDatabase –Identity <Database Name>

This kicks off a scan of AD that updates the status of disconnected mailboxes in the targeted database. Alternatively you could also just tell the user to wait until Exchange runs its maintenance cycle on the database but that answer definitely won’t win you any friends. Now why does this need to be done? As you’ve probably suspected it is due to cached AD information of the disconnected mailboxes. For more info take a look at KB2682047.

SharePoint 2013 mystery error ID4220: The SAML Assertion is either not signed …

While implementing a fresh SharePoint 2013 claims based authentication site using ADFS 2.0 I ran across this error.

ID4220: The SAML Assertion is either not signed or the signature’s KeyIdentifier cannot be resolved to a SecurityToken. Ensure that the appropriate issuer tokens are present on the token resolver. To handle advanced token resolution requirements, extend Saml11TokenSerializer and override ReadToken.

Doing a search Bing/Google search turned up precious little information on this error and it mostly pertained to customer providers, which at this point were not being implemented on the site as this was using the out of the box provider. Going through and validating rules and URLs turned up previous little. It did sound a lot like a certificate error though, so carefully looking into the certificates used showed that I had exported and imported the wrong certificate on the STS. I had grabbed the token decrypting certificate instead of the token signing certificate. This is easily corrected. Export the certificate to a DER encoded file and then use the following commands to update your STS with the correct certificate.

$certPath = “C:\certs\tokensigner.cer”

$cert = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2(“$certPath”)

New-SPTrustedRootAuthority -Name “Token Signing Certificate” -Certificate $cert

$sts = Get-SPTrustedIdentityTokenIssuer

$sts | Set-SPTrustedIdentityTokenIssuer -ImportTrustCertificate $cert

 

Getting a Windows RRAS VPN server working on XenServer

A quick note on this. I was troubleshooting a problem today of a newly setup Windows RRAS PPTP VPN server was not working. Or rather it was kind of working. You could connect and authenticate, but when it came time to passing traffic you could only ping the RRAS server itself. Which is a bit troublesome if you are wanting to access anything else on the network such as your file server, your domain controller, your Exchange server and so forth.

Capturing traffic via Wireshark did show that traffic from the VPN client would pass beyond the RRAS server and a reply would be sent. It just never makes it back to the client from the RRAS. Some quick queries to Google turned up little beyond more familiar problems of incorrectly configured multihomed RRAS servers. Which proved not to be the case here. It turned out that TCP offloading was rearing its ugly head again. After switching that off in the properties for the NIC in question traffic immediately started passing back and forth properly. This made for happy clients. So the moral of the story is probably that you should always suspect offloading no matter how fixed it is claimed to be. Or perhaps to use Intel NICs instead of Broadcom, but that remains as something that I will have to test out later if I get the opportunity.

Problems Installing Exchange 2010 Service Pack 2 on SBS 2011

Now these problems that occur are very likely originating from an already rather screwed up installation of SBS 2011. I was not involved in the original setup of this particular server but I do know that there had been a large number of problems originally encountered. In this instance the task was to get Exchange 2010 SP2 installed. There are several hoops that you may have to jump through to get this installed, here I will recount what I was required to do.

Firstly you need to make sure that you have closed any instance of the SBS Console. Otherwise you’ll get a failure in the prerequisites. Also initially you’ll need to stop the Windows SBS Manager service though if you can get the install to progress to the point of working on the installed roles rather than the organization that will no longer be a requirement. Once you’re past those prerequisites in theory your installation should go smoothly. But if that is not the case then read on.
The next problem you may encounter is any error in the Hub Transport Role. From the event logs you’ll find this error:

 Event ID 1002 MSExchangeSetup
 Exchange Server component Hub Transport Role failed.
 Error: Error:
 The following error was generated when "$error.Clear();
 if (get-service MSExchangeServiceHost* | where {$_.name -eq "MSExchangeServiceHost"})
 {
 restart-service MSExchangeServiceHost
 }
 " was run: "Service 'Microsoft Exchange Service Host (MSExchangeServiceHost)' cannot be started due to the following error: Cannot start service MSExchangeServiceHost on computer '.'.".
Service 'Microsoft Exchange Service Host (MSExchangeServiceHost)' cannot be started due to the following error: Cannot start service MSExchangeServiceHost on computer '.'.
Cannot start service MSExchangeServiceHost on computer '.'.
The service cannot be started, either because it is disabled or because it has no enabled devices associated with it

Checking your services you’ll also find all of the Exchange services disabled. Service packs and update rollups usually disable the services to prevent them from starting up unexpectedly while the update is being installed, but in this case for some reason SP2 is jinxing itself by not allowing itself to start a couple of necessary services for it to be able to continue. The easiest way to get around this, though not necessarily the safest, is to make sure that at this point all the Exchange services are set to Manual or Automatic. When you see setup get down to the point of setting up the Hub Transport Role then watch your services and wait for them all to be set to disabled. Once they are pop open a Powershell prompt and run:

Get-Service | where {$_.DisplayName –match “Microsoft Exchange”} | Set-Service –StartupType Manual

Now setup will be able to continue with starting the services that it requires for continuing setup. Which may lead to your next problem, it will fail on generating a new self-signed certificate for the Exchange Transport service. You’ll find this error in the event logs:

Event ID 1002 MSExchangeSetup
 Exchange Server component Hub Transport Role failed.
 Error: Error:
 The following error was generated when "$error.Clear();
 Write-ExchangeSetupLog -Info "Creating SBS certificate";
$thumbprint = [Microsoft.Win32.Registry]::GetValue("HKEY_LOCAL_MACHINE\Software\Microsoft\SmallBusinessServer\Networking", "LeafCertThumbPrint", $null);
if (![System.String]::IsNullOrEmpty($thumbprint))
 {
 Write-ExchangeSetupLog -Info "Enabling certificate with thumbprint: $thumbprint for SMTP service";
 Enable-ExchangeCertificate -Thumbprint $thumbprint -Services SMTP;
Write-ExchangeSetupLog -Info "Removing default Exchange Certificate";
 Get-ExchangeCertificate | where {$_.FriendlyName.ToString() -eq "Microsoft Exchange"} | Remove-ExchangeCertificate;
Write-ExchangeSetupLog -Info "Checking if default Exchange Certificate is removed";
 $certs = Get-ExchangeCertificate | where {$_.FriendlyName.ToString() -eq "Microsoft Exchange"};
 if ($certs)
 {
 Write-ExchangeSetupLog -Error "Failed to remove existing exchange certificate"
 }
 }
 else
 {
 Write-ExchangeSetupLog -Warning "Cannot find the SBS certificate";
 }
 " was run: "The internal transport certificate cannot be removed because that would cause the Microsoft Exchange Transport service to stop. To replace the internal transport certificate, create a new certificate. The new certificate will automatically become the internal transport certificate. You can then remove the existing certificate.".
The internal transport certificate cannot be removed because that would cause the Microsoft Exchange Transport service to stop. To replace the internal transport certificate, create a new certificate. The new certificate will automatically become the internal transport certificate. You can then remove the existing certificate.
Error:
 The following error was generated when "$error.Clear();
 Write-ExchangeSetupLog -Info "Creating SBS certificate";
$thumbprint = [Microsoft.Win32.Registry]::GetValue("HKEY_LOCAL_MACHINE\Software\Microsoft\SmallBusinessServer\Networking", "LeafCertThumbPrint", $null);
if (![System.String]::IsNullOrEmpty($thumbprint))
 {
 Write-ExchangeSetupLog -Info "Enabling certificate with thumbprint: $thumbprint for SMTP service";
 Enable-ExchangeCertificate -Thumbprint $thumbprint -Services SMTP;
Write-ExchangeSetupLog -Info "Removing default Exchange Certificate";
 Get-ExchangeCertificate | where {$_.FriendlyName.ToString() -eq "Microsoft Exchange"} | Remove-ExchangeCertificate;
Write-ExchangeSetupLog -Info "Checking if default Exchange Certificate is removed";
 $certs = Get-ExchangeCertificate | where {$_.FriendlyName.ToString() -eq "Microsoft Exchange"};
 if ($certs)
 {
 Write-ExchangeSetupLog -Error "Failed to remove existing exchange certificate"
 }
 }
 else
 {
 Write-ExchangeSetupLog -Warning "Cannot find the SBS certificate";
 }
 " was run: "Failed to remove existing exchange certificate".
Failed to remove existing exchange certificate

This is a very verbose yet also very helpful error. Chances are you’ll most likely encounter this if you are not using the default self-signed certificates but have installed a third party certificate. Though I didn’t check in this case, reviewing the commands being run it may be choking on a third party certificate that has a friendly name of Microsoft Exchange. To fix this one first make sure you have a copy of your third party certificate available and if you don’t then export a copy as you’ll be in need of it later. Once you have that available then run through the SBS Set up your Internet address wizard. This will generate you another self-signed certificate and replace the third party certificate you have in place. It will also remove the third party certificates from your certificate store, which is why you need to make sure you have a copy of the certificate available. Once you have done this re-run setup and you’ll be able to finish your installation of SP2. Don’t forget to put the third party certificate back in place and also it would be a good idea to run ExBPA to make sure you are still in compliance. You’ll also want to make sure that all of your Exchange services are set back to their appropriate startup values as you may be left with all the services set to disabled.

USB Drive Disappears from Removable Storage on XenServer after a Reboot

Quick fix for an annoying problem I ran across where the removable storage no longer shows the attached usb drives after a reboot under XenServer 5.6. Pop open a console window on your XenServer host:

modprobe -r usb_storage … this removes the usb_storage kernel driver

modprobe usb_storage … this reinstalls the usb_storage kernel drives

That should get you your drives back and if you don’t see them then just do a rescan.

xe sr-list | grep -i removable -B 1 … use this to find the UUID of your removable storage SR

xe sr-scan uuid=<uuid of removable storage> … your usb drives should be showing up now ready to be attached to your VM

The Case of the Mysterious Crashing Application

A recent client had migrated off their terminal server and onto a virtualized 2008 R2 RDS server. Actually a farm of them but for this case it did not matter. Their previous setup had been all contained on one 2003 server which also ran their AD, print server, and whatever else was crammed into the kitchen sink. This new setup had some proper separation and centralized storage all on 2008 servers. Now for all of their data and programs they would reach into a file share on the SAN. This was working great except for one program they had would keep on crashing unless the data files were local to the server. Event IDs were as such with one immediately following the other:

Event ID 1000

Application Error

Description:

Faulting application name: OMNIS7.exe, version: 8.0.0.0, time stamp: 0x3bb82293

Faulting module name: OMNIS7.exe, version: 8.0.0.0, time stamp: 0x3bb82293

Exception code: 0xc0000006

Event ID 1005

Application Error

Description:

Windows cannot access the file  for one of the following reasons: there is a problem with the network connection, the disk that the file is stored on, or the storage drivers installed on this computer; or the disk is missing. Windows closed the program Omnis 7 core executable because of this error.

Program: Omnis 7 core executable

File:

The error value is listed in the Additional Data section.

User Action

1. Open the file again. This situation might be a temporary problem that corrects itself when the program runs again.

2. If the file still cannot be accessed and

– It is on the network, your network administrator should verify that there is not a problem with the network and that the server can be contacted.

– It is on a removable disk, for example, a floppy disk or CD-ROM, verify that the disk is fully inserted into the computer.

3. Check and repair the file system by running CHKDSK. To run CHKDSK, click Start, click Run, type CMD, and then click OK. At the command prompt, type CHKDSK /F, and then press ENTER.

4. If the problem persists, restore the file from a backup copy.

5. Determine whether other files on the same disk can be opened. If not, the disk might be damaged. If it is a hard disk, contact your administrator or computer hardware vendor for further assistance.

Additional Data

Error value: C00000C4

Disk type: 0

It was very swiftly recognize that the situation was not a temporary problem though unfortunately a sporadic one. Patterns noted were that crashed the most often in the morning when everyone would sign on and the afternoon when everyone was closing out. Now 0xC00000C4 is STATUS_UNEXPECTED_NETWORK_ERROR but that doesn’t provide much to go on. Grabbing some performance logs also showed that there shouldn’t be a network performance problem either bandwidth-wise. The first thing that was tried was disabling rss and offloading but that did not help matters. Doing more research I was lead to believe that the problem was being caused by oplocks.

Oplocks, short for Opportunistic Locking, is a process in the SMB protocol that was designed to allow multiple processes to lock a file while providing client side caching. The purpose of this is to improve performance for the local clients on the network. For more reading on this consult this document and this document. So all the crashing basically came down to cache integrity since the database used by the client was a flat file instead of transactional database.

To disable oplocks on the server you go into this key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters

And set EnableOplocks to 0. If it is not there create it as a REG_DWORD. Reboot to take effect.

Unfortunately Server 2008 introduces a new problem. Server 2008 will communicate via SMB2 to any client using Vista or newer. SMB2 also does not allow oplocks to be disabled. The work around for this is that if SMB2 is disabled on either the client or the server then communication will fall back to using SMB. Easiest way to fix this then is to disable on the server.

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanServer\Parameters

Create a REG_DWORD named SMB2 and set it to 0. Reboot to take effect.

You may notice that the server takes substantially longer to start up after making this change. They were severe enough that I decided to test an alternative method for disabling SMB2. Since communication will default to SMB if either the client or the server did not support SMB2 then SMB2 could be disabled on the client side. Disabling on the client side is a bit different since you’re actually disabling a service.

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanWorkstation

You may want to backup this key for easy restoration. Then edit DependOnService and remove MRxSmb20.

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\mrxsmb20

You may want to backup this key as well. In this key set Start to 4. Reboot your client and SMB2 will now be disabled.

Ever since implementing these changes the client’s applications have been running solid as a rock. For oplocks reading from Microsoft check here.

Addressing P2V 0x7b Issues

The other night I was P2Ving several systems and on one I ran into the issue of it blue screening on boot. It is unfortunate but not too uncommon as usually you need to enable IDE drivers on the system prior to the P2V. Microsoft’s article here works for all versions of XP and Server 2003, though I found I needed to expand the mentioned drivers directly from the cd for the SBS system I was working. That unfortunately did not resolve my 0x7b blue screen the other night. This article turned out to be the key to what I needed. Now the part that neither of these mentions is how to fix the problem if you can’t even boot that VM, so as to avoid having to do another P2V of the system. With Server 2008 this is possible to avoid and it can save you a lot of time, especially if the systems are large.

Server 2008 contained a great feature of being able to mount VHDs which is what we’ll be doing. For the first method you’ll want to mount the VHD to a drive letter and then expand the drivers to the \windows\system32\drivers folder in the VHD. Pull up regedit and select the HKLM key. Go to File->Load Hive and open the system registery from the \windows\system32\config\ and give it an easily identifiable name. You’ll find the registery loaded in HKLM under the name that you gave. Now loading the registery this way you won’t find a CurrentControlSet under the SYSTEM key. CurrentControlSet is just a pointer to ControlSetxxx. To find out which ControlSet number the system is set to boot with look ing SYSTEM\Select. The Current dword contains the number that it is using which in most cases will be 1, so go into that particular ControlSet i.e. for 1 it will be ControlSet001. In there you can manually implement the keys from the first article or the second article. In the case of the problem I ran into I had to set the Group Value of wdf01000 to WdfLoadGroup as it was part of the base group. If you want to learn more about service orders take a look at this article and this article.

Once done with those changes unload the hive and close out of regedit. Dismount the VHD and your virtual machine should be good to go.

%d bloggers like this: