Debug: WER and WinDbg
Windows Error Reporting (WER) provides a very useful information for debugging production issues, especially when it comes to the unhandled exceptions.
Here Problem Signature 07 and Problem Signature 08 are very important pieces of information to find the faulting code.
In this example the faulting method is Method #1 (1 from Sig[6].Value after stripping off the high byte) and the code is at offet IL_0005 (5 from Sig[7].Value):
But sometimes it could be really difficult to debug a more complex code. Let's take the following .wer file as an example.
This example is much more interesting as we have an unhandled exception within the CLR. The exception was raised in the IIS working process (w3wp.exe). Here Sig[6].Value contains the exception code that is actually a StackOverflowException. But we have no other information that could help us to debug this. In order to get it, we need more data, we should create a dump. In a normal production environment creation of a dump file is not configured by default. So it should be activated. A full dump file is required to have a complete picture in this case.
Once we got a dump file we can use WinDbg to open and analyze it. WinDbg is a powerful tool and it could take a separate blog post to describe its functionality. Here we'll focus only on finding the bug.
After opening the dump in the WinDbg command window we can see that it confirms the StackOverflowException.
By executing the ~# command WinDbg is getting us into the faulting thread.
In this example it's a thread with ID 182 that caused the exception. But in order to get into a managed code we should use the SOS Debugging Extension. In our case, for CLR version 4.0+, we should execute .loadby sos clr command to load the debugging extension. Once loaded, we can execute the !eestack command to display a stack trace for every thread in the process. If we search for the thread 182 we can find its stack trace that will answer the question about what caused the exception.
CLR integrates WER seamlessly and gives everything you need to track the issue. WER generates an event log record which gives you a path to the .wer file.
A common .wer file contains a section of Problem Signatures
Sig[0].Name=Problem Signature 01 Sig[0].Value=WerTest.exe Sig[1].Name=Problem Signature 02 Sig[1].Value=1.0.0.0 Sig[2].Name=Problem Signature 03 Sig[2].Value=5b68a7e8 Sig[3].Name=Problem Signature 04 Sig[3].Value=WerTest Sig[4].Name=Problem Signature 05 Sig[4].Value=1.0.0.0 Sig[5].Name=Problem Signature 06 Sig[5].Value=5b68a7e8 Sig[6].Name=Problem Signature 07 Sig[6].Value=1 Sig[7].Name=Problem Signature 08 Sig[7].Value=5 Sig[8].Name=Problem Signature 09 Sig[8].Value=System.NotImplementedException
Here Problem Signature 07 and Problem Signature 08 are very important pieces of information to find the faulting code.
In this example the faulting method is Method #1 (1 from Sig[6].Value after stripping off the high byte) and the code is at offet IL_0005 (5 from Sig[7].Value):
Method #1 (06000001) [ENTRYPOINT] ------------------------------------------------------- MethodName: Main (06000001)
IL_0005: /* 6F | (0A)000010 */ callvirt instance void [WerTestLib]WerTestLib.WerExceptionClass::FailedMethod()
But sometimes it could be really difficult to debug a more complex code. Let's take the following .wer file as an example.
Sig[0].Name=Nom de l’application Sig[0].Value=w3wp.exe Sig[1].Name=Version de l’application Sig[1].Value=8.5.9600.16384 Sig[2].Name=Horodatage de l’application Sig[2].Value=5215df96 Sig[3].Name=Nom du module par défaut Sig[3].Value=clr.dll Sig[4].Name=Version du module par défaut Sig[4].Value=4.7.3062.0 Sig[5].Name=Horodateur du module par défaut Sig[5].Value=5ab9567c Sig[6].Name=Code de l’exception Sig[6].Value=c00000fd Sig[7].Name=Décalage de l’exception Sig[7].Value=00000000000b1c2b
This example is much more interesting as we have an unhandled exception within the CLR. The exception was raised in the IIS working process (w3wp.exe). Here Sig[6].Value contains the exception code that is actually a StackOverflowException. But we have no other information that could help us to debug this. In order to get it, we need more data, we should create a dump. In a normal production environment creation of a dump file is not configured by default. So it should be activated. A full dump file is required to have a complete picture in this case.
Once we got a dump file we can use WinDbg to open and analyze it. WinDbg is a powerful tool and it could take a separate blog post to describe its functionality. Here we'll focus only on finding the bug.
After opening the dump in the WinDbg command window we can see that it confirms the StackOverflowException.
By executing the ~# command WinDbg is getting us into the faulting thread.
In this example it's a thread with ID 182 that caused the exception. But in order to get into a managed code we should use the SOS Debugging Extension. In our case, for CLR version 4.0+, we should execute .loadby sos clr command to load the debugging extension. Once loaded, we can execute the !eestack command to display a stack trace for every thread in the process. If we search for the thread 182 we can find its stack trace that will answer the question about what caused the exception.