Debug: WER and WinDbg

Windows Error Reporting (WER) provides a very useful information for debugging production issues, especially when it comes to the unhandled exceptions.

CLR integrates WER seamlessly and gives everything you need to track the issue. WER generates an event log record which gives you a path to the .wer file.



A common .wer file contains a section of Problem Signatures

Sig[0].Name=Problem Signature 01
Sig[0].Value=WerTest.exe
Sig[1].Name=Problem Signature 02
Sig[1].Value=1.0.0.0
Sig[2].Name=Problem Signature 03
Sig[2].Value=5b68a7e8
Sig[3].Name=Problem Signature 04
Sig[3].Value=WerTest
Sig[4].Name=Problem Signature 05
Sig[4].Value=1.0.0.0
Sig[5].Name=Problem Signature 06
Sig[5].Value=5b68a7e8
Sig[6].Name=Problem Signature 07
Sig[6].Value=1
Sig[7].Name=Problem Signature 08
Sig[7].Value=5
Sig[8].Name=Problem Signature 09
Sig[8].Value=System.NotImplementedException

Here Problem Signature 07 and Problem Signature 08 are very important pieces of information to find the faulting code.

In this example the faulting method is Method #1 (1 from Sig[6].Value after stripping off the high byte) and the code is at offet IL_0005 (5 from Sig[7].Value):

 Method #1 (06000001) [ENTRYPOINT]
 -------------------------------------------------------
  MethodName: Main (06000001)
IL_0005:  /* 6F   | (0A)000010       */ callvirt   instance void [WerTestLib]WerTestLib.WerExceptionClass::FailedMethod()

But sometimes it could be really difficult to debug a more complex code. Let's take the following .wer file as an example.

Sig[0].Name=Nom de l’application
Sig[0].Value=w3wp.exe
Sig[1].Name=Version de l’application
Sig[1].Value=8.5.9600.16384
Sig[2].Name=Horodatage de l’application
Sig[2].Value=5215df96
Sig[3].Name=Nom du module par défaut
Sig[3].Value=clr.dll
Sig[4].Name=Version du module par défaut
Sig[4].Value=4.7.3062.0
Sig[5].Name=Horodateur du module par défaut
Sig[5].Value=5ab9567c
Sig[6].Name=Code de l’exception
Sig[6].Value=c00000fd
Sig[7].Name=Décalage de l’exception
Sig[7].Value=00000000000b1c2b

This example is much more interesting as we have an unhandled exception within the CLR. The exception was raised in the IIS working process (w3wp.exe). Here Sig[6].Value contains the exception code that is actually a StackOverflowException. But we have no other information that could help us to debug this. In order to get it, we need more data, we should create a dump. In a normal production environment creation of a dump file is not configured by default. So it should be activated. A full dump file is required to have a complete picture in this case.

Once we got a dump file we can use WinDbg to open and analyze it. WinDbg is a powerful tool and it could take a separate blog post to describe its functionality. Here we'll focus only on finding the bug.

After opening the dump in the WinDbg command window we can see that it confirms the StackOverflowException.



By executing the ~# command WinDbg is getting us into the faulting thread.


In this example it's a thread with ID 182 that caused the exception. But in order to get into a managed code we should use the SOS Debugging Extension. In our case, for CLR version 4.0+, we should execute .loadby sos clr command to load the debugging extension. Once loaded, we can execute the !eestack command to display a stack trace for every thread in the process. If we search for the thread 182 we can find its stack trace that will answer the question about what caused the exception.




Popular posts from this blog

Dev: PlantUML + VS Code + GitLab

BigData intro at Ciklum's Speakers' Corner