IT_Expert/dotNET

IL DASM Tutorial

낫기법필 2010. 11. 16. 17:10



01. ===============================================

Ildasm.exe Tutorial

.NET Framework 1.1

This tutorial offers an introduction to the MSIL Disassembler (Ildasm.exe) that is included with the .NET Framework SDK. The Ildasm.exe parses any .NET Framework .exe or .dll assembly, and shows the information in human-readable format. Ildasm.exe shows more than just the Microsoft intermediate language (MSIL) code — it also displays namespaces and types, including their interfaces. You can use Ildasm.exe to examine native .NET Framework assemblies, such as Mscorlib.dll, as well as .NET Framework assemblies provided by others or created yourself. Most .NET Framework developers will find Ildasm.exe indispensable.

For this tutorial, use the Visual C# version of the WordCount sample that is included with the SDK. You can also use the Visual Basic version, but the MSIL generated will be different for the two languages and the screen images will also not be identical. WordCount is located in the <FrameworkSDK>\Samples\Applications\WordCount\ directory. To build and run the sample, follow the instructions outlined in the Readme.htm file. This tutorial uses Ildasm.exe to examine the WordCount.exe assembly.

To get started, build the WordCount sample, and load it into Ildasm.exe using the following command line:

ildasm WordCount.exe

This causes the Ildasm.exe window to appear, as shown in the following figure.

The tree in the Ildasm.exe window shows the assembly manifest information contained inside WordCount.exe and the four global class types: App, ArgParser, WordCountArgParser, and WordCounter.

By double-clicking any of the types in the tree, you can see more information about the type. In the following figure, the WordCounter class type has been expanded.

In the previous figure, you can see all the WordCounter members. The following table explains what each graphic symbol means.

Symbol Meaning
More info
Namespace
Class
Interface
Value Class
Enum
Method
Static method
Field
Static field
Event
Property
Manifest or a class info item

Double-clicking the .class public auto ansi beforefieldinit entry shows the following information:

In the previous figure, you can easily see that the WordCounter type is derived from the System.Object type.

The WordCounter type contains another type, called WordOccurrence. You can expand the WordOccurrence type to see its members, as shown in the following figure.

Looking at the tree, you can see that WordOccurrence implements the System.IComparable interface — specifically, the CompareTo method. However, the rest of this conversation will ignore the WordOccurrence type, and concentrate on the WordCounter type instead.

You can see that the WordCounter type contains five private fields: totalBytes, totalChars, totalLines, totalWords, and wordCounter. The first four of these fields are instances of the int64 type, while the wordCounter field is a reference to a System.Collections.SortedList type.

Following the fields, you can see the methods. The first method, .ctor, is a constructor. This particular type has just one constructor, but other types can have several constructors — each with a different signature. The WordCounter constructor has a return type of void (as all constructors do) and accepts no parameters. If you double-click the constructor method, a new window appears that displays the MSIL code contained within the method, as shown in the following figure.

MSIL code is actually quite easy to read and understand. (For all the details, see the CIL Instruction Set Specification, which is located in the Partition III CIL.doc file in the <FrameworkSDK>\Tool Developers Guide\Docs\ folder.) Toward the top, you can see that this constructor requires 50 bytes of MSIL code. From this number, you really have no idea how much native code will be emitted by the JIT compiler — since the size depends on the host CPU and on the compiler being used to generate the code.

The common language runtime is stack based. So, in order to perform any operation, MSIL code first pushes the operands onto a virtual stack, and then executes the operator. The operator grabs the operands off the stack, performs the required operation, and places the result back on the stack. At any one time, this method has no more than eight operands pushed onto the virtual stack. You can identify this number by looking at the .maxstack attribute that appears just before the MSIL code.

Now examine the first few MSIL instructions, which are reproduced on the following four lines:

IL_0000: ldarg.0 ; Load the object's 'this' pointer on the stack
IL_0001: ldc.i4.0 ; Load the constant 4-byte value of 0 on the stack
IL_0002: conv.i8 ; Convert the 4-byte 0 to an 8-byte 0
IL_0003: stfld int64 WordCounter::totalLines

The instruction at IL_0000 loads the first parameter that was passed to the method onto the virtual stack. Every instance method is always passed the address of the object's memory. This argument is called Argument Zero and is never explicitly shown in the method's signature. So, even though the .ctor method looks like it receives zero arguments, it actually receives one argument. The instruction at IL_0000, then, loads the pointer to this object onto the virtual stack.

The instruction at IL_0001 loads a constant 4-byte value of zero onto the virtual stack.

The instruction at IL_0002 takes the value from the top of the stack (the 4-byte zero), and converts it to an 8-byte zero — thus placing the 8-byte zero on the top of the stack.

At this point, the stack contains two operands: the 8-byte zero and the pointer to this object. The instruction at IL_0003 uses both of these operands to store the value from the top of the stack (the 8-byte zero) into the totalLines field of the object identified on the stack.

The same MSIL instruction sequence is repeated for the totalChars, totalBytes, and totalWords fields.

Initialization of the wordCounter field begins with instruction IL_0020, as shown here:

IL_0020: ldarg.0
IL_0021: newobj instance void [mscorlib]System.Collections.SortedList::.ctor()
IL_0026: stfld class [mscorlib]System.Collections.SortedList WordCounter::wordCounter

The instruction at IL_0020 pushes the this pointer for the WordCounter onto the virtual stack. This operand is not used by the newobj instruction but will be used by the stfld instruction at IL_0026.

The instruction at IL_0021 tells the runtime to create a new System.Collections.SortedList object and to call its constructor with no arguments. When newobj returns, the address of the SortedList object is on the stack. At this point, the stfld instruction at IL_0026 stores the pointer to the SortedList object in the WordCounter object's wordCounter field.

After all the WordCounter object's fields have been initialized, the instruction at IL_002b pushes the this pointer onto the virtual stack, and IL_002c calls the constructor in the base type (System.Object).

Of course, the last instruction at IL_0031 is the return instruction that causes the WordCounter constructor to return to the code that created it. Constructors have to return void, so nothing is placed on the stack before the constructor returns.

Now look at another example. Double-click the GetWordsByOccurranceEnumerator method to see its MSIL code, which is shown in the following figure.

You see that the code for this method is 69 bytes in size and that the method requires four slots on the virtual stack. In addition, this method has three local variables: one is of the System.Collection.SortedList type and the other two are of the System.Collections.IDictionaryEnumerator type. Note that the variable names mentioned in the source code are not emitted to the MSIL code unless the assembly is compiled with the /debug option. If /debug is not used, the variable names V_0, V_1, and V_2 are used instead of sl, de, and CS$00000003$00000000 respectively.

When this method begins execution, the first thing it does is execute the newobj instruction, which creates a new System.Collections.SortedList and calls this object's default constructor. When newobj returns, the address of the created object is on the virtual stack. The stloc.0 instruction (at IL_0005) stores this value in local variable 0, or sl (V_0 without /debug) (which is of the System.Collections.SortedList type).

At instructions IL_0006 and IL_0007, the WordCounter object's this pointer (in Argument Zero passed to the method) is loaded onto the stack, and the GetWordsAlphabeticallyEnumerator method is called. When the call instruction returns, the address of the enumerator is on the stack. The stloc.1 instruction (at IL_000c) saves this address in local variable 1, or de (V_1 without /debug) which is of the System.Collections.IDictionaryEnumerator type.

The br.s instruction at IL_000d causes an unconditional branch to the IL test condition of the while statement. This IL test condition begins at instruction IL_0032. At IL_0032, the address of de (or V_1) (the IDictionaryEnumerator) is pushed onto the stack and, at IL_0033, its MoveNext method is called. If MoveNext returns true, an entry exists to be enumerated, and the brtrue.s instruction jumps to the instruction at IL_000f.

At instructions IL_000f and IL_0010, the addresses of the objects in sl (or V_0) and de (or V_1) are pushed onto the stack. Then, the IdictionaryEnumerator object's get_Value property method is called to get the number of occurrences of the current entry. This number is a 32-bit value that is stored in a System.Int32. The code casts this Int32 object to an int value type. Casting a reference type to a value type requires the unbox instruction at IL_0016. When unbox returns, the address of the unboxed value is on the stack. The ldind.i4 instruction (at IL_001b) loads a 4-byte value, which points to the address currently on the stack, onto the stack. In other words, the unboxed 4-byte integer is placed on the stack.

At instruction IL_001c, the value of sl (or V_1)the address of the IDictionaryEnumerator)– is pushed onto the stack, and its get_Key property method is called. When get_Key returns, the address of the System.Object is on the stack. The code knows that the dictionary contains strings, so the compiler casts this Object to a String using the castclass instruction at IL_0022.

The next few instructions (from IL_0027 through IL_002d) create a new WordOccurrence object, and pass the object's address to the Add method of the SortedLists.

At instruction IL_0032, the test condition of the while statement is evaluated again. If MoveNext returns true, the loop executes another iteration. However, if MoveNext returns false, the execution falls through the loop and ends up at instruction IL_003a. The instructions from IL_003a through IL_0040 call the SortLists object's GetEnumerator method. The value returned is a System.Collections.IDictionaryEnumerator, which is left on the stack to become GetWordsByOccurrenceEnumerator return value.

[출처] http://msdn.microsoft.com/en-us/library/aa309387(VS.71).aspx

01. 끝 ============================================

02. ===============================================

활용 참고 사이트
http://bravochoi.tistory.com/79?srchid=BR1http%3A%2F%2Fbravochoi.tistory.com%2F79

아주 간단한 소스코드를 빌드를 하고
IL DASM로 불러들였습니다.

01.namespace insideCSharp
02.{
03.    class Program
04.    {
05.        static void Main(string[] args)
06.        {
07.            Console.WriteLine("OTZ");
08.        }
09.    }
10.}


정말 간단한 소스네요....

이걸 IL DASM으로 불러들이면


이렇게 보입니다.
여기서 두번째 줄에 있는 MANIFEST를 더블클릭하면 어셈블리 정보를 설명하는
메타데이터 정보를 볼수가 있어요

// Metadata version: v2.0.50727
.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )                         // .z\V.4..
  .ver 2:0:0:0
}
.assembly insideCSharp
{
  .custom instance void [mscorlib]System.Reflection.AssemblyTitleAttribute::.ctor(string) = ( 01 00 0C 69 6E 73 69 64 65 43 53 68 61 72 70 00   // ...insideCSharp.
                                                                                              00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyDescriptionAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyConfigurationAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyCompanyAttribute::.ctor(string) = ( 01 00 0C 42 6C 61 63 6B 45 64 69 74 69 6F 6E 00   // ...BlackEdition.
                                                                                                00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyProductAttribute::.ctor(string) = ( 01 00 0C 69 6E 73 69 64 65 43 53 68 61 72 70 00   // ...insideCSharp.
                                                                                                00 )
  .custom instance void [mscorlib]System.Reflection.AssemblyCopyrightAttribute::.ctor(string) = ( 01 00 1E 43 6F 70 79 72 69 67 68 74 20 C2 A9 20   // ...Copyright ..
                                                                                                  42 6C 61 63 6B 45 64 69 74 69 6F 6E 20 32 30 30   // BlackEdition 200
                                                                                                  39 00 00 )                                        // 9..
  .custom instance void [mscorlib]System.Reflection.AssemblyTrademarkAttribute::.ctor(string) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.InteropServices.ComVisibleAttribute::.ctor(bool) = ( 01 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.InteropServices.GuidAttribute::.ctor(string) = ( 01 00 24 64 63 37 31 35 34 62 61 2D 30 30 65 34   // ..$dc7154ba-00e4
                                                                                                  2D 34 33 62 31 2D 61 30 30 61 2D 34 36 62 32 61   // -43b1-a00a-46b2a
                                                                                                  35 62 37 64 32 33 36 00 00 )                      // 5b7d236..
  .custom instance void [mscorlib]System.Reflection.AssemblyFileVersionAttribute::.ctor(string) = ( 01 00 07 31 2E 30 2E 30 2E 30 00 00 )             // ...1.0.0.0..

  // --- 다음 사용자 지정 특성이 자동으로 추가됩니다. 주석 처리를 제거하지 마십시오. -------
  //  .custom instance void [mscorlib]System.Diagnostics.DebuggableAttribute::.ctor(valuetype [mscorlib]System.Diagnostics.DebuggableAttribute/DebuggingModes) = ( 01 00 07 01 00 00 00 00 )

  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilationRelaxationsAttribute::.ctor(int32) = ( 01 00 08 00 00 00 00 00 )
  .custom instance void [mscorlib]System.Runtime.CompilerServices.RuntimeCompatibilityAttribute::.ctor() = ( 01 00 01 00 54 02 16 57 72 61 70 4E 6F 6E 45 78   // ....T..WrapNonEx
                                                                                                             63 65 70 74 69 6F 6E 54 68 72 6F 77 73 01 )       // ceptionThrows.
  .hash algorithm 0x00008004
  .ver 1:0:0:0
}
.module insideCSharp.exe
// MVID: {8A57E23A-CCB3-4592-9075-A79E5D701B7F}
.imagebase 0x00400000
.file alignment 0x00000200
.stackreserve 0x00100000
.subsystem 0x0003       // WINDOWS_CUI
.corflags 0x00000001    //  ILONLY
// Image base: 0x03F60000

참 기네요..
이걸 처음봤을때는(지금도 물론) 눈앞이 캄캄하네요....
가뜩이나 가로로 쭉쭉 늘어선 글을 붙여넣으니 당췌 뭐가뭔지 모르겠네요...

여기서 첫번째 블럭은 외부어셈블리를 참조하는데 사용되는 .assembly 레코드입니다.

어셈블리의 가장 큰 장점중 하나는 버전관리가 가능하다는 것... 이라고 하는데요. 아직 저에게는 그다지 와닿지 않는 내용입니다. --ㅋ
제가 컴파일한 insideCSharp  어셈블리는 mscorlib 어셈블리의 2.0.0.0 버전을 사용하고 있다네요.

두번째 블럭은 이 어셈블리에 대한 정보를 포함하고 있습니다.
책에서 본 바로는 명시적으로 빌드 프로세스중에 버전 넘버를 지정하지 않으면 0.0.0.0 이라고 하는데
왜 저는 1.0.0.0 인걸까요.........

책이 version 2003이니.. 바뀐걸수도 있겠고. 제가 모르는 무언가가 있는지도 모르겠네요... ㅠ
위에 노란블럭안을 보면 몇가지 어트리뷰트가 나와 있는데요.
위의 어트리뷰트에 대한 내용들은 공부가 더 진행된후에 다시 언급하도록 할께요.
(그나저나 책에서는 Debug쪽 어트리뷰트만 나오더니. 제꺼에서는 상당히 많은 어트리뷰트가 보이는군요....
 책과 다르면 매우 난감한데요....)

다음으로는 클래스의 생성자를 볼께요. (6번째줄의 .ctor : void() 를 더블클릭하면 나와요)

.method public hidebysig specialname rtspecialname
        instance void  .ctor() cil managed
{
  // 코드 크기       7 (0x7)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
  IL_0006:  ret
} // end of method Program::.ctor

첫째줄에서 두번째줄까지는 단순히 메소드가 public이고, specialname, rtspecialname 어트리뷰트의 특징을 나타냅니다.
이러한 특정 메서드에 대한 코드는 CLI(Common Language Infrastructure)에 대한 의미를 가지고 있다고 하네요.
cil managed 어트리뷰트는 managed 메서드라는 걸 의미합니다.

메소드는 처음으로 Idarg라는 opcode를 통해 인자를 스택으로 로드합니다.
Idarg 다음의 '0'은 첫번째 인수를 뜻합니다.
두번째 call 이라는 opcode는 System.Object를 호출하는데 사용됩니다.
마지막 ret opcode는 단순히 호출메소드로 컨트롤을 반환한다고 하네요.
(코드는 쉬운데. 이놈은 어렵네요....)

이번에는 Main 메소드가 어떻게 생겼는지 볼죠..

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // 코드 크기       13 (0xd)
  .maxstack  8
  IL_0000:  nop
  IL_0001:  ldstr      "OTZ"
  IL_0006:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_000b:  nop
  IL_000c:  ret
} // end of method Program::Main
위에서 한번 보고 나니.
조금 눈에 들어오네요

하지만 책과 또 다른!!!!
IL_0000과 000b의 nop은 뭐하는 녀석일까요...

나머지 녀석들은 위랑 비슷한데 말이죠..
Idstr은 문자열 OTZ를 스택에올려주고.
call을 사용해서 System.Console::WriteLine을 불러냅니다.

문자로된 문자열을 넘겨도. 이 값은 실제로는 System.String 클래스의 인스턴스입니다.(문자니깐요)

02. 끝 =============================================


endline____________________________________________