DisC - Decompiler for TurboC


DisC is a decompiler for the TurboC compiler. It can interpret a DOS executable file generated by the TurboC compiler and give you a C-language program which functions similarly. Please note that that this is a decompiler specific for the TurboC compiler and not others, since most of the logic used by the decompiler to interpret the machine code is specifically tailored for the TurboC compiler. Trying it out on executables generated by other compilers will NOT give expected results.

Features

Features not implemented (and which i wanted to implement...)

History

DisC started as a timepass project when i was in my undergraduate studies. And it was one of the first projects which i tried to code in C++. I was pretty much experienced with TurboC then, and also having known a good deal of assembly language programming, i figured out a way to decompile executables generated by TurboC and wrote this software in about a month's time.

The vision i had in the beginning was to write an intelligent program which would analyze the flow of control of the input program , thereby understanding how it works, and to generate equivalent C code for this flow of control. I thought this would be a very good way to spend the summer holidays, and started it. But finally i ended up writing a software which not as intelligent as i wanted it to be, and to speak the truth - does not analyze the flow of control etc... but it still _IS_ somewhat versatile in the kinds of situations it can handle gracefully. Of course i learnt a lot in this project, not to mention that i learnt C++!!

Under the Hood

So what is this DisC doing? From a user's perspective, you tell the software which executable (only .EXE file which run on DOS can be given!) to decompile and it would do the rest.

  1. First of all, DisC tries to figure out where the "main" function is located. If that cant be found out automatically, you are on your own to find out the entry point of the program. But dont worry, DisC almost always finds out the location of "main".
  2. Then you are shown with a menu (and i forgot to tell you that it is entirely text-based... what else can you expect from a simple DOS program??) where you can choose whether you just want to look at the assembly code of the program, or you want to decompile one particular function, or you want to do fully automatic decompilation etc...
  3. If you choose "Full decompilation", there is nothing to do after that. DisC tries on its own to understand the program, and generates equivalent C code which is saved in the output file "_DISC.C", which you can review.
  4. If you choose to "Decompile", you are asked with the address of the function which you want to decompile. And to know which function to decompile, you must have had a look at the assembly listing of the program (which is also shown by DisC!). You can choose to go on decompiling recursively all functions called by this function, or just stop with this function. Once again, output is saved in the file "_DISC.C".

What DisC actually does is...

Structured code

When compiling, all high level constructs like "if", "for", "while", "do... while" etc... are translated to branches and then code is generated by the compiler. So when you do a decompilation of the executable, the output is will not contain any loops or if constructs, but simple "goto..."s. This is not what we want, isnt it? DisC has a built-in code reorganizer (I prefer to call it C-Beautifier!) which will recognize "if" blocks and "for","while","do...while" loops and reorganize them more pleasently. The final output is almost similar to the original code.

Sample programs which were decompiled using DisC

Original C code of Sample program 1   Code output of DisC
#include <stdio.h>
main()
{
  int i,j=4;
  i=3;
  printf("%d",(i==j)?i:-2);
}
 
main()
{
  j = 0x0004;
  i = 0x0003;
  _printf(0x0194,(i != j) ? 0xFFFE : i);
}
     
Original C code of Sample program 2   Code output of DisC
#include <stdio.h>
main()
{
  int a,b,c;
  char d,e;
  int f;
  a=1; b=2; c=3;
  d=4; e=5; f=6;
}
 
   main()
   {
     l_int_1 = 0x0001;      
     l_int_2 = 0x0002;
     l_int_3 = 0x0003;
     l_char_1 = 0x0004;
     l_char_2 = 0x0005;
     l_int_4 = 0x0006;
   }
     
Original C code of Sample program 3   Code output of DisC
#include <stdio.h>
main()
{
  int i=4, j;
  printf("abc %p,%d",&i,(i=4,8));
  for(j=0; j<10; j++)
    i=j*i+i;
}
 
main()
{
  l_int_1 = 0x0004;
  l_int_1 = 0x0004;
  _printf(0x0194,&l_int_1,0x0008);
  j = 0x0000;
  while(j <  0x000A) 
  {
    l_int_1 = j * l_int_1 + l_int_1;
    j++;
  }
}
     
Original C code of Sample program 4   Code output of DisC
#include <stdio.h>
#include <process.h>

main()
{
  int i=4,j,k;
  char c;

  exit(0);
  scanf("%d",&j);
  if (i<j)
  {
    printf("i<j");
    i=j;
    c=4;
  }
  else if (i>j)
  {
    printf("i>j");
    j=4;
    k=2;
    switch((i<=j)?j:k)
    {
      case 0 : if (i<j)
		{
		  printf("i<j");
		  i=j;
		}
		else if (i>j && c==1)
		{
		  printf("i>j");
		  j=4;
		}
		break;
      case 22 : printf("22");
                break;
      case 19 : printf("a");
      case -4 : i=1;
                 break;
      default : 
                for(i=0;i<10;i++)
                  i=i+i*i;
                break;
    }
  }
  else
  {
    printf("i==j");
    k=1;
  }
}
 
main()
{
  i = 0x0004;
  _exit(0x0000);
  _scanf(0x0194,&l_int_1);
  if(i <  l_int_1) 
  {
    _printf(0x0197);
    i = l_int_1;
    l_char_1 = 0x0004;
  }
  else
  {
    if(i <= l_int_1) 
    {
    }
    else
    {
      _printf(0x019B);
      l_int_1 = 0x0004;
      k = 0x0002;
      switch((i >  l_int_1) ? k : l_int_1)
      {
        case 0 :
          if(i <  l_int_1) 
          {
            _printf(0x019F);
            i = l_int_1;
          }
          else
          {
            if(i >  l_int_1) 
            {
              if(l_char_1 == 0x0001) 
              {
                _printf(0x01A3);
                l_int_1 = 0x0004;
              }
            }
          }
          break;
        case 22 :
          _printf(0x01A7);
          break;
        case 19 :
          _printf(0x01AA);
        case -4 :
          i = 0x0001;
          break;
        default :
          i = 0x0000;
          while(i <  0x000A) 
          {
            i = i * i + i;
            i++;
          }
      }
      goto Label17;
    }
    _printf(0x01AC);
    k = 0x0001;
  }
  Label17:
}
     

Things to know (if you are planning to try out DisC on your computer)

Downloads

DisC was originally written in Borland C++ 3.0 (running on DOS), but now i dont have the compiler. Also i find that not many people are using it these days, so i have "ported" DisC to Win32! I have compiled and successfully executed DisC using Microsoft Visual C++ 6.0, though there shouldnt be any problems with other Win32 compilers. But once small quirk - you must have a front-end program called "PrepDisC" (it is also included with the downloads listed below!) which is compiled as a DOS executable using a DOS C compiler like TurboC, and DisC will use this program to convert the input program which you want to decompile, into its own internal format and then do the actual decompilation.

If you are interested in knowing more about DisC, you are welcome to download...

The TurboC v2.01 compiler from Borland (it is available for free now!). You need this to compile the sample programs and the front-end "PrepDisC". Download from http://edn.embarcadero.com/article/20841

The source code for DisC "ported" to Win32 (as a Microsoft Visual C++ 6.0 project)

The original source code for DisC compiled using Borland C++ on DOS (doesnt give very explanatory messages!).

Please note that this code was written when i was trying to learn C++, so it is not a very well commented code, but dont hesitate to download and have a look - i have added quite a few comments at very important places and that should help you. If you already have a good knowledge of decompilation and the 8086 assembly language, then it would be a breeze for you!

Please take a look at the README.TXT files in the source distribution for instructions. Make sure you also download the TurboC compiler shown above, because you need that to test DisC - after all, this is a decompiler for TurboC only! Also make sure you have compiled the file "prepdisc.c" using TurboC on DOS, and store the file "prepdisc.exe" in the same directory as the DisC executable.

Related links

Back to my Decompilation page

Back to my homepage


Copyright (c) Satish Kumar. S 2001-2003. Last Modified - 22 Oct 2001
Suggestions/Broken links/queries? Write to satish@debugmode.com