A forum for reverse engineering, OS internals and malware analysis 

Ask your beginner questions here.
 #8610  by lorddoskias
 Sat Sep 17, 2011 8:18 pm
I'm currently playing with inline patching because I want to understand how it actually works. What I'm trying to do is basically patch NtQuerySystemInformation so that a string is printed everytime the function is called. Here is what I've got until now:
Code: Select all
NTSTATUS inlineHookInstall() {
	NTSTATUS status;
	UNICODE_STRING FuncName = {0};
	DWORD dwOldCR0 = 0;
	BYTE *funcPointer = NULL; 

	//get the address we want to start overwriting
	funcPointer = getRealFuncAddress((BYTE *)ZwQuerySystemInformation, KeServiceDescriptorTable.KiServiceTable);
		
	if(!funcPointer) {
		DbgPrint("Error getting the real address of NtQuerySYsteminformation\n");
		return STATUS_UNSUCCESSFUL;
	}
		
	DbgPrint("Address of NTQUERYSYSINFO IS %p", funcPointer);
	
	dwOldCR0=__readcr0();
	__writecr0(dwOldCR0&~(1<<16));
	writeBytesToMem(funcPointer);
	__writecr0(dwOldCR0);

	return STATUS_SUCCESS;
}
And the writebytestomem:
Code: Select all
void writeBytesToMem(PVOID Addr) {
	BYTE jumpBuf[] = "\xE9\xDE\xAD\xBE\xEF";
	PVOID placeHold; 
	int bytesToOverwrite = 5; //we will try with a basic JMP ADDR instructions
	DWORD blockSize = 0;
	DWORD instSize;

	//count how many bytes we actually have to overwrite without thrashing the system
        //uses zombie dissasm engine
	while(blockSize < bytesToOverwrite) {
		GetInstLenght((PDWORD)((PBYTE)Addr + blockSize), &instSize);

		blockSize += instSize;
	}

	//allocate place for the counted blocksize
	placeHold = ExAllocatePoolWithTag(NonPagedPool, blockSize + bytesToOverwrite, 'Nik2');

	if(placeHold == NULL) {
		DbgPrint("Error allocating temporary space for placeholder\n");
		return;
	}

	//copy the original bytes
	RtlCopyMemory(placeHold, Addr, blockSize);

	//fill the original place with NOP, just in case
	RtlFillMemory(Addr, blockSize, 0x90);

	//now instead of 0XDEADBEEF we have the address of our routine.
	FixJMPAddress(jumpBuf, (BYTE *)Prolog_NtQuerySys);
	
	DbgPrint("Jump fixed address: %p\n", Prolog_NtQuerySys);
	
	//now we actually overwrite the memory
	RtlCopyMemory(Addr, jumpBuf, bytesToOverwrite);
	 
	DbgPrint("Writing successful\n");
}
And the address fixup function:
Code: Select all
void FixJMPAddress(BYTE *jump, BYTE *newRoutine) {

	DWORD address;
	DWORD *dwPtr;

	address = (DWORD)newRoutine;
	dwPtr = (DWORD *)&(jump[1]);
	*dwPtr = address;

}

So apparently the patching is successful in that I can see the NtQuerySystemInformation is changed:
Code: Select all
kd> u 0x82847416
nt!NtQuerySystemInformation:
82847416 e960da4793      jmp     15cc4e7b
8284741b 8b5508          mov     edx,dword ptr [ebp+8]
8284741e 83fa53          cmp     edx,53h
82847421 7f21            jg      nt!NtQuerySystemInformation+0x2e (82847444)
82847423 7440            je      nt!NtQuerySystemInformation+0x4f (82847465)
82847425 83fa08          cmp     edx,8
82847428 743b            je      nt!NtQuerySystemInformation+0x4f (82847465)
8284742a 83fa17          cmp     edx,17h
And here is what my debugging statements say:
Address of NTQUERYSYSINFO IS 82847416
Jump fixed address: 9347DA60

Clearly due to the fact that x86 is little-endian the address is actually reversed so what do I need to change in the FixJMPAddress function in order to get the proper jump address?
 #8612  by EP_X0FF
 Sat Sep 17, 2011 11:51 pm
ULONG JmpAddress = (ULONG)newRoutine - (ULONG)originalRoutine - 5;
 #8616  by lorddoskias
 Sun Sep 18, 2011 11:40 am
Thanks for the reply - it worked. Now I have another problem - after my detour is executed I get a BSOD: ATTEMPTED_EXECUTE_OF_NOEXECUTE_MEMORY

I'm guessing this is because my detour doesn't correctly return to the appropriate location in the original NtQuerySystemInformation.

This is my detour code, pretty simple I just call a function which prints something and then execute the instructions from the original function which have been copied and then the ret is supposed to return just after the instruction which called it but I don't know how to push the correct return address, that is it as simple as push orig + 5 or?

Another question: After I have detoured a function how can I access it's parameters to do some sort of filtering etc? Do I have to use ebp-4, -8, etc or is there some other way to reference to the parameters?
Code: Select all
void displayMsg() {
	DbgPrint("We have executed our detours.\n");
}

__declspec(naked) Prolog_NtQuerySys() {
	
	__asm {
		call displayMsg;
	}

	__asm {
               //relocated code
		mov edi, edi;
		push ebp;
		mov ebp, esp;
               //I have to first push the return address and then issue the ret
		ret;
	}
}
 #8617  by r2nwcnydc
 Sun Sep 18, 2011 1:00 pm
In most cases the Nt functions have 5 NOP's before the function. So you can overwrite the NOP's with your jmp instruction then overwrite the mov edi, edi with a jmp -7 (EB F9) which jumps to your jmp to your function. This makes it so that you don't have to manually set anything up for the function, as the mov edi, edi is basically a two byte NOP, so you are not overwriting anything important. Then you can use a regular function like the following:
Code: Select all
typedef NTSTATUS (NTAPI *_NtQuerySystemInformation)( SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength );

_NtQuerySystemInformation origCall = ORIGINAL_FUNCTION_ADDRESS+2

NTSTATUS NTAPI MyNtQuerySystemInformation( SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength )
{
    NTSTATUS rc;

    // do what ever you want

   // call the original function
   rc = origCall( SystemInformationclass, SystemInformation, SystemInformationLength, ReturnLength );

   // do what ever you want

   return rc;
}
Edit:
Otherwise for your code, you'd push the address of the original function + 5 before your ret. To access the variables, you could call a regular function:
Code: Select all
void displayMsg() {
   DbgPrint("We have executed our detours.\n");
}

NTSTATUS NTAPI MyNtQuerySystemInformation( SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength )
{
   displayMsg( );

   return STATUS_SUCCESS;
}

__declspec(naked) Prolog_NtQuerySys()
{
   __asm {
               //relocated code
      mov edi, edi;
      push ebp;
      mov ebp, esp;

      push [ebp+14]
      push [ebp+10]
      push [ebp+c]
      push [ebp+8]
      call MyNtQuerySystemInformation

      push ORIGINAL_FUNCTION_ADDRESS + 5
               //I have to first push the return address and then issue the ret
      ret;
   }
}
 #8618  by lorddoskias
 Sun Sep 18, 2011 2:22 pm
Thanks for the reply, it actually answered a question in a previous thread I had opened. But I think the F9 should actually be F8 because we want to jump 7 bytes backwards and not 6, because mov edi, edi is 2 bytes and and jmp offsets are counted from the next instruction so it is something like:
Code: Select all
nop - 1 bytes
nop - 1 bytes                
nop - 1 bytes                
nop - 1 bytes                
nop - 1 bytes                
mov edi, edi - 2 bytes   
next-instruction            <--- EIP points to here, when the previous instruction is executing

Furthermore, you are doing this sequence of pushin - first the highest address which holds the last arguments, working your way towards the lowest addresses which hold the 3,2,1 etc and this is because stdcall by default pushes arguments from right to left, no? I'm just checking my knowledge.
 #8620  by r2nwcnydc
 Sun Sep 18, 2011 3:16 pm
lorddoskias wrote: But I think the F9 should actually be F8 because we want to jump 7 bytes backwards and not 6, because mov edi, edi is 2 bytes
Yes it is 7 bytes backwards. But F9 is -7.. FF(-1), FE(-2), FD(-3), FC(-4), FB(-5), FA(-6), F9(-7)
lorddoskias wrote:Furthermore, you are doing this sequence of pushin - first the highest address which holds the last arguments, working your way towards the lowest addresses which hold the 3,2,1 etc and this is because stdcall by default pushes arguments from right to left, no? I'm just checking my knowledge
Yes, that is correct.
 #8730  by lorddoskias
 Sat Sep 24, 2011 7:19 pm
Now I've had a chance to play with this but I can't get it to work, here is what I have:

This function works and I jump to my prolog
Code: Select all
void writeBytesToMemSafe(PVOID Addr) {
	BYTE shortJMP[] = "\xEB\xF9";
	BYTE longJMP[] = "\xE9\xDE\xAD\xBE\xEF";
	
	//copy the short jump instead of mov edi, edi
	RtlCopyMemory((PBYTE) Addr, shortJMP, 2); 

	//now instead of 0XDEADBEEF we have the address of our routine.
	FixJMPAddress(longJMP, (BYTE *)Prolog_NtQuerySys, (BYTE *) Addr);

	DbgPrint("Jump fixed address: %p\n", Prolog_NtQuerySys);

	//now we actually overwrite the memory
	RtlCopyMemory((PBYTE)Addr - 5 , longJMP, 5);

	DbgPrint("Writing successful\n");
}
Here is the prolog:
Code: Select all
__declspec(naked) Prolog_NtQuerySys() {
	
	__asm {
		 push [ebp+0x14]
		 push [ebp+0x10]
		 push [ebp+0xC]
		 push [ebp+0x8]
		 call myZwQuerySystemInformation;

		 push myNtQuerySystemInformation;
         ret;
	}
}
myZwQuerySysInformation:
Code: Select all
NTSTATUS myZwQuerySystemInformation(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength) 
{
	NTSTATUS ntStatus;
	PSYSTEM_PROCESS_INFORMATION currentProcInfo;
	PSYSTEM_PROCESS_INFORMATION previousProcInfo;
	BYTE *tempMath;


	ntStatus = myNtQuerySystemInformation(SystemInformationClass, SystemInformation, SystemInformationLength, ReturnLength);
	if(!NT_SUCCESS(ntStatus))
		return ntStatus;
//filter process list based on process name prefix
And myNtQuerySystemInformation:
Code: Select all
myNtQuerySystemInformation = (QUERY_SYS_INFO)getRealFuncAddress((BYTE *)ZwQuerySystemInformation, KeServiceDescriptorTable.KiServiceTable);
	myNtQuerySystemInformation = (QUERY_SYS_INFO) ((PBYTE)myNtQuerySystemInformation + 2);
But unfortunately after the prolog calls the myZwQuerySysInfo it goes into some sort of loop and the code always ends in a place of memory where there is a slide of INT3 instructions. Any help will be much appreciated.

EDIT:

Also in your previous post you say that I can use ntOrigAddress = origAddress+2 and then use this in my own function and I won't have to care about the stack status and manually take care about it - in this case what do I have to change in the NtQuerySystemInformation - patch it so that it jumps to my wrapper function ?
 #8733  by r2nwcnydc
 Sun Sep 25, 2011 10:35 am
First, you should change the writeBytesToMemSafe function to overwrite the NOP slide before you write the jmp-7. This is needed because a write of 5 bytes is not automic; meaning another thread could be schedule between when the first 4 bytes are written and the 5th byte is written.
Code: Select all
void writeBytesToMemSafe(PVOID Addr) {
   BYTE shortJMP[] = "\xEB\xF9";
   BYTE longJMP[] = "\xE9\xDE\xAD\xBE\xEF";

   //now instead of 0XDEADBEEF we have the address of our routine.
   FixJMPAddress(longJMP, (BYTE *)Prolog_NtQuerySys, (BYTE *) Addr);

   DbgPrint("Jump fixed address: %p\n", Prolog_NtQuerySys);

   //now we actually overwrite the memory
   RtlCopyMemory((PBYTE)Addr - 5 , longJMP, 5);

   //copy the short jump instead of mov edi, edi
   RtlCopyMemory((PBYTE) Addr, shortJMP, 2);

   DbgPrint("Writing successful\n");
}
Second, what are you doing with the prolog? Just use something like this, rather then having two functions:
Code: Select all
typedef NTSTATUS (NTAPI *LPFN_ZwQuerySystemInformation)( SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength );

// if this is not an exported function, then you should set this in your hooking code
LPFN_ZwQuerySystemInformation g_OrigAddress = (PCHAR)ZwQuerySystemInformation + 2;

NTSTATUS NTAPI myZwQuerySystemInformation( SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength )
{
   NTSTATUS rc;

   // do what ever you want here

   rc = g_OrigAddress( SystemInformationClass, SystemInformation, SystemInformationLength, ReturnLength );

   // do what ever you want here

   return rc;
}
Also, why are hooking the Zw version of this function?
 #8734  by lorddoskias
 Sun Sep 25, 2011 10:54 am
I guess my naming isn't very clear. From the Zw version I get the address of the real function - Nt version and patch it. And then the function which is going to be executed I call myZwQuerySysinfo but in it I call (patched) version of Nt which performs the actual work.

EDIT:

But using just 1 function won't I find the stack in a trashed state unless I use declspec(naked)?