A forum for reverse engineering, OS internals and malware analysis 

Forum for discussion about kernel-mode development.
 #10770  by Kiuhnm
 Tue Jan 03, 2012 6:19 pm
Hi,
I'm reading a book about rootkits and I came across the following lines of code:
Code: Select all
InterlockedCompareExchange(&nCPUsLocked, nOtherCPUs, nOtherCPUs);
while (nCPUsLocked != nOtherCPUs)
{
  __asm
  {
    nop;
  }
  InterlockedCompareExchange(&nCPUsLocked, nOtherCPUs, nOtherCPUs);
}
I have a few questions:
1) Is that "nop" really necessary or even meaningful?
2) Shouldn't we read the old value returned by InterlockedCompareExchange instead?

Thank you.
 #10772  by Vrtule
 Tue Jan 03, 2012 6:38 pm
1) Is that "nop" really necessary or even meaningful?
About 5 years ago, I tried to remove the NOP instruction from similar while loop and the result was that the compiler removed the whole loop (probably some kind of optimization). My while cycle looked like this:
Code: Select all
while (InterlockedCompareExchange(&nCPUsLocked, nOtherCPUs, nOtherCPUs) < nOtherCPUs)
  __asm nop;
 #10778  by Brock
 Wed Jan 04, 2012 6:00 am
Yes, some compiler optimizations look at NOP and remove any instances (xchg eax, eax etc.) Maybe the author thought that NOP would serialize his code or consume less CPU time, who knows. At least he/she used atomic operations :lol:
 #10783  by Vrtule
 Wed Jan 04, 2012 12:32 pm
I saw the code (printed in my previous post in this topic) somewhere in Hoglund's Rootkits: Subverting the Windows kernel book. I think that when you insert the NOP instruction explicitly into your code, the compiler will not remove them, hence it also should not remove the whole while loop.
 #10785  by newgre
 Wed Jan 04, 2012 1:16 pm
Note that the out parameter of InterlockedCompareExchange needs to be volatile. The compiler won't optimize away accesses to volatile variables, so if used correctly, I don't think the compiler is even allowed to optimize away the loop.
 #10795  by Kiuhnm
 Wed Jan 04, 2012 5:19 pm
newgre wrote:Note that the out parameter of InterlockedCompareExchange needs to be volatile. The compiler won't optimize away accesses to volatile variables, so if used correctly, I don't think the compiler is even allowed to optimize away the loop.
There's something odd here. Why should the compiler remove a loop that could very well be an infinite loop?
Moreover, a compiler cannot assume that a function will always return the same value. Maybe the compiler had a bug.
I agree that if you write
Code: Select all
while (nCPUsLocked < nOtherCPUs) __asm nop;
then nCPUsLocked should be volatile, but if you call InterlockedCompareExchange() and read its returned value, there should be no such need. But I could be wrong.
 #10799  by Vrtule
 Wed Jan 04, 2012 5:49 pm
When a variable is not volatile, compiler might store its value only in processor registers for a while. This means that contents of the memory location of the variable might not be updated at the same rate as the source code dictates.

However, I think that x86 compilers usually treat every variable as volatile (the architecture has not many general purpose registers). The situation might be quite different in case of RISC processors like MIPS.
 #10800  by EP_X0FF
 Wed Jan 04, 2012 5:58 pm
Only me found this code strange from other point then this "nop" (it can be to turn off optimization of some compiler, or as a cheap wait cycle etc)?

What is this?
Code: Select all
while(nCPUsLocked != nOtherCPUs)
and next interlocked exchange which is useless.

must be author want this
Code: Select all
while (InterlockedCompareExchange(&nCPUsLocked, 
                         nOtherCPUs, 
                         nOtherCPUs) !=  nOtherCPUs) {

                    __asm {
                              nop;
                     }
}
 #10802  by Kiuhnm
 Wed Jan 04, 2012 6:21 pm
Vrtule wrote:When a variable is not volatile, compiler might store its value only in processor registers for a while. This means that contents of the memory location of the variable might not be updated at the same rate as the source code dictates.

However, I think that x86 compilers usually treat every variable as volatile (the architecture has not many general purpose registers). The situation might be quite different in case of RISC processors like MIPS.
I forgot that InterlockedCompareExchange() is inlined by the compiler.
 #10824  by Kiuhnm
 Thu Jan 05, 2012 12:11 pm
EP_X0FF wrote:Only me found this code strange from other point then this "nop" (it can be to turn off optimization of some compiler, or as a cheap wait cycle etc)?

What is this?
Code: Select all
while(nCPUsLocked != nOtherCPUs)
and next interlocked exchange which is useless.

must be author want this
Code: Select all
while (InterlockedCompareExchange(&nCPUsLocked, 
                         nOtherCPUs, 
                         nOtherCPUs) !=  nOtherCPUs) {

                    __asm {
                              nop;
                     }
}
That's my second question ;)
I think the author uses that InterlockedXXX as a memory barrier and maybe his code works because the cache is updated but I'm not sure. I will use the latter form in my code (it's even more compact).