The lynx project aims to develop techniques and tools to automate the process of analysis and simplification of malware code.
Computer malware codes are usually heavily obfuscated via a variety of techniques that make it difficult to figure out the internal logic of the code. For example, malware programs are very often self-modifying — the code is initially in a compressed or encrypted form, and is "unpacked" to the original executable form at runtime; in some cases, a program may undergo dozens or hundreds of layers of such runtime unpacking. In other cases, the malware logic may be embedded in the byte-code program of a custom-generated interpreter; in this case, examining the program code reveals only the structure of the interpreter, not that of the actual malware. The code may also be strewn with useless instructions that make it difficult to understand what the program is doing.
Existing tools for malware analysis do not provide much support for automatic removal of such obfuscations, which therefore requires a great deal of time-consuming manual intervention. The goal of this project is to develop tools and techniques to automatically analyze malware code and make such code easier to understand.
I've had some correspondence with Saumya and Babak and I was happy to receive a link to some source code. I haven't had a chance to check out the code, but it's a pretty interesting project according to the papers they've published.
https://www2.cs.arizona.edu/projects/ly ... ct/Source/