A forum for reverse engineering, OS internals and malware analysis 

Ask your beginner questions here.
 #27397  by foosaa
 Fri Dec 11, 2015 10:30 am
Dear All,

I need some help in developing the following. If you could guide or throw some pointers from where I can get more information, it would be great. Have been part of this great community for long. Got a requirement to develop a program (POC) on the following lines to check the feasibility.

Aim is to capture information from defined application window. Though the current POC requires only Windows Applications, it might be required to do the same on Linux and Mac (Solutions like xev, F-Script do existing on these platforms). Please do not bother to look into other operating systems for now.

On Windows platform, I need to capture information from Thick client and Thin clients (Browsers). If an application is defined to be hooked / monitored (I do not require modification or any other control to be exhibited on the monitored application), then I need to capture information from a set of defined controls (say like a text box).

I believe I could do Thick Client monitoring using API Hooking approach (either by using Microsoft Detours or similar API Hooking library) and log it to a file when the window gets into Focus. I have started coding the necessary flow (though have not progressed much) to define the hook target configuration, window handle identification, etc., I was really puzzled how to get the information from Thin clients aka browsers.

That's the reason for this request.

I need to capture HTML Control content from multiple browsers (IE, Chrome, Firefox, Opera) seamlessly. Are there any libraries / approaches that could be used to capture that? I researched cross browser testing tools like selenium (and it requires the script to be created and which website to visit and which web page to load), but then the user might be browsing anything on his will. Even in this instance, I have configure the program to capture the information from the webpage, only if the URL matches a certain defined list and I might have to record the text from a defined control from the loaded web page.

As I have researched (limited research), there are not methods of hooking the browser content and get the DOM for parsing and to extract the values. To an extent it is possible in IE with COM Hooking, but an universal solution which could even work across different browsers like chrome / firefox / ie itself is a problem.

I am exploring Accessibility interface but looks like it has issues across operating system as well and not fully supported by Chrome (it changes the control ids dynamically from browser version to version). Some one suggested to use AutoIT, but it does not offer all the control I need and combining it with the other piece of code is a problem again. (I.e., I cannot develop all the required features using AutoIT alone.)

I have thought about the following approaches (for which there are references and might be possible), but I feel it might be a long shot.
1. Sniffing the wire (Again, encryption will cause issues, though I could use something like fiddlercore) and get the data and reverse map it to the target using Handles and open / connected port information
2. Capturing keyboard and performing a reverse mapping with the process id / handle information and recording the relevant data. Even here again, I could skip any window which is not configured to be monitored and filter it out. Getting the respective controls might be possible using the window hooking approach.

Could anyone point some information / approach on the above.
I am really sorry to have created such a long post and consuming your valuable time, but I trust the minds here than any where else.
I thank each and every one who have read the post.
If any help could be provided I would be really grateful.