How to identify Malware Behavior


First, many times we may be looking for malware for different reasons. One reason would be that we suspect that a system may be infected with malware, while another may be that by looking for malware, we’re attempting to nail down an intrusion or compromise (like following the trail of artifacts back to the original compromise vector), with the malware being a byproduct of the intrusion.

Another thing to keep in mind is that malware, in general, has four main characteristics:


1. An initial infection vector – how it got on the system in the first place; this can be through browser download (even on a secondary or tertiary level), email attachment, etc. Conficker, for example, can infect a system when the user opens Explorer on drive (USB thumb drive, mapped share, etc.) that has been infected. In some SQL injection compromises, I’ve seen malware placed on a system by the intruder sending tftp commands or creating and launching an FTP script, all via SQL injection. I’ve also seen the bad guy load the malware into a database table in 512 byte chunks, and then have the database reassemble the file in the file system so they could launch it.

2. Artifacts – what actions does the malware take upon infection and what footprints does it leave? Many time, we can determine these ourselves through dynamic malware analysis, but often its sufficient (and quicker) to use what’s available from AV sites. Sometimes these “footprints” can be unique to a malware family (Conficker, for example). Also, these artifacts do not have to be restricted to a host; are there any network-based artifacts that you can use when analyzing logs?

3. Propogation Mechanism – How does the malware get about? Is it a worm that exploits a known (or unknown) vulnerability? Or is it like Conficker, infecting files at the root of drives and adding autorun.inf files? Understanding the propogation mechanism can help you fight the tide, as it were, or develop a mechanism to block or detect further infections.

4. Persistence Mechanism – As Jesse Kornblum points out in his “Rootkit Paradox” paper, malware likes to remain persistent, and the simple fact is that there are a finite number of ways to do that on a Windows system. The persistence mechanism can relate back to Artifacts; however, this would be an artifact specifically intended to allow the malware to survive reboots.

These characteristics act as a framework to help us visualize, understand, and categorize malware. Over the years, I have used these four characteristics to track down malware and help others do the same. In one instance in particular, after a customer had battled with a persistent (albeit fairly harmless) worm for over a month, I was told that they would delete certain files, reboot the system, and the files would be back. It occurred to me that they hadn’t adequately tracked down the persistence mechanism, and once we found it, they were able to clean their systems!

Okay, so how can we go about tracking down malware, detecting its presence? I’m going to start with the idea that we have an acquired image, and we need to determine if there’s malware on the system. I’m going to list several mechanisms for doing so, and these are not listed in order of priority. It will be incumbent upon you, the reader, to determine which steps work best for you, and in which order…that said, away we go!

Targeted Artifact Analysis


A lot of times, we may not know exactly what we’re looking for, but if we know the persistence mechanism or other artifacts of malware, we can do a quick, surgical scan that malware. Tools such as RegRipper can make this a fast and extremely easy process (remember, for live systems, you can use RegRipper in combination with F-Response!). Take Conficker…while there are changes in artifacts based on the variant, the set of unique artifacts is pretty limited. As the variants have changed so as to obviate both AV scans and hash comparisons (at this point, everyone should be aware that hash comparisons for malware are marginally less effective than AV scanning with a single engine), artifacts have remained fairly static (Registry modifications) with some new ones (Scheduled Task) being added. The addition of unique artifacts helps narrow down the false positives.

Log Analysis


There are a number of logs on Windows systems that may provide some insight into malware detection. For example, maybe the installed AV product detected and quaratined a tertiary download…depending on the product, this may appear in the AV product logs as well as the Event Log. Or perhaps the AV scanner’s real-time protection mechanism was disabled and the user ran a scan at a later time that detected the malware. Either way, check for an installed AV or anti-spyware product, and check the logs. Also, examine the Event Logs. And don’t forget mrt.log!

Scans


Another way to go about detecting the presence of malware on systems is to scan for it using AV products. Yes, there are commercial AV products available, but as many have seen over the past couple of months, particularly with Conficker and Virut, sometimes using just one commercial AV product isn’t enough. The key to running scans is to know what the scan is looking for so that you can better interpret the results.

For example, look at tools such as sigcheck and missidentify; both are extremely useful, but each tool looks for certain things. Another scanning tool that can be extremely useful is Yara, and anyone looking at using Yara should consider using the Yara-Scout Sniper release from the illustrious Don Weber! Yara can use packer rules (from the public PeID signatures) to detect packed files, and Don has added fuzzy hashing to Scout Sniper.

As a side note, while fuzzy hashing is obviously predicated on having a sample of the malware to hash, it is still a much preferable technique over “normal” hashing using MD5 or SHA-1 hashes. In one instance, I had two examinations about 8 months apart where I found files of the same name on both. Traditional (MD5) hashes didn’t match, but using ssdeep, I was able to determine that the files were 99% similar.

So, other than scanning for not-normal files (with “normal” being somewhat amorphous), there are other ways to scan for possible malware infections. With the amount of malware that subverts Windows File Protection (WFP) in some manner, tools like wfpcheck can be used to determine if something on the system modified any of the “protected” files.

But again, keep in mind that scanning in general is a broad-brush approach and scans don’t find everything. The idea is to have some idea of what you’re looking for, and then selecting the proper tool (or tools) to build a comprehensive process. As part of that process, you’ll need to document what you did, what you looked for, and what tools you used…because without that documentation, how to describe what you did in a repeatable manner, and how do you go about improving your process in the future?

Advertisements

Rebuilding Executable Images From Memory


One of the questions I initially had when I started analyzing memory images with Volatility was how do I extract the malicious executable from memory that the RE guys will be able to analyze. I kept running into the problem of not getting all the API functions with the executable and the RE guys kept telling me they could not do a complete analysis since they did not have all the functions.

So I picked up the book “Malware Analyst’s Cookbook and DVD” because there are 4 chapters dedicated to Volatility and one of those chapters (Chapter 16, Recipe 16-7) specifically deals with extracting the executable along with the API functions. So the following is a summarized excerpt of that recipe for those of you who do not have the above referenced book. This discussion will assume the reader has installed the latest version of Volatility from http://volatility.googlecode.com and installed all the available plugins and dependencies required. There are excellent walk-thru’s on the volatility website for that.

In general, to rebuild an executable from memory, you need to parse the PE section headers to learn the addresses and sizes of the PE sections. Then, you can carve out the appropriate amount of data from memory and re-combine the sections into a file on disk according to their original positions. There are references that provide a much deeper understanding of this process and are listed as follows:

http://computer.forensikblog.de/en/2006/04/reconstructing_a_binary.html (Andreas Schuster)

http://windowsir.blogspot.com/2006/07/automatic-binary-reassembly-from-ram.html (Harlan Carvey)

http://jessekornblum.com/presentation/dodcc07.html (Jesse Kornblum)

The methods described in the above publications rely on information in the PE header and do not attempt to reconstruct the Import Address Table (IAT). Malware samples that erase the entire PE header, relocate the IAT, or that use run-time dynamic linking cause significant problems. You still will be able to dump the binary using the base address and size information from the PE header (if it exists) or the base address and size information from the VAD; however, you will not be able to tell which API functions the malware calls. The following excerpt will explain methods to work around these anti-analysis techniques based on scanning the process address space for API calls, without relying on data in the IAT.

The first step is to use pslist or psscan to generate list of processes. Once you know the PID or _EPROCESS offset for the process that you want to dump, then you can pass it to procexedump or simply the leave off the -p parameter to dump all the processes. The syntax would be as follows:

python volatility.py pslist -f laqma.vmem (whatever the name of your memory image is) -p PID number

I would suggest at this point you download your favorite PE viewer and if you do not have one then I would suggest downloading CFF Explorer. Once you have dumped your process out then open it up using CFF Explorer and see if the IAT contains the right information. If it is a legitimate file it probably will but if not it will probably be missing the imported function names. At this point, you cold load the dumped file in IDA Pro and try your best to determine its capabilities without IAT information. Or you could scan the file with multiple anti-virus engines to see if they detect anything in the unpacked process image. However, what you typically want to do is perform a more thorough reverse-engineering tasks, which requires information about the imported functions.

Scanning For Imported Functions with IMPSCAN

The reason you should be concerned with an incomplete IAT is that it will hinder your ability to perform a thorough code analysis. If you try to examine the instructions in the dumped file using IDA Pro, then you will see placeholders instead of API calls. Since IDA does not have access to the entire process’s memory, it cannot determine what APIs exist at those addresses in order to label them. The impscan plug-in for Volatility aims to solve the problem of incomplete import tables. It is very unlikely that the dumped program will match the original or even execute on another machine. That is fine because all you really need to complete a thorough analysis of the malware’s capabilities is to be able to see which API functions it is calling in the disassembly. Therefore, impscan does not attempt to produce a patched version of the dumped file as Import REConstructor does for live systems. Instead, it simply provides labels that you can import into IDA Pro. Below is the syntax for impscan to scan a process for imported functions:

python volatility.py impscan -p 920 -f laqma.vmem –dump-dir=outdir

impscan works by determining the base address and size of all DLLs in a process. Using pefile, it parses the Export Address Table (EAT) of the DLLs to determine the offsets and names of exported functions. Then, using pydasm, it scans the process executable (or any memory range in the process address space as specified with the -a and -s flags) looking for call or jmp instructions. If the destination of one of the call or jmp instructions leads to an API, them impscan records the address of the instruction and the corresponding API function name. Impscan produces MakeName statements, which you can transfer into IDA Pro. These statements contain the missing information that IDA needs to link the placeholders presented earlier with the name of the API function stored at that address.

To apply the labels, click File → IDC Command, paste in the MakeName statements, and click OK. Once you have clicked OK, you will immediately see changes applied throughout the program. You can get even more information out of IDA by choosing to re-analyze the program. Now that IDA can tell which API functions the program is calling, IDA can label arguments accordingly. To do this, Click Options → General → Analysis → Reanalyze Program. Once this is complete then you can save your work through IDA Pro and then give that file to your RE guys for a detailed analysis. This will make their life a little easier and they will be much more receptive in handling executables from memory.

This process will create two files (bin file and a bin.idc) so if you are not an RE guy then you can take these two files along with the executable and give those to your RE guys who will know how to import them into IDA.

The impscan is not part of the volatility install but is a python script found on the DVD of the above referenced book.  I have uploaded this script to the library for everyone’s use.  The file is called “malware.py” and contains numerous scripts in one and the impscan script is part of that.