Techniques Used by Malware

2. Techniques Used by Malware

The most important covert methods are:

  • Streams

  • Hooking native APIs/SSDT

  • Hooking IRP

Theses topic are highly moving targets with new methods/techniques invented every new months.

2.1. Streams

Streams are a feature of NTFS file system, they are not available on FAT file systems.

Microsoft calls them Alternate Data Stream.

The original data stream is file dat itself (it is the data stream with no name), all other streams have a name. Alternate data streams can be used to store file metadata / or any other data.

To explain the concept, let us give you a demo. The demo has been tested on Windows XP SP3 (but should work flawlessly on other Windows NT based OS too)

Example:

  1. Type the following command in the command prompt:

echo This data is hidden in the stream. Can you read it?

>> sample.txt:hstream

Now you can check the file named sample.txt.

You will be surprised to see that the file size is reported as 0 bytes. You can retrieve back your data by using the following command:

more < sample.txt:hstream
  1. Now let us explain how to use the stream programmatically.

In the CreateFile API in Windows, just append :stream_name to the file name, where streamname is the name of the stream.

#include <windows.h>
#include <stdio.h>

void main(){
  hStream = CreateFile( "sample.txt:mystream",
    GENERIC_WRITE,
    FILE_SHARE_WRITE,
    NULL,
    OPEN_ALWAYS,
    0,
    NULL);
  if (hStream == INVALID_HANDLE_VALUE)
    printf("Cannot open sample.txt:mystream\n");
  else
    WriteFile(hStream,"This data is hidden in the stream. Can you Read IT ???", 53, &dwRet, NULL);
}

2.2. Hooking Native API/SSDT

Hooking means that we want our malicious function to be called instead of the actual function

SSDT stands for System Service Descriptor Table. Native API is API which resides in ntdll.dll and is basically used to communicate with kernel mode.

This communication happen using SSDT table.

For each entry in SSDT table, there is a suitable function in kernel mode which completes the task specified by the API; this representation can be pictured as:

User mode Native API

<===>

SSDT Table

<===>

Kernel Mode

SSDT table resides in the kernel and is exported as KeServiceDescriptorTable. The following are the services available for reading/writing file:

  • NtOpenFile

  • NtCreateFile

  • NtReadFile

  • NtWriteFile

  • NtQueryDictionaryFile (This is used to query content of the directory)

Microsoft keeps on adding new services on every OS release.

Example: Let us consider the case of a directory query.

For that, we have to hook NtQueryDictionaryFile

Steps:

  1. Hook SSDT table entry corresponding to NtQueryDictionaryFile

  2. Now, whenever the above function is called, your function will be called

  3. Right after your function gets called, call original function and get its result (directory listing)

  4. If the result was successful, modify the results (hide the file/sub-directory you want to hide)

  5. Now pass back the result to the caller

  6. You are hidden

This is a very basic method. Nowadays almost all anti-virus / rootkit-detectors scan SSDT table for modifications (they compare it with the copy stored in the kernel) and thus detection can be done.

2.3. Hooking IRP

Windows architecture in kernel mode introduced the concepts or IRPs (I/O Request Packets) to transmit piece of data from one component (driver) to another.

The concept of IRPs is well explained in the Windows Driver Development Kit (it is available for free)

Almost everything in windows kernel use IRPs. For example network interface (TCP/UDP, etc.), file system, keyboard and mouse, and almost all existent drivers.

There are basically 2 ways to play with IRPs:

  • Become a filter driver Register with the OS as a filter driver or an attached device

  • Hooking the function pointer the array that can be shown with Winddk (about the function pointer can be modified)

    Code snipper showing function pointer hooking:

    old_power_irp = DriverObject->MajorFunction[IRP_MJ_OWER];
    DriverObject->MajorFunction[IRP_MJ_OWER] = my_new_irp

    As you can see, function pointer is one of the easiest method to hook functions.

The basic IRP design is so that after an IRP has been created, it is passed to all the devices registered at lower levels.

The design has pre-processing mode and post-processing mode.

Pre-processing is done when an IRP arrives and post-processing is done when the IRP has been processed by all the levels below the current level.

Each device object has its own function table. Hooking the function pointers of such objects is called DKOM (Direct Kernel Object Manipulation).

All file systems, network layers, devices like keyboard, mouse, etc. have such objects.

For example:

  • \device\tcp

  • \device\ip

  • \Device\KeyboardClass0

  • \FileSystem\ntfs

Filer drivers are basically used by Anti-viruses to get control whenever a new file is written.

2.4. Hiding a Process

Hiding a process requires a more difficult approach. It requires a combination of different techniques.

E.g., first thing you have to do is to hook NtOpenProcess native API (probably using SSDT table hooks).

Other things to do is to hide process from EPROCESS list.

This list is maintained by the OS for all active processes.

The EPROCESS list has the following structure:

kd> dt _EPROCESS
    +0x000 Pcb                : _KPROCESS
    +0x06C ProcessLock        : _EX_PUSH_LOCK
    +0x070 CreateTime         : _LARGE_INTEGER
    +0x078 ExitTime           : _LARGE_INTEGER
    +0x080 RundownProtect     : _EX_RUNDOWN_REF
    +0x084 UniqueProcessId    : Ptr32 Void
    +0x088 ActiveProcessLinks : _LIST_ENTRY
    +0x090 QuotaUsage         : [3] Uint48
    ...
    +0x0C4 ObjectTable        : Ptr32 _HANDLE_TABLE
    +0x0C8 Token              : _EX_FAST_REF
    ...
    +0x174 ImageFileName      : [16] Uchar

Note : UniqueProcessId, ActiveProcessLinks, Token, and ImageFileName are normally most used

As you can see in the above structure, ActiveProcessLinks is the circular doubly linked list with *FLINK and *BLINK as pointers to other structures.

The easiest thing to do is to unlink the structure relative to our process from the list (Refer to Module 3 for more information on this list).

If the driver is loaded, you will also have to unlink it form the PsLoadedModuleList.

API hooking is essentially the act of intercepting an APIP function call, and modifying its functionality somehow, either by redirecting it to a function of our choice, stopping the function from being called, or logging the request - the possibilities are endless.

2.5. API Hooking

There can be different types of hooking such as:

  • IAT Hooking IAT (Import Address Table) is used to resolve runtime dependencies

    For example, when you use MessageBoxA API in windows, your compiler automatically links to user32.dll

    IAT hooking involves modifying the IAT table of the executable and replace the function with our copy

  • EAT Hooking EAT (Export Address Table) is maintained in DLLs (dynamic link library).

    These files just contain support function for other executable files.

    Difference between IAT and EAT hooking is:

    • Since EATs exist only in DLL files (under normal settings) most of the times EAT hooking is utilized only on DLLs while IAT hooking can be done on both EXEs and DLLs.

  • Inline Hooking Inline hooking is the most difficult to do due to the way it works

    In this form of hooking, we modify the first few bytes of the target function code and replace them with our code which tells the IP (instruction pointer) to execute code somewhere else in memory.

    Whenever the function gets executed, we will get control of execution; after doing our job, we have to call the original function so we have to fix up the modified function.

    This is normally done by executing a number of instructions which were replaced and then resuming execution in non-modifies original function code.

2.6. Anti-Debugging Methods

There are several methods which are used by malware to increase the time required to analyze the code (by security analyst)

If such techniques are not already known by security analyst then the time required increases drastically.

For example: We will document a trick that lets us detect the presence of a running debugger. It is called INT 2D trick and works flawlessly on Windows OS.

This trick is coded in assembly language.

Clear SEH.

push debugger_not_detected
push fs:[0]                 set SEH
mov  fs:[0],esp
int   2dh                   If debugger is attached it will run normally, else an exception will be returned.

nop                         
The above instruction causes it to skip this instruction if debugger is attached.

pop  fs:[0]                 clear SEH
add esp,4
...
debugger_detected :
...
debugger_not_detected:

What we do is:

  1. Set an exception handler

  2. Cause an exception with INT 2dh

  3. If a debugger is attached and does not pass the exception to us we get to debug_detected because an exception occurred for sure (we caused it)

2.7. Anti-Virtual Machine

Normal users always run the programs on a real system but security analyst analyzing malwares do not run the malware on a real system.

They always run the code in virtualized OS.

The tools are VMWare, Virtual PC, Xen, bochs, qemu to name a few.

These software let you install a virtual OS side-by-side your OS (without disturbing your OS) and will run just like any normal program.

These techniques are basically used by security analyst, so malware authors have found out few bugs in these applications which can be used to detect whether the OS is virtualized or not.

One of the trick is given in the code below

text:10007126 ; |||||| SUBROUTINE |||||
text:10007126
text:10007126
text:10007126 Ant1_Emulation_SIDT_Based_Check proc near ; CODE XREF: DllMa1n(x,x,x)+16ip
text:10007126      call Get_IDT_base
text:10007126       
text:10007120      and  eax, 0FF000000h
text:10007122      xor ecx, ecx
text:10007127      cmp  eax, 80000000h : Real Windows Machine always have 0x80 for their MSB
text:1000712A      setnz cl
text:1000712C      mov  eax, ecx       : If EAX!=0 we are emulating windows
text:1000712C      retn
text:1000712C      Ant1_Emulation_SIDT_Based_Check endp
text:1000712C

The techniques basically work on the SIDT instruction, which returns the IDT table address.

On real machines, it is in low memory less than 0xd0 while for virtualized OS (VMware/Virtual PC), it is higher than that.

This abnormal behavior leads to detection whether the malware is running on a real or virtualized system.

2.8. Obfuscation

Code obfuscation techniques transform/change a program in order to make it more difficult to analyze while preserving functionality.

Code obfuscation is used both by malware and legal software to protect itself.

The difference is that malware use it to either to prevent detection or make reverse engineering more difficult.

Basically code obfuscation/data obfuscation makes programs more difficult to reverse engineering.

One major drawback of existing obfuscating techniques is the lack of theoretical basis about their efficiency (Several implementation which looked impressive had very basic weak points, leading to their total downfall).

The malware obfuscates itself every time it infects a new machine making it harder for a detector to recognize it.

Existing malware detectors (Anti-virus Engines) are based on signature matching, thus they are based on purely syntactic information and can be fooled by such techniques.

2.9. Packets

Packers are software which compress the executable. They were initially designed to decrease the size of executable files.

However, the malware authors recognized very quickly that decreasing number of patterns in the file, so less chances of detection by anti-virus.

Anti-virus basically work by matching patterns (signatures).

So, this effectively increases the chances of malware to go undetected.

Some virus writers have gone to limit of creating their own packers (such as Yoda packer) while others user readily available packers such as UPX.

Some packers use anti-debugging tricks also.

Packer facts:

  • Packer allow to compress/encrypt applications

  • You cannot see the code of the application using a disassembler, you need to unpack it first

  • Packers compress applications and add a small loader to the file

  • The loader will decompress the binary in memory, resolve import, and call the Original Entry Point (OEP)

2.10. Polymorphism

Polymorphic code aims at performing a given action (or algorithm) through code that mutates and changes every time the action has to be taken.

The mutation makes them very difficult to detect.

There have been only a few polymorphic viruses and they still are not detected 100% by most of the anti-viruses.

All polymorphic viruses have a constant encoding and variable decryptor. So a virus using a different XOR key to encrypt its variant also falls into polymorphic category.

2.11. Metamorphism

It can be described as polymorphism with polymorphism applied to the decryptor/header as well.

There are numerous ways to implement metamorphism/polymorphism (both are similar with some minor differences).

Some of which are documentation below:

  • Garbage Insertion

  • Register Exchange

  • Permutation of Code Blocks

  • Insertion of Jump Instructions

  • Instruction Substitution

  • Code Integration with Host

2.11.1. Garbage Insertion

Garbage Insertion : Garbage data/instruction are inserted into the code, for example NOP instructions (0x90) are inserted.

2.11.2. Register Exchange

Register Exchange : The registers are exchanged in all the instructions.

For example, see the 2 snippets of code given below which are using register exchange.

Fist Snippet

Code Bytes

Disassembly

5A

pop edx

BF04000000

mov edi, 0004h

8BF5

mov esi, ebp

B80C000000

mov eax, 0000Ch

81C222000000

add edx, 0088h

8B1A

mov ebx, [edx]

Second Snippet

Code Bytes

Disassembly

58

pop eax

B804000000

mov ebx, 004h

8BD5

mov edx, ebp

BF0C000000

mov edi, 000Ch

81C88000000

add eax, 0088h

8B30

mov esi, [eax]

2.11.3. Permutation of Code Blocks

Permutation of Code Blocks : In this type of mutation, code blocks are randomly shuffled and then fixed up, so that execution logic is still the same.

This technique is very powerful.

2.11.4. Insertion of Jump Instructions

Insertion of Jump Instructions : Some malware mutate by inserting jumps after instructions (the instruction is also relocated), so that the code flow does not change.

2.11.5. Instruction Substitution

Instruction Substitution : In this type of mutation, one instruction (or set of instructions) are replaced by 1 or more different instructions which are functionally equivalent to the replaced set.

The example shows few example rules, which can be used for substitution.

REG stands for registered. imm stands for immediate number such as 0x800 (can be any numeric value). imm8 stands for 1 byte (8 bits).

Original CODE

ADD   reg, imm
MOV   reg, reg/imm

SUB   reg, reg
TEST  reg, reg
LODSx

STOSx

CMD   reg, imm8
DEC   reg
INC   reg

Transformed Code

SUB   reg, -(imm)
PUSH  reg
POP   reg
XOR   reg, reg
OR    reg, reg
MOV   ACUM, [esi]
ADD   esi, SIZE
MOV   ACUM, [esi]
ADD   edi, SIZE
CMD   reg, imm32
SUB   reg, 1
ADD   reg, 1

2.11.6. Code Integration with Host

In this type of mutation, the malware modifies the target executable (which is being infected) by spraying its code in regions of the EXE.

Zmist virus used this technique very effectively.

To hide more changes such as changes in file size, the malware might compress the original code (or can even damage the file completely) to change.

Last updated