When reading a technical analysis of malware, you will often hear of the term “hooking.” The term is rather confusing and even ambiguous in some cases and thus, in this article we will explain exactly what hooking is, the different types of hooking, and how it is used by malware.
What is Hooking?
Let’s start with a semi-formal definition from Wikipedia:
In computer programming, the term hooking covers a range of techniques used to alter or augment the behavior of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components.
As you may know, programs are essentially broken into a series of “function calls” which are just semantically-grouped areas of code to help programmers reason about and write organized programs. You probably also know that files such as DLLs on PCs and dylibs on Macs are not “directly executable.” Meaning, you cannot just double-click the file to start the program. This is because these files are simply collections of functions which are “exported” and meant to be called by other executable files. This is why when a program ships, it often ships with a single .exe and several associated DLL files. The .exe file initiaties the actual program and then relies upon the code in the DLL files in order to operate. Hence the telltale “missing DLL file” errors that every Windows user has encountered at some point in their computer life. For the remainder of this article, when we use the term “DLL” we are referring to “code library” but we use the DLL file in particular because we are discussing Windows-specific malware since this is PC Matic’s forte.
It is important to note a couple of things about DLL files:
- While it is possible to create a program which relies on no DLLs, this is rarely ever seen because in order for a program to effectively do anything on a Windows computer, it must at least use operating system DLLs such as kernel32.dll, user32.dll and ntdll.dll which are part of the Win32 and Native APIs. In essence, these files, along with others, provide the control over the computer/hardware that a programmer needs to utilize in order for his/her application to work properly. Even if a programmer doesn’t place a direct call to a function in a DLL in his/her program, eventually all standard system calls will during the compilation process.
- Although most malware appears in the form of .exe files, it is also common and effective for a malware author to write a malicious DLL instead. The benefit to doing this is that a legitimate program can be tricked into calling malicious functions from the malicious DLL rather than legitimate functions that it needs to call in order to operate. One method for doing this is hooking.
Import Address Table (IAT) Hooking
When a user double-clicks or otherwise launches an executable file, the operating system is responsible for loading it into memory. The term loading, for the most part, means reading the bytes of the program on disk, interpreting them, figuring out which parts go where (code, data, read-only data, and other sections) and how the file should be placed in RAM, and then finally taking the bytes that are on disk in the file and loading them into the computer’s RAM and starting at least 1 thread of execution. A “running” program is simply a group of instructions and data which are loaded into memory and which has at least 1 thread executing instructions.
During the above process the Windows Portable Executable Image Loader is called in order to “resolve” the dependencies of the program. Resolving the dependencies largely includes figuring out which DLLs are needed and which functions from those DLLs are to be used in the program. The way this is done is that the compiler will read the programmer’s code and place any DLLs and functions needed for the program into what’s called the “Import Address Table” which is actually a collection of data structures inside of the file’s header called IMAGE_IMPORT_DESCRIPTORs and IMAGE_THUNK_DATA. Simply put, an IMAGE_IMPORT_DESCRIPTOR contains the name and location of the DLL file needed and then inside of the descriptor is an array of IMAGE_THUNK_DATA pointers. These pointers tell the image loader which functions are needed from the DLL. See them here:
typedef struct _IMAGE_IMPORT_DESCRIPTOR {
DWORD OriginalFirstThunk;
DWORD TimeDateStamp;
DWORD ForwarderChain;
DWORD Name;
DWORD FirstThunk;
} IMAGE_IMPORT_DESCRIPTOR, *PIMAGE_IMPORT_DESCRIPTOR;
FirstThunk points to the beginning of an array of IMAGE_THUNK_DATA structures which in turn contain each function import. DWORD Name contains the name of the DLL file from which the functions are imported.
typedef struct _IMAGE_THUNK_DATA {
union {
PDWORD Function;
PIMAGE_IMPORT_BY_NAME AddressOfData;
} u1;
} IMAGE_THUNK_DATA, *PIMAGE_THUNK_DATA
The function can be imported by either an ordinal (ID number) or its name. You will also notice that the structures are of type struct _IMAGE_THUNK_DATA and struct _IMAGE_IMPORT_DESCRIPTOR and they are typedefed to the shortened name that we use in this article. The convention of typedefing a struct in C is common so that the word struct does not need to be stated over and over in the program but instead, the shorter IMAGE_THUNK_DATA, for example.
Import Address Table Hooking is simply editing the header of a legitimate program and swapping out a legitimate DLL and/or function with a malicious DLL function and/or a completely separate malicious DLL altogether. This is called “Import Address Table Hooking” because the malicious code has intercepted and modified the victim program’s functionality without it knowing. In fact, many malware authors will write the malicious code such that it will call the actual legitimate code once it finishes performing its evil deeds so that the program will continue to run smoothly and no one will ever know.
Inline Hooking
Inline hooking is almost the exact same thing as Import Address Table hooking. The difference is that, instead of misguiding the victim file and causing it to load malicious functions, it simply writes the malicious functionality directly into a legitimate DLL. The legitimate DLL will already be called by the program since the function is already inside of the Import Address Table. Due to this, the only thing an attacker who is performing an inline hook needs to do is locate the DLL which will be called by the victim program and then replace a function inside of it with malicious code. The function will be resolved by the operating system as it usually would and the malicious code will be executed when the function is called. Again, it is common for the malicious code to surround legitimate code to remain stealthy.
System Service Dispatch Table Hooking – A rootkit/kernel attack
The operating system kernel is the functional core of the OS. When a regular “user” program such as Word, Excel, or Firefox crashes, you get an error message. When the kernel crashes, the entire machine halts and produces the classic “Blue Screen of Death.” Essentially, the kernel is a very sensitive area with limited access by regular programs and complete power over the entire machine, including the hardware. It should go without saying that malware which has worked its way into the kernel is the some of the worst kind of malware and can go undetected for very long periods of time as advanced persistent threats (APTs). Malware which has infected the kernel, sometimes called “rootkit malware” can do things like circumvent programs from detecting it, damage the machine, turn on a webcam without a victim knowing, utilize the GPS hardware and microphone without the victim knowing and in a silent fashion, etc… In fact, this type of malware can be extra bad when a system which has control over important infrastructure has its kernel infected. An example of such a situation was the Stuxnet malware which was able to take over centrifuges in a lab and destroy them.
Before getting to SSDT hooking, I want to briefly mention drivers. You’ve probably heard of a driver before. A driver is simply a programmer’s way to place code into the kernel. Drivers are the only software which is allowed to be inside of the operating system kernel memory space other than the main kernel code itself and they are needed in order for the kernel to interact with and control hardware devices. This is why when a user purchases a new device and plugs it in, drivers must be installed. However, a driver is simply code in the kernel and no physical hardware need be associated with it: it is possible to write drivers that perform other tasks and have a greater control over the system than standard programs do.
What all of this means is:
- It is very beneficial for a malware author to get control over the OS kernel
- There are limited ways to do this since the kernel is primarily exposed to user-mode programs through drivers which need to be written and installed in a unique way. One benefit to this is that far fewer programmers know how to write drivers in comparison to the number of programmers who are used to writing user-mode programs. Kernel malware is more specialized, and often, it is not even needed at all to do sufficient damage to a system.
- Several ways in which malware can infect the kernel include: Installing a malicious driver, exploiting a good driver, and System Service Dispatch Table hooking
- First, (pointers to) function arguments are either placed on the stack or moved into CPU registers.
- The call instruction takes the memory address of the location where the function is being called and places it on the stack. This is done so that when the return instruction is called from inside the function, the CPU knows where to return to so that the program doesn’t crash.
- A jump occurs and takes the processor’s instruction pointer to the first instruction of the function which was called.
- The called function’s instructions are executed in sequential order
- The CPU assumes that a pointer to the stack is in the edx register
- The CPU checks what is called a Model Specific Register (MSR) for a special function pointer which is in this register at all times. Currently in x64 systems, this function is called KiSystemCall64 and it is in kernel space. This function
- Like any other call, the address to return to inside of user-mode code is also stored
- KiSystemCall64, or KiFastSystemCall will take the number currently in the eax register and look it up inside of the SSDT, which is in a data structure called KiServiceTable inside the kernel memory.
- The proper kernel function is located, the arguments that are meant to be passed to it are taken from the user-mode stack inside edx, and finally the instruction pointer is moved to execute the kernel-mode function
What is the System Service Dispatch Table?
As we discussed in foregoing sections, the system’s memory layout is split between two main modes: User and Kernel mode and there is a “wall” of sorts between the two. User-mode programs are not allowed to call functions in the kernel directly. The obvious question may pop up of “How does a user-mode application end up controlling the computer if it has no direct access to the kernel which is responsible for controlling the hardware?” This is where the System Service Dispatch Table comes into play.
In reality, a user-mode program actually can call kernel functionality, but it is very limited and rarely ever done directly. The processor registers, which are much like small slots of memory inside of the actual processor chip rather than in RAM, are used to bridge the gap from user to kernel mode in a very unique and effective way. The SSDT is essentially a look-up table, much like a phone book, which contains kernel function names. Each function is associated with an identifying unique code called a dispatch ID. In order for a user-mode program to call a kernel function, a code from this table is placed into the processor’s arithmetic register and then a special instruction is called. This instruction in the past was called int 0x80 but in modern chips, it is called syscall or sysenter. When the processor receives this instruction, it checks the arithmetic register for the code, then searches the SSDT for that code and pulls out the corresponding function. Finally, it must take the arguments that were to be passed to the function off of the user-program’s stack and transfer them into the kernel function so that it can properly execute.
The whole idea of the above process is that the CPU command, paired with the SSDT, act as a gate to severely limit which parts of the kernel are exposed. In fact, they’re so restricted and subject to change, that calling them directly from a user-mode program is never really done. Instead, developers who write user-mode applications will utilize either the Win32 API, the Native API, or even the .NET runtime which will at some point be making these SSDT calls. In this way, if the SSDT needs to change, user-mode programs will not break and the handling of kernel calls is best left to the developers of the operating system anyway.
Let’s look at a brief example of how this all works. There is a function in the Windows Native API called NtCreateSection. This function creates a new Section Object, which is an area of shared memory for which “views” can be used in order for multiple programs/processes to share the same data in memory without stepping on each other. This function is callable from both user-space and kernel-mode. If the function is called from user-space, NtCreateSection from ntdll.dll is used because ntdll.dll (the native API) is the final API in user-mode which interacts with the kernel as explained above. If a section object needs to be created from inside kernel-mode already, then ZwCreateSection is used by a driver, for example.
What this means is that NtCreateSection must use the sysenter/syscall instruction as explained above since it is bridging the gap from user-space to kernel-mode. When we open up ntdll.dll in a disassembler, we see this code:
The sysenter instruction is actually a processor instruction. What this means is that it is not part of the Windows operating system but instead, the processor instruction set, much like other assembly instructions such as mov, add, push, pop, etc. This is important to note because the processor needs to assist applications in jumping from user to kernel memory space and implementing it like this allows for better control over the two spaces. The way the jump happens is when the syscall or sysenter instructions are handled by the processor, it knows to do several things. Before we take a look at exactly what it does, let’s go over what a regular function call usually does:
This must all still occur when we are calling a kernel “system service” function via the sysenter or syscall instructions. In order to facilitate this, the processor does several things when these instructions are executed:
The details of this process can and do change from OS to OS, processor to processor, and version to version, which also is why many of these details are left out of explanations. However, the general idea is that a lookup table and several CPU registers are used to call and transfer control to kernel-mode functions. This can happen because the thing which separates kernel-mode from user-space is a memory barrier. The CPU registers, however, are exempt from this memory barrier and can act as the mediator between user and kernel space. These functions can be abused by malware using SSDT hooking, malicious drivers, and/or driver or direct kernel exploits.
An SSDT hook is simply when malware replaces one of the functions inside of the SSDT with a malicious function by editing the SSDT. In this way, when a kernel function is supposed to be called, it will actually trigger malicious code execution inside of the kernel memory space. In fact, malware could infect a very common kernel function, such as NtCreateSection or ZwReadFile which is called several times per second, in order to continuously execute some malicious behavior and piggy-back on the system calls. SSDT hooking is no longer a big threat in modern versions of Windows because there is protections which prevent a driver from editing the SSDT or modifying any kernel code for that matter. However, it is the underlying concepts which are very similar in many of these attacks and will be seen again in the future.
You will notice that the hook implementation itself is much like the aforementioned IAT hooking. This is why we’ve presented these three types of hooks together.
Hook Injection
Hook injection is still an “interception” of functionality, but other than that, it operates quite differently from the above. An important realization to make is that it is not just malware which needs to hook functions on the system. Hooking is a perfectly normal and common thing to do in many legitimate programs. In fact, many antivirus solutions use hooks as part of system monitoring and programs can use hooks to monitor hardware as well.
For these reasons, Windows actually provides several methods of hooking behavior. One such method is the Win32 SetWindowsHookEx function.
Windows is an event-driven operating system. What this means is that events are occurring all the time in the background, such as when a user presses a keyboard key, clicks a mouse button, or when packets are coming in from the Internet and need to be processed, etc…
Windows uses a system called “messages” to communicate back and forth with a user-mode application. If the OS needs to signal the program about anything, it can send a message to the program and it is up to the programmer of the application to specify how this message is handled. It is important to note here that we are speaking of the Win32 API programming. .NET programs such as C# and Visual Basic abstract most of this away from the programmer so he or she no longer has to directly program such “message handlers.” It is possible that you have written many .NET applications on Windows and have never programmed a message handler in the Win32 sense.
SetWindowHookEx allows a programmer to listen for specific messages coming from a given thread. Simply put, a programmer can use this function to intercept all keyboard presses, mouse clicks, system messages, and any message that is sent to or from a program. Malware such as keyloggers can use this function to quietly record user input in the background without them knowing. While this is still a “hook,” it is a different type; here we are listening for events/messages to and from an application and the OS whereas before we were modifying code inside of files and triggering the execution of malicious code. Hook injection is more of a tool for espionage. Different types of events that can be listened for and more information on using this function can be found on MSDN. You will see that there is a parameter called idHook which can be supplied with a different macro/code depending on which type of hook the programmer desires to use.
Conclusion
The important lesson to learn from these attacks when analyzing malware is to look for the behaviors. Even if a particular trick is obsolete, old, or rarely used anymore, the behavior of exploit and malware authors to attempt to trick the operating system into doing things that it was not designed to do, to cause harm, is very common. It is also common for malware to attempt to use a “safe” or “trusted” application to load into in order to bypass protections since the operating system and in some cases antivirus will let its guard down for particular programs. Malware analysis and research becomes more rewarding when the analyst learns to think in this fashion in order to look for and spot new malicious behaviors in malware.
Although there are different types of hooking which perform different tasks, the general definition of “hooking means intercepting” applies in most scenarios. In some cases, hooking is used to simply gather information such as monitor keystrokes. In others, it is used to alter program behavior and can be used to hide from antivirus and operating system checks.
Further Reading
To read more about some of the topics described in this article, check out the following links:
The NT Insider
The Quest for the SSDTs by Jose A. Pascoa
Switching Between User and Kernel Space by Microsoft employee CrDev
x86 Instruction Set Reference – sysenter