The Malware Research Team have come across some JavaScript malware as we posted about in Fileless Malware Explained last month. In this post, we will highlight one such piece of malware and specifically, the process used in deobfuscating part of the malicious code.
What is Obfuscation?
Code obfuscation is a group of techniques used to make programming code more difficult for a human to understand, yet still executable by the machine. It is an anti-reverse engineering trick and it serves no other purpose. The only reason why someone would want to obfuscate code is to make reverse engineering or analysis more difficult. Since it must still be executed by the machine, by definition, obfuscated code is never impossible to follow, but it can be extremely difficult or simply more time-consuming to reverse engineer. It is sort of like trying to read poorly written or “spaghetti code,” or untangle 100 wires which are all tangled up. Obfuscated code and encrypted code are two different things. For example, WannaCry/WanaCrypt0r used some encrypted code, in which all of the bytes that make up the code are scrambled by an encryption algorithm and then the code is decrypted in memory prior to running. You will first see what encrypted code looks like, followed by 2 examples of obfuscated code before we dive in:
Encrypted (not obfuscated) Code:
Obfuscated Code:
As we can see, unlike encrypted code, obfuscated code actually still has all programming constructs intact. It is just labeled in such a way to make it difficult to read. Additionally, extra, illogical steps may be added to the code to make it more difficult to follow. For example, if I wanted to tell you how to get the number 15, I could say:
To get the answer, add 5 and 10 together.
OR, I could say:
To get the answer, take 657,983, divide it by two, multiply it by 3, subtract 2,000, add 75, multiply by 1, then subtract 985,033
Both of these will get you 15, except one is obfuscated and meant to slow you down. This is the exact same technique that code obfuscation uses. Examine the first screenshot above, and you will see repeated calls to g() and a(). In fact, a() is inside of g().
First of all, g() and a() are not very good function names. Secondly, the frequency of these function calls and context in which they appear should immediately alert an analyst to go and look at the definitions of these functions. Upon doing so, we find:
g() is just an obfuscated function that prints input text to the console
a() is our decoding function. It is taking every other character and processing it with:
String.fromCharCode(parseInt(b.substr(c, 2), 16)
Frankly, you don’t even need to understand that function to deobfuscate this code, you just need to recognize that this function is taking an input string and decoding it to get a clue. To further examine the code in operation, a debugger can be used. This particular code uses some Windows-specific utilities like ActiveX and WScript. For this reason, I chose to use the Internet Explorer F12 Developer Tools. Ordinarily, this would not be a good choice, but Firefox and Chrome do not support ActiveX and other proprietary Microsoft features like IE does. Also, I wanted something with as few security protections as possible and IE fits the bill. Here is a screenshot of the IE F12 tools in action:
The code actually brought an error up on intitial run due to being unable to load a WScript object, but luckily I was able to click the “Step Over” button and the code continued to run, exposing the functionality of the decoder. I placed a Debugger; command inside of the decoding function and stepped through the code one line at a time, watching the decoding process happen one character at a time. I also had to allow ActiveX objects when I first started this process so that IE would let the code run. See a char being decoded below:
At this point, we can try and take one of those large confusing numbers in our code and run it through this decoding function. However, this manual method can be time consuming. It would be better to try some automated tools first. I tried JSNice, JSDetox, and JSUnpack and all of them failed to deobfuscate this code. Rather than keep wasting time trying deobfuscators, I decided to just do it manually. To do this, all I did was take the a() function from the code, open up Chrome, press F12, click Source, click Snippet, then paste this code into the box. I then arranged the code so that I could manually control the function input, and the output would run a console.log() and print both the input and the output to the Chrome console. This allowed me to paste each encoded string in and get the decoded string like so:
Eventually, unpacking all of this code lead to code which looked like this:
During my manual deobfuscation, I also removed redundant functions and renamed a few objects from their cryptic one-letter names to semantic names. I left one string obfuscated which will give you a sense of where we are in the code when you view the original version of this code:
As you can see, the deobfuscated code is far, far easier to understand and can help solve the mystery of what the code actually does. In this case, the code navigated to a malicious URL and attempted to download more malware. This is a common choice of malware authors because as we covered in the previous post, scripting attacks can help avoid other antivirus systems. PC Matic will still protect your computer from this type of attack.
Deobfuscation tips and suggested workflow
Here is a recommended workflow for deobfuscating JavaScript malware code:
- First determine if the code is obfuscated by examining it
- Next, try automated deobfuscation tools like JSNice, JSUnpack, etc… You may also find JSBeautifier helpful because usually the code is all compressed or “minified” into one line. I used JS Beautifier to make the code look like code before I even touched it.
- After formatting the code, read through it and locate functions which are repeatedly called throughout the code
- Examine the definitions of those functions in an attempt to reverse engineer their functionality and/or locate a decoding or encoding function
- If it’s an encoding function, you may be able to reverse it. If it’s a decoding function, you can extract and use it in a Chrome Snippet as we did here
- Decode each symbol and then go back to the original code, replacing each one and using comments and your own function and variable names to make sense of the code
- Keep eyes peeled for URLs and IP addresses. This malware had a URL to a hacker’s web server.
Thanks for reading and as always, feel free to drop any questions or comments in the comment section below.