IMBT from InvokeRE

Introduction

If you are interested in learning to reverse engineer modern Windows applications with modern tooling, or if you even simply desire to learn more about how computers work, then Introduction to Malware Binary Triage (IMBT) can be your guide through a vast sea of information.

I typically encourage people to have some fundamentals down before taking specialized training, but with a free tier and each of the paid tiers providing lifetime access (as of the date of this post), everyone can benefit.

IMBT walks you into a hands-on environment, with all the tools and information you need to take apart real-world binaries and malware samples. There are also quizzes available throughout the course, which are excellent for knowing if more time should be spent on a particular subject. By the end, you will have an excellent foundation, ready to open up anything thrown at you.

I know I might sound like I’m overselling it, but I am just really impressed with not only this training, but the community around it.

Environment Setup

Virtual Machines

Starting off, the training walks through setting up Virtual Machines (VMs) for performing analysis. I opted to deviate here, and set up my own VMs on Qemu with Windows 11. Though, the official instructions work just fine.

I set up one “static analysis” VM and another “dynamic analysis” VM, which was identical to the static analysis VM, except it did not have a virtual Network Interface Controller (NIC) connected.

At this point, the training content will slightly differ between the Binary Ninja edition and the IDA edition. I purchased the Premium edition, which comes with both the Binary Ninja and IDA editions, as well as some extra content. For the sake of this post, I will focus on IDA and mention differences for Binary Ninja where necessary.

Safety

Since the training interacts with actual, real-world malware, it’s important to keep safety in mind. Snapshots should be utilized after analysis, and caution should be used when moving malware around. The training discusses some safety tips and warnings, as well. If tools need to be installed later in the training, they should be installed on a fresh boot of a snapshot, and a new snapshot should be made.

Primer

PE File Format

Straight away, tools such as Detect it Easy are opened up to dissect the metadata of a binary. A variety of information can be seen, which helps identify next-steps for analysis. Things such as what imports and exports the binary has, what is the entropy of the binary in different sections, what technology was used to compile it, and so on.

This is part of what I might would consider the “recon” stage of approaching a target. Identifying meta information helps the analyst understand how the file should be approached.

x86

Assembly is at the core of this training, and is frequently utilized throughout. This module covers many of the most common instructions.

Although this is an excellent primer in assembly, doing your own research is frequently required. If something isn’t understood, don’t be afraid to do some googling or even hop into the Discord chat and ask a question.

The Intel 64 and IA-32 Software Developer Manuals (SDMs) are a great reference for understanding what individual instructions do, and I recommend taking some time to explore how they can be utilized, although it is not required.

Program Analysis

This portion of the training includes analyzing actions performed by malware, such as process hollowing, droppers that set up persistence, and other common behaviors. https://attack.mitre.org/ is an excellent reference to learn about these.

Static

Tools such as IDA, Binary Ninja, and dnSpyEx are employed for Static Analysis in this part of training. Several tasks throughout the next chapters explain how the most core functions of these tools are utilized.

A pattern begins to emerge, which the training does a great job of teaching. A natural progression, which starts with analyzing the binary’s metadata before proceeding with static or dynamic analysis.

Marking up the disassembler’s database is explained as a critical part of analyzing a binary. The training goes over using Enums, analyzing functions and their parameters, renaming variables, and other great methods of getting insight on the functionality of real-world malware.

Again, it is important to use resources. Constants, parameters and other helpful information should be located using Google or official Microsoft documentation when available. I’ve also found https://grep.app/ to be a very helpful resource.

Dynamic

Dynamic analysis is shown to be particularly useful when a binary is large or obscure. Though, the binary may be using methods of detecting dynamic analysis that would need to first be circumvented.

Tools utilized for dynamic analysis for the training include the dnSpyEx debugger, as well as x64dbg. With debugging, instead of sifting through mountains of decompilation and disassembly, you can break on a function call and inspect the process state to find what you’re looking for. You can even edit the state to take a desired code path, such as one which avoids debugger detection. Network Monitoring is also included here, which can help gather more information about the binary through observing what it does on the network while running.

During this portion of the training, I found the x64dbg documentation to be helpful. I cannot stress enough, resources are important not only for this training, but for real-world roles where these or similar tools may be utilized.

Packers

Generally speaking, when a program conceals its intent in any way, it’s said to be “packed”. This term is used loosely in the community. Though, obfuscation tends to refer to strings and data, whereas packing is usually oriented towards hidden code, or entire secondary binaries concealed within the primary binary.

The training covers detecting obfuscated strings in the binary, and deriving the original value by use of python scripts or dynamic analysis.

Packed binaries typically utilize a stub to transition the contained executable code into a usable state, and then it is executed. Both dynamic and static analysis approaches are covered, explaining how to identify and analyze packed binaries.

Yara

Yara is noted as a common format shared amongst the community for analyzing malware, useful for identifying certain functionality within a binary, or if a malware binary belongs to a particular family.

The training requires Yara rules to be written to match against real malware. The strongest Yara rules are those which match against portions of the binary that are difficult to change, and should be created accordingly.

Automation

Throughout the training and in the community, various automation tools are showcased, or even suggested. Some of the most frequently referred include: AssemblyLine, Any.run, Joe’s Sandbox, and Unpac.Me.

Other tools that are mentioned which are very helpful for analysis and continued learning include: VirusTotal, Malware Bazaar, and MalShare.

Conclusion

A comprehensive exam must be passed in order to complete the training and receive the certificate. After finishing, I felt I had found the structure I needed to confidently continue building my reverse engineering skill set.

I’d like to once again mention, the community around InvokeRE is absolutely incredible. The Discord server is a fantastic resource, where you’re sure to find incredible people willing to help. Even developers for great tools such as Binary Ninja can be found participating in the conversations there.

As for me, the training took around a month to complete. When I chose to pursue IMBT, I already had some experience with C, AMD64, and Windows internals. Again, I really don’t think this is a requirement. I believe with any subject in the world of computers, the key is to take things slow and take your time. Make sure you understand the topic at hand to the best of your ability before continuing, and don’t get frustrated if you have to come back to it.

Thank you for reading.

Good luck out there, and never stop learning!