What It Does
IM-LLVM-Pass is a compiler pass that runs during the LLVM compilation pipeline and renames all internal (non-exported) functions and global variables to random-looking strings — before the binary is produced.
Source code → Clang → LLVM IR → [IM-LLVM-Pass] → mangled IR → Object code → Binary
The result: a binary where internal symbols like checkLicenseKey or decrypt_payload become _Zf3a9b1c — the program behaves identically, but reverse engineers can’t use symbol names as a starting point.
Why IR Level
The pass operates on LLVM IR (Intermediate Representation), not on source code or the final binary.
This matters because:
- Source-level obfuscation requires understanding the language’s AST and semantics — fragile, language-specific
- Binary-level patching after compilation can break relocations and debug info
- IR-level is language-agnostic (any language that compiles to LLVM works), operates before final code generation, and can safely rename symbols while preserving cross-references
How It Works
The pass is a Module Pass — it sees the entire program at once, not just one function.
For each function in the module:
→ Skip if external linkage (exported — renaming would break ABI)
→ Skip if named "main" (entry point must remain findable)
→ Generate a new name: seed PRNG with symbol name → produce random string
→ Rename the function everywhere it's referenced
Same for global variables.
Name generation
The PRNG used is std::mt19937 (Mersenne Twister), seeded with a hash of the original symbol name. This makes the renaming deterministic — same source, same pass, same output every time. Useful for reproducible builds and debugging the pass itself.
| |
The generated names follow a pattern that looks like compiler-generated symbols, making them blend in rather than stand out as obviously obfuscated.
Build & Use
Prerequisites
- LLVM 14+ development headers
- CMake 3.13+
- Clang (to compile targets)
| |
What changes
Before the pass — nm target | grep T:
000000000000 T main
000000000000 T checkLicenseKey
000000000000 T decrypt_payload
000000000000 T computeChecksum
After the pass:
000000000000 T main
000000000000 T _Zf3a9b1c
000000000000 T _Z8d2e4f1a
000000000000 T _Z1b7c9d3e
main is preserved. Everything else is gone.
IR and Assembly Diff
The pass produces visible changes at the IR level:
Before (test.ll excerpt):
| |
After (mangled-test.ll excerpt):
| |
The call site in main is updated automatically — the pass handles all references.
📷 [IR diff — see /example/IR-diff.png in the repo]
📷 [Assembly diff — see /example/assembly-diff.png in the repo]
Limitations
This is a learning project — not production obfuscation. Known limitations:
mainis always preserved — necessary for the binary to function, but it’s a known entry point- External symbols untouched — anything with external linkage keeps its name (by design — renaming would break the ABI)
- Debug symbols — if you compile with
-g, DWARF debug info may still contain original names - No control flow obfuscation — symbol renaming alone doesn’t change the control flow graph, which is what most serious reverse engineers analyze
- Standalone tool — not integrated into a build system or CI pipeline; has to be explicitly loaded
For real obfuscation needs, tools like OLLVM or commercial solutions add control flow flattening, bogus control flow, and instruction substitution on top of renaming.
What I Learned
Building this pass required understanding LLVM’s pass infrastructure at a level that generic tutorials don’t cover:
- The difference between Function Passes and Module Passes — and why symbol renaming requires module scope
- How LLVM tracks symbol references — renaming a function requires updating every
callandreferencein the IR, not just the definition - Why
externallinkage symbols can’t be renamed — they’re part of the ABI contract with the linker - How
std::mt19937seeding works and why deterministic obfuscation matters for reproducibility
Resources
- LLVM Writing a Pass documentation
- LLVM Language Reference (IR)
- LLVM Programmer’s Manual
- OLLVM — Obfuscator-LLVM — more complete obfuscation framework