Android Fuzzing — Practical Lessons from Negative Results

Research project — January to May 2023. The goal was to study generative and mutative fuzzing techniques, evaluate open-source Android fuzzers, set up a real environment, and attempt to reproduce a known multimedia vulnerability. We didn’t fully get there — and documenting why is the point.

Objective

Two parts:

Study fuzzing techniques and evaluate existing tools — AFL, Droid-FF, and what else existed for Android at the time
Practice — build a working environment and try to reproduce a known vulnerability in the Android media stack (Stagefright/successor frameworks)

Target: media file parsing. The Android media framework has historically been a rich attack surface — malformed MP4, H.264, or AAC input fed to the media decoder stack has produced real CVEs.

Background — Why Android Fuzzing Is Hard

Fuzzing a desktop binary is relatively simple: compile with AddressSanitizer, point a fuzzer at the binary, watch for crashes. Android adds friction at every level:

Desktop fuzzing:
  Fuzzer → Target binary → Crash → Done

Android fuzzing:
  Fuzzer → ADB bridge → Android VM/device
         → Sandboxed app → IPC → System process (mediaserver)
         → HAL → (device-specific behavior)
         → Crash somewhere → (maybe) detected → reported back over ADB

Each arrow is a failure point. The ADB bridge alone introduces enough latency to destroy fuzzing throughput. Coverage feedback from a remote sandboxed process is non-trivial to get.

Tools Evaluated

AFL — American Fuzzy Lop

AFL is the reference coverage-guided fuzzer. It instruments a target binary at compile time to track which code paths are executed. Inputs that reach new paths are kept and mutated further — this is what makes it “smart” compared to pure random mutation.

Corpus → Mutation → Execute → Coverage bitmap → New path found? → Keep input
                                              ↓
                                         Mutate again

AFL works well when:

You control the build (can instrument with -fsanitize=address + AFL instrumentation)
The target reads from stdin or a file
Execution is fast (thousands of runs per second)

AFL’s problem for Android:

System binaries can’t easily be recompiled with AFL instrumentation
QEMU mode (black-box fuzzing without instrumentation) works but is ~2-5x slower
ADB adds round-trip latency — you can’t get anywhere near the run/second rates AFL needs to be effective

Droid-FF

Droid-FF is an Android-specific fuzzing framework designed to work over ADB. It handles the communication layer so you can focus on the fuzzing logic itself.

Evaluated against AFL for:

Setup complexity
Throughput (runs/second)
Coverage feedback quality
Integration with existing corpora

In practice: Droid-FF simplified the ADB communication but still couldn’t overcome the fundamental latency problem. Both tools ended up bottlenecked by the bridge.

Environment Setup

Two-VM architecture

┌─────────────────────────────────┐
│  Host (Linux)                   │
│                                 │
│  ┌───────────────┐              │
│  │ Fuzzer VM     │              │
│  │ (Ubuntu)      │──── ADB/TCP ─┼──┐
│  │ AFL / Droid-FF│              │  │
│  └───────────────┘              │  │
│                                 │  │
│  ┌───────────────┐              │  │
│  │ Android VM    │◄─────────────┘  │
│  │ (QEMU/AVD)    │                 │
│  │ Target process│                 │
│  └───────────────┘                 │
└─────────────────────────────────┘

ADB over TCP between the two VMs — no USB. This matters because USB ADB has slightly better throughput but the difference turned out to be negligible given the other bottlenecks.

Target

The plan was to target stagefright / mediaserver — the Android component responsible for parsing and decoding media files. Sending malformed MP4 files to trigger parsing bugs.

1
2
3
4
5
6
# Push a test file to the device
adb push malformed.mp4 /sdcard/test.mp4

# Trigger media parsing via intent
adb shell am start -a android.intent.action.VIEW \
  -d file:///sdcard/test.mp4 -t video/mp4

What Didn’t Work and Why

1. Throughput was too low

AFL needs thousands of executions per second to be effective. With the ADB bridge, we were getting maybe 5–20 per second. That’s not enough to cover the search space meaningfully.

Solutions tried:

ADB over TCP (slight improvement over USB)
Reducing test case size
Parallelizing across multiple Android VMs

Even with parallelization, throughput stayed far below what would make coverage-guided fuzzing practical.

2. Crash detection was unreliable

When mediaserver crashes on Android, the process is restarted by the system automatically. Detecting whether a crash happened — and which input caused it — required parsing logcat output, which added more latency and complexity.

1
adb logcat -s "AndroidRuntime:E" "DEBUG:*"

This worked but was fragile. Crashes from unrelated system processes appeared in the same log stream.

3. The emulator behaved differently from real hardware

Some behaviors specific to the Qualcomm or MediaTek HAL implementations couldn’t be reproduced in the QEMU-based Android emulator. The vulnerability we were trying to reproduce had hardware-specific components.

4. Corpus quality

A good fuzzing corpus requires valid seed files that exercise the code paths you care about. We started with generic MP4 samples, but without knowing which specific codec paths were vulnerable, seed selection was essentially guesswork.

What Actually Worked

Even without reproducing the target vulnerability, the project produced useful results:

Infrastructure: A working two-VM fuzzing setup over ADB/TCP that could reliably push files and trigger media parsing intents. This is reusable for future fuzzing targets.

Crash triage: A logcat parsing script that filters crash output and correlates it with the input file that triggered it.

Corpus: A set of mutated MP4 files, some of which caused non-crashing anomalies (ANRs, parser errors) worth investigating further.

Understanding: A clear picture of where Android fuzzing bottlenecks are and what would be needed to overcome them — mainly, getting instrumentation inside mediaserver to get real coverage feedback.

What It Would Take to Do This Right

In-process fuzzing — compile a standalone harness that calls the media parser directly, without the ADB overhead. This is what projects like Android-specific libFuzzer integration do.
Better corpus — use format-aware mutation (knowing the MP4 box structure) rather than pure byte-level mutation. Structure-aware fuzzers like Atheris or custom grammar-based fuzzers would help here.
Real hardware with ASAN — build Android from source with AddressSanitizer enabled to get memory error detection on real hardware.

What I Learned

The most honest takeaway from this project: negative results are results. The infrastructure limitations we hit are real — they’re not unique to this lab, they’re why Android fuzzing is still an active research area.

More specifically:

Coverage-guided fuzzing requires high throughput — anything that adds latency (ADB, VMs, IPC) kills effectiveness
Crash detection on a live OS is harder than on a single binary — you need to filter noise
Emulator fidelity matters for hardware-specific vulnerabilities
Format-aware mutation is significantly more effective than pure byte flipping for structured file formats