This tutorial teaches you how to use JAFAR's heap dump analysis capabilities for exploring and diagnosing Java heap dumps (HPROF files). You'll learn the HdumpPath query language and common analysis patterns.
- Prerequisites
- Creating a Heap Dump
- Opening a Heap Dump
- Basic Queries
- Object Analysis
- Class Analysis
- GC Root Analysis
- Memory Leak Detection
- What-If Simulation —
objects/Foo | whatif() - Object Age Estimation
- Cache Profiling —
objects/LinkedHashMap | cacheStats() - Heap Health Report —
report - Common Analysis Patterns
- Tips and Best Practices
- JAFAR shell installed (see main README for installation)
- A Java heap dump file (
.hprofformat) - Basic understanding of Java memory concepts
If you don't have a heap dump, create one from a running Java process:
# Find Java process ID
jps -l
# Create heap dump
jcmd <pid> GC.heap_dump /path/to/dump.hprofjmap -dump:format=b,file=/path/to/dump.hprof <pid>import com.sun.management.HotSpotDiagnosticMXBean;
import java.lang.management.ManagementFactory;
HotSpotDiagnosticMXBean bean = ManagementFactory.getPlatformMXBean(HotSpotDiagnosticMXBean.class);
bean.dumpHeap("/path/to/dump.hprof", true);Add JVM flag to automatically dump on OOM:
java -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/dumps/ MyAppStart the shell and open your heap dump:
# Start with a heap dump
jafar-shell /path/to/dump.hprof
# Or start empty and open later
jafar-shell
jafar> open /path/to/dump.hprofWhen a heap dump is opened, you'll see basic statistics:
Opened heap dump: dump.hprof
Size: 158 MB
Objects: 1,923,456
Classes: 16,831
GC Roots: 4,200
# Total objects
hdump> objects | count
# String objects
hdump> objects/java.lang.String | count
# Objects larger than 1KB
hdump> objects[shallow > 1KB] | count# First 10 objects
hdump> objects | head(10)
# First 10 strings
hdump> objects/java.lang.String | head(10)# Shallow size statistics for all objects
hdump> objects | stats(shallow)
| count | sum | min | max | avg |
|---------|-------------|-----|----------|-------|
| 1923456 | 78,345,280 | 16 | 1048576 | 40.7 |# Top 10 by shallow size
hdump> objects | top(10, shallow)
# Large objects (>1MB)
hdump> objects[shallow > 1MB] | select(class, shallow)# Group objects by class
hdump> objects | groupBy(class, agg=count) | top(10, count)
| class | count |
|---------------------------|---------|
| java.util.HashMap$Node | 249,734 |
| java.lang.String | 238,750 |
| java.lang.Object[] | 156,234 |
| char[] | 145,632 |# Total memory by class (sum shallow sizes)
hdump> objects | groupBy(class, agg=sum) | top(10, sum)
| class | sum |
|---------------------|-------------|
| byte[] | 45,234,567 |
| char[] | 12,345,678 |
| java.lang.String | 9,072,500 |Strings often dominate heap usage:
# String statistics
hdump> objects/java.lang.String | stats(shallow)
# Large strings
hdump> objects/java.lang.String[shallow > 1KB] | top(10, shallow)
# String count by size range
hdump> objects/java.lang.String[shallow > 100] | countArrays can consume significant memory:
# Large arrays
hdump> objects[arrayLength > 10000] | top(10, shallow)
# Byte arrays (common for buffers)
hdump> objects/byte[] | stats(shallow)
# Object arrays
hdump> objects/java.lang.Object[] | top(10, shallow)# All classes
hdump> classes | count
# Classes with most instances
hdump> classes | top(10, instanceCount)
| name | instanceCount |
|----------------------------|---------------|
| java.util.HashMap$Node | 249,734 |
| java.lang.String | 238,750 |
| java.lang.Object[] | 156,234 |# Search by name pattern
hdump> classes/java.util.* | select(name, instanceCount)
# Classes with many instances
hdump> classes[instanceCount > 1000] | sortBy(instanceCount desc)# Find implementations of Map
hdump> objects/instanceof/java.util.Map | groupBy(class) | top(10, count)
| class | count |
|-------------------------------|--------|
| java.util.HashMap | 12,345 |
| java.util.LinkedHashMap | 5,678 |
| java.util.concurrent.ConcurrentHashMap | 1,234 |# All collections
hdump> objects/instanceof/java.util.Collection | groupBy(class) | top(10, count)
# List implementations
hdump> objects/instanceof/java.util.List | groupBy(class) | top(5, count)
# Set implementations
hdump> objects/instanceof/java.util.Set | groupBy(class) | top(5, count)GC roots are entry points that keep objects alive.
# GC roots by type
hdump> gcroots | groupBy(type, agg=count) | sortBy(count desc)
| type | count |
|-------------|-------|
| JAVA_FRAME | 2,345 |
| THREAD_OBJ | 1,234 |
| JNI_GLOBAL | 456 |
| STICKY_CLASS| 165 |# Thread-related roots
hdump> gcroots/THREAD_OBJ | head(10)
# Java frame roots (stack variables)
hdump> gcroots/JAVA_FRAME | head(10)# JNI global references (potential leaks)
hdump> gcroots/JNI_GLOBAL | head(20)
hdump> gcroots/JNI_GLOBAL | countThe shell includes 6 built-in leak detectors that can be run with checkLeaks:
# Run all detectors (interactive wizard)
hdump> checkLeaks()
# Run a specific detector
hdump> checkLeaks(detector="threadlocal-leak")Available detectors:
| Detector | Description |
|---|---|
threadlocal-leak |
Find ThreadLocal instances with large retained sizes attached to threads |
classloader-leak |
Detect ClassLoader instances that should be GC'd but are retained |
duplicate-strings |
Find identical string values with high instance counts |
growing-collections |
Detect collections (HashMap, ArrayList, etc.) with large retained sizes |
listener-leak |
Find event listeners that may not have been deregistered |
finalizer-queue |
Detect objects stuck in the finalizer queue |
Detector parameters:
# Set minimum size threshold (default varies by detector)
hdump> checkLeaks(detector="growing-collections", minSize=10MB)
# Set count threshold
hdump> checkLeaks(detector="duplicate-strings", threshold=100)Retained size shows how much memory would be freed if an object were garbage collected:
# Top objects by retained size
hdump> objects | top(10, retained)
# Classes by total retained memory
hdump> objects | groupBy(class) | top(10, retained)
# Large retained objects of specific type
hdump> objects/java.util.HashMap[retained > 10MB] | top(10, retained)Find why an object is kept alive:
# Find retention path for a specific object
hdump> objects[id = 0x12345678] | pathToRoot
# Find path for largest HashMap
hdump> objects/java.util.HashMap | top(1, retained) | pathToRootSee what objects dominate memory (keep other objects alive):
# Dominator tree for an object
hdump> objects[id = 0x12345678] | dominators
# Dominators grouped by class
hdump> objects[id = 0x12345678] | dominators(groupBy="class")
# Full dominator tree view
hdump> objects[id = 0x12345678] | dominators("tree")Find over-allocated collections wasting memory:
# Analyze HashMap waste — find over-allocated instances
hdump> objects/java.util.HashMap | waste() | sortBy(wastedBytes desc) | top(20)
# Find collections with large capacity but few entries
hdump> objects/java.util.HashMap | waste() | filter(loadFactor < 0.1) | top(20)
# Aggregate waste by collection class (analyze each supported type separately)
hdump> objects/java.util.HashMap | waste() | groupBy(class, agg=sum, value=wastedBytes)
hdump> objects/java.util.ArrayList | waste() | groupBy(class, agg=sum, value=wastedBytes)The clusters root uses graph-based analysis to detect unknown leak patterns
by identifying densely-connected subgraphs with high retained size but few GC root anchors.
# List all detected clusters ranked by suspiciousness
hdump> clusters | sortBy(score desc) | head(10)
# Filter for significant clusters
hdump> clusters | filter(retainedSize > 10MB)
# Drill into a specific cluster to see its member objects
hdump> clusters[id = 3] | objects | sortBy(retained desc) | head(10)
# Find clusters anchored by thread objects
hdump> clusters | filter(anchorType = "THREAD_OBJ") | sortBy(retainedSize desc)The leak score (retainedSize / rootPathCount) indicates how suspicious a cluster is:
fewer GC root paths holding more memory is more suspicious.
# Classes with most instances
hdump> classes | top(20, instanceCount)
# Focus on application classes
hdump> classes/com.myapp.* | top(10, instanceCount)# Large HashMaps by retained size
hdump> objects/java.util.HashMap | top(10, retained)
# Large ArrayLists
hdump> objects/java.util.ArrayList[arrayLength > 1000] | head(10)# Use the built-in detector
hdump> checkLeaks(detector="duplicate-strings", threshold=50)
# Or manual analysis
hdump> objects/java.lang.String | stats(shallow)# Find cache-like structures
hdump> objects/instanceof/java.util.Map | groupBy(class, agg=count)
# Large Maps that might be caches
hdump> classes/*Cache* | select(name, instanceCount)The whatif() pipeline operator answers prioritisation questions like "if I fix this leak, how much memory would be freed?" Pipe any object or GC root query into whatif() to get a simulation result.
Output columns: action, targetCount, freedBytes, freedObjects, freedPct, remainingRetained.
# How much memory would be freed by fixing LeakyCache?
hdump> objects/com.example.LeakyCache | whatif()
# Focus on the key metrics
hdump> objects/com.example.LeakyCache | whatif() | select(freedBytes, freedPct)
# Simulate freeing thread-owned objects
hdump> gcroots/THREAD_OBJ | whatif()
# Compare impact of multiple candidates
hdump> objects/com.example.LeakyCache | whatif()
hdump> objects/com.example.SessionStore | whatif()
# Filter before simulating — only large instances
hdump> objects/com.example.Foo[retained > 1MB] | whatif()
# Target a specific object by ID
hdump> objects[id = 12345] | whatif()Note:
freedBytesis the sum of retained sizes of the input objects. When a full dominator tree has been computed,freedObjectsreflects the total dominated subtree; otherwise it equalstargetCount.
The ages root and estimateAge() operator estimate how long objects have been alive using
structural signals from a single heap snapshot — no second dump required.
Scoring signals (0–100):
- GC root type:
STICKY_CLASS(system classes) → +30,JNI_GLOBAL→ +25,THREAD_OBJ→ +5 - Inbound reference count:
min(25, refCount × 2) - BFS depth from GC root:
max(0, 10 − depth)(shallower = older)
Age buckets: ephemeral (0–25), medium (26–50), tenured (51–75), permanent (76–100)
# Distribution of all objects by age bucket
hdump> ages | groupBy(ageBucket, agg=count)
# Oldest objects by retained size
hdump> ages | filter(ageBucket = "tenured") | top(10, retained)
# Class-filtered age view
hdump> ages/com.example.SessionStore | select(id, estimatedAge, ageBucket)
# As a pipeline operator on any object stream
hdump> objects/java.util.HashMap | estimateAge() | sortBy(estimatedAge desc) | top(20)
# Suspicious: WeakReferences that should be short-lived but are old
hdump> objects/java.lang.ref.WeakReference | estimateAge() | filter(ageBucket = "tenured") | top(20)Note: Age scores are heuristic estimates based on structural signals, not GC generation metadata. Objects not reachable from any GC root default to score 20 (
ephemeral). The firstagesorestimateAge()call triggers a two-pass scan; subsequent calls use the cached result.
The cacheStats() operator enriches Map-based object rows with five additional columns:
| Field | Type | Description |
|---|---|---|
entryCount |
int | Number of entries currently in the map |
maxSize |
int | Internal table capacity; -1 if unknown |
fillRatio |
double | entryCount / capacity; 0.0 if unknown |
costPerEntry |
long | retainedSize / entryCount; 0 if empty |
isLruMode |
boolean | true if LinkedHashMap with accessOrder=true |
# Analyze all LinkedHashMap instances
hdump> objects/java.util.LinkedHashMap | cacheStats()
# Find LRU caches (LinkedHashMap with accessOrder=true)
hdump> objects/java.util.LinkedHashMap | cacheStats() | filter(isLruMode = true)
# Top 10 most expensive caches by retained bytes per entry
hdump> objects/instanceof/java.util.Map | cacheStats() | top(10, costPerEntry)
# Find under-utilized maps (< 10% fill)
hdump> objects/java.util.HashMap | cacheStats() | filter(fillRatio < 0.1) | sortBy(entryCount desc)cacheStats() recognizes HashMap, LinkedHashMap, and WeakHashMap directly. It also attempts
to analyze any object that exposes size and table fields, covering most Map subclasses. Rows
for unrecognized types receive null values in all cache columns.
The report command runs a set of non-destructive analyses and produces a severity-ranked
narrative. It is the quickest way to get an overview of a heap dump.
# Plain-text report
hdump> report
# Markdown format
hdump> report --format=markdown
# Restrict to specific areas
hdump> report --focus=duplicates,waste=== Heap Health Report ===
Heap: myapp.hprof (158 MB, 1,923,456 objects, 16,831 classes)
--- WARNING ---
[W1] Duplicate subgraphs: 48 groups, ~3.2 MB reclaimable
Most wasteful: com.example.Config (12 copies)
Action: Run: duplicates | sortBy(wastedBytes desc)
Query: duplicates | sortBy(wastedBytes desc)
--- INFO ---
[I1] Heap summary: 1,923,456 objects, 16,831 classes, 158 MB total shallow
[I2] Top class by instance count: java.util.HashMap$Node (249,734 instances, 1.9 MB shallow)
[I3] GC roots: 3 STICKY_CLASS, 1 THREAD_OBJ
- CRITICAL: retained size > 100 MB
- WARNING: retained size > 10 MB, or duplicate waste > 1 MB
- INFO: all other findings
report always runs all contributors, triggering expensive computations as needed. Progress is
shown in the console for long-running steps.
| Contributor | Notes |
|---|---|
| Heap overview | O(1) |
| Class histogram | O(classes) |
| GC root summary | O(roots) |
| Approximate retained sizes | computed if not already done |
| Leak detectors (6 built-in) | uses retained sizes |
| Collection waste estimate | uses retained sizes |
| Duplicate subgraphs (depth 3) | computed if not already done |
Use --focus= to restrict which contributors run:
| Focus value | Contributors |
|---|---|
leaks |
retained sizes + all leak detectors |
waste |
retained sizes + collection waste |
duplicates |
duplicate subgraph fingerprinting |
# Group by package prefix
hdump> classes/com.myapp.* | top(10, instanceCount)
hdump> classes/org.springframework.* | top(10, instanceCount)Create a comprehensive overview:
# Object count
hdump> objects | count
# Total shallow size
hdump> objects | sum(shallow)
# Class count
hdump> classes | count
# Top consumers
hdump> objects | groupBy(class, agg=sum) | top(10, sum)Open two heap dumps and use join to diff them:
hdump> open before.hprof
hdump> open after.hprof
# Diff class histograms — session 1 is before.hprof (the baseline)
hdump> classes | join(session=1) | sortBy(instanceCountDelta desc) | head(20)
# Find classes with growing instance counts
hdump> classes | join(session=1) | filter(instanceCountDelta > 0) | top(10, instanceCountDelta)
# New classes not present in the baseline
hdump> classes | join(session=1) | filter(baseline.exists = false)
# GC root type changes
hdump> gcroots | groupBy(type, agg=count) | join(session=1) | sortBy(countDelta desc)The join operator adds baseline.* and *Delta columns for every numeric field.
The join key is auto-inferred (name for classes, className for objects, type for GC roots)
and can be overridden with by=field.
When you have both a JFR recording and a heap dump from the same application, you can
correlate allocation activity (from JFR) with heap state (from the dump). Use the root=
parameter to specify the JFR event type:
# Open both sources
hdump> open recording.jfr
hdump> open dump.hprof
# Enrich class histogram with allocation data from JFR
hdump> classes | join(session="recording.jfr", root="jdk.ObjectAllocationSample", by=class)
# Find high-churn classes: many allocations but few survivors in the heap
hdump> classes | join(session=1, root="jdk.ObjectAllocationSample", by=class) | filter(allocCount > 1000) | sortBy(survivalRatio asc) | head(20)
# Top classes by total allocation weight
hdump> classes | join(session=1, root="jdk.ObjectAllocationSample", by=class) | sortBy(allocWeight desc) | top(10)The JFR correlation adds enrichment columns: allocCount, allocWeight, allocRate,
topAllocSite, and survivalRatio (instanceCount / allocCount). Classes with no matching
JFR events get null values for all enrichment columns.
# Export to JSON for external tools
hdump> objects | groupBy(class, agg=count) | top(100, count) --format json > classes.json
# CSV output
hdump> classes | top(20, instanceCount) --format csv > top-classes.csv
# Limit output rows
hdump> objects | top(100, shallow) --limit 10Always begin with high-level statistics:
objects | count
objects | stats(shallow)
classes | count
gcroots | groupBy(type)Find all implementations:
objects/instanceof/java.io.Closeable | groupBy(class)Filter to your packages:
classes/com.myapp.* | top(10, instanceCount)Take dumps at different times and use join to diff them:
open before.hprof
open after.hprof
classes | join(session=1) | filter(instanceCountDelta > 0) | top(20, instanceCountDelta)Good scenarios for comparison:
- Before and after a suspected leak operation
- Under normal load vs. high load
- Fresh start vs. after extended run
Common leak indicators:
- Steadily growing instance counts
- Large collections (Maps, Lists)
- Many JNI_GLOBAL references
- Duplicate strings
Make queries readable:
# Instead of
objects[shallow > 1048576]
# Use
objects[shallow > 1MB]Filter early to reduce data:
# Good: filter first
objects/java.lang.String[shallow > 1KB] | groupBy(class) | top(10)
# Less efficient: groupBy all, then filter
objects/java.lang.String | groupBy(class) | filter(count > 100)Use built-in detectors for quick wins:
# Interactive wizard runs all detectors
checkLeaks()
# Or target specific leak types
checkLeaks(detector="threadlocal-leak")
checkLeaks(detector="classloader-leak")Shallow size shows object's own memory; retained size shows total impact:
# Find objects with highest memory impact
objects | top(10, retained)
# Then trace why they're retained
objects[id = 0x12345] | pathToRoot- Read the HdumpPath Reference for complete query syntax
- Explore JFR Shell Tutorial for JFR analysis
- Check Scripting Guide for automating analysis