How to Tune Go GC with GOGC and GOMEMLIMIT
How Go's garbage collector works, what GOGC and GOMEMLIMIT actually control, how the Pacer schedules GC work, and how to tune these for your service.
The two settings that determine when Go’s GC runs, how hard it works, and whether your service survives a traffic spike inside a memory-limited container.
Most Go services ship with GOGC at its default and GOMEMLIMIT disabled entirely. That works fine at small scale. At larger heap sizes — 500MB, 1GB, 2GB live — the default behavior becomes unpredictable: the heap trigger grows proportionally with the live heap, OOM kills happen in Kubernetes, and GC assist pressure creates latency spikes nobody can explain.
This post covers everything you need to tune Go GC correctly: how the tricolor mark-and-sweep algorithm works, what the Pacer does and why it matters, and the two knobs you actually have — GOGC and GOMEMLIMIT. By the end, you’ll be able to look at GC metrics and know exactly what to change.
If you haven’t set up heap profiling yet, the complete pprof guide covers how to collect and read allocation profiles — which is the right first step before touching any GC settings.
How Go’s GC Works: The Short Version
Go uses a concurrent, tricolor mark-and-sweep garbage collector. The tricolor part refers to how it tracks objects:
- White — not yet visited. Candidates for collection.
- Gray — discovered but not fully scanned. On the work queue.
- Black — fully scanned. Reachable and safe.
The collector starts by marking all roots (globals, goroutine stacks) gray, then drains the gray queue, turning each object black after scanning its pointers. Anything still white at the end is unreachable — and gets swept.
The reason this can run concurrently with your application is the write barrier: a small hook that fires on every pointer write during the mark phase, ensuring that any new pointer relationships created while the GC is running get captured. You pay a small CPU tax on every heap pointer write during marking. This is unavoidable.
What you can control is when the GC runs, how aggressively it runs, and how much memory it’s allowed to use.
The Pacer: Why GC Timing Is Harder Than It Looks
Before Go 1.14, the GC used a simple heap-growth trigger: run a collection when the live heap doubles. That’s what GOGC=100 still means today — trigger GC when the heap has grown 100% above the size of the last surviving heap.
The problem with a simple trigger is overshoot. If your application is allocating very quickly, the GC might not finish a cycle before the heap has already blown past its target. Go solves this with the Pacer.
The Pacer’s job is to schedule GC assist work — forcing goroutines that are allocating heavily to help the GC mark — so that the mark phase finishes right as the heap hits the trigger limit. It’s a feedback control system, not a fixed timer.
In practice, this means:
- The GC estimates how much marking work needs to be done based on the previous cycle’s scan rate.
- It tracks the allocation rate during the current cycle.
- When a goroutine allocates, it checks if the GC is behind schedule. If it is, that goroutine is taxed some CPU cycles to do marking work (a “GC assist”).
GC assists are why you sometimes see latency spikes on allocation-heavy paths even though the GC runs “concurrently.” Your goroutine is doing GC work.
The Pacer was substantially rewritten in Go 1.18 to fix systematic under- and over-shoot problems in the original design. If you’re still running Go 1.17 or earlier in production, upgrading alone will improve GC behavior.
GOGC: The Heap Growth Target
GOGC controls the ratio between the live heap and the heap trigger.
heap_trigger = live_heap_after_last_gc * (1 + GOGC/100)
The default is GOGC=100, meaning the heap can double before the next GC fires.
You can set it via environment variable or at runtime:
import "runtime/debug"
// Set GOGC to 200 — allow heap to triple before GC fires
debug.SetGCPercent(200)
// Disable GC entirely (dangerous — only for batch jobs)
debug.SetGCPercent(-1)
What happens when you raise GOGC?
Higher GOGC = longer time between collections = less GC CPU overhead = more memory usage.
If your service is CPU-bound and has plenty of headroom on memory, raising GOGC to 200 or 400 is a legitimate optimization. You’re trading memory for throughput.
// Benchmark: GOGC=100 vs GOGC=400 on a throughput-heavy service
// GOGC=100: 340ms avg latency, 512MB heap
// GOGC=400: 290ms avg latency, 1.1GB heap
// ~15% latency improvement for 2x memory cost
What happens when you lower GOGC?
Lower GOGC = more frequent collections = more GC overhead = lower peak memory.
Setting GOGC=50 means the GC fires when the heap grows 50% above the last live heap. This is useful when you’re memory-constrained and can afford more CPU burn.
The problem with GOGC alone
GOGC only controls the ratio of growth — it says nothing about absolute memory limits. On a service that starts with a 100MB live heap, GOGC=100 triggers at 200MB. On a service with a 2GB live heap after a traffic spike, GOGC=100 triggers at 4GB. This is how Go services OOM kill in Kubernetes: the live heap grows, the trigger grows with it, and the container quietly marches past its memory limit.
A note on the ballast object pattern (now deprecated)
Before GOMEMLIMIT existed, a common workaround was the ballast object: allocating a large byte slice at startup and holding a reference to it, which inflated the apparent live heap and raised the GC trigger without actually using that memory for anything.
// Old pattern — pre-Go 1.19, don't use this anymore
func main() {
ballast := make([]byte, 1<<30) // 1GB "ballast"
_ = ballast // keep it alive
// Now live_heap starts at ~1GB, so GOGC=100 triggers at ~2GB
}
This worked, but it was a hack: it consumed virtual memory, made heap profiles confusing, and didn’t handle the case where the live heap grew past the ballast. GOMEMLIMIT renders it obsolete. If you see this pattern in older codebases, replace it with GOMEMLIMIT.
GOMEMLIMIT: The Hard Ceiling
GOMEMLIMIT was introduced in Go 1.19 and is arguably the most important GC addition in years. It sets an absolute upper bound on the Go runtime’s memory footprint.
import "runtime/debug"
// Set a 512MB hard limit
debug.SetMemoryLimit(512 * 1024 * 1024)
Or via environment:
GOMEMLIMIT=512MiB ./myservice
When the runtime approaches this limit, it ignores GOGC and runs GC as aggressively as necessary to stay under the ceiling. If it cannot reclaim enough memory — because everything is live — it will enter a thrashing state, spending most of its time in GC. This is intentional. The alternative is an OOM kill, which loses all in-flight requests.
The right way to set GOMEMLIMIT in Kubernetes
Set GOMEMLIMIT to roughly 90% of your container memory limit. This gives the runtime room to operate without hitting the kernel OOM killer.
resources:
limits:
memory: "512Mi"
env:
- name: GOMEMLIMIT
value: "460MiB" # ~90% of 512Mi
Before GOMEMLIMIT, the only way to prevent OOM kills in Go was to set GOGC very conservatively (low values = frequent GC = lower peak memory), which burned CPU constantly even when memory wasn’t the constraint. GOMEMLIMIT lets you run with a high GOGC for performance most of the time, while the runtime automatically gets more aggressive when memory is actually tight.
GOMEMLIMIT does not count everything
Important caveat: GOMEMLIMIT tracks memory managed by the Go runtime — heap, stacks, GC metadata. It does not track:
- Memory mapped by CGo
- Memory from
mmapcalls outside the runtime - Shared libraries
- The OS overhead of the process itself
Add a buffer for these. In pure Go services with no CGo, the 90% rule is generally safe.
Diagnosing GC Behavior in Practice
GODEBUG=gctrace=1
The fastest way to see what the GC is doing:
GODEBUG=gctrace=1 ./myservice 2>&1 | grep "^gc"
Sample output:
gc 14 @12.311s 2%: 0.10+2.1+0.14 ms clock, 0.84+0.62/1.9/0.28+1.1 ms cpu, 142->148->72 MB, 144 MB goal, 0 MB stacks, 0 MB globals, 8 P
Breaking this down:
@12.311s— time since program start2%— percentage of wall-clock time spent in GC (this is the number you care about most)0.10+2.1+0.14 ms— stop-the-world sweep termination + concurrent mark + stop-the-world mark termination142->148->72 MB— heap size at start of GC → heap size at end of GC → live heap after sweep144 MB goal— what the Pacer was targeting
A well-tuned service should spend under 5% of wall time in GC. If you’re seeing 15–25%, that’s a signal to raise GOGC or investigate allocation rate.
runtime/metrics
For production monitoring, use runtime/metrics (Go 1.16+) instead of parsing gctrace strings:
import "runtime/metrics"
samples := []metrics.Sample{
{Name: "/gc/cycles/total:gc-cycles"},
{Name: "/gc/heap/live:bytes"},
{Name: "/memory/classes/heap/objects:bytes"},
{Name: "/sched/latencies:seconds"},
}
metrics.Read(samples)
for _, s := range samples {
switch s.Value.Kind() {
case metrics.KindUint64:
fmt.Printf("%s: %d\n", s.Name, s.Value.Uint64())
case metrics.KindFloat64Histogram:
// Export to Prometheus as histogram
}
}
Key metrics to track in your dashboards:
| Metric | What it tells you |
|---|---|
/gc/cycles/total:gc-cycles |
GC frequency — rising fast = allocation problem |
/gc/heap/live:bytes |
Actual live data — your real memory baseline |
/memory/classes/heap/objects:bytes |
Current heap in use — compare to live to see fragmentation |
/gc/pauses:seconds (histogram) |
STW latency — should be sub-millisecond |
/sched/latencies:seconds (histogram) |
Goroutine scheduling latency — includes GC assists |
Watching for GC assist pressure
GC assists are invisible in most dashboards but they’re a common source of unexplained latency spikes. If /sched/latencies:seconds shows long tail latencies but your handler code looks fast, GC assists are a strong suspect.
Remediation options:
- Raise
GOGCto reduce GC frequency - Reduce allocation rate at the source (
sync.Pool, preallocated slices — see how to use sync.Pool correctly) - If you have predictable low-traffic windows, call
runtime.GC()manually to run a collection before the traffic hits
Practical Tuning Decision Tree
Is GC CPU overhead > 5% of wall time?
├── YES: Are you memory-constrained?
│ ├── YES: Reduce allocation rate (sync.Pool, escape analysis)
│ └── NO: Raise GOGC (try 200, then 400)
└── NO: Are you seeing OOM kills?
├── YES: Set GOMEMLIMIT to 90% of container limit
└── NO: You're probably fine — add metrics and revisit under load
The GOGC + GOMEMLIMIT Combination
These two settings work best together. The recommended baseline for most Go services running in Kubernetes:
GOGC=100 # Default — adjust up if CPU overhead is high
GOMEMLIMIT=<90% of container limit>
This gives you:
- Normal GC behavior under typical load
- Automatic GC aggression when memory tightens
- Protection against OOM kills from heap trigger overshoot
If your service is CPU-sensitive and memory-generous:
GOGC=300
GOMEMLIMIT=<90% of container limit>
If your service is memory-sensitive (dense packing, many replicas):
GOGC=50
GOMEMLIMIT=<80% of container limit> # tighter ceiling
What the GC Can’t Fix
GOGC and GOMEMLIMIT are pressure valves. They don’t fix the underlying problem if you’re allocating too much. Before tuning GC settings, profile your allocation rate:
go tool pprof -alloc_objects http://localhost:6060/debug/pprof/heap
If your top allocation sites are avoidable — temporary string conversions, unmarshalled structs that could be pooled, slices that grow repeatedly — fix those first. A well-tuned GC on a high-allocation service is still slower than a low-allocation service on default settings.
For a full treatment of finding and eliminating allocation hotspots, see how to find heap escapes with gcflags -m.
Summary
| Setting | Effect | Default | When to change |
|---|---|---|---|
GOGC=N |
GC triggers when heap grows N% above live size | 100 | Raise for CPU-sensitive services with memory headroom |
GOMEMLIMIT=N |
Hard cap on Go runtime memory | Off | Set to ~90% of container limit in Kubernetes |
debug.SetGCPercent(-1) |
Disable GC | — | Batch jobs with known lifetimes only |
The Pacer handles the scheduling math so you don’t have to. Your job is to give it the right constraints: a growth target that matches your CPU/memory tradeoff (GOGC) and a ceiling that keeps you safe from OOM kills (GOMEMLIMIT).
Frequently Asked Questions
What does GOGC=100 actually mean?
It means the GC triggers when the heap has grown 100% above the live heap size from the end of the last collection — i.e., the heap doubles. If 72MB survived the last GC cycle, the next GC fires when the heap reaches 144MB. At GOGC=200, it fires at 216MB. The formula is heap_trigger = live_heap * (1 + GOGC/100).
Should I set GOMEMLIMIT on every Go service?
Yes, for any service running in a container with a memory limit. Without it, the heap trigger can grow proportionally with the live heap until it exceeds the container limit, causing an OOM kill. Setting GOMEMLIMIT to 90% of the container limit costs nothing in the normal case and prevents the worst-case scenario.
What’s the difference between GOMEMLIMIT and GOGC? Can I use just one?
They control different things. GOGC controls how often the GC runs relative to heap growth (a ratio). GOMEMLIMIT sets an absolute ceiling. Using only GOGC leaves you exposed to OOM kills when the live heap grows unexpectedly. Using only GOMEMLIMIT without a reasonable GOGC can cause excessive GC frequency at low heap sizes. Use both.
What are GC assists and why do they cause latency spikes?
When the Pacer determines the GC is behind schedule — allocations are outpacing the mark phase — it taxes goroutines that are allocating: they must do some GC marking work before their allocation is served. This is called a GC assist. It causes latency spikes on allocation-heavy code paths because your request handler is suddenly doing GC work mid-request. Raising GOGC reduces assist pressure; reducing allocation rate eliminates it.
Is there a way to see if my service is spending too much time in GC without running gctrace?
Yes. Export /gc/cycles/total:gc-cycles and /sched/latencies:seconds from runtime/metrics to Prometheus or your metrics system. A rising GC cycle count with normal throughput indicates allocation pressure. Long-tail scheduling latency without high CPU suggests GC assists. These are more production-friendly than GODEBUG=gctrace=1, which prints to stderr.
When should I call runtime.GC() manually?
Only in two scenarios: (1) batch jobs or CLI tools where you want to force a collection at a known clean point before processing a large dataset, and (2) services with predictable traffic patterns where you want to run a collection during a quiet window before a known traffic spike. For steady-state services, let the Pacer schedule collections — manual GC calls interfere with the Pacer’s feedback model and can cause unpredictable behavior.
My GOGC is already at 400 and GC overhead is still high. What now?
At that point, GC tuning isn’t the answer. Profile your allocation sites: go tool pprof -alloc_objects http://localhost:6060/debug/pprof/heap. If the top allocators are in your application code, reduce allocations there — sync.Pool, pre-sized slices, avoiding interface boxing in hot paths. If the top allocators are in third-party libraries (JSON, gRPC, etc.), benchmark alternatives. See how to find allocation hotspots in Go for the full walkthrough.
Related Reading
- How to Profile a Go Application: The Complete pprof Guide — collecting and reading CPU, heap, goroutine, block, and mutex profiles
- How to Find Allocation Hotspots with Heap Profiling — the allocs profile and what to do with it
- How to Use sync.Pool Correctly in a High-Throughput Go Service — the right pattern for eliminating allocation pressure
- Go Memory Management: How to Optimize Allocations and GC Pressure — the full memory optimization hub
Need help applying this to your stack? Book a free 30-minute infrastructure review and we’ll pull your GC metrics and tell you exactly where you stand.