Pre-Allocating Slices and Maps in Go: When It Matters

How Go's append and map growth work, when pre-allocation actually helps, benchmark results across slice sizes, and practical rules for hot-path code.

Cristian Curteanu August 27, 2024 9 min read
Pre-Allocating Slices and Maps in Go: When It Matters

How slice and map growth works under the hood, when pre-allocating with make is worth it, and when it’s unnecessary overhead.


Pre-allocating slices and maps in Go is one of those optimizations where the cost is one line of code and the benefit scales with your throughput. At ten elements, it doesn’t matter. At ten million elements in a hot path, it’s the difference between 1 allocation and 38. This post covers how to know when you’re in the range where it matters, backed by benchmarks — and the one counterintuitive result that changes how I write filtering functions.

If you’re seeing allocation pressure in your heap profiles and aren’t sure where it’s coming from, the complete pprof guide covers how to collect and read allocs profiles. Start there to confirm pre-allocation is actually your problem before optimizing.


What Actually Happens When a Slice Grows

Slices in Go are backed by arrays. When you call append and the slice doesn’t have enough capacity, Go must:

  1. Allocate a new, larger array (roughly 2x for small slices, ~1.25x for larger ones)
  2. Copy all existing elements to the new array
  3. Add your new element
  4. Let the garbage collector clean up the old array

Here’s what that looks like visually:

Appending to a slice at capacity:

Before: [A][B][C][D]  cap=4, len=4
                 ↓ append(E)
Step 1: [_][_][_][_][_][_][_][_]  allocate new array (cap=8)
Step 2: [A][B][C][D][_][_][_][_]  copy 4 elements
Step 3: [A][B][C][D][E][_][_][_]  finally add new element
        [A][B][C][D] → garbage collection

For a single append, this overhead is negligible. But in a loop adding thousands of elements? You’re paying this cost over and over — and each copy is larger than the last.


The Benchmark

The core comparison — no pre-allocation, pre-allocated with exact capacity, and direct index assignment:

// No pre-allocation — slice grows dynamically
func SliceNoPrealloc(n int) []int {
    var s []int // len=0, cap=0
    for i := 0; i < n; i++ {
        s = append(s, i)
    }
    return s
}

// Pre-allocated with exact capacity
func SlicePrealloc(n int) []int {
    s := make([]int, 0, n) // len=0, cap=n
    for i := 0; i < n; i++ {
        s = append(s, i)
    }
    return s
}

// Direct index assignment (fastest)
func SliceDirectAssign(n int) []int {
    s := make([]int, n) // len=n, cap=n
    for i := 0; i < n; i++ {
        s[i] = i
    }
    return s
}

Results from go test -bench=. -benchmem:

Slice Results (integers)

Elements Method Time Memory Allocations
100 No prealloc 1,077 ns 2,040 B 8
100 Prealloc 420 ns 896 B 1
10,000 No prealloc 154 μs 357 KB 19
10,000 Prealloc 29 μs 82 KB 1
1,000,000 No prealloc 15.5 ms 41.7 MB 38
1,000,000 Prealloc 1.6 ms 8 MB 1

At one million elements, pre-allocation is 10x faster and uses 5x less memory. The non-pre-allocated version triggers 38 separate allocations, each one copying increasingly large amounts of data.

Map Results

Maps behave similarly. They rehash and redistribute buckets when the load factor exceeds a threshold:

// No size hint
m := make(map[int]int)

// Size hint provided
m := make(map[int]int, expectedSize)
Elements Method Time Memory Allocations
100 No prealloc 7,065 ns 5,364 B 16
100 Prealloc 3,777 ns 2,908 B 6
10,000 No prealloc 769 μs 687 KB 276
10,000 Prealloc 392 μs 322 KB 11
100,000 No prealloc 7.7 ms 5.8 MB 4,016
100,000 Prealloc 4.4 ms 2.8 MB 1,681

The difference is less dramatic than slices — maps are more complex internally — but still a consistent 2x improvement in speed and memory.


The Real-World Pattern: Filtering and Transforming Slices

The most common scenario where this matters is transforming one slice into another with some filtering:

// This pattern appears everywhere
func GetActiveUsers(users []User) []User {
    var active []User
    for _, u := range users {
        if u.IsActive {
            active = append(active, u)
        }
    }
    return active
}

You have three options:

// Option 1: No pre-allocation (baseline)
func TransformNoPrealloc(input []int) []int {
    var result []int
    for _, v := range input {
        if v%2 == 0 {
            result = append(result, v*2)
        }
    }
    return result
}

// Option 2: Two-pass with exact count
func TransformPreallocExact(input []int) []int {
    count := 0
    for _, v := range input {
        if v%2 == 0 {
            count++
        }
    }
    result := make([]int, 0, count)
    for _, v := range input {
        if v%2 == 0 {
            result = append(result, v*2)
        }
    }
    return result
}

// Option 3: Estimate based on heuristics
func TransformPreallocEstimate(input []int) []int {
    // Assume ~50% will match
    result := make([]int, 0, len(input)/2)
    for _, v := range input {
        if v%2 == 0 {
            result = append(result, v*2)
        }
    }
    return result
}
Method Time Memory Allocations
No prealloc 912 μs 1.96 MB 25
Two-pass exact 334 μs 401 KB 1
Estimate (len/2) 194 μs 401 KB 1

The estimate method wins. Iterating the slice twice to get an exact count costs more than the occasional over-allocation. When in doubt, over-allocate slightly rather than under-allocate or iterate twice. The GC can reclaim unused capacity; it can’t un-pay the cost of a second pass.


How to Detect Pre-allocation Issues

1. Benchmark with memory stats

go test -bench=. -benchmem

Look for high allocs/op values. If your function is doing 10+ allocations for what should be a single slice build, you have a problem.

2. Escape analysis

go build -gcflags="-m" yourfile.go

This shows where allocations happen:

./main.go:19:11: make([]int, 0, n) escapes to heap
./main.go:27:11: make(map[int]int) escapes to heap

For a deeper walkthrough of reading escape analysis output, see how to use gcflags -m to find heap escapes.

3. Static analysis with prealloc

go install github.com/alexkohler/prealloc@latest
prealloc ./...

This linter specifically catches the var s []T followed by append in a loop pattern — the exact scenario these benchmarks measure.

4. Runtime profiling

For production systems, use pprof to find allocation hotspots:

import _ "net/http/pprof"

// Access via:
// go tool pprof http://localhost:6060/debug/pprof/allocs

The allocs profile (/debug/pprof/allocs) shows where allocations occurred over the profile window, ranked by object count or byte count. If a slice-building function appears near the top, it’s worth pre-allocating.


Practical Rules

Rule 1: Known exact size → pre-allocate exactly

result := make([]T, 0, len(input))
// or if filling sequentially:
result := make([]T, len(input))

Rule 2: Upper bound known → pre-allocate to upper bound

// Filtering from input — output can't exceed input length
result := make([]T, 0, len(input))

Rule 3: Estimate available → use the estimate, avoid two passes

// If you know ~30% typically match
result := make([]T, 0, len(input)/3)

Rule 4: Building from external source → estimate reasonably

// Reading lines from a file? Guess based on typical size
lines := make([]string, 0, 1000)

Rule 5: Maps with predictable keys → always hint

// Indexing users by ID
userMap := make(map[int64]*User, len(users))

When Pre-allocation Doesn’t Matter

Skip it when:

  • The slice will have fewer than ~10 elements
  • The code isn’t in a hot path (not called per-request, per-message, or in a tight loop)
  • You genuinely have no idea about the size (rare in practice once you think about it)
  • Readability suffers significantly — a make with a confusing hint is worse than a clean var

The goal is fast, maintainable code. Pre-allocating cold-path slices that build once at startup is unnecessary cognitive overhead.


The GC Angle

Every grow-and-copy cycle creates short-lived garbage: the old backing array is allocated, used briefly, then discarded. In a service handling thousands of requests per second, each one building several unallocated slices, this accumulates into measurable GC pressure.

The relationship is direct: fewer allocations → lower allocation rate → less GC work → lower GC CPU overhead → fewer GC assists on your goroutines. For a full picture of how allocation rate connects to GC pauses and GOGC tuning, see how to tune Go GC with GOGC and GOMEMLIMIT.


Frequently Asked Questions

Does pre-allocating a slice change how append behaves?

No. append still works correctly regardless of whether you pre-allocated. The difference is that if len < cap, append adds the element at s[len] and increments len — no allocation, no copy. You get the same semantics with dramatically less overhead. The slice’s len still starts at 0 when you use make([]T, 0, n); you’re only setting the capacity.

What happens if I over-allocate and don’t fill the slice?

The unused capacity sits allocated but unused until the slice is GC’d. For most cases this is fine — the memory is minimal and short-lived. If you’re building many long-lived slices with large over-allocations, consider s = s[:len(s):len(s)] after filling to trim capacity to length, releasing the excess.

Is the size hint for maps exact or approximate?

It’s a hint, not a guarantee. make(map[K]V, n) tells the runtime to pre-allocate enough buckets to hold approximately n elements without rehashing. The actual threshold before rehashing depends on the load factor (approximately 6.5 elements per bucket). Providing a hint avoids the intermediate rehashes during population, but map memory management remains internal to the runtime.

Does pre-allocation help with strings.Builder and bytes.Buffer?

Yes, and the same logic applies. strings.Builder has a Grow(n int) method; bytes.Buffer can be initialized with bytes.NewBuffer(make([]byte, 0, n)). In JSON-heavy services where you’re building response strings or encoding buffers in hot paths, pre-sizing these avoids repeated doublings.

The prealloc linter flags my code but it’s in a cold path. Should I fix it?

Not necessarily. The linter identifies potential issues but can’t know whether the code is hot or cold. Use the //nolint:prealloc directive for intentional cold-path cases and add a comment explaining why. Linter output is a starting point for investigation, not a mandatory fix list.

How do I find which slice-building functions are actually hot in production?

Use go tool pprof -alloc_objects http://localhost:6060/debug/pprof/allocs on a live service. The allocs profile shows allocation sites weighted by object count over the capture window. Functions appearing near the top with high object counts are your real hot paths — optimize those first, ignore the rest.



Want to verify these numbers yourself? Run the benchmarks with go test -bench=. -benchmem -benchtime=2s and see the results on your own hardware.