I’ve recently learned about a surprising Golang behavior. You can significantly speed up the iteration over an array by doing:
sum := int64(0)
for _, v := range &array {
sum += v
}
instead of:
sum := int64(0)
for _, v := range array {
sum += v
}
The following code is benchmarking two ways of iterating over an array.
var global int64
func Benchmark(b *testing.B) {
var array [1000000]int64
for i := 0; i < len(array); i++ {
array[i] = rand.Int63n(10)
}
b.Run("pointer", func(b *testing.B) {
for i := 0; i < b.N; i++ {
sum := int64(0)
for _, v := range &array { // we're passing &array instead of array here
sum += v
}
global = sum // prevent the compiler from optimizing out the loop
}
})
b.Run("array", func(b *testing.B) {
for i := 0; i < b.N; i++ {
sum := int64(0)
for _, v := range array {
sum += v
}
global = sum // prevent the compiler from optimizing out the loop
}
})
}
It turns out that depending on the size of the array
the first version of the loop which uses a pointer to the array
can be up to almost 3 times faster. Below you can see a table summarising benchmark results for different sizes of the array
.
As you can see, the bigger the size of the array
, the more apparent it becomes that passing the array
pointer to for
loop is much faster in Golang.
It turns out that for
loop in Golang creates a copy of the collection you give it. This means that for _, v := range array
creates a copy of the array
and iterates over this copy instead of the array
itself. By passing &array
we make sure that the loop copies only a much smaller pointer.
This can be especially surprising for people coming from languages like C++ or Java where range loops were using iterators to traverse existing collections.
There are other consequences of this behavior than performance. Let’s try to solve a classical dynamic programming problem of unbounded knapsack:
“Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.”
One possible solution could be the following code that for each weight W
computes the maximum value that can be composed of items that total weight equals W
.
type item struct {
value int
weight int
}
func knapsack(items []item, limit int) int {
var dp [10000]int
for idx, v := range dp {
for _, item := range items {
if idx+item.weight <= limit {
if dp[idx+item.weight] < v+item.value {
dp[idx+item.weight] = v + item.value
}
}
}
}
result := 0
for idx := 0; idx <= limit; idx++ {
if result < dp[idx] {
result = dp[idx]
}
}
return result
}
The solution seems correct at first glance but when we test it with the following code, it prints 6 while the expected result should be 30. The solution is to take item[0]
5 times.
func TestKnapsack(t *testing.T) {
items := []item{
{value: 6, weight: 2},
{value: 3, weight: 2},
{value: 5, weight: 6},
{value: 4, weight: 5},
{value: 6, weight: 4},
}
limit := 10
t.Log(knapsack(items, limit))
}
The wrong value is a result of the for
loop populating v
from a copy of dp
obtained before the execution of the loop. At that point dp
is filled with 0
so v
is equal to 0
for every iteration of the loop. We can easily fix the problem by passing &dp
to the for
loop and then the result of the test becomes 30
. Below you can see the correct code.
type item struct {
value int
weight int
}
func knapsack(items []item, limit int) int {
var dp [10000]int
for idx, v := range &dp { // Passing pointer here
for _, item := range items {
if idx+item.weight <= limit {
if dp[idx+item.weight] < v+item.value {
dp[idx+item.weight] = v + item.value
}
}
}
}
result := 0
for idx := 0; idx <= limit; idx++ {
if result < dp[idx] {
result = dp[idx]
}
}
return result
}
Conclusions
It might be surprising but a for
loop in Golang creates a copy of the collection it receives and performs the iteration on that copy. This fact has some implications for loops iterating over arrays.
- When iterating over an array it’s faster to pass a pointer to that array to the
for
loop. This prevents the loop from copying the whole array which is beneficial for both CPU and memory usage. - If we want to modify the array while we’re iterating over it, we need to either give a
for
loop a pointer to that array or to use the array index operator directly.