Int vs. interface: Generic Containers in Go

Apr 21 2019

Go is a language expertly designed to troll programming language weenies. A particular point of contention has been Go's lack of generics and its consequences for implementing container types. The standard technique for implementing generic containers is to use interface{} as the value type. Users of the container cast values from interface{} back to their concrete types. Here's an illustrative comparison of map and sync.Map:

// regular built-in map
myMap := make(map[string]int)
myMap["foo"] = 97
iv, _ := myMap["foo"]

// sync.Map
var mySyncMap sync.Map
mySyncMap.Store("foo", 97)
sv, _ := mySyncMap.Load("foo")
siv := sv.(int)

I recently wrote a Go implementation of an indexable skip list. Rather than using interface{} as the value type for the skip list, I decided to try a different tack. The idea is to use int as the value type and store indices into a slice. For example, to create an indexable skip list holding strings, you append each string to a slice and then store the index of the string in the skip list:

type MyStruct struct {
    x int
    y int
    z string
}
a := make([]MyStruct, 1000)
var sl iskiplist.ISkipList

// append an element to the skip list
newElement := MyStruct{1, 2, "foo"}
a = append(a, newElement)
sl.PushBack(len(a)-1)

// retrieve the element at index 100
elem := a[sl.At(100)]
fmt.Printf("Element 100: %v, %v, %v\n", elem.x, elem.y, elem.z)

The biggest disadvantage of this method is that it makes removing elements from the container more complicated. Removing an index still leaves the referenced value present in the underlying slice. Storage can be fully reclaimed only by periodically compacting the slice and making corresponding updates to the indices stored in the container.

Aside from that, it's all gravy:

Go is probably going to get generics soon. Generics have many applications besides containers, but I'm quite pleased with the int workaround for now.

Performance analysis

I did an unscientific benchmark to compare the performance of the index method to the interface{} method. As expected, the index method performed a little better.

I was surprised to find that casting an interface{} was just as fast as accessing an element of a slice. It was the cost of converting values to interface{} in the first place that was responsible for the performance difference.1

This result made me curious to compare the implementations of a slice access and a cast. The following listings (Intel syntax) were derived by disassembling an executable on OS X. You can also take a look at the code on godbolt – if you can parse the Go compiler's idiosyncratic assembly syntax!

Disassembly of a slice access

func sliceAccess(sl []int) int {
    return sl[10]
}
00:  sub  rsp, 8
04:  mov  qword ptr [rsp], rbp
08:  lea  rbp, qword ptr [rsp]
0c:  mov  rax, qword ptr [rsp + 0x18]
11:  cmp  rax, 0xa
15:  jbe  0x2e ; -------------------------
17:  mov  rax, qword ptr [rsp + 0x10] ;  |
1c:  mov  rax, qword ptr [rax + 0x50] ;  |
20:  mov  qword ptr [rsp + 0x28], rax ;  |
25:  mov  rbp, qword ptr [rsp] ;         |
29:  add  rsp, 8 ;                       |
2d:  ret ;                               |
2e:  call 0xfffffffffff92050 ; ----------- (call to error handler)

A Go slice is effectively a struct with three fields:

In the code above, the length field of the slice is moved into rax and compared against 10 (= 0xA). If the comparison goes the wrong way, the code jumps to a call to an error handler.

Disassembly of a cast

func cast(i interface{}) int {
    return i.(int)
}
00:  mov  rcx, qword ptr gs:[0x30]
09:  cmp  rsp, qword ptr [rcx + 0x10]
0d:  jbe  0x61
0f:  sub  rsp, 0x20
13:  mov  qword ptr [rsp + 0x18], rbp
18:  lea  rbp, qword ptr [rsp + 0x18]
     ;;; load type tag of interface{} val into rax
     ;;; (The interface{} value lives on the stack
     ;;; at rsp+0x28, so the address of its first
     ;;; field is rsp+0x28.)
1d:  mov  rax, qword ptr [rsp + 0x28]
     ;;; Load type tag of 'int' into rcx. This is
     ;;; just the address of something in a read only
     ;;; section. The lea instruction is being used to
     ;;; specify this address in a position-independent
     ;;; manner (relative to the instruction pointer).
22:  lea  rcx, qword ptr [rip + 0x10507]
     ;;; compare the two type tags
29:  cmp  rax, rcx
     ;;; jump to an error handler call if they're not equal
2c:  jne  0x45 ; -----------------------------------------
2e:  mov  rax, qword ptr [rsp + 0x30] ;                  |
33:  mov  rax, qword ptr [rax] ;                         |
36:  mov  qword ptr [rsp + 0x38], rax ;                  |
3b:  mov  rbp, qword ptr [rsp + 0x18] ;                  |
40:  add  rsp, 0x20 ;                                    |
44:  ret ;                                               |
     ;;; set up arguments to error handler               |
45:  mov  qword ptr [rsp], rax ; -------------------------
49:  mov  qword ptr [rsp + 8], rcx
4e:  lea  rax, qword ptr [rip + 0x1755b]
55:  mov  qword ptr [rsp + 0x10], rax
     ;;; call the error handler for the case where
     ;;; the interface{} doesn't hold an int.
5a:  call 0xfffffffffff73b30
5f:  ud2
61:  call 0xfffffffffffba9d0

An interface{} is represented as a pair of values. The first determines the type; the second is a pointer to the relevant data. To cast an interface{} to a type T, a comparison is made between the type tag of the interface and the tag associated with T. If the two are not equal, an error is raised. If they are equal, then it is just a matter of dereferencing a *T pointer.

In effect, then, a slice access and a cast involve the same sequence of operations:

This explains why they have a similar cost.

Prior art

The technique of storing indices instead of pointers is often used in Rust. Rust has generics up the wazoo, but storing pointers in graph-like data structures gets complicated due to the language's strictly-enforced ownership and lifetime rules. Nicholas D. Matsakis gives an interesting breakdown of the advantages and disadvantages of this method. Many of the points made are relevant to the present case.

Conclusions

Storing indices in a container can often work better than storing interface{} values. Performance, memory footprint and type safety are all improved.

Casting interface{} values back to concrete types is very fast.

Boxing a value as an interface{} is slower than appending it to a slice.

Note

1. It is significantly cheaper to append a value to a slice than it is to box the value as an interface{}. This is not surprising, since a separate allocation is required for each boxing, whereas extra capacity is allocated for slices in chunks.