Due to being busy with writing company projects recently, I have neglected my study of golang, so I am starting a series of articles called The Path to Mastering Go, to document my understanding and notes during the learning process.
About Slices#
Slice, translated into Chinese as 切片, is an abstract data type based on arrays provided in Go language. In Go, most scenarios that require arrays can be replaced by slices. In Go, slices perfectly replace arrays and provide a more flexible and efficient data sequence access interface.
What Exactly is a Slice?#
Before understanding the slice in Go, we need to first understand arrays.
What is a Slice?#
In Go, an array is a fixed-length continuous sequence that holds elements of the same type. Therefore, a Go array has two properties: element type and array length. Arrays that have the same type for both properties are considered equivalent. That is:
var a [8]int
var b [8]int
Here, the array length of a is 8, and the element type is int, which is the same as b, so we can say that a and b are equivalent.
Go has value semantics, which means that an array variable represents the entire array, not a pointer to the first element of the array (which is the approach in C). In Go, passing arrays uses value copying, which leads to significant performance loss if we need to pass arrays with large element types or many elements.
In C, such scenarios can be handled using array pointer types, but in Go, we generally use slices.
The Relationship Between Slices and Arrays#
The relationship between slices and arrays is similar to that of file descriptors to files. In Go, arrays generally retreat to the "background" as the "underlying" data storage, while slices come to the "foreground," providing an external access interface to the underlying storage.
Thus, slices can be referred to as "descriptors" of arrays. The reason slices can be passed freely between different functions is due to the characteristic of "descriptors"; regardless of how large the underlying data storage is, the size of the descriptor is always fixed.
Ways to Create Slices and Their Representation in Runtime#
The representation of slices in Go runtime is:
//$GOROOT/src/runtime/slice.go
type slice struct {
array unsafe.Pointer
len int
cap int
}
As we can see, a slice has three elements:
array: a pointer to an element of the underlying array, which is also the first element of the slice.len: the length of the slice, which is the number of elements in the current slice.cap: the maximum capacity of the slice,cap >= len.
At runtime, each slice is an instance of the runtime.slice structure, and we can create a slice using the following statement:
s := make([]byte, 5)
We can see that when we create a slice, the compiler automatically creates an underlying array. When we do not specify cap, it defaults to cap = len.
We can also create a slice by operating on an existing array, which we call slicing an array:
u := [10]byte{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}
s := u[3:7]
As we can see, the slice s opens a window of operation on the array u, with s's first element being u[3], and through s, we can see and operate on 4 array elements. The cap of the slice depends on the length of the underlying array; from u[3] to the end, there are 7 slots for storing elements, so s's cap is 7.
Of course, we can create multiple slices to operate on an array, but since the multiple created slices are all descriptors of the same array, any modification to any slice will reflect in the other slices.
We can also create new slices by operating on slices, a process called reslicing. Similarly, the new slice created based on the old slice shares the same underlying array, so modifications to the new slice will also reflect in the old slice.
Thus, when slices are passed as parameters, what is passed is runtime.slice, so regardless of how large the underlying array is, the performance overhead brought by slices is very small and constant, even negligible. This is one reason why slices are usually used in functions instead of arrays. Another reason is that slices can provide more powerful functionalities than pointer arrays, such as index access, boundary overflow checks, and dynamic resizing.
Advanced Features of Slices#
Dynamic Resizing#
Slices satisfy the concept of zero-value usability because slices have an advanced feature: dynamic resizing. Even a zero-value slice can perform element assignment operations through the predefined append function.
var k []int
k = append(k, 1)
k is initialized to a zero value, so k is not bound to a corresponding underlying array. However, after the append operation, k is clearly bound to its own underlying array.
Next, I will demonstrate the len and cap after each append through a series of practical operations:
var s []int
s = append(s, 1)
fmt.Println(len(s), cap(s)) // 1 1
s = append(s, 2)
fmt.Println(len(s), cap(s)) // 2 2
s = append(s, 3)
fmt.Println(len(s), cap(s)) // 3 4
s = append(s, 4)
fmt.Println(len(s), cap(s)) // 4 4
s = append(s, 5)
fmt.Println(len(s), cap(s)) // 5 8
It can be seen that the value of len grows linearly, but cap grows irregularly. This diagram vividly illustrates how the append operation allows slices to dynamically resize:
- We can see that initially,
sis a zero value, and at this point,shas not yet bound to an underlying array. - After the
append, an element 11 is added to slices, which allocates an underlying arrayu1with a length of 1, and then pointss's internalarraytou1, so at this point,len = 1, cap = 1. - Then, again
appendis called to insert 12 into slices, but at this point, the slice clearly has no space to store 12. Therefore, it needs to resize again. A new underlying arrayu2is created with a length of 2 (u1 array length* 2), and all elements fromu1are copied tou2, thenarrayis pointed tou2, andlen = 2, cap = 2is set. - By
append, another element 13 is inserted intos, but the slice space is insufficient again, requiring another resize. Thus, a new underlying arrayu3is created with a length of 4 (u2 array length* 2), and all elements fromu2are copied tou3, thenarrayis pointed tou3, andlen = 3, cap = 4is set. - By
append, an element 14 is inserted intos, and at this point, the slicecap = 4, and space is sufficient. Therefore, 14 is placed in the next element position, which is the end ofu3, ands's internallenis incremented by 1, becoming 4. - Finally, again by
append, 15 is inserted intos, and at this point, the slicelen = 4, cap = 4, and space is insufficient, requiring another resize. Thus, a new underlying arrayu4is created with a length of 8 (u3 array length* 2), and all elements fromu3are copied tou4, thenarrayis pointed tou4, andlen = 5, cap = 8is set.
It can be observed that when the underlying array space of the slice is insufficient, append will automatically allocate the slice to a new underlying array as needed. The new array length will be expanded according to a certain algorithm (refer to the growslice function in $GOROOT/src/runtime/slice.go). After the new array is established, all data from the old array is copied to the new array, and then array points to the new array, making the new array the underlying array of the slice, while the old array will be reclaimed by gc.
However, if we create a slice using slicing an array, once the slice's cap touches the upper boundary, performing an append operation on the slice will unbind the slice from the original array.
u := [6]int{1, 2, 3, 4, 5, 6}
s := u[2:4]
fmt.Printf("u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 7)
fmt.Printf("append 7: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 8)
fmt.Printf("append 8: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 9)
fmt.Printf("append 9: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
// output:
// u = [1 2 3 4 5 6], s = [3 4] len(s) = 2, cap(s) = 4
// append 7: u = [1 2 3 4 7 6], s = [3 4 7] len(s) = 3, cap(s) = 4
// append 8: u = [1 2 3 4 7 8], s = [3 4 7 8] len(s) = 4, cap(s) = 4
// append 9: u = [1 2 3 4 7 8], s = [3 4 7 8 9] len(s) = 5, cap(s) = 8
// append 10: u = [1 2 3 4 7 8], s = [3 4 7 8 9 10] len(s) = 6, cap(s) = 8
From the output, it can be seen that after adding element 9, cap(s) equals 8, which triggers the dynamic resizing of the slice, and at this point, the slice s is already unbound from the array u, meaning s is no longer a descriptor of u.
Preferably Use the Cap Parameter to Create Slices#
The append operation is a powerful tool that allows slices to support the concept of zero-value usability and makes slices easier to use.
However, from the principle, we can see that each time a resize occurs, all elements from the old underlying array need to be copied to the new underlying array. The resources consumed in this step are still considerable, especially when there are many elements. So how can we avoid the cost of excessive memory allocation and copying elements?
An effective solution is to consciously estimate the required capacity of the slice while writing code and pass the cap parameter to the built-in function make when creating the slice.
s := make([]T, len, cap)
Below is a performance test for two scenarios:
const sliceSize = 10000
func BenchmarkSliceInitWithoutCap(b *testing.B) {
for n := 0; n < b.N; n++ {
sl := make([]int, 0)
for i := 0; i < sliceSize; i++ {
sl = append(sl, i)
}
}
}
func BenchmarkSliceInitWithCap(b *testing.B) {
for n := 0; n < b.N; n++ {
sl := make([]int, 0, sliceSize)
for i := 0; i < sliceSize; i++ {
sl = append(sl, i)
}
}
}
The results are:
goos: windows
goarch: amd64
pkg: prometheus_for_go
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkSliceInitWithoutCap
BenchmarkSliceInitWithoutCap-16 34340 34443 ns/op
BenchmarkSliceInitWithCap
BenchmarkSliceInitWithCap-16 106251 11122 ns/op
PASS
It can be seen that using the cap parameter for slices during append has an average performance of 11122 ns/op, which is about three times that of not using the cap parameter.
Therefore, it is recommended to include the cap parameter when creating slices.