Due to being busy with writing company projects recently, I have neglected my study of golang
, so I have started the series of articles titled The Path to Mastering Go
, to document my understanding and notes during the learning process.
About Slices#
Slice, translated into Chinese as 切片
, is an abstract data type provided by Go language based on arrays. In Go, most scenarios that require arrays can be replaced by slices. In Go, slices perfectly replace arrays and provide a more flexible and efficient data sequence access interface.
What Exactly is a Slice?#
Before understanding the slice in Go, we need to first understand arrays.
What is a Slice?#
In Go, an array is a fixed-length continuous sequence that holds elements of the same type. Therefore, Go arrays have two properties: element type
and array length
. Arrays with these two types being the same are equivalent. That is:
var a [8]int
var b [8]int
Here, the array length of a
is 8, and the element type is int, which is the same as b
, so we can say that a
and b
are equivalent.
Go has value semantics, which means that an array variable represents the entire array, not a pointer to the first element of the array (as is done in C). In Go, passing an array uses value copying, which leads to significant performance loss when passing arrays with large element types or many elements.
In C, such scenarios can be handled using array pointer types, but in Go, slices are generally used.
The Relationship Between Slices and Arrays#
The relationship between slices and arrays is similar to that of file descriptors to files. In Go, arrays generally retreat to the "background" as the "underlying" data storage, while slices move to the "foreground," providing an external access interface to the underlying storage.
Thus, slices can be referred to as "descriptors" of arrays. The reason slices can be passed between different functions is due to the characteristics of "descriptors"; regardless of how large the underlying data storage is, the size of the descriptor is always fixed.
Ways to Create Slices and Their Representation in Runtime#
The representation of slices in Go runtime
:
//$GOROOT/src/runtime/slice.go
type slice struct {
array unsafe.Pointer
len int
cap int
}
As we can see, a slice has three elements:
array
: a pointer to an element of the underlying array, which is also the first element of the slice.len
: the length of the slice, which is the current number of elements in the slice.cap
: the maximum capacity of the slice,cap >= len
.
At runtime, each slice is an instance of the runtime.slice
struct, and we can create a slice using the following statement:
s := make([]byte, 5)
We can see that when we create a slice, the compiler automatically creates an underlying array. When we do not specify cap
, it defaults to cap = len
.
We can also create a slice by operating on an existing array, which we call slicing an array
:
u := [10]byte{11, 12, 13, 14, 15, 16, 17, 18, 19, 20}
s := u[3:7]
We can see that the slice s
opens an operational window on the array u
, with s
's first element being u[3]
, and through s
, we can see and operate on 4 array elements. The cap
of the slice depends on the length of the underlying array; from u[3]
to the end, there are 7 storage slots for elements, so s
's cap
is 7.
Of course, we can create multiple slices that operate on the same array, but since the multiple slices are all descriptors of the same array, any modification to any slice will reflect in the other slices.
We can also create new slices by operating on slices, a process known as reslicing
. Similarly, the new slice created from the old slice shares the same underlying array, so modifications to the new slice will also reflect in the old slice.
Thus, when slices are passed as parameters, what is passed is runtime.slice
, so regardless of how large the underlying array is, the performance loss brought by slices is very small and constant, even negligible. This is one reason why slices are usually used instead of arrays in functions. Another reason is that slices can provide more powerful functionality than pointer arrays, such as index access, boundary overflow checks, and dynamic resizing.
Advanced Features of Slices#
Dynamic Resizing#
Slices satisfy the concept of zero-value usability because slices have an advanced feature: dynamic resizing
. Even a zero-value slice can perform element assignment operations through the predefined function append
.
var k []int
k = append(k, 1)
k
is initialized to a zero value, so k
is not bound to a corresponding underlying array. However, after the append
operation, k
is clearly bound to its own underlying array.
Next, I will demonstrate the len
and cap
after each append
through a series of practical operations:
var s []int
s = append(s, 1)
fmt.Println(len(s), cap(s)) // 1 1
s = append(s, 2)
fmt.Println(len(s), cap(s)) // 2 2
s = append(s, 3)
fmt.Println(len(s), cap(s)) // 3 4
s = append(s, 4)
fmt.Println(len(s), cap(s)) // 4 4
s = append(s, 5)
fmt.Println(len(s), cap(s)) // 5 8
It can be seen that the value of len
grows linearly, but cap
grows irregularly. This diagram visually shows how the append
operation allows slices to dynamically resize:
- We can see that initially
s
is a zero value, and at this point,s
has not yet bound to an underlying array. - After
append
, an element 11 is added to slices
, which allocates an underlying arrayu1
with a length of 1, and then pointss
's internalarray
tou1
, so at this point,len = 1, cap = 1
. - Then, when
append
is called again to insert 12 into slices
, the slice clearly has no space to store 12. Therefore, it needs to resize again. A new underlying arrayu2
is created with a length of 2 (u1 array length
* 2), and all elements fromu1
are copied tou2
, finally pointingarray
tou2
, and settinglen = 2, cap = 2
. - When
append
is called again to insert an element 13 intos
, the slice space is insufficient again, requiring another resize. Thus, a new underlying arrayu3
is created with a length of 4 (u2 array length
* 2), and all elements fromu2
are copied tou3
, thenarray
is pointed tou3
, andlen = 3, cap = 4
is set. - When
append
is called again to insert element 14 intos
, the slicecap = 4
, and there is enough space. Therefore, 14 is placed in the next element position, which is the end ofu3
, ands
's internallen
is incremented by 1, becoming 4. - Finally, when
append
is called again to insert 15 intos
, the slicelen = 4, cap = 4
, and space is insufficient again, requiring another resize. Thus, a new underlying arrayu4
is created with a length of 8 (u3 array length
* 2), and all elements fromu3
are copied tou4
, thenarray
is pointed tou4
, andlen = 5, cap = 8
is set.
It can be observed that when the underlying array space of the slice is insufficient, append
will automatically allocate the slice to a new underlying array as needed. The new array length will be expanded according to a certain algorithm (see the growslice
function in $GOROOT/src/runtime/slice.go
). After the new array is established, all data from the old array is copied to the new array, and then array
points to the new array, making the new array the underlying array of the slice, while the old array will be reclaimed by gc
.
However, if we create a slice using slicing an array
, once the slice's cap
touches the upper boundary and we perform an append
operation on the slice, then the slice will be unbound from the original array.
u := [6]int{1, 2, 3, 4, 5, 6}
s := u[2:4]
fmt.Printf("u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 7)
fmt.Printf("append 7: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 8)
fmt.Printf("append 8: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
s = append(s, 9)
fmt.Printf("append 9: u = %v, s = %v len(s) = %d, cap(s) = %d\n", u, s, len(s), cap(s))
// output:
// u = [1 2 3 4 5 6], s = [3 4] len(s) = 2, cap(s) = 4
// append 7: u = [1 2 3 4 7 6], s = [3 4 7] len(s) = 3, cap(s) = 4
// append 8: u = [1 2 3 4 7 8], s = [3 4 7 8] len(s) = 4, cap(s) = 4
// append 9: u = [1 2 3 4 7 8], s = [3 4 7 8 9] len(s) = 5, cap(s) = 8
// append 10: u = [1 2 3 4 7 8], s = [3 4 7 8 9 10] len(s) = 6, cap(s) = 8
From the output results, it can be seen that after adding element 9, cap(s)
equals 8, which triggers the dynamic resizing of the slice, and at this point, the slice s
is no longer bound to the array u
, meaning s
is no longer a descriptor of u
.
Try to Use the Cap Parameter When Creating Slices#
The append
operation is a powerful tool that allows slices to support the concept of zero-value usability and makes slices easier to use.
However, from the principle, we can see that each time resizing occurs, all elements from the old underlying array need to be copied to the new underlying array. The resources consumed in this step are still considerable, especially when there are many elements. So how can we avoid the costs of excessive memory allocation and copying elements?
An effective solution is to consciously estimate the required capacity of the slice when writing code and pass the cap
parameter to the built-in function make
when creating the slice.
s := make([]T, len, cap)
Below is a performance test for two situations:
const sliceSize = 10000
func BenchmarkSliceInitWithoutCap(b *testing.B) {
for n := 0; n < b.N; n++ {
sl := make([]int, 0)
for i := 0; i < sliceSize; i++ {
sl = append(sl, i)
}
}
}
func BenchmarkSliceInitWithCap(b *testing.B) {
for n := 0; n < b.N; n++ {
sl := make([]int, 0, sliceSize)
for i := 0; i < sliceSize; i++ {
sl = append(sl, i)
}
}
}
The results are:
goos: windows
goarch: amd64
pkg: prometheus_for_go
cpu: AMD Ryzen 7 5800H with Radeon Graphics
BenchmarkSliceInitWithoutCap
BenchmarkSliceInitWithoutCap-16 34340 34443 ns/op
BenchmarkSliceInitWithCap
BenchmarkSliceInitWithCap-16 106251 11122 ns/op
PASS
It can be seen that using the cap
parameter for slices has an average performance of 11122 ns/op
during append
, which is about three times that of not using the cap
parameter.
Therefore, it is recommended to include the cap
parameter when creating slices.