Strings
Table of Contents
Basics of string
tutorial
Text strings are conventionally interpreted as UTF-8-encoded sequences of Unicode code points (runes)
- In Go, a string is in effect a read-only slice of bytes.
- A string holds arbitrary bytes.
- So, indexing a string yields its bytes, not its characters.
- Source code in Go is defined to be UTF-8 text; no other representation is allowed.
- So, When we write a string literal
"hi"
, it is encoded as UTF-8 text and stored in bytes. Go strings are always UTF-8, but they are not: only string literals are UTF-8.
- In Go, the Unicode code points are called runes.
The Go language defines the word
rune
as an alias for the typeint32
, so programs can be clear when an integer value represents a code point.- Strings are immutable. As such, a string
s
and a substring likes[7:]
may safely share the same data, so the substring operation is also cheap. - A raw string literal is written
`...`
, using backquotes instead of double quotes. Fortunately, Go’s
range
loop, when applied to a string, performs UTF-8 decoding implicitly.
Conversions to and from a string type discussion
int
-> string
[]byte
-> string
[]rune
-> string
string
-> []byte
string
-> []rune