Strings
Table of Contents
Basics of string tutorial
Text strings are conventionally interpreted as UTF-8-encoded sequences of Unicode code points (runes)
- In Go, a string is in effect a read-only slice of bytes.
- A string holds arbitrary bytes.
- So, indexing a string yields its bytes, not its characters.
- Source code in Go is defined to be UTF-8 text; no other representation is allowed.
- So, When we write a string literal
"hi", it is encoded as UTF-8 text and stored in bytes. Go strings are always UTF-8, but they are not: only string literals are UTF-8.
- In Go, the Unicode code points are called runes.
The Go language defines the word
runeas an alias for the typeint32, so programs can be clear when an integer value represents a code point.- Strings are immutable. As such, a string
sand a substring likes[7:]may safely share the same data, so the substring operation is also cheap. - A raw string literal is written
`...`, using backquotes instead of double quotes. Fortunately, Go’s
rangeloop, when applied to a string, performs UTF-8 decoding implicitly.
Conversions to and from a string type discussion
int -> string
[]byte -> string
[]rune -> string
string -> []byte
string -> []rune