Effective Go 摘记

发表于 2019-02-18 标签 go

本文是 Effective Go 中的一些摘记，主要涉及 golang 中的语法、技巧、风格等。为了尽可能保持原文意思，会通过英文记录相关的知识点。

Formatting

The gofmt program (also available as go fmt, which operates at the package level rather than source file level) reads a Go program and emits the source in a standard style of indentation and vertical alignment, retaining and if necessary reformatting comments.

Commentary

Every package should have a package comment, a block comment preceding the package clause. For multi-file packages, the package comment only needs to be present in one file, and any one will do. The package comment should introduce the package and provide information relevant to the package as a whole, for example
1
2
3
4
5
6
7
/*
Package regexp implements a simple library for regular expressions.

The syntax of the regular expressions accepted is:
.......
*/
package regexp
Doc comments work best as complete sentences, which allow a wide variety of automated presentations. The first sentence should be a one-sentence summary that starts with the name being declared.
1
2
3
// Compile parses a regular expression and returns, if successful,
// a Regexp that can be used to match against text.
func Compile(str string) (*Regexp, error) {

Grouping variables can indicate relationships between items, such as the fact that a set of variables is protected by a mutex.

var (
    countLock   sync.Mutex
    inputCount  uint32
    outputCount uint32
    errorCount  uint32
)

Names

By convention, packages are given lower case, single-word names, no need for underscores or mixedCaps
Another convention is that the package name is the base name of its source directory; the package in "src/encoding/base64" is imported as "encoding/base64" but has name base64, not encoding_base64 and not encodingBase64
By convention, one-method interfaces are named by the method name plus an -er suffix or similar modification to construct an agent noun: Reader, Writer, Formatter, CloseNotifier
the convention in Go is to use MixedCaps or mixedCaps rather than underscores to write multiword names

Control structures

if and switch accept an optional initialization statement like that of for

if err := file.Chmod(0664); err != nil {
    log.Print(err)
    return err
}

In a := declaration a variable v may appear even if it has already been declared, providing that there is at least one other variable in the declaration that is being declared anew, otherwise an error no new variables on left side of := will occur
1
2
3
4
5
6
7
8
9
10
11
// legal for err
f, err := os.Open(name)
d, err := f.Stat()

// not legal for a
a := 1
a := 2

// not legal for a and b
a, b := 1, 1
a, b := 2, 2

For strings, the range breaks out individual Unicode code points by parsing the UTF-8. Erroneous encodings consume one byte and produce the replacement rune U+FFFD, rune is Go terminology for a single Unicode code point, similar to char in other languages

for pos, char := range "日本\x80語" { // \x80 is an illegal UTF-8 encoding
    fmt.Printf("character %#U starts at byte position %d\n", char, pos)
}
/* output
character U+65E5 '日' starts at byte position 0
character U+672C '本' starts at byte position 3
character U+FFFD '�' starts at byte position 6
character U+8A9E '語' starts at byte position 7
*/

if the switchhas no expression it switches on true. It's therefore possible—and idiomatic—to write an if-else-if-else chain as a switch.

func unhex(c byte) byte {
    switch {
    case '0' <= c && c <= '9':
        return c - '0'
    case 'a' <= c && c <= 'f':
        return c - 'a' + 10
    case 'A' <= c && c <= 'F':
        return c - 'A' + 10
    }
    return 0
}

Functions

Deferred functions are executed in LIFO order(imagine it like a stack), so the following code will cause 4 3 2 1 0 to be printed when the function returns.
1
2
3
for i := 0; i < 5; i++ {
defer fmt.Printf("%d ", i)
}

The arguments to deferred functions are evaluated when the defer executes, not when the function executes

func trace(s string) string {
    fmt.Println("entering:", s)
    return s
}

func un(s string) {
    fmt.Println("leaving:", s)
}

func a() {
    defer un(trace("a"))
    fmt.Println("in a")
}

func b() {
    defer un(trace("b"))
    fmt.Println("in b")
    a()
}

func main() {
    b()
}

/*output
entering: b
in b
entering: a
in a
leaving: a
leaving: b
*/

Data

new v.s make

Go has two allocation primitives, the built-in functions new and make
New does not initialize the memory, new(T) allocates zeroed storage for a new item of type T and returns its address, that is a pointer to a newly allocated zero value of type T, it's helpful to arrange when designing your data structures that the zero value of each type can be used without further initialization

Sometimes the zero value isn't good enough and an initializing constructor is necessary, as in this example derived from package os

func NewFile(fd int, name string) *File {
    if fd < 0 {
        return nil
    }
    f := new(File)
    f.fd = fd
    f.name = name
    f.dirinfo = nil
    f.nepipe = 0
    return f
}

We can simplify it using a composite literal, which is an expression that creates a new instance each time it is evaluated(File{fd, name, nil, 0} in the following code). If a composite literal contains no fields at all, it creates a zero value for the type. The expressions new(File) and &File{} are equivalent.
1
2
3
4
5
6
7
func NewFile(fd int, name string) *File {
if fd < 0 {
return nil
}
f := File{fd, name, nil, 0}
return &f
}
Note that unlike in C, it's perfectly OK to return the address of a local variable, the storage associated with the variable survives after the function returns
make(T, args) serves a purpose different from new(T), it **creates slices, maps, and channels only, and it returns an initialized (not zeroed) value of type T (not *T)**
The reason for the distinction is that these three types(slices, maps, and channels) represent, under the covers, references to data structures that must be initialized before use

array v.s slice

There are major differences between the ways arrays work in Go and C. In Go,
- Arrays are values. Assigning one array to another copies all the elements.
- In particular, if you pass an array to a function, it will receive a copy of the array, not a pointer to it.
- The size of an array is part of its type. The types [10]int and [20]int are distinct
The value property can be useful but also expensive; if you want C-like behavior and efficiency, you can pass a pointer to the array

A slice does not store any data, it just describes a section of an underlying array, so if you assign one slice to another, both refer to the same array

names := [4]string{
	"John",
	"Paul",
	"George",
	"Ringo",
}

a := names[0:2]
b := names[1:3]

b[0] = "XXX"
fmt.Println(a, b)
fmt.Println(names)
// output
// [John XXX] [XXX George]
// [John XXX George Ringo]

If a function takes a slice argument, modification of elements of the slice will be visible to the caller, but append elements won't, if you want to append elements to slice in function, pass the address instead

slices are variable-length, for a two-dimensional slice, it is possible to have each inner slice be a different length

text := LinesOfText{
	[]byte("Now is the time"),
	[]byte("for all good gophers"),
	[]byte("to bring some fun to the party."),
}

map

For a map in golang like map[KeyType]ValueType, KeyType may be any type that is comparable ,such as integers, floating point and complex numbers, strings, pointers, interfaces (as long as the dynamic type supports equality).Slices cannot be used as map keys, because equality is not defined on them, and ValueType may be any type at all, including another map!
1
2
hits := make(map[string]map[string]int)
n := hits["/doc/"]["au"]
Like slices, maps hold references to an underlying data structure. If you pass a map to a function that changes the contents of the map, the changes will be visible in the caller.
An attempt to fetch a map value with a key that is not present in the map will return the zero value for the type of the entries in the map.The zero value is:
- 0 for numeric types,
- false for the boolean type
- "" (the empty string) for strings.

If you need to judge whether a key in map, you can do this

1
2
3

if val, ok := dict["foo"]; ok {
    //do something here
}

Methods

Methods can be defined for any named type (except a pointer or an interface); the receiver does not have to be a struct.

type ByteSlice []byte

func (slice ByteSlice) Append(data []byte) []byte {
    // Body exactly the same as the Append function defined above.
}

The rule about pointers vs. values for receivers is that value methods can be invoked on pointers and values, but pointer methods can only be invoked on pointers.

Interfaces and other types

Interfaces

An interface is defined as a set of method signatures, and a type implements an interface by implementing its methods. A type can implement multiple interfaces

type Sequence []int

// Methods required by sort.Interface.
func (s Sequence) Len() int {
    return len(s)
}
func (s Sequence) Less(i, j int) bool {
    return s[i] < s[j]
}
func (s Sequence) Swap(i, j int) {
    s[i], s[j] = s[j], s[i]
}

You can define your own interface and a value of interface type can hold any value that implements those methods.

type Abser interface {
	Abs() float64
}

type MyFloat float64

func (f MyFloat) Abs() float64 {
	if f < 0 {
		return float64(-f)
	}
	return float64(f)
}

func main() {
	var a Abser
	f := MyFloat(-math.Sqrt2)
	a = f  // a MyFloat implements Abser
	fmt.Println(a.Abs())
}

Concurrency

Concurrent programming in many environments is made difficult by the subtleties required to implement correct access to shared variables.
Go encourages a different approach in which shared values are passed around on channels and, in fact, never actively shared by separate threads of execution，only one goroutine has access to the value at any given time.
For example，Reference counts may be best done by putting a mutex around an integer variable. But as a high-level approach, using channels to control access makes it easier to write clear, correct programs.

Goroutines

A goroutine has a simple model: it is a function executing concurrently with other goroutines in the same address space.
Prefix a function or method call with the go keyword to run the call in a new goroutine. When the call completes, the goroutine exits silently，don't wait for it.

Channels

Like maps, channels are allocated with make, and the resulting value acts as a reference to an underlying data structure

There are lots of nice idioms using channels. For example, if we launched a sort in the background and do sth else while waiting for the goroutine to finish. A channel allows us to do so

c := make(chan int)  // Allocate a channel.
// Start the sort in a goroutine; when it completes, signal on the channel.
go func() {
    list.Sort()
    c <- 1  // Send a signal; value does not matter.
}()
doSomethingForAWhile()
<-c   // Wait for sort to finish; discard sent value.

The above code works becase receivers always block until there is data to receive. As for the sender, if the channel is unbuffered, the sender blocks until the receiver has received the value. If the channel has a buffer, the sender blocks only until the value has been copied to the buffer; if the buffer is full, this means waiting until some receiver has retrieved a value.

A buffered channel can be used like a semaphore, for instance to limit throughput. In the following example, incoming requests are passed to handle, which sends a value into the channel, processes the request, and then receives a value from the channel to ready the “semaphore” for the next consumer. The capacity of the channel buffer limits the number of simultaneous calls to process.

var sem = make(chan int, MaxOutstanding)

func handle(r *Request) {
    sem <- 1    // Wait for active queue to drain.
    process(r)  // May take a long time.
    <-sem       // Done; enable next request to run.
}

func Serve(queue chan *Request) {
    for {
        req := <-queue
        go handle(req)  // Don't wait for handle to finish.
    }
}

The above design has a problem: Serve creates a new goroutine for every incoming request, even though only MaxOutstanding of them can run at any moment. As a result, the program can consume unlimited resources if the requests come in too fast. We can address that deficiency by changing Serve to gate the creation of the goroutines.
1
2
3
4
5
6
7
8
9
func Serve(queue chan *Request) {
for req := range queue {
sem <- 1
go func() {
process(req) // Buggy; see explanation below.
<-sem
}()
}
}
The bug in the above code is that in a Go for loop, the loop variable is reused for each iteration, so the req variable is shared across all goroutines. But we need to make sure that req is unique for each goroutine. Here's one way to do that, passing the value of req as an argument to the closure in the goroutine:
1
2
3
4
5
6
7
8
9
func Serve(queue chan *Request) {
for req := range queue {
sem <- 1
go func(req *Request) {
process(req)
<-sem
}(req)
}
}

Another solution is just to create a new variable with the same name, like the following code, req := req may seem odd, but it's legal and idiomatic in Go to do this. You get a fresh version of the variable with the same name

func Serve(queue chan *Request) {
    for req := range queue {
        req := req // Create new instance of req for the goroutine.
        sem <- 1
        go func() {
            process(req)
            <-sem
        }()
    }
}

Another approach that manages resources well is to start a fixed number of handle goroutines all reading from the request channel. The number of goroutines limits the number of simultaneous calls to process.

func handle(queue chan *Request) {
    for r := range queue {
        process(r)
    }
}

func Serve(clientRequests chan *Request, quit chan bool) {
    // Start handlers
    for i := 0; i < MaxOutstanding; i++ {
        go handle(clientRequests)
    }
    <-quit  // Wait to be told to exit.
}

Channels of channels

In the example in the previous section, handle was an idealized handler for a request but we didn't define the type it was handling. If that type includes a channel on which to reply, each client can provide its own path for the answer.
1
2
3
4
5
type Request struct {
args []int
f func([]int) int
resultChan chan int
}

The client provides a function and its arguments, as well as a channel inside the request object on which to receive the answer.

func sum(a []int) (s int) {
    for _, v := range a {
        s += v
    }
    return
}

request := &Request{[]int{3, 4, 5}, sum, make(chan int)}
// Send request
clientRequests <- request

On the server side, the handler function is the only thing that changes.

func handle(queue chan *Request) {
    for req := range queue {
        req.resultChan <- req.f(req.args)
    }
}

Errors

Library routines must often return some sort of error indication to the caller, it is easy to return a detailed error description alongside the normal return value with Go's multivalue return feature

Type error is a simple built-in interface, and library writer is free to implement this interface with a richer model under the covers, making it possible not only to see the error but also to provide some context

// built-in error interface
type error interface {
    Error() string
}

// custom error
// PathError records an error and the operation and
// file path that caused it.
type PathError struct {
    Op string    // "open", "unlink", etc.
    Path string  // The associated file.
    Err error    // Returned by the system call.
}

func (e *PathError) Error() string {
    return e.Op + " " + e.Path + ": " + e.Err.Error()
}

// PathError's Error will generate a string like this:
// open /etc/passwx: no such file or directory

The usual way to report an error to a caller is to return an error as an extra return value, but sometimes the error is unrecoverable, the program simply cannot continue. We can use the built-in function panic in this case
panic function takes a single argument of arbitrary type—often a string—to be printed as the program dies. It's also a way to indicate that something impossible has happened, for example,it is reasonable to use panic with the failure of initialization,
1
2
3
4
5
6
7
var user = os.Getenv("USER")

func init() {
if user == "" {
panic("no value for $USER")
}
}
When panic is called, including implicitly for run-time errors such as indexing a slice out of bounds or failing a type assertion, it immediately stops execution of the current function and begins unwinding the stack of the goroutine, running any deferred functions along the way. If that unwinding reaches the top of the goroutine's stack, the program dies. But it is possible to use the built-in function recover to regain control of the goroutine and resume normal execution.
A call to recover stops the unwinding and returns the argument passed to panic. Because the only code that runs while unwinding is inside deferred functions, recover is only useful inside deferred functions.

One application of recover is to shut down a failing goroutine inside a server without killing the other executing goroutines. In this example, if do(work) panics, the result will be logged and the goroutine will exit cleanly without disturbing the others

func server(workChan <-chan *Work) {
    for work := range workChan {
        go safelyDo(work)
    }
}

func safelyDo(work *Work) {
    defer func() {
        if err := recover(); err != nil {
            log.Println("work failed:", err)
        }
    }()
    do(work)
}

Some Syntax

int(math.Pow(float64(x), float64(count)))

Atoi (string to int) and Itoa (int to string).

1 2	i, err := strconv.Atoi("-42") s := strconv.Itoa(-42)

concate string s1 and s2

1
2
3

buffer := bytes.NewBufferString(s1)
buffer.WriteString(s2)
buffer.String()

append function

// append multiple elements
x := []int{1,2,3}
x = append(x, 4, 5, 6)

// append a slice 
x := []int{1,2,3}
y := []int{4,5,6}
x = append(x, y...)