Golang mem model 与内存屏障

前几天有小伙伴读 go101 Memory Order Guarantee 问了我这个问题: 红框中的两句话是不是矛盾的?

我们可以看看官方关于 Go Memory Model 中对应的描述:

  • A send on a channel happens before the corresponding receive from that channel completes.
  • The closing of a channel happens before a receive that returns a zero value because the channel is closed.
  • A receive from an unbuffered channel happens before the send on that channel completes.
  • The kth receive on a channel with capacity C happens before the k+Cth send from that channel completes.

除去关闭 channel 的情况后,我们可以结合官方的解释将上面 go101 描述的三种情况分为下面两个时序图:

  1. unbuffered:
  2. buffered


Many compilers (at compile time) and CPU processors (at run time) often make some optimizations by adjusting the instruction orders, so that the instruction execution orders may differ from the orders presented in code. Instruction ordering is also often called memory ordering.

我们先来看看 go101 中提到的这段 ”unprofessional“ 代码

package main

import "log"
import "runtime"

var a string
var done bool

func setup() {
    a = "hello, world"
    done = true
    if done {
        log.Println(len(a)) // always 12 once printed

func main() {
    go setup()

    for !done {
    log.Println(a) // expected to print: hello, world


  1. setup() 可能被编译器编译或者CPU执行为

    func setup() {
        done = true // done assgined at first
        a = "hello, world"
        if done {
            log.Println(len(a)) // always 12
  2. 即使指令没有被重新排序,CPU 有 Cache 一致性协议 MESI 的情况下,执行仍然可能乱序。(详见这里




 * Force strict CPU ordering.
 * And yes, this is required on UP too when we're talking
 * to devices.
#ifdef CONFIG_X86_32
 * Some non-Intel clones support out of order store. wmb() ceases to be a
 * nop for these.
#define mb() alternative("lock; addl $0,0(%%esp)", "mfence", X86_FEATURE_XMM2)
#define rmb() alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
#define wmb() alternative("lock; addl $0,0(%%esp)", "sfence", X86_FEATURE_XMM)
#define mb()     asm volatile("mfence":::"memory")
#define rmb()    asm volatile("lfence":::"memory")
#define wmb()    asm volatile("sfence" ::: "memory")


1. 指令重排

简单来说,屏障的作用正如其名:防止屏障前后的指令重排。下表是几种屏障的具体作用,具体细节可以参考 Linux Kernel Development 3rd Edition - Chapter 10 - Ordering and Barriers

Memory and Complier Barrier Methods

rmb()Prevents loads from being reordered across the barrier
read_barrier_depends()Prevents data-depends() loads from being reordered across the barrier
wmb()Prevents stores from being reordered across the barrier
mb()Prevents stores and loads being reordered across the barrier

2. MESI乱序


全文最重要的结论点来了:内存屏障解决了程序并发时的两个问题:1.阻止指令重排 2.MESI顺序保证

和 golang 有什么关系

其实我们可以看到,golang channel 中用到的 lock() 方法,其实调用的是 plan9 中的 lock 指令,具体翻译为 x86 指令也是一个 lock。lock 其实隐含了 屏障功能,lock 指令执行之前,会将未完成读写操作完成。

// runtime/chan.go

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    if c == nil {
        if !block {
            return false
        gopark(nil, nil, waitReasonChanSendNilChan, traceEvGoStop, 2)


    lock(&c.lock) // <---------------------------- here

    if c.closed != 0 {
        panic(plainError("send on closed channel"))

    if sg := c.recvq.dequeue(); sg != nil {
        // Found a waiting receiver. We pass the value we want to send
        // directly to the receiver, bypassing the channel buffer (if any).
        send(c, sg, ep, func() { unlock(&c.lock) }, 3)
        return true


//runtime internal atomic

TEXT runtime∕internal∕atomic·Cas64(SB), NOSPLIT, $0-25
    MOVQ    ptr+0(FP), BX
    MOVQ    old+8(FP), AX
    MOVQ    new+16(FP), CX
    CMPXCHGQ    CX, 0(BX)
    SETEQ    ret+24(FP)

TEXT runtime∕internal∕atomic·Casuintptr(SB), NOSPLIT, $0-25
    JMP    runtime∕internal∕atomic·Cas64(SB)

回头来看 golang 的设计

The design philosophy of Go is to use as fewer features as possible to support as more use cases as possible, at the same time to ensure a good enough overall code execution efficiency. So Go built-in and standard packages don't provide direct ways to use the CPU fence instructions. In fact, CPU fence instructions are used in implementing all kinds of synchronization techniques supported in Go. So, we should use these synchronization techniques to ensure expected code execution orders.

golang 并没有像 C++ 一样给出 LoadLoad、StoreLoad 或者是 直接调用汇编指令 之类的非常底层同步元语,而是封装了 channel 、atomic 等更高抽象程度的同步方法,来减轻程序员的负担。但是从原理上理解并发编程模型,能让我们写出健壮的代码。写到这里想到 golang 官方在内存模型这一章里写的一句话:

If you must read the rest of this document to understand the behavior of your program, you are being too clever.

Don't be clever.


Memory Order Guarantees in Go

浅谈Memory Reordering



谢宝友:深入理解 Linux RCU 从硬件说起之内存屏障

10 张图打开 CPU 缓存一致性的大门

Go 和 CPU 高速缓存:原理和应用

标签: none
返回文章列表 文章二维码