Go逆向_3——Goroutine

2022-03-03

字数统计: 6.4k字 | 阅读时长≈ 30分

感觉这几篇Go逆向写的跟Go高级开发博客一样。看上去逆向还是得稍微要一点开发基础吧。

Go逆向_3——Goroutine

简介

关于Goroutine，协程这一概念，网上的解释很多。简单来说，就是一种轻量级线程，同时能并发数千个，且堆栈开销小，默认只会分配4kb。Goroutine算是Go语法中最重要的一个点了，大概也是让Go语言出彩的一个重头戏。

例子1

在函数或方法调用前面加上关键字go，就能轻松实现一个Goroutine。

package main

import (  
    "fmt"
)

func hello() {  
    fmt.Println("Hello world goroutine")
}
func main() {  
    go hello()
    fmt.Println("main function")
}

我们编译并执行它：

1	go build -gcflags="-N -l" 1.go

得到

1 2	.\1.exe main function

实验了10次，发现仍然如此，hello()函数内的打印操作并没有实现。

规则

了解Goroutine的一些规则，将会帮助了解。

当新的Goroutine开始时，Goroutine调用立即返回。与函数不同，Go不等待Goroutine执行结束。当Goroutine调用，并且Goroutine的任何返回值被忽略之后，Go立即执行到下一行代码。
main的Goroutine应该为其他的Goroutine执行。如果main的Goroutine终止了，程序将被终止，而其他Goroutine将不会运行。

修改

所以说，可能是上面的main函数的Goroutine执行结束的太快了，导致hello的Goroutine没来得及打印就被强行结束了。

于是我们修改它，给它更多的时间来反应。调用time.Sleep让main函数暂缓1秒：

package main

import (  
    "fmt"
    "time"
)

func hello() {  
    fmt.Println("Hello world goroutine")
}
func main() {  
    go hello()
    time.Sleep(1 * time.Second)
    fmt.Println("main function")
}

再次执行后得到：

1
2
3

.\2.exe
Hello world goroutine
main function

执行的时候会发现，打印完Hello world goroutine后会等待1秒左右，然后再打印main function。

逆向分析

看一下IDA加深理解。（这一次IDA的伪代码还是靠谱的）

void __cdecl main_main()
{
  _QWORD v0[5]; // [rsp+38h] [rbp-30h] BYREF
  void *retaddr; // [rsp+68h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  runtime_newproc(0, off_4D5E28);
  v0[0] = &unk_4B3180;
  v0[1] = &off_4EEFF8;
  v0[2] = v0;
  v0[3] = 1LL;
  v0[4] = 1LL;
  fmt_Println(v0, 1LL, 1LL);
}

乍一看好像啥也没有，仔细看一下调用的函数会发现，调用了newproc函数。

`newproc`

这个函数在博客一讲过，向其中传入函数指针fn，将会为这个Go程序创建一个新的用于运行函数fn的Goroutine。

这个off_4D5E28正好指向了main.hello函数。

`strip`

strip后，我们再次扔进IDA，看看能不能再识别出来。

我的方案是：

若IDA7.6能识别出runtime.newproc函数，那么就可以直接确认。（不靠谱，有可能字符串会被混淆）
若识别不出来，那就得根据newproc函数本身的特征来手动确认。
或者使用Bindiff直接导入函数表

实话实说，单纯就恢复函数表来说，7.6确实很靠谱。反编译结果没有任何区别，显示的很清晰。

例子2

开启多个goroutine

package main

import (  
    "fmt"
    "time"
)

func numbers() {  
    for i := 1; i <= 5; i++ {
        time.Sleep(250 * time.Millisecond)
        fmt.Printf("%d ", i)
    }
}

func alphabets() {  
    for i := 'a'; i <= 'e'; i++ {
        time.Sleep(400 * time.Millisecond)
        fmt.Printf("%c ", i)
    }
}

func main() {  
    go numbers()
    go alphabets()
    time.Sleep(3000 * time.Millisecond)
    fmt.Println("main terminated")
}

得到

1 2	.\3.exe 1 a 2 3 b 4 c 5 d e main terminated

程序是通过time.Sleep来控制每个goroutine的打印间隔的。

逆向

void __cdecl main_main()
{
  _QWORD v0[2]; // [rsp+40h] [rbp-18h] BYREF
  void *retaddr; // [rsp+58h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  runtime_newproc(0, (char)&off_4D7E28);
  runtime_newproc(0, (char)off_4D7E20);
  time_Sleep(3000000000LL);
  v0[0] = &unk_4B5180;
  v0[1] = &off_4F1348;
  fmt_Fprintln((__int64)&off_4F27F8, qword_56A648, (__int64)v0, 1LL, 1LL);
}

毫无新意。

__int64 __usercall main_numbers@<rax>()
{
  __int64 v0; // rdi
  __int64 v1; // rsi
  __int64 result; // rax
  __int64 v3; // [rsp+8h] [rbp-68h]
  __int64 v4; // [rsp+50h] [rbp-20h]
  void *retaddr; // [rsp+70h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  for ( result = 1LL; result <= 5; result = v4 + 1 )
  {
    v4 = result;
    v3 = time_Sleep(250000000LL);
    runtime_convT64(v4, v3);
    fmt_Fprintf(v0, v1, (const char *)&off_4F27F8);
  }
  return result;
}

__int64 __usercall main_alphabets@<rax>()
{
  __int64 v0; // rdi
  __int64 v1; // rsi
  __int64 result; // rax
  __int64 v3; // [rsp+8h] [rbp-68h]
  int v4; // [rsp+54h] [rbp-1Ch]
  void *retaddr; // [rsp+70h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  for ( result = 97LL; (int)result <= 101; result = (unsigned int)(v4 + 1) )
  {
    v4 = result;
    v3 = time_Sleep(400000000LL);
    runtime_convT32(v4, v3);
    fmt_Fprintf(v0, v1, (const char *)&off_4F27F8);
  }
  return result;
}

共享内存（锁）

上面都是用time.Sleep()这样迫真的手法来操纵goroutine。下面稍微看一下工程上常用的手法：共享内存和通道。

这个概念就是比较常见的多线程编程理念。Go的并发虽然和多线程有很大区别，但是有些东西还是很相似的。

在Go的sync包下有相关实现。

1	import "sync"

多线程的概念：

多线程基础 - 廖雪峰的官方网站 (liaoxuefeng.com)

不过这是Java教程，但是不影响概念互通。

Go语言sync包的应用详解 - 知乎 (zhihu.com)

package main

import (
	"fmt"
	"runtime"
	"sync"
)

var counter int = 0

func Count(lock *sync.Mutex) {
	lock.Lock() // 上锁
	counter++
	fmt.Println("counter =", counter)
	lock.Unlock() // 解锁
}

func main() {
	lock := &sync.Mutex{} // 互斥锁

	// 创建10个Count goroutine
	// 每个Count()都会将counter变量+1
	for i := 0; i < 10; i++ {
		go Count(lock)
	}
	for {
		lock.Lock()  // 上锁
		c := counter // 获取counter
		fmt.Println("2counter =", counter)
		lock.Unlock() // 解锁

		runtime.Gosched() // 出让时间片
		// 让当前goroutine让出CPU，好让其它的goroutine获得执行的机会。同时，当前的goroutine也会在未来的某个时间点继续运行。

		if c >= 10 {
			break
		}
	}
}

得到（结果不唯一）：

.\4.exe
counter = 1
counter = 2
2counter = 2
counter = 3
counter = 4
counter = 5
counter = 6
counter = 7
counter = 8
counter = 9
counter = 10
2counter = 10

可以发现2counter就打印了两次，即第二个for循环跑了2圈不到，10个goroutine就结束操作了。这里面，runtime.Goshed()是一个很关键的操作；因为如果不主动分割CPU控制权的话，10个Count就难以得到运行机会，第二个for循环就会无效循环很久。

注释掉Goshed后：

go run .\4.go
counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
2counter = 1
counter = 2
counter = 3
counter = 4
counter = 5
counter = 6
counter = 7
counter = 8
counter = 9
counter = 10
2counter = 10

可以看到2counter打印了好多个无效的1，等了好久才让Counter拿到运行机会。

逆向

`Count`

void __golang main_Count(volatile signed __int32 *a1)
{
  void *retaddr; // [rsp+68h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  if ( _InterlockedCompareExchange(a1, 1, 0) )
    sync___ptr_Mutex__lockSlow(a1);
  runtime_convT64(++counter);
  fmt_Fprintln(&off_4F04A8, qword_567648);
  if ( _InterlockedDecrement(a1) )
    sync___ptr_Mutex__unlockSlow(a1);
}

可以看到由_InterlockedCompareExchange下的sync___ptr_Mutex__lockSlow()作为开头和_InterlockedDecrement下的sync___ptr_Mutex__unlockSlow作为结尾；分别代表了lock.lock()和lock.unlock()。

runtime_convT64是用于构造一个接口的，向其传入一个对象，它会返回一个裸指针。

同时可以发现函数传参lock *sync.Mutex是个32位int。

_InterlockedCompareExchange宏就是lock cmpxchg [rcx], edx汇编，lock被作为汇编指令的一个前缀符存在。

_InterlockedDecrement宏是lock xadd [rcx], eax

当然由于sync包下的函数还是看得到的，所以暂时不打算仔细研究这些汇编的意义。

`main`

void __cdecl main_main()
{
  volatile signed __int32 *v0; // rax
  __int64 v1; // rcx
  volatile signed __int32 *v2; // [rsp+8h] [rbp-78h]
  __int64 tmp; // [rsp+40h] [rbp-40h]
  __int64 v4; // [rsp+48h] [rbp-38h]
  volatile signed __int32 *v5; // [rsp+50h] [rbp-30h]
  void *retaddr; // [rsp+80h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  runtime_newobject((_type *)&stru_4BC700);
  v0 = v2;
  v5 = v2;
  v1 = 0LL;
  while ( v1 < 10 )
  {
    tmp = v1;
    runtime_newproc(8, &off_4D5E18);
    v1 = tmp + 1;
    v0 = v5;
  }
  while ( 1 )
  {
    if ( _InterlockedCompareExchange(v0, 1, 0) )
      sync___ptr_Mutex__lockSlow(v0);
    v4 = counter;
    runtime_convT64(counter);
    fmt_Fprintln(&off_4F04A8, qword_567648);
    if ( _InterlockedDecrement(v5) )
      sync___ptr_Mutex__unlockSlow(v5);
    runtime_mcall(off_4D6040);
    if ( v4 >= 10 )
      break;
    v0 = v5;
  }
}

`runtime.newobject`

1	runtime_newobject((_type *)&stru_4BC700);

先看一下runtime/malloc.go/newobject函数

// implementation of new builtin
// compiler (both frontend and SSA backend) knows the signature
// of this function
func newobject(typ *_type) unsafe.Pointer {
	return mallocgc(typ.size, typ, true)
}

大致就是传入一个_type结构体指针，然后它会根据这个来在堆上创建这个对象，然后返回指向它的指针。

IDA中的结果也是一样的

__int64 __usercall runtime_newobject@<rax>(__int64 a1)
{
  void *retaddr; // [rsp+28h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();
  return runtime_mallocgc(*(_QWORD *)a1, a1, 1);
}

这里遇到了一个上个博客遇到过但没仔细分析过的结构体_type（在接口结构体中遇到过）：

`_type`

type nameOff int32
type typeOff int32

// tflag is documented in reflect/type.go.
//
// tflag values must be kept in sync with copies in:
//	cmd/compile/internal/gc/reflect.go
//	cmd/link/internal/ld/decodesym.go
//	reflect/type.go
//      internal/reflectlite/type.go
type tflag uint8

type _type struct {
	size       uintptr // 大小
	ptrdata    uintptr // size of memory prefix holding all pointers
	hash       uint32
	tflag      tflag   // 类型的特征标记
	align      uint8   // _type 作为整体变量存放时的对齐字节数
	fieldAlign uint8   // 当前结构字段的对齐字节数
	kind       uint8   // 基础类型枚举值和反射中的 Kind 一致，kind 决定了如何解析该类型
	// function for comparing objects of this type
	// (ptr to object A, ptr to object B) -> ==?
	equal func(unsafe.Pointer, unsafe.Pointer) bool
	// gcdata stores the GC type data for the garbage collector.
	// If the KindGCProg bit is set in kind, gcdata is a GC program.
	// Otherwise it is a ptrmask bitmap. See mbitmap.go for details.
	gcdata    *byte   // GC 相关
	str       nameOff // str 用来表示类型名称字符串在编译后二进制文件中某个section的偏移量
	ptrToThis typeOff // ptrToThis 用来表示类型元信息的指针在编译后二进制文件中某个section 偏移量，有连接器负责填充
}

得到这个之后就可以在IDA内Structure界面内实现这个结构体。

00000000 _type           struc ; (sizeof=0x30, mappedto_73)
00000000                                         ; XREF: .rdata:stru_4B1260/r
00000000                                         ; .rdata:stru_4B1320/r ...
00000000 size            dq ?
00000008 ptrdata         dq ?
00000010 hash            dd ?
00000014 tflag           db ?
00000015 align           db ?
00000016 fieldAlign      db ?
00000017 kind            db ?
00000018 equal           dq ?                    ; offset
00000020 gcdata          dq ?                    ; offset
00000028 str             dd ?
0000002C ptrToThis       dd ?
00000030 _type           ends

我们得到

_type <
    8, 
    0, 
    48061933h, 
    0Fh, 
    4, 
    4, 
    19h, 
    offset off_4D60A8, 
    offset unk_4EEC57, 
    2900h, 
    12C20h
>

其中tflag是0xF，即0b1111。

const (
	// tflagUncommon means that there is a pointer, *uncommonType,
	// just beyond the outer type structure.
	//
	// For example, if t.Kind() == Struct and t.tflag&tflagUncommon != 0,
	// then t has uncommonType data and it can be accessed as:
	//
	//	type tUncommon struct {
	//		structType
	//		u uncommonType
	//	}
	//	u := &(*tUncommon)(unsafe.Pointer(t)).u
    
    // tflagUncommon意味着在这个type结构体外部有一个取地址&操作，即这是一个指针。
	tflagUncommon tflag = 1 << 0

	// tflagExtraStar means the name in the str field has an
	// extraneous '*' prefix. This is because for most types T in
	// a program, the type *T also exists and reusing the str data
	// saves binary size.
    // 这意味着在str域，也就是字符串上会有额外的*标识
    // 显然使用指针能复用资源，实现优化。
	tflagExtraStar tflag = 1 << 1

	// tflagNamed means the type has a name.
    // 此结构体有名称
	tflagNamed tflag = 1 << 2

	// tflagRegularMemory means that equal and hash functions can treat
	// this type as a single region of t.size bytes.
	tflagRegularMemory tflag = 1 << 3
)

四个全中。意味着这个对象是对某结构体的取地址，且有名字，且str域内有*前缀。

由于现在有源码，所以现在知道是lock := &sync.Mutex{}。为了确认这一点，现在IDAString界面上面找一下。

1	.rdata:00000000004AA91E aSyncEntry db 0Bh,'*sync.entry'

根据上文对str NameOff的描述，我们先通过IDASegments界面找到.rdata段的开始处。为地址0x4A8000

然后结构体中的+0x2900偏移得到0x4AA900

这里正好就是这个字符串结构体所在的位置。

创建`goroutine`

while ( v1 < 10 )
{
  tmp = v1;
  runtime_newproc(8, &off_4D5E18);
  v1 = tmp + 1;
  v0 = v5;
}

这个就是

1
2
3

for i := 0; i < 10; i++ {
		go Count(lock)
}

注意newproc传参一开始可能是被搞错成只有一个参数的，要注意修改，添加一个_QWORD。

`mcall`

while ( 1 )
{
  if ( _InterlockedCompareExchange(v0, 1, 0) )
    sync___ptr_Mutex__lockSlow(v0);
  v4 = counter;
  runtime_convT64(counter);
  fmt_Fprintln(&off_4F04A8, qword_567648);
  if ( _InterlockedDecrement(v5) )
    sync___ptr_Mutex__unlockSlow(v5);
  runtime_mcall(off_4D6040);
  if ( v4 >= 10 )
    break;
  v0 = v5;
}

其中mcall中传入的参数就是runtime.Goshed函数。

在runtime/asm_amd64.s中有汇编实现，其主要作用就是将栈切换至m->g0然后执行fn函数。了解一下原型即可。

1	func mcall(fn func(*g))

通道（channel）

https://www.cnblogs.com/liang1101/p/7285955.html

消息机制认为每个并发单元是自包含的、独立的个体，并且都有自己的变量，但在不同并发单元间这些变量不共享。每个并发单元的输入和输出只有一种，那就是消息。

channel 是 Go 语言在语言级别提供的 goroutine 间的通信方式，我们可以使用 channel 在多个 goroutine 之间传递消息。channel是进程内的通信方式，因此通过 channel 传递对象的过程和调用函数时的参数传递行为比较一致，比如也可以传递指针等。channel是类型相关的，一个 channel 只能传递一种类型的值，这个类型需要在声明channel时指定。

声明方式

1	var chanName chan ElementType

比如声明一个传递int的channel

1	var ch chan int

使用make()函数来创建channel：

1	ch := make(chan int)

在channel的用法中，最常见的包括写入和读出：

// 将一个数据value写入至channel，这会导致阻塞，直到有其他goroutine从这个channel中读取数据
ch <- value

// 从channel中读取数据，如果channel之前没有写入数据，也会导致阻塞，直到channel中被写入数据为止
value := <-ch

默认情况下，channel的接收和发送都是阻塞的，除非另一端已准备好。

我们还可以创建一个带缓冲的channel：

c := make(chan int, 1024)

// 从带缓冲的channel中读数据
for i:=range c {
　　...
}

此时，创建一个大小为1024的int类型的channel，即使没有读取方，写入方也可以一直往channel里写入，在缓冲区被填完之前都不会阻塞。

可以关闭不再使用的channel：

close(ch)

应该在生产者的地方关闭channel，如果在消费者的地方关闭，容易引起panic

下面是一个官方例子：

package main

import "fmt"

func sum(s []int, c chan int) {
	sum := 0
	for _, v := range s {
		sum += v
	}
	c <- sum // send sum to c
}
func main() {
	s := []int{7, 2, 8, -9, 4, 0}
	c := make(chan int)
	go sum(s[:len(s)/2], c)
	go sum(s[len(s)/2:], c)
	x, y := <-c, <-c // receive from c
	fmt.Println(x, y, x+y)
}

得到结果

1 2	go run .\6.go -5 17 12

逆向

void __cdecl main_main()
{
  _QWORD *s; // [rsp+8h] [rbp-B0h]
  __int64 x_y; // [rsp+8h] [rbp-B0h]
  __int64 y2; // [rsp+40h] [rbp-78h]
  __int64 x2; // [rsp+48h] [rbp-70h]
  __int64 y; // [rsp+50h] [rbp-68h] BYREF
  __int64 x; // [rsp+58h] [rbp-60h] BYREF
  __int64 chan; // [rsp+60h] [rbp-58h] MAPDST
  __int64 y_i; // [rsp+68h] [rbp-50h]
  __int64 x_i; // [rsp+70h] [rbp-48h]
  _QWORD *s_ptr; // [rsp+78h] [rbp-40h]
  _QWORD v11[6]; // [rsp+80h] [rbp-38h] BYREF

  while ( (unsigned __int64)v11 <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();

  s = (_QWORD *)runtime_newobject((__int64)&qword_4B49C0);
  s_ptr = s;
  *s = 7LL;
  s[1] = 2LL;
  s[2] = 8LL;
  s[3] = -9LL;
  s[4] = 4LL;
  s[5] = 0LL;                                   // 
                                                // s := []int{7, 2, 8, -9, 4, 0}

  runtime_makechan((__int64)&qword_4B25E0, 0LL);// c := make(chan int)
                                                // 不过返回值似乎IDA没能正确搞对
                                                // 应该就是
                                                // chan = runtime_makechan((__int64)&qword_4B25E0, 0LL);

  runtime_newproc(32, &off_4D5E88, s_ptr, 3LL, 6LL, chan);
  runtime_newproc(32, &off_4D5E88, s_ptr + 3, 3LL, 3LL, chan);

  x = 0LL;
  runtime_chanrecv1(chan, (__int64)&x);
  y = 0LL;
  runtime_chanrecv1(chan, (__int64)&y);
  x2 = x;
  y2 = y;

  x_i = runtime_convT64(x);
  y_i = runtime_convT64(y2);
  x_y = runtime_convT64(x2 + y2);

  v11[0] = &qword_4B2AA0;                       // int类的_type信息
  v11[1] = x_i;                                 // 就是一个iface结构体，一个接口
  v11[2] = &qword_4B2AA0;
  v11[3] = y_i;
  v11[4] = &qword_4B2AA0;
  v11[5] = x_y;
  x = fmt_Fprintln(&off_4F0488, qword_567628, v11, 3LL, 3LL);
}

`runtime.newobject`

在reflect/type.go中，定义了_type结构体中kind成员的枚举意义：

type Kind uint

const (
	Invalid Kind = iota	// 从0开始自动递增
	Bool
	Int
	Int8
	Int16
	Int32
	Int64
	Uint
	Uint8
	Uint16
	Uint32
	Uint64
	Uintptr
	Float32
	Float64
	Complex64
	Complex128
	Array
	Chan
	Func
	Interface
	Map
	Ptr
	Slice
	String
	Struct
	UnsafePointer
)

.rdata:00000000004B49C0 qword_4B49C0    dq 48                   ; DATA XREF: main_main+36↑o
.rdata:00000000004B49C8                 dq 0
.rdata:00000000004B49D0                 dd 0B7036A26h
.rdata:00000000004B49D4                 db  0Ah
.rdata:00000000004B49D5                 db    8
.rdata:00000000004B49D6                 db    8
.rdata:00000000004B49D7                 db  11h
.rdata:00000000004B49D8                 dq offset off_4AB170
.rdata:00000000004B49E0                 dq offset unk_4EEC77
.rdata:00000000004B49E8                 db  5Eh ; ^
.rdata:00000000004B49E9                 db  11h
.rdata:00000000004B49EA                 db    0
.rdata:00000000004B49EB                 db    0
.rdata:00000000004B49EC                 db    0
.rdata:00000000004B49ED                 db    0
.rdata:00000000004B49EE                 db    0
.rdata:00000000004B49EF                 db    0

11h，0x17正好对应Array。说明这是一个列表。且大小是48字节，6*8，能放6个int。

同时根据str域找到相应字符串：*[6]int

更是直接证实了这个想法。

`runtime.makechan`

在runtime/chan.go中

1	func makechan(t chantype, size int) hchan

在runtime/type.go中

type chantype struct {
	typ  _type
	elem *_type
	dir  uintptr
}

套了一个_type结构体代表了chan自身类型，然后1个指针，指向了channel内传递元素的_type类型。

总体来说没啥区别，不过我发现typ _type的kind值是50，超过了Kind的枚举数量。暂时没搞懂为什么。

`runtime.newproc`

func newproc(siz int32, fn *funcval) {
	argp := add(unsafe.Pointer(&fn), sys.PtrSize)
	gp := getg()
	pc := getcallerpc()
	systemstack(func() {
		newg := newproc1(fn, argp, siz, gp, pc)

		_p_ := getg().m.p.ptr()
		runqput(_p_, newg, true)

		if mainStarted {
			wakep()
		}
	})
}

在前面的分析中，由于设定的fn都是无参函数，所以newproc一直只有2个参数。但是当fn有参数时，那么实际编译出来，传入的参数个数就会根据fn的形参表而产生变化。

func sum(s []int, c chan int) {
	sum := 0
	for _, v := range s {
		sum += v
	}
	c <- sum // send sum to c
}

传入的第一个参数是个slice。讲过slice的结构体特性：

slice · 深入解析Go (gitbooks.io)

struct    Slice
{    // must not move anything
    byte*    array;        // actual data
    uintgo    len;        // number of elements
    uintgo    cap;        // allocated number of elements
};

需要3个QWORD，第一个是个指针，指向具体的列表数据；第二个是长度；第三个是切片的容量。

void __golang main_sum(QWORD *Arr, __int64 len, __int64 cap, __int64 chan)
{
  __int64 i; // rdx
  __int64 Sum; // rbx
  QWORD tmp; // rsi
  __int64 v7; // [rsp+10h] [rbp-10h] BYREF
  void *retaddr; // [rsp+20h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();

  i = 0LL;
  Sum = 0LL;
  while ( len > i )
  {
    tmp = Arr[i++];
    Sum += tmp;
  }
  v7 = Sum;
  runtime_chansend1(chan, (__int64)&v7);
}

sum有4个参数，前三个对应的就是slice结构体，最后一个自然就是chan。

其中cap在这个函数中并未被用到。

最后调用了

`runtime.chansend1`

// entry point for c <- x from compiled code
//go:nosplit
func chansend1(c *hchan, elem unsafe.Pointer) {
	chansend(c, elem, true, getcallerpc())
}

伪代码和源码也基本上是一样的（IDA7.6真香）

`runtime.chanrecv1`

// entry points for <- c from compiled code
//go:nosplit
func chanrecv1(c *hchan, elem unsafe.Pointer) {
	chanrecv(c, elem, true)
}

`fmt.Fprintln`（对空接口和变参的进一步研究）

1 2	// fmt/print.go func Fprintln(w io.Writer, a ...interface{}) (n int, err error)

第一个是个io.Write接口

// io/io.go
type Writer interface {
	Write(p []byte) (n int, err error)
}

那么就是一个iface结构体，则占用2个QWORD。

1	x = fmt_Fprintln(&off_4F0488, qword_567628, v11, 3LL, 3LL);

中，&off_4F0488, qword_567628就是一个_type指针，一个是data unsafe.Pointer。

qword_567628根据交叉引用会发现大抵是得到

1	os_newFile(qword_5AAE00, (__int64)"/dev/stdout", 11LL, (__int64)"file", 4LL);

也就是输出流。

Go源码中的第二个参数是个很常见的...interface{}结构，又是变参，又是空接口。

在

The Go low-level calling convention on x86-64 (updated) - What’s new in 2020 and in Go 1.15 · dr knz @ work (dr-knz.net)

这篇文章的Vararg Calls一节中讲到了变参的特征。

简而言之，调用者函数会在栈上准备一个slice对象，然后让一个同样在栈上的位置参数指针（positional arguments）指向它；然后将这个slice当作固定位置参数（fixed-position argument）传给被调用函数。

同时如果变参类型是个空接口，那么一般类型转换成接口类型的行为也会和变参转换同时发生。

函数原型中要求的空接口，意思就是让传入的参数自动转换成其对应的接口类型再传进去。

博客二分析的空接口，则是创建出来的接口对象，由于其初始化没传任何参数而变成了空接口类型。

例子1——`f(...int)`

package main

func f(...int) {}

var x, y, z, w int

func caller() {
	f(x, y, z, w)
}

func main() {
	caller()
}

.text:0000000000462740                 mov     rcx, gs:28h
.text:0000000000462749                 mov     rcx, [rcx+0]
.text:0000000000462750                 cmp     rsp, [rcx+10h]
.text:0000000000462754                 jbe     loc_462800

.text:000000000046275A                 sub     rsp, 60h
.text:000000000046275E                 mov     [rsp+58h], rbp
.text:0000000000462763                 lea     rbp, [rsp+58h]

								   ; 初始化slice
.text:0000000000462768                 xorps   xmm0, xmm0       ; 清零
.text:000000000046276B                 movups  xmmword ptr [rsp+18h], xmm0 
.text:0000000000462770                 movups  xmmword ptr [rsp+28h], xmm0 ;0x18-0x38一共4个QWORD清零

                                      ; [rsp+0x38] = rsp+0x18
                                      ; 0x18应该是slice的起始点
                                      ; 0x38算是这个slice的指针
.text:0000000000462775                 lea     rax, [rsp+18h]
.text:000000000046277A                 mov     [rsp+38h], rax

.text:000000000046277F                 test    [rax], al

								   ; [0x18]=main_x（全局变量，没被定义）
.text:0000000000462781                 mov     rcx, cs:main_x
.text:0000000000462788                 mov     [rsp+18h], rcx

								   ; [0x20]=main_y
.text:000000000046278D                 test    [rax], al
.text:000000000046278F                 mov     rcx, cs:main_y
.text:0000000000462796                 mov     [rsp+20h], rcx

								   ; [0x28]=main_z
.text:000000000046279B                 test    [rax], al
.text:000000000046279D                 mov     rcx, cs:main_z
.text:00000000004627A4                 mov     [rsp+28h], rcx

								   ; [0x30]=main_w
.text:00000000004627A9                 test    [rax], al
.text:00000000004627AB                 mov     rax, cs:main_w
.text:00000000004627B2                 mov     [rsp+30h], rax

								   ; rax=[rsp+0x38]=rsp+0x18
.text:00000000004627B7                 mov     rax, [rsp+38h]
.text:00000000004627BC                 test    [rax], al
.text:00000000004627BE                 xchg    ax, ax
.text:00000000004627C0                 jmp     short $+2

								   ; [0x40]=rax=rsp+0x18= start of slice
.text:00000000004627C2                 mov     [rsp+40h], rax
								   ; [0x48]=4
.text:00000000004627C7                 mov     qword ptr [rsp+48h], 4
								   ; [0x50]=4
.text:00000000004627D0                 mov     qword ptr [rsp+50h], 4


								   ; [0]=start of slice
.text:00000000004627D9                 mov     [rsp], rax
								   ; [0x8]=4
.text:00000000004627DD                 mov     qword ptr [rsp+8], 4
								   ; [0x10]=4
.text:00000000004627E6                 mov     qword ptr [rsp+10h], 4
.text:00000000004627EF                 call    main_f
.text:00000000004627F4                 mov     rbp, [rsp+58h]
.text:00000000004627F9                 add     rsp, 60h
.text:00000000004627FD                 retn

总结一下，

[0x18-0x30]=x,y,z,w

[0x38]=0x18

;[0x40-0x50]是一个slice对象
[0x40]=0x18  ; uintptr
[0x48]=4     ; len
[0x50]=4     ; cap

; [0x0-0x10]是函数调用栈，也是一个slice对象
; [0x0]=rax=0x18
; [0x8]=4
; [0x10]=4

关于rsp+0x40-rsp+0x50处这个多余的slice对象，我觉得应该是由于没开优化导致的冗余代码。平时设置-gcflags="-N -l"似乎也是经常遇到这种现象，暂时不管。

通过修复结构体定义，在IDA中我们得到

void main_caller()
{
  _QWORD list[4]; // [rsp+18h] [rbp-48h] BYREF
  _QWORD *v1; // [rsp+38h] [rbp-28h]
  slice obj; // [rsp+40h] [rbp-20h]
  void *retaddr; // [rsp+60h] [rbp+0h] BYREF

  while ( (unsigned __int64)&retaddr <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();

  v1 = list;
  list[0] = main_x;
  list[1] = main_y;
  list[2] = main_z;
  list[3] = main_w;
  obj.data = list;
  obj.len = 4LL;
  obj.cap = 4LL;
  main_f(list, 4LL, 4LL);
}

十分清晰。

例子2——`f(...interface{})`

package main

func f(...interface{}) {}

var x, y, z, w int

func caller() {
	f(x, y, z, w)
}

func main() {
	caller()
}

其实本质上是差不多的，只不过多了一个向接口类型的转换。

大致恢复好结构体interface和slice后的IDA反编译结果：

00000000 interface       struc ; (sizeof=0x10, mappedto_13)
00000000                                         ; XREF: main_caller+3E/w
00000000                                         ; main.caller/r
00000000 Tab             dq ?                    ; XREF: main_caller+39/w
00000000                                         ; main_caller+43/w ... ; offset
00000008 Data            dq ?                    ; XREF: main_caller+10/o ; offset
00000010 interface       ends
00000010
00000000 ; ---------------------------------------------------------------------------
00000000
00000000 slice           struc ; (sizeof=0x18, mappedto_14)
00000000                                         ; XREF: main.caller/r
00000000                                         ; main.caller/r
00000000 data            dq ?                    ; XREF: main_caller:loc_462891/w
00000000                                         ; main_caller+168/w ; offset
00000008 len             dq ?                    ; XREF: main_caller+156/w
00000008                                         ; main_caller+16C/w
00000010 cap             dq ?                    ; XREF: main_caller+15F/w
00000010                                         ; main_caller+175/w
00000018 slice           ends

注意slice.data类型最好定义为interface*，这样更直观

void main_caller()
{
  interface *inter___; // rax
  interface *inter____; // rax
  interface *inter__; // rax
  slice s; // [rsp+0h] [rbp-A0h]
  __int64 w; // [rsp+18h] [rbp-88h] BYREF
  __int64 z; // [rsp+20h] [rbp-80h] BYREF
  __int64 y; // [rsp+28h] [rbp-78h] BYREF
  __int64 x; // [rsp+30h] [rbp-70h] BYREF
  interface *inter_; // [rsp+38h] [rbp-68h]
  slice s2; // [rsp+40h] [rbp-60h]
  interface inter[4]; // [rsp+58h] [rbp-48h] BYREF

  while ( (unsigned __int64)&inter[2].Data <= *(_QWORD *)(*(_QWORD *)NtCurrentTeb()->NtTib.ArbitraryUserPointer + 16LL) )
    runtime_morestack_noctxt();

  inter[0].Data = 0LL;
  memset(&inter[1], 0, 48);
  inter_ = inter;
  x = main_x;
  inter[0].Tab = &int_struct;
  if ( runtime_writeBarrier )
    runtime_gcWriteBarrier();
  else
    inter[0].Data = (QWORD *)&x;

  y = main_y;
  inter___ = inter_;
  inter_[1].Tab = &int_struct;
  if ( runtime_writeBarrier )
    runtime_gcWriteBarrier();
  else
    inter___[1].Data = (QWORD *)&y;

  z = main_z;
  inter____ = inter_;
  inter_[2].Tab = &int_struct;
  if ( runtime_writeBarrier )
    runtime_gcWriteBarrier();
  else
    inter____[2].Data = (QWORD *)&z;

  w = main_w;
  inter__ = inter_;
  inter_[3].Tab = &int_struct;
  if ( runtime_writeBarrier )
    runtime_gcWriteBarrier();
  else
    inter__[3].Data = (QWORD *)&w;

  s2.data = inter_;
  s2.len = 4LL;
  s2.cap = 4LL;

  s.data = inter_;
  s.len = 4LL;
  s.cap = 4LL;

  main_f(s);
}

抛开寄存器参数不谈，栈上的第一个参数便是slice s

1	slice s; // [rsp+0h] [rbp-A0h]

和刚刚的一样，这个显然是传入函数的参数。在函数最后

s.data = inter_;
s.len = 4LL;
s.cap = 4LL;

main_f(s);

s.data指向了interface[4]数组，作为这个slice中的元素。

然后是4个QWORD，对应w, z, y, x全局变量。但是这还不够，需要变换成接口类型，然后提供给上面所述的slice s。

interface inter[4]; // [rsp+58h] [rbp-48h] BYREF

x = main_x;
inter[0].Tab = &int_struct;
if ( runtime_writeBarrier )
    runtime_gcWriteBarrier();
else
    inter[0].Data = (QWORD *)&x;

// 后面y, z, w的都一样

其中中间又有一个没用冗余的slice s2

slice s2; // [rsp+40h] [rbp-60h]

s2.data = inter_;
s2.len = 4LL;
s2.cap = 4LL;

所以回到一开始的通道程序

1	x = fmt_Fprintln(&off_4F0488, qword_567628, v11, 3LL, 3LL);

后面3个参数那显然就是slice对象了，且其元素都是转化为interface类型过的。恢复过程和刚刚的一样手法。

总结

go语法糖的本质就是调用newproc(sz, fn)创建新的Goroutine。
mcall用于调用将会切换goroutine的函数
runtime.newobject(*_type)表明这里创建了某种对象。
sync___ptr_Mutex__lockSlow和 sync___ptr_Mutex__UnlockSlow
sync库使用起来，逆向难度会增大不少。
channel比sync更容易逆向。

其他

`strip`Go程序

Go的strip似乎不能直接用GNU链里的strip程序，搞出来的程序虽然确实没有调试信息了，但是似乎也没法正确运行。

1	go build -ldflags "-s -w" xxx.go

通过设置链接时的标志，才能进行strip。

1 2	-s disable symbol table -w disable DWARF generation

`go-strip`

GitHub - boy-hack/go-strip: 清除Go编译时自带的信息

如何消除Go的编译特征.md (qq.com)

这位兄弟开源了，但是没完全开源，到头来还是需要花钱进知识星球才能拿到源码。（本人穷逼）

不过他的博客是有参考价值的。

并发模型

GO语言基础进阶教程：Go语言的并发模型 - 知乎 (zhihu.com)

根据描述，Go使用的是两级线程模型。

同时对Go的M，P，G调度模型进行了比较清晰的阐述。

参考

GO语言基础进阶教程：Go语言的协程——Goroutine - 知乎 (zhihu.com)

Golang 之协程详解 - 星火燎原智勇 - 博客园 (cnblogs.com)

多线程基础 - 廖雪峰的官方网站 (liaoxuefeng.com)

GO语言基础进阶教程：Go语言的并发模型 - 知乎 (zhihu.com)

由浅入深剖析 go channel - 简书 (jianshu.com)

本文作者： Taardis
本文链接： https://taardisaa.github.io/2022/03/03/Go Reverse 3/
版权声明： 本博客所有文章除特别声明外，均采用 Apache License 2.0 许可协议。转载请注明出处！

Go逆向_3——Goroutine

简介

例子1

规则

修改

逆向分析

newproc

strip

例子2

逆向

共享内存（锁）

逆向

Count

main

runtime.newobject

_type

创建goroutine

mcall

通道（channel）

逆向

runtime.newobject

runtime.makechan

runtime.newproc

runtime.chansend1

runtime.chanrecv1

fmt.Fprintln（对空接口和变参的进一步研究）

例子1——f(...int)

例子2——f(...interface{})

总结

其他

stripGo程序

go-strip

并发模型

参考

`newproc`

`strip`

`Count`

`main`

`runtime.newobject`

`_type`

创建`goroutine`

`mcall`

`runtime.newobject`

`runtime.makechan`

`runtime.newproc`

`runtime.chansend1`

`runtime.chanrecv1`

`fmt.Fprintln`（对空接口和变参的进一步研究）

例子1——`f(...int)`

例子2——`f(...interface{})`

`strip`Go程序

`go-strip`