前言
大名鼎鼎的 Bomb Lab
Bomb Lab 对应 CSAPP 内讲汇编与 C 逆向(或者说讲汇编器行为)的第三章,这个 Lab 旨在提高我们的阅读汇编和一些 C 逆向能力
题目只给了我们一个不完整的 C 源文件、一个 64 位 ELF 文件和一个 README
题意大概是:“二进制炸弹” bomb 里有 6 个 phase,每个 phase 都会要求我们输入一段密码,密码错误则“爆炸”,需要我们通过动态调试和阅读反汇编代码来推测正确的输入
Bomb Lab
做 Lab 就不依赖 IDA 了,老老实实手动做吧
使用objdump
命令可以将 ELF 文件反汇编,这样反汇编出来的是 AT&T 语法(比较反直觉,但是契合书本),也可以加参数改成 Intel 语法
建议改后缀为.asm
并在 vscode 下一个 ASM 插件加高亮方便阅读
objdump -b bomb > bomb.asm
动态调试那当然选择 gdb+pwndbg 啦
phase_1
main 里在输入完后将输入传入 rdi,然后 call phase_1
400e32: e8 67 06 00 00 call 40149e <read_line>
400e37: 48 89 c7 mov %rax,%rdi
400e3a: e8 a1 00 00 00 call 400ee0 <phase_1>
phase_1 如下
0000000000400ee0 <phase_1>:
400ee0: 48 83 ec 08 sub $0x8,%rsp
400ee4: be 00 24 40 00 mov $0x402400,%esi
400ee9: e8 4a 04 00 00 call 401338 <strings_not_equal>
400eee: 85 c0 test %eax,%eax
400ef0: 74 05 je 400ef7 <phase_1+0x17>
400ef2: e8 43 05 00 00 call 40143a <explode_bomb>
400ef7: 48 83 c4 08 add $0x8,%rsp
400efb: c3 ret
可以看到 phase_1 调用了 strings_not_equal,想都不用想就知道是比较字符串用的,若两字符串相等则返回 0,否则返回 1
然后判定存放返回值的寄存器 eax 是否为 0,如果不为 0 则不发生跳转触发“爆炸”
观察到 strings_not_equal 有两个参数,一个是 rdi 里的我们的输入,另一个在 rsi 里,那么 rsi 里的自然是密码了
gdb 里在 phase_1 下个断点,然后直接看 0x402400 里的内容就找到了
pwndbg> x /s 0x402400
0x402400: "Border relations with Canada have never been better."
phase_2
同样在 main 完成输入后 call phase_2,不再赘述
phase_2 如下
0000000000400efc <phase_2>:
400efc: 55 push %rbp
400efd: 53 push %rbx
400efe: 48 83 ec 28 sub $0x28,%rsp
400f02: 48 89 e6 mov %rsp,%rsi
400f05: e8 52 05 00 00 call 40145c <read_six_numbers>
400f0a: 83 3c 24 01 cmpl $0x1,(%rsp)
400f0e: 74 20 je 400f30 <phase_2+0x34>
400f10: e8 25 05 00 00 call 40143a <explode_bomb>
400f15: eb 19 jmp 400f30 <phase_2+0x34>
400f17: 8b 43 fc mov -0x4(%rbx),%eax
400f1a: 01 c0 add %eax,%eax
400f1c: 39 03 cmp %eax,(%rbx)
400f1e: 74 05 je 400f25 <phase_2+0x29>
400f20: e8 15 05 00 00 call 40143a <explode_bomb>
400f25: 48 83 c3 04 add $0x4,%rbx
400f29: 48 39 eb cmp %rbp,%rbx
400f2c: 75 e9 jne 400f17 <phase_2+0x1b>
400f2e: eb 0c jmp 400f3c <phase_2+0x40>
400f30: 48 8d 5c 24 04 lea 0x4(%rsp),%rbx
400f35: 48 8d 6c 24 18 lea 0x18(%rsp),%rbp
400f3a: eb db jmp 400f17 <phase_2+0x1b>
400f3c: 48 83 c4 28 add $0x28,%rsp
400f40: 5b pop %rbx
400f41: 5d pop %rbp
400f42: c3 ret
read_six_numbers 如下
000000000040145c <read_six_numbers>:
40145c: 48 83 ec 18 sub $0x18,%rsp
401460: 48 89 f2 mov %rsi,%rdx
401463: 48 8d 4e 04 lea 0x4(%rsi),%rcx
401467: 48 8d 46 14 lea 0x14(%rsi),%rax
40146b: 48 89 44 24 08 mov %rax,0x8(%rsp)
401470: 48 8d 46 10 lea 0x10(%rsi),%rax
401474: 48 89 04 24 mov %rax,(%rsp)
401478: 4c 8d 4e 0c lea 0xc(%rsi),%r9
40147c: 4c 8d 46 08 lea 0x8(%rsi),%r8
401480: be c3 25 40 00 mov $0x4025c3,%esi
401485: b8 00 00 00 00 mov $0x0,%eax
40148a: e8 61 f7 ff ff call 400bf0 <__isoc99_sscanf@plt>
40148f: 83 f8 05 cmp $0x5,%eax
401492: 7f 05 jg 401499 <read_six_numbers+0x3d>
401494: e8 a1 ff ff ff call 40143a <explode_bomb>
401499: 48 83 c4 18 add $0x18,%rsp
40149d: c3 ret
虽然不是很重要,但还是先来分析下这个 read_six_numbers
这个函数首先利用存放了 rsp 的 rsi 按每 4 字节的间隔设置了存放 sscanf 的 8 个参数的寄存器或栈地址(输入 + 格式化字符串 + 6 个地址,地址指向栈内 24 个字节处),并将 0x4025c3 处的内容存在 esi 里,输入完成后就比较输入的数量,少于 6 个则触发“炸弹”
0x4025c3 里存放的、也就是 sscanf 的格式化字符串是 6 个 %d,所以我们需要输入 6 个整数
pwndbg> x /s 0x4025c3
0x4025c3: "%d %d %d %d %d %d"
然后回来分析下 phase_2
在完成后,首先检查第一个输入是否为 1,然后经过一连串眼花缭乱的跳转,每次跳转都会检查后一个输入是否为前一个输入的两倍,否则触发“炸弹”
所以可以推测出我们应该输入的是:1 2 4 8 16 32
phase_3
0000000000400f43 <phase_3>:
400f43: 48 83 ec 18 sub $0x18,%rsp
400f47: 48 8d 4c 24 0c lea 0xc(%rsp),%rcx
400f4c: 48 8d 54 24 08 lea 0x8(%rsp),%rdx
400f51: be cf 25 40 00 mov $0x4025cf,%esi
400f56: b8 00 00 00 00 mov $0x0,%eax
400f5b: e8 90 fc ff ff call 400bf0 <__isoc99_sscanf@plt>
400f60: 83 f8 01 cmp $0x1,%eax
400f63: 7f 05 jg 400f6a <phase_3+0x27>
400f65: e8 d0 04 00 00 call 40143a <explode_bomb>
400f6a: 83 7c 24 08 07 cmpl $0x7,0x8(%rsp)
400f6f: 77 3c ja 400fad <phase_3+0x6a>
400f71: 8b 44 24 08 mov 0x8(%rsp),%eax
400f75: ff 24 c5 70 24 40 00 jmp *0x402470(,%rax,8)
400f7c: b8 cf 00 00 00 mov $0xcf,%eax
400f81: eb 3b jmp 400fbe <phase_3+0x7b>
400f83: b8 c3 02 00 00 mov $0x2c3,%eax
400f88: eb 34 jmp 400fbe <phase_3+0x7b>
400f8a: b8 00 01 00 00 mov $0x100,%eax
400f8f: eb 2d jmp 400fbe <phase_3+0x7b>
400f91: b8 85 01 00 00 mov $0x185,%eax
400f96: eb 26 jmp 400fbe <phase_3+0x7b>
400f98: b8 ce 00 00 00 mov $0xce,%eax
400f9d: eb 1f jmp 400fbe <phase_3+0x7b>
400f9f: b8 aa 02 00 00 mov $0x2aa,%eax
400fa4: eb 18 jmp 400fbe <phase_3+0x7b>
400fa6: b8 47 01 00 00 mov $0x147,%eax
400fab: eb 11 jmp 400fbe <phase_3+0x7b>
400fad: e8 88 04 00 00 call 40143a <explode_bomb>
400fb2: b8 00 00 00 00 mov $0x0,%eax
400fb7: eb 05 jmp 400fbe <phase_3+0x7b>
400fb9: b8 37 01 00 00 mov $0x137,%eax
400fbe: 3b 44 24 0c cmp 0xc(%rsp),%eax
400fc2: 74 05 je 400fc9 <phase_3+0x86>
400fc4: e8 71 04 00 00 call 40143a <explode_bomb>
400fc9: 48 83 c4 18 add $0x18,%rsp
400fcd: c3 ret
400f43~400f65:首先设置输入保存地址为栈上第 8 和第 12 字节,0x4025cf 处内容为 %d %d,即需要输入两个整数,完成输入后检查输入个数(scanf 返回值为输入个数)是否大于等于 1,是则跳转到 400f6a
400f6a~400f6f:比较第一个输入是否大于 7,大于则跳转触发“炸弹”
400f71~400fb7:跳转到 *(0x402470) + input1 * 8 处,0x402470 处的内容为 0x400f7c 也就是 0x402470 的下一语句的地址,随后就是一连串的比较与跳转,不难看出这就是 switch 语句的汇编形式,这里每个分支的逻辑相似,都是将 eax 赋值为某个值然后跳转至 0x400fbe
400fbe~400fcd:比较第二个输入与 eax 是否相等,相等则成功返回,否则触发“炸弹”
所以我们第一个输入可以设置为 0~7,然后在 switch 语句里找相应的值作为第二个输入即可
这里我选择 0 207
phase_4
phase_4 和 func4 如下
0000000000400fce <func4>:
400fce: 48 83 ec 08 sub $0x8,%rsp
400fd2: 89 d0 mov %edx,%eax
400fd4: 29 f0 sub %esi,%eax
400fd6: 89 c1 mov %eax,%ecx
400fd8: c1 e9 1f shr $0x1f,%ecx
400fdb: 01 c8 add %ecx,%eax
400fdd: d1 f8 sar %eax
400fdf: 8d 0c 30 lea (%rax,%rsi,1),%ecx
400fe2: 39 f9 cmp %edi,%ecx
400fe4: 7e 0c jle 400ff2 <func4+0x24>
400fe6: 8d 51 ff lea -0x1(%rcx),%edx
400fe9: e8 e0 ff ff ff call 400fce <func4>
400fee: 01 c0 add %eax,%eax
400ff0: eb 15 jmp 401007 <func4+0x39>
400ff2: b8 00 00 00 00 mov $0x0,%eax
400ff7: 39 f9 cmp %edi,%ecx
400ff9: 7d 0c jge 401007 <func4+0x39>
400ffb: 8d 71 01 lea 0x1(%rcx),%esi
400ffe: e8 cb ff ff ff call 400fce <func4>
401003: 8d 44 00 01 lea 0x1(%rax,%rax,1),%eax
401007: 48 83 c4 08 add $0x8,%rsp
40100b: c3 ret
000000000040100c <phase_4>:
40100c: 48 83 ec 18 sub $0x18,%rsp
401010: 48 8d 4c 24 0c lea 0xc(%rsp),%rcx
401015: 48 8d 54 24 08 lea 0x8(%rsp),%rdx
40101a: be cf 25 40 00 mov $0x4025cf,%esi
40101f: b8 00 00 00 00 mov $0x0,%eax
401024: e8 c7 fb ff ff call 400bf0 <__isoc99_sscanf@plt>
401029: 83 f8 02 cmp $0x2,%eax
40102c: 75 07 jne 401035 <phase_4+0x29>
40102e: 83 7c 24 08 0e cmpl $0xe,0x8(%rsp)
401033: 76 05 jbe 40103a <phase_4+0x2e>
401035: e8 00 04 00 00 call 40143a <explode_bomb>
40103a: ba 0e 00 00 00 mov $0xe,%edx
40103f: be 00 00 00 00 mov $0x0,%esi
401044: 8b 7c 24 08 mov 0x8(%rsp),%edi
401048: e8 81 ff ff ff call 400fce <func4>
40104d: 85 c0 test %eax,%eax
40104f: 75 07 jne 401058 <phase_4+0x4c>
401051: 83 7c 24 0c 00 cmpl $0x0,0xc(%rsp)
401056: 74 05 je 40105d <phase_4+0x51>
401058: e8 dd 03 00 00 call 40143a <explode_bomb>
40105d: 48 83 c4 18 add $0x18,%rsp
401061: c3 ret
<phase_4> 执行流程:
40100c~40102c:同样是输入两个整数,然后检查输入个数是否等于 2,否则跳转触发“炸弹”
40102e~401048:判断第一个输入是否小于等于 14,是则跳转至 0x40103a,然后 func (input_1, 0, 14)
40104d~401061:检查 func4 的返回值是否为 0,是则不触发“炸弹”,然后检查第二个输入是否等于 0,是则成功返回
<func4> 通过观察可以发现是个递归函数,这里不太好描述执行流程,直接贴个 C 代码更清晰些(有点晕,不保证全部逻辑都正确)
int func4(int edi, int esi, int edx) {
int eax, ecx;
eax = (edx - esi) / 2;
ecx = eax + esi;
if (ecx <= edi) {
eax = 0;
if (ecx >= edi) {
return eax;
}
else {
eax = func4(edi, ecx + 1, edx);
}
}
else {
edx--;
eax = func4(edi, esi, edx);
}
eax = eax * 2;
return eax;
}
所以要使得返回值为 0,就要使得 eax 为 0,进一步反推得到两种方案:
- 令 edi=ecx=7,使函数直接返回被置 0 的 eax
- 令 edi=0,使函数进入 edx 递减的递归,最终达成 edi=ecx=0,从而返回被置 0 的 eax
所以我们的输入可以是 7 0 或 0 0
phase_5
phase_5 如下
0000000000401062 <phase_5>:
401062: 53 push %rbx
401063: 48 83 ec 20 sub $0x20,%rsp
401067: 48 89 fb mov %rdi,%rbx
40106a: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
401071: 00 00
401073: 48 89 44 24 18 mov %rax,0x18(%rsp)
401078: 31 c0 xor %eax,%eax
40107a: e8 9c 02 00 00 call 40131b <string_length>
40107f: 83 f8 06 cmp $0x6,%eax
401082: 74 4e je 4010d2 <phase_5+0x70>
401084: e8 b1 03 00 00 call 40143a <explode_bomb>
401089: eb 47 jmp 4010d2 <phase_5+0x70>
40108b: 0f b6 0c 03 movzbl (%rbx,%rax,1),%ecx
40108f: 88 0c 24 mov %cl,(%rsp)
401092: 48 8b 14 24 mov (%rsp),%rdx
401096: 83 e2 0f and $0xf,%edx
401099: 0f b6 92 b0 24 40 00 movzbl 0x4024b0(%rdx),%edx
4010a0: 88 54 04 10 mov %dl,0x10(%rsp,%rax,1)
4010a4: 48 83 c0 01 add $0x1,%rax
4010a8: 48 83 f8 06 cmp $0x6,%rax
4010ac: 75 dd jne 40108b <phase_5+0x29>
4010ae: c6 44 24 16 00 movb $0x0,0x16(%rsp)
4010b3: be 5e 24 40 00 mov $0x40245e,%esi
4010b8: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
4010bd: e8 76 02 00 00 call 401338 <strings_not_equal>
4010c2: 85 c0 test %eax,%eax
4010c4: 74 13 je 4010d9 <phase_5+0x77>
4010c6: e8 6f 03 00 00 call 40143a <explode_bomb>
4010cb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
4010d0: eb 07 jmp 4010d9 <phase_5+0x77>
4010d2: b8 00 00 00 00 mov $0x0,%eax
4010d7: eb b2 jmp 40108b <phase_5+0x29>
4010d9: 48 8b 44 24 18 mov 0x18(%rsp),%rax
4010de: 64 48 33 04 25 28 00 xor %fs:0x28,%rax
4010e5: 00 00
4010e7: 74 05 je 4010ee <phase_5+0x8c>
4010e9: e8 42 fa ff ff call 400b30 <__stack_chk_fail@plt>
4010ee: 48 83 c4 20 add $0x20,%rsp
4010f2: 5b pop %rbx
4010f3: c3 ret
比之前的长了一截,更有挑战性了
401062~401084:检测输出长度是否为6
401089~4010ae:循环6次,每次从输入中取一个字符的低 8 位,然后以 0x4024b0 为基址、以这个低 4 位为偏移取出字符存放在栈空间内组成新字符串,最后补上末尾截断
4010b3~4010c4:比较字符串
4010c4~4010f3:收尾工作,还为了防止溢出检查了 canary(原来检查 canary 是调用 __stack_chk_fail,学习了)
pwndbg> x /s 0x40245e
0x40245e: "flyers"
pwndbg> x /s 0x4024b0
0x4024b0 <array.3449>: "maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?"
由于输入被截取了低位,所以偏移的范围是 0~15,对应着字符串“maduiersnfotvbyl”
然后在这些字符里面找低位组成“flyers”(9FE567)即可
pwndbg> x /16t 0x4024b0
0x4024b0 <array.3449>: 01101101 01100001 01100100 01110101 01101001 01100101 01110010 01110011
0x4024b8 <array.3449+8>: 01101110 01100110 01101111 01110100 01110110 01100010 01111001 01101100
最终得出:ionefg(0x49,0x4f,0x4E,0x45,0x46,0x47)
phase_6
00000000004010f4 <phase_6>:
4010f4: 41 56 push %r14
4010f6: 41 55 push %r13
4010f8: 41 54 push %r12
4010fa: 55 push %rbp
4010fb: 53 push %rbx
4010fc: 48 83 ec 50 sub $0x50,%rsp
401100: 49 89 e5 mov %rsp,%r13
401103: 48 89 e6 mov %rsp,%rsi
401106: e8 51 03 00 00 call 40145c <read_six_numbers>
40110b: 49 89 e6 mov %rsp,%r14
40110e: 41 bc 00 00 00 00 mov $0x0,%r12d
401114: 4c 89 ed mov %r13,%rbp
401117: 41 8b 45 00 mov 0x0(%r13),%eax
40111b: 83 e8 01 sub $0x1,%eax
40111e: 83 f8 05 cmp $0x5,%eax
401121: 76 05 jbe 401128 <phase_6+0x34>
401123: e8 12 03 00 00 call 40143a <explode_bomb>
401128: 41 83 c4 01 add $0x1,%r12d
40112c: 41 83 fc 06 cmp $0x6,%r12d
401130: 74 21 je 401153 <phase_6+0x5f>
401132: 44 89 e3 mov %r12d,%ebx
401135: 48 63 c3 movslq %ebx,%rax
401138: 8b 04 84 mov (%rsp,%rax,4),%eax
40113b: 39 45 00 cmp %eax,0x0(%rbp)
40113e: 75 05 jne 401145 <phase_6+0x51>
401140: e8 f5 02 00 00 call 40143a <explode_bomb>
401145: 83 c3 01 add $0x1,%ebx
401148: 83 fb 05 cmp $0x5,%ebx
40114b: 7e e8 jle 401135 <phase_6+0x41>
40114d: 49 83 c5 04 add $0x4,%r13
401151: eb c1 jmp 401114 <phase_6+0x20>
401153: 48 8d 74 24 18 lea 0x18(%rsp),%rsi
401158: 4c 89 f0 mov %r14,%rax
40115b: b9 07 00 00 00 mov $0x7,%ecx
401160: 89 ca mov %ecx,%edx
401162: 2b 10 sub (%rax),%edx
401164: 89 10 mov %edx,(%rax)
401166: 48 83 c0 04 add $0x4,%rax
40116a: 48 39 f0 cmp %rsi,%rax
40116d: 75 f1 jne 401160 <phase_6+0x6c>
40116f: be 00 00 00 00 mov $0x0,%esi
401174: eb 21 jmp 401197 <phase_6+0xa3>
401176: 48 8b 52 08 mov 0x8(%rdx),%rdx
40117a: 83 c0 01 add $0x1,%eax
40117d: 39 c8 cmp %ecx,%eax
40117f: 75 f5 jne 401176 <phase_6+0x82>
401181: eb 05 jmp 401188 <phase_6+0x94>
401183: ba d0 32 60 00 mov $0x6032d0,%edx
401188: 48 89 54 74 20 mov %rdx,0x20(%rsp,%rsi,2)
40118d: 48 83 c6 04 add $0x4,%rsi
401191: 48 83 fe 18 cmp $0x18,%rsi
401195: 74 14 je 4011ab <phase_6+0xb7>
401197: 8b 0c 34 mov (%rsp,%rsi,1),%ecx
40119a: 83 f9 01 cmp $0x1,%ecx
40119d: 7e e4 jle 401183 <phase_6+0x8f>
40119f: b8 01 00 00 00 mov $0x1,%eax
4011a4: ba d0 32 60 00 mov $0x6032d0,%edx
4011a9: eb cb jmp 401176 <phase_6+0x82>
4011ab: 48 8b 5c 24 20 mov 0x20(%rsp),%rbx
4011b0: 48 8d 44 24 28 lea 0x28(%rsp),%rax
4011b5: 48 8d 74 24 50 lea 0x50(%rsp),%rsi
4011ba: 48 89 d9 mov %rbx,%rcx
4011bd: 48 8b 10 mov (%rax),%rdx
4011c0: 48 89 51 08 mov %rdx,0x8(%rcx)
4011c4: 48 83 c0 08 add $0x8,%rax
4011c8: 48 39 f0 cmp %rsi,%rax
4011cb: 74 05 je 4011d2 <phase_6+0xde>
4011cd: 48 89 d1 mov %rdx,%rcx
4011d0: eb eb jmp 4011bd <phase_6+0xc9>
4011d2: 48 c7 42 08 00 00 00 movq $0x0,0x8(%rdx)
4011d9: 00
4011da: bd 05 00 00 00 mov $0x5,%ebp
4011df: 48 8b 43 08 mov 0x8(%rbx),%rax
4011e3: 8b 00 mov (%rax),%eax
4011e5: 39 03 cmp %eax,(%rbx)
4011e7: 7d 05 jge 4011ee <phase_6+0xfa>
4011e9: e8 4c 02 00 00 call 40143a <explode_bomb>
4011ee: 48 8b 5b 08 mov 0x8(%rbx),%rbx
4011f2: 83 ed 01 sub $0x1,%ebp
4011f5: 75 e8 jne 4011df <phase_6+0xeb>
4011f7: 48 83 c4 50 add $0x50,%rsp
4011fb: 5b pop %rbx
4011fc: 5d pop %rbp
4011fd: 41 5c pop %r12
4011ff: 41 5d pop %r13
401201: 41 5e pop %r14
401203: c3 ret
比 phase_5 还长,哈人()
还是一截一截放出来吧
首先是熟悉的输入 6 个数、存在栈空间中,这里不再赘述
4010f4: 41 56 push %r14
4010f6: 41 55 push %r13
4010f8: 41 54 push %r12
4010fa: 55 push %rbp
4010fb: 53 push %rbx
4010fc: 48 83 ec 50 sub $0x50,%rsp
401100: 49 89 e5 mov %rsp,%r13
401103: 48 89 e6 mov %rsp,%rsi
401106: e8 51 03 00 00 call 40145c <read_six_numbers>
然后是一个复杂的嵌套循环
根据跳出和跳回的关系层层拆开就明了不少了
40110b: 49 89 e6 mov %rsp,%r14
40110e: 41 bc 00 00 00 00 mov $0x0,%r12d
401114: 4c 89 ed mov %r13,%rbp
401117: 41 8b 45 00 mov 0x0(%r13),%eax
40111b: 83 e8 01 sub $0x1,%eax
40111e: 83 f8 05 cmp $0x5,%eax
401121: 76 05 jbe 401128 <phase_6+0x34>
401123: e8 12 03 00 00 call 40143a <explode_bomb>
401128: 41 83 c4 01 add $0x1,%r12d
40112c: 41 83 fc 06 cmp $0x6,%r12d
401130: 74 21 je 401153 <phase_6+0x5f>
401132: 44 89 e3 mov %r12d,%ebx
401135: 48 63 c3 movslq %ebx,%rax
401138: 8b 04 84 mov (%rsp,%rax,4),%eax
40113b: 39 45 00 cmp %eax,0x0(%rbp)
40113e: 75 05 jne 401145 <phase_6+0x51>
401140: e8 f5 02 00 00 call 40143a <explode_bomb>
401145: 83 c3 01 add $0x1,%ebx
401148: 83 fb 05 cmp $0x5,%ebx
40114b: 7e e8 jle 401135 <phase_6+0x41>
40114d: 49 83 c5 04 add $0x4,%r13
401151: eb c1 jmp 401114 <phase_6+0x20>
这段分析起来比较繁琐,还是写伪 C 代码比较直观
总的来说就是检查输入的每个数必须大于 0 小于等于 6,且互不相等
int input[6] = {/* something we input */};
for (r12 = 0;;) {
if (input[r12] > 6)
bomb();
r12++;
if (r12 == 6)
return;
do {
if (input[r12] == input[12 - 1])
bomb();
ebx++;
} while (ebx <= 5);
}
紧接着来到下一段,这里的 r14 就是上一段开头保存的栈指针 rsp
401153: 48 8d 74 24 18 lea 0x18(%rsp),%rsi
401158: 4c 89 f0 mov %r14,%rax
40115b: b9 07 00 00 00 mov $0x7,%ecx
401160: 89 ca mov %ecx,%edx
401162: 2b 10 sub (%rax),%edx
401164: 89 10 mov %edx,(%rax)
401166: 48 83 c0 04 add $0x4,%rax
40116a: 48 39 f0 cmp %rsi,%rax
40116d: 75 f1 jne 401160 <phase_6+0x6c>
不难看出这段就是用 7 依次减去栈内的值并替换之
再到下一段
40116f: be 00 00 00 00 mov $0x0,%esi
401174: eb 21 jmp 401197 <phase_6+0xa3>
401176: 48 8b 52 08 mov 0x8(%rdx),%rdx
40117a: 83 c0 01 add $0x1,%eax
40117d: 39 c8 cmp %ecx,%eax
40117f: 75 f5 jne 401176 <phase_6+0x82>
401181: eb 05 jmp 401188 <phase_6+0x94>
401183: ba d0 32 60 00 mov $0x6032d0,%edx
401188: 48 89 54 74 20 mov %rdx,0x20(%rsp,%rsi,2)
40118d: 48 83 c6 04 add $0x4,%rsi
401191: 48 83 fe 18 cmp $0x18,%rsi
401195: 74 14 je 4011ab <phase_6+0xb7>
401197: 8b 0c 34 mov (%rsp,%rsi,1),%ecx
40119a: 83 f9 01 cmp $0x1,%ecx
40119d: 7e e4 jle 401183 <phase_6+0x8f>
40119f: b8 01 00 00 00 mov $0x1,%eax
4011a4: ba d0 32 60 00 mov $0x6032d0,%edx
4011a9: eb cb jmp 401176 <phase_6+0x82>
乍一看完全看不懂是什么东西,无奈先看看 0x6032d0 附近的是什么
pwndbg> x/24wx 0x6032d0
0x6032d0 <node1>: 0x0000014c 0x00000001 0x006032e0 0x00000000
0x6032e0 <node2>: 0x000000a8 0x00000002 0x006032f0 0x00000000
0x6032f0 <node3>: 0x0000039c 0x00000003 0x00603300 0x00000000
0x603300 <node4>: 0x000002b3 0x00000004 0x00603310 0x00000000
0x603310 <node5>: 0x000001dd 0x00000005 0x00603320 0x00000000
0x603320 <node6>: 0x000001bb 0x00000006 0x00000000 0x00000000
node……首尾单向对应……竟然是个单向链表,每个 node 存放了一个值、一个标号和一个 next 指针
回头再来分析这段逻辑
40116f~401174:跳转完成一次初始化并进入循环,此时 ecx 为第一个输入, eax=1,edx 指向第一个 node
401176~40117f:循环直至 eax=ecx,每次循环 rdx 都累加来获取指向下一个 node 的地址
401188~401195:将此 node 的地址存入 rsp+0x20 及往后的栈空间,若未遍历完 6 个则重复整个过程,否则跳出
所以整段的含义其实是按照我们输入的数字顺序(7-n)来排列 node
再继续到下一段
4011ab: 48 8b 5c 24 20 mov 0x20(%rsp),%rbx
4011b0: 48 8d 44 24 28 lea 0x28(%rsp),%rax
4011b5: 48 8d 74 24 50 lea 0x50(%rsp),%rsi
4011ba: 48 89 d9 mov %rbx,%rcx
4011bd: 48 8b 10 mov (%rax),%rdx
4011c0: 48 89 51 08 mov %rdx,0x8(%rcx)
4011c4: 48 83 c0 08 add $0x8,%rax
4011c8: 48 39 f0 cmp %rsi,%rax
4011cb: 74 05 je 4011d2 <phase_6+0xde>
4011cd: 48 89 d1 mov %rdx,%rcx
4011d0: eb eb jmp 4011bd <phase_6+0xc9>
4011d2: 48 c7 42 08 00 00 00 movq $0x0,0x8(%rdx)
4011d9: 00
这段的含义是更新旧的 node 的 next 指针,使之指向我们上一段构造出的地址数组,这样就完成了链表的重排了
更简洁些描述就是:输入自定义顺序 --> 按此顺序在数组内存放对应 node 的地址 --> 按此数组更改 node 的 next 指针 --> 完成重排
来到最后一段
4011da: bd 05 00 00 00 mov $0x5,%ebp
4011df: 48 8b 43 08 mov 0x8(%rbx),%rax
4011e3: 8b 00 mov (%rax),%eax
4011e5: 39 03 cmp %eax,(%rbx)
4011e7: 7d 05 jge 4011ee <phase_6+0xfa>
4011e9: e8 4c 02 00 00 call 40143a <explode_bomb>
4011ee: 48 8b 5b 08 mov 0x8(%rbx),%rbx
4011f2: 83 ed 01 sub $0x1,%ebp
4011f5: 75 e8 jne 4011df <phase_6+0xeb>
4011f7: 48 83 c4 50 add $0x50,%rsp
4011fb: 5b pop %rbx
4011fc: 5d pop %rbp
4011fd: 41 5c pop %r12
4011ff: 41 5d pop %r13
401201: 41 5e pop %r14
401203: c3 ret
这段其实就是检查新链表各个值是否按照降序排列
所以我们只需将各个 node 储存的值按降序排列,然后标号 7-n 就是我们的输入了
最终得出:4 3 2 1 6 5(3 4 5 6 1 2 --> 4 3 2 1 6 5)
secret_phase
你以为结束了吗?其实有彩蛋!
Wow, they got it! But isn't something... missing? Perhaps
something they overlooked? Mua ha ha ha ha!
不仅注释里有提示,其实翻到 phase_6 后面也能发现有个 secret_phase
还记得 main 里调用的 phase_defused 吗,出题人在那里藏了一个 secret_phase
来研究下怎么触发这个 secret_phase
00000000004015c4 <phase_defused>:
4015c4: 48 83 ec 78 sub $0x78,%rsp
4015c8: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
4015cf: 00 00
4015d1: 48 89 44 24 68 mov %rax,0x68(%rsp)
4015d6: 31 c0 xor %eax,%eax
4015d8: 83 3d 81 21 20 00 06 cmpl $0x6,0x202181(%rip) # 603760 <num_input_strings>
4015df: 75 5e jne 40163f <phase_defused+0x7b>
4015e1: 4c 8d 44 24 10 lea 0x10(%rsp),%r8
4015e6: 48 8d 4c 24 0c lea 0xc(%rsp),%rcx
4015eb: 48 8d 54 24 08 lea 0x8(%rsp),%rdx
4015f0: be 19 26 40 00 mov $0x402619,%esi
4015f5: bf 70 38 60 00 mov $0x603870,%edi
4015fa: e8 f1 f5 ff ff call 400bf0 <__isoc99_sscanf@plt>
4015ff: 83 f8 03 cmp $0x3,%eax
401602: 75 31 jne 401635 <phase_defused+0x71>
401604: be 22 26 40 00 mov $0x402622,%esi
401609: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
40160e: e8 25 fd ff ff call 401338 <strings_not_equal>
401613: 85 c0 test %eax,%eax
401615: 75 1e jne 401635 <phase_defused+0x71>
401617: bf f8 24 40 00 mov $0x4024f8,%edi
40161c: e8 ef f4 ff ff call 400b10 <puts@plt>
401621: bf 20 25 40 00 mov $0x402520,%edi
401626: e8 e5 f4 ff ff call 400b10 <puts@plt>
40162b: b8 00 00 00 00 mov $0x0,%eax
401630: e8 0d fc ff ff call 401242 <secret_phase>
401635: bf 58 25 40 00 mov $0x402558,%edi
40163a: e8 d1 f4 ff ff call 400b10 <puts@plt>
40163f: 48 8b 44 24 68 mov 0x68(%rsp),%rax
401644: 64 48 33 04 25 28 00 xor %fs:0x28,%rax
40164b: 00 00
40164d: 74 05 je 401654 <phase_defused+0x90>
40164f: e8 dc f4 ff ff call 400b30 <__stack_chk_fail@plt>
401654: 48 83 c4 78 add $0x78,%rsp
401658: c3 ret
可以发现里面藏了一个 sscanf,若我们输入三个东西,并对最后一个输入与 0x402622 的内容相匹配则触发 secret_phase
但其输入串和格式化字符串和匹配串都未知,打个断点看看
pwndbg> x/s 0x603870
0x603870 <input_strings+240>: "0 0"
pwndbg> x/s 0x402619
0x402619: "%d %d %s"
pwndbg> x/s 0x402622
0x402622: "DrEvil"
0 0 可不就是我们在 phase_4 的输入嘛,跑回去看了看,phase_4 限定了只能输入两个,但是试验输入三个不会出错,也许 sscanf 的返回值只认与格式化字符串对应的输入吧
所以我们在 phase_4 输入 0 0 DrEvil 就能触发 secret_phase 了
secret_phase 如下
0000000000401242 <secret_phase>:
401242: 53 push %rbx
401243: e8 56 02 00 00 call 40149e <read_line>
401248: ba 0a 00 00 00 mov $0xa,%edx
40124d: be 00 00 00 00 mov $0x0,%esi
401252: 48 89 c7 mov %rax,%rdi
401255: e8 76 f9 ff ff call 400bd0 <strtol@plt>
40125a: 48 89 c3 mov %rax,%rbx
40125d: 8d 40 ff lea -0x1(%rax),%eax
401260: 3d e8 03 00 00 cmp $0x3e8,%eax
401265: 76 05 jbe 40126c <secret_phase+0x2a>
401267: e8 ce 01 00 00 call 40143a <explode_bomb>
40126c: 89 de mov %ebx,%esi
40126e: bf f0 30 60 00 mov $0x6030f0,%edi
401273: e8 8c ff ff ff call 401204 <fun7>
401278: 83 f8 02 cmp $0x2,%eax
40127b: 74 05 je 401282 <secret_phase+0x40>
40127d: e8 b8 01 00 00 call 40143a <explode_bomb>
401282: bf 38 24 40 00 mov $0x402438,%edi
401287: e8 84 f8 ff ff call 400b10 <puts@plt>
40128c: e8 33 03 00 00 call 4015c4 <phase_defused>
401291: 5b pop %rbx
401292: c3 ret
secret_phase 先读取输入,然后调用strtol,strtol 的作用为将字符串转换成长整型,然后检查字符串转长整数是否小于等于 1001(好优雅的 Magic Number),然后设置好参数(0x6030f0 和输入转换的长整数),随后调用 fun7,最后检查返回值是否等于 2
fun7 如下,又是让人头疼的递归
0000000000401204 <fun7>:
401204: 48 83 ec 08 sub $0x8,%rsp
401208: 48 85 ff test %rdi,%rdi
40120b: 74 2b je 401238 <fun7+0x34>
40120d: 8b 17 mov (%rdi),%edx
40120f: 39 f2 cmp %esi,%edx
401211: 7e 0d jle 401220 <fun7+0x1c>
401213: 48 8b 7f 08 mov 0x8(%rdi),%rdi
401217: e8 e8 ff ff ff call 401204 <fun7>
40121c: 01 c0 add %eax,%eax
40121e: eb 1d jmp 40123d <fun7+0x39>
401220: b8 00 00 00 00 mov $0x0,%eax
401225: 39 f2 cmp %esi,%edx
401227: 74 14 je 40123d <fun7+0x39>
401229: 48 8b 7f 10 mov 0x10(%rdi),%rdi
40122d: e8 d2 ff ff ff call 401204 <fun7>
401232: 8d 44 00 01 lea 0x1(%rax,%rax,1),%eax
401236: eb 05 jmp 40123d <fun7+0x39>
401238: b8 ff ff ff ff mov $0xffffffff,%eax
40123d: 48 83 c4 08 add $0x8,%rsp
401241: c3 ret
尝试写了写伪 C 代码(rdi=0x6030f0,esi=input)
int eax;
int fun7(int *rdi, int esi) {
if (rdi == 0)
return -1;
if (esi < *rdi) {
rdi = rdi + 1;
fun7(rdi, esi);
eax = eax * 2;
return eax;
}
else {
eax = 0;
if (esi == *rdi)
return eax;
rdi = rdi + 2;
fun7(esi, rdx);
eax = 1 + eax * 2;
return eax;
}
}
但是这样写应该很快就能发现:这样迭代下去指针 rdi 的指向是有问题的
所以我去看了看 rdi 的初始值 0x6030f0 周围都有什么
pwndbg> x /120wx 0x6030f0
0x6030f0 <n1>: 0x00000024 0x00000000 0x00603110 0x00000000
0x603100 <n1+16>: 0x00603130 0x00000000 0x00000000 0x00000000
0x603110 <n21>: 0x00000008 0x00000000 0x00603190 0x00000000
0x603120 <n21+16>: 0x00603150 0x00000000 0x00000000 0x00000000
0x603130 <n22>: 0x00000032 0x00000000 0x00603170 0x00000000
0x603140 <n22+16>: 0x006031b0 0x00000000 0x00000000 0x00000000
0x603150 <n32>: 0x00000016 0x00000000 0x00603270 0x00000000
0x603160 <n32+16>: 0x00603230 0x00000000 0x00000000 0x00000000
0x603170 <n33>: 0x0000002d 0x00000000 0x006031d0 0x00000000
0x603180 <n33+16>: 0x00603290 0x00000000 0x00000000 0x00000000
0x603190 <n31>: 0x00000006 0x00000000 0x006031f0 0x00000000
0x6031a0 <n31+16>: 0x00603250 0x00000000 0x00000000 0x00000000
0x6031b0 <n34>: 0x0000006b 0x00000000 0x00603210 0x00000000
0x6031c0 <n34+16>: 0x006032b0 0x00000000 0x00000000 0x00000000
0x6031d0 <n45>: 0x00000028 0x00000000 0x00000000 0x00000000
0x6031e0 <n45+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x6031f0 <n41>: 0x00000001 0x00000000 0x00000000 0x00000000
0x603200 <n41+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603210 <n47>: 0x00000063 0x00000000 0x00000000 0x00000000
0x603220 <n47+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603230 <n44>: 0x00000023 0x00000000 0x00000000 0x00000000
0x603240 <n44+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603250 <n42>: 0x00000007 0x00000000 0x00000000 0x00000000
0x603260 <n42+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603270 <n43>: 0x00000014 0x00000000 0x00000000 0x00000000
0x603280 <n43+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x603290 <n46>: 0x0000002f 0x00000000 0x00000000 0x00000000
0x6032a0 <n46+16>: 0x00000000 0x00000000 0x00000000 0x00000000
0x6032b0 <n48>: 0x000003e9 0x00000000 0x00000000 0x00000000
0x6032c0 <n48+16>: 0x00000000 0x00000000 0x00000000 0x00000000
一眼下去看不懂是什么东西,但是发现每组数值后都有两个地址,且都指向某个值,纸笔整理一下发现原来是个二叉搜索树
所以伪 C 代码应该改成这样(懒得重写,直接抄现成的了)
int fun7(node* current, int input)
{
if (current == NULL)
return -1;
int key = current->key;
if (input > key)
return 2 * fun7(current->right, input) + 1;
else if (input == key)
return 0;
else
return 2 * fun7(current->left, input);
}
目标是最终返回 2,分析一下
首先是 0x24,此时我们想返回 2,然而左子树 2*eax、右子树返回 1+2*eax,所以选左子树
然后来到 0x8,此时我们想返回 2÷2=1,选右子树
接着来到 0x16,此时我们想返回 (1-1)÷2=0,所以我们需要在这里使得 input=esi=*rdi,于是得出我们其中一个答案:0x16=22
或者更进一步,我们来到左子树 0x14,使其返回 0 再继续返回也同样满足需求,于是得出另一个答案:20
所以 secret_phase 的答案是:22 或 20
后记
CMU 的 Labs 设计者很懂得如何把控出循序渐进的难度,更懂得如何将一个 Lab 设计得更富有挑战性和成就感,迈入门槛后的每一个 phase 都有不同的惊喜(以及惊吓),尤其是phase_5 和 phase_6 让我抓狂不已,最终花费一天又多才总算完成
做完这个 Lab 最大的感受是叹服,叹服 Lab 设计者高超且优雅的汇编编程艺术,更叹服极早期用纯汇编编写软件的开拓者们;然后是感激,在这里给 Dennis Ritchie 和 Kenneth Thompson 磕个响头(怎么做个 Lab 跟去了精神传销一样)
筋疲力竭了,得歇两天迎接新年()
Comments NOTHING