我编写了这个程序来做与 hi.c 程序相同的事情,但没有 c lib 调用。然后按照建议在 hi.c 上使用 -S gcc 选项,然后剖析生成的 hi.s 程序。
$ cat hiasm.asm
section .text
global _start
_start:
mov dl, 5
mov esi, msg
xor di,di
xor al,al
inc di
inc al
syscall
xor rdi,rdi
mov al,60
syscall
msg: db "Hello"
$ nasm -f elf64 hiasm.asm && ld -m elf_x86_64 hiasm.o -o hiasm && ./hiasm
你好
$ echo $?
0
所以这很好用
再次,这里是简单的 hi.c
$ cat hi.c
#include <stdio.h>
int main(void)
{
puts("Hello");
return 0;
}
$ gcc -s hi.c && cat hi.s
.file "hi.c"
.section .rodata
.LC0:
.string "Hello"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
.section .note.GNU-stack,"",@progbits
$ gcc hi.s -o hi && ./hi
你好
标签 .LFB0 和 .LFE0 似乎没有在 .s 文件中被引用
删除这两个文件后仍然按预期工作,
引用“as”汇编文档:
https://sourceware.org/binutils/docs/as/index.html
局部符号在汇编器中定义和使用,但它们是
通常不保存在目标文件中。因此,当它们不可见时
调试。您可以使用 `-L' 选项(请参阅包括本地符号)来
保留目标文件中的局部符号。
所以作为一个不需要花里胡哨的纯可执行文件,它们可以被切碎
所以我摆脱了简单的那些
接下来函数要调用main,这个用处不大,所以我调用_start
对于 ELF 目标,.size 指令的使用方式如下:
.size name , expression
该指令设置与符号名称关联的大小。尺寸
以字节为单位是从可以使用标签的表达式中计算出来的
算术。该指令通常用于设置
功能符号。
不需要函数符号大小,去掉底部引用main的.size
$cat hi.s
.
file "hi.c" ##tells 'as' that we are about to start a new logical file
.section .rodata ##assembles the following code into section '.rodata'
.LC0: ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
## are guaranteed to be unique over the source code
## that allow the compiler to use names/simple notation
## to reference sections of code
##But here, only .LC0 is actually referenced in the code
.string "Hello" ##
.text
.globl _start
_start:
.cfi_startproc ##used at the beginning of each function that should have an
##entry in .eh_frame. It initializes some internal data
##structures. Don't forget to close by .cfi_endproc
pushq %rbp ##push base pointer onto stack
.cfi_def_cfa_offset 16 ##modifies a rule for computing CFA. Register remains the
##same, but offset is new. Note that it is the absolute
##offset that will be added to a defined register to
##compute CFA address
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rdi
call puts@PLT
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc ##close of .cfi_startproc
.ident "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
.section .note.GNU-stack,"",@progbits
尝试一下:
$ gcc -o hi hi.s
/tmp/ccLxG1jh.o: In function `_start':
hi.c:(.text+0x0): multiple definition of `_start'
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o:(.text+0x0): first defined here
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
$ ldd 嗨
linux-vdso.so.1 (0x00007fffb6569000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe7456e7000)
/lib64/ld-linux-x86-64.so.2 (0x000055edc8bc8000)
肯定是用了libc,这解释了我们对_start的多重定义
所以我会尝试使用 -nostdlib gcc 选项摆脱 std lib
$ gcc -nostdlib -o hi hi.s
/tmp/ccV5QYaT.o: In function `_start':
hi.c:(.text+0xc): undefined reference to puts'
collect2: error: ld returned 1 exit status
对,puts 还是需要 C,去掉 puts
.file "hi.c" ##tells 'as' that we are about to start a new logical file
.section .rodata ##assembles the following code into section '.rodata'
.LC0: ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
## are guaranteed to be unique over the source code
## that allow the compiler to use names/simple notation
## to reference sections of code
##But here, only .LC0 is actually referenced in the code
.string "Hello" ##
.text
.globl _start
_start:
.cfi_startproc ##used at the beginning of each function that should have an
##entry in .eh_frame. It initializes some internal data
##structures. Don't forget to close by .cfi_endproc
pushq %rbp ##push base pointer onto stack
.cfi_def_cfa_offset 16 ##modifies a rule for computing CFA. Register remains the
##same, but offset is new. Note that it is the absolute
##offset that will be added to a defined register to
##compute CFA address
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rsi ##this reg value and others were changed for write call
movq $1, %rax
movq $1, %rdi
movq $5, %rdx
syscall
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc ##close of .cfi_startproc
$ gcc -nostdlib -o hi.s && ./hi
HelloSegmentation 错误
很有希望
.file "hi.c" ##tells 'as' that we are about to start a new logical file
.section .rodata ##assembles the following code into section '.rodata'
.LC0: ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
## are guaranteed to be unique over the source code
## that allow the compiler to use names/simple notation
## to reference sections of code
##But here, only .LC0 is actually referenced in the code
.string "Hello"
.text
.globl _start
_start:
.cfi_startproc ##used at the beginning of each function that should have an
##entry in .eh_frame. It initializes some internal data
##structures. Don't forget to close by .cfi_endproc
##deleted the base pointer push and pops from stack, don't need stack
.cfi_def_cfa_offset 16 ##modifies a rule for computing CFA. Register remains the
##same, but offset is new. Note that it is the absolute
##offset that will be added to a defined register to
##compute CFA address
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
leaq .LC0(%rip), %rsi
movq $1, %rax
movq $1, %rdi
movq $5, %rdx
syscall
xor %rdi,%rdi
mov $60, %rax
.cfi_def_cfa 7, 8
syscall
.cfi_endproc ##close of .cfi_startproc
$ gcc -g -nostdlib -o hi hi.s && ./hi
你好
知道了!
试图弄清楚什么是CFA
http://dwarfstd.org/doc/DWARF4.pdf
第 6.4 节
在堆栈上分配的内存区域称为“调用帧”。
调用帧由堆栈上的地址标识。我们指
此地址作为规范帧地址或 CFA。通常情况下,
CFA 被定义为调用时堆栈指针的值
前一帧中的站点(可能与其上的值不同
进入当前帧)
那么所有 .cfi_def_cfa_offset、.cfi_offset 和 .cfi_def_cfa_register 都在做计算,
并操纵堆栈。但是这个程序根本不需要堆栈,所以还不如删除它
$ cat hi.s
.file "hi.c" ##tells 'as' that we are about to start a new logical file
.section .rodata ##assembles the following code into section '.rodata'
.LC0: ##.LC0, .LFB0, .LFE0 are just local labels; symbols that
## are guaranteed to be unique over the source code
## that allow the compiler to use names/simple notation
## to reference sections of code
##But here, only .LC0 is actually referenced in the code
.string "Hello"
.text
.globl _start
_start:
.cfi_startproc ##used at the beginning of each function that should have an
##entry in .eh_frame. It initializes some internal data
##structures. Don't forget to close by .cfi_endproc
leaq .LC0(%rip), %rsi
movq $1, %rax
movq $1, %rdi
movq $5, %rdx
syscall
xor %rdi,%rdi
mov $60, %rax
syscall
.cfi_endproc ##close of .cfi_startproc
.cfi_startproc:
用在每个应该有入口的函数的开头
.eh_frame
What is eh_frame
“当使用支持异常的语言(例如 C++)时,必须向运行时环境提供附加信息,这些信息描述了在处理异常期间要展开的调用帧。这些信息包含在特殊部分 .eh_frame 和 .eh_framehdr 。”
不需要异常处理,不使用 C++
$ cat hi.s
.section .rodata
.LC0:
.string "Hello"
.text
.globl _start
_start:
leaq .LC0(%rip), %rsi
movq $1, %rax
movq $1, %rdi
movq $5, %rdx
syscall
xor %rdi,%rdi
mov $60, %rax
syscall