如何在 x86 汇编中划分浮点数？答案

【问题标题】：How to divide floating-point number in x86 assembly?如何在 x86 汇编中划分浮点数？
【发布时间】：2012-01-10 14:23:42
【问题描述】：

当我尝试编写 Heron 算法来计算 ECX 寄存器中的 sqrt 时，它不起作用。看起来问题是除法浮点数，因为结果是整数。

我的算法：

 sqrtecx:

MOV EDX, 10 ; loop count
MOV EAX, 5 ; x_0 in heron algorythm
MOV DWORD[EBP-100], ECX  ; save INPUT (ecx is input)    
MOV DWORD[EBP-104], EDX  ; save loop count
jmp     loop
MOV     ECX, EAX ; move  OUTPUT to ECX

loop:

MOV DWORD[EBP-104], EDX ; save loop count
xor edx, edx

MOV ECX, EAX
MOV     EAX, DWORD[EBP-100]
DIV ECX
ADD EAX, ECX
XOR EDX, EDX
mov ecx, 2
DIV ecx

MOV EDX, DWORD[EBP-104] ; load loop count
DEC EDX
JNZ loop

【问题讨论】：

顺便说一句，对于 FPU 代码和 SSE 代码，也有平方根指令。所以你甚至不需要这个..
@harold，在 nasm 汇编中有平方根指令吗？我的代码表中没有它。你能告诉我吗？
FSQRT (D9 FA) 用于 FPU 代码，SQRTSS (F3 0F 51 /r) 用于 SSE 和 SQRTSD (F2 0F 51 /r) 用于 SSE2（也有采用 4 个打包浮点数或 2 个的版本打包双打）。这里有一个更完整的参考：siyobik.info/main/reference
>'mov ecx, 144 fcmov ecx fsqrt ' 如果我尝试这个，它也不起作用
不，你不能像那样将整数加载到浮点堆栈中，BlackBear 的答案有解决方案

标签： assembly x86 floating-point sse x87

【解决方案1】：

您需要使用浮点指令集来实现您的目标。一些您可能会觉得有用的说明是：

fild <int>  - loads and integer into st0 (not an immediate)
faddp       - adds st0 to st1, and pop from reg stack (i.e. result in st0)
fdivp       - divides st1 by st0, then pop from reg stack (again, push the result in st0)

这是一个简短的示例 sn-p（VS2010 内联汇编）：

int main(void)
{
    float res;

    __asm {
        push    dword ptr 5;     // fild needs a memory location, the trick is
        fild    [esp];           // to use the stack as a temp. storage
        fild    [esp];           // now st0 and st1 both contain (float) 5
        add     esp, 4;          // better not screw up the stack
        fadd    st(0), st(0);    // st0 = st0 + st0 = 10
        fdivp   st(1), st(0);    // st0 = st1 / st0 = 5 / 10 = 0.5
        sub     esp, 4;          // again, let's make some room on the stack
        fstp    [esp];           // store the content of st0 into [esp]
        pop     eax;             // get 0.5 off the stack
        mov     res, eax;        // move it into res (main's local var)
        add     esp, 4;          // preserve the stack
    }

    printf("res is %f", res);    // write the result (0.5)
}

编辑：
正如哈罗德指出的那样，还有一条直接计算平方根的指令，它是fsqrt。操作数和结果都是st0。

编辑 #2：
我不确定您是否真的可以将即时值加载到st0，因为我的reference 没有明确说明。因此我做了一个小sn-p检查，结果是：

    float res = 5.0 * 3 - 1;
000313BE D9 05 A8 57 03 00    fld         dword ptr [__real@41600000 (357A8h)]  
000313C4 D9 5D F8             fstp        dword ptr [res]

这些是357A8h 处的字节：

__real@41600000:
000357A8 00 00                add         byte ptr [eax],al  
000357AA 60                   pushad  
000357AB 41                   inc         ecx

所以我不得不得出结论，不幸的是，在加载和存储数字时，您必须将数字存储在主内存中的某个位置。当然，使用我上面建议的堆栈并不是强制性的，实际上您也可以在数据段或其他地方定义一些变量。

编辑 #3：
不用担心，汇编是一个强大的野兽；）关于您的代码：

mov     ecx, 169    ; the number with i wanna to root
sub     esp, 100    ; i move esp for free space
push    ecx         ; i save value of ecx
add     esp,4       ; push was move my ebp,then i must come back 
fld                 ; i load from esp, then i should load ecx 
fsqrt               ; i sqrt it
fst                 ; i save it on ebp+100 
add     esp,100     ; back esp to ebp

您缺少fld 和fst 的操作数。看看你的 cmets，我想你想要 fld [esp] 和 fst [esp]，但我不明白你为什么要谈论 ebp。 ebp 应该保存堆栈帧的开头（其中有很多我们不应该搞砸的东西），而 esp 保存它的结尾。我们基本上想在堆栈帧的末尾进行操作，因为在它之后就是垃圾，没人关心。
在计算并保存平方根之后，您还应该在最后add esp, 4。这是因为push ecx 也在后台sub esp, 4 为您推动的价值腾出空间，并且在保存价值时仍然需要一些空间。正因为如此，你也可以避开sub esp, 100和add esp, 100，因为push已经为你准备好了房间。
最后一个“警告”：整数和浮点值以非常不同的方式表示，因此当您知道必须使用这两种类型时，请注意您选择的指令。您建议的代码使用fld 和fst，它们都对浮点值进行操作，因此您得到的结果不会是您期望的结果。一个例子？ 00 00 00 A9 是 169 上的字节表示，但它表示浮点数 +2.3681944047089408e-0043（对于那些挑剔的人来说，它实际上是一个长双精度数）。
所以，最终的代码是：

mov     ecx, 169;   // the number which we wanna root
push    ecx;        // save it on the stack
fild    [esp];      // load into st0 
fsqrt;              // find the square root
fistp   [esp];      // save it back on stack (as an integer)
// or fst [esp] for saving it as a float
pop ecx;            // get it back in ecx

【讨论】：

现在我明白了。还有用于整数和浮点数的其他寄存器。但是，如果我在 ECX 中有整数，并且想在 st0 中得到 ECX 的平方根，而不使用堆栈，我该怎么办？不使用堆栈可以吗？我试过这样的：'mov ecx, 144 mov st0,ecx fsqrt' 但它不起作用:(
感谢您的下一个回答。我是初学者，然后我的理解水平很低。在我读完你的答案后，我写了这样的代码mov ecx, 169 ; the number with i wanna to root sub esp, 100 ; i move esp for free space push ecx ; i save value of ecx add esp,4 ; push was move my ebp,then i must come back fld ; i load from esp, then i should load ecx fsqrt ; i sqrt it fst ; i save it on ebp+100 add esp,100; back esp to ebp 在我看来，它应该像我在 cmets 中发布（在';'之后）一样工作，但它没有..
第三次编辑的代码有个小问题，结果留在FPU栈上。
@harold：你说得对，我会解决的。顺便说一句，这很危险吗？
有点.. OP 是新手，所以他可能会忘记清空堆栈，然后以后的函数可能会“突然”失败。大多数编译器生成的代码在使用它之前不会清空堆栈，因为典型的 ABI 指定在函数之后它必须为空或包含单个项目作为返回值。

【解决方案2】：

DIV 用于整数除法 - 您需要 FDIV 用于浮点（在这种特殊情况下更可能是 FIDIV，因为看起来您从整数值开始）。

【讨论】：

当我将 DIV 编辑为 FiDIV 或 FDIV 时，它不起作用。可能我做错了。你确定，它在 nasm 中有效吗？ FFTCount32.S:188：错误：操作码和操作数组合无效 FFTCount32.S:192：错误：操作码和操作数组合无效

【解决方案3】：

我不完全确定你真正想要做什么，所以现在我假设你想要取整数的浮点平方根。

mov dword ptr[esp],ecx   ; can't load a GRP onto the FPU stack, so go through mem
fild dword ptr[esp]      ; read it back (as integer, converted to float)
fsqrt                    ; take the square root

第一个 dword ptr 可能是可选的，具体取决于您的汇编程序。

在这段代码之后，结果位于 FPU 堆栈的顶部，ST(0)。我不知道你以后想用它做什么..如果你想把它四舍五入到一个 int 并把它放回 ecx，我建议这样做：

fistp dword ptr[esp]     ; again it can't go directly, it has to go through mem
mov ecx,dword ptr[esp]

我将采用 SSE2 的方式进行衡量：

cvtsi2sd xmm0,ecx  ; convert int to double
sqrtsd xmm0,xmm0   ; take the square root
cvtsd2si ecx,xmm0  ; round back to int (cvttsd2si for truncate instead of round)

这样会容易一些。

【讨论】：