在汇编代码 (MASM) 中的字符串中查找子字符串的更好形式？答案

【问题标题】：A better form to find a Substring in a string in assembly code (MASM)?在汇编代码 (MASM) 中的字符串中查找子字符串的更好形式？
【发布时间】：2021-10-04 17:46:53
【问题描述】：

所以我用我在各个站点收集的知识编写了这段代码，我认为有一种优化的方法可以做到这一点，而无需推送和弹出堆栈内存上的寄存器，但我不知道该怎么做。 这是我的代码

comparing proc
    MOV CX, SIZEOF vec2 ;The size of the substring
    DEC CX
    MOV DX, SIZEOF vec1 ; The size of the String
    LEA SI, vec1        ; The String
    LEA DI, vec2        ; The substring

FIND_FIRST:        
    MOV AL, [SI];   storing the ascii value of current character of mainstring 
    MOV AH, [DI];   storing the ascii value of current character of substring
    CMP AL,AH;      comparing both character
    JE FITTING;      if we find it we try to find the whole substring
    JNE NEXT

NEXT:
    INC SI; We go to the next char
    DEC DX; And the size of the string decreased
    JE N_FINDED
    JMP FIND_FIRST
FITTING:
    CLD
    PUSH CX ; I push this register because in the instruction REPE CMPSB
    PUSH SI ; They change.
    PUSH DI
    REPE CMPSB
    JNE N_FITTING
    JE FINDED
N_FITTING:
    POP DI
    POP SI
    POP CX
    JMP NEXT ; if the complete substring doesn't fit we go to the next char
FINDED:
    POP DI
    POP SI
    POP CX
    MOV AL, 0001H;  substring found
    JMP RETURN


N_FINDED:    
    MOV AL, 0000H;  substring not found

RETURN:
    ret 
comparing endp

【问题讨论】：

对于它的价值，“find”的仰卧是“found”。
在微优化方面，您可以将cld 和push/pop 提升到循环之外，只保存/恢复整个功能。此外，在开始慢速rep cmpsb 之前扫描匹配的第一个字节会很有意义。（事实上，在现代 CPU 上，rep cmps 和 rep scas 没有使用快速字符串微码进行优化，只有 rep stos 和 rep movs，因此通常完全避免使用它们；see this example 旧版本的 GCC在-O1 处内联rep cmpsb，对于大型strlen 速度较慢，尤其是与SSE2 SIMD 相比。
在算法优化方面，对于长子串“needle”，有一些算法可以做得比仅仅将“haystack”的每个字节作为可能的起点更好。一个著名的是博耶-摩尔。 en.wikipedia.org/wiki/…。或者在不真正改变蛮力策略的情况下，使用cmp [si], ax 或其他东西来检查2个匹配字节，并进行重叠的未对齐字比较。您是针对实际 8086 还是针对 16 位模式下的现代 x86 进行优化？还是介于 386 或 Pentium 之间？

标签： string assembly masm strstr

【解决方案1】：

如果子字符串碰巧有多个字符（这很可能），那么您的代码将开始比较要搜索的字符串之外的字节。
有了DI引用的字符串和它的长度DX，和SI引用的子字符串和它的长度CX，你首先需要确保两个字符串都不为空，然后你需要限制可能发现的数量。接下来的 4 行代码就是这样做的：

    jcxz NotFound   ; Substring is empty
    sub  dx, cx
    jb   NotFound   ; Substring is longer than String
                    ; Happens also when String is empty
    inc  dx

以字符串“overflow”（DX=8）和子字符串“basic”（CX=5）为例：

sub  dx, cx    ; 8 - 5 = 3
inc  dx        ; 3 + 1 = 4  Number of possible finds is 4

overflow

basic       possible find number 1
 basic      possible find number 2
  basic     possible find number 3
   basic    possible find number 4

您可以编写您的过程而不必一直将这些寄存器保存在堆栈（或其他地方）。只需引入另一个寄存器，这样您就不必破坏 CX、SI 和 DI 寄存器：

    jcxz NotFound
    sub  dx, cx
    jb   NotFound
    inc  dx

    mov  al, [si]       ; Permanent load of first char of the Substring
FirstCharLoop:
    cmp  [di], al
    je   FirstCharMatch
NextFirstChar:
    inc  di
    dec  dx             ; More tries?
    jnz  FirstCharLoop  ; Yes
NotFound:
    xor  ax, ax
    ret

FirstCharMatch:
    mov  bx, cx
    dec  bx
    jz   Found          ; Substring had only 1 character
OtherCharsLoop:
    mov  ah, [si+bx]
    cmp  [di+bx], ah
    jne  NextFirstChar
    dec  bx
    jnz  OtherCharsLoop
Found:
    mov  ax, 1
    ret

请注意，此代码现在不会像您程序中的 repe cmpsb 那样再次比较第一个字符。
AX 是结果，唯一被破坏的寄存器（您可能想要保留）是BX、DX和DI。

【讨论】：