© Jo Van Hoey 2019
J. Van HoeyBeginning x64 Assembly Programminghttps://doi.org/10.1007/978-1-4842-5076-1_32

32. Compare Strings

Jo Van Hoey1 
(1)
Hamme, Belgium
 

In the previous chapter, we used strings with implicit lengths, which means that these strings are terminated by a null byte. In this chapter, we will compare strings with implicit lengths and strings with explicit lengths.

Implicit Length

Instead of matching characters, we will look for characters that differ. Listing 32-1 shows the example code we will discuss.
; sse_string2_imp.asm
; compare strings implicit length
extern printf
section .data
       string1    db    "the quick brown fox jumps over the lazy"
                  db    " river",10,0
       string2    db    "the quick brown fox jumps over the lazy"
                  db    " river",10,0
       string3    db    "the quick brown fox jumps over the lazy
                        " dog",10,0
       fmt1   db "Strings 1 and 2 are equal.",10,0
       fmt11  db "Strings 1 and 2 differ at position %i.",10,0
       fmt2   db "Strings 2 and 3 are equal.",10,0
       fmt22  db "Strings 2 and 3 differ at position %i.",10,0
section .bss
section .text
       global main
main:
push  rbp
mov   rbp,rsp
;first print the strings
      mov   rdi, string1
      xor   rax,rax
      call  printf
      mov   rdi, string2
      xor   rax,rax
      call  printf
      mov   rdi, string3
      xor   rax,rax
      call  printf
; compare string 1 and 2
      mov   rdi, string1
      mov   rsi, string2
      call  pstrcmp
      mov   rdi,fmt1
      cmp   rax,0
      je    eql1          ;the strings are equal
      mov   rdi,fmt11     ;the strings are unequal
 eql1:
      mov   rsi, rax
      xor   rax,rax
      call  printf
 ; compare string 2 and 3
      mov   rdi, string2
      mov   rsi, string3
      call  pstrcmp
      mov   rdi,fmt2
      cmp   rax,0
      je    eql2          ;the strings are equal
      mov   rdi,fmt22     ;the strings are unequal
 eql2:
      mov   rsi, rax
      xor   rax,rax
      call  printf
; exit
leave
ret
;string compare----------------------------------------------
pstrcmp:
push  rbp
mov   rbp,rsp
      xor    rax, rax            ;
      xor    rbx, rbx            ;
.loop: movdqu    xmm1, [rdi + rbx]
      pcmpistri  xmm1, [rsi + rbx], 0x18 ; equal each | neg polarity
      jc         .differ
      jz         .equal
      add        rbx, 16
      jmp        .loop
.differ:
      mov rax,rbx
      add rax,rcx    ;the position of the differing character
      inc rax        ;because the index starts at 0
.equal:
leave
ret
Listing 32-1

sse_string2_imp.asm

As usual, we first print the strings; we then call a function, pstrcmp , to compare the strings. The essential information is in the function pstrcmp. The control byte is 0x18 or 00011000, that is, from right to left: packed integer bytes, equal each, negative polarity, and ecx, which contains the index to the first occurrence. The instruction pcmpistri makes use of the flags; you can find the following in the Intel manuals:
  • CFlag: Reset if IntRes2 is equal to zero; set otherwise.

  • ZFlag: Set if any byte/word of xmm2/mem128 is null; reset otherwise.

  • SFlag: Set if any byte/word of xmm1 is null; reset otherwise.

  • OFlag: IntRes2[0].

  • AFlag: Reset.

  • PFlag: Reset.

In the example, pcmpistri puts a 1 for every match into the corresponding position in IntRes1. When a differing byte is found, a zero is written in the corresponding position in IntRes1. Then IntRes2 is formed and applies negative polarity to IntRes1. IntRes2 will contain a 1 at the differing index (negative polarity), so IntRes2 will not be zero, and CF will be set to 1. The loop will then be interrupted, and pstrcmp will return with the position of the differing character in rax. If CF is not set but pcmpistri detects the terminating zero, the function will return with 0 in rax.

Figure 32-1 shows the output.
../images/483996_1_En_32_Chapter/483996_1_En_32_Fig1_HTML.jpg
Figure 32-1

sse_string2_imp.asm output

Explicit Length

Most of the time we use strings with implicit lengths, but Listing 32-2 shows an example of strings with explicit lengths.
; sse_string3_exp.asm
; compare strings explicit length
extern printf
section .data
      string1      db      "the quick brown fox jumps over the "
                   db      "lazy river"
      string1Len equ $ - string1
      string2      db      "the quick brown fox jumps over the "
                   db      "lazy river"
      string2Len equ $ - string2
      dummy  db "confuse the world"
      string3      db      "the quick brown fox jumps over the "
                   db      "lazy dog"
      string3Len equ $ - string3
      fmt1  db "Strings 1 and 2 are equal.",10,0
      fmt11 db "Strings 1 and 2 differ at position %i.",10,0
      fmt2  db "Strings 2 and 3 are equal.",10,0
      fmt22 db "Strings 2 and 3 differ at position %i.",10,0
section .bss
        buffer resb 64
section .text
      global main
main:
push  rbp
mov   rbp,rsp
; compare string 1 and 2
    mov      rdi, string1
    mov      rsi, string2
    mov      rdx, string1Len
    mov      rcx, string2Len
    call     pstrcmp
    push     rax    ;push result on stack for later use
; print the string1 and 2 and the result
;-------------------------------------------------------------
; first build the string with newline and terminating 0
; string1
    mov      rsi,string1
    mov      rdi,buffer
    mov      rcx,string1Len
    rep      movsb
    mov      byte[rdi],10 ; add NL to buffer
    inc      rdi          ; add terminating 0 to buffer
    mov      byte[rdi],0
;print
    mov      rdi, buffer
    xor      rax,rax
    call     printf
; string2
    mov      rsi,string2
    mov      rdi,buffer
    mov      rcx,string2Len
    rep      movsb
    mov      byte[rdi],10 ; add NL to buffer
    inc      rdi          ; add terminating 0 to buffer
    mov      byte[rdi],0
;print
    mov      rdi, buffer
    xor      rax,rax
    call     printf
;-------------------------------------------------------------
; now print the result of the comparison
    pop      rax     ;recall the return value
    mov      rdi,fmt1
    cmp      rax,0
    je       eql1
    mov      rdi,fmt11
 eql1:
    mov      rsi, rax
    xor      rax,rax
    call     printf
;-------------------------------------------------------------
;-------------------------------------------------------------
; compare string 2 and 3
    mov      rdi, string2
    mov      rsi, string3
    mov      rdx, string2Len
    mov      rcx, string3Len
    call     pstrcmp
    push     rax
; print the string3 and the result
;-------------------------------------------------------------
; first build the string with newline and terminating 0
; string3
    mov      rsi,string3
    mov      rdi,buffer
    mov      rcx,string3Len
    rep      movsb
    mov      byte[rdi],10 ; add NL to buffer
    inc      rdi          ; add terminating 0 to buffer
    mov      byte[rdi],0
;print
    mov      rdi, buffer
    xor      rax,rax
    call     printf
;-------------------------------------------------------------
; now print the result of the comparison
    pop      rax                  ; recall the return value
    mov      rdi,fmt2
    cmp      rax,0
    je       eql2
    mov      rdi,fmt22
eql2:
    mov      rsi, rax
    xor      rax,rax
    call     printf
; exit
leave
ret
;-------------------------------------------------------------
pstrcmp:
push   rbp
mov    rbp,rsp
       xor     rbx, rbx
       mov     rax,rdx         ;rax contains length of 1st string
       mov     rdx,rcx         ;rdx contains length of 2nd string
       xor     rcx,rcx         ;rcx as index
.loop:
       movdqu      xmm1, [rdi + rbx]
       pcmpestri xmm1, [rsi + rbx], 0x18 ; equal each|neg. polarity
       jc     .differ
       jz     .equal
       add    rbx, 16
       sub    rax,16
       sub    rdx,16
       jmp    .loop
.differ:
       mov    rax,rbx
       add    rax,rcx     ; rcx contains the differing position
       inc    rax         ; because the counter starts at 0
       jmp    exit
.equal:
       xor    rax,rax
exit:
leave
ret
Listing 32-2

sse_string3_exp.asm.

As you can see, using explicit length can sometimes complicate things. Then why use it? Many communication protocols use it, or your application may require that you use 0s in your data. One way or another we have to provide the length of the strings. In our case, we computed the length of the strings from the memory locations in section. data. However, printf expects zero-terminated strings. So, after we demonstrate how to compare strings with explicit lengths, we rebuild the strings in a buffer, add a newline and a terminating null in the buffer, and hand over the buffer to printf.

Now take a look at pstrcmp, the compare function. The length of the first string goes into rax, and the length of the second string goes into rdx. Then we start a loop: we load the address of the 16-byte block into an xmm1 register and call pcmpestri, with control byte 0x18 as before. Next, let’s at the flags; you can find the following in the Intel manuals:
  • CFlag: Reset if IntRes2 is equal to zero; set otherwise.

  • ZFlag: Set if absolute value of EDX is less than 16 (8); reset otherwise.

  • SFlag: Set if absolute value of EAX is less than 16 (8); reset otherwise.

  • OFlag: IntRes2[0].

  • AFlag: Reset.

  • PFlag: Reset.

Note that pcmpestri and pcmpistri use ZF and SF differently. Instead of ZF signaling a terminating null, at every loop we decrease rax and rdx, and when one of them goes below 16, the loop is terminated.

Figure 32-2 shows the output.
../images/483996_1_En_32_Chapter/483996_1_En_32_Fig2_HTML.jpg
Figure 32-2

sse_string3_exp.asm output

Summary

In this chapter, you learned about the following:
  • Implicit and explicit string lengths

  • Negative polarity

  • Using flags

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.198.81