© Jo Van Hoey 2019
J. Van HoeyBeginning x64 Assembly Programminghttps://doi.org/10.1007/978-1-4842-5076-1_31

31. Search for a Character

Jo Van Hoey1 
(1)
Hamme, Belgium
 

In this chapter, we will start using the control byte to help us find a specific character in a string.

Determining the Length of a String

In the first example, we will determine the length of a string by looking for a terminating 0.

Listing 31-1 shows the code.
; sse_string_length.asm
extern printf
section .data
;template            0123456789abcdef0123456789abcdef0123456789abcd  e
;template            1234567890123456789012345678901234567890123456  7
      string1 db    "The quick brown fox jumps over the lazy river.",0
      fmt1 db       "This is our string: %s ",10,0
      fmt2 db       "Our string is %d characters long.",10,0
section .bss
section .text
      global main
main:
push  rbp
mov   rbp,rsp
      mov   rdi, fmt1
      mov   rsi, string1
      xor   rax,rax
      call  printf
      mov   rdi, string1
      call  pstrlen
      mov   rdi, fmt2
      mov   rsi, rax
      xor   rax,rax
      call  printf
leave
ret
;function to compute string length-------------------------
pstrlen:
push  rbp
mov   rbp,rsp
      mov   rax,    -16         ; avoid changing later
      pxor  xmm0,  xmm0         ; 0 (end of string)
.not_found:
      add          rax, 16      ; avoid changing ZF later
                                ; after pcmpistri
      pcmpistri    xmm0, [rdi + rax], 00001000b     ;'equal each'
      jnz          .not_found   ; 0 found?
      add          rax, rcx     ; rcx contains the index of the 0
      inc          rax          ; correct for index 0 at start
leave
ret
Listing 31-1

sse_string_length.asm

At the beginning of the program, we added two templates in comments to make the character counting easier for us. One template uses decimal numbering, starting at 1, and the other template uses hexadecimal numbering, starting at index 0.
;template     1234567890123456789012345678901234567890123456  7
;template     0123456789abcdef0123456789abcdef0123456789abcd  e
string1 db   "The quick brown fox jumps over the lazy river.",0

First, as usual, we print the strings. Then we call the custom-built search function pstrlen. Our function pstrlen scans for the first occurrence of a zero byte. The instruction pcmpistri analyzes blocks of 16 bytes at a time; we use rax as a block counter. If pcmpistri detects a zero byte in the current block, ZF will be set and used to decide whether to jump. We have to avoid that incrementing rax will impact the ZF flag just before the jump is evaluated, so we have to increment the ZF flag before pcmpistri. That is why we start with -16 in rax; now we can increase rax before using pcmpistri. Note the pxor instruction ; it is the logical or instruction for xmm registers. SIMD has its own logical instructions!

The immediate control byte contains 00001000, which means the following:
  • 00 Packed unsigned bytes

  • 10 Equal each

  • 00 Positive Polarity

  • 0 Least significant index

  • 0 Reserved

You might expect that we use “equal any” to find any 0. But instead, we are using “equal each”! Why is that?

You have to know that pcmpistri initializes rcx to contain the value 16, which is the number of bytes in a block. If a matching byte is found, pcmpistri will copy the index of the matching byte in rcx. If there is no match found, rcx will contain 16.

Look in the Intel manuals, specifically, in Volume 2B. Section 4.1.6, “Valid/Invalid Override of Comparisons,” explains what happens when a block has “invalid” bytes, or bytes past the end of a string.

We can use this table to interpret our situation:

xmm0

Memory

Equal any

Equal each

Invalid

Invalid

Force false

Force true

Invalid

Valid

Force false

Force false

We have xmm0 invalid because we initialized it to contain 0 bytes. When we have a 16-byte block containing a 0 byte, in the case of “equal any,” pcmpistri detects that one of the 16 bytes contains 0. At that moment, we have xmm0 invalid and memory invalid. However, pcmpistri is designed to “force false” in the case of “equal any.” So, pcmpistri thinks there is no match and returns 16 in rcx, so the calculated string length will not be correct.

But when we use “equal each,” xmm0 is invalid like before, and as soon as pcmpistri reads the terminating 0 byte in the block, it is designed to “force true.” The index of the 0 byte is recorded in ecx. And that value in ecx can be used to correctly calculate the end of the string.

One caveat: the program reads in blocks of 16 bytes. That is okay as long as the place where the data is found is within a memory space allocated to the program. If it tries reading beyond the allowed memory border, the program will crash. You can avoid this by keeping track of where you are in the memory page (in most cases, pages are chunks of 4K bytes), and if you come close to the page border, start reading byte per byte. That way you will never accidentally try to cross over from an allowed memory page to a memory page of another process. We did not implement this feature to complicate the explanation and the example program. But be warned that such a situation can happen.

Figure 31-1 shows the output. As you can see, the string length includes the terminating null.
../images/483996_1_En_31_Chapter/483996_1_En_31_Fig1_HTML.jpg
Figure 31-1

sse_string_length.asm output

Searching in Strings

Now that we know how to determine the length of a string, let’s do some searching in strings (see Listing 31-2).
; sse_string_search.asm
extern printf
section .data
;template     123456789012345678901234567890123456789012345  6
;template     0123456789abcdef0123456789abcdef0123456789abc  d
string1 db   "the quick brown fox jumps over the lazy river",0
string2      db    "e",0
fmt1         db    "This is our string: %s ",10,0
fmt2         db    "The first '%s' is at position %d.",10,0
fmt3         db    "The last '%s' is at position %d.",10,0
fmt4         db    "The character '%s' didn't show up!.",10,0
section .bss
section .text
      global main
main:
push  rbp
mov   rbp,rsp
      mov   rdi, fmt1
      mov   rsi, string1
      xor   rax,rax
      call  printf
; find the first occurrence  
      mov   rdi, string1
      mov   rsi, string2
      call  pstrscan_f
      cmp   rax,0
      je    no_show
      mov   rdi, fmt2
      mov   rsi, string2
      mov   rdx, rax
      xor   rax,rax
      call  printf
; find the last occurrence  
      mov   rdi, string1
      mov   rsi, string2
      call  pstrscan_l
      mov   rdi, fmt3
      mov   rsi, string2
      mov   rdx, rax
      xor   rax,rax
      call  printf
      jmp   exit
no_show:
      mov   rdi, fmt4
      mov   rsi, string2
      xor   rax, rax
      call  printf
exit:
leave
ret
;------ find the first occurrence ----------------------
pstrscan_f:
push  rbp
mov   rbp,rsp
      xor    rax, rax
      pxor   xmm0, xmm0
      pinsrb xmm0, [rsi],0
.block_loop:
      pcmpistri  xmm0, [rdi + rax], 00000000b
      jc    .found
      jz    .none
      add   rax, 16
      jmp   .block_loop
.found:
      add   rax, rcx            ; rcx contains the position of the char
      inc   rax                 ; start counting from 1 instead of 0
leave
ret
.none:
      xor   rax,rax             ; nothing found, return 0
leave
ret
;------ find the last occurrence ----------------------
pstrscan_l:
push  rbp
mov   rbp,rsp
push  rbx                 ; callee saved
push  r12                 ; callee saved
      xor    rax, rax
      pxor   xmm0, xmm0
      pinsrb xmm0, [rsi],0
      xor    r12,r12
.block_loop:
      pcmpistri  xmm0, [rdi + rax], 01000000b
      setz   bl
      jc     .found
      jz     .done
      add    rax, 16
      jmp    .block_loop
.found:
      mov    r12, rax
      add    r12, rcx    ; rcx contains the position of the char
      inc    r12
      cmp    bl,1
      je     .done
      add    rax,16
      jmp    .block_loop
pop r12                  ; callee saved
pop rbx                  ; callee saved
leave
ret
.done:
      mov    rax,r12
pop r12                  ; callee saved
pop rbx                  ; callee saved
leave
ret
Listing 31-2

sse_string_search.asm

At the beginning of the program, we added two templates in comments to make the character counting easier for us.

Here, string1 contains the string, and string2 contains the search argument. We will be searching for the first and last occurrences of the search argument. First, we print the strings; then we call the custom-built functions. We have separate functions for finding the first occurrence of the character and the last occurrence. The function pstrscan_f scans for the first occurrence of the search argument. The instruction pcmpistri treats blocks of 16 bytes at a time; we use rax as a block counter. We clear xmm0 with the pxor instruction. With pinsrb , we put the search argument in the low byte of xmm0 (byte 0). We use “equal any” to find the occurrences, and as soon as an occurrence is found, rcx indicates the index of the matching byte in the current 16-byte block. If no occurrence is found in the current block, the value 16 is put into rcx. With jc, we check if CF=1. If so, we find a match; rcx is added to rax, which contains the number of bytes already screened in previous blocks, and then rax is returned, corrected for the counting to start at 1 instead of 0.

If CF=0, we check with jz to see if we have reached the last block. pcmpistri sets ZF=1 when a null byte is detected, and rax is cleared, because no match was found. And the function returns with 0.

Of course, we did not do any error checking; if the string is not null terminated, you may get erroneous results. Try to delete the 0 at the end of the string and watch the result.

The function pstrscan_l scans for the last match of the search argument. This is more complicated than just looking for the first match and exiting. We have to read all 16-byte blocks and keep track of the last occurrence in a block. So even when we find an occurrence, we have to continue the loop until we find a terminating zero. To keep an eye on the terminating zero, we set register bl to 1 as soon as we detect the zero. The register r12 is used to record the index of the most recent match. See Figure 31-2.
../images/483996_1_En_31_Chapter/483996_1_En_31_Fig2_HTML.jpg
Figure 31-2

sse_string_search.asm output

Summary

In this chapter, you learned about the following:
  • Using pcmpistri to scan for characters and string length

  • Interpreting the outcome of pcmpistri with different control bytes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.72.74