0

In the assembly programming I understand:

EAX : 22 66 77 55
 AX :       77 55
 AH :       77
 AL :          55

But I don't really understand how it works when reading data from an array with pointer offsets:

 .data 
 arrayW WORD 1233h,2245h, 1176h

 ptr2 PWORD arrayW

 .code 
    mov esi, ptr2
    mov ax, [esi]
    mov ah, [esi + 1]
    mov ax, [esi + 2]
    mov eax, [esi + 2]

mov ax, [esi] the register EAX = 12331233. I thought the register EAX would be 00001233?

Also, mov ax, [esi + 2] the register = EAX = 12334455. I don't understand how the register became 12334455.

Can someone please explain to me what all the values of the registers will be after the execution?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
student001
  • 31
  • 7
  • Related: [Assembly language - Why are characters stored in register as little endian?](https://stackoverflow.com/q/48645360) – Peter Cordes Jun 09 '21 at 09:33

1 Answers1

4

See Assembly programming memory Allocating EAX vs Ax, AH, AL for how AX, AH, and AL overlap, as subsets of EAX.

Writing partial registers (AX, AH, or AL) doesn't modify the rest of EAX. (In 64-bit mode, writing EAX does zero the upper half of RAX)

So mov ax, [esi] leaves the top 2 bytes of EAX unmodified, and replaces the low 2 bytes with AX=0x1233. That means AH=0x12 and AL=0x33, and EAX=0x22 66 12 33.


Bytes in memory are little endian, so mov ah, [esi + 1] loads 0x12 (the upper byte of 0x1233, the first word of the array.)

This is assembly language. Everything is just bytes. It doesn't matter that they got there in that order because you used WORD 1233h,2245h instead of BYTE 33h,12h, 45h, 22h. The CPU doesn't know or care about any "meaning" for the instruction, it just loads 1 byte from [esi+1], and puts it in AH. (Which already had that value from mov ax, [esi]).

Bytes in registers don't have an endianness, since they don't have addresses. Left shifts always multiply by 2, right shifts always divide by 2 (rounding towards -Infinity, unlike signed integer division).


See also the tag wiki for more FAQ questions, and links to docs and guides.

You can always put this code into a program and run it in a debugger to watch register values change as you single-step through it. If you're not sure what happens, do that.


BTW, mov esi, ptr2 is silly. mov esi, offset arrayW would do the same thing without needing to load a pointer from data memory.

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 'mov ah, [esi+1]' after the execution does the register's value change? Because it is WORD so [esi+1] doesn't point to anything right? [esi+2] points to the second element? – student001 Sep 26 '16 at 01:55
  • @student001: this is assembly language. Everything is just bytes. It doesn't matter that they got there in that order because you used `WORD 1233h,2245h` instead of `BYTE 33h,12h, 45h, 22h`. The CPU doesn't know or care about any "meaning" for the instruction, it just loads 1 byte from `[esi+1]`. – Peter Cordes Sep 26 '16 at 02:02
  • I'm sorry I'm new to this. But I just want to make sure I understand. So if I use `DWORD 12344h` = `BYTE 44h, 23h, 01h` . Is that right? And one more, `BYTE 1,2,3,4` = `WORD 4321h` ? – student001 Sep 26 '16 at 02:10
  • @student001: yes, but you left out the 4th byte (`00h`). a DWORD is always 4 bytes, and doesn't depend on the size of the constant. (It stands for DoubleWord). https://en.wikipedia.org/wiki/Endianness has some pictures. – Peter Cordes Sep 26 '16 at 02:13