5

I am learning 64-bit nasm, I assemble the .nasm file, which ONLY contains 64-bit registers, by doing the following

nasm -f elf64 HelloWorld.nasm -o HelloWorld.o

and link it doing the following

ld HelloWorld.o -o HelloWorld

the program runs correctly and even says it is a 64-bit ELF when I run the file command, but when I use objdump or gdb to disassemble the executable, the registers I put as 64-bit registers in the code show up as 32-bit registers when disassembled. (example: rax in source showing up as eax when disassembled)

Why is this?

This does not happen on just one computer, and it is a new problem, it hasn't been doing this before.

HelloWorld.nasm:

global _start

section .text

_start:
        mov rax, 1
        mov rdi, 1
        mov rsi, hello_world
        mov rdx, length
        syscall

        mov rax, 60
        mov rdi, 11
        syscall

section .data

        hello_world: db 'Hello World',0xa
        length: equ $-hello_world

Disassembled HelloWorld:

...
00000000004000b0 <_start>:
  4000b0:       b8 01 00 00 00          mov    eax,0x1
  4000b5:       bf 01 00 00 00          mov    edi,0x1
  4000ba:       48 be d8 00 60 00 00    movabs rsi,0x6000d8
  4000c1:       00 00 00
  4000c4:       ba 0c 00 00 00          mov    edx,0xc
  4000c9:       0f 05                   syscall
  4000cb:       b8 3c 00 00 00          mov    eax,0x3c
  4000d0:       bf 0b 00 00 00          mov    edi,0xb
  4000d5:       0f 05                   syscall
...
Community
  • 1
  • 1
themustang
  • 53
  • 5
  • Its not **all** registers, only those where you mov values which fit into 32 bits, see http://stackoverflow.com/questions/11177137/why-do-most-x64-instructions-zero-the-upper-part-of-a-32-bit-register – Andreas Fester Oct 22 '14 at 10:27
  • Basically a duplicate of the more recent [Why NASM on Linux changes registers in x86\_64 assembly](https://stackoverflow.com/q/48596247) – Peter Cordes Apr 25 '23 at 03:43

1 Answers1

7

Why does

...
mov rax, 1
mov rdi, 1
mov rsi, hello_world
...

gets disassembled as

...
4000b0:       b8 01 00 00 00          mov    eax,0x1
4000b5:       bf 01 00 00 00          mov    edi,0x1
4000ba:       48 be d8 00 60 00 00    movabs rsi,0x6000d8
4000c1:       00 00 00
...

Because the literal 0x1 fits into 32 bits, and the upper 32 bits of a 64 bit register are set to 0 when loading the lower 32 bits through the corresponding E-register. Hence the assembler can optimize the mov to a 32 bit operation.

Note that the address loaded into rsi might not fit into 32 bits, hence rsi remains as such.

If you add the following instructions, you can see the effect very clearly:

mov rbx, 0x0ffffffff      ; still fits into 32 bit
mov rbx, 0x100000000      ; does not fit into 32 bits anymore

gets disassembled as

 a: bb ff ff ff ff          mov    $0xffffffff,%ebx
 f: 48 bb 00 00 00 00 01    movabs $0x100000000,%rbx
16: 00 00 00 

You can disable nasm optimization with -O0, in which case the instructions keep their long format:

nasm -O0 -f elf64 HelloWorld.asm 

Result:

14: 48 bb ff ff ff ff 00    movabs $0xffffffff,%rbx
1b: 00 00 00 
1e: 48 bb 00 00 00 00 01    movabs $0x100000000,%rbx
25: 00 00 00 
Community
  • 1
  • 1
Andreas Fester
  • 36,091
  • 7
  • 95
  • 123
  • You can use `objdump -drwC -Mintel` to get disassembly that's closer to NASM syntax, like the OP did. See also [Why NASM on Linux changes registers in x86\_64 assembly](https://stackoverflow.com/q/48596247) - the other option (instead of `nasm -O0`) is `mov rax, strict dword 60` to use a sign-extended 32-bit immediate with 64-bit operand-size. Or `mov rax, strict qword 1` to force a 64-bit immediate. `nasm -O0` make very inefficient asm all over the place so you wouldn't want to actually use it, but might sometimes want a longer encoding for alignment reasons. – Peter Cordes Apr 25 '23 at 03:43