Skip to main content

Reverse in a nutshell

In this preparatory chapter, we are going to cover the basics of reverse-engineering an executable file. If you are already familiar with binary file format, you can skip and directly go to the next chapter.

We are going to use the tiny ELF executable fibonacci as a running example, paving the way to the next chapter.

What is inside?

An executable file contains machine code instructions and data that the processor will operate, but also some metadata that instruct the operating system how to map the content in the process memory.

note

The content and encoding of these metadata are operating system dependent. The common ones are:

  • the Executable and Linkable Format (ELF) for Linux;
  • the Portable Executable (PE) for Windows;
  • Mach-O for MacOS.

Hexdump

The following gives you an overview of the file content as read by BINSEC. Next sections describe this information in detail.

tip
  • Hover block content to display section names or string constants.
  • Click on instruction to get the disassembly preview.
00000000000000100000002000000030000000400000005000000060000000700000008000000090000000a0000000b0000000c0000000d0000000e0000000f000000100000001100000012000000130000001400000015000000160000001700000018000000190000001a0000001b0000001c0000001d0000001e0000001f000000200000002100000022000000230000002400000025000000260000002700000028000000290000002a0000002b0000002c0000002d0000002e0000002f0000003000000031000000320000003300000034000000350000003600000037000000380
7f45 4c46 0201 0100 0000 0000 0000 0000 0200 3e00 0100 0000 e800 4000 0000 0000 4000 0000 0000 0000 1002 0000 0000 0000 0000 0000 4000 3800 0100 4000 0600 0500 0100 0000 0500 0000 7800 0000 0000 0000 7800 4000 0000 0000 7800 4000 0000 0000 c400 0000 0000 0000 c400 0000 0000 0000 0400 0000 0000 0000 31c0 4885 ff74 0fba 0100 0000 4892 4801 d048 ffcf 75f6 c390 31ff 0fb6 0685 c074 1c83 e830 0f88 8200 0000 3c09 777e 48ff c648 8d3c bf48 01ff 4801 c7eb ddc3 6690 6668 000a 4889 e666 5abb 0a00 0000 48ff ce31 d248 f7f3 80c2 3088 1648 85c0 75ee bf01 0000 0048 89e2 4829 f289 f80f 05c3 5f66 83ff 0275 1f48 89e7 488b 7708 e895 ffff ffe8 78ff ffff e8b3 ffff ff31 ffb8 3c00 0000 0f05 bf01 0000 0048 8d34 2529 0140 00ba 1300 0000 89f8 0f05 40b7 ffeb de55 7361 6765 3a20 6669 626f 6e61 6363 6920 4e0a 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0100 0000 0200 0100 9000 4000 0000 0000 2600 0000 0000 0000 0e00 0000 0200 0100 b800 4000 0000 0000 3000 0000 0000 0000 1a00 0000 1000 0100 e800 4000 0000 0000 0000 0000 0000 0000 2100 0000 1200 0100 7800 4000 0000 0000 1700 0000 0000 0000 0070 6172 7365 5f75 696e 7436 3400 6563 686f 5f75 696e 7436 3400 5f73 7461 7274 0066 6962 6f6e 6163 6369 0000 2e73 796d 7461 6200 2e73 7472 7461 6200 2e73 6873 7472 7461 6200 2e74 6578 7400 2e72 6f64 6174 6100 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1b00 0000 0100 0000 0600 0000 0000 0000 7800 4000 0000 0000 7800 0000 0000 0000 b100 0000 0000 0000 0000 0000 0000 0000 0400 0000 0000 0000 0000 0000 0000 0000 2100 0000 0100 0000 0200 0000 0000 0000 2901 4000 0000 0000 2901 0000 0000 0000 1300 0000 0000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000 0100 0000 0200 0000 0000 0000 0000 0000 0000 0000 0000 0000 4001 0000 0000 0000 7800 0000 0000 0000 0400 0000 0300 0000 0800 0000 0000 0000 1800 0000 0000 0000 0900 0000 0300 0000 0000 0000 0000 0000 0000 0000 0000 0000 b801 0000 0000 0000 2b00 0000 0000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000 1100 0000 0300 0000 0000 0000 0000 0000 0000 0000 0000 0000 e301 0000 0000 0000 2900 0000 0000 0000 0000 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000
Legend
Headers Magic
Code Instructions
Read-Only Data Strings
Data Other Sections
ELF Header:
Class:               ELF64                        
Data:                2's complement, little endian
Type:                EXEC                         
Machine:             x86                        
Entry point address: 0x4000e8                     

Section Headers:
[Nr] Name      Type     Addr             Off    Size   ES Flg Lk Inf Al
[ 0]           NULL     0000000000000000 000000 000000 00      0   0  0
[ 1] .text     PROGBITS 0000000000400078 000078 0000b1 00  AX  0   0  4
[ 2] .rodata   PROGBITS 0000000000400129 000129 000013 00   A  0   0  1
[ 3] .symtab   SYMTAB   0000000000000000 000140 000078 18      4   3  8
[ 4] .strtab   STRTAB   0000000000000000 0001b8 00002b 00      0   0  1
[ 5] .shstrtab STRTAB   0000000000000000 0001e3 000029 00      0   0  1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), G (group), T (TLS), O (extra OS processing required)

Symbol table '.symtab' contains 5 entries:
Num:            Value Size Type   Bind   Section Name        
  0: 0000000000000000    0 NOTYPE LOCAL  UND                 
  1: 0000000000400090   38 FUNC   LOCAL  .text   parse_uint64
  2: 00000000004000b8   48 FUNC   LOCAL  .text   echo_uint64
  3: 00000000004000e8    0 NOTYPE GLOBAL .text   _start    
  4: 0000000000400078   23 FUNC   GLOBAL .text   fibonacci

Headers

The headers contain information about the type of the file and the structure of its content.

BINSEC can output basic information about the file using the -describe flag.

tip

You may get more / prettier information using dedicated tools like readelf or readpe.

ELF header

The ELF header identifies the type of the binary file.

ELF Header:
Class: ELF64
Data: 2's complement, little endian
Type: EXEC
Machine: x86
Entry point address: 0x4000e8

Here fibonacci is an executable for the x86-64 architecture. The entrypoint (0x4000e8) will be the first address to be executed when the program is started.

info

The file tool determine the file type by reading this header.

Section header

Sections structure the content of the file and provide the mapping between the file offsets and the process virtual addresses.

Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000400078 000078 0000b1 00 AX 0 0 4
[ 2] .rodata PROGBITS 0000000000400129 000129 000013 00 A 0 0 1
[ 3] .symtab SYMTAB 0000000000000000 000140 000078 18 4 3 8
[ 4] .strtab STRTAB 0000000000000000 0001b8 00002b 00 0 0 1
[ 5] .shstrtab STRTAB 0000000000000000 0001e3 000029 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), G (group), T (TLS), O (extra OS processing required)

Here, the program loads two section in memory (A):

  • .text gets the executable permission (X), it contains the machine instruction;
  • .rodata contains read-only data.
note

The other sections are not loaded in memory. Their only purpose is to provide or complement metadata like the symbols (.symtab, .strtab) or section names (.shstrtab). They are optional (see strip), but their absence makes it more opaque for human.

Symbol table

Symbols label some elements of the program. They are often provided by the compilation process from a higher language to identify the program functions or global variables.

Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Section Name
0: 0000000000000000 0 NOTYPE LOCAL UND
1: 0000000000400090 38 FUNC LOCAL .text parse_uint64
2: 00000000004000b8 48 FUNC LOCAL .text echo_uint64
3: 00000000004000e8 0 NOTYPE GLOBAL .text _start
4: 0000000000400078 23 FUNC GLOBAL .text fibonacci

Here, fibonacci identifies the entry point of a function that we will study in the next chapter.

Disassembly

The .text section is full of machine instruction opcode. The disassembly aims to recover the (more) human readable assembly language mnemonic.

BINSEC can output the code disassembly using the -disasm flag.

tip

You will get better output using objdump or a more powerful GUI disassembler like cutter, Ghidra or IDA.