What is Little Endian?
It’s a way of bytes are stored in Computer Memory.
Then what is Little Endian? It’s bytes stored in computer memory where the least significant byte byte occupies the lower memory address.
It will be easier if I show you the sample below:
For Example I have data: abcdefgh or 61,62,63,64,65,66,67,68 in hexadecimal number (a=61 in Ascii table). If it’s loaded in memory, it the order will be:
d, c, b,a h, g, f,e 64,63,62,61 68,67,66,65
The easier way, read from right to left per 4 bytes.
There are 2 types of it:
-Little Endian
-Big Endian
Big Endian is the opposite of Little Endian. The byte order in Memory is left to right.
Little Endian is mainly used in micro processor world.
Outside micro processor world, Big Endian is the common format in data networking for protocol like TCP, UDP, IPv4 and IPv6 for transmit data.
History word ‘Endian.
In 1980, computer scientist Danny Cohen introduced the terms Big Endian and Little Endian to digital electronic. These term actually come from novel Gulliver’s Travels written by Jonathan Swift.
Which type of Processor that use Little Endian?
Motorolla use Big Endians meanwhile Intel and AMD use Little Endian.
Which type is better?
Until I write this article, I still can’t find article that really show which one is better, in term of speed, easy to code or benchmark.
Why Intel and AMD use Little Endian and Why Motorolla and ARM use Big Endian in their processor.
I believe it is based on their history processor design and to maintain the backward compatibility, each vendor continue their Endian system in their product until now.
For Intel, since the first successful micro processor, Intel 8088 which is co-created by Victor Poor, an American Engineer and Computer Pioneer it continue to use Little Endian system until now.
For your reference, you can read the article below:
https://archive.computerhistory.org/resources/text/Oral_History/Intel_8008/Intel_8008_1.oral_history.2006.102657982.pdf
After you download the PDF, search for “for example, storing numbers least significant byte first”.
Using GDB to see how it work in the system.
I this tutorial, I use 64bit AMD processor and use “nasm” for compiling and link my assembly code.
Below the code that I use for the sample.
1 section .text 2 global _start 3 4 _start: 5 6 mov rax,sample 7 8 ;Exit 9 mov eax,1 10 mov ebx,0 11 int 0x80 12 13 section .data 14 sample db 'abcdefgh'
I compile it with nasm.
$ nasm -f elf64 -g endian.asm -o endian.o
$ ld endian.o -o endian
$
$ gdb ./endian GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git Copyright (C) 2018 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./endian...done. (gdb) list 1,15 1 section .text 2 global _start 3 4 _start: 5 6 mov rax,sample 7 8 ;Exit 9 mov eax,1 10 mov ebx,0 11 int 0x80 12 13 section .data 14 sample db 'abcdefgh' (gdb) break _start Breakpoint 1 at 0x4000b0 (gdb) run Starting program: /home/darklinux/endian Breakpoint 1, 0x00000000004000b0 in _start () (gdb)
(gdb) set disassembly-flavor intel (gdb) disassemble _start Dump of assembler code for function _start: => 0x00000000004000b0 <+0>: movabs rax,0x6000c8 0x00000000004000ba <+10>: mov eax,0x1 0x00000000004000bf <+15>: mov ebx,0x0 0x00000000004000c4 <+20>: int 0x80 End of assembler dump. (gdb) info reg rax rax 0x0 0 (gdb) si 0x00000000004000ba in _start () (gdb) disassemble _start Dump of assembler code for function _start: 0x00000000004000b0 <+0>: movabs rax,0x6000c8 => 0x00000000004000ba <+10>: mov eax,0x1 0x00000000004000bf <+15>: mov ebx,0x0 0x00000000004000c4 <+20>: int 0x80 End of assembler dump. (gdb) info reg rax rax 0x6000c8 6291656 (gdb) x/s 0x6000c8 0x6000c8 : "abcdefgh\001" (gdb) x/8x 0x6000c8 0x6000c8 : 0x61 0x62 0x63 0x64 0x65 0x66 0x67 0x68 (gdb) x/2wx 0x6000c8 0x6000c8 : 0x64636261 0x68676665 (gdb)
As you can see, start at address 0x600c8 (location of string character ‘abcdefgh’), the byte order by 4 bytes are:
abcd efgh
become
0xdcba 0xhgfe
In ASCII table, a=61, b=62, c=63, d=64, e=65, f=66, g=67 and h=68.
So, it’s correct that Intel and Intel compatible (AMD) are using Little Endian.
it seems nice, then i will read after i back from work