ARM 体系结构 整理总结帖

最近,打算写长篇博客介绍ARM 的体系结构,也算是学习笔记,两年工作的一些积累的总结吧:




Topic 1:大小端

大小端(big- endian, little- endian)影响到数据在存储器中的存放顺序。

大端模式(big- endian), 高字节放在放低地址,低字节放在高地址;

小端模式(little- endian), 高字节放在高地址,低字节放在低地址。

助记: 以低字节存放的位置来看:


真正理解这个大小端概念需要明白,存储器是按照字节为存储单元编号的,小端模式可以理解为,从数据的小端(即低位)开始存放数据,因为存储单元的编号是从低到高的,因此就出现了,低字节放在低地址,高字节放在高地址。而且, ARM,x86,一般都是小端模式(LSB)。PowerPC/MIPS 一般为大端模式(MSB)。

$file zip

zip: ELF 32-bit LSB executable,ARM,
version 1 (SYSV), for GNU/Linux 2.6.34, dynamically linked (uses shared libs), for GNU/Linux 2.6.34, stripped

$file /bin/ls 

/bin/ls: ELF 32-bit LSB executable, Intel 80386,
version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped

$file ls

ls: ELF 32-bit MSB executable,PowerPC
or cisco 4500, version 1 (SYSV), for GNU/Linux 2.6.34, dynamically linked (uses shared libs), for GNU/Linux 2.6.34, stripped

$file ls

ls: ELF 32-bit MSB executable,MIPS,
MIPS32 rel2 version 1 (SYSV), for GNU/Linux 2.6.34, dynamically linked (uses shared libs), for GNU/Linux 2.6.34, stripped

摘一个uboot 的lds 看一下,

OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
. = 0x00000000;

. = ALIGN(4);
.text	   :
cpu/arm1136/start.o	(.text)

. = ALIGN(4);
.rodata : { *(.rodata) }

. = ALIGN(4);
.data : { *(.data) }

. = ALIGN(4);
.got : { *(.got) }

. = .;
__u_boot_cmd_start = .;
.u_boot_cmd : { *(.u_boot_cmd) }
__u_boot_cmd_end = .;

. = ALIGN(4);
__bss_start = .;
.bss : { *(.bss) }
_end = .;

最终链接生成的elf 格式是 little ARM,表示就是小端模式,用linux file/readelf 命令也可以获取到一个可执行文件大小端信息。




小端模式(Little Endian)

大端模式(Big Endian)

Name        : endian_check.c
Author      : qiang
Version     :
Copyright   : Your copyright notice
Description : Hello World in C, Ansi-style

#include <stdio.h>
#include <stdlib.h>

int main(void) {
int x = 1;

if(*(char*) &x ==1) {
printf("Little-Endian. \n");
else {
printf("Big-Endian. \n");

puts("!!!Hello World!!!"); /* prints !!!Hello World!!! */



!!!Hello World!!!

这段代码之所以能够判断出机器的打小端模式在于指针的类型,看下objdump 出来的汇编(ldrb):

000082f0 <main>:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
82f0:	e92d4800 	push	{fp, lr}
82f4:	e28db004 	add	fp, sp, #4
82f8:	e24dd008 	sub	sp, sp, #8
int x = 1;
82fc:	e3a03001 	mov	r3, #1
8300:	e50b3008 	str	r3, [fp, #-8]

if(*(char*) &x ==1) {
8304:	e24b3008 	sub	r3, fp, #8
8308:	e5d33000 	ldrb	r3, [r3]
830c:	e3530001 	cmp	r3, #1
8310:	1a000004 	bne	8328 <main+0x38>
printf("Little-Endian. \n");
8314:	e59f303c 	ldr	r3, [pc, #60]	; 8358 <main+0x68>
8318:	e08f3003 	add	r3, pc, r3
831c:	e1a00003 	mov	r0, r3
8320:	ebffffcb 	bl	8254 <puts@plt>
8324:	ea000003 	b	8338 <main+0x48>
else {
printf("Big-Endian. \n");
8328:	e59f302c 	ldr	r3, [pc, #44]	; 835c <main+0x6c>
832c:	e08f3003 	add	r3, pc, r3
8330:	e1a00003 	mov	r0, r3
8334:	ebffffc6 	bl	8254 <puts@plt>

从 AAPCS 文档上摘取的关于 little endian & big endian 的解释:





Topic 2:AAPCS & ARM Core Register

AAPCS(Procedure Call Stand for ARM Architechture): ARM 架构下应用程序例程调用二进制接口规范。

学习AAPCS 最好的方法是在ARM官方网站,下载AAPCS的spec, PDF 名称为Procedure Call Standard for the ARM Architecture.pdf


The ARM architecture defines a core instruction set plus a number of additional instructions implemented by co-processors.

The core instruction set can access the core registers and co-processors can provide additional registers whiche are available for specific operations.

There are 16,32-bit core(integer) registers visible to the ARM and Thumb instruction sets.

These are labeled r0-r15 or R0-R15. Register names may appear in assembly language in either upper case or lower case.

AAPCS 中16个通用寄存器的作用参考下面的截图:

着重解释一下 R13,栈指针,压栈的过程和出栈在函数调用的过程中分量太重了:

Stack Point Register
    - R13 indicates the stack point(address) of the current processor mode

    - Each processor modes have its own SP(Stack Point)

§ARM state (32 bit Instruction)
    - you can usually see the below assembly code at the entry of the function

       STMDB R13!,{R0-R3,R14} // stores the link register(LR – R14) to return and

                                                          // general-purpose registers into the stack

    - you can usually see the below assembly code at the end of the function

       LDMIA R13!,{R0-R3,PC} // recovers the PC(Program Counter) using the LR

§Thumb state (16 bit Instruction)
    - you can usually see the below assembly code at the entry of the function

       PUSH {R0-R3,R14} // R13 is fixed for the stack

 // stores the general-purpose registers and LR

    - you can usually see the below assembly code at the end of the function

       POP {R0-R3,PC} // recovers the PC using the LR






Topic 3:ARM 工作模式

•User mode
    - normal program execution mode. It can not change the processor mode.

•FIQ mode
    -The operating mode to handle fast interrupt requests

•IRQ mode
    -The operating mode to handle normal interrupt requests

•Supervisor mode
    -ARM switches its mode to SVC Mode when a reset or a software interrupt

     (SWI) occurs

•Abort mode
    -ARM switches to Abort Mode if an error occurs while reading from or writing to


•Undefined mode
    -ARM switches to Undefined Mode if the processor tries to execute an

     unrecognized instruction

•System mode
    -The purpose for this mode is the same as the user mode, except that this

     mode is a privileged mode(can disable/enable the interrupts and change the


•Secure Monitor mode
    - This mode is a secure mode forTrustZone Secure Monitor code.

     ARM1176 supports it.


Topic 4:ARM Exception


