您的位置:首页 > 其它

Code layout in memory(zt)

2006-01-20 11:53 253 查看
Since you mentioned C, I'll briefly explain the sections of a compiled & linked C program.

When a compiler compiles each file of a program, it compiles it into object code. Some compilers output object code directly into machine code, while other compilers output code that is in assembly language, which then needs to be passed to an assembler to convert it into machine code. In either case, the object code for each file contains the coded instructions for all the C statements as "text", memory allocated for global variables as "bss", & memory allocated for global constants or initialized variables as "data". [BTW, I just found out for myself that "bss" stands for "Block Started by Symbol" http://en.wikipedia.org/wiki/Block_Started_by_Symbol ]

After compilation, each file has been compiled to a separate object file, each of which has their own code & data encoded in order relative only to itself. In other words, the coded "text" instructions for each object file start at memory 0x0000 & increase from there. The "bss" & "data" sections also start from their own 0x0000 addresses.

It's the linker's job to combine all of these separate object files into a program & assign the "text", "bss", & "data" sections from each file with absolute memory addresses. The linker does this by starting at some addresses defined by the linker script for each section & file by file, appends each object file's section to the appropriate linker section.

In the end, the program will have combined all the relative sections found in each object file into a single, cohesive program. It depends on the OS whether the linked program will contain actual, fixed, absolute memory addresses (like many embedded systems) or relative addresses to the whole linked program (like most PC programs).

With many linkers, you can specify starting addresses for each section OR request that each section start immediately following the previous section. The former provides control of where each section is going to start so you don't have to look at the linker output file to figure out where in hex each section is. However, you hvae to make sure that you allocate enough memory for each section so that the linker doesn't overlap them. The latter is easier since the linker will automatically just link each section one after the other, but if you want to inspect the actual machine code in memory, the sections' starting points will change as more or less code is added.

For an example, let's have three files A.c, B.c, & C.c to be built for an embedded system. The compiler compiles each into object files, A.o, B.o, C.o (or A.obj, whatever). Each of these object files has their own sections relative only to itself :

- A.o :
-- A.c's "text" section starting @ 0x0000 size 0x1468
-- A.c's "bss"_ section starting @ 0x1468 size 0x0214
-- A.c's "data" section starting @ 0x167C size 0x0108

- B.o :
-- B.c's "text" section starting @ 0x0000 size 0x0834
-- B.c's "bss"_ section starting @ 0x0834 size 0x00C4
-- B.c's "data" section starting @ 0x08F8 size 0x0044

- C.o :
-- C.c's "text" section starting @ 0x0000 size 0x211C
-- C.c's "bss"_ section starting @ 0x211C size 0x023C
-- C.c's "data" section starting @ 0x2358 size 0x0190

As mentioned earlier, note that each of the object file's start @ 0x0000 & know nothing about each other.

Now it's the linker's job to combine them into a single program. Let's assume the linker script says to start the the "text" section @ 0x2000, "bss" @ 0x6000, "data" @ 0x7000. Also just for fun, let's say the files are specified in the order, C.o, B.o, & A.o. The linker will output the final program as :

- ABC.exe (or whatever executable format)
-- C.c's "text" section starting @ 0x2000
-- B.c's "text" section starting @ 0x411C (0x2000 + 0x211C) <- size of C.c's "text" section
-- A.c's "text" section starting @ 0x4950 (0x411C + 0x0834)
-------- "text" section stops __ @ 0x5DB8 (0x4950 + 0x1468)

-- C.c's "bss" section starting @ 0x6000
-- B.c's "bss" section starting @ 0x623C (0x6000 + 0x023C)
-- A.c's "bss" section starting @ 0x6300 (0x623C + 0x00C4)
-------- "bss" section stops __ @ 0x6514 (0x6300 + 0x0214)

-- C.c's "data" section starting @ 0x7000
-- B.c's "data" section starting @ 0x7190 (0x7000 + 0x0190)
-- A.c's "data" section starting @ 0x71D4 (0x7190 + 0x0044)
-------- "data" section stops __ @ 0x72DC (0x71D4 + 0x0108)

To run this final program, the CPU &/or OS must be configured to start running code from address 0x2000. The linker will have resolved all relative addresses in each object file to an absolute address that falls within one of the valid "text", "bss", or "data" sections. [Note that I've ignored any "stack" or "heap" memory sections for simplicity, but these sections are specified by the program, linker, &/or OS depending on the OS & linker capabilities.]
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: