This is a very interesting topic we are about to dig in. How the compiler organizes our program, maybe you been wondering all those years how the compiler handles global variables, program code and etc.. Basically the compiler takes your program and divided into sections, one for the code, another for the global variables with initial values, another one for none initialized variables and even custom sections the programmer can declare, this way the linker later on can organize these data into our microcontroller memory
Ok lets go back to the most basic program ever created main.c
int main( void )
{
return 0;
}
Compiling …. only
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
And what next?, lets take a look at the information store in the object file, we need to use a different tool called objdump. Ohhhh, what an interesting piece of information we got from the object file, if you notice we can find five different rows and columns with some really useful information, for the moment lets focus on column Size, and for simplicity lets ignore rows 3 and 4. We are going to call the rows as sections and notice the only one with a size different from zero is .text, mmmm.
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000000c 00000000 00000000 00000034 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000040 2**0
ALLOC
3 .comment 0000001f 00000000 00000000 00000040 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002c 00000000 00000000 0000005f 2**0
CONTENTS, READONLY
Add a global variable just to see what happens next.
unsigned long var1;
int main( void )
{
return 0;
}
Compile and display the header information and please take a look at the .bss section, is now 4 bytes, just the size of the variable we declare ( .text remains the same ). Quite interesting…
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000000c 00000000 00000000 00000034 2**1
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000040 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000040 2**2
ALLOC
3 .comment 0000001f 00000000 00000000 00000040 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002c 00000000 00000000 0000005f 2**0
CONTENTS, READONLY
Declare another global variable but this time with an initial value and assign variable 2 to variable 1.
unsigned long var1;
unsigned long var2 = 34;
int main( void )
{
var1 = var2;
return 0;
}
Compile and display again. Notice the section .data now is 4 bytes and also .text increase its size, maybe because we added more code. Time for a small analysis. We can deduct the compiler assign the uninitialized global variables into a section called .bss and those variables with a initial value goes into section .data, and all the instruction are place into section .text, you can add more variables of different types and code and see how the sections size increases.
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000004 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000054 2**2
ALLOC
3 .comment 0000001f 00000000 00000000 00000054 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002c 00000000 00000000 00000073 2**0
CONTENTS, READONLY
And what about the local variables?, OK we can add a local variable and according to what we been seeing the .data section should increase it size, isn’t it?
unsigned long var1;
unsigned long var2 = 34;
int main( void )
{
unsigned long var3 = 40;
var1 = var2;
return 0;
}
Down below we can see .data section does not increase its size, but .text indeed did. This has an explanation, remember that local variables are created at run time not at compilation time, in a special section of the memory called stack ( we are not going into detail about the stack yet ) and allocate memory there requires some instruction to move something called stack pointer, that is why the .text increase its size
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000024 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000004 00000000 00000000 00000058 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 0000005c 2**2
ALLOC
3 .comment 0000001f 00000000 00000000 0000005c 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002c 00000000 00000000 0000007b 2**0
CONTENTS, READONLY
So what happens if we declare our local variable as static?
unsigned long var1;
unsigned long var2 = 34;
int main( void )
{
static unsigned long var3 = 40;
var1 = var2;
return 0;
}
Hey, the .data section increase its size, that's mean local but static variables are not stored in stack but in a global memory region. Also look at the .text section
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000008 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000058 2**2
ALLOC
3 .comment 0000001f 00000000 00000000 00000058 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002c 00000000 00000000 00000077 2**0
CONTENTS, READONLY
And what about variables declare as constants?, those are important too,
unsigned long var1;
unsigned long var2 = 34;
const unsigned long var4 = 100;
int main( void )
{
static unsigned long var3 = 40;
var1 = var2;
return 0;
}
A wild new section just appears .rodata, that’s right if we add the const qualifier to a global variable it is assigned to a new section. A little homework for you, declare a const local but non static variable and see what happens, no it is not going to be assigned to .rodata
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000008 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000058 2**2
ALLOC
3 .rodata 00000004 00000000 00000000 00000058 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .comment 0000001f 00000000 00000000 0000005c 2**0
CONTENTS, READONLY
5 .ARM.attributes 0000002c 00000000 00000000 0000007b 2**0
CONTENTS, READONLY
These are not the only sections we can use to place our variables and code, we can also create new ones with any identifier we want, but keep in mind a compiler specific instructions needs to be used. This special instruction is __attribute__((section("name")))
and is only recognize by GCC like compilers.
unsigned long var1;
unsigned long var2 = 34;
const unsigned long var4 = 100;
unsigned long var5 __attribute__((section("custom")));
int main( void )
{
static unsigned long var3 = 40;
var1 = var2;
return 0;
}
We can observe our new section “custom“ down below, and notice its size its 4 bytes which means our variable var5 has been placed there
$ arm-none-eabi-gcc -c main.c -o main.o -mcpu=cortex-m0plus
$ arm-none-eabi-objdump -h main.o
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000001c 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000004 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000054 2**2
ALLOC
3 .rodata 00000008 00000000 00000000 00000054 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 custom 00000004 00000000 00000000 0000005c 2**2
CONTENTS, ALLOC, LOAD, DATA
5 .comment 0000001f 00000000 00000000 00000060 2**0
CONTENTS, READONLY
6 .ARM.attributes 0000002c 00000000 00000000 0000007f 2**0
CONTENTS, READONLY
And what now?, easy, we learned how the compiler organizes our program variables and instructions but you can see nothing happens with their addresses, that’s a job for the linker.
By the way, what to know what does it mean the words below the numbers on each section
- ALLOC – Section will have space allocated in the process when loaded. Set for all sections except those containing debug information.
- LOAD – Section will be loaded from the file into the child process memory. Set for pre-initialized code and data, clear for .bss sections.
- RELOC – Section needs to be relocated before loading.
- READONLY – Section cannot be modified by the child process.
- CODE – Section contains executable code only.
- DATA – Section contains data only (no executable code).