All general purpose processors, be it Intel, ARM or AVR devices work by reading instructions from memory and executing them. They are generally based on one of two Architectures: Von-Neumann or Harvard. The vast majority of larger systems such as PC’s and mobile devices use Von-Neumann, but AVR processors in Arduino boards such as Uno and Mega use Harvard architecture.
In a Von-Neumann processor there is one memory area, all program code, data, EEPROM and IO registers share a single memory space, with everything mapped to unique address ranges. This makes programming simpler in many ways because there are no concerns about “which” type of memory is being accessed.
In a Harvard processor each memory area is a segmented space. This means that program code (FLASH or PROGMEM) is read with one set of instructions; while SRAM uses a different set. Later in this article we cover how to move constant items into FLASH/PROGMEM.
Above we can see a model of the memory within the Arduino, on the left we see Program Memory storage, here the code itself is stored, along with anything we define as PROGMEM. It’s usually the largest memory block. Accessing this memory requires special instructions on AVR processors.
Next, in the middle is SRAM, this is the most contended space in AVR / Arduino sketches. On an ATMEGA328 (Uno board) there is 2K of this RAM, shared between all those usages. Global variables are the ones defined outside of any function in the sketch, the virtual methods area is generated by the compiler. In the next section we’ll take a look at stack and heap.
Lastly, there’s EEPROM memory, arranged as another memory area but needs special instructions to read and write. We’ll not go into too much detail here.
Using the new operator in any embedded processor with limited memory needs very careful thought. Every new operation creates an object on the heap. It is fine to use sparingly during start up to allocate dynamic structures, but using it during program runtime dramatically increases the chances of program failure. This is because the heap is very small, and can easily become fragmented. See the diagram below..
Every local variable we use is stored on the stack, every time we call into a function its parameters are added to the stack and any of it’s local variables are added too (the stack pointer is increased). Conversely, when the function finishes, its local variables and parameters are removed from the stack (the stack pointer is reduced). If the stack grows too large it may run into the heap, causing unexpected, unexplained failures. When programming in C, functions can usually be nested reasonably deeply, even on a smaller device.
Unlike writing software for server, desktop or mobile, embedded devices have a much less memory meaning types need to be chosen carefully. Always consider the largest value you need to store and choose the smallest type that will hold it. Note the below are good for AVR, but on 32 bit boards some sizes may differ, rely on these as a minimum size.
AVR Type | Size | Description |
---|---|---|
byte | 8 bit | signed single byte: -128..127 |
bool | 8 bit | true (non zero) or false (zero) |
char | 8 bit | single byte - don’t rely on sign |
int | 16 bit | two bytes signed: -32768..32767 |
short | 16 bit | two bytes signed: -32768..32767 |
long | 32 bit | four bytes signed |
For byte
, int
, short
and long
you can prepend with unsigned to create an unsigned variable. For example:
unsigned int myInt;
There are also some types where the size is fixed on all platforms:
AVR Type | Size | Description |
---|---|---|
uint8_t | 8 bit | one byte unsigned: 0..255 |
uint16_t | 16 bit | two bytes unsigned: 0..65355 |
uint32_t | 32 bit | four bytes unsigned: 0..4294967295 |
int8_t | 8 bit | one byte signed: -128 to 127 |
int16_t | 16 bit | two bytes signed: -32768..32767 |
int32_t | 32 bit | four bytes signed |
I’ve often seen confusion in the Arduino domain around text manipulation, and even discussions about using the string type on 8 bit boards. Firstly, from what I’ve read string doesn’t work well on 8 bit boards, use character arrays instead. In C any character array is a zero terminated array of characters. Let’s take a look what these means in practise:
char myText[10] = "Hello";
In memory this will look as follows (where \0 represents 0x00 and not ‘0’). The question marks mean memory in these locations will currently be undefined, but it does not matter as they are after the terminating 0.
0 1 2 3 4 5 6 7 8 9
|H|e|l|l|o|\0|?|?|?|?|
If we use the sizeof
operator on this type, we would get back 10. It returns the size of the type, not the string.
int arraySize = sizeof(myText); // arraySize would be 10
If we use strlen
as below we would get back 5. It measures where the zero termination is in the text.
int stringLen = strlen(myText); // stringLen would be 5.
We can pass myText
to any usual function that requires text, such as print, the function will print the characters up to
the zero terminator.
MyStream.println(myText);
Caution: Be very careful not to exceed the size of a character array, there is no protection from overwriting memory beyond the end of your array, causing the board will crash in unexpected ways.
When creating enumerations for Arduino / AVR, it’s unlikely you’ll need more than 256 values in most cases, declare them of type
byte as follows, this ensures they only take up 1 byte of SRAM. If you don’t do this, the default size for enum is that of int
.
enum MyEnumValueType : byte {
ENUM_OPTION1,
ENUM_OPTION2,
ENUM_OPTION3
}
If you wanted to store 8 boolean types, that would require 8 bytes. However, if instead you bit packed them into a single byte, you’d only need one byte. That is a large saving. Here’s an example
// this enum goes from 0..7
enum MyBoolFlags : byte { MYFLAG1 = 0, MYFLAG2, MYFLAG3, MYFLAG4, MYFLAG5, MYFLAG6, MYFLAG7 };
byte flags = 0;
// then to read one of the bits
bool theValue = bitRead(flags, MYFLAG1);
// then to write one of the bits
bitWrite(flags, MYFLAG1, newValue);
When we create any constant variable in an Arduino sketch or library, it will default to being stored in SRAM, this is obviously quite inconvenient given the small size of the said SRAM storage. However storing items in program memory does not come for free, as to read the values we need to use special functions.
We’ll take a look at those in a moment, but first let’s look at how to create a constant char array in FLASH
const char someConstantText[] PROGMEM = "My text to store in program memory";
and then to access the memory later a byte at a time
pgm_read_byte_near(address)
There’s even special versions of some of the character array functions such as:
strcpy_P(destInRam, srcInFlash); // copy from PROGMEM / FLASH into RAM
strlen_P(srcInFlash); // get length of string in PROGMEM / FLASH
Arrays of other sizes and even structures can also be stored in PROGMEM using the same technique, there’s also pgm_read_word_near
and pgm_read_dword_near
which do the same for arrays of word and d-word values.
There is also a text helper that shortcuts the above for appropriate cases, putting the data in FLASH. Many of the inbuilt functions such as print on Stream can take this as a parameter directly.
F("Some text to store in program memory / Flash")