Memory alignment

electroRF · Jan 11, 2014

Hi,

I write this thread here as I suspect that it is Hardware that stands behind memroy alignment.

I read in several places that most CPUs place objects / variables in paritcular offsets in memory.

for 32-bit CPUs, all objects / variables will be located in addresses that are divisable by 4 Bytes - i.e. 0x0000, 0x0004, ... 0x0020, ....

additionally, if user defines a struct which is 18-byte large, the CPU will pad it up with 2-extra bytes, to make it 20-byte large (without the user noticing the difference).

What is the reason behind it?

I understand that if 32-bit CPU can interface only memory addresses in 4-byte offset, then for reading data from address 0x0007, it'd need to read data from 0x0004, and make << 3 shift operation, which will slow its performance.

but why can't it start reading from address 0x0007?

Additionally, is it correct to say that 8-bit CPUs don't have to deal with memory alignment, as they can access any address in the memory? (as their word size is 1-byte).

Thank you very much.

Ian Rogers · Jan 11, 2014

The CPU has to load a register at a time.... as simple as that.. It can't load a partial register... I may be able to load half a register...

The working register of a 64 bit computer is 8 bytes long therefore the CPU is designed to load 8 bytes in one swoop as it were.. The modern PC has a memory management module as well which will be 64 bits long as well, so in a nutshell it will be hardware that determines alignment...

electroRF · Jan 11, 2014

Hi Ian,
Thank you.

The CPU has to load a register at a time.... as simple as that.. It can't load a partial register... I may be able to load half a register...

Lets take a 32-bit CPU.

Now you'd like to read a Character from the memory - i.e. you'll read only 1 byte, not 4 bytes, so the CPU will read an entire 4-byte register, but will deliver only 1/4 of it to the user?

What I didn't understand is what does that char (1-byte) variable has to reside in memory in an address that is divisible by 4?

geko · Jan 11, 2014

It depends on the processor but typically memory is still byte addressable.

The processor reads external memory in 32-bit (4 byte) words so for example each word is addressed like.

Address:
0x0000: [A0000][A0001][A0002][A0003]
0x0004: [A0004][A0005][A0006][A0007]
0x0008: [A0008][A0009][A000a][A000b]

A byte can only ever be in one 32-bit word so it doesn't have to be aligned.

If you have a 2-byte Integer then that would need to align on a 2-byte boundary. It could be in [A0][A1] or [A2][A3] but not [A3][A4] for example.

Exactly how it is handled depends on the processor, architecture and the compiler.

electroRF · Jan 11, 2014

Hi,
Thanks geko

The processor reads external memory in 32-bit (4 byte) words so for example each word is addressed like.

That's what i'm trying to understand - why does 32-bit CPU reads addresses in 4-byte offset?
i.e. what can't it start read from 0x00-00 and as well from 0x00-01?

geko · Jan 11, 2014

Because it has a 32-bit wide data bus so access to memory is done 32 bits at a time, or 4-byte chunks if you like to think of it in bytes.

If you look at the datasheet for a Intel 486DX for example, you'll see it only has external address lines A2-A31, no A0 or A1. It has Byte Enable signals to indicate the active bytes in the 32-bit word but it addresses memory in aligned 4 byte chunks.

https://en.wikipedia.org/wiki/File:80486DX2_arch.svg

3v0 · Jan 11, 2014

By requiring specific offsets the machine can perform operations without adjusting the position of the data to that required by the hardware.

electroRF · Jan 11, 2014

Thank you geko and 3V

Then it is an 'hardware' limit for which 32-bit CPU will move from one memory address to the other in 4-byte steps?

i.e. the hardware prevents CPU to move from address 0x0000 to address 0x0001? (but only from 0x0000 to 0x0004)

geko said:
A byte can only ever be in one 32-bit word so it doesn't have to be aligned.

3V0 said:
By requiring specific offsets the machine can perform operations without adjusting the position of the data to that required by the hardware.

according to what 3V said, reading a byte which starts for example at address 0x0006, will require a 32-bit CPU to adjust its position, therefore, also 1-byte variables have to be memory aligned, is that correct?

3v0 · Jan 11, 2014

I did not say that. Needed alignment is simply determined by the internals of the machine.

Independent of word size some computers are byte addressable and some are not. This choice changes how much memory a computer can address. (8 bit machines are byte addressable by definition)

A 64 bit, byte addressable machine, will only accommodate 1/8 the memory that a similar non byte addressable machine because each word uses 8 addresses. We take memory decoding for granted but in the 70s and 80s the sized of memory was limited by how fast an address could be decoded by the memory logic.

On byte addressable machines there is generaly no need to align byte data. On word addressable machines it gets more complicated.

electroRF · Jan 11, 2014

Hi 3V!
Thanks a lot!

I understood what you wrote, beside one aspect:

3v0 said:
A 64 bit, byte addressable machine, will only accommodate 1/8 the memory that a similar non byte addressable machine because each word uses 8 addresses.

Could you please explain it perhaps differently?

It's an important aspect which I did not manage to understand.

3v0 · Jan 11, 2014

Suppose we have an imaginary machine that only has the ability to address 16 addresses
If we make it byte addressable each byte has its own address so we have exactly 16 bytes of memory.

Now suppose we use each address to address 4 bytes or (32 bits) of memory. Now our machine has 64 bytes of memory. The cost of the additional memory is increased complexity in utilizing each of the 4 bytes.

electroRF · Jan 11, 2014

Awesome explanation!

Thank you very much 3V!

electroRF · Jan 11, 2014

Friends,
I have additional question please.

While studying Malloc implementation, I saw the below structure.

Couldn't it be that the structure size would be larger than long, e.g. struct which cotains int, int*, int* would contain 12 bytes, while long is 8 bytes,
and then the header will NOT be aligned with the largest data type?

C:

/*
• To simplify memory alignment
o Make all memory blocks a multiple of the header size
o Ensure header is aligned with largest data type (e.g., long)
*/

/* align to long boundary */
typedef long Align;
union header { /* block header */
struct {
union header *ptr;
unsigned size;
} s;
Align x; /* Force alignment */
}
typedef union header Header;

nsaspook · Jan 11, 2014

Hardware memory alignment also effects software design so there are ways to handle it.

https://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Variable-Attributes.html
https://www.eventhelix.com/RealtimeMantra/ByteAlignmentAndOrdering.htm

Ratchit · Jan 11, 2014

electroRF,

I don't know to which CPU you are referring, but all the Intel architure x86/x386 and above etc. are able to address a byte directly. There are different instruction codes for bytes, words, and double words. It is more efficient to place the data on a word or double word boundary for word or double word references, but the CPU is able to read/write at any address.

Ratch

KeepItSimpleStupid · Jan 11, 2014

It depends on the archetecture. One susch archetecture had a 64K 16 bit addressable bytes and a 16K 16 bit Word addressable 16 bit words.

adress 0001 for instance is a byte addree
wheras 0002 could be a byte or word address.

Instructions are aligned on a word boundry.
Any data structures were aligned on a word boundry.

The instruction set did have a Mov and a Movb or Move word and move b instruction, but a word could not be between boundries.

It's just too hard keeping track of "pointers" to the data structures. It would really require expanding the instructions to include a bytepointer and a word pointer. Not good.

Ratchit · Jan 12, 2014

KISS,

The Intel processors can operate from any address, but the storage and retrieval of data is more efficient if the data is aligned. Since the Intel instructions are of variable length, successive instructions cannot be guaranteed to be on an even address, nor do they have to be.

Data groups are referenced by "structures" in assembler. Once they are set up, any piece of date is easily referenced.

Ratch

NorthGuy · Jan 12, 2014

Ratchit said:
The Intel processors can operate from any address, but the storage and retrieval of data is more efficient if the data is aligned.

Not all processors are that way. PIC, for example, generates a fault if you try to access a misaligned word.

Ratchit · Jan 12, 2014

NorthGuy,

You are correct. KISS did not specify a particular CPU. I did.

Ratch

KeepItSimpleStupid · Jan 12, 2014

FWIW: I was talking about a PDP-11, a minicomputer that existed in the mid 70's. Memory is a bit hazy.

But it's all architecture.

One goofy CPU I used did not have a gosub statement, but ANY register could become the program counter or stack pointer.
To branch, you passed the program counter to another register. Talk about goofy.

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Memory alignment

Member

User Extraordinaire

Member

Active Member

Member

Active Member

Coop Build Coordinator

Member

Coop Build Coordinator

Member

Coop Build Coordinator

Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Similar threads

New Articles From Microcontroller Tips