Memory Allocation & Release in O(1)

electroRF · Dec 31, 2013

Hi,

I'm trying to understand how can one manage the following memory in O(1).

You have a block size of 1000 Bytes.

The user can ask to allocate for him memories of sizes 10, 20 or 50 bytes.

How can you manage the user requests for Allocate / Release in O(1)?

Thank you.

misterT · Dec 31, 2013

Allocated memory blocks are kept in a linked list. You have two lists; Allocated memory and free-memory.
So, allocating and freeing memory is nothing more than adding and removing nodes from a linked list.
Of course the algorithm is not O(1) anymore if your memory becomes fragmented and you need to find a large enough free memory block. I don't think that can be done in O(1).

Read the last chapter in this page: https://www.nongnu.org/avr-libc/user-manual/malloc.html

EDIT: Actually only the free memory blocks are kept in a linked list so that they can be re-used. The user of the memory is responsible to keep record (pointers) of the memory blocks he has reserved.

electroRF · Dec 31, 2013

Hi T,
Thank you very much.

Regarding O(1) solution.

If the user could ask only for fixed-size blocks, i.e. 10 bytes each time,

I could easily handle the memory allocation /deallocation in O(1).

That is by having each free memory pointing on the next free memory.

But having the user asking for difference sizes of blocks (10 / 20 / 50 bytes), I don't manage to handle it in O(1).

But I'm not sure there isn't such an O(1) solution - it might be just a matter of space-time tradeoff.

misterT · Dec 31, 2013

electroRF said:
But having the user asking for difference sizes of blocks (10 / 20 / 50 bytes), I don't manage to handle it in O(1).

But I'm not sure there isn't such an O(1) solution - it might be just a matter of space-time tradeoff.

If the user has only three options for the memory size (10 / 20 / 50 bytes), then you could easily have a free-list for each size.

And this is not a matter of space-time tradeoff, since the data structure (linked list) uses the free memory blocks themselves as nodes. Each free memory block contains its size and a pointer to the next free memory block. Keeping several linked lists would not waste memory practically at all. Only one pointer per list is needed.

electroRF · Dec 31, 2013

Hi T,
Thank you again!

Say that we have 3 free lists - one for each size (10 / 20 / 50).

Say that you used all the 1000-byte block for sub-blocks of 10byte each (i.e. 100 blocks).

Now you start releasing 10-byte sub-blocks.

How do you update the 50-byte list (or 20-byte list) list when 5 (or 2) contiguous 10-byte blocks were released? (the update should be in O(1))

That's the issue I don't manage to resolve.

misterT · Dec 31, 2013

Hmm.. yeah. I didn't think that through.

I found this paper which describes a fast algorithm, but it is really too advanced for your "simple" problem. **broken link removed**

I'll keep thinking about your problem and post if I figure out some tricks that can be done.

misterT · Dec 31, 2013

Google for "slab allocator".

electroRF · Dec 31, 2013

misterT said:
Allocated memory blocks are kept in a linked list. You have two lists; Allocated memory and free-memory.
So, allocating and freeing memory is nothing more than adding and removing nodes from a linked list.
Of course the algorithm is not O(1) anymore if your memory becomes fragmented and you need to find a large enough free memory block. I don't think that can be done in O(1).

Read the last chapter in this page: https://www.nongnu.org/avr-libc/user-manual/malloc.html

EDIT: Actually only the free memory blocks are kept in a linked list so that they can be re-used. The user of the memory is responsible to keep record (pointers) of the memory blocks he has reserved.

Hi again T,

I'm trying to understand the algorithm in the link you attached - how Malloc and Free work.

It says there:
" Since both, a chain pointer and the size of the chunk need to be recorded in each chunk, the minimum chunk size on the freelist is four bytes. "

I didn't manage to understand, how is the freelist built?
Say in the beginning when all memory is free.

misterT · Dec 31, 2013

electroRF said:
" Since both, a chain pointer and the size of the chunk need to be recorded in each chunk, the minimum chunk size on the freelist is four bytes. "
I didn't manage to understand, how is the freelist built?
Say in the beginning when all memory is free.

I'll simplify this as much as I can and do my best explaining it.

You have 1000 bytes of heap memory. The bytes are numbered from 1 to 1000 (byte addresses).

You need 2 variables:
heap_bottom = 1 // This marks the start of unused heap memory.. when this reaches 1000 then all the memory is used. Possible free memory is in the freelist.
freelist_pointer = NULL // this is a pointer to first free memoryblock.. initially NULL

When you want to allocate 10 bytes of memory you record that size in the byte just before the actual memory pointer that is returned. So after calling malloc(10) the variables contain:
heap_bottom == 12
freelist_pointer == NULL
Malloc returned pointer to memory location 2 and location 1 contains the value 10.

After calling malloc(20)
heap_bottom == 33
freelist_pointer == NULL
Malloc returned pointer to memory location 13 and location 12 contains the value 20

After calling malloc(10)
heap_bottom == 44
freelist_pointer == NULL
Malloc returned pointer to memory location 34 and location 33 contains the value 10

Now.. lets free memory, call to free(2) should free the memory at that position. All we need to do is to write that pointer value to the freelist.
freelist_pointer == 2
We also write NULL to the memory location 2, because this memory block is the last node of our list.

Free(34):
Now, we need to link this memory bloc to the freelist. This is simple linked list addition.
Write 34 to the last nodes pointer and set this node as last node.
memory location 2 == 34
memory location 34 == NULL

Now, our variable "freelist_pointer" pointer points to memory location 2, which in turn points to memory location 34, which points to NULL.
When you need to know the size of the free memory block, you just read the byte before the memory block i.e. locations 1 and 33 in this case.

Well.. that is quite a mess. One good picture would explain it much better. These pictures are not exactly related to my text. But very close.

misterT · Dec 31, 2013

The important thing to know is that every allocated memory block has a header that contains all kinds of information you need. The size of the memory block, pointers to next free memory block etc.

NorthGuy · Dec 31, 2013

Once I walked into an electric supply store and they had a sign:

"We can do it well, fast, and cheap. But you can only peek two of these"

So, with any algorithm. You cannot make it optimal in every respect. If you want it fast, it probably won't be optimal in other areas.

Memory allocation algorithm should be:

- fast
- compact (will not waste memory)
- robust (minimize defragmentation)
- safe (work fine if you pass a wrong pointer to free())

You cannot achieve all of that in one memory manager. You need to make some compromise in areas which are less important to you.

You want it to be O(1), which is fast.

The champion of the fast would be to allocate a 1000-byte block and return it regadess of the size requested. This also would be extremely robust and could be made safe with ease. But this will be terrible because it wouldn't be compact at all.

So, we probably need something better. I would suggest the algorithm below. It'll be just a touch slower than the fastest algorithm, and it'll run in O(1). But, it will not be as robust and as compact as it could be, and it won't be safe.

You know how to do fixed-size allocator, right?

1. Create a 1000-byte allocator.

2. For 50-byte blocks, create a 50-byte allocator. When it runs out of memory, allocate a new 1000-byte block and create 18 new 5o-byte blocks in it (I assumed 4-byte pointers, so 50-byte block requires 54 bytes of memory), mark them all as free and add them to the free list of your 50-byte allocator.

3. For 20-byte blocks, create a 20-byte allocator. When it runs out of memory, allocate a new 1000-byte block and create 45 new 20-byte blocks. Or, you can allocate a new 50-byte block and create 2 20-byte blocks in it. You test your algorithm and you decide what's better.

4. For 10-byte blocks, you can allocate 50-byte blocks and nest 3 10-byte blocks inside, or you can allocate 1000-byte block and create 71 10-byte blocks in it. But also you can allocate a block from 20-byte allocator and return it as a 10-byte block. Again, only testing can tell you what's better.

This will be fast, relatively compact, moderately robust, and unsafe.

misterT · Dec 31, 2013

NorthGuy said:
You want it to be O(1), which is fast.

Sorry for nitpicking, but O(1) doesn't automatically mean that the algorithm is fast. It only means that it executes in constant time, which is very important in some real-time applications where timing need to be very predictable and precise.
But, usually O(1) algorithms are also fast.

NorthGuy · Dec 31, 2013

misterT said:
Sorry for nitpicking, but O(1) doesn't automatically mean that the algorithm is fast. It only means that it executes in constant time, which is very important in some real-time applications where timing need to be very predictable and precise.
But, usually O(1) algorithms are also fast.

Good point.

O(1) is not necessarily constant, but rather independent on the amount of objects being managed, so it can be scaled without any loss in speed.

electroRF · Jan 1, 2014

Hi T,

This is an awesome post and so great an explanation.

I read it over and over to better understand, and I think I got it

I have a question on your example.

Say that you now free (13).

You'll get a contiguous part of the memory which is free - from address 1 to 43 (inclusive).

How do you efficiently make them one contiguous free block of size 43, instead of 3 contiguous blocks of sizes 10, 20, 10?

BTW - you can't really allocate sizes that are not 4-byte-aligned, right?

because it'd decrease CPU performance.

--EDIT--

Additional question please

in this algorithm, when you the user wants to allocate additional memory, which is available in the free-list, do you remove that node from the list, or do you mark it as taken?

misterT said:
I'll simplify this as much as I can and do my best explaining it.

You have 1000 bytes of heap memory. The bytes are numbered from 1 to 1000 (byte addresses).

You need 2 variables:
heap_bottom = 1 // This marks the start of unused heap memory.. when this reaches 1000 then all the memory is used. Possible free memory is in the freelist.
freelist_pointer = NULL // this is a pointer to first free memoryblock.. initially NULL

When you want to allocate 10 bytes of memory you record that size in the byte just before the actual memory pointer that is returned. So after calling malloc(10) the variables contain:
heap_bottom == 12
freelist_pointer == NULL
Malloc returned pointer to memory location 2 and location 1 contains the value 10.

After calling malloc(20)
heap_bottom == 33
freelist_pointer == NULL
Malloc returned pointer to memory location 13 and location 12 contains the value 20

After calling malloc(10)
heap_bottom == 44
freelist_pointer == NULL
Malloc returned pointer to memory location 34 and location 33 contains the value 10

Now.. lets free memory, call to free(2) should free the memory at that position. All we need to do is to write that pointer value to the freelist.
freelist_pointer == 2
We also write NULL to the memory location 2, because this memory block is the last node of our list.

Free(34):
Now, we need to link this memory bloc to the freelist. This is simple linked list addition.
Write 34 to the last nodes pointer and set this node as last node.
memory location 2 == 34
memory location 34 == NULL

Now, our variable "freelist_pointer" pointer points to memory location 2, which in turn points to memory location 34, which points to NULL.
When you need to know the size of the free memory block, you just read the byte before the memory block i.e. locations 1 and 33 in this case.

Well.. that is quite a mess. One good picture would explain it much better. These pictures are not exactly related to my text. But very close.

electroRF · Jan 1, 2014

NorthGuy

Thank you very much!

I'm still digesting the offered solution you wrote

NorthGuy said:
Once I walked into an electric supply store and they had a sign:

"We can do it well, fast, and cheap. But you can only peek two of these"

So, with any algorithm. You cannot make it optimal in every respect. If you want it fast, it probably won't be optimal in other areas.

Memory allocation algorithm should be:

- fast
- compact (will not waste memory)
- robust (minimize defragmentation)
- safe (work fine if you pass a wrong pointer to free())

You cannot achieve all of that in one memory manager. You need to make some compromise in areas which are less important to you.

You want it to be O(1), which is fast.

The champion of the fast would be to allocate a 1000-byte block and return it regadess of the size requested. This also would be extremely robust and could be made safe with ease. But this will be terrible because it wouldn't be compact at all.

So, we probably need something better. I would suggest the algorithm below. It'll be just a touch slower than the fastest algorithm, and it'll run in O(1). But, it will not be as robust and as compact as it could be, and it won't be safe.

You know how to do fixed-size allocator, right?

1. Create a 1000-byte allocator.

2. For 50-byte blocks, create a 50-byte allocator. When it runs out of memory, allocate a new 1000-byte block and create 18 new 5o-byte blocks in it (I assumed 4-byte pointers, so 50-byte block requires 54 bytes of memory), mark them all as free and add them to the free list of your 50-byte allocator.

3. For 20-byte blocks, create a 20-byte allocator. When it runs out of memory, allocate a new 1000-byte block and create 45 new 20-byte blocks. Or, you can allocate a new 50-byte block and create 2 20-byte blocks in it. You test your algorithm and you decide what's better.

4. For 10-byte blocks, you can allocate 50-byte blocks and nest 3 10-byte blocks inside, or you can allocate 1000-byte block and create 71 10-byte blocks in it. But also you can allocate a block from 20-byte allocator and return it as a 10-byte block. Again, only testing can tell you what's better.

This will be fast, relatively compact, moderately robust, and unsafe.

misterT · Jan 1, 2014

electroRF said:
Say that you now free (13).
You'll get a contiguous part of the memory which is free - from address 1 to 43 (inclusive).
How do you efficiently make them one contiguous free block of size 43, instead of 3 contiguous blocks of sizes 10, 20, 10?

Yes, you will get contiguous free blocks and it is a good idea to combine them into single larger block. I don't know what would be the most efficient, but keeping the freelist in order (by address) is a good start. Then you only need to check the previous and next block in the list. If their space collides with the new node, then combine the blocks.

electroRF said:
in this algorithm, when you the user wants to allocate additional memory, which is available in the free-list, do you remove that node from the list, or do you mark it as taken?

I think usually the node is removed. If the free block is (much) larger than the requested memory, then the node can be split into two smaller memory blocks.

Anyway.. there are always many ways to do things.

electroRF · Jan 1, 2014

Thank you T, very very much.

Man, isn't it a VERY tough question to be asked during an interview?

I mean, how can you know how efficient and how fast your algorithm is expected to be?

misterT · Jan 1, 2014

electroRF said:
Man, isn't it a VERY tough question to be asked during an interview?

Well, if this was a job interview question, then maybe they were not expecting a correct answer to the question itself. Maybe they wanted to see how you approach the problem etc. Maybe the question was difficult intentionally. They want to see how you react to that.

NorthGuy · Jan 1, 2014

electroRF said:
Man, isn't it a VERY tough question to be asked during an interview?

Not really. They don't want you to give the exact solution, they just want to see how you're going to approach the task. They try to find a question that you haven't heard before because they want to see how you think.

NorthGuy · Jan 1, 2014

misterT said:
Well, if this was a job interview question, then maybe they were not expecting a correct answer to the question itself. Maybe they wanted to see how you approach the problem etc. Maybe the question was difficult intentionally. They want to see how you react to that.

MisterT! You're faster than me

But hey, almost word for word

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Memory Allocation & Release in O(1)

Member

Well-Known Member

Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Member

Member

Well-Known Member

Member

Well-Known Member

Well-Known Member

Well-Known Member

Similar threads

New Articles From Microcontroller Tips