Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

What would make a faster execution?

Status
Not open for further replies.

electroRF

Member
Hi,

Say that you need to write in microcontroller a function which needs to read values from an array - one value per call.

This function is called many times.

What would be faster implementation:

1. to write the array's values hard-coded to the memory? (which would be ROM, right?)
i.e. i assume it means setting a global array, which will include all the values.
--- does it mean it'd be stored in data segment of RAM?

2. just when main starts, use a loop (e.g. for loop) to compute all values in the array.
--- I assume that in that case the array will be global but will not be initialized.

I'd appreciate your comments.

Thank you.
 
Try different methods in a simulator (PIC's have such simulators bundled in) to see which is fastest.

You can use the stopwatch and breakpoints. Run until the function call, reset the stop watch, and then continue until a break point after your function call.

The stopwatch will give you the time and or cycles the function has taken for the array read.

In C18 you define arrays in program memory as
Code:
const rom unsigned char sinetable[256] = { 128,130,131, ...... };

I cannot remember how many cycles such a read takes.

In regard to 2, for RAM based arrays, yes (I have a suspicion you MAYBE able to write to flash at run time). If you really need extra speed in for loops dealing with arrays, you can try loop unrolling to mitigate the loop overhead (i.e. process 8 elements per loop iteration, or more).

Hope that helps.
 
Hi Rich,
Thanks a lot!
Try different methods in a simulator (PIC's have such simulators bundled in) to see which is fastest.

You can use the stopwatch and breakpoints. Run until the function call, reset the stop watch, and then continue until a break point after your function call.

The stopwatch will give you the time and or cycles the function has taken for the array read.

did you mean to compare the time of function's execution, when array's values are hard-coded, and when array's values are pre-computed right in the beginning of main?

I'm actually trying to understand the theory behind it.

I'm trying to understand, what would be the difference between the different implementations in a. b. and c. and what is the difference between reading from RAM Static Area and ROM?

a. When an array is hard-coded - (global scope) int sinetable[256] = {0, 12, 14, ...}; - it'd be written to RAM Static Area?

b. When an array is hard-coded with const - (global scope) int const sinetable[256] = {0, 12, 14, ...}; - it'd be written to ROM?

c. When an array is pre-computed inside main, but is defined globally (defined outside main - int sinetable[256]; ), it'd be still written into RAM, static area?
 
Last edited:
I meant compare arrays declared in RAM and flash, verify yourself if there is a speed difference :)

a. RAM.
b. Hmm I do not think so. The syntax i posted above is for getting arrays into flash (from the c18 manual). I do not do much embedded programming anymore. Mostly a VHDL man now ;)
c. Generally RAM. I think you can write to flash, however, I am not sure, and I think you need to write functions or use libraries to do so. Your loop would take some time to run on start up, but generally that would not be an issue. You could use a global, or use pointers. Remember, the name of an array is a pointer.

So you could go something like this for using pointers (Note this is VERY hastily written, whilst slurping my coffee on my break!)

Code:
void main ( void )
{
    char testarray[] = "testing123";
    int testint;

    testint = functiontest(testarray);

}

// a silly function that counts the number of bytes
// until it finds a null terminator.
int functiontest ( char * inputarr )
{
    int i = 0;
    while(inputarr[i] != '\0')
        i++;

   return i;
}

Globals are a lot more commonly used in the embdedded world, than the pc world.
 
Last edited:
I am not too sure how pic compilers do it, but here asi my 2 cent about how avr-gcc does it.
Cases a and b will be both the same, and they will be loaded by the startup code from the program memory into ram. Avrs have special macros to allow for loading static data from program memory.

Case c is very similar to a and b, only the memory will be zeroed before main starts instead of loading it with the data from program memory.
 
RAM is much faster than ROM.
When I need a sine look-up table, I usually calculate the values in the table "online" when the main starts. Usually a char table with 256 elements and a full sine wave. I think that is the most efficient way with 8-bit micros. Of course you need to have plenty of memory to do that.
You could define the array as const to give the compiler more options to optimize your code. I'm not sure if that makes any real difference. Depends on how you use the array.
 
Last edited:
It is certainly faster to fetch from RAM than from ROM. Remember shadow BIOS? BIOS is moved from ROM to RAM to speed up processing.

About pre-computing. If it's something long and difficult, it's better to pre-compute on PC and create an array. If it's something easy, such as assigning consecutive numbers, it's better to build the at startup in main().
 
About pre-computing. If it's something long and difficult, it's better to pre-compute on PC and create an array.
That is true. My sine-table initialization looks like this:
C:
    for (int i=0; i<256; i++) {
        sineTable[i] = (int8_t)(127.0 * sin(6.283*((float)i)/256.0));
    }
That is time consuming and requires floating point and math routines included in the program. Maybe for a "final product" I would pre-compute the table. But at "production phase" it is nice to have the flexibility to play with things easily.
 
But at "production phase" it is nice to have the flexibility to play with things easily.

I agree.

If I do pre-calculations, I usually write a small program that outputs the definition of the array with all necessary syntax into an include file. The location of the include file is hard-coded, so when I run the program it creates the file where needed. which immediately affects all my future builds. If I need to change something, I make changes to the generator program and run it again.
 
Hi,
Thank you very much dear friend!

RAM is faster than ROM, however, i understand that in both cases

1. global array with hard-coded values (with and without const) -> the array will be written in ROM and will be loaded from the ROM to the RAM before the beginning of program

2. global array with values computed in the beginning of main ->the will be written only in RAM (?)

Therefore, is it correct to assume that the parameter which will tell which is more time efficient is
the loading duration of array's values from ROM to RAM VS the pre-compution time of values?
 
I agree.

If I do pre-calculations, I usually write a small program that outputs the definition of the array with all necessary syntax into an include file. The location of the include file is hard-coded, so when I run the program it creates the file where needed. which immediately affects all my future builds. If I need to change something, I make changes to the generator program and run it again.

I tend to do this, with an Octave script in the project directory.
 
Therefore, is it correct to assume that the parameter which will tell which is more time efficient is
the loading duration of array's values from ROM to RAM VS the pre-compution time of values?

Yes, and loading the values from ROM to RAM is much faster in real life applications.. must be. If calculating values is faster than reading from ROM, then there is not much point using a look-up table :)
But, the initialization needs to be done only once. Why would it matter how fast the initialization is (if we are talking about fractions of seconds)?
 
But, the initialization needs to be done only once. Why would it matter how fast the initialization is (if we are talking about fractions of seconds)?

That's a good point.

There's no reason to try to write a more efficient code when the less efficient code works fine, doesn't cause undue delays and doesn't use limiting resources. It is usually much faster/easier to write any code than try to make it very efficient. 95% of the code you write dosn't have to be efficent at all, but needs to be reliable. In such cases it's best to use the most straightforward approach. Then you can spend 95% of your time writing 5% of the code that does have to be efficient :)
 
I usually add few hundred milliseconds of delay before initializations and before the code enters the infinite loop. This gives time for everything to get "stable" and time for all other devices to start-up properly.
I have wasted all day debugging a graphic LCD. The problem was that the GLCD did not have enough time to initialize itself on power-up. Half a second delay was needed before trying to write any settings to the GLCD.
Populating a sine table takes no time compared to a 100ms delay.
 
Thank you very much guys :)

Is it written in datasheet how many instructions cycles required to read from Program Flash / Data SRAM?

I'm using this 8051 uC and it says there: (Page 30) (it has 64KB/128KB Program Flash and 8KB SRAM)

- 256B Data - single instruction Cycle

- XDATA - 4-5 instructions cycle

- SFR - single instruction cycles

but it doesnt mention how many to read from Program Flash or to read/write from/to SRAM.
 
Status
Not open for further replies.

Latest threads

New Articles From Microcontroller Tips

Back
Top