Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

12 simultaneous & unique frequencies (PWM)

Status
Not open for further replies.
It is unlikely that you can get anywhere close to maximum efficiency with a C program.

I'm not familir with AVR (some processers have weired limitations, you never know). A clever assembler program on PIC could run at about 2.2 CPU cycles per channel per interrupt. This is about 30 cycles per interrupt (counting in interrupt overhead). PIC runs 2(or 4) times slower than the clock (don't know about AVR) so at 20MHz clock assumin 1/2 factor, it would be, 100 ns per cycle. AVR should be similar, but you need to look at the AVR datasheet to figure out the real speed. If 100 ns is correct, you should be able to get (100 ns x 3) 3us per interrupt. Leaving half of the bandwidth to USB (should be enough), you get 6us per interrupt or 167kHz. You need 379 ticks to get 439.8 Hz, and 380 ticks to get 438.6 Hz. I cannot tell if that's a good resolution for you.

At any rate, 40MIPS PIC @ $6 would be enough.

To let it work all together, you need to make timer interrupt high priority and run USB in the main thread with low priority interrupts. This way they will live together without interference.

Keep in mind that you get a square wave, which will not sound anywhere as good as a sine wave.
 
Hmm.. I smell unnecessary PIC vs AVR and C vs ASM here.
This problem has nothing to do with the language you program with. I write my example codes in (pseudo) C because it is readable and it is not tied to any specific chip. Of course you can squeeze some cycles out if you optimize the code with some inline asm.

Clever assembly program on pic does exactly the same as clever assembly program on avr. Only that avr cpu runs exactly the same speed as the clock.. no dividing by 4. With 20 MHz you get almost 20 MIPS (some instructions take 2 cycles).

6 dollars for 12 pwm channels is a lot in my opinion.
 
This problem has nothing to do with the language you program with. I write my example codes in (pseudo) C because it is readable and it is not tied to any specific chip. Of course you can squeeze some cycles out if you optimize the code with some inline asm.

Some cycles? In this particular case, you need to run it as fast as you possibly can because speed here directly translates into the accuracy of frequency reproduction. Speed is really the only requirement here. It needs to run several million times per second, which leaves you just few cycles to work with.

With assembler, we're talking about close to 2 cycles per channel per interrupt here. One more cycle is 50% decrease in performance. 8-10 cycles that you (may be) can get from C is several times worse and will require slowing down, will start interfering with USB operations and will dramatically decrease the accuracy.

Clever assembly program on pic does exactly the same as clever assembly program on avr.

I assumed so, but I don't really know. Here you need direct memory addressing. If you need to load something to the register before doing operation, then store back, it'll take 2 more cycles. I'm not familiar with AVR, so I cannot tell. May be it has good commands for this particular situation, may be it doesn't. I cannot speak of things that I don't know.

Only that avr cpu runs exactly the same speed as the clock.. no dividing by 4. With 20 MHz you get almost 20 MIPS (some instructions take 2 cycles).

That is good. In this application you need it as fast as possible.

6 dollars for 12 pwm channels is a lot in my opinion.).

Looked at DigiKey. ATMega640 is over $10.

From this viewpoint, 6 small controllers with synchronized clocks running 2 channels each with their hardware PWM modules would be cheaper than one big one.
 
Some cycles? In this particular case, you need to run it as fast as you possibly can because speed here directly translates into the accuracy of frequency reproduction. Speed is really the only requirement here. It needs to run several million times per second, which leaves you just few cycles to work with.
Yes, I should not have mentioned optimization or inline asm. My point is that I write code examples in C because they are easy to follow and the general idea comes through easily. There would not be any point to say "do something like this" and then post assembly code. I don't think the OP is stupid, he understands that speed and efficiency is important in this application. But, choosing the right strategy is key.. if you have wrong general plan then assembly or any other trick will not solve your problems.

Looked at DigiKey. ATMega640 is over $10.
Yes, and that is also a lot if all you get is 12 square wave frequency outputs (with crappy quality). Maybe some small controllers that cost around 0.5 dollars and could do at least two channels in hardware would be a cost effective solution. The happy situation is that 12 is divisible by 2, 3 and 4 :)
 
Last edited:
I wrote a very naive (and stupid in the way it handles output, the error does not affect the point of this post) software pwm just to get some idea what are we really dealing with:

C:
#include <avr/io.h>
#include <stdint.h>

#define sbi(b,n) (b |= (1<<n))          /* Set bit number n in byte b    */
#define cbi(b,n) (b &= (~(1<<n)))       /* Clear bit number n in byte b  */

volatile uint8_t counter[12];

/* Just some dummy values */
const uint8_t top[12] = {112, 119, 126, 134, 141, 150, 159, 168, 178, 189, 200, 212};

void main(void)
{
    while(1)
    {
        /* The compiler will unroll this when compiled with -O3 */
        for(uint8_t i = 0; i<12; i++) {

            /* Increment each counter */
            counter[i]++;

            /* Check if counter is over the top */
            if (counter[i] > top[i]) {counter[i] = 0; }

            /* Compare for 50% square wave */
            (counter[i] > (top[i]>>1)) ? (sbi(PORTB, i)) : (cbi(PORTB, i));
        }
    }
}

Simulator tells me that one cycle takes ~18 clock cycles.
With 20 MHz clock this means that you effectively get ~90 kHz counter clock for each channel. (20 MHz / (18*12))

I calculated that if you make the update frequency 44 kHz you can hit these frequencies and you are using about 50% of your processor time. I don't know if they are accurate enough for you.
(double the counter frequency and you halve the error. But, with 8 bit counter and this naive solution the max frequency you can use is ~52750 Hz):
207.5
220.0
232.8
247.2
261.9
276.7
293.3
312.1
328.4
349.2
369.7
392.9

this is disassembly for one loop cycle (one channel update):
Code:
            counter[i]++;
000000A3  LDS R24,0x0201        Load direct from data space
000000A5  SUBI R24,0xFF        Subtract immediate
000000A6  STS 0x0201,R24        Store direct to data space
            if (counter[i] > top[i]) {counter[i] = 0; }
000000A8  LDS R24,0x0201        Load direct from data space
000000AA  CPI R24,0x78        Compare with immediate
000000AB  BRCS PC+0x03        Branch if carry set
000000AC  STS 0x0201,R1        Store direct to data space
            (counter[i] > (top[i]>>1)) ? (sbi(PORTB, i)) : (cbi(PORTB, i));
000000AE  LDS R24,0x0201        Load direct from data space
000000B0  CPI R24,0x3C        Compare with immediate
000000B1  BRCS PC+0x02        Branch if carry set
000000B2  RJMP PC+0x0071        Relative jump
000000B3  CBI 0x05,1        Clear bit in I/O register

           This is the code where the relative jump points
00000123  SBI 0x05,1        Set bit in I/O register
00000124  RJMP PC-0x0070        Relative jump

Sorry that my first two codes were wrong.. I was drinking after a long day at work :) The above code is very "naive" way to do it and it can be optimized from that.
(Actually I'd like to see how a good assembler programmer optimizes that code, generated from a naive approach by a C compiler.
Keeping all counters in registers is the thing to do of course, and you can do that in C code, no need to mess with asm and sacrifice readability, flexibility and portability)

If you are going to optimize your code with inline assembly then you need to know this (assuming you use gcc compiler):
how GCC uses the registers. This section describes how registers are allocated and
used by the compiler.
Register Use

r0: This can be used as a temporary register. If you assigned a value to this
register and are calling code generated by the compiler, you’ll need to save r0,
since the compiler may use it. Interrupt routines generated with the compiler save
and restore this register.

r1: The compiler assumes that this register contains zero. If you use this register
in your assembly code, be sure to clear it before returning to compiler generated
code (use "clr r1"). Interrupt routines generated with the compiler save and
restore this register, too.

r2–r17, r28, r29: These registers are used by the compiler for storage. If your
assembly code is called by compiler generated code, you need to save and restore
any of these registers that you use. (r29:r28 is the Y index register and is used
for pointing to the function’s stack frame, if necessary.)

r18–r27, r30, r31: These registers are up for grabs. If you use any of these
registers, you need to save its contents if you call any compiler generated code.
Function call conventions

Fixed Argument Lists: Function arguments are allocated left to right. They are
assigned from r25 to r8, respectively. All arguments take up an even number of
registers (so that the compiler can take advantage of the movw instruction on
enhanced cores.) If more parameters are passed than will fit in the registers, the
rest are passed on the stack. This should be avoided since the code takes a
performance hit when using variables residing on the stack.

Variable Argument Lists: Parameters passed to functions that have a variable
argument list (printf, scanf, etc.) are all passed on the stack. char parameters
are extended to ints.

Return Values: 8-bit values are returned in r24. 16-bit values are returned in
r25:r24. 32-bit values are returned in r25:r24:r23:r22. 64-bit values are returned
in r25:r24:r23:r22:r21:r20:r19:r18.


This is C code that updates all 12 channels in 123 cycles. That gives you update frequency of 162 kHz.
I would like to see if good asm coder can top that in 15 minutes (the code took me 5 minutes to write and simulate).
C:
#include <avr/io.h>
#include <stdint.h>

#define sbi(b,n) (b |= (1<<n))          /* Set bit number n in byte b    */
#define cbi(b,n) (b &= (~(1<<n)))       /* Clear bit number n in byte b  */

/* Keep the counters in registers */
register uint8_t counter1 asm("r2");
register uint8_t counter2 asm("r3");
register uint8_t counter3 asm("r4");
register uint8_t counter4 asm("r5");
register uint8_t counter5 asm("r6");
register uint8_t counter6 asm("r7");
register uint8_t counter7 asm("r8");
register uint8_t counter8 asm("r9");
register uint8_t counter9 asm("r10");
register uint8_t counter10 asm("r11");
register uint8_t counter11 asm("r12");
register uint8_t counter12 asm("r13");

const uint8_t top[12] = {112, 119, 126, 134, 141, 150, 159, 168, 178, 189, 200, 212};

int main(void)
{
    while(1)
    {
        /* Increment each counter */
        counter1++;
        counter2++;
        counter3++;
        counter4++;
        counter5++;
        counter6++;
        counter7++;
        counter8++;
        counter9++;
        counter10++;
        counter11++;
        counter12++;

        /* Check if counter is over the top */
        if (counter1 > top[0]) {counter1 = 0; }
        if (counter2 > top[1]) {counter2 = 0; }
        if (counter3 > top[2]) {counter3 = 0; }
        if (counter4 > top[3]) {counter4 = 0; }
        if (counter5 > top[4]) {counter5 = 0; }
        if (counter6 > top[5]) {counter6 = 0; }
        if (counter7 > top[6]) {counter7 = 0; }
        if (counter8 > top[7]) {counter8 = 0; }
        if (counter9 > top[8]) {counter9 = 0; }
        if (counter10 > top[9]) {counter11 = 0; }
        if (counter11 > top[10]) {counter11 = 0; }
        if (counter12 > top[11]) {counter12 = 0; }

        /* Compare for 50% square wave */
        (counter1 > (top[0]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter2 > (top[1]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter3 > (top[2]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter4 > (top[3]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter5 > (top[4]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter6 > (top[5]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter7 > (top[6]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter8 > (top[7]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter9 > (top[8]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter10 > (top[9]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter11 > (top[10]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
        (counter12 > (top[11]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));

    }
}

Dissassembly:
Code:
        counter1++;
0000008A  INC R2        Increment
        counter2++;
0000008B  INC R3        Increment
        counter3++;
0000008C  INC R4        Increment
        counter4++;
0000008D  INC R5        Increment
        counter5++;
0000008E  INC R6        Increment
        counter6++;
0000008F  INC R7        Increment
        counter7++;
00000090  INC R8        Increment
        counter8++;
00000091  INC R9        Increment
        counter9++;
00000092  INC R10        Increment
        counter10++;
00000093  INC R11        Increment
        counter11++;
00000094  INC R12        Increment
        counter12++;
00000095  INC R13        Increment
        if (counter1 > top[0]) {counter1 = 0; }
00000096  LDI R24,0x70        Load immediate
00000097  CP R24,R2        Compare
00000098  BRCC PC+0x02        Branch if carry cleared
00000099  MOV R2,R1        Copy register
0000009A  LDI R24,0x77        Load immediate
0000009B  CP R24,R3        Compare
0000009C  BRCC PC+0x02        Branch if carry cleared
0000009D  MOV R3,R1        Copy register
0000009E  LDI R24,0x7E        Load immediate
0000009F  CP R24,R4        Compare
000000A0  BRCC PC+0x02        Branch if carry cleared
000000A1  MOV R4,R1        Copy register
000000A2  LDI R24,0x86        Load immediate
000000A3  CP R24,R5        Compare
000000A4  BRCC PC+0x02        Branch if carry cleared
000000A5  MOV R5,R1        Copy register
000000A6  LDI R24,0x8D        Load immediate
000000A7  CP R24,R6        Compare
000000A8  BRCC PC+0x02        Branch if carry cleared
000000A9  MOV R6,R1        Copy register
000000AA  LDI R24,0x96        Load immediate
000000AB  CP R24,R7        Compare
000000AC  BRCC PC+0x02        Branch if carry cleared
000000AD  MOV R7,R1        Copy register
000000AE  LDI R24,0x9F        Load immediate
000000AF  CP R24,R8        Compare
000000B0  BRCC PC+0x02        Branch if carry cleared
000000B1  MOV R8,R1        Copy register
000000B2  LDI R24,0xA8        Load immediate
000000B3  CP R24,R9        Compare
000000B4  BRCC PC+0x02        Branch if carry cleared
000000B5  MOV R9,R1        Copy register
000000B6  LDI R24,0xB2        Load immediate
000000B7  CP R24,R10        Compare
000000B8  BRCC PC+0x02        Branch if carry cleared
000000B9  MOV R10,R1        Copy register
000000BA  LDI R24,0xBD        Load immediate
000000BB  CP R24,R11        Compare
000000BC  BRCS PC+0x04        Branch if carry set
000000BD  LDI R24,0xC8        Load immediate
000000BE  CP R24,R12        Compare
000000BF  BRCC PC+0x02        Branch if carry cleared
000000C0  MOV R12,R1        Copy register
000000C1  LDI R24,0xD4        Load immediate
000000C2  CP R24,R13        Compare
000000C3  BRCC PC+0x02        Branch if carry cleared
000000C4  MOV R13,R1        Copy register
000000C5  LDI R24,0x38        Load immediate
000000C6  CP R24,R2        Compare
000000C7  BRCC PC+0x2F        Branch if carry cleared
000000C8  SBI 0x05,0        Set bit in I/O register
000000C9  LDI R24,0x3B        Load immediate
000000CA  CP R24,R3        Compare
000000CB  BRCC PC+0x2F        Branch if carry cleared
000000CC  SBI 0x05,0        Set bit in I/O register
000000CD  LDI R24,0x3F        Load immediate
000000CE  CP R24,R4        Compare
000000CF  BRCC PC+0x2F        Branch if carry cleared
000000D0  SBI 0x05,0        Set bit in I/O register
000000D1  LDI R24,0x43        Load immediate
000000D2  CP R24,R5        Compare
000000D3  BRCC PC+0x2F        Branch if carry cleared
000000D4  SBI 0x05,0        Set bit in I/O register
        (counter5 > (top[4]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000D5  LDI R24,0x46        Load immediate
000000D6  CP R24,R6        Compare
000000D7  BRCC PC+0x2F        Branch if carry cleared
000000D8  SBI 0x05,0        Set bit in I/O register
        (counter6 > (top[5]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000D9  LDI R24,0x4B        Load immediate
000000DA  CP R24,R7        Compare
000000DB  BRCC PC+0x2F        Branch if carry cleared
000000DC  SBI 0x05,0        Set bit in I/O register
        (counter7 > (top[6]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000DD  LDI R24,0x4F        Load immediate
000000DE  CP R24,R8        Compare
000000DF  BRCC PC+0x2F        Branch if carry cleared
000000E0  SBI 0x05,0        Set bit in I/O register
        (counter8 > (top[7]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000E1  LDI R24,0x54        Load immediate
000000E2  CP R24,R9        Compare
000000E3  BRCC PC+0x2F        Branch if carry cleared
000000E4  SBI 0x05,0        Set bit in I/O register
        (counter9 > (top[8]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000E5  LDI R24,0x59        Load immediate
000000E6  CP R24,R10        Compare
000000E7  BRCC PC+0x2F        Branch if carry cleared
000000E8  SBI 0x05,0        Set bit in I/O register
        (counter10 > (top[9]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000E9  LDI R24,0x5E        Load immediate
000000EA  CP R24,R11        Compare
000000EB  BRCC PC+0x2F        Branch if carry cleared
000000EC  SBI 0x05,0        Set bit in I/O register
        (counter11 > (top[10]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000ED  LDI R24,0x64        Load immediate
000000EE  CP R24,R12        Compare
000000EF  BRCC PC+0x2F        Branch if carry cleared
000000F0  SBI 0x05,0        Set bit in I/O register
        (counter12 > (top[11]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000F1  LDI R24,0x6A        Load immediate
000000F2  CP R24,R13        Compare
000000F3  BRCC PC+0x2F        Branch if carry cleared
000000F4  SBI 0x05,0        Set bit in I/O register
000000F5  RJMP PC-0x006C        Relative jump
        (counter1 > (top[0]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000F6  CBI 0x05,0        Clear bit in I/O register
        (counter2 > (top[1]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000F7  LDI R24,0x3B        Load immediate
000000F8  CP R24,R3        Compare
000000F9  BRCS PC-0x2D        Branch if carry set
000000FA  CBI 0x05,0        Clear bit in I/O register
        (counter3 > (top[2]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000FB  LDI R24,0x3F        Load immediate
000000FC  CP R24,R4        Compare
000000FD  BRCS PC-0x2D        Branch if carry set
000000FE  CBI 0x05,0        Clear bit in I/O register
        (counter4 > (top[3]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
000000FF  LDI R24,0x43        Load immediate
00000100  CP R24,R5        Compare
00000101  BRCS PC-0x2D        Branch if carry set
00000102  CBI 0x05,0        Clear bit in I/O register
        (counter5 > (top[4]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
00000103  LDI R24,0x46        Load immediate
00000104  CP R24,R6        Compare
00000105  BRCS PC-0x2D        Branch if carry set
00000106  CBI 0x05,0        Clear bit in I/O register
        (counter6 > (top[5]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
00000107  LDI R24,0x4B        Load immediate
00000108  CP R24,R7        Compare
00000109  BRCS PC-0x2D        Branch if carry set
0000010A  CBI 0x05,0        Clear bit in I/O register
        (counter7 > (top[6]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
0000010B  LDI R24,0x4F        Load immediate
0000010C  CP R24,R8        Compare
0000010D  BRCS PC-0x2D        Branch if carry set
0000010E  CBI 0x05,0        Clear bit in I/O register
        (counter8 > (top[7]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
0000010F  LDI R24,0x54        Load immediate
00000110  CP R24,R9        Compare
00000111  BRCS PC-0x2D        Branch if carry set
00000112  CBI 0x05,0        Clear bit in I/O register
        (counter9 > (top[8]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
00000113  LDI R24,0x59        Load immediate
00000114  CP R24,R10        Compare
00000115  BRCS PC-0x2D        Branch if carry set
00000116  CBI 0x05,0        Clear bit in I/O register
        (counter10 > (top[9]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
00000117  LDI R24,0x5E        Load immediate
00000118  CP R24,R11        Compare
00000119  BRCS PC-0x2D        Branch if carry set
0000011A  CBI 0x05,0        Clear bit in I/O register
        (counter11 > (top[10]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
0000011B  LDI R24,0x64        Load immediate
0000011C  CP R24,R12        Compare
0000011D  BRCS PC-0x2D        Branch if carry set
0000011E  CBI 0x05,0        Clear bit in I/O register
        (counter12 > (top[11]>>1)) ? (sbi(PORTB, 0)) : (cbi(PORTB, 0));
0000011F  LDI R24,0x6A        Load immediate
00000120  CP R24,R13        Compare
00000121  BRCS PC-0x2D        Branch if carry set
00000122  CBI 0x05,0        Clear bit in I/O register
00000123  RJMP PC-0x009A        Relative jump

Well, I hope this at least helps you to make the right decision.
 
Last edited:
You cheated a bit. Our counters are longer than 8-bit. With 16-bit counters, C will do much worse.

You probably cannot use registers because USB code will use them. You will have to save/reload them at every interrupt, which makes things longer.

About optimization. You need to optimize hard what runs often. The core thing that you need to do on every pass is

Code:
process_channel_1_first_byte:
LDS R24, channel_1_counter
DEC R24
STS R24, channel_1_counter
BREQ process_channel_1_second_byte

These are 4 instructions that will run every interrupt for every channel. The code for different channels is concatenated:

Code:
process_channel_1_first_byte:
LDS R24, channel_1_counter
DEC R24
STS channel_1_counter, R24
BREQ process_channel_1_second_byte
 
process_channel_2_first_byte:
LDS R24, channel_2_counter
DEC R24
STS channel_2_counter, R24
BREQ process_channel_2_second_byte

Note few things:

- Only one register is used. So, when we enter into interrupt we don't need to save more.

- Jumps are rare, only when first byte of the counter goes to zero, which is, in average, about 1 cycle out of 128. Most of the time it goes right through

- It would be 2 cycled if not for LDS/STS instructions - PICs can decrement directly in memory, so it would be 2 cycles for PIC.

The result - 4 cycles per channel per interrupt.

So, what do we do if BREQ jumps? This part doesn't need any heavy optimization, it doesn't run very often. Only about once per 128 cycles, so I won't try hard here:

Code:
process_channel_1_second_byte:
LDS R24, channel_1_counter+1
DEC R24
STS  channel_1_counter+1, R24
BRCC process_channel_2_first_byte ; I hope I got this opcode right?
; our counter has expired
; toggle the LED here. This is my first exercise with AVR.
; I assume this can be done in 6 instructions
; I skip this part
LDI R24, lsb_byte_of_the_channel_1_tick_count
STS channel_1_counter, R24
LDI R24, msb_byte_of_the_channel_1_tick_count
STS channel_1_counter+1, R24
JMP process_channel_2_first_byte

Tick count should be set to half a cycle.

Assume this runs in 20 cycles, adding 20/128 = 0.15 cycles per channel/per interrupt.

So, we get 4.15 cycles per channel/per interrupt. As you can see, the 16-bit Assembler solution runs 18/4.15 = over 4 times better than your 8-bit C solution.

Note. I've never worked with AVR. There could be some tricks that would allow to speed it up more for an experienced AVR programmer.
 
With 16-bit counters, C will do much worse.
Why? Tell me why?

and.. did you come up with code that beats my example? I posted complete code that handles 12 channels, but all you did was ramble about something that nobody cares. Post full code or shut up about C vs ASM.
 
Last edited:
Why? Tell me why?

Because it's an 8-bit processor. 16-bit operations must be cretated through a combination of 8-bit operations.

Compile your example with char. Clock the speed. Them change char to int, compile again and clock the speed. You will see that it got slower.
 
Assume this runs in 20 cycles, adding 20/128 = 0.15 cycles per channel/per interrupt.
How can any processor update anything in less than one cycle? I did simulations etc. to show some actual data.. could you even try to match that if you are going to put down the work I did for the OP..
 
Because it's an 8-bit processor. 16-bit operations must be cretated through a combination of 8-bit operations.

Compile your example with char. Clock the speed. Them change char to int, compile again and clock the speed. You will see that it got slower.

You said that "C will do much worse".. I do not understand why you think C language is worse with 16bit variables.. it is exactly the same with assembly. Char is for characters.. if you need 16 bit integers then use the proper size variable. I think the main problem you have is that you do not know the C programming language.
 
Compile your example with char. Clock the speed. Them change char to int, compile again and clock the speed. You will see that it got slower.
Do you think I'm stupid? I did not declare the variables in my example code to be uint8_t just for fun.. and I did clock the speed.. 128 cycles to update 12 channels. Your post was very confusing.. how fast your solution updates all 12 channels?
 
Last edited:
You said that "C will do much worse".. I do not understand why you think C language is worse with 16bit variables.. it is exactly the same with assembly. Char is for characters.. if you need 16 bit integers then use the proper size variable. I think the main problem you have is that you do not know the C programming language.

Look at the program I posted. It operates on 16-bit variables, but it's nearly the same speed as it would be with 8-bit variables.

C will be considerably slower with 16-bit variables compared to 8-bit. You can compile and measure, if you're interested.

So it's not exactly the same.
 
C will be considerably slower with 16-bit variables compared to 8-bit.
That is just stupid.. do you think that assembly program does not get slower if you use 16bit variables instead of 8bit variables?
 
If you think assembly is so superior, then post the code.. I mean full code.. like I did.
 
Do you think I'm stupid? I did not declare the variables in my example code to be uint8_t just for fun.. and I did clock the speed.. 128 cycles to update 12 channels.

The problem with this is that counters that we want to use won't fit into uint8_t.

Your post was very confusing.. how fast your solution updates all 12 channels?

What exactly is confusing? 4.15 x 12 = 50 cycles (2.5 us average execution time)

Your 128 cycles number is not actually feasible. During the main loop execution, an USB program will use registers, so when you enter the interrupt, you will have to (a) store all the register you use, (b) load your variables into registers. When you leave, you need to (a) store your variables back, (b) restore registers to what they were at the entry. With 12 registers and 4 operations this is 48 more cycles, so correct count is 176, or somewhere around 300 if you go from 8-bit variables to 16-bit ones.
 
The problem with this is that counters that we want to use won't fit into uint8_t.
8bit counter will fit in 8 bit variable. 8 bit counter is enough if you are clever.

Your 128 cycles number is not actually feasible. During the main loop execution, an USB program will use registers, so when you enter the interrupt, you will have to (a) store all the register you use, (b) load your variables into registers. When you leave, you need to (a) store your variables back, (b) restore registers to what they were at the entry. With 12 registers and 4 operations this is 48 more cycles, so correct count is 176, or somewhere around 300 if you go from 8-bit variables to 16-bit ones.
yes.. my example was supposed to show a naive approach to the problem and how much processing time it takes etc.. kind of "worst case scenario".. that gives you an idea of the problem you are dealing with. That is something assembler programmers are not familiar with... apparently.
 
I did post the code in post #26.
Ok.. so how many cycles your code takes to update 12 channels? What is confusing for me is that you say that your code takes ~50 cycles, but then you say that my 128 cycles is not feasible..
 
Your 128 cycles number is not actually feasible. During the main loop execution, an USB program will use registers, so when you enter the interrupt, you will have to (a) store all the register you use, (b) load your variables into registers. When you leave, you need to (a) store your variables back, (b) restore registers to what they were at the entry. With 12 registers and 4 operations this is 48 more cycles, so correct count is 176, or somewhere around 300 if you go from 8-bit variables to 16-bit ones.

Believe me or not, but I know the compiler and the hardware and I know exactly what every line of C code does and how it translates to asm. And I can tell you that asm programmers vs C programmers.. the C compiler wins. The compiler is the result of many years of development and written by experienced asm programmers etc.
 
  • Like
Reactions: 3v0
Status
Not open for further replies.

Latest threads

New Articles From Microcontroller Tips

Back
Top