Software PWM for 128 led matrix + [video]

dougy83 · Mar 2, 2011

Mike said:
No, that's not it at all. There's a pretty good description of the anomoly in post #3 in this PicBasicPro forum thread.

Oh, I see. It wouldn't be so evident at a higher sample rate though? Or, as you said, with sufficient row-off time.

tubos · Mar 2, 2011

Thx for the code doug now I understand what you meant before.
I'll have to upgrade my pic though since the pins that have
hw SPI are occupied. (its only 18p)
I think there's a lot of speed to gain by optimizing the C Fragments inside
the 2 loops of my ISR , so I'm working on this first.

tubos · Mar 4, 2011

Thanks to the tips here and from some helpful posters on the microchip
forums I was able to optimize my ISR code for speed.
I now have 8 Brightness levels (with some gamma correction table) (corresponds to 28 pwm levels) per colour.
So that means I have now 64 brightness levels per LED (8x8) for my purposes enough for now.
The framerate is 70hz and the ISR load now is 46%.
Most of the optimizing was done by using pointers to access the arrays by row.

My own simplified gamma correction = as follows:

Step 0 1 2 3 4 5 6 7
PWM 0 1 4 6 10 15 21 28

what that means is that
step 2 corresponds to a duty cycle of 4/24
step 5 corresponds to a duty cycle of 15/13

It looks great , I will add an updated video later

thx tubos

nickelflippr · Mar 4, 2011

Really nice!

I will be referring back to this post and techniques employed in the future (for an 8x8RGB display). It's pretty cool when you adjust from a log to linear response, it's been brought up before on the forum here.

Mike - K8LH · Mar 4, 2011

tubos,

Congratulations. I'd like to see what you ended up with (source and generated assembler). You should also be sure to add a post over at Forum.Microchip to thank those chaps for their help.

Cheerful regards, Mike

Mike - K8LH · Mar 5, 2011

dougy83 said:

Using the hw SPI make the send run in parallel (i.e. in the background) so you can continue to do useful stuff with the CPU.

Also taking the calc/precalc out of loops will increase performance. Something like the following may work.

Code:

void interrupt isr()
{
    // setup the BAM-specific interrupt spacing
    interval >>= 1;
    if(interval == 0)
    {
            interval |= 128;
        row = (row + 1) & 0xF7;     // 0..7
        rowMask <<= 1;
        if(rowMask == 0)
        rowMask++;
    }

    TMRxINTERVAL = interval;    // TMRx has an appropriate prescaler

    // send the first 8 bits
    ROWSEL_PINS = 0;
    CS_PIN = 1;     // whatever the latch of the 595 is to shift the data
    TXREG = redPrecalc;


    char *ptr = &red[row];
    if(*ptr++ & interval)
        redPrecalc |= 1;
    if(*ptr++ & interval)
        redPrecalc |= 2;
    if(*ptr++ & interval)
        redPrecalc |= 4;
    if(*ptr++ & interval)
        redPrecalc |= 8;
    if(*ptr++ & interval)
        redPrecalc |= 16;
    if(*ptr++ & interval)
        redPrecalc |= 32;
    if(*ptr++ & interval)
        redPrecalc |= 64;
    if(*ptr++ & interval)
        redPrecalc |= 128;

    // possibly wait for tx to finish here if the precalc above took < 32 cycles
    TXREG = grnPrecalc;

    ptr = &grn[row];
    if(*ptr++ & interval)
        grnPrecalc |= 1;
    if(*ptr++ & interval)
        grnPrecalc |= 2;
    if(*ptr++ & interval)
        grnPrecalc |= 4;
    if(*ptr++ & interval)
        grnPrecalc |= 8;
    if(*ptr++ & interval)
        grnPrecalc |= 16;
    if(*ptr++ & interval)
        grnPrecalc |= 32;
    if(*ptr++ & interval)
        grnPrecalc |= 64;
    if(*ptr++ & interval)
        grnPrecalc |= 128;

    // possibly wait for tx to finish here if the above precalc took < 32 cycles
    CS_PIN = 1;        // latch the data or something
    ROWSEL_PINS = rowMask;    // enable the row
}

Doug,

Your example using SPI is looking better and better as I study the bottleneck problem more and more. If you can really burst 8 bits through SPI in 32 cycles (or faster?) then I suspect a fully developed driver based on this method might be about as fast as you can get. By comparison, I posted a bit banged 8-bit (256 step) driver over on the Microchip forum that uses 80 cycles (5-usecs) out of each 128-cycle (8-usec) interrupt.

Cheerful regards, Mike

dougy83 · Mar 5, 2011

Mike said:
Doug,

Your example using SPI is looking better and better as I study the bottleneck problem more and more. If you can really burst 8 bits through SPI in 32 cycles (or faster?) then I suspect a fully developed driver based on this method might be about as fast as you can get. By comparison, I posted a bit banged 8-bit (256 step) driver over on the Microchip forum that uses 80 cycles (5-usecs) out of each 128-cycle (8-usec) interrupt.

Cheerful regards, Mike

Well, I'm glad you don't hate the idea

I have a feeling that Fosc/16 is the fastest SPI rate for 18F uC, so under 32 ins. cy. isn't possible, but getting the precalc under 32 cy. probably isn't possible either (each comparison will take 4 cycles: movfw postinc0/andfw interval/btfss status,z/bsf precalc,bit). So you're right, it should be in the same ballpark as the parallel-load method - quite surprising!

tubos · Mar 5, 2011

One thing I noticed it takes a lot of instructions
pointing my pointer to a different Row.
See below:

Code:

pG = &Dgrn[Irow];    // point to Array at Irow

compiles to this:

Code:

L__interrupt66:
;bi88pwm.c,196 ::                 pG = &Dgrn[Irow];    // point to Array at Irow
        MOVLW       3
        MOVWF       R2 
        MOVF        R3, 0 
        MOVWF       R0 
        MOVLW       0
        MOVWF       R1 
        MOVF        R2, 0 
L__interrupt67:
        BZ          L__interrupt68
        RLCF        R0, 1 
        BCF         R0, 0 
        RLCF        R1, 1 
        ADDLW       255
        GOTO        L__interrupt67
L__interrupt68:
        MOVLW       _Dgrn+0
        ADDWF       R0, 0 
        MOVWF       _pG+0 
        MOVLW       hi_addr(_Dgrn+0)
        ADDWFC      R1, 0 
        MOVWF       _pG+1

Not a real problem now as i point to it out of the loops.

dougy83 · Mar 5, 2011

tubos said:
One thing I noticed it takes a lot of instructions
pointing my pointer to a different Row.
See below:

Code:

pG = &Dgrn[Irow]; // point to Array at Irow

I remember seeing something similar in PICC when I wanted to multiply by some power of 2 - instead of using a multiply instruction it "optimised" it to use a shift left loop. I don't know if it's possible to get around it (even specifying pG = &Dgrn + iRow*8 will likely give the same rubbish). If you're concerned (sounds like you're not), you can use inline assembler to force the use of the multiplier.

Mike - K8LH · Mar 5, 2011

dougy83 said:
... but getting the precalc under 32 cy. probably isn't possible either (each comparison will take 4 cycles: movfw postinc0/andfw interval/btfss status,z/bsf precalc,bit). ...

It looks more like 20 cycles per precalc to me (below) so you would probably need to test for TX buffer ready before sending the second 8 bits and test for TX complete before strobing shift register data onto the outputs.

Code:

  void interrupt()                 // 8-usec (128 cycle) interrupts
  { pir1.TMR2IF = 0;               // clear interrupt flag
    txreg = Precalc;               //
    Precalc = 0;                   //
    fsr2 = rowaddr;                // fsr2 = &red[row][0] (0x100..0x13F)
    asm                            //
    { movf    _interval,W          // interval, 0..255
      cpfsgt  _postinc2            // if(interval >= red[row][0]) {
      bsf     _Precalc,0           //   Precalc |= 1; }
      cpfsgt  _postinc2            // if(interval >= red[row][1]) {
      bsf     _redPrecalc,1        //   Precalc |= 2; }
      cpfsgt  _postinc2            // if(interval >= red[row][2]) {
      bsf     _Precalc,2           //   Precalc |= 4; }
      cpfsgt  _postinc2            // if(interval >= red[row][3]) {
      bsf     _Precalc,3           //   Precalc |= 8; }
      cpfsgt  _postinc2            // if(interval >= red[row][4]) {
      bsf     _Precalc,4           //   Precalc |= 16; }
      cpfsgt  _postinc2            // if(interval >= red[row][5]) {
      bsf     _Precalc,5           //   Precalc |= 32; }
      cpfsgt  _postinc2            // if(interval >= red[row][6]) {
      bsf     _Precalc,6           //   Precalc |= 64; }
      cpfsgt  _indf2               // if(interval >= red[row][7]) {
      bsf     _Precalc,7           //   Precalc |= 128; }
    }

    while(!pir1.TXIF)              // is this right?
    txreg = Precalc;               //
    Precalc = 0;                   //
    fsr2 |= 64;                    // fsr2 = &grn[row][7] (0x140..0x17F)
    asm                            //
    { incf    _interval,W          // interval, 0..255
      cpfsgt  _postdec2            // if(interval >= grn[row][7]) {
      bsf     _Precalc,7           //   Precalc |= 128; }
      cpfsgt  _postdec2            // if(interval >= grn[row][6]) {
      bsf     _Precalc,6           //  Precalc |= 64; }
      cpfsgt  _postdec2            // if(interval >= grn[row][5]) {
      bsf     _Precalc,5           //   Precalc |= 32; }
      cpfsgt  _postdec2            // if(interval >= grn[row][4]) {
      bsf     _Precalc,4           //   Precalc |= 16; }
      cpfsgt  _postdec2            // if(interval >= grn[row][3]) {
      bsf     _Precalc,3           //   Precalc |= 8; }
      cpfsgt  _postdec2            // if(interval >= grn[row][2]) {
      bsf     _Precalc,2           //   Precalc |= 4; }
      cpfsgt  _postdec2            // if(interval >= grn[row][1]) {
      bsf     _Precalc,1           //   Precalc |= 2; }
      cpfsgt  _indf2               // if(interval >= grn[row][0]) {
      bsf     _Precalc,0           //   Precalc |= 1; }
    }

    if(++interval == 0)            // if end-of-period
    { asm {                        //
      rlncf   _rowsel,F            // advance row select bit mask
      }                            //
      latc = ~rowsel;              // select new row (active low)
      rowaddr += 8;                // prep for next row
      rowaddr &= 0b10111111;       // 0x100..0x13F, inclusive
    }
    while(!pir1.TXIF)              // is this right?
    stb = 1; stb = 0;              // latch data onto outputs
  }

dougy83 · Mar 5, 2011

Mike said:
It looks more like 20 cycles per precalc to me (below) so you would probably need to check SPI buffer ready...

I'm not actually familiar with pic18 assembler, so maybe I should keep my mouth shut before making assumptions about the language. Well, as you've said, it's possible to do in 20 cy/byte. So that you don't need to waste time waiting for the SPI to finish, just delay the latching of the SPI data until the next ISR, i.e.

Code:

void interrupt()                 // 8-usec (128 cycle) interrupts
  { 
    rowSel = nothing;
    stb = 1; stb = 0;              // latch data onto outputs from last isr
    rowSel = interval;

    // setup next interrupt period here...

    TXREG = precalcRed;
    
    precalcRed = { your fancy precalc stuff }
    tempPrecalcGrn = { your fancy precalc stuff }

    // > 40 cycles have now past, so we know the SPI has sent the first byte
    TXREG = precalcGrn;
    precalcGrn = tempPrecalcGrn;

    // update the interval variable here...
}

Mike - K8LH · Mar 5, 2011

tubos said:
One thing I noticed it takes a lot of instructions
pointing my pointer to a different Row.

Is there a lot of overhead with pointers like there is with arrays?

Mike - K8LH · Mar 5, 2011

dougy83 said:
... So that you don't need to waste time waiting for the SPI to finish, just delay the latching of the SPI data until the next ISR ...

That's genius Doug! That could potentially reduce interrupt overhead from 70-80 cycles to perhaps 50-60 cycles. Man, if you weren't "assembler challenged" (grin), there'd be no stopping you (lol)...

Cheerful regards, Mike

tubos · Mar 5, 2011

Mike said:
Is there a lot of overhead with pointers like there is with arrays?

Well its a bit less , but in my case profitable because I do it only once per row.

That hwSPI stuff wont work when I use the spi library from mikroc I suppose?

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Software PWM for 128 led matrix + [video]

dougy83

Well-Known Member

tubos

New Member

tubos

New Member

nickelflippr

Member

Mike - K8LH

Well-Known Member

Mike - K8LH

Well-Known Member

dougy83

Well-Known Member

tubos

New Member

dougy83

Well-Known Member

Mike - K8LH

Well-Known Member

dougy83

Well-Known Member

Mike - K8LH

Well-Known Member

Mike - K8LH

Well-Known Member

tubos

New Member

Similar threads

Latest threads

New Articles From Microcontroller Tips