# Can MPLABX simulator count processor cycles?

#### BobW

##### Active Member
In a PIC16F688 assembly language program, I'm trying to debug a time delay subroutine. It's just some nested loops that use up processor cycles. I need a one second delay—actually 999997 cycles which leaves 3 cycles for the main program to stop Timer_0. When I run it, it seems to use a couple more processor cycles than I've calculated. I was hoping the MPLABX simulator would have a facility for counting cycles, but if it does, then I haven't figured out how and where.

I've written my own PIC simulator in BASIC and have just added a cycle counter feature to it in order to test this routine. It shows two more cycles than I'd calculated, but that could be a bug in my BASIC simulator. I'd rather not compound my problems (and bugs) with the BASIC simulator. I'd prefer to do this in MPLABX if possible.

Anyone know if there's a cycle counter, and if so, how to access it?

Last edited:

#### Pommie

##### Well-Known Member
See if the menu item Window->Debugging->Stopwatch is available. I not figured out WHEN it's available - on some projects it is - on others not!!!

Mike.
Edit, just worked out it's only available in the simulator - lucky you.

##### Active Member
You sure Mike, stopwatch seems ok whenever , runs between breakpoints ? on my puter and shows cycle count
gives cycly counnt

Last edited:

#### BobW

##### Active Member
Thanks for pointing out the Stopwatch tool. Unfortunately, when I click on stopwatch properties, everthing is greyed out and it says "Not supported for this device/tool"

Anyway, it appears that the cycle counter in my Basic simulator is accurate after all. After adjusting the code to give the required 999997 cycles in the Basic simulator, I loaded into the PIC and ran it. The timing appears to be correct (or very close). I'll need to do more testing to make sure.

#### BobW

##### Active Member
Update:
Even though the stopwatch properties window can't be configured, the stopwatch is working. It shows the accumulated cycle count whenever a breakpoint is encountered. That should be all I need.

##### Active Member
I always test delays , easy to get them long or short. even with __delay_ms(1) ( built in )

#### BobW

##### Active Member
Well, something very strange is happening. The delay routine which I'd calculated on a spreadsheet to have a delay of 999997 cycles was shown to have 736 more than that in the simulator. Fair enough. It's easy to make a mistake with a hand calculation. So, I tweaked the loop count in the simulator and padded in some nop's to get the required 999997 cycles. Then programmed it into the PIC and found that the time delay was too short. Suspiciously short by around 730 cycles or so. So I went back to my original delay routine, and it was almost dead on.

#### Mike - K8LH

##### Well-Known Member
Are you doing this delay subsystem in assembly language or C, Bob?

#### BobW

##### Active Member
This is in assembly language. The project is a frequency counter with 1 Hz resolution, which is why I need a precise 1 second delay. I rewrote the delay routine last night to make it a bit simpler, but there's still a 738 cycle discrepancy that I can't explain.
I have to be away for a couple of hours, but when I return, I'll post more details (plus code), and maybe someone can point out what I've missed.

#### Mike - K8LH

##### Well-Known Member
That's great, Bob. I was thinking about re-doing my 50-MHz counter with an OLED display on an 8-pin PIC.

I would be happy to share a cycle accurate fixed delay subsystem I wrote some time ago. It supports almost any clock and allows for subtracting cycles (loop overhead, etc.).

You need to set the 'clock' variable (in the DelayCy macro) and set the 'dloop' variable to accommodate your longest delay (number of cycles divided by 65536 plus 1).

Tell me if the following code makes sense?

Code:
        cblock  0x70
delayhi                         ; DelayCy() subsystem variable
endc
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;  K8LH DelayCy() subsystem macro generates four instructions     ~
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
clock   equ     4               ; 4, 8, 12, 16, 20 (MHz), etc.
usecs   equ     clock/4         ; cycles/microsecond multiplier
msecs   equ     usecs*1000      ; cycles/millisecond multiplier
dloop   equ     16              ; loop size, minimum 5 cycles
;
;  -- loop --  -- delay range --  -- memory overhead ----------
;  5-cyc loop, 11..327690 cycles,  9 words (+4 each macro call)
;  6-cyc loop, 11..393226 cycles, 10 words (+4 each macro call)
;  7-cyc loop, 11..458762 cycles, 11 words (+4 each macro call)
;  8-cyc loop, 11..524298 cycles, 12 words (+4 each macro call)
;
DelayCy macro   cycles          ; range, see above
if (cycles<11)|(cycles>(dloop*65536+10))
error " DelayCy range error "
else
movlw   high((cycles-11)/dloop)+1
movwf   delayhi
movlw   low ((cycles-11)/dloop)
;       rcall   uLoop-(((cycles-11)%dloop)*2)    ; (18F version)
call    uLoop-((cycles-11)%dloop)        ; (16F version)
endif
endm

;******************************************************************
;  reset vector                                                   *
;******************************************************************
org     0x000

v_reset
DelayCy(1000*msecs-3)   ; delay 1 second minus 3 cycles
nop                     ; insert break point here

;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;  K8LH DelayCy() subsystem 16-bit uLoop subroutine               ~
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

a = dloop-1
while a > 0
nop                     ; (cycles-11)%dloop entry points  |??
a -= 1
endw
uLoop   addlw   -1              ; subtract 'dloop' loop time      |??
skpc                    ; borrow? no, skip, else          |??
decfsz  delayhi,F       ; done?  yes, skip, else          |??
;       bra  uLoop-dloop*2+10   ; do another loop (18F version)   |
goto uLoop-dloop+5      ; do another loop (16F version)   |??
return                  ;                                 |??
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
end

Last edited:

#### BobW

##### Active Member
Hi Mike. Thanks for the code. Unfortunately, the assembler is throwing an error on this line:
Code:
       goto uLoop-dloop+5      ; do another loop (16F version)   |??
Error[151] : Operand contains unresolvable labels or is too complex

I've encountered this error in the past too, when the expression involves subtraction of an address reference. I've never understood why it happens though. Anyway, I just changed the line to
goto uLoop-11

The MPLABX simulator verifies the cycle count to be 999997 cycles. So, I no longer doubt the stopwatch's accuracy.

Unfortunately, I wouldn't be able to use your code without significant modifications, because the delay routine needs to include code to check the TMR0 overflow bit approximately every 1 ms. With a 1 second gate time, the counter will overflow several hundred times when measuring an input frequency of 50 MHz. So the overflows must be detected and counted.

I've now tested my revised delay routine in both the MPLABX simulator and my Basic simulator, and they both agree with my hand calculation of 999997 cycles, so I'm going to conclude that this is the true cycle count. There's still the puzzling apparent discrepancy of 738 cycles in the actual program operation, but for now I'm going to assume that this is due to a bug elsewhere in the program, and I will focus on that.

FYI, this is my current version of the 999997 cycle delay with the Tmr0 overflow counter code.
Code:
; 1 second delay constants for 4 MHz clock
Dly1S_CountH equ d'4'
Dly1S_CountL equ d'36'
Tweak_Count equ d'43'
; 1 ms delay constants for 4 MHz clock
Delay1msLong equ d'1'
Delay1msShort equ d'72'

Delay1S  ; 2 cycles for subroutine call from main program
bank0  ; 1 cycle
clrf acc2  ; 1 cycle
clrf acc3  ; 1 cycle
movlw Dly1S_CountH  ; 1 cycle
movwf r3  ; 1 cycle
movlw Dly1S_CountL  ; 1 cycle
movwf r2  ; 1 cycle

; Main Loop execution count = 256*(Dly1S_CountH - 1) + (256 - Dly1S_CountL)
; Dly1S_CountH must not be zero (0 actually counts as 256)
; therefore with values 4 & 36, loop executes 256*(4 - 1)+ (256 - 36) = 988 times
; Total cycles = 11 + 1012*LoopCount + 4*TweakCount + (0,1 or 2 nop padding)

Delay1S_Lp
call Delay1mS  ; 997 cycles - this 1ms delay routine is thoroughly debugged
; At Fmax (50 MHz) it takes approx. 1.3ms (1300 cycles) to overflow the timer
; So the overflow bit must be checked after every execution of Delay1mS
btfsc intcon,T0IF  ; 1 cycle
goto Dly1sOflw2  ; 2 cycles if branch taken, 1 cycle if not

;No overflow branch -
pad 3 cycles here so both branches match
nop  ; 1 cycle
Goto $+1 ; 2 cycle nop goto Dly1sCont ; 2 cycles ;Timer overflow handler branch Dly1sOflw2 ;2 cycles for goto to get here bcf intcon,T0IF ; 1 cycle, clear TMR0 o-flow bit incf acc2,f ; 1 cycle btfsc status,z ; 1 cycle incf acc3,f ; 1 cycle Dly1sCont ;Convergence point for above branches ;Decrement 16 bit loop counter and test for zero decf r3,f ; 1 cycle incfsz r2,f ;1 cycle incf r3,f ; 1 cycle movf r2,w ; 1 cycle iorwf r3,w ; 1 cycle btfss status,z ; 1 cycle goto Delay1S_Lp ; 2 cycles ; -1 cycle (Adjust for skipped goto on final iteration) ;Main delay loop is complete. Finish with short adjustment loop movlw Tweak_Count ; 1 cycle movwf r2 ; 1 cycle Dly1S_twk ;Final tweak loop decfsz r2,f ; 1 cycle goto Dly1S_twk ; 2 cycles ; -1 cycle (Adjust for skipped goto on final iteration) nop ; 1 cycle return ; 2 cycles Delay1mS ; 00002 cycles for subroutine call movlw Delay1msLong ; 00001 cycle - load long delay count call delay ; 00774 cycles movlw Delay1msShort ; 00001 cycle - load short delay count movwf dlyctrL ; 00001 cycle decfsz dlyctrL,f ; 3cycles/iteration ; = 00216 total cycles (00744 for 4MHz) goto$-1     ; -0001 cycle for loop exit
nop         ; 00001 cycle short tweak
Return     ; 00002 cycles, total 997 cycles
; Variable Delay subroutine: delay cycles = 770*w+4
delay
; 00002 cyc for entry call
movwf dlyctrH ; 00001 cyc, w contains delay value
subloop
decfsz dlyctrL,f ; 00768 cyc for loop * w
goto \$-1     ; -0001 cyc for loop exit * w
decfsz dlyctrH,f ; 00001 * w
goto subloop ; 00002 * w
; -0001 for loop exit
return     ; 00002

#### Mike - K8LH

##### Well-Known Member
Error[151] : Operand contains unresolvable labels or is too complex
I wonder if you're using 'relocatable' code rather than 'absolute' code?
Unfortunately, I wouldn't be able to use your code without significant modifications, because the delay routine needs to include code to check the TMR0 overflow bit approximately every 1 ms. With a 1 second gate time, the counter will overflow several hundred times when measuring an input frequency of 50 MHz. So the overflows must be detected and counted.
I use isochronous code in a 1-mS loop in order to account for TMR0 overflows in my frequency counter, too. I gate the counter by toggling the T0CKI pin data direction between 'input' and 'output' and the TMR0 prescaler is set to 256. Here's an excerpt from a counter that uses a 200-mS gate time (5-Hz resolution) and a 24-bit counter;
Code:
;
;  count frequency for precisely 200 msecs (5 Hz resolution)
;
NewCount
clrf    TMR0            ; clear TMR0 and prescaler        |B0
clrf    countl          ; clear 24 bit counter registers  |B0
clrf    counth          ;                                 |B0
clrf    countu          ;                                 |B0
movlw   high(200)+1     ;                                 |B0
movwf   msctrh          ;                                 |B0
movlw   low(200)        ;                                 |B0
movwf   msctrl          ; gate timer = 200 msecs          |B0
movlw   TRISA           ;                                 |B0
movwf   FSR             ; setup indirect access to TRISA  |B0
bsf     INDF,4          ; TRISA.4 (T0CKI) = 1, gate "on"  |B0
GateLoop
setz                    ; set Z = 1                       |B0
btfsc   INTCON,TMR0IF   ; TMR0 overflow? no, skip, else   |B0
incf    countu,F        ; bump CountU, Z = 0              |B0
skpz                    ; TMR0 overflow? no, skip, else   |B0
bcf     INTCON,TMR0IF   ; clear TMR0 interrupt flag       |B0
DelayCy(1*msecs-10)     ; delay 1 msec minus 10 cycles    |B0
decf    msctrl,F        ;                                 |B0
skpnz                   ;                                 |B0
decfsz  msctrh,F        ;                                 |B0
goto    GateLoop        ; loop again                      |B0
bcf     INDF,4          ; TRISA.4 = 0, gate "off"         |B0
btfsc   INTCON,TMR0IF   ; TMR0 overflow? no, skip, else   |B0
incf    countu,F        ; bump CountU                     |B0
bcf     INTCON,TMR0IF   ; clear TMR0 interrupt flag       |B0
;
;  toggle T0SE to flush the prescaler & retrieve the count
;
movf    TMR0,W          ;                                 |B0
movwf   counth          ; save TMR0 value                 |B0
Flush   bsf     STATUS,RP0      ; bank 1                          |B1
bcf     OPTION_REG,T0SE ; clock on rising edge            |B1
bsf     OPTION_REG,T0SE ; clock on falling edge           |B1
bcf     STATUS,RP0      ; bank 0                          |B0
decf    countl,F        ; decrement counter LSB           |B0
movf    counth,W        ;                                 |B0
xorwf   TMR0,W          ; prescaler overflow into TMR0?   |B0
bz      Flush           ; no, clock it again              |B0
;
Good luck on your project. Cheerful regards, Mike, K8LH

Last edited:

#### Mike - K8LH

##### Well-Known Member
duplicate post. sorry...

Last edited:

#### rjenkinsgb

##### Well-Known Member
Why not use another counter - or any spare peripheral that can produce interrupts at a suitable rate - to give an absolute time reference?

That avoids all the software timing and allows you to do other things in the background if you wish. Once you get the initial setup working it's much simpler and more versatile approach.

eg. If there is no suitable spare timer and you are not using a UART, you could set that up for 300 baud one stop bit and keep the TX buffer loaded to get a 30Hz clock interrupt.

Other than such as LCD initialisation routines, I've never used software delays or wait-for-something loops beyond microseconds in any design over decades; I just have a master clock interrupt at a suitable rate and all timing & most i/o polling etc. is controlled by that.

#### BobW

##### Active Member
I agree that using a delay loop for a 1 second time delay does seem clunky, but at this point, I'd like to find out where the bug is, rather than abandon this method and use a second counter. The changes would be massive. Counters and interrupts introduce other complications, as well.

I guess it all comes down to going with what I'm familiar with. I've built dozens of PIC based frequency counters using a delay loop, usually 1ms, and have never had a problem. In fact, the current project is a dual mode counter which alternates between a 1ms gate time (1 kHz resolution), and 1 second gate time (1 Hz resolution). The 1 ms frequency count is dead on, while the 1 second frequency count is always off by a consistent percentage. If I add 736 cycles to the 1 second gate, then the 1 Hz resolution counter is also dead on. The weird thing is that I can think of all kinds of reasons why the delay routine might run longer than expected, but I'm absolutely baffled as to why it would be shorter, especially by 736 cycles. This is less than the number of cycles that would accumulate in one iteration of the main loop, but more than could accumulate in the short loop. It's also the same count discrepancy that occurred with my earlier delay routine which was coded completely differently. This number doesn't seem to relate to anything that I can think of. I had thought that I may have accidentally specified a decimal value as hex or vice versa, but I've been through the code many times checking for that.

#### rjenkinsgb

##### Well-Known Member
By any chance are you counting the same instruction cycles twice in places?

eg.
goto Dly1sOflw2 ; 2 cycles if branch taken, 1 cycle if not
...
Dly1sOflw2 ;2 cycles for goto to get here

#### BobW

##### Active Member
I've gone over the hand calculation many times, checking for those kinds of errors. Also, both the MPLABX simulator and my Basic simulator give the same 999997 cycle count as my hand calculation.

One other thing that I checked, was to make sure that the PIC system clock is within spec. I'm using an external 4 MHz crystal oscillator. The crystal would have to be defective to be that far off, but testing with an HP frequency counter, the clock is within spec, and I trimmed it to zero beat with the BFO on a well aligned digital shortwave radio. As a further test, I fed the crystal clock signal into the PIC frequency counter input so that frequency errors would cancel out. In 1 ms gate mode, the count is dead on 4000 kHz. In 1 s gate mode the count is 3997072 Hz. Thinking there could be a bug in the binary to decimal conversion, I also tried skipping the conversion and displayed the count in hex. It's 3CFD90 which is indeed hex for 3997072.

So, now I'm combing through program code examining the parts that are different between the 1 ms counter, and the 1 s counter. There's not much to look at because they share most of the same subroutines.

#### BobW

##### Active Member
Further update:
Digging through the Microchip site, there is indeed a verified bug in the MPLABX simulator where it counts cycles incorrectly. Apparently, the problem is due to the MPLABX simulator using the published info in the various processor datasheets. It turns out that the datasheets are incorrect in specifying the number of cycles for certain instructions that alter the program counter. Since my own Basic simulator also uses the info from the datasheet, this explains why it agrees with the MPLABX cycle count, and my hand calculation.

Apparently, in MPLAB 8 they based the cycle counter on actual hardware design info instead of the published datasheets, and so the MPLAB 8 simulator gives a correct count. I don't have a copy of MPLAB 8 to test this with, but I'm going to assume that this is the cause of my problem. I'll add 736 cycles to the delay routine, and call it fixed.

#### Mike - K8LH

##### Well-Known Member
Your code does not initialize the dlyctrL variable to zero prior to the first call to the delay subroutine which could affect your very first count. The dlyctrL variable will always have a zero value after the delay subroutine uses it.