Lets start with YOUR setup Which chip are you using and what crystal have you used..
For instance... A 12Mhz crystal will give you a 1uS clock cycle ( most instructions are 1 clock cycle)
So look at this delay
Code:
delay:
mov R1, #2
mov R2, #0
dly: djnz R2, dly
djnz R1, dly
ret
mov = 1 clock = 2 x 1= 2uS( two move statements )
djnz = 2 clocks = 2 x 512uS = 1024uS (R2 counts down twice )
djnz = 2 clocks = 2uS (R1 counts two )
ret = 2 clocks = 2 x 1 = 2uS
1030uS or 1mS 30uS
so for a 40mS delay your looking to load R1 with 77 (ish)