To make it easy, mask to get the lowest 4 bits and firstly look to see if it is zero by moving the value in and out of a file and checking the zero bit in the Status register 03,2.
If it is zero, call the table and output the value to create a "0". If not, decrement the file and test for zero and at the same time increment the jump value for the table. When you have the value, send it to the display and call a short delay. Then blank the display.
Shift the 12bit number 4 places right and mask to get the lowest 4 bits to get the 2nd number.
shift 8 places to get the 3rd number. You have created the scan routine as well as the conversion routine with a few instructions.
You can be very smart and use one routine (3 times) to produce all the digits.
Your delay sub-routines are far too complex. They don't have to be accurate to 1 microsecond.
Remember this: A file is left with 0ffh after a decfsz and you can use this in a subsequent delay routine without having to load it with a value.