Software delay
The fastest you can probably make that μC run is 60Mhz in X2 mode (6 Clocks/machine cycle), where you will get 10 MIPS, or 100ns per instruction. In this mode you would need 10 instructions to make a 1us delay. This is assuming my interpretation of section 7.1 - X2 Feature of the datasheet is correct. If not, then it would only be able to run at 5 MIPS, which means you will have to make your software delay take 5 instructions. This would not be very tolerant of mistakes. But It can be done. Be aware that being off by one instruction would create as much as a 20% deviation in your frequency/period. If it were me, I would use "inline asm" and be particular to account for the extra delays caused by the call to the routine and such.
Timer hardware/peripherals.
The AT89C51ED2 has three timers. Timer 2 is the only one I can find information on in the datasheet. You can clock it from the system clock (CLK PERIPH), to create a 100ns~200ns time base. Or have it directly create your 1μs delay with the prescaler. Then have it interrupt every time it overflows. The problem here is that this will happen every 5~10 instructions, which will make the device practically useless for anything other than this time base. It's just about as useful as the software method, only you can run a handful of instructions between each tick.
External clock.
If you can get a 1Mhz xtal, you can use it to create a 1μs time base directly. 1MHz by nature is 1μs per period. This is almost no different from the above, in order to use the time base you will have to get it every 5~10 instructions, as the CPU is still clocked at the same speed.
Bottom line
Unless you only really need to make this precise delay happen once every so often, then you're most likely going to need a faster microcomputer. Probably something in the >100MHz range.