Many years ago I tried to benchmark C# and .NET framework. I find out how get precise system time.. or timing. I think it used some low level media clock from windows media.dll.. anyway. I wrote a program that calculated sin-function thousands times in a row. I thought I had a good setup there.
The result was just about zero, because the code did not use the result (from the sin-function) in anywhere.. and the compiler just optimized the whole loop away.
When you benchmark, you need to know what you are benchmarking.. the compiler or the processor (and other hardware). Many traps there.. not easy.
The first idea you had is ok, but it is not enough. In C there is a keyword "volatile" that prevents the compiler from optimizing, but then again you need to know what you are doing.
EDIT: I would like to know how long your test signal is "High" in your original test described in post #1
EDIT2:
Does this mean that the signal goes high or low? (referring to your first post):
Low (PORTB.7) //Make PortB.7 High