Which bears out what I said about the output being intelligible despite significant distortion. As for the level, note that the input is a stereo track and the output is mono: plus, I didn't try too hard to normalise the output waveform.I do not hear the severe distortion caused by the opamps half-wave rectifying the sounds and I do not hear the music parts missing when the output transistor has no bias.
Look at the output waveform in Audacity and you will see distinct clipping of the bottom edge.