Without using the circuit depicted you could get the phone (on answering) to fire a relay (from the speaker output) and, say, make the car move, but that would be that.
Since the key tones are used to actually steer (or whatever), they have to be de-coded, then that info needs to further de-coded, and so on.
So, no. Not with just some logic gates, or whatever.