Simply, you could measure zero crossings of the speech. Alternatively, measure the zero crossings (frequency) and measure the envelope (amplitude) using an envelope detector and an ADC.
Either way, once you have got the parameters of the waveform, it is going to take a lot of software to normalise the data and analyse it