Sound localization approach.

Mosaic · Jun 8, 2014

Hi All:
I am back at my sound localization project. Without getting into FFT I want to sample 4 electret mics spaced about 10cm apart in a square, simultaneously, in order to try to assess the sound propagation direction among them.

Some complexity ahead!

The approach: using 4 x 12F675 synced 20Mhz PICs to sample the 4 x mics and a 16F1826 'master'.
1) Each 12f675 (20Mhz) samples audio (OPA amplified) @ 10Khz, 8 bits. Assess a peak inflection based on two consecutive declining amplitude samples. There can be several of these during any 8msec period at higher audio frequencies. The largest is reported to the main processor.
2) Every 8 msecs (80 samples) they bitbang a 16 bit timestamp and an 8 bit waveform (max) peak amplitude. Each sampling PIC delays its transmisson by 0,2,4 or 6 msecs for transmission syncing. Thus the 24 bits of data must transmit in under 2 millisec.
The time stamp is when the peak amplitude of the waveform occurs in a fixed 48mS period, after which the timestamp is reset. A zero amplitude implies a peak is not detected in the last 8ms sampled. Therefore the waveform is still rising or falling => frequency < 125hz
By assessing a single 'peak' every 8mS the amount of data processing is reduced.

The 16F1826 (20Mhz) main processor then looks at the stored samples over the most recent rolling 48ms period (6, 8mS periods) to identify the time differences in the peaks and determines which quadrant the sound source is located in. 48mS should cover frequencies down to just over 20Hz. Prescaling (x4) the 16 bit Tmr1 instruction clock will permit an overflow @ about 52ms. This gives the highest resolution (0.8 uS/tick) time stamp for the 48mS period.

Main processor RAM data table requirements is 3 bytes x 4 samples x 6 periods = 72. This fits within a 80 byte bank.

Does this sound reasonable?
thx Ancel

ronsimpson · Jun 8, 2014

Have you done the math?
The mics are 10cm apart. What delay do you expect to get in 10cm? I think you want to sample much faster than that delay.

Mosaic · Jun 8, 2014

At 10cm there is about a 0.3mS propagation delay. A 10Khz sampling is 3x as fast. How much faster do u think is required?

EDIT:
@ 20Mhz the 12F675 should do an 8 bit sample in about 20uSec.
That's about 50Khz.
I could try to use the TAD wait states to handle other activities during the sample to achieve the full 50Khz rate.
Might be tight though.

ronsimpson · Jun 8, 2014

If I had a right and a left mic;
+300uS delay indicates right +90
-300uS delay indicates left -90
o delay indicates In front or behind. 0 or 180
I think you will get measurements like this:
+300, +200, +100, 0, -100, -200, -300uS
+90, +60, +30, 0, -30, -60, -90

I think this is how it works?
With 4 mics you will get more information.
Question: Will you get better information from a triangle? (3)

Mosaic · Jun 8, 2014

The spec I have calls for the 4 mics...I am considering cardioid pattern units to refine the solution using the amplitude as well.

Edit:
I might be able to zone in to within a 45 degree arc with 4 mics.

alec_t · Jun 9, 2014

@ 20Mhz the 12F675 should do an 8 bit sample in about 20uSec.

Couldn't you then use just 1 processor to sample all 4 mics, allowing for sampling delays in the software?

ronsimpson · Jun 9, 2014

alec_t said:
Couldn't you then use just 1 processor to sample all 4 mics, allowing for sampling delays in the software?

I think syncing 4/5 micros together will be a hard problem.

Mosaic · Jun 10, 2014

With a view to reducing the parallel processing requirements I have come up with an analog trigger for a single PIC to do all the ADC sampling & processing.

Can you guys see if this makes sense:
1) Given about a 12cm max gap between the mic pickups a sharp roll off past 2500 Hz limits the # of peaks being sampled to one at any one time from any one source. The amplitude of the peak can be used to differentiate sound sources.
2) Using an RC delay OPA circuit for each audio pickup signal I can trigger the PIC interrupt on change pins (goes high) to sample the same point on the propagating wavefront. Note spice pic.
3) The PIC can then know which mic is just hit by the wavefront and the signal amplitude. At 32Mhz ASM (16F1826) it should be able to discern such events with a sub 30uSec resolution. That's a 1 cm propagation resolution @ 345m/s
4) Angular resolution depends on the distance of the sound source from the pickups vs this 1 cm max resolution between pickups.
5) The PIC assesses the most important sound by amplitude filtering and then determines the propagation direction from the timing of the triggered ADC samples.

I'd like to develop a way for the PIC to bias the opamp amplification to self calibrate for ambient noise, rather than be continuously capturing & rejecting spurious adc samples.

NorthGuy · Jun 10, 2014

Looks like a difficult task. If you get 1kHz sine wave it has length of about a foot. If your microphones are closer than this, you're Ok. But a 10kHz sine wave has a wavelength slightly more than an inch. If your microphones are a foot apart, a number of waves will fit between them, so you will have a hard time deciding which of these waves must be synched. It'll get even worse if you have several sounds from different directions. A small PIC is not that powerful to do that.

How many sound sources do you expect? Do you have any frequency limitations? Is this 2- or 3-dimentional?

Mosaic · Jun 11, 2014

I'll have to roll off anything above 2500Hz due to the 12 cm spacing. Its a 2 dimensional mic arrangement. I want to control the amplitude to limit the processed sounds to the loudest one if possible.
I am looking at using the DAC to output a DC rev. bias voltage on an inline 'post opamp rectifier' to 'clip' some of the lower volume 'noise'. Then a.c. couple the output to the ADC.

Alternatively, I could use opamps as a sample and hold for the overall peak waveform, and then voltage divide that to perhaps 90% to reverse bias the rectifier to block a lot of the spurious other noise without doing any processing.

NorthGuy · Jun 11, 2014

If you have two microphones and detect a phase shift, this will give you a parabolic curve representing the positions where the sound source could be located. Except for trivial cases where phase shift is zero or where it is equal to the distance between microphobes, it doesn't give you much information about the direction. You can only tell if the sound is from the left of the center line or from the right.

To detect the direction more precisely, you would need to have a series of microphones located on a ring patten with each pair of microphones detecting whether the sound is to the left or to the right of the centerline. Say, with 4 microphones on a square, you only can detect up to 45 degrees.

Instead of peaks, it's easier to detect zero crossing using comparators. Then you can easily construnct a composite signal, which will turn to 1 when left microphone crosses - to + and to 0 when the right microphone does this. The duty cycle of this signal will be equal to the phase shift. You get these signals from each pair of adjacent microphones and feed it to an MCU which has enough CCP modules to process all signals and measure the duty cycles on them. You only need to figure out if duty cycle is less than 50% (meaning left microphone is leading - sound is left of the centerline) or more 50% (meaning it's lagging - right of the centerline). Doesn't need much processing power.

Of course, if you need to detect different simultaneous sounds, things get way more compilcated.

I think it would be easier to detect direction by using a pair of directional microphones as in stereo-recordording.

Mosaic · Jun 11, 2014

Zero crossing means a dual supply. It's battery operated at the moment and batt. life is important too. I can conceivably look at using two batteries if zero crossing is more effective.
The reason for the peak triggering is to obtain the signal amplitude which can also be used in assessing directionality for higher frequency sounds as I will be separating the mics with a cylindrical sponge (10 Cm diameter) which will affect higher frequency amplitudes permitting differentiation between the mics.

Nearby loud, high freq. sounds will be something I have to handle at some point.
I am looking at the CDA approach for lower frequency phase shift here:
http://www.isn.ucsd.edu/pubs/casi04_loc.pdf

I did some more work on the analog pre-processing to permit a smoothed PWM signal to rev. bias an inline diode to 'clip' low level signals under the PIC control. Hopefully this can filter out ambient spurious noise! I think that it is critical to isolate a single audio source for this approach. I imagine the method would be for the PIC to dynamically raise the clip voltage to eliminate all triggering and then step it down until a trigger occurs. Perhaps do this after each signal direction assessment. Based on the amplitude of the current sample the PIC will be able to directly set a PWM DC% to clip it.That might be around a 5 ms period overall per direction assessment. If that works then perhaps every 30ms or so the system can report the statistical mode of the loudest sound direction.

Here is a pic of the sim. of the preprocessor so far. It does a 5x gain with a 2500 Hz corner low pass and implements the PWM clipping with a sample and hold cap for peak sampling (less a diode voltage drop). The ADC input will do the job of discharging the S&H cap by switching to a dig. low. after every sample.

NorthGuy · Jun 11, 2014

You do not need a dual supply to detect zero crossings. Zero crossings are much more reliable than peaks and do not require any sound attenuation.

Thes guys who wrote the paper cheated a bit. They used white noise. Real sounds have periodic components and strong autocorrelations. Measuring time between zero crossings is much easier and more reliable. Will give you TD right away.

They also placed the source 60 feet away. If you allow the sound to get closer to the mocrophones, the phase shift will not be strictly tied to the angle. It certainly depends on the range of distances that you need.

It would be interesting to try to tell apart various sounds coming from different directions. Perhaps a set of (adjustable)band-pass filters can be used to seperate them.

Mosaic · Jun 11, 2014

I guess I could offset the the AC voltage to create a virtual ground for a zero crossing. But I still don't see how a zero crossing approach can help isolate the loudest sound.

NorthGuy · Jun 11, 2014

If you imagine two sine waves of dtastically different amplitudes and mix them together, you'll see that only the wave with higher amplitude will produce zero crossings. This is also true for peaks.

As the amplitude of different sounds get closer, they will get difficult to tell apart and you probably won't get a good result. To some extent you can battle this by measuring several RDs and then throwing out outliers. But it's the same wether you sample peaks or zero crossings.

There are two advantages of zero crossings:

- they happen faster, so you can time them with better precision
- they will be there even if your input signal is way beyond the op-amp's range

To filter out the background noise, you just use the comparator hysteresis.

ronsimpson · Jun 11, 2014

Is the sound you are looking for "random" as in anything, or is it make by you.
In the audio world it is believed the human ear does not have directionality at low frequencies. That is why there is only one speaker in sub woofers.
If humans can not tell directionality at low frequencies your system will not ether.

If you noise source is a single pulse this would be a simple project.
MAYBE; If you add a band pass filter to each mic so that you only receive audio form 1khz to 2khz then the processing might be better. Some thing to think about.

Mosaic · Jun 11, 2014

With the zero crossing approach, should I create a virtual ground for the signal? Perhaps 1.8V, and then run the sig thru a comparator (with some hysteresis)ref'd to the virtual ground? Then whenever that changes state, I detect with the int on change pins and assess what to do?
I accept that the dominant signal will drive the zero crossing and thus will be detected.

NorthGuy · Jun 11, 2014

You can do virtual ground.

Or, you can simply pull up the signal by using a 1:1 voltage divider between the signal and positive rail. When the signal is at 0V, the middle of the voltage divider will be at Vcc/2. To detect zero crossing you campare this to Vcc/2.

Most PICs have CPP modules. They remember the timer value when the pin transitions from low to high. Since it's done in hardware, it's very precise. The only bad thing is that you need 4 of them. I think not many PICs have 4 CPPs.

Mosaic · Jun 12, 2014

Well I do have 16F1936 on hand...it seems to have enough CCPx modules. Timer1 (16bit) must be run off the instruction clock @ Fosc/4. or 32/4 = 8Mhz which should give enough accuracy. The 16 bit timer1 will over flow after 8.2 mS at that speed.
That is long enough to observe a few waves propagating past the mics.

Instead of peaks, it's easier to detect zero crossing using comparators. Then you can easily construnct a composite signal, which will turn to 1 when left microphone crosses - to + and to 0 when the right microphone does this. The duty cycle of this signal will be equal to the phase shift. You get these signals from each pair of adjacent microphones and feed it to an MCU which has enough CCP modules to process all signals and measure the duty cycles on them. You only need to figure out if duty cycle is less than 50% (meaning left microphone is leading - sound is left of the centerline) or more 50% (meaning it's lagging - right of the centerline). Doesn't need much processing power.

I still am not clear on how to establish the direction of the wave, can u elaborate on analyzing the zero crossing duty cycle to determine the direction. a 45 deg quadrant is fine.

NorthGuy · Jun 12, 2014

Mosaic said:
I still am not clear on how to establish the direction of the wave, can u elaborate on analyzing the zero crossing duty cycle to determine the
direction. a 45 deg quadrant is fine.

You will get a series of time measurements from microphone 1: m1[1],m1[2],m1[3] ... from microphone 2: m2[1], m2[2], m2[3]. You calculate: (m1[n+1] - m1[n])/(m2[n]-m1[n]). If it is less than 0.5 - microphone 1 is leading, and if it is more than 0.5 - microphone 2 is leading.

Say, on microphone one, you get 10, 30, 50, 70 and on microphone two you get 15, 35, 55, 75. The result would be (15-10)/(30-10) = 25%. Which means that microphone 1 is leading. And the sound is left of the centerline between microphones.

It is a little bit more to it because there might be some noise and you will have to re-synchronize the sequences all the time. E.g. you can get 10, 12, 13, 30, 50, 70 instead of 10, 30, 50, 70. You got two extra zero crossings 12 and 13 not recorded on the other microphone. You can drop all these numbers and re-synchronize at 30.

You can also calculate the TD delay mentioned in the paper as TD = m2[n] - m1[n]. If you sound source is far enough, you can use the formula from the paper to calculate the angle. Won't work if the sound source is close.

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Sound localization approach.

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Well-Known Member

Similar threads

New Articles From Microcontroller Tips