Mirror, mirror, on the wall, is there a fairer route in the land?

Some years ago I was often in a traffic jam approaching a T-junction near my house. I had a choice of taking the direct path, or taking a detour on to a circuitous route before reaching the potential traffic jam; naturally, I couldn't see the traffic jam before having to make that decision. This was before I had a smartphone with Google Maps, and before the traffic data in Google Maps was detailed enough to include that local backup.

This led me to ask this question: could I build a system, installed on my property, able to determine whether or not there was a traffic jam on the direct route? Such a system could publish this data to the web, and I would examine it before leaving work, saving myself anywhere from a couple of minutes, to 15.  Not earth shaking, but it would be nice.

If there was a direct line of sight from a window in my house to the road, I could just install a camera, and upload photos every few minutes, or on demand.  Of course, there isn't.

However, standing in my yard I can hear the road noise of fast moving vehicles, and the rumble of idling trucks in the distance, but I'm not usually sure what direction that rumble comes from.  So, could I build a system using microphones placed outside the house (mounted on the wall, or maybe in the yard), which could compute the directions from which the multiple sounds are arriving, or even better the
locations of the sound sources?  And having such an ability, could I then classify different types of sounds, such as idling cars in a traffic jam, a single idling truck at a loading dock, and vehicles moving quickly, preferably with enough spatial resolution to distinguish the slow moving cars approaching the T-junction from those quickly moving away?

I decided to try working on idea early last year, thinking of it as a passive acoustic sonar, though I've since learned this is often referred to as acoustic localization or multi-lateration.

My friend Bent loaned me a couple of microphones, which I used as the left and right channels of a 3 microphone array, the center microphone being the average of the 2 internal microphones of a TASCAM 4-channel recorder. Bent and I recorded ourselves talking as we walked back and forth in a line parallel to the line of the 3 microphones, perhaps 10 feet away. I then wrote some Python code to read the WAV files and perform cross-correlation of the channel pairs using numpy, but was dismayed at that results. The stereo effect was very evident when I listened to the left and right channels, but the results from my program showed nothing intelligible at all. I worked on it for a bit, but dropped the effort in favor of skiing.

After the end of ski season this year, Bent and I reviewed the code, and discovered that I just wasn't using numpy correctly. Sigh. After fixing those, we got this image:


The horizontal axis is time (12 seconds in total, with 10ms windows), and the vertical axis is the offset (in samples) between the sliding windows of samples from the left and right microphones, with the value being the cross-correlation between the normalized windows. I used matplotlib to render the results as an image, which used a default color map, where blue colors indicate low correlation, and red is high correlation.  The bright line across the image is where there is zero offset between the windows of samples (i.e. a high correlation value there implies there was a sound source on the line perpendicular to the mid-point of the line between the two microphones.

More to come.

Popular Posts