I have an application that distributes audio among multiple, distant nodes (emitters/speakers) over an ethernet. I control the relative timing (phasing) of the signals from each emitter -- either in synchronism or leading/lagging by specific dynamic amounts (which can be on the order of many ms).
To date, I've been measuring this skew in the time domain by emitting a plosive burst whose start time can readily be identified (on a 'scope) relative to the time it appears from another emitter. This is the direct analog of a "clap board" in a film production.
Is there a cleverer way I can do this -- likely in the frequency domain -- by deploying a signal of particular "characteristics" that would let me "miss" the start time of the emission and still yield the same results? (it's just annoyingly inconvenient to have to set up each experiment with an introductory delay just so I can be prepared for the output(s). A "steady state" sort of approach would be nicer -- set up a stimulus, then probe each emitter to check results for that emitter!)
[I.e., you need to be able to differentiate between THIS point on the time series on THIS device and THE SAME point on another device. You can't risk that point being on a subsequent cycle of a periodic waveform!]
Send something aperiodic and wideband - a long chirp or noise or something - and cross-correlate. I assume the emitters are identical, which simplifies the situation.
In theory, that could be done continuously at inaudible levels.
--
John Larkin Highland Technology, Inc
Science teaches us to doubt.
Claude Bernard
Look into various automated room equalization schemes used in higher end hi-fi speaker systems. For example, from 2013: I'm sure there are patents available for dissection.
My Harman Kardon AVR 254 amp has such a feature. I place a plug in microphone near where I expect to be sitting when listening to the hi-fi, punch the "setup" function, and the amp adjusts the delays and frequency response to each of the 7 speakers (channels). The noises emitted during the setup process sound like a mixture of clicks (time of flight delay?) and hiss (broadband noise). Actually, I'm not sure if it can deal with variations in speaker distance and delays from the listener. The diagram in the manual showing recommended placement for the 7 speakers is a perfect circle, which suggests that it can't handle random speaker placement. Still, I think it might be worth investigating.
--
Jeff Liebermann jeffl@cruzio.com
150 Felker St #D http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann AE6KS 831-336-2558
If yu are using ethernet, transit time across the network can depend on network load, retries etc and is not guaranteed. For that sort of work, a private subnet or a different protocol may be more appropriate.
Of course! Every device on the switch is of my own design -- I know exactly what it sends/receives and when. The switch participates in the clock synchronization algorithm (similar to PTP).
So, I distribute the "content" slightly ahead of time. Then, tell each node when to emit it, relative to that synchronized "clock". The issue then distills to ensuring I can keep ahead of the "consumers" (cuz I can't deliver a complete "audio snipet")
OK. That's a different goal than what I'm pursuing but I can possibly borrow some ideas/techniques from it.
In my case, the individual "nodes" are located over reasonably large distances (50 ft) and not "line-of-hearing" (concocted term as a play on "line-of-sight").
I also want to be able to numerically demonstrate the individual delays (hence the DSO time-domain approach... no "magic" involved in convincing yourself that a SPECIFIC delay has been achieved)
(later, I'll need to do the same with video; and the audio wrt the video)
Actually, they aren't. And, driven class D which adds some "switching variation" to the timing.
Hmmm... I'm not sure that would buy me anything. I'd need to have measurement kit in place "all the time" that could "hear" (or otherwise observe) the signals emitted from other nodes...?
Originally, when designing/demonstrating the clock synchronization algorithm, I placed a dozen "nodes" on a bench, each tethered to the switch via a "random" amount of CAT5e. I modified the code to generate an "impulse" on a digital I/O -- on each node -- that represented its idea of when the clock was occurring.
I cabled these to 12 channels of a logic analyzer and arbitrarily triggered off of one to observe the others in relationship to it.
As measuring EACH delay -- and the jitter it exhibited -- would be tedious (the analyzer won't automate that process), I tweeked the code to generate a pulse of width X, instead of just an impulse. I adjusted for the computed instantaneous CPU clock frequency (cuz there's no guarantee that they are identical).
Setting X to a large value, I reconfigured the analyzer to trigger on "all inputs hi". This will only occur if all of the pulses overlap, in time. So, a successful trigger means the clocks are synchronized to within X of each other. (You can let this run "forever" to see if they ever exceed this value)
Then, I reduced X until I couldn't reliably get a trigger.
The advantage of that approach was that there's no hand-waving going on; folks can see that the pulses each appear to be "about X width". And, can understand that a trigger means "at least ALL partially overlapping".
But, I have since deployed the nodes. They are now 50+ feet apart over a few thousand square feet. "Wire distance" between them (if I have to connect something to each and fish it around walls, through doors, etc.) is considerably longer.
Rather than using a similar DIGITAL "impulse" technique (running a low level signal through gobs of wire). So, I decided to use the amplifier as a buffer -- gain and lower impedance drive. But, a narrow pulse won't make it through the amplifier with the same clean edges that a digital I/O can be emitted.
So, synthesize an audio signal that has a nice recognizable shape and let it travel down the wire to the DSO (add identical lengths of wire to each node to eliminate transit delay differences)
[DSO can only do a few channels at a time so this means setting up an experiment for nodes 1 & 2, then 1 & 3, then 1 & 4, etc. It takes a lot longer to demonstrate this -- and more patience on the part of the observers.]
Depending on the size of the loudspeakers and SPL (sound pressure level) generated, 50 ft seems like it would work with any commodity electret microphone. This is what HK EZset mic looks like: If 50 ft is too far, or a speaker is hiding behind some obstruction, just add some more cable to the microphone and move it around as needed. Since sound travels at about 1 ft/millisec through the air, and about 0.7 ft/nanosecond through the audio cable, you can ignore the mic cable delay.
What you're doing kinda sounds like Art-Net, which uses DMX to control the lighting. I suppose it could be adapted to control the delays at each loudspeaker: Other than some light reading and random Googling, I know next to nothing about Art-Net technology. As before, it might be worth looking into what it does and what's available.
--
Jeff Liebermann jeffl@cruzio.com
150 Felker St #D http://www.LearnByDestroying.com
Santa Cruz CA 95060 http://802.11junk.com
Skype: JeffLiebermann AE6KS 831-336-2558
It would be "challenging" to try to handle all of the emitters in this way. I'd have to concentrate on groups that are "within earshot" of each other. Then, move onto the next (ovelapping!) group, etc.
I suppose I could drive a fixed signal from each and tweek the delays until the signal was maximized at the microphone's location. Then, physically measure it's location, etc. And, work backwards from that to get to the values that I seek.
I already have the ability to shift the emitted signals in time. What I need is a way to show (graphically, numerically, acoustically) the temporal differences between these emitted signals.
The demo I'm building uses the user's present location to determine which emitters should be powered up, which powered down, and how to adjust the delay and SPL of each driver to "accompany" the user as he walks around, relative to their locations.
For example, if you're listening to emitter A producing some particular content -- but walking towards emitter B -- I want to bring B on-line at a level and delay that "walks alongside you" as you leave A's influence and approach B's.
I can't just make B "really loud" (to compensate for the increased distance relative to A) because there will also be a transport delay (sound through air) for the wavefront to reach the user. Depending on the source material, any delay exceeding the echo threshold will want to be perceived as a second sound.
Instead, I want to adjust the level AND delay so that you perceive the sound as coming from A more rapidly than if you just approached it as it grew louder.
[Think about what it sounds like when you walk past one speaker and towards another; you feel like you are PASSING one sound source and APPROACHING another. I want the sound source to be perceived as ACCOMPANYING you.]
By reducing the delay at B, I can ensure its wavefront reaches you before that of A (Haas' effect). And, let B's output dominate your approach as B is attenuated by your passing AND my deliberate attempts to minimize it's perceptual influence. As you approach B, it's SPL is reduced so it's comparable to what A was like when you were in its proximity.
[of course, there is some doppler involved but I think, at walking speed, you'd not hear the pitch changes -- even though I'm manipulating those pitches by dynamically twiddling with delays]
In certain areas, I can surround the user (demonstrate-ee?) and interactively twiddle with the delays to draw attention to individual emitters without altering SPLs. But, that's a less impressive demo.
How accurately must the delay be measured? As a practical matter, it will be say one eighth of the wavelength at about 3 KHz, the peak of ear sensitivity. In any event, it cannot be better than a wavelength at the highest frequency that will pass through the speakers, the mic usually being far wider band.
My first thought was to use a multi-sine test waveform, which works well for such things as determining the electrical length of a long transmission line, but ...
Practical rooms have a lot of reverb and multi-path, so I'd hazard that only wideband signals with unambiguous correlation peaks are going to work. One can limit the time spent searching for peaks in delay space by a coarse-fine strategy, starting with band-limited signals.
ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.