Kyma Forum
  Kyma Sound Exchange
  Radio Surveillance with Kyma?

Post New Topic  Post A Reply
profile | register | preferences | faq | search

next newest topic | next oldest topic
Author Topic:   Radio Surveillance with Kyma?
David McClain
Member
posted 01 July 2001 02:06         Edit/Delete Message   Reply w/Quote

spac_filtering.kym

 
It's been a while, so here is another contribution...

This one is a shortwave radio simulation with a speech enhancement technique dubbed SPAC (Speech Processing by AutoCorrelation [sic]). The newest rigs have something like this built in, but I don't have one (yet!).

I don't like the acronym SPAC but that's what it is called in all the Japanese rigs (e.g., Icom, Kenwood, Yaesu).

Anyway, after seeing the adverts I decided to give my hand a try with the Kyma, and as you can see it does pretty well...

The Noise and Signal sliders control the dB attenuation on the input noise and signal. By default I have Kurt's Cephalophage running somewhere below -3 dB signal-to-noise-ratio.

The mixed signal is then sent through a shortwave radio bandpass filter with a passband running from around 200 Hz to 1.7 KHz. This keeps just enough passband to let the voice energy through and not much more.

Click on the SPAC button to hear what it sounds like with and without SPAC processing.

The SPAC filter works by computing an exponential average autocorrelation of the input spectrum for use as a filter against the incoming spectrum. The default setting has an e-folding length of around 10 FFT half-lengths. Long FFT's work better than short ones here.

The PreGain setting amplifies the mean autocorrelation SPAC filter values, while the postgain amplifies the product of the SPAC filter and the incoming spectrum.

The SPAC technique works well for low SNR situations, but as you lower the input noise you will begin to notice its shortcomings.

Essentially, it boosts the voice spectrum too much in the bass region, and depresses the highs - which makes for poor listening under good SNR conditions. Intelligibility suffers greatly.

But when the SNR is low and even negative, like the default settings here, then SPAC enormously improves the speech clarity. That's because the highs are getting through anyway because of the strong noise contribution to the autocorrelation. And you will find that the noise is seriously reduced as the speech is brought out.

Just for kicks, try lowering the signal level to -6 dB and leave the noise at 0 dB. The "signal" all but disappears in the noise without SPAC, but when SPAC is enabled you can very clearly hear much of what Kurt has to say.

As another experiment, try making the noise source "pink" instead of white. This might even be a better simulation of a SW RX. Now you can lower the "signal" as far as -12 dB SNR and still pick up Kurt with SPAC processing.

So now maybe, equipped with my trusty Kyma/Capy, I will be able to continue picking up the BBC here in Arizona after they discontinue the North American Service and broadcast only to Central and South America.

Happy sleuthing!

- DM
k de N7AIG

[This message has been edited by David McClain (edited 01 July 2001).]

IP: Logged

David McClain
Member
posted 01 July 2001 03:33         Edit/Delete Message   Reply w/Quote

spac_filtering.kym

 
hmmm... Seeing how the system responds with pink noise background, and comparing the bassy response obtained against that with a white noise background, it puts me in mind of an improvement...

Attached is the same Sound as before but with an additional sound using "pre-emphasis".

The idea here is to extract better performance from the high end and less bassy sounding speech. In this case the pre-emphasis is had by injecting a white noise derivative into the FFT along with the signal. (Stochastic resonance, anyone?)

I switch the derivative white noise source off when you want to A/B compare the unprocessed signal against the SPAC processing.

The derivative white noise has an upward slope with frequency, and its average autocorrelation serves to open up the filters at the high end. The result is somewhat less bassy sounding, and we can still follow Kurt all the way down to SNR -12 to -15 dB.

I'd say that's pretty impressive!

- k de N7AIG

IP: Logged

David McClain
Member
posted 01 July 2001 04:07         Edit/Delete Message   Reply w/Quote

spac_filtering.kym

 
... and here is a possibly better way to do input preemphasis. (in addition to the other Sounds already presented.)

This one uses a "sort-of" derivative of the input signal plus noise background. Actually, it is a very simple FIR filter with two adjustable parameters. The !PreEmphasis dB controls the amount of filter activity which tends to flatten the noise spectrum and enhance the highs in the speech. The !Delay parameter addresses the pre-emphasis time constant (in the old days they talked about 80 usec preemphasis and such... well here it is in digital form).

The FIR filter has a transfer function given as

y[n] = x[n] - g*x[n-d]

where d is the delay, and g is the preemphasis attenuation. When g = 1 and d = 1 this becomes the Kyma pseudo-differentiator.

For other values of delay d it becomes a kind of inverse-comb filter. And when g is less than 1 (0 dB) its notches are not real deep. So this filter tends to deemphasize the bass and emphasize the highs.

Tuned just right it can counteract the very bassy response provided by strong 1/F noise.

So now we can both SPAC process the low SNR speech records, and we can control the spectral bandpass shape so that we don't lose the high frequencies needed for intelligibility.

- SK DE N7AIG

[This message has been edited by David McClain (edited 01 July 2001).]

IP: Logged

David McClain
Member
posted 01 July 2001 07:38         Edit/Delete Message   Reply w/Quote

spac_filtering.kym

 
Sorry... I couldn't resist... Attached includes an additional sound that implements a soft squelch control.

This illustrates a really interesting use of Kyma. I wanted to remove most of the background noise when there was no speech signal. One way to do this would be to suppress the product filtering whenever the incoming spectrum closely matches that due to noise alone. Then increase the gain of the product filtering as the incoming spectrum departs from a noise-like signal.

But what is a noise-like spectrum? It can vary depending on the nature of the noise. For pure white noise it would be a flat spectrum. For 1/F noise it would have a 1/F slope, and if you throw preemphasis in then all bets are off...

So one way to compare the spectrum for similarity to noise-like spectra would be to compare the sum of the low frequencies against the sum of the high frequencies. When these match nominal noise-like behavior then we want minimum gain.

We can get the sum of the front half of the spectral mean autocorrelation by using an averaging lowpass filter with NFFT/4 period. And we can get the back-half sum by using a delay of NFFT/4 followed by another NNFT/4 period averaging lowpass filter. So far so good...

(in actual fact, we should probably compare closer to DC than this since our passband is so limited, but this works pretty well so bear with me...)

Now how do we compare these two sums, or averages? Their actual values will depend on the strength or weakness of the incoming signals. We can't do a divide ( and we wouldn't want to either for reasons of speed and instability when dividing by zero...).

We already know that these autocorrelation values are positive definite values. They are always greater than or equal to zero. If we feed the sum from the front of the spectrum into the left channel, and the sum of the back of the spectrum into the right channel, then we can use the arctangent function to compare them.

The neat thing about an arctangent is that it produces an angle that indicates how different its two arguments are, independent of their actual amplitudes. When they are both equal (and positive as here) then the angle is pi/4. When they are unequal, but positive, then the angle becomes either more toward pi/2 or zero.

So, for example if we have a white noise background, then noise like should have nearly equal amounts of energy fore and aft in its autocorrelation mean. And the arctangent should hover around pi/4 when the input is mostly noise.

When speech comes along the autocorrelation mean will begin to depart, probably having more bass than treble and so the autocorrelation spectrum will tip with a negative slope. The back half should have considerably less energy than the front half. And so the arctangent will swing toward zero, as we have it set up here.

We can use the departure of the arctangent value from its noise like mean value of pi/4 as the scaling value to use for additional gain. When the arctangent hovers near pi/4 the gain is near zero. But when it swings toward either zero or pi/2, we can use the absolute value of this deviation as an indication of additional gain.

So that's what we do here... After taking the arctangent I subtract an offset, and then boost it a little. I also hang onto the value with a sample-hold for the duration of the FFT so that we have a stable, well defined, gainer for this particular FFT period.

But remember we don't really know the nature of the noise background. So in this case the noise-like arctangent value is unknown and probably not pi/4.

So instead a presuming an offset of pi/4 I made it into an adjustable parameter on the VCS. When you use this new sound, tune the offset up and down until the background noise almost vanishes. That offset is very close to the noise-like mean value of the arctangent, whatever its value is.

Now boost the postgain till you hear some noise floor, and by golly the speech really stands out now! Even at less than -15 dB SNR!!!!

This idea came to me because I was thinking about Fuzzy Logic control of a device like this. Fuzzy logic says when the autocorrelation spectrum is nearly noise-like use a little gain, and when it is not noise-like use a lot of gain.

Well that's pretty much what we did here, but we included a trapezoidal control boundary instead of an abrupt on/off gain control. The less like noise the input signal is the more gain it gets. Our input was a simple two parameter state space consisting of the average of the low frequencies, and the average of the high frequencies.

This little system with the manual offset adjustment can handle any kind of noise background from white to 1/F and anything else you might care to throw at it.

Pretty neat, huh?

- SK DE N7AIG

[This message has been edited by David McClain (edited 01 July 2001).]

IP: Logged

sm
Member
posted 02 July 2001 04:29         Edit/Delete Message   Reply w/Quote
very neat indeed!!!!

you are taking about fuzzy logic. can you recommend a book about fuzzy logic for dummies or a web-forum?
seems that this area is quite interesting for audio recognition.

->m

IP: Logged

David McClain
Member
posted 02 July 2001 14:09         Edit/Delete Message   Reply w/Quote
Well... I learned most of what I know about a decade ago when it was the coming rage. I learned one afternoon from the horse's mouth, Lotfi Zadeh, of Stanford U.

It's really a very simple premise, and it shouldn't take much to learn how to do simple systems like I just described above.

I have a book that someone loaned me a while back. I haven't even opened it yet... But for what it's worth...

"Fuzzy Logic with Engineering Applications" by Timothy J. Ross, McGraw-Hill, 1995

I don't know if this will be a simple enough presentation. There ought to be a bunch of stuff on the Web for intro purposes.

I'm glad this tickled your curiosity though!

- DM

IP: Logged

All times are CT (US)

next newest topic | next oldest topic

Administrative Options: Close Topic | Archive/Move | Delete Topic
Post New Topic  Post A Reply

Contact Us | Symbolic Sound Home

This forum is provided solely for the support and edification of the customers of Symbolic Sound Corporation.


Ultimate Bulletin Board 5.45c