Kyma Forum
  Kyma Sound Exchange
  SRS Encoding in Kyma

Post New Topic  Post A Reply
profile | register | preferences | faq | search

next newest topic | next oldest topic
Author Topic:   SRS Encoding in Kyma
David McClain
Member
posted 24 March 2003 18:24         Edit/Delete Message   Reply w/Quote

s3encoding.kym

 
So I have this little MP3 player on my computer and one of the screens for adjusting it has something called SRS Effects. I searched around trying to find out what it is, and stumbled on a patent number 4,748,669 issued May 31, 1988 to Arnold Klayman and assigned to Hughes Aircraft Company. Apparently Hughes decided to dump the patents for some money since SRS has little to do with building rockets and airplanes....

But this patent, though worded in trying legalese, did have some interesting things to teach... So here is a first cut at implementing the SRS system in Kyma. This is a big Sound, in more ways than one! I have added the equalization deemed necessary for headphones and side-mounted speakers, as well as adding my half-wave rectifier bass booster.

The input stereo signal is converted to middle (L+R) and side (L-R) sounds and then each of these are separately processed. Both go through a bank of bandpass filters, each about one octave wide, and then into compressors. The M and S channels have identical filter banks, but their compressors are operated slightly differently from each other. The S channel compressors use the output of each bandpass as their sidechain. The M channel compressors use the same sidechains as the S channel compressors. (Compression derived from S for both channels)

Additionally, in order to control excessive reverberation the 250, 500, 1K, and 2K bands have sidechains that mix in some of the M channel energy appearing in the 200-2000 Hz bandpass. (This is an approximation of the reverb control unit described in the patent).

All of these bandpass filters are single pole Butterworth filters in this implementation, giving a broad rolloff of -6 dB/octave.

[The M compressors operate with twice the compression as the S compressor, and have thresholds half as deep. The threshold of the compressors (nominally for the S channel) was chosen to yield approximately 12 dB compression max. Hence the M channel gets compressed with twice the gain as the S channels.

The wording for how the M and S channels are "dynamically equalized" is a bit obtuse in the patent document (about 80 pages long -- lots of repitition). But if you sit and think about what he is trying to accomplish it appeared to me to require compressors on both bands, with the M bands compressing twice as hard. Correct me if I'm wrong...]

The other difference between M and S processing is that the M channel (representing mostly frontal energy) is equalized in the compressor post gains to give a broad passband tailing off at each end of the spectrum, while the S channel has the midbands attenuated and the low and high frequencies lifted. The claim is that reverberation in a natural setting is mostly in those high and low frequencies and comes to us from the sides.

The output L and R channels are combinations of these separately modified M and S channels along with the original L and R signals.

This much describes the stereo enhancement circuitry. Next, the newly formed L and R channels are again split into new M and S channels, and the M channel has some EQ applied that represents the difference between the side and front HRTF (Head Related Transfer Function), approximated by means of 3 1/3rd octave notching filters at 500, 1000, and 8000 Hz. These use my IIR filter micro-sounds that many of you have seen before. (I think Kyma already provides a peaking filter that could be used here -- they are essentially identical).

The effect of this EQ in the M channel depresses sounds coming from the front in frequency bands that approximate the way our heads and earlobes shield our ears from frontal sounds. This is done, since with headphones and side mounted speakers, the sounds are being delivered from the side. So we need to synthesize what they ought to sound like as coming from the front.

Finally, these M and S channels are again recombined into fresh L and R channels. The final stage of this monster Sound is my half-wave rectifier bass booster which provides a mono enhancement using psychoacoustic energy instead of boosting the low frequencies to humongous amounts that would burn out your speakers and saturate your recording channels. This mono addition is added to the outgoing stereo signal.

There are two parameters, K1 and K2 that control the amount of M and S enhancement, respectively. I have them both set to 0 dB and even that gives an enormous sound. But they are presented on the VCS for you to modify to your desires.

The VCS also shows the M and S equalization curves in the stereo enhancement section. The upper left area of the VCS controls the bass boost.

This is a first cut, and will likely be found inferior in some respects to the "real SRS". That's okay, I just wanted to see what would happen if I put this up on Kyma, reading from the patent description.

You can read the patent for yourself at www.uspto.gov. Just search for patent number 4,748,669. The language is a bit arcane and antiquated by today's standards. But if you read it carefully and translate to our modern Kyma DSP systems you should find pretty much what I have constructed here for all of us to play with.... The patent itself is astonishingly broad in its claims. But so be it...

Enjoy!

- DM

[... yes, I realize it is a real mess... it is huge, hugely interconnected, and not well documented yet. Bear with me, I wanted to get it into other people's hands quickly! Takes about 6 DSP's to run it...]

[This message has been edited by David McClain (edited 24 March 2003).]

IP: Logged

David McClain
Member
posted 25 March 2003 01:36         Edit/Delete Message   Reply w/Quote

s3encoding.kym

 
... okay, call me daft! After the initial thrill dissipated I began to realize something is terribly incorrect with the first approach. It sounds very dreamy, but it isn't correct.

The attached Sound file now contains that original sound plus something a bit more along the right path... The correct one is labeled SRS Encoding.

I had to reread the patent description about 5 times before the arcane, contorted, description began to make some sense. Compression is used in the S channel, and this channel's filterbank serves as the sidechains for its own compression, and the expansion performed in the M channel.

The rules for SRS encoding are:

1. In frequency bands where S is weak -- boost it = compression
2. In frequency bands where S is strong -- boost M

There is another subtlety related to auto-detection of synthetic reverb. But the more important point is that rule (2) does not correspond to anything like compression. It is inverse compression. And that doesn't mean expansion either...

So I had to invent a Multiplying wavetable amplifier on the M bands, driven by an amplitude follower on corresponding S bands. I haven't put a meter on any of these M band sidechains yet, and the amplitude followers are set for 10 ms time constant and a gain of 20. That is the gain provided by default in Kyma. Why... I don't know precisely. Probably related to some arithmetic involved in computing the running means. So some of these gains on the M band sidechains may need adjusting.

The sound is much more crisp; it does still retain a surround feeling to it, and voices in front don't appear to have as much sibillance -- due to the HRTF equalization for headphones and side mounted speakers.

If you want to adapt this Sound for front placement of speakers, take that HRTF equalization and put it into the second S channel instead of the second M channel, and make all the gains positive dB instead of negative dB. That will correct the side driven sounds to be more sideways sounding.

The Multiplying wavetables Sounds use the Kyma compression curves found in the waveform directory. These look like linear gain curves near the origin (small amplitude) and become progressively flattened -- like a magnetic B-H curve. As a result, I think I can detect some fuzz on strong sounds. We might want to change this to a linear ramp. I just chose the first waveform that looked suitable.

The SRS spec calls for a maximum M band boost of 6 dB, and a maximum S band boost of 12 dB. I haven't yet tuned anything to approach these specifications. I did find that I needed to increase the S-channel compression thresholds to around -12 dB in order to prevent apparent clipping on low level sounds. At a setting of -36 dB I was apparently magnifying a whole lot of very high frequency energy that I can't hear. More tuning ahead...

... I kind of miss that dreamy sound from the first attempt, but it wasn't correct.

- DM

[This message has been edited by David McClain (edited 25 March 2003).]

IP: Logged

David McClain
Member
posted 25 March 2003 10:28         Edit/Delete Message   Reply w/Quote

s3encoding.kym


gaincurve6db.aif


gaincurve12dbrev.aif

 
... Last night's corrections hosed up the interconnects, as I discovered this morning. So that is corrected now in the SRS Sound block.

But more importantly, the compressors in the S channel have been removed and replaced with Multiplying Waveshapers, as with the M channel.

Secondly, both of these Waveshapers needed gain curves, not simple transfer functions like a linear ramp. The S channel needs max gain when the sidechain has its minimum value, and vice versa. It should provide a dynamic compression range of 12 dB. Done...

The M channel waveshapers need a gain curve that is the reverse of the S channels, so that when an S band has maximum amplitude, the corresponding M band gets maximum boost, up to a maximum of 6 dB. When the S band has no amplitude the M band should get unity gain. Done...

Now it sounds correct...

- DM

IP: Logged

David McClain
Member
posted 26 March 2003 10:24         Edit/Delete Message   Reply w/Quote

s3encoding_final.kym

 
Here is a final draft of the SRS encoding Sound. There are two versions here... one which pretty faithfully follows the patent disclosure, including the open loop correction for artificial reverb supression, and another which incorporates the essential elements of SRS encoding done as cheaply as possible. You be the judge which is better...

I spent a lot of time bringing versions up to the full patent spec, only to find that some very poor ideas were present in that patent disclosure. Most notably the feedback gain controlled amplifier on the S-channel processing. This sets up a situation where the open loop controller is fighting with the feedback controller, making the gain on the S-channel processing very jittery at times.

But in truth, the essential elements of SRS are:

1. Separate incoming stereo into M and S channels
2. Apply multiband compression to the S channel
3. Send it through a broad band-reject filter to remove a fair amount of the midband frequencies
4. Apply a simple HRTF filter to the M band to make it sound like it is coming from the front
5. Add bass boost to the M channel
6. Recombine M and S processing to form new L and R channels.

That's all the cheapo version does, and it does a very respectable job. I haven't yet run into the need for the reverb supression circuitry. That doesn't mean it isn't necessary; just that I haven't found a situation in which it is.

The cheapo version also uses a simpler multiband compressor more along the lines of a Behringer DSP-24 6-channel unit.

What surprises me as a result of this exercise is how sloppily one can process audio without the ear really discerning much difference. I'm reminded somewhat of image processing where an FIR filter for image smoothing might only be a 3x3 pixel block. Despite how keen our senses are, one can do a lot of minimalist processing to the scene elements without offending the senses too much.

This whole thing simplified enough that the essential elements could be put into a tiny little Nord micro modular synthesizer to act as an effects processor.

With all the years that have passed since the original patent filing, I wouldn't be too surprised to find that SRS Labs has likewise made gross simplifications -- especially for personal computer multimedia systems. I would, however, expect that their HRTF processing has progressed well beyond that employed here.

I find it somewhat disturbing that people can invent such a simple device, and then patent it along with broad claims to own all such devices that equalize M and S channels for stereo image control. But I realize that the Hughes attorneys needed to keep busy with something, so why not grab the universe and lay claim to it?

Ahh, well... at least they patented their thinking so that we could learn from it. And a patent does not prevent our making personal use of such devices.

- DM

IP: Logged

All times are CT (US)

next newest topic | next oldest topic

Administrative Options: Close Topic | Archive/Move | Delete Topic
Post New Topic  Post A Reply

Contact Us | Symbolic Sound Home

This forum is provided solely for the support and edification of the customers of Symbolic Sound Corporation.


Ultimate Bulletin Board 5.45c