![]() |
![]() ![]() ![]() ![]() ![]()
|
next newest topic | next oldest topic |
Author | Topic: NonHarmonic Resynthesis |
David McClain Member |
![]() ![]() ![]() Boy, Kyma is deep! I'm just getting into spectral resynthesis using the LiveSpectralAnalysis and OscillatorBank. It appears that for the non-harmonic case at least, the spectral analysis is performed using some power of 2 short-term FFT. By that I mean an FFT whose time location and center frequency estimates are adjusted by means of weighting with power or amplitude. Is that so? Can you provide any hints on the effects of the Response (Better Freq, Better Time, etc) to this process? It appears that the number of filters provided is only affected by the LowestAnalyzedFreq parameter. So the Response must be used to allow less or more frequency departure from FFT cell center. Using one of the new I/O Characteristics fed by a ramp we could modify the amplitude provided across the spectrum in the right channel of the LiveSpectralAnalysis. That would give us a 512 channel equalizer... Doing something crazy with an I/O characteristic in the frequency channel (left output of the LiveSpectralAnalysis) could be used to scramble sounds beyond recognition. Cheers, - DM IP: Logged |
David McClain Member |
![]() ![]() ![]() Okay... here is my guess at what's going on in LiveSpectralAnalysis (LSA) with the non-harmonic settings... I did a test where I fed a mono source to two parallel LSA blocks running at opposite extremes of Response (BestFreq and BestTime) and then feeding one to the left output and the other to the right output after passing each through an OscillatorBank. I noticed subtle delay effects between the two ears, on the order of 20 ms or so... I did another test with a WaveShaper acting on the amplitude channel (left) coming out of an LSA before feeding the result to an OscillatorBank. This applied an interesting equalization to the sound. I found that the number of separate amplitude/frequency samples coming out matches your documentation -- 512 analysis pairs at the 1F setting for LowestAnalyzedFrequency, and progressing by powers of 2 downward for higher settings. So my bet is that the LSA is not quite as complicated as I had originally thought... 1. LSA is an FFT with block overlap processing Hence the LSA is an FFT whose real and imaginary components are reduced to amplitude measurements. There is something going on in the (right) Frequency channel output, to indicate subtle variations in Frequency. Sending this frequency info into a self subtractor with delay shows noise like variations from one frame period to the next, but overall the frequency output looks like a linear ramp. You must be doing some kind of weighted frequency estimation based on the intra-frame time location of the maximum amplitude in each FFT cell. This is a kind of "poor-man's" phase information. I find it very interesting to listen to the difference between a high quality (= large number of partials) resynthesis and the original signal. There is something lacking in the resynthesized output, but it is very good overall. If I were to send the signal through a straight forward/inverse FFT block then the sound is somewhat closer to the original. So the only difference I can see between resynthesis with non-harmonic spectral analysis and straight FFT processing is the different treatment of signal phase. Apparently phase information does matter, at least to second order approximation of sound. First order approximation with even crude estimates of phase can still sound very good. I am always inspired by the intelligence and cleverness that has gone into making Kyma. I learn so much from your system. Thanks!! - DM [ How to get the subtle frequency variations in each oscillator? Perhaps you are making use of phase information relative to an expected phase ramp value within each block of processing. Differences between the measured phase and the predicted smooth phase could be interpreted as a second derivative of phase, and this would be applied in the amount goverened by the duration of the FFT block to build a delta F value to add to the outgoing frequency ramp? ] [This message has been edited by David McClain (edited 12 January 2002).] IP: Logged |
David McClain Member |
![]() ![]() ![]() Why all the interest in these details? Well I wanted to compare the techniques used for frequency shifting and temporal rate changing used by Kyma and a new product from Ableton called "Live" [ www.ableton.com ]. Live is a program that permits one to import WAV and AIFF samples recorded at arbitrary sample rates, stack these samples, and play them back at arbitrary tempos. You can apply static pitch shifts to each sample, not dynamic. Hence you can tune your samples to each other and have them play back in synchrony. The sonic reproduction provided by Live is quite good -- merely sample rate conversion and pitch shifting interpolation artifacts (virtually unnoticeable). There are annoying sample chopping artifacts introduced under some circumstances because in order to synchronize different samples with each other, at arbitrary tempos, they chop the sounds into blocks as fine as 1/16 note or as coarse as 1 bar, and numerous divisions in between. They simply slide these chopped, interpolated, blocks to and fro in time to get them to synchronize to the beat. If you have a long sustained pad sound, then these artifacts are most noticeable, but for drum riffs, they are pretty well hidden. Overall the sound is quite impressively good, and you can mask the chopping artifacts by layering sounds. (Layering can hide so much...) Kyma, on the other hand, offers playback rate adjustment, sample rate conversion, and frequency shifting, all in real time, and all of them dynamic. I wanted to understand how Kyma could avoid these chopping artifacts during altered playback rates, and also how it could simultaneously offer dynamic pitch shifting. I find the sonic reproduction of Kyma resynthesis lacking subtle qualities compared to the original sound, not surprisingly. But again, layering will doubless hide these subtleties. I am quite impressed that Kyma never produces the chopped sample artifacts that happen in Live because they never need to chop the sample blocks. Playback rate adjustment is essentially continuous instead of gridded. Sample rate conversion is automatic in Kyma too. Overall, both the Live and the Kyma techniques produce good sound. Both have their shortcomings. But like a good layer of makeup, layering of sounds can make either one sound fabulous. - DM [This message has been edited by David McClain (edited 12 January 2002).] IP: Logged |
SSC Administrator |
![]() ![]() ![]() Thanks for the analysis :-) You should be able to do something *similar* to Live with Kyma using the Polyphonic Pitch Shifter (time) or with the TimeFrequencyScaler from the prototypes. IP: Logged |
David McClain Member |
![]() ![]() ![]() More info and some test results -- see for yourself! I did a simple remix this weekend in Live, and then I decided to take part of that song and produce it on the Kyma with and without resynthesis of the lead grunge guitar. Personally, I like the sound of resynthesis -- more clarity, crisper sound, and mostly a lot less noise. In fact, that is a remarkable advantage of resynthesis -- it removes most of the noise in the signal. This is probably even more true for harmonic analysis than it is for the non-harmonic analysis. Given the layering that takes place in the mix you never hear the artifacts of resynthesis. And furthermore, the pitch shifting is superb and seamless. I don't think Live performs quite as well here, given the layering of sounds that removes the advantage of pitch shifting the raw sonic material instead of a reconstruction as in Kyma. But judge for yourself... Anyway, I took a short section of the song in the middle where I had to repitch the guitar every beat and did this with Kyma non-harmonic analysis followed by a 256 oscillator bank on each of the left and right channels. Thats what you hear in the MP3 at "Kyma Resynth Mix". All of the equalization and effects processing was performed (mostly) live on the Kyma but I did cache some results for playback - especially the stereo resynthesis of the guitar. The example titled "Kyma Raw Mix" contains only the 8 lead in bars of the section because without resynthesis, the only choice for repitching on Kyma (as far as I know...) is playback rate adjustment. But when moving up as much as 6 semitones, it just doesn't sound correct because it falls quickly out of sync. So the raw mix merely depicts the original source material running through precisely the same effects as were used for the resynth-mix. Finally, the "Live Mix (full song)" contains the result of the Live processing. I tried to mimmick its effects processing with Kyma. But I think Kyma actually sounds better. See for yourself at These are relatively short MP3's. The raw mix is about 260 KB, the resynth mix is about 600 KB, and the full song is about 2.6 MB. All of the Kyma processing was performed in independent stereo left/right channels. Processing consists of: * a +3 dB boost at 2 KHz (200 Hz BW) on the guitars, * +6 dB boost at 6 KHz (600 Hz BW) on the drums followed by a compressor (ratio = 3, thresh = -18 dB, gain = 6 dB, attack = 1 ms, release = 10 ms) * Both guitars and drums are then fed over a send bus to a stereo phaser with a high-pass front end set at 500 Hz. Both of these track play in parallel with their effects processing mixed at 50%. * The phaser is fed directly to a sine-modulated high-pass filter from about 500 Hz to 1250 Hz cutoff * The modulated high pass filter is fed to a stereo ping-pong delay set at 3/16 beat on both channels. I personally think that Kyma comes out the winner again. What do you think? - DM [ Interestingly, now that I go back and listen to the Live produced song, it has a definite dark character to all the sound. Why? I didn't push anything through low-pass filters anywhere. In fact, if anything, I added a high-pass just like I described in the processing steps on Kyma. But Kyma produces a brighter sounding mix overall. Both the Kyma and Live mixes were sample rate converted to 22.05 KHz before encoding as MP3's. So that processing was identical... hmmmm.... It can't have anything to do with the quality of A/D and D/A, in either the computer or the Kyma, because in the Live mix I recorded directly from the program's generated output without ever going to a soundcard. In the case of Kyma, I did all the work in Kyma and then recorded in the computer across an S/PDIF connection. So what causes the Kyma to sound so much brighter? ] [This message has been edited by David McClain (edited 15 January 2002).] IP: Logged |
David McClain Member |
![]() ![]() ![]() I mis-stated the recording technique used with Kyma... Instead of recording from the S/PDIF output of Capy, I simply recorded the Timeline directly to disk using the Action menu option. So I did another test to see why Kyma sounds so bright. I recorded a sample of white noise with CoolEdit (about 10 seconds long). Then I recorded the Live rendition and the Kyma rendition of this noise. The original spectrum is quite flat on average. The Live rendition is similarly quite flat with frequency. But... the Kyma version shows substantial roll-off at the higher frequencies. So if anything, Kyma should be sounding darker, not the other way around. Next, I re-examined my effects track level, and dropped it to 70% (-3 dB) and re-recorded the test piece Timeline. Now it sounds more close to the Live version. So it appears that the Kyma HPF is working too well compared to the Live version, and that seems to account for the difference. - DM IP: Logged |
Eric Payrot Member |
![]() ![]() ![]() Hi David, Funny you tried that. I bought Live recently and think it's a great software, especially from the UI useability standpoint which is the best I've ever seen in an audio software. I am somewhat disappointed by the Live sound which sounds dull sometimes in comparison with Kyma and is not very "hot" from a level standpoint. Also I miss the ability to disable timestretch on non bpm samples I remarked for a long time that samples played through Kyma have much more presence, clarity and punch than played through my soundcard (SPDIF out into 03D mixer). The stereo definition sounds more detailed too. I thought this was coming from the Kyma DAC and output analog circuitry although this is not very logical as I monitor both my soundcard and Kyma through the 03D I wish I could be able to send live stereo output to Kyma for DAC and some processing inserts too. Unfortunately I'm on PC and still no ASIO driver for Kyma ... I wonder if the ASIO driver could also be used to send one of the Live "send" tracks to Kyma and return it in one of the live tracks after some processing into Kyma. Thinking about it, a killer way to integrate Live with Kyma would be to have some kind of "pluggo" shell as with Max, sending audio data back and forth the 2 apps and enabling control and automation of Kyma VCS parameters using the live plugin interface (with x,y control surface ...) ... just dreaming :-) Eric I'll give a listen to your mp3s after I finish dl them IP: Logged |
David McClain Member |
![]() ![]() ![]() Hi Eric, I have to agree about the GUI in Live. It is one of the most impressive that I have seen in a long time. While everyone else is working overtime to provide photo-realistic GUI's, the Ableton team has redefined what it means to be a computer interface. I find it very attractive and appealing to use. I have noticed that Live notches down their internal bus levels by 6 dB, but the recorded results are as hot as can be when played by any other player program. As for piping live audio into Kyma for effects processing, why bother going first into the computer? Why not just feed the live audio directly in the Capy, and then from there record over S/PDIF into your mixer program? Or am I missing something there? I do this all the time here -- I use a computer with an E-mu APS soundcard. Capy feeds its S/PDIF at 48 KHz into that soundcard. The second S/PDIF of the E-mu feeds my Akai DSP-16 mixer/recorder. But the Capy has plenty of analog audio lines too, for live feeds. And the Kyma Timeline recordings on the Web were definitely too much high end. I'll re-upload those samples for a less hissy sound... - DM IP: Logged |
Bill Meadows Member |
![]() ![]() ![]() This thread has really begun to digress from a "support" topic, but... I use non-harmonic analysis/resynthesis all the time for pitch shifting. It is the best sounding polyphonic pitch shifting that Kyma can do, but it is very expensive in terms of DSP resources. I also have an Eventide DSP-4000, and I must say the pitch-shifting is very good - very natural sounding - and superior to Kyma on most (but not all) material. They must have some amazing algorithms, because they can do eight simultaneous pitch shifts and I'm sure they don't have as much DSP power as my capy 320/8DSP. I have posted a Kyma version of a DSP-4000 patch called "Pitch+Time Manifold" in the Sound Exchange area - check it out. It has two pitch shifts derived from analysis/resynthesis objects, plus various delays and feedback paths. It is a fun sound mangler. IP: Logged |
All times are CT (US) | next newest topic | next oldest topic |
![]() ![]() |
This forum is provided solely for the support and edification of the customers of Symbolic Sound Corporation.