Kyma Forum
  Tips & Techniques
  Re: The Problem with Spectral Modifications

Post New Topic  Post A Reply
profile | register | preferences | faq | search

next newest topic | next oldest topic
Author Topic:   Re: The Problem with Spectral Modifications
David McClain
Member
posted 14 May 2005 17:44         Edit/Delete Message   Reply w/Quote
While working on something else the other day, I suddenly realized why people have been complaining about the quality of sounds generated by making spectral modifications to streams of sounds using the Kyma FFT blocks... People have complained that they could hear the pre-echo caused by FIR-like filtering, and I have doubted that for some time because this pre-echo is so short as to be imperceptible in most cases.

But! What happens when you make spectral modifications to the spectrum of sounds in between an FFT and an inverse-FFT block, is that you are almost, but not quite, performing the same action as an FIR filter. The difference makes all the world of difference in sound quality and does in fact lead to the artifacts that people complain about. These aren't the pre-echo per se, but something called Temporal-Aliasing.

Most of you know what spectral-aliasing is, and what it sounds like. Its cause is the failure to sample the incoming data at more than twice the highest frequency involved in the signal. Temporal aliasing is very similar, mathematically speaking, but it works in the conjugate domain, and sounds very different from spectral aliasing. What happens in temporal aliasing is that some amount of signal from the earlier parts of a block of samples get mixed with samples later in the block, and so there becomes some confusion about when an attack occurs. That attack gets smeared out over the full duration of the FFT block size.

As long as spectral filtering or modifications are performed inside the FFT and then the modified signal is reconstructed by the inverse-FFT, this temporal aliasing will be a problem. Windowing the data only helps moderate its influence but cannot remove the artifacts.

I have been guilty in the past of saying that modifying the signal in between an FFT and inverse-FFT block is the same as performing FIR filtering. In fact it is not. When you make modifications to the signal spectrum in between the FFT blocks, you are indeed performing the effects of an FIR filter, but rather than convolving the signal with this effective FIR filter, you are really performing a circular convolution of the signal with that FIR filter. The result of circular convolution is to mix early and late portions of the signal within the block size and hence smearing out the signal events over the entire duration of the FFT block size.

You can do spectral modification using FFT's but to do so properly requires that you not reconstruct the signal from the modified spectrum. Instead, the modifications must consist solely of gain and phase changes at each frequency. Then the output of the following inverse-FFT block will actually be an FIR filter with as many taps as that block size.

[Actually, the FIR taps will be presented in a temporally rotated manner and you have to delay the first half of the output so that it can be joined to the end of the second half. This can be done with creative use of delay lines.]

Then to actually modify the incoming signal with that newly constructed FIR filter, you have to perform normal FIR filtering, which is a simple linear convolution operation. Only in this way can you cleanly construct a spectrally modified signal without any bothersome temporal aliasing artifacts.

In this case the prefered manner of treating the data and the inverse-FFT reconstruction is to apply windowing to both the input and output signals. Windowing the input signal prevents the spurious presentation of excess high frequency power, and windowing the inverse FFT output tapers the time-domain FIR filter so that it acts mostly in the middle of its block length and produces relatively little output for delays near the beginning and end of the tap sequence.

------------------
There is a common misapplication of spectral modification found in the sciences that attempts to perform Hilbert transforms on data streams by taking the FFT of the data, rotating the real and imaginary components of the Fourier spectrum by +45 and -45 degrees respectively, and then attempts to reconstruct the Hilbert-transformed data by means of an inverse-FFT on this phase-modified spectrum.

But as I have just described above, this will produce temporal aliasing and so despite a nice quadrature phase spectrum, the amplitude spectrum will be garbled and produce sound streams that contain unnecessary artifacts.

The proper way to do this kind of Hilbert transform is simply with a direct application of FIR filtering. The filter tap weights are easy to construct. Every even tap is zero, and every odd tap has weight equal to the tap number's reciprocal. E.g., [0, -1/7, 0, -1/5, 0, -1/3, 0, -1, 0, 1, 0, 1/3, 0, 1/5, 0, 1/7, 0] for a 17-tap Hilbert transformer.

--------------------------------
FFT's are incredibly useful devices, but they must be used with knowledgeable caution. It is too easy to get lulled into the notion that these actually produce Fourier Transforms. They do not, and it is at the boundaries of time and spectrum that the errors are most manifest.

I maintain that the listener cannot hear the small amplitude pre-echo resulting from a short (e.g., 256-1024 block size) FFT running at a high sample rate. Even for a comparitively large FFT block size like 1024 elements, the pre-echo at 48 KHz sample rate would only be about 10 ms long, and generally of exceedingly low amplitude. To hear that would be claiming to hear the low-level 100 Hz sidebands beneath a loud signal at the spectral center. Given the workings of loudness masking that is begging too much credulity. However, the effects of temporal aliasing will be quite apparent, since sharp attacks will appear smeared over the entire duration of the block, in this case as much as 20 ms. That would doubtless sound objectionable...

- David McClain

IP: Logged

pete
Member
posted 15 May 2005 15:40         Edit/Delete Message   Reply w/Quote
Hi David

I'm not to sure about this bit:--

[Actually, the FIR taps will be presented in a temporally rotated manner and you have to delay the first half of the output so that it can be joined to the end of the second half. This can be done with creative use of delay lines.]

I think here you are trying to derive the negative frequencies which could also be considered as the frequencies between half and full sample rate. But a simple delay wouldn't do it, you would have to reverse it in time and add it to the end. The odd numbered input (time domain) samples would be happy only if you added this time reversed signal in antiphase (times -1) but the even input samples would only be happy if this signal was added in phase.

Some times looking at the problem upside down and back to front can make things more clear. As you know FFT and inverse FFT do almost the same thing. If you consider an input to the first FFT of one single pulse at the one away from begining of the window you will get a cos wave out of the real output and a sine wave out of the imaginary output. Don't forget that the pulse is in the time domain and the sine and cos waves are describing the spectrum. If there is a single pulse in the third position we will get a cos wave out of the real and a sine out of the imaginary but at double the frequency. In fact no matter what input signal you put into the FFT you can only ever get a mix of cos waves out of the real and a mix of sines out of the imaginary. At this point you will never have any sine components coming out of the real output or any cos components coming out of the imaginary, regardles of the input signal and even if the input signal was noise. Now if we had another FFT along side the first and it was fed with a second input signal that being your Hilbert transformer. It too would be limmited to haveing real and imaginary outputs with only cos and sines wave mixes respectivly. Now if we did complex multiplies on these two signal pairs it would act just like the single sideband ring modulater and every frequency in one would make all the frequencies in the other get shifted up in frequency (the sum only of the frequiencies), But even this output still follows the magic pattern (only cos in the real and only sine in the imaginary. If you maintain this pattern and send that to the inverse FFT, you don't get any artifacts in the output even of you don't use any windows or over lapping frames on the input.

Saying that this is only good for filtering but a lot of what we try to do in Kyma is pitch shifting, formant shifting and morphing etc. and this introduces a whole new set of artifacts and problems. Here we try to brake the signal down to it's component parts and we have to forget the phase relationship between them al together. Here I'm talking about Phase relationship in the time doman. We know we have a good analisis of the sound if we can play the component sine waves (following the pitch and level) and the signal still sounds the same regardless of what phases the sine waves were started up in.

This opens up a whole new set of problems and Human percption has to get added into the formula.

The time smearing you've mentioned is only one of the sorces of artfacts but most of what we do has many more, some of which sound similar to time smearing.

One cure could be to have smaller windows at higher frequencies and wider ones at low frequencies. This means that the higher frequencies will have less frequency resolution and the low frequencies will have more time smear. This fits in well with human perseption model but the data produced is not so regular and almost impossable to transform into something sensable. You have to forget the harmonic spacing. Also this in it's self has artifacts when the source signal has component frequencies fall on the borders.

Pete

IP: Logged

David McClain
Member
posted 18 May 2005 12:11         Edit/Delete Message   Reply w/Quote
Hi Pete,

My quick mouth was incorrect in only one category above... Namely, that I said you couldn't correctly perform Hilbert transforms using FFT's. In fact you can, and I misstated the case.

Any time you perform nothing more than a simple multiplicative scaling operation to every real and imaginary component of the FFT, you can get artifact-free results. The reason for that is that the conjugate operation is one of merely convolving with a complex-valued Dirac delta function -- in the computer domain, this means multiplying every sound sample by that complex-valued number. No artifacts are produced when you do this in the Fourier domain.

But you mention things like Vocoding and pitch-shifting. All of these are "filtering" operations of a sort, although the filtering occurs in a manner we aren't accustomed to thinking of as filtering...

For example, pitch shifting, ideally means simply shifting the spectrum of the sound. If it is a linear shift it can be done by multiplying the so-called "analytic spectrum", a one-sided spectrum without negative frequency components, by a complex exponential. You get this analytic signal by applying the Hilbert transform to the signal to produce its quadrature component, then treat this quadrature component and the original in-phase component as two parts of the single analytic signal.

That much holds only for linear frequency shifting. Performing a more useful harmonic shift entails multiplying the frequency of each harmonic component by some scale factor. You can't readily do this in the Fourier domain without introducing some rather serious artifacts. However, if you could work in the domain of the Fourier transform of the logarithm of the signal Fourier transform, this multiplication then becomes a simple additive operation again. However, going to the log-Fourier^2 domain also entails unwrapping the phase of the signal -- a difficult operation to do in the presence of noise.

All I was really trying to do in the above discussion was to express my understanding and sympathy for those who have been complaining about the quality of sounds emanating from FFT modifications. Any modifications, apart from simple scaling of real and imaginary components, will produce nasty artifacts. In the case of operations that correspond to conventional filtering, the way around these artifacts is to use the inverse-FFT of the modification values, not the modified spectrum itself, to produce a windowed FIR filter. Doing that much removes all evidence of temporal aliasing.

[...actually, there really is a way of doing this entirely with FFT's without introducing temporal aliasing, and it involves quarter-frame or tighter overlap, using half-frame windowing. In that way, the circular convolution becomes equivalent to linear convolution, but only if you first convert your proposed modifications to a half-frame windowed FIR and then move back to the Fourier domain with this modified FIR.]

Doing other more complex kinds of modifications to the spectrum will leave you with sonic artifacts no matter what you do. The best you can hope for is high sample rates, very short FFT blocks (short time spans), and good windowing practices. In this way, the temporal aliasing is held to very short timespans and the windowing produces fewer high frequency artifacts and can also minimize the amplitude of temporal aliasing.

Unfortunately, we are held hostage by nature when dealing with both frequency and time. The Heisenberg Uncertainty Principal is not something uniquely discovered by Heisenberg in the field of quantum mechanics. It was actually around since the discovery by Fourier that you could decompose a time series into a colletion of waveforms. You can't aribtrarily refine your work on time domain and simultaneously achieve arbitrarily fine frequency control. There is a fundamental limit to the product of these two refinements that cannot be breached. Doing well in one domain necessarily means doing poorly in the other domain. Add to that the artifacts imposed by digital sampling.

- DM

[This message has been edited by David McClain (edited 18 May 2005).]

[This message has been edited by David McClain (edited 18 May 2005).]

IP: Logged

David McClain
Member
posted 18 May 2005 12:14         Edit/Delete Message   Reply w/Quote
... the temporal rotation of the inverse FFT to produce an FIR filter is a consequence of the math. You can either interchange the two halves of the resulting output in the time domain...

... or else you can preempt this effect by multiplying alternate real and imaginary FFT components by -1. That has the effect of shifting [er, rotating] the entire output from the subsequent inverse FFT by one-half period.

- DM

[This message has been edited by David McClain (edited 18 May 2005).]

IP: Logged

David McClain
Member
posted 18 May 2005 12:22         Edit/Delete Message   Reply w/Quote
As for your inverse frequency windowing widths -- this is commonly done and it corresponds to the creation of Wavelet decompositions, e.g., Haar Wavelets. Unfortunately, while this corresponds more in line with our perceptions, any modifications performed to the resulting Wavelet decomposition tend to sound even poorer to our ears than Fourier filtering.

The reasons why this should be so have puzzled me for some time now... Possible culprits are the use of Quadratic-Mirror Filters, that leave a huge amount of spectral crossover bleed between high and low frequency subbands, and perhaps the failure of Wavelet processing to properly account for aliasing and sub-aliasing artifacts.

Once again, no matter how you do it, you are constrained by the uncertainty relation that exists between the conjugate domains of time and frequency.

- DM

IP: Logged

David McClain
Member
posted 18 May 2005 12:43         Edit/Delete Message   Reply w/Quote
As an interesting aside, the Kyma version of the FFT and inverse FFT, are actually Hartley Transforms, not Fourier Transforms.

The reason for this is that Kyma works on realizable signals in the time domain. Kyma only presents the positive frequency components to us for modification.

Hence any signal is assumed to have an even real part and odd imaginary part for its frequency spectrum. No matter what you do to these real and imaginary frequency spectrum components, the inverse FFT will force the interpretation of even-real, and odd-imaginary, components in order to produce a real time-domain signal.

Oddly enough, while the manual claims that the real part is in the left channel, and the imaginary part is in the right channel, [am I being dyslexic again?]... my own simple experiments have shown that the left channel corresponds to a collection of sine waves, and the right channel corresponds to a collection of cosine waves. Opposite the usual interpretation of real and imaginary spectral components. But so what?!

- DM

IP: Logged

tuscland
Member
posted 18 May 2005 17:05         Edit/Delete Message   Reply w/Quote
Hi David,

While I was reading your posts one thing that came to my mind was the wavelet decomposition, but you talked about it before I could show how clever I was :-)

I was surprised that you stated that wavelet processing would sound "bad" ? I actually never heard anything processed using wavelets, but I thought that this kind of analysis, where some kind of multi-resolution is used (lower frequencies are analysed with greater resolution than higher frequencies), should sound more natural and attenuate the 'aliasing' effect discussed above.

I have another question:
Another factor that is a source of temporal-aliasing when using FFT transforms for spectral analysis is the time correlation between each FFT analysis. (You may already have talked about this, but I would like to discuss about it in order to make my mind clear.) FFT is done on finite blocks of audio which are transformed back thereafter. You talked about windowing techniques, showing that some are more successful than others. Is there any kind of time/frequency analysis that instead of processing discrete blocks, would process signal in a continuous stream ?

While window need to be carefully chosen, the size of the FFT blocks is also very important. What are the practical choices to do when chosing both windows and FFT size? I am a Kyma beginner, and I have walked through the examples, but I have not seen specific applications of windowing techniques. Things seem to be simply FFT'ed, processed then fed into a OscillatorBank or equivalent ...


Best Regards,
Camille

IP: Logged

SSC
Administrator
posted 18 May 2005 23:44         Edit/Delete Message   Reply w/Quote
quote:
Originally posted by tuscland:
I was surprised that you stated that wavelet processing would sound "bad" ? I actually never heard anything processed using wavelets, but I thought that this kind of analysis, where some kind of multi-resolution is used (lower frequencies are analysed with greater resolution than higher frequencies), should sound more natural and attenuate the 'aliasing' effect discussed above.

The difficulty comes from slicing the spectrum into chunks that do not overlap. Any overlap between bands (whether equal size as in the FFT, or varying sizes as in the Wavelet transforms) produces aliasing (audible if you do any sort of processing in the spectral domain). (If you do not do any spectral modifications, then the aliasing components will cancel each other out when you resynthesize the spectrum.)

The FFT can work better than a Wavelet transform for harmonic sounds since the spectrum is sliced into equal sized chunks. If these chunks are small enough, you can guarantee that there is only one harmonic in any given chunk.

Since the Wavelet transform slices the spectum into unequal chunks, it is very difficult to guarantee that there is only one harmonic in a chunk. This means that any processing done to a given chunk in the Wavelet domain will change the amplitude and phase of a sum of two or more harmonics, not to each harmonic independently.

quote:
Another factor that is a source of temporal-aliasing when using FFT transforms for spectral analysis is the time correlation between each FFT analysis. (You may already have talked about this, but I would like to discuss about it in order to make my mind clear.) FFT is done on finite blocks of audio which are transformed back thereafter. You talked about windowing techniques, showing that some are more successful than others. Is there any kind of time/frequency analysis that instead of processing discrete blocks, would process signal in a continuous stream ?

While window need to be carefully chosen, the size of the FFT blocks is also very important. What are the practical choices to do when chosing both windows and FFT size? I am a Kyma beginner, and I have walked through the examples, but I have not seen specific applications of windowing techniques. Things seem to be simply FFT'ed, processed then fed into a OscillatorBank or equivalent ...


The simplest way to get a continuous stream is to perform an FFT on overlapping chunks of the signal. For instance, for a 1024 long FFT, you could have 1024 FFTs, offset from each other in time by one sample.

---

The choice of window affects how good the "filters" are that slice the spectrum into equal size chunks.

The rectangular window has a relatively sharp cutoff but a lot of ringing (sidelobes) in the frequency domain. This means that one harmonic's energy can leak into the measurements of more than one chunk.

Other windows, such as the Hamming window, have much smaller sidelobes but a slower cutoff, so that one harmonic's energy is concentrated in two adjacent chunks (and less in chunks farther away).

---

The length of the FFT affects the number of slices made to the spectrum. A 1024 long FFT will divide DC to the sample rate into 1024 equal sized chunks (roughly 44 hz wide for a 44.1khz sample rate). For best results, you want to have no more than one important spectral component in one of these chunks, so this would mean a fundamental frequency of 44 hz or higher.

However, the longer the FFT, the longer the amount of time the spectral measurements are made over. This means that a 1024 long FFT gives a single amplitude and phase measurement for each chunk every 10 ms. This means that some temporal detail can get smeared. To minimize thiis effect, you should try to use shortest FFT length that satisfies the previous paragraph.

IP: Logged

tuscland
Member
posted 19 May 2005 17:20         Edit/Delete Message   Reply w/Quote

StreamOfFFTs.kym

 
Hi,

I come back with a practical demonstration of your reply.
Anyone, please feel free to tell my if I am wrong here.

In the attached kym file, there are 2 sounds :
- The first sound, named StreamOfFFTs is a FFT->windowing->iFFT processing, but multiplexed with a script 16 times (I don't have enough DSP power to try more). FFTs and iFFTs have a size of 16. Each of the 16 instances are started with an offset of 1 sample. The result is a high pass filtered sound at 44100/[count of instances] (in this case 44100/16 ~= 2756 Hz).
- The second sound is the same layout but with only one instance of the FFT->windowing->iFFT processing, which has a size of 16 too.

You can compare the sound quality of the two sounds. Where the first sounds natural, the second one shows the temporal-aliasing effect, which is decreasing when the FFT size is higher.


Camille

IP: Logged

robertjarvis
Member
posted 20 May 2005 09:53         Edit/Delete Message   Reply w/Quote
quote:
Originally posted by tuscland:
...the first sounds natural...

It only sounds 'natural' to me when I have GenericSource set to 'Disk'; when set to 'RAM' it sounds awful (sorry, unnatural).

Why do you think this is?

Robert

IP: Logged

tuscland
Member
posted 20 May 2005 11:07         Edit/Delete Message   Reply w/Quote
quote:
Originally posted by robertjarvis:
It only sounds 'natural' to me when I have GenericSource set to 'Disk'; when set to 'RAM' it sounds awful (sorry, unnatural).

Why do you think this is?

Robert


Hi Robert,

I have the same problem.
I only tested my sounds with a Live input from inputs 1 & 2, with iTunes playing in background. Using RAM is like downsampling the signal, very strange.

I am trying to understand why ...

Camille

IP: Logged

rafe
Member
posted 20 May 2005 18:32         Edit/Delete Message   Reply w/Quote
One reason for the quality of the sound being 'poor' is the number of detectors being used. I tried your sound and increased the number of detectors by several steps and each time i do this the aliasing frequency drops by an octave and the sound quality improves. This is possibly because this prototype is an FFT which is an algorythm design to 'trim' the computations for realizing fourier theorum and there is increased 'error' in approximating the frequency content of the sample (i found this consistent whether i used live, ram or disk samples). This error may be a result of the frequency of the source not being the same as the 'known' frequency that is used as the comparison frequency hence some samples may not sound as bad as others depending on their natural frequency (i also managed to improve the quality of the sound by putting !frequency in that feild of the sample/ generic source and changing it within small range < octave). I say this may be part of the reason because i am not positive if this is the method implemented in this prototype so I am speculating based on what I have read on implementing FFT and fourier therom from outside sources. Perhaps some of the more experienced kyma users could give a clearer description.

I am often struggling to get as close to the original source with spectral representation of a sample and these stategies have helped me. Clearly the fft prototype is not nearly as accurate as the spectal analyses tool. If you could devise a strategy using that tool you would likely get much better results.

IP: Logged

rafe
Member
posted 20 May 2005 22:54         Edit/Delete Message   Reply w/Quote
I realized that I am not entirely clear on what your attempting to do. What I thought were aliase frequencies were actually a result of the freq feild of the module 'window'.

IP: Logged

tuscland
Member
posted 21 May 2005 06:51         Edit/Delete Message   Reply w/Quote
Actually, it is probably not clear how I want to do it too!
What I wanted to do is show how windowing affects sound quality. And how to avoid temporal-aliasing by implementing the what was suggested by Kurt (or Carla?) in the reply to my questions.

If you compare both sounds you see that there is a clear difference of sound quality.
The sound that only has 1 circuit of FFT, which means that the input sound is processing by chunks of the size of the FFT, one at a time.

If you look at the script, there are as much FFT circuits as there are samples in those FFTs, except that each FFT circuit is started with an offset of one sample. This results in a continuous stream of FFTs, not just one FFT at a time being windowed independently.

To have a better exemple, one should put 512 in the length variable in the Script module, thus allowing more bandpass (cutoff frequency being at 44100/512=86 Hz). My Capy can't run more than 16 FFT circuits (!).

I found out that using a AudioInput was more DSP hungry than using an GenericSource. If someone has a explanation, I am curious to know why.

I fear the differences with a sound being cached in RAM and played on Disk is a problem. I don't see a reason about why should a sound be computed differently if it is consumed in RAM or on Disk.

On another hand, I wonder if my implementation of the algorithm is correct. I saw that FFT should be used in complement with delay lines, but I think this is because of in the examples the signals are monophonic.


Camille

IP: Logged

pete
Member
posted 22 May 2005 15:07         Edit/Delete Message   Reply w/Quote
Hi Camille

"On another hand, I wonder if my implementation of the algorithm is correct. I saw that FFT should be used in complement with delay lines, but I think this is because of in the examples the signals are monophonic."

I see you have a window between the two FFTs which is windowing the frequency domain i.e. filtering out the high and low frequencies in the band covered by the FFT. This band is only 3 Khz aprox up to half sample rate as the FFTs are only 16 samples wide. Most of the lower frequencies signal gets through using the DC component and is not that effected by the FFT .

I don't know if this is what you ment to do.

The way the FFT module is supposed to work is to feed it with two overlapping windows into the first FFTs input. As you can't put a pair of overlapping windows into a single input it makes use of the right hand sterio input for the second one. As the two halfs of the FFT start the frame at the same time we have to delay one half by half a frame on the way in and delay the other half on the way out, (after the last FFT ) to line it up again.

If you were to do the test with just one FFT pair (overlapping every half frame as the FFT module intended) the output sound would still be good and wouldn't show any artifacts. It's only when you do somthing in the middle to try and transform it in some way that we get the artifacts. Only then you've got artifacts can you try to get rid of them. You can then add more overlapping FFTs to try to reduce the time smear and inturn, see if the artifacts reduce.

I think these examples are showing more the artifacts of having no window or square window and that by having 16 stagered square windows the sharp edge is not so noticeable.

Also I'm not entirely sure, but I think you will get different results with live input and sample input as the script is making multi copies of the input with staggered start up times. Samples will be staggered, but obviously live inputs won't be.

I'm not in a state where I can fully try these out so I've probably made some errors here but I hope it gives you something to think about.

Pete

IP: Logged

tuscland
Member
posted 23 May 2005 13:05         Edit/Delete Message   Reply w/Quote
Hi Peter,


quote:
Originally posted by pete:
I see you have a window between the two FFTs which is windowing the frequency domain i.e. filtering out the high and low frequencies in the band covered by the FFT. This band is only 3 Khz aprox up to half sample rate as the FFTs are only 16 samples wide. Most of the lower frequencies signal gets through using the DC component and is not that effected by the FFT .

I don't know if this is what you ment to do.


Actually no, but hopefully, with your precisions, I understand more clearly !

quote:
The way the FFT module is supposed to work is to feed it with two overlapping windows into the first FFTs input. As you can't put a pair of overlapping windows into a single input it makes use of the right hand sterio input for the second one. As the two halfs of the FFT start the frame at the same time we have to delay one half by half a frame on the way in and delay the other half on the way out, (after the last FFT ) to line it up again.

That is what I missed there.
First, I thought that FFT prototype were stereo.
I also didn't understood that the window (show in the "FFT 512 samp windowed" example) which is an Oscillator, had actually a two track output in the FFT, or was duplicated. This seem to be the case for oscillator, which duplicate when a stereo signal is pasted in the Enveloppe field. There must exists a more general rule, but I don't know it. There is an implicit representation with one straigt line, which confused me, unlike stereo (or multi-input/output) signals which have multiple lines connecting each prototypes.

quote:
If you were to do the test with just one FFT pair (overlapping every half frame as the FFT module intended) the output sound would still be good and wouldn't show any artifacts. It's only when you do somthing in the middle to try and transform it in some way that we get the artifacts. Only then you've got artifacts can you try to get rid of them. You can then add more overlapping FFTs to try to reduce the time smear and inturn, see if the artifacts reduce.

I have constructed a simple windowed FFT / iFFT couple, and in the middle I have put a XenOscillator in order to build some kind of EQ. (I think this is how the EQ worked in the early versions of WinAmp, because it is easy to manipulate the FFT of the signal in a MP3 file.)

For some curves of equalisation, I can hear the aliasing, and yes, the choice of a partiular window affects the results. I found that for low freq response signals, the hann window gave best results.

quote:
I think these examples are showing more the artifacts of having no window or square window and that by having 16 stagered square windows the sharp edge is not so noticeable.

Also I'm not entirely sure, but I think you will get different results with live input and sample input as the script is making multi copies of the input with staggered start up times. Samples will be staggered, but obviously live inputs won't be.


Yes, and no.
You get different results when you set the input to RAM and [Disk or Live], the RAM setting shows strange artifacts and I don't understand why. If that is because there are multiple copies of the input, then when do we know when the input is duplicated ?

I would like to know also how it is possible to build a sound with a script that does not copy the input, but only the specific branch where the script is attached ?

quote:
I'm not in a state where I can fully try these out so I've probably made some errors here but I hope it gives you something to think about.

Thank you very much for your input, it really helped me to learn a bit more about Kyma.


Camille

[This message has been edited by tuscland (edited 23 May 2005).]

IP: Logged

rafe
Member
posted 27 May 2005 18:19         Edit/Delete Message   Reply w/Quote
Is there a reason why these artifacts would be more apparent with a sound incorprating spectral modifications that was used in the timeline vs from the editor?

After studying this thread I decided I would try spectral modifications from aiff sample files. I have always used spectral analyses tool in the past. I used the spectral deranger prototype with minor changes. All worked as expected but when I dragged it into the timeline the gurgling buzz, that is familiar (from making spc files with the analyses tool) is front and center in the sound while played in the timeline. I have checked for hot parameter variations but this does not seem to account for it.

Thanks to all - this thread has been very educational.

IP: Logged

All times are CT (US)

next newest topic | next oldest topic

Administrative Options: Close Topic | Archive/Move | Delete Topic
Post New Topic  Post A Reply

Contact Us | Symbolic Sound Home

This forum is provided solely for the support and edification of the customers of Symbolic Sound Corporation.


Ultimate Bulletin Board 5.45c