Kyma Forum
  Tips & Techniques
  Voice Processing and/or Synthesis

Post New Topic  Post A Reply
profile | register | preferences | faq | search

next newest topic | next oldest topic
Author Topic:   Voice Processing and/or Synthesis
gustl
Member
posted 03 April 2014 06:03         Edit/Delete Message   Reply w/Quote
I wonder if anyone has done something like that in Kyma:
- Modal Synthesis/Manipulation of Voice Quality / Glottal Pulses: changing a breathy voice into a harsh one etc.
- Modelling/Manipulation of supra-laryngeal vocal tract (mouth, nose)
- Modelling/Manipulation of Breathiness and/or the lungs
- Something like this: http://www.antarestech.com/products/detail.php?product=THROAT_Evo_14
- Making a normal speech sound like whispered or screamed etc.

Any help is highly appreciated

Thanks,
Gustl

IP: Logged

MathisNitschke
Member
posted 05 April 2014 06:27         Edit/Delete Message   Reply w/Quote
I did: http://mathis-nitschke.com/wp/auf-herz-und-nieren/

And in this old article I write a little about it (German):
https://www.dropbox.com/s/ar3iit9f863tcoo/Kyma_Artikel.pdf

Grüße,
- Mathis

IP: Logged

MathisNitschke
Member
posted 05 April 2014 06:33         Edit/Delete Message   Reply w/Quote
Just watched the little Antares Demo video, I'm quite underwhelmed I have to say.

IP: Logged

SSC
Administrator
posted 05 April 2014 17:07         Edit/Delete Message   Reply w/Quote
Scary! (the film excerpt, that is

[This message has been edited by SSC (edited 05 April 2014).]

IP: Logged

pete
Member
posted 05 April 2014 19:12         Edit/Delete Message   Reply w/Quote
Hi Gusti

The THROT demo sound pretty much like a similar technique I demonstrated at the KISS 2013 when I made Kurts voice become different people. I did it by first boosting the hi harmonics and attenuating the lower ones so that the formants become more even and the fundamental was not over powering everything (which I called flattening). This is probably what they mean by neutralizing. I then isolated the spectrum into 3 sections in the same way as you have been pulling the spectrum apart, and shifting the formant of each section separately and in different directions or amounts. This is similar to what you have been doing with the selective smoother except using a format shifter instead. I then un-flattened the spectrum to make the lower harmonics and fundamental loud again.

I was doing it with speech, but they are doing it with singing which in some ways is easier and gives you a smoother pitch movement which you can use as a dynamic controller to further adjust the formant control.

You can make a formant shifter in the wire between by putting a spectrum scaler module into only the left leg, and putting a matching one frame delay in the right. As you are not allowing the right leg to go through the module it stops the frequency scaling happening and only shifts the formant when you adjust the frequency control.

It looks like they are using 5 groups instead of 3 and are crossfading between them. Although they are probably bending the formant (squashing and stretching) over the spectrum.

This couldn't be done with the frequency scaler module for the same reason that we couldn't use the the delay with feed back module for variable independent partial smoothing (the formant control doesn't work at sample rate in the same way as the feedback control didn't), so instead you would have to build your own formant shifter out of component parts. This was what I did at the KISS 2011 although I couldn't demonstrate the "De-Formant" sound (deform the formant, get it) as I ran out of time. Although some of us discussed this in detail afterwards.

Oddly I found that hard cuts actually gave more distinctive and natural results as the formants tended to retain their shape (band width) even though their edges were being cut off.

To get the breathiness just add a tiny amount of noise to the right leg of the wire only, but first put the noise through a variable multi smoother (delay with feedback that matches the number of partials just the same as the normal spectral smoother). You can also control it's level with one of your sample rate trapezium generators and a product module. This means you can inject different amounts of noise into different areas of the spectrum. This should give you a much better breath or growl sound than what we hear in the THROT demo.

I don't know exactly how they are doing what they are doing ( I suspect it's more on a cycle per cycle basis which you can do if you're only processing singing) but it sounds pretty much like what the Kyma folk have been doing for some time.

Once again I hope this makes sense.

Pete

IP: Logged

gustl
Member
posted 08 April 2014 02:36         Edit/Delete Message   Reply w/Quote
Thanks for the article, Mathis, very interesting. Good Voice Design there, scary it is Me too I'm underwhelmed, I think we can do way better with Kyma!

Thank you so much, Pete! Let me know when I'm getting a pain in the neck with all this questions Flattening the spectrum seems like a good idea, I don't have the KISS13 sounds but I think I can work it out. I suppose you're multiplying the amplitudes with some kind of ramp (e.g. 1+1/256 to 2) to flatten and then multiply with the inverse of the same ramp after processing, but I have to check in kyma.

Seems like I can't find the "DeFormant" sound in the KISS11 files... Can you put it here or explain how it works?

What do you mean by hard cuts? Cutting the wire w/ pulse trains vs. trapezoids?

Thanks for the tips with the noise!

Again thanks for your time and effort

IP: Logged

pete
Member
posted 10 April 2014 14:24         Edit/Delete Message   Reply w/Quote
Hi Gustl

You can ask question all you like I'll answer if and when I can.

The DeFormant module was made before the Paca came out and used Pete;s DSP modules that only work with the Capy and was removed from the 2011 sounds.

The flattening was done by multiplying the AMPS with 1/256 then 2/256 then 3/256 up to 256/256 which was the highest partial. This means that if you analyzed a saw tooth, after this process all the partials would be equal level. The to un-flatten you multiply with 1 the 1/2 then 1/3 then 1/4 up to 1/256. You then put the whole lot through a gain module set to 256 so that the Osc bank is fed with a reasonable level and nothing got clipped on the way. I used spectrum from array to generate the multiplying curves.

Re hard cut and trapizoids.

There are three methods. the same three methods we talked about with the spectrum smoother.

FIRST. Hard cuts. where we had a quantity of the same process but with different pre set values and use pulse trains to select which process was sent to the osc bank for each block of partials. Important note here is that all the processes inputs are fed with the whole spectrum (identical signal) it was just the outputs that were selectively routed to the Osc bank.

SECOND. Trapizodal cross fade. Same as Hard cuts where we had a quantity of processes with different setting but we cross faded between them so that partials on the border of the cuts got differing amounts of processing by any two side by side of the processes.


THIRD. Sample rate controlled with trapezoid. In this case there was only one process but it's values were being changed at sample rate such that each partial could have it's own effect. It made sense to use a trapezoid to control the value so that partials next to each other would have similar values.

The first and the second didn't need sample rate control of its parameters as they were the same during a whole frame , but the third did need sample rate control as it was varying from sample to sample during the frame.

If you want to experiment with formant shifting then put a spectrum frequency scaler module in the left leg of the wire and give it a live hot parameter for frequency and that will adjust the formant.

If you want to see a hand made formant shifter that can be controlled at sample rate, then look in the 2011 sounds for the spectrum stable morph. There are two of them in there and at the end of the talk I tried to explain how they worked. It will probably need more explanation, but have a look at it first and see if you can work out what's going on, then ask me what isn't making sense and I'll try to explain it in a better way.


Hope this makes sense.

Pete

IP: Logged

gustl
Member
posted 10 April 2014 16:22         Edit/Delete Message   Reply w/Quote
Thanks Pete
I already managed to do the flattening/unflattening. I did almost exactly like you said. To compensate the gain I did it on 2 stages (1 after flattening, 1 after unflattening) to avoid clipping. everything works fine
The last days I worked on a hand-made formant shifter using a memory writer & waveshaper combination controlled by xenoscs to be able to shift, stretch and compress the amplitudes. as you made clear the stretching / compressing is only possible with the THIRD option.
I'd like to add the hybrid FORTH: sample rate controlled trapezoid combined with FIRST or SECOND. the way I want to do it is to stretch/compress the amplitudes (quantity of the same process) and choose the part of the spectrum for each process. The XenOscs still give me a little bit of a headache but I think I can work it out next week. In the end I want to achieve a highly flexible spectral tool to shift, stretch, compress, scale, offset, smooth........ amps & pitches in individual spectral bands.
Let's see what happens

IP: Logged

pete
Member
posted 10 April 2014 17:59         Edit/Delete Message   Reply w/Quote
Wow Gustl

That'll be one hell of a module very difficult to control but with great power when you learn how to use it.

I'm impressed that you've managed to make the hand made formant shifter. Don't forget it can add frame delays that need to be compensated for in the other leg. Also did you make the memory writer two frames in length (alternating the read and write frames) so that you were always processing the completed last frame and not trying to read values that hadn't yet been written.

I find it is worth making your controls such that they start from having no effect at a known value 0 or 1 so that you can always get back to the original sound and play with one feature at a time before you combine them.

One thing I didn't add was breath. From the Demo above they seemed to add some electronic sounding noisy thing that didn't really sound breath like. The thing is that breath undergoes a similar formant shape that the voice goes through. To make a better breath you could mix a tiny amount of noise to the frequency leg and or the amp leg. Make it more pronounced at higher partials and lower the level of the lower partials. You could even use of the flattened version and by pass the un flattener for the noisy part only (mixed with the un-flattened normal voice). Breath tends to follow the formant shape of the flattened version as there is no glottal pulse to prejudice the lower partials.

I sure wish you luck with this one.

Pete

IP: Logged

gustl
Member
posted 11 April 2014 05:16         Edit/Delete Message   Reply w/Quote
Yes, a real beast

Thanks, it took me some time. After sitting hours staring at my computer saying to myself: I need something to read out the amplitudes frame by frame, read out, read out, frame by frame, read, read, read - Finally the waveshaper came to my mind
I just write 1 frame and read that out with a waveshaper controlled by a fullramp. No problem with the fullramp. But I see where you're heading: If I try to read partial 255 first it will read out partial 255 of the previous frame. Good point, I have to work that out too.

I certainly do so. It should always be possible to go back to the original. Also I try to keep all hotValues 0,1 or -1,1 for easier encapsulating. I think when the whole thing is finished encapsulating would be nice because the whole sound will get very big and confusing

The thing about breath is really helpful, using the flattener and bypassing the unflattener certainly makes sense. Or maybe a seperate syntheticSpectrumFromArray with some hotValues to shape the noise. Also injecting different amounts of noise in different areas of the spectrum is a very good idea.

Thank you so much for your help! I'll certainly treat you to a beer (or 2) at KISS14

Best Wishes,
Gustl

IP: Logged

SSC
Administrator
posted 11 April 2014 16:00         Edit/Delete Message   Reply w/Quote
quote:
The flattening was done by multiplying the AMPS with 1/256 then 2/256 then 3/256 up to 256/256 which was the highest partial. This means that if you analyzed a saw tooth, after this process all the partials would be equal level. The to un-flatten you multiply with 1 the 1/2 then 1/3 then 1/4 up to 1/256. You then put the whole lot through a gain module set to 256 so that the Osc bank is fed with a reasonable level and nothing got clipped on the way. I used spectrum from array to generate the multiplying curves.

As a possible alternative implementation, maybe you could use the SpectrumModifier and set the AmpScale field to:
TrackNumber / 256

Then the unflattener would be another SpectrumModifier with the AmpScale field set to:
1 / TrackNumber

For the other processing you're doing, you may still need to split the spectrum, but if you just want to do the flattening, this could be an alternative that doesn't require encapsulation.

IP: Logged

gustl
Member
posted 15 April 2014 13:59         Edit/Delete Message   Reply w/Quote
Thanks SSC, sometimes I should think "inside the box" again

I've got everything working now: flattening/unflattening, formant shifting/compressing/stretching, alternate read/write frames - so far so good. When experimenting with the ramp for the waveshaper I noticed the following and please correct me if I'm wrong:
There are three ways to treat the ramp: shifting, stretching/compressing, bending. The ramp is basically affecting the mapping of the partials. So if I shift the ramp 1 sample to the left partial 1 (fundamental) gets the value of partial 2 , partial 2 the value of value 3 and so on. (BTW in all cases I make sure to not get overlapping partials (partial 256 getting the value of partial 1) or partials twice (partial 255 getting the value of partial 256, partial 256 also) by cutting out these parts of the spectral signal.) This kind of shifting I'd call amp shifting (or freq shifting for the right leg).

This is not the same thing as using the spectrumFrequencyScale which is about formant and pitch shifting. From my experiments I derived that stretching/compressing (adjusting the steepness) of the ramp is the same as using the spectrumFrequencyScale module. Which makes sense as you get the formants shifted down when you compress the ramp because for example partials 1-156 get the (interpolated) values of partials 1-256.

Now for the bending: Is there an easy way to obtain a bended (convex, concave) ramp from the xenOsc? Or from the generated ramp out of the xenOsc? I was thinking about working with arrays in the xenOsc but before I start I want to know if there is probably an easier way...

Thanks!

IP: Logged

pete
Member
posted 15 April 2014 17:49         Edit/Delete Message   Reply w/Quote
Hi Gustl

You can think of it this way. If you use the ramp as the bace, as you have said formant shifting is changing the gradient. i.e if you multiply it by one there is no shift, if you multiply it by 1.5 the formant shifts down and if you multiply it by 0.75 the formant shifts up. the problem is that the multiplier (the product) cannot multiply greater than one so if we put a gain module of x4 on the output and multiply by 0.25 DC constant we get our stating ramp.

Now you can add to or subtract from that 0.25 with your frame sunk trapeziums
to form the formant stretch or compress or deformations with in regions. Or you can add stepped wave forms so that chunks of formant are shifted.

But I think the ramp -> product -> gain with 0.25 constant feeding a mixer with your control signal which then feeds the product is probably the way to go.

Hope this makes sense.

Pete

IP: Logged

gustl
Member
posted 16 April 2014 03:31         Edit/Delete Message   Reply w/Quote
Thanks, Pete! Genius!
Bending the formants sounds even better (or more natural) than shifting. Now I "just" have to build the whole thing working in different bands of the spectrum

IP: Logged

gustl
Member
posted 16 April 2014 05:38         Edit/Delete Message   Reply w/Quote
Ok, more than 3 Bands are pushing my Paca to the limit That may be because of all the XenOscs I'm using for splitting the signal and generating the ramps for the waveshapers. Using pulse trains and ramp functions may leave me with some extra power but aren't as flexible as XenOscs. On the other hand only 3 Bands are quite difficult to control.. Anyway it sounds really good now allthough far away from finished
BTW bending the pitches is also fun!

IP: Logged

MathisNitschke
Member
posted 19 April 2014 06:54         Edit/Delete Message   Reply w/Quote
I'd love to hear something!

IP: Logged

All times are CT (US)

next newest topic | next oldest topic

Administrative Options: Close Topic | Archive/Move | Delete Topic
Post New Topic  Post A Reply

Contact Us | Symbolic Sound Home

This forum is provided solely for the support and edification of the customers of Symbolic Sound Corporation.


Ultimate Bulletin Board 5.45c