### La Voz Cantante: a 512 Channel Vocoder

Posted:

**Sun Dec 11, 2016 4:01 pm**Hey gang,

I thought I'd post this FFT Channel Vocoder that I made. Here is an explanation of the basic principle:

A vocoder has two inputs: one is the carrier (usually a sound source with rich spectral content, like a saw) and the other is the modulator (usually a human voice, can be sung ore just spoken, it does not matter much). Well, mine has these two plus a MIDI input as an option to use an internal synth as the carrier. At the core is an FFT engine which is quite heavily optimized so it is somewhat difficult so see the structure. Here is what it does:

Every 512 samples:

1. Calculate FIR coefficients for the vowel filter:

a) apply a windowed FFT to the modulator

b) compute modulus of 1a)

c) inverse FFT the result

d) shift by 1/4 FFT size to minimize boundary effects

2. Apply FIR filter to carrier (by linear convolution):

a) apply FFT to the carrier:

- use rectangular window for 1st half frame

- zero pad 2nd half frame to avoid time aliasing

b) apply windowed FFT to the FIR coefficients from 1d):

- use Kaiser window for 1st half frame

- zero pad 2nd half frame

c) complex multiply 2b)*2a) in the Fourier domain

d) inverse FFT the result

e) overlap-add with last frame in the time domain

Finally apply a spectral untilt operation (differentiator)

to compensate for an assumed 1/f carrier spectrum

falloff (e.g. saw or rectangle).

Some basic effects are included to make the plugin self contained.

There is more information, an mp3 demo and a PDF user manual over at my Web site:

http://vicanek.de/audioprocessing/lavozcantante.htm

I thought I'd post this FFT Channel Vocoder that I made. Here is an explanation of the basic principle:

A vocoder has two inputs: one is the carrier (usually a sound source with rich spectral content, like a saw) and the other is the modulator (usually a human voice, can be sung ore just spoken, it does not matter much). Well, mine has these two plus a MIDI input as an option to use an internal synth as the carrier. At the core is an FFT engine which is quite heavily optimized so it is somewhat difficult so see the structure. Here is what it does:

Every 512 samples:

1. Calculate FIR coefficients for the vowel filter:

a) apply a windowed FFT to the modulator

b) compute modulus of 1a)

c) inverse FFT the result

d) shift by 1/4 FFT size to minimize boundary effects

2. Apply FIR filter to carrier (by linear convolution):

a) apply FFT to the carrier:

- use rectangular window for 1st half frame

- zero pad 2nd half frame to avoid time aliasing

b) apply windowed FFT to the FIR coefficients from 1d):

- use Kaiser window for 1st half frame

- zero pad 2nd half frame

c) complex multiply 2b)*2a) in the Fourier domain

d) inverse FFT the result

e) overlap-add with last frame in the time domain

Finally apply a spectral untilt operation (differentiator)

to compensate for an assumed 1/f carrier spectrum

falloff (e.g. saw or rectangle).

Some basic effects are included to make the plugin self contained.

There is more information, an mp3 demo and a PDF user manual over at my Web site:

http://vicanek.de/audioprocessing/lavozcantante.htm