Page 2 of 7

Re: Stream FFT and iFFT

PostPosted: Wed Jun 19, 2013 7:08 pm
by trogluddite
Cheers Tronic - that looks very useful. :D

Re: Stream FFT and iFFT

PostPosted: Wed Jun 19, 2013 7:16 pm
by MyCo
I had a go at the code. After I descrambled it :mrgreen:

I just replaced the inner loop with a real SSE. It basically calculates the real and the imaginary part simultanously. This reduced the CPU usage down to 1/3 !!!

I guess this can be improved a lot more. Especially the bit reordering part can get reduced. As it is a fixed FFT, you could save the reordering pattern in stage 0 into an array and when a sample comes in, you put it to the right spot using this array.

Re: Stream FFT and iFFT

PostPosted: Wed Jun 19, 2013 11:15 pm
by MyCo
I've done a little bit more cleaning. The main FFT processing part is now ready for variable FFT sizes.

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 3:56 am
by MyCo
Done!

This is now a flexible version, it has no hard code FFT Size anymore... although I haven't tested it yet. For absoulte flexibility, we have to move some code from stage0 into stage1, haven't done this yet.

I've also changed the bit reversal method. Instead of rearranging the buffer once, it does it every sample now. This doesn't make a huge difference on large audio buffer sizes. But when the buffer is smaller then the FFT size, this should give a performance boost.

I've removed the sine table creation stuff. Instead it uses the wavetable that FS already has. Saves a little bit memory... the performance isn't better.

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 8:49 am
by trogluddite
Wow, MyCo, you really hit the turbo fuel-injection there - even my little Atom netbook now runs it at sensible CPU loads!
Fantastic work, many thanks! :D

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 3:40 pm
by MyCo
Would be cool to benchmark this code against pure C/C++ code. It's surely not the fastest code around, but I think it beats pure C/C++ versions.

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 7:04 pm
by tester
v3 is around 2x slower than v2, is this correct? (looks like on my both C2D)

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 7:16 pm
by trogluddite
About 50% more for v2 on my systems - that tallies quite well with the results I got when I traded "one shot" for "continuous" bit-reversal, though that could just be a coincidence. A lot will depend on CPU cache size/performance, so results will likely vary quite a bit between systems. Still WAY faster than my original port, though!

Re: Stream FFT and iFFT

PostPosted: Thu Jun 20, 2013 10:18 pm
by digitalwhitebyte
great job guys.
is coming, a stable version of a MEMin and MEMout in ASM code.
finally found a way, without using any primitive hidden.
will post as soon as ready, I'm working on, testing it, to make it easily understandable and usable.

Re: Stream FFT and iFFT

PostPosted: Fri Jun 21, 2013 1:13 am
by MyCo
trogluddite wrote:About 50% more for v2 on my systems - that tallies quite well with the results I got when I traded "one shot" for "continuous" bit-reversal, though that could just be a coincidence.


In v3 is the output connected to DirectSound, in v2 it is not. v3 is even without disconnection faster here. I have an my AMD, though.