Stream FFT and iFFT

DSP related issues, mathematics, processing and techniques
User avatar
digitalwhitebyte
Posts: 106
Joined: Sat Jul 31, 2010 10:20 am

Re: Stream FFT and iFFT

Post by digitalwhitebyte »

i have misured without the graph and the three de-serializer, no output connected to external driver
asio driver samplerate 96000
i7 920

value from internal cpu meter
v1 ~8.9%
v2 ~8.3%
v3 ~11.7%

value from Resource Monitor tools in W7
Average Cycle
v1 ~2.23
v2 ~2.45
v3 ~2.73
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

That's really weird. Maybe it's just instruction latency that has more impact on Intels than on AMDs. That means, it could be fixed by rearranging the code a little bit, without changing it's function. I can't test this, maybe someone else can try this.

I've attached my latest version. It is now fully flexible. The FFT size can be changed while the whole thing is running. That required some minor changes in the code of the FFT (just rearrangement). But I had to change some of the other modules to make this work, too. I completely rebuild the serializer, so that it uses a double buffer (one is written to, the other is read from). You can't compare the CPU usage with the previous versions because of this and there is another signal source (3 Oscs) for testing.

On my machine I can go up to 32768 Points without crashing or lagging, although there is a hugh delay (~1.5 seconds @ 44kHz). The 32768 Points maximum in the dropdown is also the maximum of the buffers in the code, so don't go beyond that.
Attachments
Stream FFT v5 (trogluddite, MyCo).fsm
(73.22 KiB) Downloaded 1646 times
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: Stream FFT and iFFT

Post by tester »

Great work.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

I found something weird. After changing almost any code in the project, I had to change the Integer counter (so that it outputs 2 signals). After that change I noticed a huge performance boost, and I can't explain why. My code is even longer than the one from trog, but on my machine it uses only 1/250 as much CPU.

I've attached a comparison, maybe someone finds the reason for the huge difference.
Attachments
Counter comparison.fsm
(25.46 KiB) Downloaded 1663 times
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: Stream FFT and iFFT

Post by tester »

B to A ~ 196 to 16 here (C2D).

I'm not an ASM geek, but I would think this. Either in Trogs example there is calculated something more (directly or in background), or there is some queue or value that (silently or direct) waits until some other operations are done. Or some CPU element is used, that in that particular design creates such slowdown.

So I would split it into conceptual blocks and test the blocks only; that would tell me whether some of these blocks is doing this, or combination of them.

Analyzer is fine (I switched outputs connected to it and reset the analyzer - note for those who didn't).
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
trogluddite
Posts: 1730
Joined: Fri Oct 22, 2010 12:46 am
Location: Yorkshire, UK

Re: Stream FFT and iFFT

Post by trogluddite »

Ha ha - because my code is WRONG, yet still WORKS!!

The culprit seems to be...

Code: Select all

cvtps2dq xmm1,xmm1;
addps xmm1,current;

...so after converting xmm1 to integer, I'm doing a float add - D'oh :oops: :lol:

As both numbers are <23 bits in length (as integers), they will have exponent = 0 when treated as floats, and so are denormals, hence the huge CPU load when added as floats. The opcode should, of course, have been "paddd", what a dumb-ass!
But, since both numbers have exactly the same exponent of zero, as will the answer, the integer output still comes out correctly - as there was no 'bug' apparent in this tiny "utility" code, I never looked over the code again, and didn't see the stupid mistake!!

Lesson of the story - there is no such thing as a "trivial" code routine!!
All schematics/modules I post are free for all to use - but a credit is always polite!
Don't stagnate, mutate to create!
RJHollins
Posts: 1573
Joined: Thu Mar 08, 2012 7:58 pm

Re: Stream FFT and iFFT

Post by RJHollins »

hehe,

None of your work here has been trivial :lol:

Thanks to TROG and other esteem GURU's !

This is a wonderful learning experience for me ! 8-)
User avatar
TheAudiophileDutchman
Posts: 46
Joined: Tue Jul 13, 2010 1:36 pm
Location: Apeldoorn, The Netherlands

Re: Stream FFT and iFFT

Post by TheAudiophileDutchman »

MyCo wrote:I've attached my latest version. It is now fully flexible. The FFT size can be changed while the whole thing is running.

WOW, MyCo and Trog this is really amazing stuff you guys have going on here! :!:

(just one minor niggle: amplitude plots in this latest version appear to be asymmetrical, while previous versions were okay)
T A D - since 2005
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

trogluddite wrote:The culprit seems to be...

Code: Select all

cvtps2dq xmm1,xmm1;
addps xmm1,current;


Haven't seen that... That explains the performance difference. The int value interpreted as float is just a denormal. That's why the calculation can still output the right value.
User avatar
MyCo
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany
Contact:

Re: Stream FFT and iFFT

Post by MyCo »

TheAudiophileDutchman wrote:just one minor niggle: amplitude plots in this latest version appear to be asymmetrical, while previous versions were okay


hm... interresting. Maybe it is just the graph display, that doesn't interpolate the plot points.
Post Reply