If you have a problem or need to report a bug please email : support@dsprobotics.com
There are 3 sections to this support area:
DOWNLOADS: access to product manuals, support files and drivers
HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects
USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here
NEW REGISTRATIONS - please contact us if you wish to register on the forum
Users are reminded of the forum rules they sign up to which prohibits any activity that violates any laws including posting material covered by copyright
Stream FFT and iFFT
Re: Stream FFT and iFFT
i have misured without the graph and the three de-serializer, no output connected to external driver
asio driver samplerate 96000
i7 920
value from internal cpu meter
v1 ~8.9%
v2 ~8.3%
v3 ~11.7%
value from Resource Monitor tools in W7
Average Cycle
v1 ~2.23
v2 ~2.45
v3 ~2.73
asio driver samplerate 96000
i7 920
value from internal cpu meter
v1 ~8.9%
v2 ~8.3%
v3 ~11.7%
value from Resource Monitor tools in W7
Average Cycle
v1 ~2.23
v2 ~2.45
v3 ~2.73
-
digitalwhitebyte - Posts: 106
- Joined: Sat Jul 31, 2010 10:20 am
Re: Stream FFT and iFFT
That's really weird. Maybe it's just instruction latency that has more impact on Intels than on AMDs. That means, it could be fixed by rearranging the code a little bit, without changing it's function. I can't test this, maybe someone else can try this.
I've attached my latest version. It is now fully flexible. The FFT size can be changed while the whole thing is running. That required some minor changes in the code of the FFT (just rearrangement). But I had to change some of the other modules to make this work, too. I completely rebuild the serializer, so that it uses a double buffer (one is written to, the other is read from). You can't compare the CPU usage with the previous versions because of this and there is another signal source (3 Oscs) for testing.
On my machine I can go up to 32768 Points without crashing or lagging, although there is a hugh delay (~1.5 seconds @ 44kHz). The 32768 Points maximum in the dropdown is also the maximum of the buffers in the code, so don't go beyond that.
I've attached my latest version. It is now fully flexible. The FFT size can be changed while the whole thing is running. That required some minor changes in the code of the FFT (just rearrangement). But I had to change some of the other modules to make this work, too. I completely rebuild the serializer, so that it uses a double buffer (one is written to, the other is read from). You can't compare the CPU usage with the previous versions because of this and there is another signal source (3 Oscs) for testing.
On my machine I can go up to 32768 Points without crashing or lagging, although there is a hugh delay (~1.5 seconds @ 44kHz). The 32768 Points maximum in the dropdown is also the maximum of the buffers in the code, so don't go beyond that.
- Attachments
-
- Stream FFT v5 (trogluddite, MyCo).fsm
- (73.22 KiB) Downloaded 1448 times
-
MyCo - Posts: 718
- Joined: Tue Jul 13, 2010 12:33 pm
- Location: Germany
Re: Stream FFT and iFFT
Great work.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
Feel free to donate. Thank you for your contribution.
- tester
- Posts: 1786
- Joined: Wed Jan 18, 2012 10:52 pm
- Location: Poland, internet
Re: Stream FFT and iFFT
I found something weird. After changing almost any code in the project, I had to change the Integer counter (so that it outputs 2 signals). After that change I noticed a huge performance boost, and I can't explain why. My code is even longer than the one from trog, but on my machine it uses only 1/250 as much CPU.
I've attached a comparison, maybe someone finds the reason for the huge difference.
I've attached a comparison, maybe someone finds the reason for the huge difference.
- Attachments
-
- Counter comparison.fsm
- (25.46 KiB) Downloaded 1456 times
-
MyCo - Posts: 718
- Joined: Tue Jul 13, 2010 12:33 pm
- Location: Germany
Re: Stream FFT and iFFT
B to A ~ 196 to 16 here (C2D).
I'm not an ASM geek, but I would think this. Either in Trogs example there is calculated something more (directly or in background), or there is some queue or value that (silently or direct) waits until some other operations are done. Or some CPU element is used, that in that particular design creates such slowdown.
So I would split it into conceptual blocks and test the blocks only; that would tell me whether some of these blocks is doing this, or combination of them.
Analyzer is fine (I switched outputs connected to it and reset the analyzer - note for those who didn't).
I'm not an ASM geek, but I would think this. Either in Trogs example there is calculated something more (directly or in background), or there is some queue or value that (silently or direct) waits until some other operations are done. Or some CPU element is used, that in that particular design creates such slowdown.
So I would split it into conceptual blocks and test the blocks only; that would tell me whether some of these blocks is doing this, or combination of them.
Analyzer is fine (I switched outputs connected to it and reset the analyzer - note for those who didn't).
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
Feel free to donate. Thank you for your contribution.
- tester
- Posts: 1786
- Joined: Wed Jan 18, 2012 10:52 pm
- Location: Poland, internet
Re: Stream FFT and iFFT
Ha ha - because my code is WRONG, yet still WORKS!!
The culprit seems to be...
...so after converting xmm1 to integer, I'm doing a float add - D'oh
As both numbers are <23 bits in length (as integers), they will have exponent = 0 when treated as floats, and so are denormals, hence the huge CPU load when added as floats. The opcode should, of course, have been "paddd", what a dumb-ass!
But, since both numbers have exactly the same exponent of zero, as will the answer, the integer output still comes out correctly - as there was no 'bug' apparent in this tiny "utility" code, I never looked over the code again, and didn't see the stupid mistake!!
Lesson of the story - there is no such thing as a "trivial" code routine!!
The culprit seems to be...
- Code: Select all
cvtps2dq xmm1,xmm1;
addps xmm1,current;
...so after converting xmm1 to integer, I'm doing a float add - D'oh
As both numbers are <23 bits in length (as integers), they will have exponent = 0 when treated as floats, and so are denormals, hence the huge CPU load when added as floats. The opcode should, of course, have been "paddd", what a dumb-ass!
But, since both numbers have exactly the same exponent of zero, as will the answer, the integer output still comes out correctly - as there was no 'bug' apparent in this tiny "utility" code, I never looked over the code again, and didn't see the stupid mistake!!
Lesson of the story - there is no such thing as a "trivial" code routine!!
All schematics/modules I post are free for all to use - but a credit is always polite!
Don't stagnate, mutate to create!
Don't stagnate, mutate to create!
-
trogluddite - Posts: 1730
- Joined: Fri Oct 22, 2010 12:46 am
- Location: Yorkshire, UK
Re: Stream FFT and iFFT
hehe,
None of your work here has been trivial
Thanks to TROG and other esteem GURU's !
This is a wonderful learning experience for me !
None of your work here has been trivial
Thanks to TROG and other esteem GURU's !
This is a wonderful learning experience for me !
- RJHollins
- Posts: 1571
- Joined: Thu Mar 08, 2012 7:58 pm
Re: Stream FFT and iFFT
MyCo wrote:I've attached my latest version. It is now fully flexible. The FFT size can be changed while the whole thing is running.
WOW, MyCo and Trog this is really amazing stuff you guys have going on here!
(just one minor niggle: amplitude plots in this latest version appear to be asymmetrical, while previous versions were okay)
T A D - since 2005
-
TheAudiophileDutchman - Posts: 46
- Joined: Tue Jul 13, 2010 1:36 pm
- Location: Apeldoorn, The Netherlands
Re: Stream FFT and iFFT
trogluddite wrote:The culprit seems to be...
- Code: Select all
cvtps2dq xmm1,xmm1;
addps xmm1,current;
Haven't seen that... That explains the performance difference. The int value interpreted as float is just a denormal. That's why the calculation can still output the right value.
-
MyCo - Posts: 718
- Joined: Tue Jul 13, 2010 12:33 pm
- Location: Germany
Re: Stream FFT and iFFT
TheAudiophileDutchman wrote:just one minor niggle: amplitude plots in this latest version appear to be asymmetrical, while previous versions were okay
hm... interresting. Maybe it is just the graph display, that doesn't interpolate the plot points.
-
MyCo - Posts: 718
- Joined: Tue Jul 13, 2010 12:33 pm
- Location: Germany
Who is online
Users browsing this forum: No registered users and 11 guests