Support

If you have a problem or need to report a bug please email : support@dsprobotics.com

There are 3 sections to this support area:

DOWNLOADS: access to product manuals, support files and drivers

HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects

USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here

NEW REGISTRATIONS - please contact us if you wish to register on the forum

Users are reminded of the forum rules they sign up to which prohibits any activity that violates any laws including posting material covered by copyright

Fast Stream Array Access

Post any examples or modules that you want to share here

Re: Fast Stream Array Access

Postby KG_is_back » Sun Oct 19, 2014 10:27 pm

Exo wrote:
KG_is_back wrote:I was going to ask you guys is there any opcodes you really want/need? If you can give clear examples of benefits of certain opcodes I could get on to Malc to add them (I'm usually quite good at getting him to add little things if I give him a clear example and make it simple for him).

Maybe topic for another thread?


NO.1 choice: subtraction for integers. Either sub reg,reg/var32; or/and psubd xmm0,xmm1/var; and to fix the nasty andnps coloring bug ...and logical not would be appreciated even in Code component.
KG_is_back
 
Posts: 1196
Joined: Tue Oct 22, 2013 5:43 pm
Location: Slovakia

Re: Fast Stream Array Access

Postby MyCo » Mon Oct 20, 2014 4:29 am

wow, Martin has a run :P

Haven't noticed that FS supports "movd r/m32, xmm" instruction, good to know... Unfortunately it doesn't support "movd xmm, r/m32", that would give another performance boost.

BTW: Don't trust the cycle counter method, it's pretty inaccurate. On my system for example the cycle counter outputs the same for the "Simple Delay" and the "Simple Delay (Stock)", although I know there should be a huge difference. When I need a meaningful comparison, I do hundreds of synchronized copys of a module and put them in parallel into a selector (as mono/packed mono stream). And then switch between optimized/normal while looking at the CPU usage either in FS or in the resource monitor of windows.
User avatar
MyCo
 
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany

Re: Fast Stream Array Access

Postby MyCo » Mon Oct 20, 2014 5:07 am

Here is a test bench schematic that I use for optimizations. I've set it up with the delays.
Attachments
Delay Testbench (MyCo).fsm
(140.93 KiB) Downloaded 1000 times
User avatar
MyCo
 
Posts: 718
Joined: Tue Jul 13, 2010 12:33 pm
Location: Germany

Re: Fast Stream Array Access

Postby Tronic » Mon Oct 20, 2014 8:42 am

Exo wrote:I was going to ask you guys is there any opcodes you really want/need?


call [ reg ]
so we can call a function with address pointer from dll, directly in the Assembler, and use the dll as plugin.
Or any other way to call function from DLL in Code or Assembler.
Tronic
 
Posts: 539
Joined: Wed Dec 21, 2011 12:59 pm

Re: Fast Stream Array Access

Postby KG_is_back » Mon Oct 20, 2014 6:42 pm

MyCo wrote:Here is a test bench schematic that I use for optimizations. I've set it up with the delays.


very interesting! the stock delays show about 20% and the "optimized" show 30-40% on my machine.
KG_is_back
 
Posts: 1196
Joined: Tue Oct 22, 2013 5:43 pm
Location: Slovakia

Re: Fast Stream Array Access

Postby martinvicanek » Mon Oct 20, 2014 10:01 pm

MyCo wrote:When I need a meaningful comparison, I do hundreds of synchronized copys of a module and put them in parallel into a selector (as mono/packed mono stream). And then switch between optimized/normal while looking at the CPU usage either in FS or in the resource monitor of windows.

Hm, very confusing. The mass test does not show a big difference between stock and "optimized" - if any, then the other way round. :? When you go to 10 instead of 100 copies then the proportions change towards the analyzer result. For me this shows that performance is a complex beast, it depends very much on context. Measuring the performance of one isolated unit seems to have little meaning. But then again, is the mass setup with 100 delays in parallel more representative of a real scenario?

I have implemented "fast" lookup table modules but now I hesitate to post them ...
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

Re: Fast Stream Array Access

Postby tester » Mon Oct 20, 2014 10:21 pm

When I play with oscillators, I usually have few hunderts of them on board. So - yes, it can be a real scenario, and it has practical uses. But on the other hand - even if your oscillators have better performance within smaller designs, these designs can be heavy on other parts, so these few percent can become helpful too. I think I may have a possibility to do a quick test of multi-osc setup, to see what is the real-life difference between stock and custom made part.

In fact - this is why I asked you the question on possibility to make "multisine" oscillators. I'm not sure if there is any way to make a single "shape" oscillator, that as an input takes a list of random sine frequencies (at c.a. 0.01Hz accuracy each).
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
tester
 
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: Fast Stream Array Access

Postby KG_is_back » Mon Oct 20, 2014 10:54 pm

Actually, now more relevant test occur to me - we can put the module into poly section and create module, that initiates given number of voices. Because poly section can work in parallel independently and run only when voice is on, we can avoid selectors.
KG_is_back
 
Posts: 1196
Joined: Tue Oct 22, 2013 5:43 pm
Location: Slovakia

Opcode Wishlist

Postby martinvicanek » Tue Oct 21, 2014 8:35 am

KG_is_back wrote:
Exo wrote:I was going to ask you guys is there any opcodes you really want/need?

NO.1 choice: subtraction for integers. Either sub reg,reg/var32; or/and psubd xmm0,xmm1/var; and to fix the nasty andnps coloring bug ...and logical not would be appreciated even in Code component.

+1, and the following:

PSRLD xmm1, xmm2/m128
Shift doublewords in xmm1 right by amount specified in xmm2/m128 while shifting in 0s.
(Would be handy for some IEE 754 trickey in log and exp approximations)

PMULUDQ xmm1, xmm2/m128
Multiply packed unsigned doubleword integers in xmm1 by packed unsigned doubleword integers in xmm2/m128, and store the quadword results in xmm1.
(Useful for linear congrugential random number generator)

PADDD xmm1, xmm2
Add packed doubleword integers from xmm2/m128 and xmm1.
(Current implementation only supports PADDD xmm1, m128)

Exo wrote:Maybe topic for another thread?
Yes, please :)
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

Re: Fast Stream Array Access

Postby martinvicanek » Tue Oct 21, 2014 8:42 am

martinvicanek wrote:Hm, very confusing. The mass test does not show a big difference between stock and "optimized" - if any, then the other way round. :?

Apparently this paradox has confused others before:
http://synthmaker.co.uk/forum/viewtopic ... =30#p77149
User avatar
martinvicanek
 
Posts: 1328
Joined: Sat Jun 22, 2013 8:28 pm

PreviousNext

Return to User Examples

Who is online

Users browsing this forum: No registered users and 99 guests