If you have a problem or need to report a bug please email : support@dsprobotics.com
There are 3 sections to this support area:
DOWNLOADS: access to product manuals, support files and drivers
HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects
USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here
NEW REGISTRATIONS - please contact us if you wish to register on the forum
Users are reminded of the forum rules they sign up to which prohibits any activity that violates any laws including posting material covered by copyright
do the Shufps
18 posts
• Page 1 of 2 • 1, 2
do the Shufps
Hi all,
Can anyone tell me the shufps code to change the sse channel order from 0123 to 3210? i.e. to reverse it - what would be the 'n' in :
shufps xmm0,xmm0,n; ? (Maybe it needs more than one step ..)
I've read here that there was a handy shufps helper on the forum some years back, but I haven't been able to find it. A ref to that would be most useful!
Thanks
H
Can anyone tell me the shufps code to change the sse channel order from 0123 to 3210? i.e. to reverse it - what would be the 'n' in :
shufps xmm0,xmm0,n; ? (Maybe it needs more than one step ..)
I've read here that there was a handy shufps helper on the forum some years back, but I haven't been able to find it. A ref to that would be most useful!
Thanks
H
-
HughBanton - Posts: 265
- Joined: Sat Apr 12, 2008 3:10 pm
- Location: Evesham, Worcestershire
Re: do the Shufps
HughBanton wrote:Hi all,
Can anyone tell me the shufps code to change the sse channel order from 0123 to 3210? i.e. to reverse it - what would be the 'n' in :
shufps xmm0,xmm0,n; ? (Maybe it needs more than one step ..)
I've read here that there was a handy shufps helper on the forum some years back, but I haven't been able to find it. A ref to that would be most useful!
Thanks
H
According to Intel x86 Assembly/SSE, this would be it:
- Code: Select all
shufps $0x1b, %xmm0, %xmm0 # reverse order of the 4 floats
The control byte (that apart from this language NASM is always displayed as the last operand), is an 8-bit immediate and tells what goes where.
The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.
$0x1b is hexcode, decimal 27, binary 00011011, broken into immediate 0, 1, 2, 3, I think it's MSB order
Hope it helps!
"There lies the dog buried" (German saying translated literally)
- tulamide
- Posts: 2714
- Joined: Sat Jun 21, 2014 2:48 pm
- Location: Germany
Re: do the Shufps
Hah - 27 .. that's it! Thanks Tula.
I had searched hi & lo, but couldn't find the logic written down anywhere. I'll make a note of all that.
I've been occasionally looking at Rotary Speaker stuff of late (about time ..?) and realised that swapping the mono-4 channels around like this would instantly simplify the spiders web inside the auto-panner that I've introduced. I'm trying to make the delay reflections move individually in stereo as they 'rotate', seems to be a crucial Leslie element.
Anyway, more on all this when I eventually get something worth demonstrating.
Thanks again.
H
I had searched hi & lo, but couldn't find the logic written down anywhere. I'll make a note of all that.
I've been occasionally looking at Rotary Speaker stuff of late (about time ..?) and realised that swapping the mono-4 channels around like this would instantly simplify the spiders web inside the auto-panner that I've introduced. I'm trying to make the delay reflections move individually in stereo as they 'rotate', seems to be a crucial Leslie element.
Anyway, more on all this when I eventually get something worth demonstrating.
Thanks again.
H
-
HughBanton - Posts: 265
- Joined: Sat Apr 12, 2008 3:10 pm
- Location: Evesham, Worcestershire
Re: do the Shufps
It's the first time I had to deal with it. Which shows that it's actually pretty easy. The select operand has 8 bits, and each 2 bits represent an action to be done on the equivalent element of the register. You just need to learn 4 states:
0 = copy to least significant element
1 = copy to second element
2 = copy to third element
3 = copy to most significant element
above numbers in 2-bit binary: 0 = 00, 1 = 01, 2 = 10, 3 = 11
These are the same for all 4 instructions in the IMM8. But, and this is the catch, there's a specified order, when using two registers!
However, if you only work with one register, you can directly translate it:
ABCD to DABC
IMM8 2, 1, 0, 3 = mask 10 01 00 11 = binary 10010011 = decimal 147 = hex 0x93
Above example would be called rotation. If you are only interested in specific usage of shufps on one register, specifically broadcast, swap and rotate, this page will help you a lot, as it doesn't explain much, but gives straight usage code for specific tasks.
http://www.songho.ca/misc/sse/sse.html
EDIT: I told you it is in MSB order, but my example was in LSB order! Sorry! 0x93 would do ABCD to BCDA !
EDIT2: According to the tool, Martin posted, my original explanation is absolutely correct. So ignore Edit1 please!
0 = copy to least significant element
1 = copy to second element
2 = copy to third element
3 = copy to most significant element
above numbers in 2-bit binary: 0 = 00, 1 = 01, 2 = 10, 3 = 11
These are the same for all 4 instructions in the IMM8. But, and this is the catch, there's a specified order, when using two registers!
However, if you only work with one register, you can directly translate it:
ABCD to DABC
IMM8 2, 1, 0, 3 = mask 10 01 00 11 = binary 10010011 = decimal 147 = hex 0x93
Above example would be called rotation. If you are only interested in specific usage of shufps on one register, specifically broadcast, swap and rotate, this page will help you a lot, as it doesn't explain much, but gives straight usage code for specific tasks.
http://www.songho.ca/misc/sse/sse.html
EDIT: I told you it is in MSB order, but my example was in LSB order! Sorry! 0x93 would do ABCD to BCDA !
EDIT2: According to the tool, Martin posted, my original explanation is absolutely correct. So ignore Edit1 please!
Last edited by tulamide on Fri May 14, 2021 8:56 pm, edited 1 time in total.
"There lies the dog buried" (German saying translated literally)
- tulamide
- Posts: 2714
- Joined: Sat Jun 21, 2014 2:48 pm
- Location: Germany
Re: do the Shufps
Wonderful tool by STW and infuzion!
- Attachments
-
- shufps ASM operand mask helper 1.5.2.fsm
- (20.36 KiB) Downloaded 967 times
-
martinvicanek - Posts: 1328
- Joined: Sat Jun 22, 2013 8:28 pm
Re: do the Shufps
martinvicanek wrote:Wonderful tool by STW and infuzion!
Interesting. His tool lays out the mask exactly as I did in my example. 0x97 does a right shift. But Intel explains it exactly the opposite. According to their documentation, it should do a left shift.
What's going on here?
"There lies the dog buried" (German saying translated literally)
- tulamide
- Posts: 2714
- Joined: Sat Jun 21, 2014 2:48 pm
- Location: Germany
Re: do the Shufps
Am I ignored, or does nobody know?
"There lies the dog buried" (German saying translated literally)
- tulamide
- Posts: 2714
- Joined: Sat Jun 21, 2014 2:48 pm
- Location: Germany
Re: do the Shufps
tulamide wrote:Am I ignored, or does nobody know?
Definitely ignored.
We need a “I read your post but I know nothing" button!
-
Spogg - Posts: 3358
- Joined: Thu Nov 20, 2014 4:24 pm
- Location: Birmingham, England
Re: do the Shufps
Sorry, Tula, not ignoring your post, just don't know the answer to your question.
If this is intel's explanation then I don't understand it. I have read it several times but even the grammar seems odd to me. All I can say is that the shufps helper tool, which I have been using excessively for years, works flawlessly.
tulamide wrote:The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.
If this is intel's explanation then I don't understand it. I have read it several times but even the grammar seems odd to me. All I can say is that the shufps helper tool, which I have been using excessively for years, works flawlessly.
-
martinvicanek - Posts: 1328
- Joined: Sat Jun 22, 2013 8:28 pm
Re: do the Shufps
martinvicanek wrote:Sorry, Tula, not ignoring your post, just don't know the answer to your question.tulamide wrote:The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.
If this is intel's explanation then I don't understand it. I have read it several times but even the grammar seems odd to me. All I can say is that the shufps helper tool, which I have been using excessively for years, works flawlessly.
Thanks! Yes, as I said earlier, the tool and my explanation both do the correct thing. That's why I was confused, that it's explained in the opposite order.
But nobody ever complained about the description, so I assume its flaw has long been accepted and people are aware of it? Or it is a thing of little and big endian, which is dependend on the CPU. Maybe I was reading the description for big-endian, instead of little endian as used by Intel-CPUs? Well, I think we can leave it at that.
"There lies the dog buried" (German saying translated literally)
- tulamide
- Posts: 2714
- Joined: Sat Jun 21, 2014 2:48 pm
- Location: Germany
18 posts
• Page 1 of 2 • 1, 2
Who is online
Users browsing this forum: No registered users and 27 guests