SSE3

Revision as of 17:28, 28 April 2025 by 208.209.227.4 (talk) (Transmeta TM8800 had SSE3 added by Code Morphing Software 6.1.1; it was not in the original shipping 6.0.4 from 2004. Most vendors did not ship the update)
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Template:Short description Template:Distinguish SSE3, Streaming SIMD Extensions 3, also known by its Intel code name Prescott New Instructions (PNI),<ref name=":1">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> is the third iteration of the SSE instruction set for the IA-32 (x86) architecture. Intel introduced SSE3 in early 2004 with the Prescott revision of their Pentium 4 CPU.<ref name=":1" /> In April 2005, AMD introduced a subset of SSE3 in revision E (Venice and San Diego) of their Athlon 64 CPUs.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> The earlier SIMD instruction sets on the x86 platform, from oldest to newest, are MMX, 3DNow! (developed by AMD, no longer supported on newer CPUs), SSE, and SSE2.

SSE3 contains 13 new instructions over SSE2.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

ChangesEdit

The most notable change is the capability to work horizontally in a register, as opposed to the more or less strictly vertical operation of all previous SSE instructions. More specifically, instructions to add and subtract the multiple values stored within a single register have been added.<ref name=":2">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> These instructions can be used to speed up the implementation of a number of DSP and 3D operations. There is also a new instruction to convert floating point values to integers without having to change the global rounding mode, thus avoiding costly pipeline stalls. Finally, the extension adds LDDQU, an alternative misaligned integer vector load that has better performance on NetBurst based platforms for loads that cross cacheline boundaries.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

CPUs with SSE3Edit

  • AMD:
    • Opteron (since Stepping E4<ref>{{#invoke:citation/CS1|citation

|CitationClass=web }}</ref>)

|CitationClass=web }}</ref>)

New instructionsEdit

Common instructionsEdit

ArithmeticEdit

ADDSUBPD
Add-Subtract-Packed-Double<ref name=":0">{{#invoke:citation/CS1|citation

|CitationClass=web }}</ref>

  • Input: { A0, A1 }, { B0, B1 }
  • Output: { A0 − B0, A1 + B1 }
ADDSUBPS
Add-Subtract-Packed-Single<ref name=":0" />
  • Input: { A0, A1, A2, A3 }, { B0, B1, B2, B3 }
  • Output: { A0 − B0, A1 + B1, A2 − B2, A3 + B3 }

AOS ( Array Of Structures )Edit

HADDPD
Horizontal-Add-Packed-Double<ref name=":0" />
  • Input: { A0, A1 }, { B0, B1 }
  • Output: { A0 + A1, B0 + B1 }
HADDPS
Horizontal-Add-Packed-Single<ref name=":0" />
  • Input: { A0, A1, A2, A3 }, { B0, B1, B2, B3 }
  • Output: { A0 + A1, A2 + A3, B0 + B1, B2 + B3 }
HSUBPD
Horizontal-Subtract-Packed-Double<ref name=":0" />
  • Input: { A0, A1 }, { B0, B1 }
  • Output: { A0 − A1, B0 − B1 }
HSUBPS
Horizontal-Subtract-Packed-Single<ref name=":0" />
  • Input: { A0, A1, A2, A3 }, { B0, B1, B2, B3 }
  • Output: { A0 − A1, A2 − A3, B0 − B1, B2 − B3 }
LDDQU
As stated above, this is an alternative misaligned integer vector load.<ref name=":0" /> It can be helpful for video compression tasks.
MOVDDUP, MOVSHDUP, MOVSLDUP<ref name="
2" />
These are useful for complex numbers and wave calculation like sound.
FISTTP
Like the older x87 FISTP instruction, but ignores the floating point control register's rounding mode settings and uses the "chop" (truncate) mode instead.<ref name=":2" /> Allows omission of the expensive loading and re-loading of the control register in languages such as C where float-to-int conversion requires truncate behaviour by standard.

Other instructionsEdit

MONITOR, MWAIT
The MONITOR instruction is used to specify a memory address for monitoring, while the MWAIT instruction puts the processor into a low-power state and waits for a write event to the monitored address.<ref name=":2" />

ReferencesEdit

Template:Reflist

External linksEdit

Template:Multimedia extensions