Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Out-of-order execution
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Dispatch and issue decoupling allows out-of-order issue == One of the differences created by the new paradigm is the creation of queues that allow the dispatch step to be decoupled from the issue step and the graduation stage to be decoupled from the execute stage. An early name for the paradigm was ''decoupled architecture''. In the earlier ''in-order'' processors, these stages operated in a fairly [[Lockstep (computing)|lock-step]], pipelined fashion. The [[Instruction cycle|fetch and decode stages]] is separated from the execute stage in a [[Pipeline (computing)|pipelined]] processor by using a [[Data buffer|buffer]]. The buffer's purpose is to partition the [[Memory access pattern|memory access]] and execute functions in a computer program and achieve high performance by exploiting the fine-grain [[parallel computing|parallelism]] between the two.<ref>{{cite journal |author-last=Smith |author-first1=J. E. |title=Decoupled access/execute computer architectures |journal= ACM Transactions on Computer Systems|date=1984 |volume=2 |issue=4 |pages=289β308 |citeseerx=10.1.1.127.4475 |doi=10.1145/357401.357403|s2cid=13903321 }}</ref> In doing so, it effectively hides all [[memory latency]] from the processor's perspective. A larger buffer can, in theory, increase throughput. However, if the processor has a [[branch misprediction]] then the entire buffer may need to be flushed, wasting a lot of [[clock cycle]]s and reducing the effectiveness. Furthermore, larger buffers create more heat and use more [[Die (integrated circuit)|die]] space. For this reason processor designers today favor a [[multi-threaded]] design approach. Decoupled architectures are generally thought of as not useful for general-purpose computing as they do not handle control-intensive code well.<ref>{{cite journal |author-last1=Kurian |author-first1=L. |author-last2=Hulina |author-first2=P. T. |author-last3=Coraor |author-first3=L. D. |title=Memory latency effects in decoupled architectures |journal=[[IEEE Transactions on Computers]] |volume=43 |issue=10 |date=1994 |pages=1129β1139 |doi=10.1109/12.324539 |s2cid=6913858 |url=https://pdfs.semanticscholar.org/6aa3/18cce633e3c2d86d970d6d50104d818d9407.pdf |archive-url=https://web.archive.org/web/20180612141055/https://pdfs.semanticscholar.org/6aa3/18cce633e3c2d86d970d6d50104d818d9407.pdf |url-status=dead |archive-date=2018-06-12 }}</ref> Control intensive code include such things as nested branches that occur frequently in [[operating system kernel]]s. Decoupled architectures play an important role in scheduling in [[very long instruction word]] (VLIW) architectures.<ref>{{cite journal |author-first1=M. N. |author-last1=Dorojevets |author-first2=V. |author-last2=Oklobdzija |title=Multithreaded decoupled architecture |journal=International Journal of High Speed Computing |volume=7 |issue=3 |pages=465β480 |date=1995 |doi=10.1142/S0129053395000257 |url=https://www.researchgate.net/publication/220171480}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)