Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cray-1
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Vector machines=== In the STAR, new instructions essentially wrote the loops for the user. The user told the machine where in memory the list of numbers was stored, then fed in a single instruction <code>a(1..1000000) = addv b(1..1000000), c(1..1000000)</code>. At first glance it appears the savings are limited; in this case the machine fetches and decodes only a single instruction instead of 1,000,000, thereby saving 1,000,000 fetches and decodes, perhaps one-fourth of the overall time. The real savings are not so obvious. Internally, the [[central processing unit|CPU]] of the computer is built up from a number of separate parts dedicated to a single task, for instance, adding a number, or fetching from memory. Normally, as the instruction flows through the machine, only one part is active at any given time. This means that each sequential step of the entire process must complete before a result can be saved. The addition of an [[instruction pipeline]] changes this. In such machines the CPU will "look ahead" and begin fetching succeeding instructions while the current instruction is still being processed. In this [[assembly line]] fashion any one instruction still requires as long to complete, but as soon as it finishes executing, the next instruction is right behind it, with most of the steps required for its execution already completed. [[Vector processor]]s use this technique with one additional trick. Because the data layout is in a known format β a set of numbers arranged sequentially in memory β the pipelines can be tuned to improve the performance of fetches. On the receipt of a vector instruction, special hardware sets up the memory access for the arrays and stuffs the data into the processor as fast as possible. CDC's approach in the STAR used what is today known as a ''memory-memory architecture''. This referred to the way the machine gathered data. It set up its pipeline to read from and write to memory directly. This allowed the STAR to use vectors of length not limited by the length of registers, making it highly flexible. Unfortunately, the pipeline had to be very long in order to allow it to have enough instructions in flight to make up for the slow memory. That meant the machine incurred a high cost when switching from processing vectors to performing operations on non-vector operands. Additionally, the low scalar performance of the machine meant that after the switch had taken place and the machine was running scalar instructions, the performance was quite poor{{citation needed|date=September 2012}}. The result was rather disappointing real-world performance, something that could, perhaps, have been forecast by [[Amdahl's law]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)