Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cray-3/SSS
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Design== The SSS project started after a [[Supercomputing Research Center]] (SRC) engineer, Ken Iobst, noticed a novel way to implement a parallel computer. Previous massively SIMD designs, like the [[Connection Machine]]s, consisted of a large number of individual processing elements consisting of a simple processor and some local memory. Results that needed to be passed from element to element were passed along networking links at relatively slow speeds. This was a serious bottleneck in most parallel designs, which limited their use to certain roles where these interdependencies could be reduced. Iobst's idea was to use the super-fast scatter/gather hardware from the Cray-3 to move the data around instead of using a separate network. This would offer at least an order of magnitudes better performance than systems based on "commodity" hardware. Better yet, the machine would still include a complete Cray-3 CPU, allowing the machine as a whole to use either SIMD or vector instructions depending on the particulars of the problem. Now all that remained was the selection of a processor. Since the Cray-3 already had a [[vector processor]] for heavy computing, the SIMD processors themselves could be considerably simpler, handling only the most basic instructions. This is where the SSS concept was truly unique; since the problem with most SIMD machines was moving data around, Iobst suggested that the processors be built into the [[Static random access memory|SRAM]] chips themselves. Memory is normally organized within the RAM chips in a row/column format, with a controller on the chip reading requested data from the chip in parallel across the rows, then assembling the results into 32- or [[64-bit]] words for processing by the [[Central processing unit|CPU]]. In the SSS concept, the chips would also be equipped with a series of single-bit computers operating on a particular column of all the rows are at once—this meant that the processors could access data at very high speeds, about 100x as fast as normal. Add to this the speed of the "network" implemented by the scatter/gather hardware, and the system could be scaled to sizes considerably greater than existing SIMD systems. Each processor could accept two commands every 200 nanoseconds, for an effective cycle rate of 100 ns (10 MHz). A fully equipped system with 1,024,000 processors would have an aggregate processing capability of 32 TFlops.<ref>Ken Iobst et al, [http://portal.acm.org/citation.cfm?id=620191 "Processing in Memory: The Terasys Massively Parallel PIM Array"], ''Computer'', Volume 28 Issue 4 (April 1995)</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)