Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Cell (processor)
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Overview== {{more citations needed|section|date=May 2025}} {{copy edit|section|date=May 2025}} The '''Cell Broadband Engine''', or ''Cell'' as it is more commonly known, is a microprocessor intended as a hybrid of conventional desktop processors (such as the [[Athlon 64]], and [[Core 2]] families) and more specialized high-performance processors, such as the [[NVIDIA]] and [[ATI (brand)|ATI]] graphics-processors ([[Graphics processing unit|GPU]]s). The longer name indicates its intended use, namely as a component in current and future [[online distribution]] systems; as such it may be utilized in high-definition displays and recording equipment, as well as [[high-definition television|HDTV]] systems. Additionally the processor may be suited to [[digital imaging]] systems (medical, scientific, ''etc.'') and [[physical simulation]] (''e.g.'', scientific and [[structural engineering]] modeling). As used in the PlayStation 3 it has 250 million transistors.<ref>{{Cite web |date=July 13, 2006 |title=A Glimpse Inside The Cell Processor |url=https://www.gamedeveloper.com/programming/a-glimpse-inside-the-cell-processor |access-date=June 19, 2019 |website=[[Gamasutra]]}}</ref> In a simple analysis, the Cell processor can be split into four components: external input and output structures, the main processor called the ''Power Processing Element'' (PPE) (a two-way [[Simultaneous multithreading|simultaneous-multithreaded]] [[PowerPC 2.02]] core),<ref>{{Cite book |last=Koranne |first=Sandeep |url=https://link.springer.com/chapter/10.1007/978-1-4419-0308-2_2 |title=Practical Computing on the Cell Broadband Engine |date=July 15, 2009 |publisher=[[Springer Science+Business Media]] |isbn=978-1-4419-0307-5 |page=17 |chapter=Chapter 2 - The Power Processing Element (PPE) |doi=10.1007/978-1-4419-0308-2_2 |chapter-url=https://link.springer.com/chapter/10.1007/978-1-4419-0308-2_2}}</ref> eight fully functional co-processors called the ''Synergistic Processing Elements'', or SPEs, and a specialized high-bandwidth [[circular data bus]] connecting the PPE, input/output elements and the SPEs, called the ''Element Interconnect Bus'' or EIB. To achieve the high performance needed for mathematically intensive tasks, such as decoding/encoding [[MPEG]] streams, generating or transforming three-dimensional data, or undertaking [[Fourier analysis]] of data, the Cell processor marries the SPEs and the PPE via EIB to give access, via fully [[Direct memory access#Cache coherency|cache coherent]] [[Direct memory access|DMA (direct memory access)]], to both main memory and to other external data storage. To make the best of EIB, and to overlap computation and data transfer, each of the nine processing elements (PPE and SPEs) is equipped with a [[Direct memory access#DMA engine|DMA engine]]. Since the SPE's load/store instructions can only access its own local [[scratchpad memory]], each SPE entirely depends on DMAs to transfer data to and from the main memory and other SPEs' local memories. A DMA operation can transfer either a single block area of size up to 16KB, or a list of 2 to 2048 such blocks. One of the major design decisions in the architecture of Cell is the use of DMAs as a central means of intra-chip data transfer, with a view to enabling maximal asynchrony and concurrency in data processing inside a chip.<ref name="geschwindpaper">{{Cite conference |last=Gschwind |first=Michael |year=2006 |title=Chip multiprocessing and the cell broadband engine |url=http://portal.acm.org/citation.cfm?id=1128023 |publisher=ACM |pages=1β8 |doi=10.1145/1128022.1128023 |isbn=1595933026 |access-date=June 29, 2008 |book-title=Proceedings of the 3rd conference on Computing frontiers - CF '06 |s2cid=14226551}}</ref> The PPE, which is capable of running a conventional operating system, has control over the SPEs and can start, stop, interrupt, and schedule processes running on the SPEs. To this end, the PPE has additional instructions relating to the control of the SPEs. Unlike SPEs, the PPE can read and write the main memory and the local memories of SPEs through the standard load/store instructions. The SPEs are not fully autonomous and require the PPE to prime them before they can do any useful work. As most of the "horsepower" of the system comes from the synergistic processing elements, the use of [[Direct memory access|DMA]] as a method of data transfer and the limited local [[memory footprint]] of each SPE pose a major challenge to software developers who wish to make the most of this horsepower, demanding careful hand-tuning of programs to extract maximal performance from this CPU. The PPE and bus architecture includes various modes of operation giving different levels of [[memory protection]], allowing areas of memory to be protected from access by specific processes running on the SPEs or the PPE. Both the PPE and SPE are [[RISC]] architectures with a fixed-width 32-bit instruction format. The PPE contains a 64-bit [[general-purpose register]] set (GPR), a 64-bit floating-point register set (FPR), and a 128-bit [[Altivec]] register set. The SPE contains 128-bit registers only. These can be used for scalar data types ranging from 8-bits to 64-bits in size, or for [[SIMD]] computations on various integer and floating-point formats. System memory addresses for both the PPE and SPE are expressed as 64-bit values. Local store addresses internal to the SPU (Synergistic Processor Unit) processor are expressed as a 32-bit word. In documentation relating to Cell, a word is always taken to mean 32 bits, a doubleword means 64 bits, and a quadword means 128 bits. <!-- Far from perfect but trending toward accuracy. Could not find either the virtual address range limit or the physical address range limit. Note that a "system address" on the SPU is an address passed to the SPU DMA controller; the LS has only 2^14 addressable locations (257K/16B) ~~~~ --> ===PowerXCell 8i=== In 2008, IBM announced a revised variant of the Cell called the '''PowerXCell 8i''',<ref name="cbe-programming-handbok">{{Cite book |url=http://www.iman1.jo/iman1/images/IMAN1-User-Site-Files/Programming/CellBE_PXCell_Handbook_v1.11_12May08_pub.pdf |title=Cell Broadband Engine Programming Handbook Including the PowerXCell 8i Processor |date=May 12, 2008 |publisher=[[IBM]] |series=Version 1.11 |access-date=March 10, 2018 |archive-url=https://web.archive.org/web/20180311081221/http://www.iman1.jo/iman1/images/IMAN1-User-Site-Files/Programming/CellBE_PXCell_Handbook_v1.11_12May08_pub.pdf |archive-date=March 11, 2018 |url-status=dead}}</ref> which is available in QS22 [[BladeCenter|Blade Servers]] from IBM. The PowerXCell is manufactured on a [[65 nm]] process, and adds support for up to 32 GB of slotted DDR2 memory, as well as dramatically improving [[double-precision floating-point]] performance on the SPEs from a peak of about 12.8 [[GFLOPS]] to 102.4 GFLOPS total for eight SPEs, which, coincidentally, is the same peak performance as the [[NEC SX-9]] vector processor released around the same time. The [[Roadrunner (supercomputer)|IBM Roadrunner]] supercomputer, the world's fastest during 2008β2009, consisted of 12,240 PowerXCell 8i processors, along with 6,562 [[Opteron|AMD Opteron]] processors.<ref name="beyond3dpowerxcell">{{Cite web |date=May 2008 |title=IBM announces PowerXCell 8i, QS22 blade server |url=http://www.beyond3d.com/content/news/640 |url-status=dead |archive-url=https://web.archive.org/web/20080616190441/http://www.beyond3d.com/content/news/640 |archive-date=June 16, 2008 |access-date=June 10, 2008 |publisher=Beyond3D |df=mdy-all}}</ref> The PowerXCell 8i powered super computers also dominated all of the top 6 "greenest" systems in the Green500 list, with highest MFLOPS/Watt ratio supercomputers in the world.<ref name="The Green 500 list, Nov 2009 ">{{Cite web |title=The Green500 List - November 2009 |url=http://www.green500.org/lists/2009/11/top/list.php |url-status=dead |archive-url=http://archive.wikiwix.com/cache/20110223210120/http://www.green500.org/lists/2009/11/top/list.php |archive-date=February 23, 2011 |df=mdy-all}}</ref> Beside the QS22 and supercomputers, the PowerXCell processor is also available as an accelerator on a PCI Express card and is used as the core processor in the [[QPACE]] project. Since the PowerXCell 8i removed the RAMBUS memory interface, and added significantly larger DDR2 interfaces and enhanced SPEs, the chip layout had to be reworked, which resulted in both larger chip die and packaging.<ref>{{Cite web |title=Packaging the Cell Broadband Engine Microprocessor for Supercomputer Applications |url=http://ecadigitallibrary.com/pdf/58thECTC/s31p4p67.pdf |url-status=dead |archive-url=https://web.archive.org/web/20140104204617/http://ecadigitallibrary.com/pdf/58thECTC/s31p4p67.pdf |archive-date=January 4, 2014 |access-date=January 4, 2014}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)