Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Barrel processor
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{short description|CPU that switches between threads of execution on every cycle}} A '''barrel processor''' is a [[Central processing unit|CPU]] that switches between [[Thread (computer science)|threads]] of execution on every [[Instruction cycle|cycle]]. This [[CPU design]] technique is also known as "interleaved" or "fine-grained" [[temporal multithreading]]. Unlike [[simultaneous multithreading]] in modern [[superscalar]] architectures, it generally does not allow execution of multiple instructions in one cycle. Like [[preemptive multitasking]], each thread of execution is assigned its own [[program counter]] and other [[hardware register]]s (each thread's [[architectural state]]). A barrel processor can guarantee that each thread will execute one instruction every ''n'' cycles, unlike a [[preemptive multitasking]] machine, that typically runs one thread of execution for tens of millions of cycles, while all other threads wait their turn. A technique called [[C-slowing]] can automatically generate a corresponding barrel processor design from a single-tasking processor design. An ''n''-way barrel processor generated this way acts much like ''n'' separate [[multiprocessing]] copies of the original single-tasking processor, each one running at roughly 1/''n'' the original speed.{{Citation needed|reason=Couldn't find info on this. May be original research?|date=October 2019}} ==History== One of the earliest examples of a barrel processor was the I/O processing system in the [[CDC 6000 series]] supercomputers. These executed one [[Instruction (computer science)|instruction]] (or a portion of an instruction) from each of 10 different virtual processors (called peripheral processors or PPs) before returning to the first processor.<ref name="cyber">[http://www.textfiles.com/bitsavers/pdf/cdc/cyber/60456100A_cyberInstr_Mar79.pdf CDC Cyber 170 Computer Systems; Models 720, 730, 750, and 760; Model 176 (Level B); CPU Instruction Set; PPU Instruction Set] {{Webarchive|url=https://web.archive.org/web/20160303190612/http://www.textfiles.com/bitsavers/pdf/cdc/cyber/60456100A_cyberInstr_Mar79.pdf |date=2016-03-03 }} -- See page 2-44 for an illustration of the rotating "barrel".</ref> From [[CDC 6000 series]] we read that "The peripheral processors are collectively implemented as a barrel processor. Each executes routines independently of the others. They are a loose predecessor of bus mastering or [[direct memory access]]." One motivation for barrel processors was to reduce hardware costs. In the case of the CDC 6x00 PPUs, the digital logic of the processor was much faster than the core memory, so rather than having ten separate processors, there are ten separate core memory units for the PPUs, but they all share the single set of processor logic. Another example is the [[Honeywell 800]], which had 8 groups of registers, allowing up to 8 concurrent programs. After each instruction, the processor would (in most cases) switch to the next active program in sequence.<ref>{{cite book |title=Honeywell 800 Programmers' Reference Manual |date=1960 |url=http://bitsavers.org/pdf/honeywell/h800/H800_programmersRefMan.pdf|page=17}}</ref> Barrel processors have also been used as large-scale central processors. The [[Tera Computer Company|Tera]] [[Cray MTA|MTA]] (1988) was a large-scale barrel processor design with 128 threads per core.<ref name="tera_mta">{{cite web |url=http://cseweb.ucsd.edu/~carter/Papers/tera2.html |title=Archived copy |access-date=2012-08-11 |url-status=dead |archive-url=https://web.archive.org/web/20120222015429/http://cseweb.ucsd.edu/~carter/Papers/tera2.html |archive-date=2012-02-22 }}</ref><ref name="cray_mta">{{cite web |url=http://www.cray.com/About/History.aspx |title=Cray History |access-date=2014-08-19 |url-status=dead |archive-url=https://web.archive.org/web/20140712100729/http://www.cray.com/About/History.aspx |archive-date=2014-07-12 }}</ref> The MTA architecture has seen continued development in successive products, such as the [[Cray Urika-GD]], originally introduced in 2012 (as the YarcData uRiKA) and targeted at data-mining applications.<ref name="urika">{{cite press release |author=<!--Staff writer(s); no by-line.--> |title=Cray's YarcData division launches new big data graph appliance |url=http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-newsArticle&ID=1667153 |location=Seattle, WA and Santa Clara, CA |publisher=Cray Inc. |date=February 29, 2012 |access-date=2017-08-24 |archive-date=2017-03-18 |archive-url=https://web.archive.org/web/20170318115647/http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-newsArticle&ID=1667153 |url-status=dead }}</ref> Barrel processors are also found in embedded systems, where they are particularly useful for their deterministic [[real-time computing|real-time]] thread performance. An early example is the “Dual CPU” version of the [[4-bit computing|four-bit]] [[COP400]] that was introduced by [[National Semiconductor]] in 1981. This single-chip [[microcontroller]] contains two ostensibly independent CPUs that share instructions, memory, and most IO devices. In reality, the dual CPUs are a single two-thread barrel processor. It works by duplicating certain sections of the processor—those that store the [[architectural state]]—but not duplicating the main execution resources such as [[Arithmetic logic unit|ALU]], buses, and memory. Separate architectural states are established with duplicated A (accumulators), B (pointer registers), C (carry flags), N (stack pointers), and PC (program counters).<ref name=":1">{{cite web |title=COPS Microcontrollers Data Book |url=https://usermanual.wiki/Document/1982NationalCOPSMicrocontrollers.1220890920/help |publisher=National Semiconductor |access-date=19 January 2022}}</ref> Another example is the [[XMOS]] [[XCore XS1]] (2007), a four-stage barrel processor with eight threads per core. (Newer processors from [[XMOS]] also have the same type of architecture.) The XS1 is found in Ethernet, USB, audio, and control devices, and other applications where I/O performance is critical. When the XS1 is programmed in the 'XC' language, software controlled [[direct memory access]] may be implemented. Barrel processors have also been used in specialized devices such as the eight-thread [[Ubicom]] IP3023 network I/O processor (2004). Some 8-bit [[microcontroller| microcontrollers]] by [[Padauk Technology]] feature barrel processors with up to 8 threads per core. ==Comparison with single-threaded processors== ===Advantages=== A single-tasking processor spends a lot of time idle, not doing anything useful whenever a [[cache miss]] or [[pipeline stall]] occurs. Advantages to employing barrel processors over single-tasking processors include: * The ability to do useful work on the other threads while the stalled thread is waiting. * Designing an ''n''-way barrel processor with an ''n''-deep [[Instruction pipeline|pipeline]] is much simpler than designing a single-tasking processor because a barrel processor never has a [[pipeline stall]] and doesn't need [[Hazard (computer architecture)#Register forwarding|feed-forward]] circuits. * For [[real-time computing|real-time]] applications, a barrel processor can guarantee that a "real-time" thread can execute with precise timing, no matter what happens to the other threads, even if some other thread [[Deadlock (computer science)|locks up]] in an [[infinite loop]] or is [[interrupt storm | continuously interrupted]] by [[hardware interrupt]]s. ===Disadvantages=== There are a few disadvantages to barrel processors. * The state of each thread must be kept on-chip, typically in registers, to avoid costly off-chip context switches. This requires a large number of registers compared to typical processors. * Either all threads must share the same [[CPU cache|cache]], which slows overall system performance, or there must be one unit of cache for each execution thread, which can significantly increase the [[transistor count]] and thus the cost of such a CPU. However, in [[Real-time computing#Criteria for real-time computing|hard real-time]] [[embedded system]]s where barrel processors are often found, memory access costs are typically calculated assuming worst-case cache behavior, so this is a minor concern.{{cn|reason="Minor concern" seems to be too all-encompassing unless this is genuinely the case for all barrel processors|date=January 2020}} Some barrel processors such as the [[XMOS]] XS1 do not have a cache at all. ==See also== * [[Super-threading]] * [[Computer multitasking]] * [[Simultaneous multithreading]] (SMT) * [[Hyper-threading]] * [[Vector processor]] * [[Cray XMT]] ==References== <references/> == External links == * [https://web.archive.org/web/20070905134756/http://www.embedded.com/story/OEG20030509S0043/ Soft peripherals] Embedded.com article examines Ubicom's IP3023 processor * [http://www.cs.clemson.edu/~mark/g60.ps An Evaluation of the Design of the Gamma 60] * [https://web.archive.org/web/20130821022746/http://www.feb-patrimoine.com/projet/gamma60/gamma_60.htm Histoire et architecture du Gamma 60] (French and English) {{DEFAULTSORT:Barrel Processor}} [[Category:Central processing unit]] [[Category:Instruction processing]] [[Category:Threads (computing)]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite press release
(
edit
)
Template:Cite web
(
edit
)
Template:Cn
(
edit
)
Template:Short description
(
edit
)
Template:Webarchive
(
edit
)