Editing Scalable Coherent Interface (section)

==The standard==
The standard defined two interface levels: 
*The physical level that deals with electrical signals, connectors, mechanical and thermal conditions 
*The logical level that describes the address space, data transfer protocols, cache coherence mechanisms, synchronization primitives, control and status registers, and initialization and error recovery facilities. 

This structure allowed new developments in physical interface technology to be easily adapted without any redesign on the logical level.

Scalability for large systems is achieved through a distributed [[directory-based cache coherence]] model. (The other popular models for cache coherency are based on system-wide eavesdropping (snooping) of memory transactions – a scheme which is not very scalable.) In SCI each node contains a directory with a pointer to the next node in a linked list that shares a particular cache line.

SCI defines a 64-bit flat address space (16 exabytes) where 16 bits are used for identifying a node (65,536 nodes) and 48 bits for address within the node (256 terabytes). A node can contain many processors and/or memory. The SCI standard defines a [[packet switched network]].

===Topologies===
SCI can be used to build systems with different types of switching topologies from centralized to fully distributed switching:

*With a central switch, each node is connected to the switch with a ringlet (in this case a two-node ring). 
*In distributed switching systems, each node can be connected to a ring of arbitrary length and either all or some of the nodes can be connected to two or more rings. 

The most common way to describe these multi-dimensional topologies is k-ary n-cubes (or tori). The SCI standard specification mentions several such topologies as examples.

The 2-D [[torus]] is a combination of rings in two dimensions. Switching between the two dimensions requires a small switching capability in the node. This can be expanded to three or more dimensions. The concept of folding rings can also be applied to the Torus topologies to avoid any long connection segments.

===Transactions===
SCI sends information in packets. Each packet consists of an unbroken sequence of 16-bit symbols. The symbol is accompanied by a flag bit. A transition of the flag bit from 0 to 1 indicates the start of a packet. A transition from 1 to 0 occurs 1 (for echoes) or 4 symbols before the packet end. A packet contains a header with address command and status information, payload (from 0 through optional lengths of data) and a CRC check symbol. The first symbol in the packet header contains the destination node address. If the address is not within the domain handled by the receiving node, the packet is passed to the output through the bypass FIFO. In the other case, the packet is fed to a receive queue and may be transferred to a ring in another dimension. All packets are marked when they pass the scrubber (a node is established as scrubber when the ring is initialized). Packets without a valid destination address will be removed when passing the scrubber for the second time to avoid filling the ring with packets that would otherwise circulate indefinitely.

===Cache coherence===
[[Cache coherence]] ensures data consistency in multiprocessor systems. The simplest form applied in earlier systems was based on clearing the cache contents between [[context switch]]es and disabling the cache for data that were shared between two or more processors. These methods were feasible when the performance difference between the cache and memory were less than one order of magnitude. Modern processors with caches that are more than two orders of magnitude faster than main memory would not perform anywhere near optimal without more sophisticated methods for data consistency. Bus based systems use eavesdropping ([[Bus sniffing|snooping]]) methods since buses are inherently broadcast. Modern systems with point-to point links use broadcast methods with snoop filter options to improve performance. Since broadcast and eavesdropping are inherently non-scalable, these are not used in SCI.

Instead, SCI uses a distributed directory-based cache coherence protocol with a [[linked list]] of nodes containing processors that share a particular cache line. Each node holds a directory for the main memory of the node with a tag for each line of memory (same line length as the cache line). The memory tag holds a pointer to the head of the linked list and a state code for the line (three states – home, fresh, gone). Associated with each node is also a cache for holding remote data with a directory containing forward and backward pointers to nodes in the linked list sharing the cache line. The tag for the cache has seven states (invalid, only fresh, head fresh, only dirty, head dirty, mid valid, tail valid).

The distributed directory is scalable. The overhead for the directory based cache coherence is a constant percentage of the node’s memory and cache. This percentage is in the order of 4% for the memory and 7% for the cache.