Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
MESI protocol
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Operation == [[File:Diagrama MESI.GIF|thumb|Image 1.1 State diagram for MESI protocol Red: Bus initiated transaction. Black: Processor initiated transactions.<ref>{{Cite book|title=Parallel Computer Architecture|last=Culler|first=David|publisher=Morgan Kaufmann Publishers|year=1997|pages=Figure 5β15 State transition diagram for the Illinois MESI protocol. Pg 286}}</ref> ]] The MESI protocol is defined by a [[finite-state machine]] that transitions from one state to another based on 2 stimuli. The first stimulus is the processor-specific Read and Write request. For example: A processor P1 has a Block X in its Cache, and there is a request from the processor to read or write from that block. The second stimulus is given through the bus connecting the processors. In particular, the "Bus side requests" come from other processors that don't have the cache block or the updated data in their Cache. The bus requests are monitored with the help of [[Bus snooping|Snoopers]],<ref>{{Cite web|url=http://hps.ece.utexas.edu/people/suleman/class_projects/pca_report.pdf|title=An evaluation of Snoopy Based Cache Coherence protocols|last=Bigelow, Narasiman, Suleman|publisher=ECE Department, University of Texas at Austin}}</ref> which monitor all the bus transactions. Following are the different types of Processor requests and Bus side requests: Processor Requests to Cache include the following operations: # PrRd: The processor requests to '''read''' a Cache block. # PrWr: The processor requests to '''write''' a Cache block Bus side requests are the following: # BusRd: Snooped request that indicates there is a '''read''' request to a Cache block requested by another processor # BusRdX: Snooped request that indicates there is a '''write''' request to a Cache block requested by another processor that '''doesn't already have the block.''' # BusUpgr: Snooped request that indicates that there is a write request to a Cache block requested by another processor that already has that '''cache block residing in its own cache'''. # Flush: Snooped request that indicates that an entire cache block is written back to the main memory by another processor. # FlushOpt: Snooped request that indicates that an entire cache block is posted on the bus in order to supply it to another processor (Cache to Cache transfers). (''Such Cache to Cache transfers can reduce the read miss [[CAS latency|latency]] if the latency to bring the block from the main memory is more than from Cache to Cache transfers, which is generally the case in bus based systems.'') '''Snooping Operation''': In a snooping system, all caches on a bus monitor all the transactions on that bus. Every cache has a copy of the sharing status of every block of physical memory it has stored. The state of the block is changed according to the State Diagram of the protocol used. (Refer image above for MESI state diagram). The bus has snoopers on both sides: # Snooper towards the Processor/Cache side. # The snooping function on the memory side is done by the Memory controller. '''Explanation:''' Each Cache block has its own 4 state [[finite-state machine]] (refer image 1.1). The State transitions and the responses at a particular state with respect to different inputs are shown in Table1.1 and Table 1.2 {| class="wikitable" |+Table 1.1 State Transitions and response to various Processor Operations !Initial State !Operation !Response |- |Invalid(I) |PrRd | * Issue BusRd to the bus * other Caches see BusRd and check if they have a valid copy, inform sending cache * State transition to (S)'''Shared''', if other Caches have valid copy. * State transition to (E)'''Exclusive''', if none (must ensure all others have reported). * If other Caches have copy, one of them sends value, else fetch from Main Memory |- | |PrWr | * Issue BusRdX signal on the bus * State transition to (M)'''Modified''' in the requestor Cache. * If other Caches have copy, they send value, otherwise fetch from Main Memory * If other Caches have copy, they see BusRdX signal and invalidate their copies. * Write into Cache block modifies the value. |- |Exclusive(E) |PrRd | * No bus transactions generated * State remains the same. * Read to the block is a Cache Hit |- | |PrWr | * No bus transaction generated * State transition from Exclusive to (M)'''Modified''' * Write to the block is a Cache Hit |- |Shared(S) |PrRd | * No bus transactions generated * State remains the same. * Read to the block is a Cache Hit. |- | |PrWr | * Issues BusUpgr signal on the bus. * State transition to (M)'''Modified'''. * other Caches see BusUpgr and mark their copies of the block as (I)Invalid. |- |Modified(M) |PrRd | * No bus transactions generated * State remains the same. * Read to the block is a Cache hit |- | |PrWr | * No bus transactions generated * State remains the same. * Write to the block is a Cache hit. |} {| class="wikitable" |+Table 1.2 State Transitions and response to various Bus Operations !Initial State !Operation !Response |- |Invalid(I) |BusRd | * No State change. Signal Ignored. |- | |BusRdX/BusUpgr | * No State change. Signal Ignored |- |Exclusive(E) |BusRd | * Transition to '''Shared''' (Since it implies a read taking place in other cache). * Put FlushOpt on bus together with contents of block. |- | |BusRdX | * Transition to '''Invalid'''. * Put FlushOpt on Bus, together with the data from now-invalidated block. |- |Shared(S) |BusRd | * No State change (other cache performed read on this block, so still shared). * May put FlushOpt on bus together with contents of block (design choice, which cache with Shared state does this). |- | |BusRdX/BusUpgr | * Transition to '''Invalid''' (cache that sent BuxRdX/BusUpgr becomes Modified) * May put FlushOpt on bus together with contents of block (design choice, which cache with Shared state does this) |- |Modified(M) |BusRd | * Transition to '''(S)Shared.''' * Put FlushOpt on Bus with data. Received by sender of BusRd and Memory Controller, which writes to Main memory. |- | |BusRdX | * Transition to '''(I)Invalid'''. * Put FlushOpt on Bus with data. Received by sender of BusRdx and Memory Controller, which writes to Main memory. |} A write may only be performed freely if the cache line is in the Modified or Exclusive state. If it is in the Shared state, all other cached copies must be invalidated first. This is typically done by a broadcast operation known as ''Request For Ownership (RFO)''. A cache that holds a line in the Modified state must ''[[Bus snooping|snoop]]'' (intercept) all attempted reads (from all the other caches in the system) of the corresponding main memory location and insert the data that it holds. This can be done by forcing the read to ''back off'' (i.e. retry later), then writing the data to main memory and changing the cache line to the Shared state. It can also be done by sending data from Modified cache to the cache performing the read. Note, snooping only required for read misses (protocol ensures that Modified cannot exist if any other cache can perform a read hit). A cache that holds a line in the Shared state must listen for invalidate or request-for-ownership broadcasts from other caches, and discard the line (by moving it into Invalid state) on a match. The Modified and Exclusive states are always precise: i.e. they match the true cache line ownership situation in the system. The Shared state may be imprecise: if another cache discards a Shared line, this cache may become the sole owner of that cache line, but it will not be promoted to Exclusive state. Other caches do not broadcast notices when they discard cache lines, and this cache could not use such notifications without maintaining a count of the number of shared copies. In that sense the Exclusive state is an opportunistic optimization: If the CPU wants to modify a cache line in state S, a bus transaction is necessary to invalidate all other cached copies. State E enables modifying a cache line with no bus transaction. '''<big>Illustration of MESI protocol operations</big>''' For example, let us assume that the following stream of read/write references. All the references are to the same location and the digit refers to the processor issuing the reference. The stream is : R1, W1, R3, W3, R1, R3, R2. Initially it is assumed that all the caches are empty. {| class="wikitable" |+Table 1.3 An example of how MESI works All operations to same cache block (Example: "R3" means read block by processor 3) ! !<big>'''''Local'''''</big> <big>'''''Request'''''</big> !''<big>'''P1'''</big>'' !''<big>'''P2'''</big>'' !<big>'''''P3'''''</big> !''<big>Generated</big>'' ''<big>Bus Request</big>'' !''<big>Data Supplier</big>'' |- !0 !'''Initially''' !- !- !- !- !- |- !1 !R1 !E !- !- !'''BusRd''' !'''Mem''' |- !2 !W1 !M !- !- !- !- |- !3 !R3 !S !- !S !'''BusRd''' !'''P1's Cache''' |- !4 !W3 !I !- !M !'''BusUpgr''' !'''-''' |- !'''5''' !R1 ! '''S''' !'''-''' !'''S''' !BusRd !P3's Cache |- !'''6''' !R3 !'''S''' !'''-''' !'''S''' !'''-''' !'''-''' |- !'''7''' !R2 !'''S''' !'''S''' !'''S''' !BusRd ! P1/P3's Cache |} '''Note:''' ''The term snooping referred to below is a protocol for maintaining cache coherency in symmetric multiprocessing environments. All the caches on the bus monitor (snoop) the bus if they have a copy of the block of data that is requested on the bus.'' * Step 1: As the cache is initially empty, so the main memory provides P1 with the block and it becomes exclusive state. * Step 2: As the block is already present in the cache and in an exclusive state so it directly modifies that without any bus instruction. The block is now in a modified state. * Step 3: In this step, a BusRd is posted on the bus and the snooper on P1 senses this. It then flushes the data and changes its state to shared. The block on P3 also changes its state to shared as it has received data from another cache. The data is also written back to the main memory. * Step 4: Here a BusUpgr is posted on the bus and the snooper on P1 senses this and invalidates the block as it is going to be modified by another cache. P3 then changes its block state to modified. * Step 5: As the current state is invalid, thus it will post a BusRd on the bus. The snooper at P3 will sense this and so will flush the data out. The state of both the blocks on P1 and P3 will become shared now. Notice that this is when even the main memory will be updated with the previously modified data. * Step 6: There is a hit in the cache and it is in the shared state so no bus request is made here. * Step 7: There is cache miss on P2 and a BusRd is posted. The snooper on P1 and P3 sense this and both will attempt a flush. Whichever gets access of the bus first will do that operation. === Read For Ownership === A ''Read For Ownership (RFO)'' is an operation in [[cache coherency]] protocols that combines a read and an invalidate broadcast. The operation is issued by a processor trying to write into a cache line that is in the shared (S) or invalid (I) states of the MESI protocol. The operation causes all other caches to set the state of such a line to I. A read for ownership transaction is a read operation with intent to write to that memory address. Therefore, this operation is exclusive. It brings data to the cache and invalidates all other processor caches that hold this memory line. This is termed "BusRdX" in tables above. === Memory Barriers === MESI in its naive, straightforward implementation exhibits two particular performance issues. First, when writing to an invalid cache line, there is a long delay while the line is fetched from other CPUs. Second, moving cache lines to the invalid state is time-consuming. To mitigate these delays, CPUs implement store buffers and invalidate queues.<ref>{{Cite book|title=The Cache Memory Book|last=Handy|first=Jim|publisher=Morgan Kaufmann|year=1998|isbn=9780123229809}}</ref> ==== Store Buffer ==== A store buffer is used when writing to an invalid cache line. As the write will proceed anyway, the CPU issues a read-invalid message (hence the cache line in question and all other CPUs' cache lines that store that memory address are invalidated) and then pushes the write into the store buffer, to be executed when the cache line finally arrives in the cache. A direct consequence of the store buffer's existence is that when a CPU commits a write, that write is not immediately written in the cache. Therefore, whenever a CPU needs to read a cache line, it first scans its own store buffer for the existence of the same line, as there is a possibility that the same line was written by the same CPU before but hasn't yet been written in the cache (the preceding write is still waiting in the store buffer). Note that while a CPU can read its own previous writes in its store buffer, other CPUs ''cannot see those writes'' until they are flushed to the cache - a CPU cannot scan the store buffer of other CPUs. ==== Invalidate Queues ==== With regard to invalidation messages, CPUs implement invalidate queues, whereby incoming invalidate requests are instantly acknowledged but not immediately acted upon. Instead, invalidation messages simply enter an invalidation queue and their processing occurs as soon as possible (but not necessarily instantly). Consequently, a CPU can be oblivious to the fact that a cache line in its cache is actually invalid, as the invalidation queue contains invalidations that have been received but haven't yet been applied. Note that, unlike the store buffer, the CPU can't scan the invalidation queue, as that CPU and the invalidation queue are physically located on opposite sides of the cache. As a result, memory barriers are required. A ''store barrier'' will flush the store buffer, ensuring all writes have been applied to that CPU's cache. A ''read barrier'' will flush the invalidation queue, thus ensuring that all writes by other CPUs become visible to the flushing CPU. Furthermore, memory management units do not scan the store buffer, causing similar problems. This effect is visible even in single threaded processors.<ref>{{Cite book | doi = 10.1007/978-3-319-12154-3_8| chapter = Store Buffer Reduction with MMUs| title = Verified Software: Theories, Tools and Experiments| volume = 8471| pages = 117| series = Lecture Notes in Computer Science| year = 2014| last1 = Chen | first1 = G. | last2 = Cohen | first2 = E. | last3 = Kovalev | first3 = M. | isbn = 978-3-319-12153-6}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)