Editing CPU cache (section)

==Operation==
==={{Anchor|cache lines|CACHE-LINES}}Cache entries===
Data is transferred between memory and cache in blocks of fixed size, called ''cache lines'' or ''cache blocks''. When a cache line is copied from memory into the cache, a cache entry is created. The cache entry will include the copied data as well as the requested memory location (called a tag).

When the processor needs to read or write a location in memory, it first checks for a corresponding entry in the cache. The cache checks for the contents of the requested memory location in any cache lines that might contain that address. If the processor finds that the memory location is in the cache, a '''cache hit''' has occurred. However, if the processor does not find the memory location in the cache, a '''cache miss''' has occurred. In the case of a cache hit, the processor immediately reads or writes the data in the cache line. For a cache miss, the cache allocates a new entry and copies data from main memory, then the request is fulfilled from the contents of the cache.

===Policies===

===={{Anchor|EVICTION}}Replacement policies====
{{Main article|Cache replacement policies}}

To make room for the new entry on a cache miss, the cache may have to evict one of the existing entries. The heuristic it uses to choose the entry to evict is called the replacement policy. The fundamental problem with any replacement policy is that it must predict which existing cache entry is least likely to be used in the future. Predicting the future is difficult, so there is no perfect method to choose among the variety of replacement policies available. One popular replacement policy, [[least-recently used]] (LRU), replaces the least recently accessed entry.

Marking some memory ranges as non-cacheable can improve performance, by avoiding caching of memory regions that are rarely re-accessed. This avoids the overhead of loading something into the cache without having any reuse. Cache entries may also be disabled or locked depending on the context.

====Write policies====
{{Main article|Cache (computing)#WRITEPOLICIES|l1=Cache (computing) § Writing policies}}

If data are written to the cache, at some point they must also be written to main memory; the timing of this write is known as the write policy. In a [[write-through]] cache, every write to the cache causes a write to main memory. Alternatively, in a [[write-back]] or copy-back cache, writes are not immediately mirrored to the main memory, with locations been written over being marked as [[Dirty bit|dirty]], being written back to the main memory only when they are evicted from the cache. For this reason, a read miss in a write-back cache may sometimes require two memory accesses to service: one to first write the dirty location to main memory, and then another to read the new location from memory. Also, a write to a main memory location that is not yet mapped in a write-back cache may evict an already dirty location, thereby freeing that cache space for the new memory location.

There are intermediate policies as well. The cache may be write-through, but the writes may be held in a store data queue temporarily, usually so multiple stores can be processed together (which can reduce bus turnarounds and improve bus utilization).

Cached data from the main memory may be changed by other entities (e.g., peripherals using [[direct memory access]] (DMA) or another core in a [[multi-core processor]]), in which case the copy in the cache may become out-of-date or stale. Alternatively, when a CPU in a [[multiprocessor]] system updates data in the cache, copies of data in caches associated with other CPUs become stale. Communication protocols between the cache managers that keep the data consistent are known as [[cache coherence]] protocols.

===Cache performance===
[[Cache performance measurement and metric|Cache performance measurement]] has become important in recent times where the speed gap between the memory performance and the processor performance is increasing exponentially. The cache was introduced to reduce this speed gap. Thus knowing how well the cache is able to bridge the gap in the speed of processor and memory becomes important, especially in high-performance systems. The cache hit rate and the cache miss rate play an important role in determining this performance. To improve the cache performance, reducing the miss rate becomes one of the necessary steps among other steps. Decreasing the access time to the cache also gives a boost to its performance and helps with optimization.

====CPU stalls====
The time taken to fetch one cache line from memory (read [[Latency (engineering)|latency]] due to a cache miss) matters because the CPU will run out of work while waiting for the cache line. When a CPU reaches this state, it is called a stall. As CPUs become faster compared to main memory, stalls due to cache misses displace more potential computation; modern CPUs can execute hundreds of instructions in the time taken to fetch a single cache line from main memory.

Various techniques have been employed to keep the CPU busy during this time, including [[out-of-order execution]] in which the CPU attempts to execute independent instructions after the instruction that is waiting for the cache miss data. Another technology, used by many processors, is [[simultaneous multithreading]] (SMT), which allows an alternate thread to use the CPU core while the first thread waits for required CPU resources to become available.