Editing Cache (computing) (section)

=={{Anchor|CACHE-HIT|CACHE-MISS}}Operation==
Hardware implements cache as a [[block (data storage)|block]] of memory for temporary storage of data likely to be used again. [[Central processing unit]]s (CPUs), [[solid-state drive]]s (SSDs) and hard disk drives (HDDs) frequently include hardware-based cache, while [[web browser]]s and [[web server]]s commonly rely on software caching.

A cache is made up of a pool of entries. Each entry has associated ''data'', which is a copy of the same data in some ''backing store''. Each entry also has a ''tag'', which specifies the identity of the data in the backing store of which the entry is a copy.

When the cache client (a CPU, web browser, [[operating system]]) needs to access data presumed to exist in the backing store, it first checks the cache. If an entry can be found with a tag matching that of the desired data, the data in the entry is used instead. This situation is known as a '''cache hit'''. For example, a web browser program might check its local cache on disk to see if it has a local copy of the contents of a web page at a particular [[URL]]. In this example, the URL is the tag, and the content of the web page is the data. The percentage of accesses that result in cache hits is known as the '''hit rate''' or '''hit ratio''' of the cache.

The alternative situation, when the cache is checked and found not to contain any entry with the desired tag, is known as a '''cache miss'''. This requires a more expensive access of data from the backing store. Once the requested data is retrieved, it is typically copied into the cache, ready for the next access.

During a cache miss, some other previously existing cache entry is typically removed in order to make room for the newly retrieved data. The [[Heuristic (computer science)|heuristic]] used to select the entry to replace is known as the [[Cache replacement policies|replacement policy]]. One popular replacement policy, least recently used (LRU), replaces the oldest entry, the entry that was accessed less recently than any other entry. More sophisticated caching algorithms also take into account the frequency of use of entries.

==={{Anchor|Dirty|WRITEPOLICIES|WRITE-BACK|WRITE-BEHIND|WRITE-THROUGH|WRITE-AROUND}}Write policies===
[[File:Write-through with no-write-allocation.svg|thumb|380px|A write-through cache without write allocation]]
[[File:Write-back with write-allocation.svg|thumb|500px|A write-back cache with write allocation]]

Cache writes must eventually be propagated to the backing store. The timing for this is governed by the ''write policy''. The two primary write policies are:<ref>{{Cite web|url=https://www.linuxjournal.com/article/7105|title=Understanding Caching|last=Bottomley|first=James|date=2004-01-01|website=Linux Journal|access-date=2019-10-01}}</ref>
* ''Write-through'': Writes are performed synchronously to both the cache and the backing store.
* ''Write-back'': Initially, writing is done only to the cache. The write to the backing store is postponed until the modified content is about to be replaced by another cache block.

A write-back cache is more complex to implement since it needs to track which of its locations have been written over and mark them as ''dirty'' for later writing to the backing store. The data in these locations are written back to the backing store only when they are evicted from the cache, a process referred to as a ''lazy write''. For this reason, a read miss in a write-back cache may require two memory accesses to the backing store: one to write back the dirty data, and one to retrieve the requested data. Other policies may also trigger data write-back. The client may make many changes to data in the cache, and then explicitly notify the cache to write back the data.

Write operations do not return data. Consequently, a decision needs to be made for write misses: whether or not to load the data into the cache. This is determined by these ''write-miss policies'':
* ''Write allocate'' (also called ''fetch on write''): Data at the missed-write location is loaded to cache, followed by a write-hit operation. In this approach, write misses are similar to read misses.
* ''No-write allocate'' (also called ''write-no-allocate'' or ''write around''): Data at the missed-write location is not loaded to cache, and is written directly to the backing store. In this approach, data is loaded into the cache on read misses only.

While both write policies can Implement either write-miss policy, they are typically paired as follows:<ref name="HennessyPatterson2011">{{cite book|last1=Hennessy|first1=John L.|url=https://books.google.com/books?id=v3-1hVwHnHwC&pg=SL2-PA12|title=Computer Architecture: A Quantitative Approach|last2=Patterson|first2=David A.|publisher=Elsevier|year=2011|isbn=978-0-12-383872-8|page=B–12|language=en}}</ref><ref>{{cite book|title=Computer Architecture A Quantitative Approach|last1=Patterson|first1=David A.|last2=Hennessy|first2=John L.|isbn=1-55860-069-8|date=1990|page=413|publisher=Morgan Kaufmann Publishers}}</ref>
* A write-back cache typically employs write allocate, anticipating that subsequent writes or reads to the same location will benefit from having the data already in the cache.
* A write-through cache uses no-write allocate. Here, subsequent writes have no advantage, since they still need to be written directly to the backing store.

Entities other than the cache may change the data in the backing store, in which case the copy in the cache may become out-of-date or ''stale''. Alternatively, when the client updates the data in the cache, copies of that data in other caches will become stale. Communication protocols between the cache managers that keep the data consistent are associated with [[cache coherence]].

===Prefetch===
{{main|Cache prefetching}}
{{see|Memory paging#Page replacement techniques}}

On a cache read miss, caches with a ''[[demand paging]] policy'' read the minimum amount from the backing store. A typical demand-paging virtual memory implementation reads one page of virtual memory (often 4&nbsp;KB) from disk into the disk cache in RAM. A typical CPU reads a single L2 cache line of 128&nbsp;bytes from DRAM into the L2 cache, and a single L1 cache line of 64&nbsp;bytes from the L2 cache into the L1 cache.

Caches with a [[prefetch input queue]] or more general ''anticipatory paging policy'' go further—they not only read the data requested, but guess that the next chunk or two of data will soon be required, and so prefetch that data into the cache ahead of time. Anticipatory paging is especially helpful when the backing store has a long latency to read the first chunk and much shorter times to sequentially read the next few chunks, such as [[disk storage]] and DRAM.

A few operating systems go further with a [[loader (computing)|loader]] that always pre-loads the entire executable into RAM. A few caches go even further, not only pre-loading an entire file, but also starting to load other related files that may soon be requested, such as the [[page cache]] associated with a [[prefetcher]] or the [[web cache]] associated with [[link prefetching]].