Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Memory hierarchy
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Examples== [[File:Hwloc.png|thumb|right|300px|Memory hierarchy of an AMD Bulldozer server]] The number of levels in the memory hierarchy and the performance at each level has increased over time. The type of memory or storage components also change historically.<ref>{{cite web|url=http://www.computerhistory.org/timeline/memory-storage/|title=Memory & Storage β Timeline of Computer History β Computer History Museum|website=www.computerhistory.org}}</ref> For example, the memory hierarchy of an Intel Haswell Mobile<ref>{{cite web|last=Crothers |first=Brooke |url=http://news.cnet.com/8301-13579_3-57609045-37/dissecting-intels-top-graphics-in-apples-15-inch-macbook-pro/ |title=Dissecting Intel's top graphics in Apple's 15-inch MacBook Pro β CNET |publisher=News.cnet.com |access-date=2014-07-31}}</ref> processor circa 2013 is: * [[Processor register]]s{{dash}}the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size. * [[CPU cache|Cache]] ** Level 0 (L0), [[micro-operation]]s cache{{dash}}6,144 bytes (6 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}})<ref>{{cite web|url=http://www.anandtech.com/show/6355/intels-haswell-architecture/6 |title=Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel |publisher=AnandTech |access-date=2014-07-31}}</ref> in size ** Level 1 (L1) [[Opcode|instruction]] cache{{dash}}128 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size ** Level 1 (L1) data cache{{dash}}128 KiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 700 [[Gigabyte|GB]]/s.<ref name=sisd_qa_f_mem_hsw>{{cite web|url=http://www.sisoftware.co.uk/?d=qa&f=mem_hsw |title=SiSoftware Zone |publisher=Sisoftware.co.uk |access-date=2014-07-31|archive-url=https://web.archive.org/web/20140913231938/http://www.sisoftware.co.uk/?d=qa&f=mem_hsw|archive-date=2014-09-13}}</ref> ** Level 2 (L2) instruction and data (shared){{dash}}1 [[MiB]]{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 200 GB/s.<ref name=sisd_qa_f_mem_hsw /> ** Level 3 (L3) shared cache{{dash}}6 MiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 100 GB/s.<ref name=sisd_qa_f_mem_hsw /> ** Level 4 (L4) shared cache{{dash}}128 MiB{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 40 GB/s.<ref name=sisd_qa_f_mem_hsw /> * [[Computer memory|Main memory]] ([[primary storage]]){{dash}}[[GiB]]{{cn|reason=No source provided for IEC units, sources only use metric units like KB, MB, GB, etc|date=May 2021}}{{Original research inline|certain=y|date=May 2021}} in size. Best access speed is around 10 GB/s.<ref name=sisd_qa_f_mem_hsw /> In the case of a [[Non-Uniform Memory Access|NUMA]] machine, access times may not be uniform. * [[Mass storage]] ([[secondary storage]]){{dash}}[[terabyte]]s in size. {{As of|2017}}, best access speed is from a consumer [[Solid-state drive|solid state drive]] is about 2000 MB/s.<ref>{{cite web|url=http://www.storagereview.com/samsung_960_pro_m2_nvme_ssd_review|title=Samsung 960 Pro M.2 NVMe SSD Review|date=20 October 2016 |publisher=storagereview.com|access-date=2017-04-13}}</ref> * [[Nearline storage]] ([[tertiary storage]]){{dash}}up to [[exabytes]] in size. {{As of|2013}}, best access speed is about 160 MB/s.<ref>{{cite web |url=http://www.lto.org/technology/generations.html |title=Ultrium β LTO Technology β Ultrium GenerationsLTO |publisher=Lto.org |access-date=2014-07-31 |url-status=dead |archive-url=https://web.archive.org/web/20110727052050/http://www.lto.org/technology/generations.html |archive-date=2011-07-27 }}</ref> * [[Offline storage]] The lower levels of the hierarchy{{dash}}from mass storage downwards{{dash}}are also known as [[tiered storage]]. The formal distinction between online, nearline, and offline storage is:<ref name="pearson2010">{{cite web|last=Pearson|first=Tony|year=2010|title=Correct use of the term Nearline.|url=https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2|url-status=dead|archive-url=https://web.archive.org/web/20181127020712/https://www.ibm.com/developerworks/community/blogs/InsideSystemStorage/entry/the_correct_use_of_the_term_nearline2?lang=en|archive-date=2018-11-27|access-date=2015-08-16|work=IBM Developerworks, Inside System Storage}}</ref> * Online storage is immediately available for I/O. * Nearline storage is not immediately available, but can be made online quickly without human intervention. * Offline storage is not immediately available, and requires some human intervention to bring online. For example, always-on spinning disks are online, while spinning disks that spin down, such as massive arrays of idle disk ([[Non-RAID drive architectures#MAID|MAID]]), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a [[tape library]], are nearline, while cartridges that must be manually loaded are offline. Most modern [[Central processing unit|CPUs]] are so fast that, for most program workloads, the [[wikt:bottleneck|bottleneck]] is the [[locality of reference]] of memory accesses and the efficiency of the [[CPU cache|caching]] and memory transfer between different levels of the hierarchy{{Citation needed|date=September 2009}}. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small and fast level and require use of a larger, slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively: [[register spilling]] (due to [[register pressure]]: register to cache), [[cache miss]] (cache to main memory), and (hard) [[page fault]] (''real'' main memory to ''virtual'' memory, i.e. mass storage, commonly referred to as ''disk'' regardless of the actual mass storage technology used). Modern [[programming language]]s mainly assume two levels of memory, main (''working'') memory and mass storage, though in [[assembly language]] and [[inline assembler]]s in languages such as [[C (programming language)|C]], registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system): *''Programmers'' are responsible for moving data between disk and memory through file I/O. *''Hardware'' is responsible for moving data between memory and caches. *''[[Optimizing compiler]]s'' are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently. Many programmers assume one level of memory. This works fine until the application hits a performance wall. Then the memory hierarchy will be assessed during [[code refactoring]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)