Editing Semantic memory (section)

== Models ==
Semantic memory's contents are not tied to any particular instance of experience, as in episodic memory. Instead, what is stored in semantic memory is the "gist" of experience, an abstract structure that applies to a wide variety of experiential objects and delineates categorical and functional relationships between such objects. There are numerous sub-theories related to semantic memory that have developed since Tulving initially posited his argument on the differences between semantic and episodic memory; an example is the belief in hierarchies of semantic memory, in which different information one has learned with specific levels of related knowledge is associated. According to this theory, brains are able to associate specific information with other disparate ideas despite not having unique memories that correspond to when that knowledge was stored in the first place.<ref>{{Cite journal |last=Rubin |first=David C. |date=Apr 2022 |title=A conceptual space for episodic and semantic memory |journal=Memory & Cognition |language=en |volume=50 |issue=3 |pages=464–477 |doi=10.3758/s13421-021-01148-3 |pmid=33650021 |s2cid=232089657 |issn=0090-502X|doi-access=free }}</ref> This theory of hierarchies has also been applied to episodic memory, as in the case of work by William Brewer on the concept of autobiographical memory.<ref>{{Cite book |url=https://www.worldcat.org/oclc/12724186 |title=Autobiographical memory |date=1986 |others=David C. Rubin |isbn=0-521-30322-2 |location=Cambridge [Cambridgeshire] |oclc=12724186}}</ref>

=== Network models ===
[[Neural network|Networks]] of various sorts play an integral part in many theories of semantic memory. Generally speaking, a network is composed of a set of nodes connected by links. The nodes may represent concepts, words, perceptual features, or nothing at all. The links may be weighted such that some are stronger than others or, equivalently, have a length such that some links take longer to traverse than others. All these features of networks have been employed in models of semantic memory.

====Teachable language comprehender====
One of the first examples of a network model of semantic memory is the teachable language comprehender (TLC).<ref>{{cite journal | last1 = Collins | first1 = A. M. | last2 = Quillian | first2 = M. R. | year = 1969 | title = Retrieval time from semantic memory | journal = Journal of Verbal Learning and Verbal Behavior | volume = 8 | issue = 2| pages = 240–247 | doi=10.1016/s0022-5371(69)80069-1| s2cid = 60922154 }}</ref>  In this model, each node is a word, representing a concept (like ''bird''). Within each node is stored a set of properties (like "can fly" or "has wings") as well as links to other nodes (like ''chicken''). A node is directly linked to those nodes of which it is either a subclass or superclass (i.e., ''bird'' would be connected to both ''chicken'' and ''animal''). Properties are stored at the highest category level to which they apply; for example, "is yellow" would be stored with ''canary'', "has wings" would be stored with ''bird'' (one level up), and "can move" would be stored with ''animal'' (another level up). Nodes may also store negations of the properties of their superordinate nodes (i.e., "NOT-can fly" would be stored with "penguin").

Processing in TLC is a form of [[spreading activation]].<ref>Collins, A. M. & Quillian, M. R. (1972). How to make a language user. In E. Tulving & W. Donaldson (Eds.), ''Organization of memory'' (pp. 309-351). New York: Academic Press.</ref>  When a node becomes active, that activation spreads to other nodes via the links between them. In that case, the time to answer the question "Is a chicken a bird?" is a function of how far the activation between the nodes for ''chicken'' and ''bird'' must spread, or the number of links between those nodes.

The original version of TLC did not put weights on the links between nodes. This version performed comparably to humans in many tasks, but failed to predict that people would respond faster to questions regarding more typical category instances than those involving less typical instances.<ref>{{cite journal | last1 = Rips | first1 = L. J. | last2 = Shoben | first2 = E. J. | last3 = Smith | first3 = F. E. | year = 1973 | title = Semantic distance and the verification of semantic relations | journal = Journal of Verbal Learning and Verbal Behavior | volume = 14 | pages = 665–681 | doi = 10.1016/s0022-5371(73)80056-8 }}</ref> [[Allan M. Collins|Allan Collins]] and Quillian later updated TLC to include weighted connections to account for this effect,<ref>{{cite journal | last1 = Collins | first1 = A. M. | last2 = Loftus | first2 = E. F. | year = 1975 | title = A spreading-activation theory of semantic processing | journal = Psychological Review | volume = 82 | issue = 6| pages = 407–428 | doi=10.1037/0033-295x.82.6.407| s2cid = 14217893 }}</ref> which allowed it to explain both the familiarity effect and the typicality effect. Its biggest advantage is that it clearly explains [[priming (psychology)|priming]]: information from memory is more likely to be retrieved if related information (the "prime") has been presented a short time before. There are still a number of memory phenomena for which TLC has no account, including why people are able to respond quickly to obviously false questions (like "is a chicken a meteor?") when the relevant nodes are very far apart in the network.<ref>{{cite journal | last1 = Glass | first1 = A. L. | last2 = Holyoak | first2 = K. J. | last3 = Kiger | first3 = J. I. | year = 1979 | title = Role of antonymy relations in semantic judgments | journal = Journal of Experimental Psychology: Human Learning & Memory | volume = 5 | issue = 6| pages = 598–606 | doi=10.1037/0278-7393.5.6.598}}</ref>

====Semantic networks====
TLC is an instance of a more general class of models known as [[semantic networks]]. In a semantic network, each node is to be interpreted as representing a specific concept, word, or feature; each node is a symbol. Semantic networks generally do not employ distributed representations for concepts, as may be found in a [[neural network]]. The defining feature of a semantic network is that its links are almost always directed (that is, they only point in one direction, from a base to a target) and the links come in many different types, each one standing for a particular relationship that can hold between any two nodes.<ref>Arbib, M. A. (Ed.). (2002). Semantic networks. In ''The Handbook of Brain Theory and Neural Networks (2nd ed.)'', Cambridge, MA: MIT Press.</ref> 

Semantic networks see the most use in models of [[Discourse analysis|discourse]] and [[logic]]al [[comprehension (logic)|comprehension]], as well as in [[artificial intelligence]].<ref>
* {{cite book |last1=Barr |first1=Avron |last2=Feigenbaum |first2=Edward A. |title=The Handbook of artificial intelligence, volume 1 |date=1981 |publisher=HeurisTech Press; William Kaufmann |location=Stanford, CA; Los Altos, CA |isbn=978-0-86576-004-2 |url=https://archive.org/details/handbookofartific01barr/}}
* {{cite book |last1=Barr |first1=Avron |last2=Feigenbaum |first2=Edward A. |title=The Handbook of artificial intelligence, volume 2 |date=1982 |publisher=HeurisTech Press; William Kaufmann |location=Stanford, CA; Los Altos, CA |isbn=978-0-86576-006-6 |url=https://archive.org/details/handbookofartific02barr/}}
* {{cite book |last1=Cohen |first1=Paul R. |last2=Feigenbaum |first2=Edward A. |title=The Handbook of artificial intelligence, volume 3 |date=1982  |publisher=HeurisTech Press; William Kaufmann |location=Stanford, CA; Los Altos, CA  |isbn=978-0-86576-007-3 |url=https://archive.org/details/handbookofartific03cohe}}
* {{cite book |last1=Barr |first1=Avron |last2=Cohen |first2=Paul R. |last3=Feigenbaum |first3=Edward A. (Edward Albert) |title=Handbook of artificial intelligence, volume 4 |date=1989 |publisher=Addison Wesley |location=Reading, MA |isbn=978-0-201-51731-6 |url=https://archive.org/details/handbookofartific04barr}}</ref> In these models, the nodes correspond to words or word stems and the links represent syntactic relations between them.<ref>{{cite journal | last1 = Cravo | first1 = M. R. | last2 = Martins | first2 = J. P. | year = 1993 | title = SNePSwD: A newcomer to the SNePS family | journal = Journal of Experimental & Theoretical Artificial Intelligence | volume = 5 | issue = 2–3| pages = 135–148 | doi=10.1080/09528139308953764}}</ref>

===Feature models===
Feature models view semantic categories as being composed of relatively unstructured sets of features. The [[semantic feature-comparison model]] describes memory as being composed of feature lists for different concepts.<ref name="Smith, E. E. 1974">{{cite journal | last1 = Smith | first1 = E. E. | last2 = Shoben | first2 = E. J. | last3 = Rips | first3 = L. J. | year = 1974 | title = Structure and process in semantic memory: A featural model for semantic decisions | journal = Psychological Review | volume = 81 | issue = 3| pages = 214–241 | doi=10.1037/h0036351}}</ref> According to this view, the relations between categories would not be directly retrieved, and would be indirectly computed instead. For example, subjects might verify a sentence by comparing the feature sets that represent its subject and predicate concepts. Such computational feature-comparison models include the ones proposed by Meyer (1970),<ref>{{cite journal | last1 = Meyer | first1 = D. E. | year = 1970 | title = On the representation and retrieval of stored semantic information | journal = Cognitive Psychology | volume = 1 | issue = 3| pages = 242–299 | doi=10.1016/0010-0285(70)90017-4}}</ref> Rips (1975),<ref>{{cite journal | last1 = Rips | first1 = L. J. | year = 1975 | title = Inductive judgments about natural categories | journal = Journal of Verbal Learning & Verbal Behavior | volume = 14 | issue = 6| pages = 665–681 | doi=10.1016/s0022-5371(75)80055-7}}</ref> and Smith ''et al.'' (1974).<ref name="Smith, E. E. 1974"/>

Early work in perceptual and conceptual categorization assumed that categories had critical features and that category membership could be determined by logical rules for the combination of features. More recent theories have accepted that categories may have an ill-defined or "fuzzy" structure<ref>{{cite journal | last1 = McCloskey | first1 = M. E. | last2 = Glucksberg | first2 = S. | year = 1978 | title = Natural categories: Well defined or fuzzy sets? | journal = Memory & Cognition | volume = 6 | issue = 4| pages = 462–472 | doi=10.3758/bf03197480| doi-access = free }}</ref> and have proposed probabilistic or global similarity models for the verification of category membership.<ref>{{cite journal | last1 = McCloskey | first1 = M. | last2 = Glucksberg | first2 = S. | year = 1979 | title = Decision processes in verifying category membership statements: Implications for models of semantic memory | journal = Cognitive Psychology | volume = 11 | issue = 1| pages = 1–37 | doi=10.1016/0010-0285(79)90002-1| s2cid = 54313506 }}</ref>

===Associative models===
The set of [[Association (psychology)|associations]] among a collection of items in memory is equivalent to the links between nodes in a network, where each node corresponds to a unique item in memory. Indeed, neural networks and semantic networks may be characterized as associative models of cognition. However, associations are often more clearly represented as an ''N''×''N'' matrix, where ''N'' is the number of items in memory; each cell of the matrix corresponds to the strength of the association between the row item and the column item.

Learning of associations is generally believed to be a [[Hebbian]] process, where whenever two items in memory are simultaneously active, the association between them grows stronger, and the more likely either item is to activate the other. See below for specific operationalizations of associative models.

====Search of associative memory====
A standard model of memory that employs association in this manner is the search of associative memory (SAM) model.<ref>{{Cite news | last1=Raaijmakers | first1=J. G. W. | last2=Schiffrin | first2=R. M. | publication-date=1981 |year=1981 |title=Search of associative memory | periodical=Psychological Review |volume=8 |issue=2 |pages=98–134 }}</ref> Though SAM was originally designed to model episodic memory, its mechanisms are sufficient to support some semantic memory representations.<ref>{{Cite journal|last1=Kimball|first1=Daniel R.|last2=Smith|first2=Troy A.|last3=Kahana|first3=Michael J.|date=2007|title=The fSAM model of false recall.|journal=Psychological Review|language=en|volume=114|issue=4|pages=954–993|doi=10.1037/0033-295x.114.4.954|issn=1939-1471|pmc=2839460|pmid=17907869}}</ref> The model contains a short-term store (STS) and long-term store (LTS), where STS is a briefly activated subset of the information in the LTS. The STS has limited capacity and affects the retrieval process by limiting the amount of information that can be sampled and limiting the time the sampled subset is in an active mode. The retrieval process in LTS is cue dependent and probabilistic, meaning that a cue initiates the retrieval process and the selected information from memory is random.  The probability of being sampled is dependent on the strength of association between the cue and the item being retrieved, with stronger associations being sampled before one is chosen. The buffer size is defined as ''r'', and not a fixed number, and as items are rehearsed in the buffer the associative strengths grow linearly as a function of the total time inside the buffer.<ref>{{cite book|last=Raaijmakers|first=J.G.|author2=Shiffrin R.M. |title=SAM: A theory of probabilistic search of associative memory|journal=The Psychology of Learning and Motivation: Advances in Research and Theory|year=1980|volume=14|pages=207–262|doi=10.1016/s0079-7421(08)60162-0|series=Psychology of Learning and Motivation|isbn=9780125433143}}</ref> In SAM, when any two items simultaneously occupy a working memory buffer, the strength of their association is incremented; items that co-occur more often are more strongly associated. Items in SAM are also associated with a specific context, where the strength of that association determined by how long each item is present in a given context. In SAM, memories consist of a set of associations between items in memory and between items and contexts. The presence of a set of items and/or a context is more likely to evoke some subset of the items in memory. The degree to which items evoke one another—either by virtue of their shared context or their co-occurrence—is an indication of the items' [[semantic relatedness]].

In an updated version of SAM, pre-existing semantic associations are accounted for using a semantic [[Matrix (mathematics)|matrix]]. During the experiment, semantic associations remain fixed showing the assumption that semantic associations are not significantly impacted by the episodic experience of one experiment.  The two measures used to measure semantic relatedness in this model are latent semantic analysis (LSA) and word association spaces (WAS).<ref>{{cite journal|last=Sirotin|first=Y.B.|author2=Kahana, d. R |title=Going beyond a single list: Modeling the effects of prior experience on episodic free recall|journal=Psychonomic Bulletin & Review|year=2005|volume=12|issue=5|pages=787–805|doi=10.3758/bf03196773|pmid=16523998|doi-access=free}}</ref> The LSA method states that similarity between words is reflected through their co-occurrence in a local context.<ref>{{cite journal|last=Landauer, T.K|author2=Dumais S.T. |title=Solution to Plato's problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge|journal=Psychological Review|volume=104|issue=2 |pages=211–240|doi=10.1037/0033-295x.104.2.211|year=1997 |citeseerx=10.1.1.184.4759 |s2cid=1144461 }}</ref> WAS was developed by analyzing a database of free association norms, and is where "words that have similar associative structures are placed in similar regions of space".<ref>{{cite book|pages=237–249|doi=10.1037/10895-018|chapter=Word Association Spaces for Predicting Semantic Similarity Effects in Episodic Memory|title=Experimental Cognitive Psychology and Its Applications|year=2005|last1=Steyvers|first1=Mark|last2=Shiffrin|first2=Richard M.|last3=Nelson|first3=Douglas L.|isbn=978-1-59147-183-7|chapter-url=http://psiexp.ss.uci.edu/research/papers/SteyversShiffrinNelsonFormatted.pdf|url=http://www.apa.org/pubs/books/4318016.aspx|editor-first1= Alice F. |editor-last1=Healy|citeseerx=10.1.1.66.5334|archive-url=https://web.archive.org/web/20100609210806/http://psiexp.ss.uci.edu/research/papers/SteyversShiffrinNelsonFormatted.pdf|archive-date=2010-06-09}}</ref>

====ACT-R: a production system model====
The adaptive control of thought (ACT)<ref>Anderson, J. R. (1983). ''The Architecture of Cognition''. Cambridge, MA: Harvard University Press.</ref> (and later [[ACT-R]] (Adaptive Control of Thought-Rational)<ref>Anderson, J. R. (1993b). ''Rules of the mind''. Hillsdale, NJ: Erlbaum.</ref>) theory of cognition represents [[declarative memory]] (of which semantic memory is a part) as "chunks", which consist of a label, a set of defined relationships to other chunks (e.g., "this is a _", or "this has a _"), and any number of chunk-specific properties. Chunks can be mapped as a semantic network, given that each node is a chunk with its unique properties, and each link is the chunk's relationship to another chunk. In ACT, a chunk's activation decreases as a function of the time from when the chunk was created, and increases with the number of times the chunk has been retrieved from memory. Chunks can also receive activation from [[Gaussian noise]] and from their similarity to other chunks. For example, if ''chicken'' is used as a retrieval cue, ''canary'' will receive activation by virtue of its similarity to the cue. When retrieving items from memory, ACT looks at the most active chunk in memory; if it is above threshold, it is retrieved; otherwise an "error of omission" has occurred and the item has been forgotten. There is also retrieval latency, which varies inversely with the amount by which the activation of the retrieved chunk exceeds the retrieval threshold. This latency is used to measure the response time of the ACT model and compare it to human performance.<ref>{{cite journal | last1 = Anderson | first1 = J. R. | last2 = Bothell | first2 = D. | last3 = Lebiere | first3 = C. | last4 = Matessa | first4 = M. | year = 1998 | title = An integrated theory of list memory | journal = Journal of Memory and Language | volume = 38 | issue = 4| pages = 341–380 | doi=10.1006/jmla.1997.2553| citeseerx = 10.1.1.132.7920 | s2cid = 14462252 }}</ref>

===Statistical models===
Some models characterize the acquisition of semantic information as a form of [[statistical inference]] from a set of discrete experiences, distributed across a number of [[Context (language use)|contexts]]. Though these models differ in specifics, they generally employ an (Item × Context) [[matrix (mathematics)|matrix]] where each cell represents the number of times an item in memory has occurred in a given context. Semantic information is gleaned by performing a statistical analysis of this matrix.

Many of these models bear similarity to the algorithms used in [[search engines]], though it is not yet clear whether they really use the same computational mechanisms.<ref>{{cite journal | last1 = Griffiths | first1 = T. L. | last2 = Steyvers | first2 = M. | last3 = Firl | first3 = A. | year = 2007 | title = Google and the mind: Predicting fluency with PageRank | journal = Psychological Science | volume = 18 | issue = 12| pages = 1069–1076 | doi=10.1111/j.1467-9280.2007.02027.x| pmid = 18031414 | s2cid = 12063124 }}</ref><ref>Anderson, J. R. (1990). ''The adaptive character of thought''. Hillsdale, NJ: Lawrence Erlbaum Associates.</ref>

====Latent semantic analysis====
One of the more popular models is [[latent semantic analysis]] (LSA).<ref>{{cite journal | last1 = Landauer | first1 = T. K. | last2 = Dumais | first2 = S. T. | year = 1997 | title = A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction, and representation of knowledge | journal = Psychological Review | volume = 104 | issue = 2| pages = 211–240 | doi=10.1037/0033-295x.104.2.211| citeseerx = 10.1.1.184.4759 | s2cid = 1144461 }}</ref> In LSA, a T × D [[matrix (mathematics)|matrix]] is constructed from a [[text corpus]], where T is the number of terms in the corpus and D is the number of documents (here "context" is interpreted as "document" and only words—or word phrases—are considered as items in memory). Each cell in the matrix is then transformed according to the equation:

<math>\mathbf{M}_{t,d}'=\frac{\ln{(1 + \mathbf{M}_{t,d})}}{-\sum_{i=0}^D P(i|t) \ln{P(i|t)}}</math>

where <math>P(i|t)</math> is the probability that context <math>i</math> is active, given that item <math>t</math> has occurred (this is obtained simply by dividing the raw frequency, <math>\mathbf{M}_{t,d}</math> by the total of the item vector, <math>\sum_{i=0}^D \mathbf{M}_{t,i}</math>).

====Hyperspace Analogue to Language (HAL)====
The Hyperspace Analogue to Language (HAL) model<ref>Lund, K., Burgess, C. & Atchley, R. A. (1995). Semantic and associative priming in a high-dimensional semantic space. ''Cognitive Science Proceedings (LEA)'', 660-665.</ref><ref>{{cite journal | last1 = Lund | first1 = K. | last2 = Burgess | first2 = C. | year = 1996 | title = Producing high-dimensional semantic spaces from lexical co-occurrence | journal = Behavior Research Methods, Instruments, and Computers | volume = 28 | issue = 2| pages = 203–208 | doi=10.3758/bf03204766| doi-access = free }}</ref> considers context only as the words that immediately surround a given word. HAL computes an NxN matrix, where N is the number of words in its lexicon, using a 10-word reading frame that moves incrementally through a corpus of text. Like SAM, any time two words are simultaneously in the frame, the association between them is increased, that is, the corresponding cell in the NxN matrix is incremented. The bigger the distance between the two words, the smaller the amount by which the association is incremented (specifically, <math>\Delta=11-d</math>, where <math>d</math> is the distance between the two words in the frame).