Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Vector clock
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Algorithm for partial ordering of events and detecting causality in distributed systems}} {{distinguish|Version vector}} A '''vector clock''' is a [[data structure]] used for determining the [[partial ordering]] of events in a [[distributed system]] and detecting [[causality]] violations. Just as in [[Lamport timestamp]]s, inter-process messages contain the state of the sending process's [[logical clock]]. A vector clock of a system of ''N'' processes is an [[array data structure|array]]/vector of ''N'' logical clocks, one clock per process; a local "largest possible values" copy of the global clock-array is kept in each process. Denote <math>VC_i</math> as the vector clock maintained by process <math>i</math>, the clock updates proceed as follows:<ref>{{Cite web|title=Distributed Systems 3rd edition (2017)|url=https://www.distributed-systems.net/index.php/books/ds3/|access-date=2021-03-21|website=DISTRIBUTED-SYSTEMS.NET|language=en-US}}</ref> [[Image:Vector Clock.svg|thumb|upright=1.9|Example of a system of vector clocks. Events in the blue region are the causes leading to event B4, whereas those in the red region are the effects of event B4.]] * Initially all clocks are zero. * Each time a process experiences an internal event, it increments its own [[logical clock]] in the vector by one. For instance, upon an event at process <math>i</math>, it updates <math>VC_{i}[i] \leftarrow VC_{i}[i] + 1</math>. * Each time a process sends a message, it increments its own logical clock in the vector by one (as in the bullet above, but not twice for the same event) then it pairs the message with a copy of its own vector and finally sends the pair. * Each time a process receives a message-vector clock pair, it increments its own logical clock in the vector by one and updates each element in its vector by taking the maximum of the value in its own vector clock and the value in the vector in the received pair (for every element). For example, if process <math>P_i</math> receives a message <math>(m, VC_{j})</math> from <math>P_j</math>, it first increments its own logical clock in the vector by one <math>VC_{i}[i]\leftarrow VC_{i}[i]+1</math> and then updates its entire vector by setting <math>VC_{i}[k]\leftarrow \max(VC_{i}[k], VC_{j}[k]), \forall k</math>. ==History== Lamport originated the idea of logical [[Lamport clock]]s in 1978.<ref name="Lamport 1978">{{Cite journal | last1 = Lamport | first1 = L. |authorlink1=Leslie Lamport| title = Time, clocks, and the ordering of events in a distributed system | doi = 10.1145/359545.359563 | journal = [[Communications of the ACM ]]| volume = 21 | issue = 7 | pages = 558–565| year = 1978 | s2cid = 215822405 | url=http://research.microsoft.com/users/lamport/pubs/time-clocks.pdf}}</ref> However, the logical clocks in that paper were scalars, not vectors. The generalization to vector time was developed several times, apparently independently, by different authors in the early 1980s.<ref name=Schwarz>{{cite journal |last1=Schwarz |first1=Reinhard |last2=Mattern |first2=Friedemann |title=Detecting causal relationships in distributed computations: In search of the holy grail |journal=Distributed Computing |date=March 1994 |volume=7 |issue=3 |pages=149–174 |doi=10.1007/BF02277859|s2cid=3065996 |url=https://nbn-resolving.org/urn:nbn:de:hbz:386-kluedo-4006 }}</ref> At least 6 papers contain the concept. <ref>{{cite web |last1=Kuper |first1=Lindsey |title=Who invented vector clocks? |url=https://decomposition.al/blog/2023/04/08/who-invented-vector-clocks/ |website=decomposition ∘ al |language=en |date=8 April 2023}} The papers are (in chronological order): * {{cite book |last1=Fischer |first1=Michael J. |last2=Michael |first2=Alan |title=Proceedings of the 1st ACM SIGACT-SIGMOD symposium on Principles of database systems - PODS '82 |chapter=Sacrificing serializability to attain high availability of data in an unreliable network |date=1982 |pages=70 |doi=10.1145/588111.588124|isbn=0897910702 |s2cid=8774876 }} * {{cite journal |last1=Parker |first1=D.S. |last2=Popek |first2=G.J. |last3=Rudisin |first3=G. |last4=Stoughton |first4=A. |last5=Walker |first5=B.J. |last6=Walton |first6=E. |last7=Chow |first7=J.M. |last8=Edwards |first8=D. |last9=Kiser |first9=S. |last10=Kline |first10=C. |title=Detection of Mutual Inconsistency in Distributed Systems |journal=IEEE Transactions on Software Engineering |date=May 1983 |volume=SE-9 |issue=3 |pages=240–247 |doi=10.1109/TSE.1983.236733|s2cid=2483222 }} * {{cite book |last1=Wuu |first1=Gene T.J. |last2=Bernstein |first2=Arthur J. |title=Proceedings of the third annual ACM symposium on Principles of distributed computing - PODC '84 |chapter=Efficient solutions to the replicated log and dictionary problems |date=1984 |pages=233–242 |doi=10.1145/800222.806750|isbn=0897911431 |s2cid=2384672 }} * {{cite journal |last1=Strom |first1=Rob |last2=Yemini |first2=Shaula |title=Optimistic recovery in distributed systems |journal=ACM Transactions on Computer Systems |date=August 1985 |volume=3 |issue=3 |pages=204–226 |doi=10.1145/3959.3962|s2cid=1941122 |doi-access=free }} * {{cite tech report |last1=Schmuck |first1=Frank B. |title=Software clocks and the order of events in a distributed system |date=November 1985 |type=unpublished }} * {{cite book |last1=Liskov |first1=Barbara |last2=Ladin |first2=Rivka |title=Proceedings of the fifth annual ACM symposium on Principles of distributed computing - PODC '86 |chapter=Highly available distributed services and fault-tolerant distributed garbage collection |date=1986 |pages=29–39 |doi=10.1145/10590.10593|isbn=0897911989 |s2cid=16148617 }} * {{cite journal |last1=Raynal |first1=Michel |title=A distributed algorithm to prevent mutual drift between n logical clocks |journal=Information Processing Letters |date=February 1987 |volume=24 |issue=3 |pages=199–202 |doi=10.1016/0020-0190(87)90186-4}} </ref> The papers canonically cited in reference to vector clocks are Colin Fidge’s and [[Friedemann Mattern]]’s 1988 works, <ref>{{cite conference|first=Colin J.|last=Fidge|date=February 1988|title=Timestamps in message-passing systems that preserve the partial ordering|book-title=Proceedings of the 11th Australian Computer Science Conference (ACSC'88)|volume=10|issue=1 | editor = K. Raymond|pages=56–66 | url = http://zoo.cs.yale.edu/classes/cs426/2012/lab/bib/fidge88timestamps.pdf | access-date = 2009-02-13}}</ref><ref>{{cite conference|title=Virtual Time and Global States of Distributed systems|book-title=Proc. Workshop on Parallel and Distributed Algorithms|first=Friedemann|last=Mattern | editor-last=Cosnard | editor-first=M. | place=Chateau de Bonas, France |date=October 1988 |publisher=Elsevier | pages=215–226}}</ref> as they (independently) established the name "vector clock" and the mathematical properties of vector clocks.<ref name=Schwarz/> ==Partial ordering property== Vector clocks allow for the partial causal ordering of events. Defining the following: * <math>VC(x)</math> denotes the vector clock of event <math>x</math>, and <math>VC(x)_z</math> denotes the component of that clock for process <math>z</math>. * <math>VC(x) < VC(y) \iff \forall z [VC(x)_z \le VC(y)_z] \land \exists z' [ VC(x)_{z'} < VC(y)_{z'} ]</math> ** In English: <math>VC(x)</math> is less than <math>VC(y)</math>, if and only if <math>VC(x)_z</math> is less than or equal to <math>VC(y)_z</math> for all process indices <math>z</math>, and at least one of those relationships is strictly smaller (that is, <math>VC(x)_{z'} < VC(y)_{z'}</math>). * <math>x \to y\;</math> denotes that event <math>x</math> happened before event <math>y</math>. It is defined as: if <math>x \to y\;</math>, then <math>VC(x) < VC(y)</math> Properties: * [[antisymmetric relation|Antisymmetry]]: if <math>VC(a) < VC(b)</math>, then ¬<math>(VC(b) < VC(a))</math> * [[transitive relation|Transitivity]]: if <math>VC(a) < VC(b)</math> and <math>VC(b) < VC(c)</math>, then <math>VC(a) < VC(c)</math>; or, if <math>a \to b\;</math> and <math>b \to c\;</math>, then <math>a \to c\;</math> == Relation with other orders: == * Let <math>RT(x)</math> be the real time when event <math>x</math> occurs. If <math>VC(a) < VC(b)</math>, then <math>RT(a) < RT(b)</math> * Let <math>C(x)</math> be the [[Lamport timestamps|Lamport timestamp]] of event <math>x</math>. If <math>VC(a) < VC(b)</math>, then <math>C(a) < C(b)</math> == Limitations under Byzantine Failures == Vector clocks can reliably detect causality in distributed systems subject to crash failures. However, when processes behave arbitrarily or maliciously—as in the Byzantine failure model—causality detection becomes fundamentally impossible <ref>{{cite conference | last1 = Misra | first1 = Anshuman | last2 = Kshemkalyani | first2 = Ajay D. | title = Detecting Causality in the Presence of Byzantine Processes: There is No Holy Grail | book-title = 2022 IEEE 21st International Symposium on Network Computing and Applications (NCA) | year = 2022 | pages = 73–80 | doi = 10.1109/NCA57778.2022.10013644 | publisher = IEEE }}</ref> , rendering vector clocks ineffective in such environments. This impossibility result holds for all variants of vector clocks, as it stems from core limitations inherent to the problem of causality detection under Byzantine faults. ==Other mechanisms== {{Incomplete list|date=June 2023}} * In 1999, Torres-Rojas and Ahamad developed '''Plausible Clocks''',<ref>{{Citation |author1=Francisco Torres-Rojas |author2=Mustaque Ahamad |title=Plausible clocks: constant size logical clocks for distributed systems |journal=Distributed Computing |volume=12 |issue=4 |year=1999 |pages=179–195 |doi=10.1007/s004460050065 |s2cid=2936350 |url=https://www.cc.gatech.edu/fac/Mustaque.Ahamad/pubs/plausible.ps|url-access=subscription }}</ref> a mechanism that takes less space than vector clocks but that, in some cases, will totally order events that are causally concurrent. * In 2005, Agarwal and Garg created '''Chain Clocks''',<ref>{{cite book |last1=Agarwal |first1=Anurag |last2=Garg |first2=Vijay K. |title=Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing |chapter=Efficient dependency tracking for relevant events in shared-memory systems |date=17 July 2005 |pages=19–28 |doi=10.1145/1073814.1073818 |chapter-url=http://users.ece.utexas.edu/~garg/dist/agarwal-garg-DC.pdf |access-date=21 April 2021 |publisher=Association for Computing Machinery|isbn=1-58113-994-2 |s2cid=11779779 }}</ref> a system that tracks dependencies using vectors with size smaller than the number of processes and that adapts automatically to systems with dynamic number of processes. * In 2008, Almeida ''et al.'' introduced '''Interval Tree Clocks'''.<ref>{{Citation | last1=Almeida | first1=Paulo | last2=Baquero | first2=Carlos | last3=Fonte | first3=Victor | contribution=Interval Tree Clocks: A Logical Clock for Dynamic Systems | title=Principles of Distributed Systems | volume=5401 | publisher=Springer-Verlag, Lecture Notes in Computer Science | year=2008 | doi=10.1007/978-3-540-92221-6 | url=http://gsd.di.uminho.pt/members/cbm/ps/itc2008.pdf | pages=259–274 | series=Lecture Notes in Computer Science | bibcode=2008LNCS.5401.....B | editor1-last=Baker | editor1-first=Theodore P. | editor2-last=Bui | editor2-first=Alain | editor3-last=Tixeuil | editor3-first=Sébastien | isbn=978-3-540-92220-9 }}</ref><ref>{{Citation | last1=Almeida | first1=Paulo | last2=Baquero | first2=Carlos | last3=Fonte | first3=Victor | title=Interval Tree Clocks: A Logical Clock for Dynamic Systems | volume=5401 | pages=259 | contribution=Interval Tree Clocks: A Logical Clock for Dynamic Systems | year=2008 | doi=10.1007/978-3-540-92221-6_18 | url=https://www.researchgate.net/publication/235246938 | series=Lecture Notes in Computer Science | isbn=978-3-540-92220-9 | hdl=1822/37748 | hdl-access=free }}</ref><ref>{{Citation | last1=Zhang | first1=Yi | title=Background Preliminaries: Interval Tree Clock Results | contribution=Background Preliminaries: Interval Tree Clock Results | year=2014 | url=https://cs.uwaterloo.ca/~mkarsten/cs755-F14/presentations/ITC.pdf }}</ref> This mechanism generalizes Vector Clocks and allows operation in dynamic environments when the identities and number of processes in the computation is not known in advance. * In 2019, Lum Ramabaja proposed '''Bloom Clocks''', a probabilistic data structure based on [[Bloom filters]].<ref>{{cite journal |last1=Pozzetti |first1=Tommaso |last2=Kshemkalyani |first2=Ajay D. |title=Resettable Encoded Vector Clock for Causality Analysis With an Application to Dynamic Race Detection |journal=IEEE Transactions on Parallel and Distributed Systems |date=1 April 2021 |volume=32 |issue=4 |pages=772–785 |doi=10.1109/TPDS.2020.3032293|s2cid=220362525 |doi-access=free }}</ref><ref>{{Citation |author1=Lum Ramabaja |title=The Bloom Clock |year=2019 |arxiv=1905.13064 |bibcode=2019arXiv190513064R }}</ref><ref>{{cite book |last1=Kulkarni |first1=Sandeep S |last2=Appleton |first2=Gabe |last3=Nguyen |first3=Duong |title=Proceedings of the 23rd International Conference on Distributed Computing and Networking |chapter=Achieving Causality with Physical Clocks |date=4 January 2022 |pages=97–106 |doi=10.1145/3491003.3491009|arxiv=2104.15099 |isbn=9781450395601 |s2cid=233476293 }}</ref> Compared to a vector clock, the space used per node is fixed and does not depend on the number of nodes in a system. Comparing two clocks either produces a true negative (the clocks are not comparable), or else a suggestion that one clock precedes the other, with the possibility of a false positive where the two clocks are unrelated. The false positive rate decreases as more storage is allowed. ==See also== *[[Lamport timestamps]] *[[Matrix clock]]s *[[Version vector]] ==References== {{Reflist|30em}} == External links == * [http://queue.acm.org/detail.cfm?id=2917756 Why Logical Clocks are Easy (Compares Causal Histories, Vector Clocks and Version Vectors)] * [http://basho.com/why-vector-clocks-are-easy/ Explanation of Vector clocks] * [https://github.com/cliffmoon/dynomite/blob/master/elibs/vector_clock.erl Timestamp-based vector clock implementation in Erlang] * [https://github.com/jeremytregunna/JVectorClock Vector clock implementation in Objective-C] * [https://github.com/basho/riak_core/blob/master/src/vclock.erl Vector clock implementation in Erlang] * [http://basho.com/why-vector-clocks-are-hard/ Why Vector Clocks are Hard] * [http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks Why Cassandra doesn’t need vector clocks] {{DEFAULTSORT:Vector Clock}} [[Category:Logical clock algorithms]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Ambox
(
edit
)
Template:Citation
(
edit
)
Template:Cite book
(
edit
)
Template:Cite conference
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite tech report
(
edit
)
Template:Cite web
(
edit
)
Template:Distinguish
(
edit
)
Template:Incomplete list
(
edit
)
Template:Main other
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)