Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Kolmogorov complexity
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Basic results== We write <math>K(x, y)</math> to be <math>K((x,y))</math>, where <math>(x, y)</math> means some fixed way to code for a tuple of strings x and y. === Inequalities === We omit additive factors of <math>O(1)</math>. This section is based on.<ref name=":0" /> '''Theorem.''' <math>K(x) \leq C(x) + 2\log_2 C(x) </math> '''Proof.''' Take any program for the universal Turing machine used to define plain complexity, and convert it to a prefix-free program by first coding the length of the program in binary, then convert the length to prefix-free coding. For example, suppose the program has length 9, then we can convert it as follows:<math display="block">9 \mapsto 1001 \mapsto 11-00-00-11-\color{red}{01} </math>where we double each digit, then add a termination code. The prefix-free universal Turing machine can then read in any program for the other machine as follows:<math display="block">[\text{code for simulating the other machine}][\text{coded length of the program}][\text{the program}]</math>The first part programs the machine to simulate the other machine, and is a constant overhead <math>O(1)</math>. The second part has length <math>\leq 2 \log_2 C(x) + 3</math>. The third part has length <math>C(x)</math>. '''Theorem''': There exists <math>c</math> such that <math>\forall x, C(x) \leq |x| + c</math>. More succinctly, <math>C(x) \leq |x|</math>. Similarly, <math>K(x) \leq |x| + 2\log_2 |x|</math>, and <math>K(x | |x|) \leq |x| </math>.{{clarify|reason=This appears to be the first use of conditional complexity notation. Its definition should occur before here. Moreover, notation could better be changed to avoid confusion with string length (maybe, string length is better denoted by '#x').|date=January 2024}} '''Proof.''' For the plain complexity, just write a program that simply copies the input to the output. For the prefix-free complexity, we need to first describe the length of the string, before writing out the string itself. '''Theorem. (extra information bounds, subadditivity)''' * <math>K(x | y) \leq K(x) \leq K(x, y) \leq \max( K(x|y) + K(y), K(y|x) + K(x)) \leq K(x) + K(y) </math> * <math>K(xy) \leq K(x, y)</math> Note that there is no way to compare <math>K(xy)</math> and <math>K(x|y)</math> or <math>K(x)</math> or <math>K(y|x)</math> or <math>K(y)</math>. There are strings such that the whole string <math>xy</math> is easy to describe, but its substrings are very hard to describe. '''Theorem. (symmetry of information)''' <math>K(x, y) = K(x | y, K(y)) + K(y) = K(y, x)</math>. '''Proof.''' One side is simple. For the other side with <math>K(x, y) \geq K(x | y, K(y)) + K(y)</math>, we need to use a counting argument (page 38 <ref>{{Cite book |last=Hutter |first=Marcus |title=Universal artificial intelligence: sequential decisions based on algorithmic probability |date=2005 |publisher=Springer |isbn=978-3-540-26877-2 |series=Texts in theoretical computer science |location=Berlin New York}}</ref>). '''Theorem. (information non-increase)''' For any computable function <math>f</math>, we have <math>K(f(x)) \leq K(x) + K(f)</math>. '''Proof.''' Program the Turing machine to read two subsequent programs, one describing the function and one describing the string. Then run both programs on the work tape to produce <math>f(x)</math>, and write it out. ===Uncomputability of Kolmogorov complexity=== ==== A naive attempt at a program to compute ''K'' ==== At first glance it might seem trivial to write a program which can compute ''K''(''s'') for any ''s'', such as the following: '''function''' KolmogorovComplexity('''string''' s) '''for''' i = 1 '''to''' infinity: '''for each''' string p '''of''' length exactly i '''if''' isValidProgram(p) '''and''' evaluate(p) == s '''return''' i This program iterates through all possible programs (by iterating through all possible strings and only considering those which are valid programs), starting with the shortest. Each program is executed to find the result produced by that program, comparing it to the input ''s''. If the result matches then the length of the program is returned. However this will not work because some of the programs ''p'' tested will not terminate, e.g. if they contain infinite loops. There is no way to avoid all of these programs by testing them in some way before executing them due to the non-computability of the [[halting problem]]. What is more, no program at all can compute the function ''K'', be it ever so sophisticated. This is proven in the following. ==== Formal proof of uncomputability of ''K''==== '''Theorem''': There exist strings of arbitrarily large Kolmogorov complexity. Formally: for each natural number ''n'', there is a string ''s'' with ''K''(''s'') β₯ ''n''.<ref group="note">However, an ''s'' with ''K''(''s'') = ''n'' need not exist for every ''n''. For example, if ''n'' is not a multiple of 7, no [[ASCII]] program can have a length of exactly ''n'' bits.</ref> '''Proof:''' Otherwise all of the infinitely many possible finite strings could be generated by the finitely many<ref group="note">There are 1 + 2 + 2<sup>2</sup> + 2<sup>3</sup> + ... + 2<sup>''n''</sup> = 2<sup>''n''+1</sup> β 1 different program texts of length up to ''n'' bits; cf. [[geometric series]]. If program lengths are to be multiples of 7 bits, even fewer program texts exist.</ref> programs with a complexity below ''n'' bits. '''Theorem''': ''K'' is not a [[computable function]]. In other words, there is no program which takes any string ''s'' as input and produces the integer ''K''(''s'') as output. The following [[proof by contradiction|'''proof''' by contradiction]] uses a simple [[Pascal (programming language)|Pascal]]-like language to denote programs; for sake of proof simplicity assume its description (i.e. an [[interpreter (computing)|interpreter]]) to have a length of {{val|1400000}} bits. Assume for contradiction there is a program '''function''' KolmogorovComplexity('''string''' s) which takes as input a string ''s'' and returns ''K''(''s''). All programs are of finite length so, for sake of proof simplicity, assume it to be {{val|7000000000}} bits. Now, consider the following program of length {{val|1288}} bits: '''function''' GenerateComplexString() '''for''' i = 1 '''to''' infinity: '''for each''' string s '''of''' length exactly i '''if''' KolmogorovComplexity(s) β₯ 8000000000 '''return''' s Using <code>KolmogorovComplexity</code> as a subroutine, the program tries every string, starting with the shortest, until it returns a string with Kolmogorov complexity at least {{val|8000000000}} bits,<ref group="note">By the previous theorem, such a string exists, hence the <code>for</code> loop will eventually terminate.</ref> i.e. a string that cannot be produced by any program shorter than {{val|8000000000}} bits. However, the overall length of the above program that produced ''s'' is only {{val|7001401288}} bits,<ref group=note>including the language interpreter and the subroutine code for <code>KolmogorovComplexity</code></ref> which is a contradiction. (If the code of <code>KolmogorovComplexity</code> is shorter, the contradiction remains. If it is longer, the constant used in <code>GenerateComplexString</code> can always be changed appropriately.)<ref group=note>If <code>KolmogorovComplexity</code> has length ''n'' bits, the constant ''m'' used in <code>GenerateComplexString</code> needs to be adapted to satisfy {{nowrap|''n'' + {{val|1400000}} + {{val|1218}} + 7Β·log<sub>10</sub>(''m'') < ''m''}}, which is always possible since ''m'' grows faster than log<sub>10</sub>(''m'').</ref> The above proof uses a contradiction similar to that of the [[Berry paradox]]: "<sub>{{color|#8080ff|1}}</sub>The <sub>{{color|#8080ff|2}}</sub>smallest <sub>{{color|#8080ff|3}}</sub>positive <sub>{{color|#8080ff|4}}</sub>integer <sub>{{color|#8080ff|5}}</sub>that <sub>{{color|#8080ff|6}}</sub>cannot <sub>{{color|#8080ff|7}}</sub>be <sub>{{color|#8080ff|8}}</sub>defined <sub>{{color|#8080ff|9}}</sub>in <sub>{{color|#8080ff|10}}</sub>fewer <sub>{{color|#8080ff|11}}</sub>than <sub>{{color|#8080ff|12}}</sub>twenty <sub>{{color|#8080ff|13}}</sub>English <sub>{{color|#8080ff|14}}</sub>words". It is also possible to show the non-computability of ''K'' by reduction from the non-computability of the halting problem ''H'', since ''K'' and ''H'' are [[turing degree#Turing equivalence|Turing-equivalent]].<ref>Stated without proof in: {{cite web |url=http://www.daimi.au.dk/~bromille/DC05/Kolmogorov.pdf |title=Course notes for Data Compression - Kolmogorov complexity |archive-url=https://web.archive.org/web/20090909132048/http://www.daimi.au.dk/~bromille/DC05/Kolmogorov.pdf |archive-date=2009-09-09 |year=2005 |author=P. B. Miltersen |page=7 | url-status=dead}}</ref> There is a corollary, humorously called the "[[full employment theorem]]" in the programming language community, stating that there is no perfect size-optimizing compiler. ===Chain rule for Kolmogorov complexity=== {{Main| Chain rule for Kolmogorov complexity}} The chain rule<ref>{{cite journal | first = A. | last = Zvonkin |author2=L. Levin | title = The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. | journal = Russian Mathematical Surveys | volume = 25 | number = 6 | pages = 83β124 | year = 1970 | doi = 10.1070/RM1970v025n06ABEH001269 | bibcode = 1970RuMaS..25...83Z | s2cid = 250850390 | url = http://alexander.shen.free.fr/library/Zvonkin_Levin_70.pdf}}</ref> for Kolmogorov complexity states that there exists a constant ''c'' such that for all ''X'' and ''Y'': :''K''(''X'',''Y'') = ''K''(''X'') + ''K''(''Y''|''X'') + c*max(1,log(''K''(''X'',''Y''))). It states that the shortest program that reproduces ''X'' and ''Y'' is [[Big-O notation|no more]] than a logarithmic term larger than a program to reproduce ''X'' and a program to reproduce ''Y'' given ''X''. Using this statement, one can define [[Mutual information#Absolute mutual information|an analogue of mutual information for Kolmogorov complexity]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)