Editing Algorithmic efficiency (section)

==Overview==
An algorithm is considered efficient if its resource consumption, also known as computational cost, is at or below some acceptable level. Roughly speaking, 'acceptable' means:  it will run in a reasonable amount of time or space on an available computer, typically as a [[function (mathematics)|function]] of the size of the input. Since the 1950s computers have seen dramatic increases in both the available computational power and in the available amount of memory, so current acceptable levels would have been unacceptable even 10 years ago. In fact, thanks to the [[Moore's law|approximate doubling of computer power every 2 years]], tasks that are acceptably efficient on modern [[smartphone]]s and [[embedded system]]s may have been unacceptably inefficient for industrial [[server (computing)|server]]s 10 years ago.

Computer manufacturers frequently bring out new models, often with higher [[computer performance|performance]]. Software costs can be quite high, so in some cases the simplest and cheapest way of getting higher performance might be to just buy a faster computer, provided it is [[Backward compatibility|compatible]] with an existing computer.

There are many ways in which the resources used by an algorithm can be measured: the two most common measures are speed and memory usage; other measures could include transmission speed, temporary disk usage, long-term disk usage, power consumption, [[total cost of ownership]], [[response time (technology)|response time]] to external stimuli, etc. Many of these measures depend on the size of the input to the algorithm, i.e. the amount of data to be processed. They might also depend on the way in which the data is arranged; for example, some [[sorting algorithm]]s perform poorly on data which is already sorted, or which is sorted in reverse order.

In practice, there are other factors which can affect the efficiency of an algorithm, such as requirements for accuracy and/or reliability. As detailed below, the way in which an algorithm is implemented can also have a significant effect on actual efficiency, though many aspects of this relate to [[optimization (computer science)|optimization]] issues.

===Theoretical analysis===
In the theoretical [[analysis of algorithms]], the normal practice is to estimate their complexity in the asymptotic sense. The most commonly used notation to describe resource consumption or "complexity" is [[Donald Knuth]]'s [[Big O notation]], representing the complexity of an algorithm as a function of the size of the input <math display="inline">n</math>. Big O notation is an [[Asymptote|asymptotic]] measure of function complexity, where <math display="inline">f(n) = O\bigl( g(n)\bigr)</math> roughly means the time requirement for an algorithm is proportional to <math>g(n)</math>, omitting [[lower-order terms]] that contribute less than <math>g(n)</math> to the growth of the function as <math display="inline">n</math> [[limit (mathematics)|grows arbitrarily large]]. This estimate may be misleading when <math display="inline">n</math> is small, but is generally sufficiently accurate when <math display="inline">n</math> is large as the notation is asymptotic. For example, bubble sort may be faster than [[merge sort]] when only a few items are to be sorted; however either implementation is likely to meet performance requirements for a small list. Typically, programmers are interested in algorithms that [[Scalability|scale]] efficiently to large input sizes, and merge sort is preferred over bubble sort for lists of length encountered in most data-intensive programs.

Some examples of Big O notation applied to algorithms' asymptotic time complexity include:

{| class="wikitable"
|-
!Notation !! Name !! Examples
|-
|<math>O(1)</math>||[[Constant time|constant]] || Finding the median from a sorted list of measurements; Using a constant-size [[lookup table]]; Using a suitable [[hash function]] for looking up an item.
|-
|<math>O(\log n)</math>||[[Logarithmic time|logarithmic]] || Finding an item in a sorted array with a [[Binary search algorithm|binary search]] or a balanced search [[Tree data structure|tree]] as well as all operations in a [[Binomial heap]].
|-
|<math>O(n)</math>||[[linear time|linear]] || Finding an item in an unsorted list or a malformed tree (worst case) or in an unsorted array; Adding two ''n''-bit integers by [[Ripple carry adder|ripple carry]].
|-
|<math>O(n\log n)</math>||[[Linearithmic time|linearithmic]], loglinear, or quasilinear || Performing a [[Fast Fourier transform]]; [[heapsort]], [[quicksort]] ([[Best, worst and average case|best and average case]]), or [[merge sort]]
|-
|<math>O(n^2)</math>||[[quadratic time|quadratic]] ||[[Multiplication|Multiplying]] two ''n''-digit numbers by [[Long multiplication|a simple algorithm]]; [[bubble sort]] (worst case or naive implementation), [[Shell sort]], quicksort ([[Best, worst and average case|worst case]]), [[selection sort]] or [[insertion sort]]
|-
|<math>O(c^n),\;c>1</math>||[[exponential time|exponential]] || Finding the optimal (non-[[Travelling salesman problem#Heuristic and approximation algorithms|approximate]]) solution to the [[travelling salesman problem]] using [[dynamic programming]]; [[Boolean satisfiability problem|determining if two logical statements are equivalent]] using [[brute-force search]]
|}

===Measuring performance===
For new versions of software or to provide comparisons with competitive systems, [[benchmark (computing)|benchmark]]s are sometimes used, which assist with gauging an algorithms relative performance. If a new [[sorting algorithm|sort algorithm]] is produced, for example, it can be compared with its predecessors to ensure that at least it is efficient as before with known data, taking into consideration any functional improvements. Benchmarks can be used by customers when comparing various products from alternative suppliers to estimate which product will best suit their specific requirements in terms of functionality and performance. For example, in the [[mainframe computer|mainframe]] world certain proprietary [[Mainframe sort merge|sort]] products from independent software companies such as [[Syncsort]] compete with products from the major suppliers such as [[IBM]] for speed.

Some benchmarks provide opportunities for producing an analysis comparing the relative speed of various compiled and interpreted languages for example<ref name="fourmilab.ch" /><ref>{{cite web|url=http://www.roylongbottom.org.uk/whetstone.htm#anchorPC2 |title=Whetstone Benchmark History |publisher=Roylongbottom.org.uk |access-date=14 December 2011}}</ref>
and [[The Computer Language Benchmarks Game]] compares the performance of implementations of typical programming problems in several programming languages.

Even creating "[[do it yourself]]" benchmarks can demonstrate the relative performance of different programming languages, using a variety of user specified criteria. This is quite simple, as a "Nine language performance roundup" by Christopher W. Cowell-Shah demonstrates by example.<ref>{{Cite web|url=http://www.osnews.com/story/5602|title=Nine Language Performance Round-up: Benchmarking Math & File I/O|author=OSNews Staff|website=osnews.com|access-date=2018-09-18}}</ref>

===Implementation concerns===
Implementation issues can also have an effect on efficiency, such as the choice of programming language, or the way in which the algorithm is actually coded,<ref name="KriegelSchubert2016">{{cite journal|last1=Kriegel|first1=Hans-Peter|author-link=Hans-Peter Kriegel|last2=Schubert|first2=Erich|last3=Zimek|first3=Arthur|author-link3=Arthur Zimek|title=The (black) art of runtime evaluation: Are we comparing algorithms or implementations?|journal=Knowledge and Information Systems|volume=52|issue=2|year=2016|pages=341–378|issn=0219-1377|doi=10.1007/s10115-016-1004-2|s2cid=40772241}}</ref> or the choice of a [[compiler]] for a particular language, or the [[compiler optimization|compilation options]] used, or even the [[operating system]] being used. In many cases a language implemented by an [[interpreter (computing)|interpreter]] may be much slower than a language implemented by a compiler.<ref name="fourmilab.ch">{{cite web|url=http://www.fourmilab.ch/fourmilog/archives/2005-08/000567.html |title=Floating Point Benchmark: Comparing Languages (Fourmilog: None Dare Call It Reason) |publisher=Fourmilab.ch |date=4 August 2005 |access-date=14 December 2011}}</ref> See the articles on [[just-in-time compilation]] and [[interpreted language]]s.

There are other factors which may affect time or space issues, but which may be outside of a programmer's control; these include [[data alignment]], [[granularity#Data granularity|data granularity]], [[locality of reference|cache locality]], [[cache coherence|cache coherency]], [[garbage collection (computer science)|garbage collection]], [[instruction-level parallelism]], [[Multithreading (disambiguation)|multi-threading]]<!--Intentional link to DAB page--> (at either a hardware or software level), [[Simultaneous multithreading|simultaneous multitasking]], and [[subroutine]] calls.<ref name="steele1997">Guy Lewis Steele, Jr. "Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO". MIT AI Lab. AI Lab Memo AIM-443. October 1977.[http://dspace.mit.edu/handle/1721.1/5753]</ref>

Some processors have capabilities for [[vector processor|vector processing]], which allow a [[SIMD|single instruction to operate on multiple operands]]; it may or may not be easy for a programmer or compiler to use these capabilities. Algorithms designed for sequential processing may need to be completely redesigned to make use of [[parallel computing|parallel processing]], or they could be easily reconfigured. As [[parallel computing|parallel]] and [[distributed computing]] grow in importance in the late 2010s, more investments are being made into efficient [[high-level programming language|high-level]] [[Application programming interface|API]]s for parallel and distributed computing systems such as [[CUDA]], [[TensorFlow]], [[Apache Hadoop|Hadoop]], [[OpenMP]] and [[Message Passing Interface|MPI]].

Another problem which can arise in programming is that processors compatible with the same [[instruction set architecture|instruction set]] (such as [[x86-64]] or [[ARM architecture|ARM]]) may implement an instruction in different ways, so that instructions which are relatively fast on some models may be relatively slow on other models. This often presents challenges to [[optimizing compiler]]s, which must have extensive knowledge of the specific [[Central processing unit|CPU]] and other hardware available on the compilation target to best optimize a program for performance. In the extreme case, a compiler may be forced to [[software emulation|emulate]] instructions not supported on a compilation target platform, forcing it to [[code generation (compiler)|generate code]] or [[linking (computing)|link]] an external [[library (computing)|library call]] to produce a result that is otherwise incomputable on that platform, even if it is natively supported and more efficient in hardware on other platforms. This is often the case in [[embedded system]]s with respect to [[floating-point arithmetic]], where small and [[low-power computing|low-power]] [[microcontroller]]s often lack hardware support for floating-point arithmetic and thus require computationally expensive software routines to produce floating point calculations.