Editing Data envelopment analysis

{{Short description|Method in operations research and economics}}
'''Data envelopment analysis''' ('''DEA''') is a [[non-parametric statistics|nonparametric]] method in [[operations research]] and [[economics]] for the estimation of [[production-possibility frontier|production frontiers]].<ref name=":0">Charnes et al (1978)</ref> DEA has been applied in a large range of fields including international banking, economic sustainability, police department operations, and logistical applications<ref name="Charnes95">Charnes et al (1995)</ref><ref name=":15">Emrouznejad et al (2016)</ref><ref name=":16">Thanassoulis (1995)</ref> Additionally, DEA has been used to assess the performance of natural language processing models, and it has found other applications within machine learning.<ref>Koronakos and Sotiropoulos (2020)</ref><ref name=":18">Zhou et al (2022)</ref><ref name=":17">Guerrero et al (2022)</ref>

==Description==
DEA is used to [[empirically]] measure [[productive efficiency]] of decision-making units (DMUs). Although DEA has a strong link to [[Production (economics)|production theory]] in economics, the method is also used for [[benchmarking]] in [[operations management]], whereby a set of measures is selected to benchmark the performance of manufacturing and service operations.<ref>Mahmoudi et al (2021)</ref> In benchmarking, the efficient DMUs, as defined by DEA, may not necessarily form a “production frontier”, but rather lead to a “best-practice frontier.”<ref name=":0"/><ref>Sickles et al (2019)</ref>{{rp|243–285}}

In contrast to parametric methods that require the ''[[ex-ante]]'' specification of a production- or cost-function, non-parametric approaches compare feasible input and output combinations based on the available [[data]] only.<ref name=cooper2007>Cooper et al (2007)</ref> DEA, one of the most commonly used non-parametric methods, owes its name to its enveloping property of the dataset's efficient DMUs, where the empirically observed, most efficient DMUs constitute the production frontier against which all DMUs are compared. DEA's popularity stems from its relative lack of assumptions, the ability to benchmark multi-dimensional inputs and outputs as well as its computational ease owing to it being expressable as a [[linear program]], despite its task to calculate [[efficiency ratio]]s.<ref name=ease>Cooper et al (2011)</ref>

==History==
Building on the ideas of Farrell,<ref name=farrell>Farrell (1957)</ref> the 1978 work "Measuring the efficiency of decision-making units" by [[Abraham Charnes|Charnes]], [[William W. Cooper|Cooper]] & [[Edwardo Rhodes|Rhodes]]<ref name=":0" /> applied linear programming to estimate, for the first time, an [[empirical]], production-technology frontier. In [[Germany]], the procedure had earlier been used to estimate the [[marginal productivity]] of [[R&D]] and other factors of production. Since then, there have been a large number of books and journal articles written on DEA or about applying DEA to various sets of problems.

Starting with the CCR model, named after Charnes, Cooper, and Rhodes,<ref name=":0"/> many extensions to DEA have been proposed in the literature. They range from adapting implicit model assumptions such as input and output orientation, distinguishing technical and allocative efficiency,<ref name=allo>Fried et al (2008)</ref> adding limited disposability<ref>Cooper et al (2000)</ref>
of inputs/outputs or varying returns-to-scale<ref>Banker et al (1984)</ref> to techniques that utilize DEA results and extend them for more sophisticated analyses, such as stochastic DEA<ref name=":1">Olesen (2016)</ref> or cross-efficiency analysis.<ref name=":2" />

==Techniques==
In a one-input, one-output scenario, [[efficiency]] is merely the ratio of output over input that can be produced, while comparing several entities/DMUs based on it is trivial. However, when adding more inputs or outputs the efficiency computation becomes more complex. Charnes, Cooper, and Rhodes (1978)<ref name=":0" /> in their basic DEA model (the CCR) define the objective function to find <math>DMU_j's</math> efficiency <math>(\theta_j)</math> as:

:<math>\max \quad \theta_j = \frac{\sum\limits_{m=1}^{M}y_m^j u_m^j}{\sum\limits_{n=1}^{N}x_n^j v_n^j},</math>

where the <math>DMU_j's</math> known <math>M</math> outputs <math>y_1^j,...,y_m^j</math> are multiplied by their respective weights <math>u_1^j,...,u_m^j</math> and divided by the <math>N</math> inputs <math>x_1^j,...,x_n^j</math> multiplied by their respective weights <math>v_1^j,...,v_n^j</math>.

The efficiency score <math>\theta_j</math> is sought to be maximized, under the constraints that using those weights on each <math>DMU_k \quad k=1,...,K</math>, no efficiency score exceeds one:

:<math>\frac{\sum\limits_{m=1}^{M}y_m^k u_m^j}{\sum\limits_{n=1}^{N}x_n^k v_n^j} \leq 1 \qquad k = 1,...,K,</math>

and all inputs, outputs and weights have to be non-negative. To allow for linear optimization, one typically constrains either the sum of outputs or the sum of inputs to equal a fixed value (typically 1. See later for an example).

Because this [[optimization]] problem's dimensionality is equal to the sum of its inputs and outputs, selecting the smallest number of inputs/outputs that collectively, accurately capture the process one attempts to characterize is crucial. And because the production frontier envelopment is done empirically, several guidelines exist on the minimum required number of DMUs for good discriminatory power of the analysis, given homogeneity of the sample. This minimum number of DMUs varies between twice the sum of inputs and outputs (<math>2 (M + N)</math>) and twice the product of inputs and outputs (<math>2 M N</math>).

Some advantages of the DEA approach are:
* no need to explicitly specify a mathematical form for the production function
* capable of handling multiple inputs and outputs
* capable of being used with any input-output measurement, although ordinal variables remain tricky
* the sources of inefficiency can be analysed and quantified for every evaluated unit
* using the dual of the optimization problem identifies which DMU is evaluating itself against which other DMUs

Some of the disadvantages of DEA are:
* results are sensitive to the selection of inputs and outputs
* high-efficiency values can be obtained by being truly efficient or having a niche combination of inputs/outputs
* the number of efficient firms on the frontier increases with the number of inputs and output variables
* a DMU's efficiency scores may be obtained by using non-unique combinations of weights on the input and/or output factors

==Example==
Assume that we have the following data:

* Unit 1 produces 100 items per day, and the inputs per item are 10 dollars for materials and 2 labour-hours
* Unit 2 produces 80 items per day, and the inputs are 8 dollars for materials and 4 labour-hours
* Unit 3 produces 120 items per day, and the inputs are 12 dollars for materials and 1.5 labour-hours

To calculate the efficiency of unit 1, we define the objective function (OF) as

*<math>Max Efficiency :(100u_1)/(10v_1+2v_2)</math>

which is subject to (ST) all efficiency of other units (efficiency cannot be larger than 1):

*Efficiency of unit 1: <math>(100u_1)/(10v_1+2v_2)\leq 1</math>     
*Efficiency of unit 2: <math>(80u_1)/(8v_1+4v_2)\leq 1</math>        
*Efficiency of unit 3: <math>(120u_1)/(12v_1+1.5v_2)\leq 1</math>

and non-negativity:

*<math>u,v \geq 0</math>

A fraction with decision variables in the numerator and denominator is nonlinear. Since we are using a linear programming technique, we need to linearize the formulation, such that the denominator of the objective function is constant (in this case 1), then maximize the numerator.

The new formulation would be:

* OF
**<math>Max Efficiency :100u_1</math>
*ST
** Efficiency of unit 1: <math>100u_1-(10v_1+2v_2)\leq 0</math>
** Efficiency of unit 2: <math display="inline">80u_1-(8v_1+4v_2)\leq 0</math>
** Efficiency of unit 3: <math>120u_1-(12v_1+1.5v_2)\leq 0</math>
**Denominator of nonlinear OF'':''   <math>10v_1+2v_2=1</math>
** Non-negativity: <math>u,v \geq 0</math>

==Extensions==
A desire to improve upon DEA by reducing its disadvantages or strengthening its advantages has been a major cause for discoveries in the recent literature. The currently most often DEA-based method to obtain unique efficiency rankings is called "cross-efficiency." Originally developed by Sexton et al. in 1986,<ref name=":2">Sexton (1986)</ref> it found widespread application ever since Doyle and Green's 1994 publication.<ref name=Doyle>Doyle (1994)</ref> Cross-efficiency is based on the original DEA results, but implements a secondary objective where each DMU peer-appraises all other DMU's with its own factor weights. The average of these peer-appraisal scores is then used to calculate a DMU's cross-efficiency score. This approach avoids DEA's disadvantages of having multiple efficient DMUs and potentially non-unique weights.<ref name=nonu>Dyson (2001)</ref> Another approach to remedy some of DEA's drawbacks is Stochastic DEA,<ref name=":1" /> which synthesizes DEA and [[Stochastic Frontier Analysis]] (SFA).<ref name=olesen>Olesen et al (2016)</ref>

==Footnotes==
{{Reflist}}

==References==
* {{cite journal |last1=Charnes |first1=Abraham |authorlink1=Abraham Charnes |last2=Cooper|first2=William Wager|authorlink2=William W. Cooper |last3=Rhodes|first3=E.|date=1978 |title=Measuring the Efficiency of Decision Making Units |journal=[[European Journal of Operational Research]]|volume=2|issue=6 |pages=429–444| url=https://personal.utdallas.edu/~ryoung/phdseminar/CCR1978.pdf|access-date=27 January 2022 |doi=10.1016/0377-2217(78)90138-8}}
* {{cite book| first1=Abraham | last1=Charnes | first2=William | last2=Cooper| first3=Arie | last3=Lewin| first4=Lawrence| last4=Seiford| publisher=Springer Science & Business Media|year=1995|title=Data Envelopment Analysis: Theory, Methodology, and Applications | isbn=9780792394808}}
* {{Cite journal |last1=Mahmoudi |first1=Amin |last2=Abbasi |first2=Mehdi |last3=Deng |first3=Xiaopeng |date=2021 |title=Evaluating the Performance of the Suppliers Using Hybrid DEA-OPA Model: A Sustainable Development Perspective |url=http://dx.doi.org/10.1007/s10726-021-09770-x |journal=Group Decision and Negotiation |volume=31 |issue=2 |pages=335–362 |doi=10.1007/s10726-021-09770-x |s2cid=254498857 |issn=0926-2644}}
* {{cite journal|last1=Banker|first1=R. D.|last2=Charnes|first2=A.|last3=Cooper|first3=William Wager|date=September 1984 | url=https://sites.temple.edu/banker/files/2021/01/banker1984.pdf|title = Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis | journal = [[Management Science (journal)|Management Science]] | volume = 30 | issue = 9| pages = 1078–1092 | doi=10.1287/mnsc.30.9.1078|s2cid=51901687 |access-date=27 January 2022}}
* {{cite journal|author=Brockhoff K.|author-link=Klaus Brockhoff|year=1970|title=On the Quantification of the Marginal Productivity of Industrial Research by Estimating a Production Function for a Single Firm|journal=[[German Economic Review]]|volume=8|pages=202–229}}
* {{cite journal|last1=Banker|first1=R. D.|last2=Charnes|first2=A.|last3=Cooper|first3=William Wager|date=September 1984 | url=https://sites.temple.edu/banker/files/2021/01/banker1984.pdf|title = Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis | journal = [[Management Science (journal)|Management Science]] | volume = 30 | issue = 9| pages = 1078–1092 | doi=10.1287/mnsc.30.9.1078|s2cid=51901687 |access-date=27 January 2022}}
* {{Cite journal
| last1=Cook |first1=Wade D. 
| last2=Hababou |first2=Moez
| last3=Tuenter |first3=Hans J. H.
| title=Multicomponent Efficiency Measurement and Shared Inputs in Data Envelopment Analysis: An Application to Sales and Service Performance in Bank Branches
| journal=Journal of Productivity Analysis
| volume=14 | issue=3 |pages=209–224
| date=November 2000
| doi=10.1023/A:1026598803764
| jstor = 41781515}}

* {{Cite journal
| last1=Cook |first1=Wade D.
| last2=Tone|first2=Kaoru
| last3=Zhu|first3=Joe
| title=Data envelopment analysis: Prior to choosing a model
| journal=Omega
| volume=44 |issue=C |pages=1–4
| date= April 2014
|doi=10.1016/j.omega.2013.09.004}}

* {{Cite journal|last1=Cooper |first1=William Wager |author-link1=William Wager Cooper |last2=Seiford|first2=Lawrence|last3=Zhu|first3=Joe|date=2000|title=A unified additive model approach for evaluating inefficiency and congestion with associated measures in DEA|journal=[[Socio-Economic Planning Sciences]]|volume=34|issue=1|pages=1–25|doi=10.1016/S0038-0121(99)00010-5}}
* {{Cite journal|last1=Cooper |first1=William Wager |author-link1=William Wager Cooper |last2=Seiford|first2=Lawrence|last3=Zhu|first3=Joe|date=2000|title=A unified additive model approach for evaluating inefficiency and congestion with associated measures in DEA|journal=[[Socio-Economic Planning Sciences]]|volume=34|issue=1|pages=1–25|doi=10.1016/S0038-0121(99)00010-5}}
* {{Cite book|last1=Cooper|first1=William Wager |title=Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software|last2=Seiford|first2=Lawrence M.|last3=Tone|first3=Kaoru|date=2007|publisher=[[Springer Publishing]]|edition=2|language=en}}
* {{Cite book|date=2011|publisher=Springer Publishing|isbn=978-1441961501|editor-last=Cooper|editor-first=William Wager |edition=2|title=Handbook on Data Envelopment Analysis|series=International Series in Operations Research & Management Science|volume=164|language=en|editor-last2=Seiford|editor-first2=Lawrence M.|editor-last3=Zhu|editor-first3=Joe}}
* {{Cite journal|last1=Dyson|first1=R. G.|last2=Allen|first2=R.|last3=Camanho|first3=A. S.|last4=Podinovski|first4=V. V.|last5=Sarrico|first5=C. S.|last6=Shale|first6=E. A.|date=2001-07-16|title=Pitfalls and protocols in DEA|journal=European Journal of Operational Research|series=Data Envelopment Analysis|volume=132|issue=2|pages=245–259|doi=10.1016/S0377-2217(00)00149-1}}
* {{Cite journal|last1=Doyle|first1=John|last2=Green|first2=Rodney|date=1994|title=Efficiency and Cross-efficiency in DEA: Derivations, Meanings and Uses|journal=[[Journal of the Operational Research Society]]|language=en|volume=45|issue=5|pages=567–578|doi=10.1057/jors.1994.84|s2cid=122161456|issn=0160-5682}}
* {{cite journal|first1=Ali|last1=Emrouznejad|first2=Rajiv|last2=Banker|first3=Subhash|last3=Ray|first4=Lei|last4= Chen|year=2016| title=Recent Applications of Data Envelopment Analysis|publisher=Proceedings of the 14th International Conference on Data Envelopment Analysis}}
* {{Cite journal|last=Farrell|first=Michael James | author-link = Michael James Farrell  |date=1957|title=The Measurement of Productive Efficiency|journal=Journal of the Royal Statistical Society|volume=120|issue=3|pages=253–290 |doi=10.2307/2343100|jstor=2343100 }}
* {{Cite book|last1=Fried|first1=Harold O.|title=The Measurement of Productive Efficiency and Productivity Growth|last2=Lovell|first2=C. A. Knox|last3=Schmidt|first3=Shelton S.|date=2008|publisher=[[Oxford University Press]]|isbn=978-0-19-804050-7|language=en}}
* {{cite journal|first1=Nadia|last1=Guerrero|first2=Juan|last2=Aparicio|first3=Daniel|last3=Valero-Carreras|title=Combining data envelopment analysis and machine learning|doi=10.3390/math10060909|year=2022|journal=Mathematics|volume=10|issue=6|page=909 |doi-access=free }}
* Lovell, C.A.L., & P. Schmidt (1988) "A Comparison of Alternative Approaches to the Measurement of Productive Efficiency, in Dogramaci, A., & R. Färe (eds.) ''Applications of Modern Production Theory: Efficiency and Productivity'', Kluwer: Boston. 
*{{Cite journal|last1=Olesen|first1=Ole B.|last2=Petersen|first2=Niels Christian|date=2016|title=Stochastic Data Envelopment Analysis—A review|journal=[[European Journal of Operational Research]]|language=en|volume=251|issue=1|pages=2–21|doi=10.1016/j.ejor.2015.07.058|issn=0377-2217}}
* {{Cite book|last1=Ramanathan|first1=R.|title=An Introduction to Data Envelopment Analysis: A tool for Performance Measurement|date=2003|publisher=[[SAGE Publishing]]|location=N.Delhi|language=en}}
* {{Cite journal|last=Sexton|first=Thomas R.|date=1986|title=Data envelopment analysis: Critique and extension|journal=New Directions for Program Evaluation|volume=1986 |issue=32|pages=73–105|doi=10.1002/ev.1441}}
* {{Cite book |last1=Sickles|first1=Robin|last2=Zelenyuk|first2=Valentin  |url=https://assets.cambridge.org/97811070/36161/frontmatter/9781107036161_frontmatter.pdf |authorlink=Robin Sickles |title=Measurement of Productivity and Efficiency - Theory and Practice|date=2019|publisher=[[Cambridge University Press]]|isbn=978-1-107-68765-3|language=en|accessdate=27 January 2022}}
* {{cite journal | first1=Emmanuel|last1=Thanassoulis| year = 1995 | title = Assessing police forces in England and Wales using data envelopment analysis | journal = [[European Journal of Operational Research]] | volume = 87 | issue = 3| pages = 641–657 | doi=10.1016/0377-2217(95)00236-7}}
* {{cite journal | first1=Zachary|last1=Zhou|first2=Alisha|last2=Zachariah|first3=Devin|last3=Conathan|first4=Jeffery|last4=Kline|title=Assessing Resource-Performance Trade-off of Natural Language Models using Data Envelopment Analysis|journal=Proceedings of the 3rd Workshop on Evaluation and Comparison of NLP Systems|year=2022|publisher=Association for Computational Linguistics|pages=11–20|arxiv=2211.01486 }}
* {{cite book | first1=Gregory|last1=Koronakos|first2=Dionysios|last2=Sotiropoulos|title=2020 11th International Conference on Information, Intelligence, Systems and Applications (IISA |chapter=A Neural Network approach for Non-parametric Performance Assessment |year=2020|chapter-url=https://ieeexplore.ieee.org/document/9284346|publisher=IEEE|pages=1–8|doi=10.1109/IISA50023.2020.9284346 |isbn=978-1-6654-2228-4 |s2cid=228097834 }}

==Further reading==
* {{cite journal | first=Shinn|last= Sun | year = 2002 | title = Measuring the relative efficiency of police precincts using data envelopment analysis| journal = Socio-Economic Planning Sciences | volume = 36 | issue = 1| pages = 51–71 | doi=10.1016/s0038-0121(01)00010-6|url=https://www.researchgate.net/publication/4934554|accessdate=27 January 2022}}
* {{cite journal | last= Tofallis|first=Chris | year = 2001|url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1353122|accessdate=27 January 2022 | title = Combining two approaches to efficiency assessment | ssrn = 1353122 | journal = [[Journal of the Operational Research Society]] | volume = 52 | issue = 11| pages = 1225–1231 | doi=10.1057/palgrave.jors.2601231| hdl = 2299/917 | s2cid = 15258094 | hdl-access = free }}

==External links==
*[https://www.dataenvelopment.com Data Envelopment Analysis] official website
*[https://www.springer.com/journal/11123 ''Journal of Productivity Analysis''] official website

{{Authority control}}

[[Category:Linear programming]]
[[Category:Production economics]]
[[Category:Mathematical optimization in business]]