Editing Quasi-arithmetic mean

{{Short description|Generalization of means}}
In [[mathematics]] and [[statistics]], the '''quasi-arithmetic mean''' or '''generalised ''f''-mean''' or '''Kolmogorov-Nagumo-de Finetti mean'''<ref>{{cite journal |last1=Nielsen |first1=Frank |last2=Nock |first2=Richard |title=Generalizing skew Jensen divergences and Bregman divergences with comparative convexity |journal=IEEE Signal Processing Letters |date=June 2017 |volume=24 |issue=8 |page=2 |doi=10.1109/LSP.2017.2712195 |arxiv=1702.04877 |bibcode=2017ISPL...24.1123N |s2cid=31899023 }}</ref> is one generalisation of the more familiar [[mean]]s such as the [[arithmetic mean]] and the [[geometric mean]], using a function <math>f</math>. It is also called '''Kolmogorov mean''' after Soviet mathematician [[Andrey Kolmogorov]]. It is a broader generalization than the regular [[generalized mean]].

==Definition==

If ''f'' is a function which maps an interval <math>I</math> of the real line to the [[real number]]s, and is both [[continuous function|continuous]] and [[injective function|injective]], the '''''f''-mean of <math>n</math> numbers'''
<math>x_1, \dots, x_n \in I</math>
is defined as <math>M_f(x_1, \dots, x_n) = f^{-1}\left( \frac{f(x_1)+ \cdots + f(x_n)}n \right)</math>, which can also be written 
:<math> M_f(\vec x)= f^{-1}\left(\frac{1}{n} \sum_{k=1}^{n}f(x_k) \right)</math>

We require ''f'' to be injective in order for the [[inverse function]] <math>f^{-1}</math> to exist. Since <math>f</math> is defined over an interval, <math>\frac{f(x_1)+ \cdots + f(x_n)}n</math> lies within the domain of <math>f^{-1}</math>.

Since ''f'' is injective and continuous, it follows that ''f'' is a strictly [[monotonic function]], and therefore that the ''f''-mean is neither larger than the largest number of the tuple <math>x</math> nor smaller than the smallest number in <math>x</math>.

== Examples ==

* If <math>I = \mathbb{R}</math>, the [[real line]],  and <math>f(x) = x</math>, (or indeed any linear function <math>x\mapsto a\cdot x + b</math>, <math>a</math> not equal to 0) then the ''f''-mean corresponds to the [[arithmetic mean]].
* If <math>I = \mathbb{R}^+</math>, the [[positive real numbers]] and <math>f(x) = \log(x)</math>, then the ''f''-mean corresponds to the [[geometric mean]]. According to the ''f''-mean properties, the result does not depend on the base of the [[logarithm]] as long as it is positive and not 1.
* If <math>I = \mathbb{R}^+</math> and <math>f(x) = \frac{1}{x}</math>, then the ''f''-mean corresponds to the [[harmonic mean]].
* If <math>I = \mathbb{R}^+</math> and <math>f(x) = x^p</math>, then the ''f''-mean corresponds to the [[power mean]] with exponent <math>p</math>.
* If <math>I = \mathbb{R}</math> and <math>f(x) = \exp(x)</math>, then the ''f''-mean is the mean in the [[log semiring]], which is a constant shifted version of the [[LogSumExp]] (LSE) function (which is the logarithmic sum), <math>M_f(x_1, \dots, x_n) = \mathrm{LSE}(x_1, \dots, x_n)-\log(n)</math>. The <math>-\log(n)</math> corresponds to dividing by {{mvar|''n''}}, since logarithmic division is linear subtraction. The LogSumExp function is a [[smooth maximum]]: a smooth approximation to the maximum function.

== Properties ==
The following properties hold for <math>M_f</math> for any single function <math>f</math>:

'''Symmetry:''' The value of  <math>M_f</math> is unchanged if its arguments are permuted.

'''Idempotency:''' for all ''x'',  <math>M_f(x,\dots,x) = x</math>.

'''Monotonicity''': <math>M_f</math> is monotonic in each of its arguments (since  <math>f</math> is [[Monotonic function|monotonic]]).

'''Continuity''':  <math>M_f</math> is continuous in each of its arguments  (since  <math>f</math> is continuous).

'''Replacement''': Subsets of elements can be averaged a priori, without altering the mean, given that the multiplicity of elements is maintained. With <math>m=M_f(x_1,\dots,x_k)</math> it holds:

:<math>M_f(x_1,\dots,x_k,x_{k+1},\dots,x_n) = M_f(\underbrace{m,\dots,m}_{k \text{ times}},x_{k+1},\dots,x_n)</math>

[[Partition of a set|'''Partitioning''']]: The computation of the mean can be split into computations of equal sized sub-blocks:<math>
M_f(x_1,\dots,x_{n\cdot k}) =
  M_f(M_f(x_1,\dots,x_{k}),
      M_f(x_{k+1},\dots,x_{2\cdot k}),
      \dots,
      M_f(x_{(n-1)\cdot k + 1},\dots,x_{n\cdot k}))
</math>

'''Self-distributivity''': For any quasi-arithmetic mean <math>M</math> of two variables: <math>M(x,M(y,z))=M(M(x,y),M(x,z))</math>.

'''Mediality''': For any quasi-arithmetic mean <math>M</math> of two variables:<math>M(M(x,y),M(z,w))=M(M(x,z),M(y,w))</math>.

'''Balancing''': For any quasi-arithmetic mean <math>M</math> of two variables:<math>M\big(M(x, M(x, y)), M(y, M(x, y))\big)=M(x, y)</math>.

'''[[Central limit theorem]]''' : Under regularity conditions, for a sufficiently large sample, <math>\sqrt{n}\{M_f(X_1, \dots, X_n) - f^{-1}(E_f(X_1, \dots, X_n))\}</math> is approximately normal.<ref>{{cite journal|last=de Carvalho|first=Miguel|title=Mean, what do you Mean?|journal=[[The American Statistician]]|year=2016|volume=70|issue=3|pages=764‒776|doi=10.1080/00031305.2016.1148632|url=https://zenodo.org/record/895400|hdl=20.500.11820/fd7a8991-69a4-4fe5-876f-abcd2957a88c|s2cid=219595024 |hdl-access=free}}</ref>
A similar result is available for Bajraktarević means and deviation means, which are generalizations of quasi-arithmetic means.<ref>{{Cite journal |last1=Barczy |first1=Mátyás |last2=Burai |first2=Pál |date=2022-04-01 |title=Limit theorems for Bajraktarević and Cauchy quotient means of independent identically distributed random variables |url=https://link.springer.com/article/10.1007/s00010-021-00813-x |journal=Aequationes Mathematicae |language=en |volume=96 |issue=2 |pages=279–305 |doi=10.1007/s00010-021-00813-x |issn=1420-8903}}</ref><ref>{{Cite journal |last1=Barczy |first1=Mátyás |last2=Páles |first2=Zsolt |date=2023-09-01 |title=Limit Theorems for Deviation Means of Independent and Identically Distributed Random Variables |url=https://link.springer.com/article/10.1007/s10959-022-01225-6 |journal=Journal of Theoretical Probability |language=en |volume=36 |issue=3 |pages=1626–1666 |doi=10.1007/s10959-022-01225-6 |issn=1572-9230|arxiv=2112.05183 }}</ref>

'''Scale-invariance''': The quasi-arithmetic mean is invariant with respect to offsets and scaling of <math>f</math>:  <math>\forall a\ \forall b\ne0 ((\forall t\ g(t)=a+b\cdot f(t)) \Rightarrow \forall x\ M_f (x) = M_g (x)</math>.

== Characterization ==
There are several different sets of properties that characterize the quasi-arithmetic mean (i.e., each function that satisfies these properties is an ''f''-mean for some function ''f'').

* '''Mediality''' is essentially sufficient to characterize quasi-arithmetic means.<ref name=":0">{{Cite book|title=Functional equations in several variables. With applications to mathematics, information theory and to the natural and social sciences. Encyclopedia of Mathematics and its Applications, 31.|author=Aczél, J.|author2=Dhombres, J. G.|publisher=Cambridge Univ. Press|year=1989|location=Cambridge}}</ref>{{Rp|chapter 17}}
* '''Self-distributivity''' is essentially sufficient to characterize quasi-arithmetic means.<ref name=":0" />{{Rp|chapter 17}}
* '''Replacement''': Kolmogorov proved that the five properties of symmetry, fixed-point, monotonicity, continuity, and replacement fully characterize the quasi-arithmetic means.<ref>{{Cite web|url=https://math.stackexchange.com/a/3261514/29780|title=Characterization of the quasi-arithmetic mean|last=Grudkin|first=Anton|date=2019|website=Math stackexchange}}</ref>
* Continuity is superfluous in the characterization of two variables quasi-arithmetic means. See [10] for the details.
* '''Balancing''': An interesting problem is whether this condition (together with symmetry, fixed-point, monotonicity and continuity properties) implies that the mean is quasi-arithmetic. [[Georg Aumann]] showed in the 1930s that the answer is no in general,<ref>{{cite journal|last=Aumann|first=Georg|year=1937|title=Vollkommene Funktionalmittel und gewisse Kegelschnitteigenschaften|journal=[[Journal für die reine und angewandte Mathematik]]|volume=1937|issue=176|pages=49–55|doi=10.1515/crll.1937.176.49|s2cid=115392661}}</ref> but that if one additionally assumes <math>M</math> to be an [[analytic function]] then the answer is positive.<ref>{{cite journal|last=Aumann|first=Georg|year=1934|title=Grundlegung der Theorie der analytischen Analytische Mittelwerte|journal=Sitzungsberichte der Bayerischen Akademie der Wissenschaften|pages=45–81}}</ref>

== Homogeneity ==

[[Mean]]s are usually [[Homogeneous function|homogeneous]], but for most functions <math>f</math>, the ''f''-mean is not.
Indeed, the only homogeneous quasi-arithmetic means are the [[power mean]]s (including the [[geometric mean]]); see Hardy&ndash;Littlewood&ndash;Pólya, page 68.

The homogeneity property can be achieved by normalizing the input values by some (homogeneous) mean <math>C</math>.
:<math>M_{f,C} x = C x \cdot f^{-1}\left( \frac{f\left(\frac{x_1}{C x}\right) + \cdots + f\left(\frac{x_n}{C x}\right)}{n} \right)</math>
However this modification may violate [[Monotonic function|monotonicity]] and the partitioning property of the mean.

== Generalizations ==

Consider a Legendre-type strictly convex function <math>F</math>. Then the [[gradient]] map <math>\nabla F</math> is globally invertible and the weighted multivariate quasi-arithmetic mean<ref>{{cite arXiv|last=Nielsen|first=Frank|year=2023|title=Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry|eprint= 2301.10980| class = cs.IT}}</ref> is defined by
<math>
M_{\nabla F}(\theta_1,\ldots,\theta_n;w) = {\nabla F}^{-1}\left(\sum_{i=1}^n w_i \nabla F(\theta_i)\right)
</math>, where <math>w</math> is a normalized weight vector (<math>w_i=\frac{1}{n}</math> by default for a balanced average). From the convex duality, we get a dual quasi-arithmetic mean <math>M_{\nabla F^*}</math> associated to the quasi-arithmetic mean <math>M_{\nabla F}</math>.
For example, take <math>F(X)=-\log\det(X)</math> for <math>X</math> a symmetric positive-definite matrix.
The pair of matrix quasi-arithmetic means yields the matrix harmonic mean:
<math>M_{\nabla F}(\theta_1,\theta_2)=2(\theta_1^{-1}+\theta_2^{-1})^{-1}.
</math>

== See also ==
* [[Generalized mean]]
* [[Jensen's inequality]]

== References ==
* Andrey Kolmogorov (1930) "On the Notion of Mean", in "Mathematics and Mechanics" (Kluwer 1991) — pp.&nbsp;144&ndash;146.
* Andrey Kolmogorov (1930) Sur la notion de la moyenne. Atti Accad. Naz. Lincei 12, pp.&nbsp;388&ndash;391.
* John Bibby (1974) "Axiomatisations of the average and a further generalisation of monotonic sequences," Glasgow Mathematical Journal, vol. 15, pp.&nbsp;63–65.
* Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) Inequalities. 2nd ed. Cambridge Univ. Press, Cambridge, 1952.
* B. De Finetti, [http://www.brunodefinetti.it/Opere/concettodiMedia.pdf "Sul concetto di media"], vol. 3, p. 36996, 1931, istituto italiano degli attuari.

{{DEFAULTSORT:Quasi-Arithmetic Mean}}
[[Category:Means]]
<references />[10] [https://rdcu.be/dVZA1 MR4355191 - Characterization of quasi-arithmetic means without regularity condition]
[https://rdcu.be/dVZA1 Burai, P.; Kiss, G.; Szokol, P.]
[https://rdcu.be/dVZA1 Acta Math. Hungar. 165 (2021), no. 2, 474–485.]

[11] 

MR4574540 - A dichotomy result for strictly increasing bisymmetric maps

Burai, Pál; Kiss, Gergely; Szokol, Patricia

J. Math. Anal. Appl. 526 (2023), no. 2, Paper No. 127269, 9 pp.