Editing Legendre transformation

{{Short description|Mathematical transformation}}
{{about|an involution transform commonly used in classical mechanics and thermodynamics|the integral transform using Legendre polynomials as kernels|Legendre transform (integral transform)}}
[[Image:Legendre transformation.png|thumb|256px|right|The function <math>f(x)</math> is defined on the interval <math display="inline">[a,b]</math>. For a given <math>p</math>, the difference <math>px - f(x)</math> takes the maximum at <math>x'</math>. Thus, the Legendre transformation of <math>f(x)</math> is <math>f^*(p) =p x'-f(x')</math>.]]
In [[mathematics]], the '''Legendre transformation''' (or '''Legendre transform'''), first introduced by [[Adrien-Marie Legendre]] in 1787 when studying the minimal surface problem,<ref name=":0">{{Cite book |last=Legendre |first=Adrien-Marie |url=https://www.biodiversitylibrary.org/page/28011033 |title=Mémoire sur l'intégration de quelques équations aux différences partielles. In Histoire de l'Académie royale des sciences, avec les mémoires de mathématique et de physique |publisher=Imprimerie royale |year=1789 |volume= 1787|location=Paris |pages=309–351 |language=French}}</ref> is an [[involution (mathematics)|involutive]] [[List of transforms|transformation]] on [[real number|real]]-valued functions that are [[Convex function|convex]] on a real variable. Specifically, if a real-valued multivariable function is convex on one of its independent real variables, then the Legendre transform with respect to this variable is applicable to the function. 

In physical problems, the Legendre transform is used to convert functions of one quantity (such as position, pressure, or temperature) into functions of the [[Conjugate variables (thermodynamics)|conjugate quantity]] (momentum, volume, and entropy, respectively). In this way, it is commonly used in [[classical mechanics]] to derive the [[Hamiltonian mechanics|Hamiltonian]] formalism out of the [[Lagrangian mechanics|Lagrangian]] formalism (or vice versa) and in [[thermodynamics]] to derive the [[Thermodynamic potential|thermodynamic potentials]], as well as in the solution of [[Differential equation|differential equations]] of several variables. 

For sufficiently smooth functions on the real line, the Legendre transform <math>f^*</math> of a function <math>f</math> can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. This can be expressed in [[Notation for differentiation#Euler.27s notation|Euler's derivative notation]] as
<math display="block">Df(\cdot) = \left( D f^* \right)^{-1}(\cdot)~,</math> where <math>D</math> is an operator of differentiation, <math>\cdot</math> represents an argument or input to the associated function, <math>(\phi)^{-1}(\cdot)</math> is an inverse function such that <math>(\phi) ^{-1}(\phi(x))=x</math>, or equivalently, as <math>f'(f^{*\prime}(x^*)) = x^*</math> and <math>f^{*\prime}(f'(x)) = x</math> in [[Notation for differentiation#Lagrange's notation|Lagrange's notation]].

The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the [[convex conjugate]] (also called the Legendre–Fenchel transformation), which can be used to construct a function's [[Convex_hull#Functions|convex hull]].

==Definition==

===Definition in one-dimensional real space===
Let <math>I \sub \R</math> be an [[Interval (mathematics)|interval]], and <math>f:I \to \R</math> a [[convex function]]; then the ''Legendre transform'' ''of'' <math>f</math> is the function <math>f^*:I^* \to \R</math> defined by
<math display="block">f^*(x^*) = \sup_{x\in I}(x^*x-f(x)),\ \ \ \ I^*= \left \{x^*\in \R:\sup_{x\in I}(x^*x-f(x)) <\infty \right \}</math>
where <math display="inline">\sup</math> denotes the [[Infimum and supremum|supremum]] over <math>I</math>, e.g., <math display="inline">x</math> in <math display="inline">I</math> is chosen such that <math display="inline">x^*x - f(x)</math> is maximized at each <math display="inline">x^*</math>, or <math display="inline">x^*</math> is such that <math>x^*x-f(x)</math> has a bounded value throughout <math display="inline">I</math> (e.g., when <math>f(x)</math> is a linear function).

The function <math>f^*</math> is called the [[convex conjugate]] function of <math>f</math>. For historical reasons (rooted in analytic mechanics), the conjugate variable is often denoted <math>p</math>, instead of <math>x^*</math>. If the convex function <math>f</math> is defined on the whole line and is everywhere [[Differentiable function|differentiable]], then
<math display="block">f^*(p)=\sup_{x\in I}(px - f(x)) = \left( p x - f(x) \right)|_{x = (f')^{-1}(p)} </math>
can be interpreted as the negative of the [[y-intercept|<math>y</math>-intercept]] of the [[tangent line]] to the [[Graph of a function|graph]] of <math>f</math> that has slope <math>p</math>.

===Definition in n-dimensional real space===
The generalization to convex functions <math>f:X \to \R</math> on a [[convex set]] <math>X \sub \R^n</math> is straightforward: <math>f^*:X^* \to \R</math> has domain
<math display="block">X^*= \left \{x^* \in \R^n:\sup_{x\in X}(\langle x^*,x\rangle-f(x))<\infty \right \}</math>
and is defined by
<math display="block">f^*(x^*) = \sup_{x\in X}(\langle x^*,x\rangle-f(x)),\quad x^*\in X^* ~,</math>
where <math>\langle x^*,x \rangle</math> denotes the [[dot product]] of <math>x^*</math> and <math>x</math>.

The Legendre transformation is an application of the [[Duality (projective geometry)|duality]] relationship between points and lines. The functional relationship specified by <math>f</math> can be represented equally well as a set of <math>(x,y)</math> points, or as a set of tangent lines specified by their slope and intercept values.

===Understanding the Legendre transform in terms of derivatives===

For a differentiable convex function <math>f</math> on the real line with the first derivative <math>f'</math> and its inverse <math>(f')^{-1}</math>, the Legendre transform of <math>f</math>, <math> f^*</math>, can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other, i.e., <math>f' = ((f^*)')^{-1}</math> and <math>(f^*)' = (f')^{-1}</math>.

To see this, first note that if <math> f</math> as a convex function on the real line is differentiable and <math> \overline{x} </math> is a [[critical point (mathematics)|critical point]] of the function of <math> x \mapsto p \cdot x -f(x) </math>, then the supremum is achieved at <math display="inline"> \overline{x}</math> (by convexity, see the first figure in this Wikipedia page). Therefore, the Legendre transform of <math> f</math> is <math> f^*(p)= p \cdot \overline{x} - f(\overline{x})</math>.

Then, suppose that the first derivative <math>f'</math> is invertible and let the inverse be <math> g = (f')^{-1} </math>. Then for each <math display="inline"> p</math>, the point <math> g(p)</math> is the unique critical point <math display="inline"> \overline{x}</math> of the function <math> x \mapsto px -f(x) </math> (i.e., <math> \overline{x} = g(p)</math>) because <math> f'(g(p))=p </math> and the function's first derivative with respect to <math>x</math> at <math> g(p)</math> is <math> p-f'(g(p))=0 </math>. Hence we have <math> f^*(p) = p \cdot g(p) - f(g(p))</math> for each <math display="inline"> p</math>. By differentiating with respect to <math display="inline"> p</math>, we find 
<math display="block">(f^*)'(p) = g(p)+ p \cdot g'(p) - f'(g(p)) \cdot g'(p).</math>
Since <math> f'(g(p))=p</math> this simplifies to <math>(f^*)'(p) = g(p) = (f')^{-1}(p)</math>. In other words, ''<math>(f^*)'</math> and <math>f'</math> are inverses to each other''.

In general, if <math> h' = (f')^{-1} </math> as the inverse of <math> f',</math> then <math> h' = (f^*)' </math> so integration gives <math> f^* = h +c.</math> with a constant <math> c. </math>

In practical terms, given <math>f(x),</math> the parametric plot of <math>xf'(x)-f(x)</math> versus <math>f'(x)</math> amounts to the graph of <math>f^*(p)</math> versus <math>p.</math>

In some cases (e.g. thermodynamic potentials, below), a non-standard requirement is used, amounting to an alternative definition of {{math|''f'' *}} with a ''minus sign'',
<math display="block">f(x) - f^*(p) = xp.</math>

=== Formal definition in physics context ===
In analytical mechanics and thermodynamics, Legendre transformation is usually defined as follows: suppose <math>f</math> is a function of <math>x</math>; then we have 

::<math>\mathrm{d} f = \frac{\mathrm{d} f}{\mathrm{d} x} \mathrm{d} x.</math>

Performing the Legendre transformation on this function means that we take <math>p = \frac{\mathrm{d} f}{\mathrm{d} x}</math> as the independent variable, so that the above expression can be written as

::<math>\mathrm{d} f = p \mathrm{d} x,</math>

and according to Leibniz's rule <math>\mathrm{d} (uv) = u\mathrm{d} v + v\mathrm{d} u,</math> we then have 

::<math>\mathrm{d} \left(x p - f \right) = x \mathrm{d} p + p \mathrm{d} x - \mathrm{d} f = x\mathrm{d} p,</math>

and taking <math>f^* = xp-f,</math> we have <math>\mathrm d f^* = x \mathrm{d} p,</math> which means 

::<math>\frac{\mathrm{d} f^*}{\mathrm{d} p} = x.</math>

When <math>f</math> is a function of <math>n</math> variables <math>x_1, x_2, \cdots, x_n</math>, then we can perform the Legendre transformation on each one or several variables: we have 

::<math>\mathrm{d} f = p_1\mathrm{d} x_1 + p_2 \mathrm{d} x_2 + \cdots + p_n \mathrm{d} x_n,</math>

where <math>p_i = \frac{\partial f}{\partial x_i}.</math> Then if we want to perform the Legendre transformation on, e.g. <math>x_1</math>, then we take <math>p_1</math> together with <math>x_2, \cdots, x_n</math> as independent variables, and with Leibniz's rule we have 

::<math>\mathrm{d} (f - x_1 p_1) = -x_1 \mathrm{d} p_1 + p_2 \mathrm{d} x_2 + \cdots + p_n \mathrm{d} x_n.</math>

So for the function <math>\varphi(p_1, x_2, \cdots, x_n) = f(x_1, x_2, \cdots, x_n) - x_1 p_1,</math> we have

::<math>\frac{\partial \varphi}{\partial p_1} = -x_1,\quad \frac{\partial \varphi}{\partial x_2} = p_2,\quad \cdots, 
\quad \frac{\partial \varphi}{\partial x_n} = p_n.</math>

We can also do this transformation for variables <math>x_2, \cdots, x_n</math>. If we do it to all the variables, then we have 

::<math>\mathrm{d} \varphi = -x_1 \mathrm d p_1 - x_2 \mathrm{d} p_2 - \cdots - x_n \mathrm{d} p_n </math> where <math>\varphi = f-x_1 p_1 - x_2 p_2 - \cdots - x_n p_n. </math>

In analytical mechanics, people perform this transformation on variables <math>\dot q_1, \dot q_2, \cdots, \dot q_n </math> of the Lagrangian <math>L(q_1, \cdots, q_n, \dot{q}_1, \cdots, \dot{q}_n) </math> to get the Hamiltonian:

<math>H(q_1, \cdots, q_n, p_1, \cdots, p_n) = \sum_{i=1}^n p_i \dot{q}_i -
L(q_1, \cdots, q_n, \dot{q}_1 \cdots, \dot{q}_n). </math>

In thermodynamics, people perform this transformation on variables according to the type of thermodynamic system they want; for example, starting from the cardinal function of state, the internal energy <math>U(S,V)</math>, we have

::<math>\mathrm{d}U = T \mathrm{d} S - p \mathrm{d} V, </math>

so we can perform the Legendre transformation on either or both of <math>S, V </math> to yield

::<math>\mathrm{d} H = \mathrm{d} (U + pV) \ \ \ \ \ \ \ \ \ \ = \ \ \ \ T\mathrm{d} S + V \mathrm{d} p</math> 

::<math>\mathrm{d} F = \mathrm{d} (U - TS) \ \ \ \ \ \ \ \ \ \ = -S\mathrm{d} T - p \mathrm{d} V</math>

::<math>\mathrm{d} G = \mathrm{d} (U - TS + pV) = -S\mathrm{d} T + V \mathrm{d} p,</math>

and each of these three expressions has a physical meaning.

This definition of the Legendre transformation is the one originally introduced by Legendre in his work in 1787,<ref name=":0" /> and is still applied by physicists nowadays. Indeed, this definition can be mathematically rigorous if we treat all the variables and functions defined above: for example, <math>f,x_1,\cdots,x_n,p_1,\cdots,p_n, </math> as differentiable functions defined on an open set of <math>\R^n </math> or on a differentiable manifold, and <math>\mathrm{d} f, \mathrm{d} x_i, \mathrm{d} p_i </math> their differentials (which are treated as cotangent vector field in the context of differentiable manifold). This definition is equivalent to the modern mathematicians' definition as long as <math>f </math> is differentiable and convex for the variables <math>x_1, x_2, \cdots, x_n. </math>

==Properties==
*The Legendre transform of a convex function, of which double derivative values are all positive, is also a convex function of which double derivative values are all positive.{{pb}}''Proof.'' Let us show this with a doubly differentiable function <math>f(x)</math> with all positive double derivative values and with a bijective (invertible) derivative.{{pb}} For a fixed <math>p</math>, let <math>\bar{x}</math> maximize or make the function <math>px - f(x)</math> bounded over <math>x</math>. Then the Legendre transformation of <math>f</math> is <math>f^*(p) = p\bar{x} - f(\bar{x})</math>, thus,<math display="block">f'(\bar{x}) = p</math>by the maximizing or bounding condition <math>\frac{d}{dx}(px - f(x)) = p - f'(x)= 0 </math>. Note that <math>\bar{x}</math> depends on <math>p </math>. (This can be visually shown in the 1st figure of this page above.){{pb}} Thus <math>\bar{x} = g(p)</math> where <math>g \equiv (f')^{-1}</math>, meaning that <math>g</math> is the inverse of <math>f'</math> that is the derivative of <math>f</math> (so <math>f'(g(p))= p</math>).{{pb}} Note that <math>g</math> is also differentiable with the [[Inverse functions and differentiation|following derivative (Inverse function rule)]],<math display="block">\frac{dg(p)}{dp} = \frac{1}{f''(g(p))} ~.</math>Thus, the Legendre transformation <math>f^*(p) = pg(p) - f(g(p))</math> is the composition of differentiable functions, hence it is differentiable.{{pb}} Applying the [[product rule]] and the [[chain rule]] with the found equality <math>\bar{x} = g(p)</math> yields<math display="block">\frac{d(f^{*})}{dp} = g(p) + \left(p - f'(g(p))\right)\cdot \frac{dg(p)}{dp} = g(p), </math>giving <math display="block">\frac{d^2(f^{*})}{dp^2} = \frac{dg(p)}{dp} = \frac{1}{f''(g(p))} > 0,</math>so <math>f^*</math> is convex with its double derivatives are all positive.
* The Legendre transformation is an [[Involution (mathematics)|involution]], i.e., <math>f^{**} = f ~</math>.{{pb}} ''Proof.'' By using the above identities as <math>f'(\bar{x}) = p</math>, <math>\bar{x} = g(p)</math>, <math>f^*(p) = p\bar{x} - f(\bar{x})</math> and its derivative <math>(f^*)'(p) = g(p)</math>, <math display="block">\begin{align}
f^{**}(y) &{} = \left(y\cdot \bar{p} - f^{*}(\bar{p})\right)|_{(f^{*})'(\bar{p}) = y} \\[5pt]
&{} = g(\bar{p})\cdot \bar{p} - f^{*}(\bar{p}) \\[5pt]
&{} = g(\bar{p})\cdot \bar{p} - (\bar{p} g(\bar{p})-f(g(\bar{p})))\\[5pt]
&{} = f(g(\bar{p})) \\[5pt]
&{} = f(y)~.
\end{align}</math>Note that this derivation does not require the condition to have all positive values in double derivative of the original function <math>f</math>.

== Identities ==
As shown [[#Properties|above]], for a convex function <math>f(x)</math>, with <math>x = \bar{x}</math> maximizing or making <math>px - f(x)</math> bounded at each <math>p</math> to define the Legendre transform <math>f^*(p) = p\bar{x} - f(\bar{x})</math> and with <math>g \equiv (f')^{-1}</math>, the following identities hold.

* <math>f'(\bar{x}) = p</math>,
* <math>\bar{x} = g(p)</math>,
* <math>(f^*)'(p) = g(p)</math>.

==Examples==

===Example 1===
[[Image:LegendreExample.svg|right|thumb|200px|<math> f(x) = e^x</math> over the domain <math>I=\mathbb{R}</math> is plotted in red and its Legendre transform <math>
f^*(x^*) = x^*(\ln(x^*) - 1)
</math> over the domain <math>I^* = (0, \infty)</math> in dashed blue. Note that the Legendre transform appears convex.]]
Consider the [[exponential function]] <math> f(x) = e^x,</math> which has the domain <math>I=\mathbb{R}</math>. From the definition, the Legendre transform is 
<math display="block">
f^*(x^*) = \sup_{x\in \mathbb{R}}(x^*x-e^x),\quad x^*\in I^*</math>
where <math>I^*</math> remains to be determined. To evaluate the [[Infimum and supremum|supremum]], compute the derivative of <math>x^*x-e^x</math> with respect to <math>x</math> and set equal to zero:
<math display="block">
\frac{d}{dx} (x^*x-e^x) = x^*-e^x = 0.
</math>
The [[Derivative_test#Second-derivative_test_(single_variable)|second derivative]] <math>-e^x</math> is negative everywhere, so the maximal value is achieved at <math>x = \ln(x^*)</math>. Thus, the Legendre transform is
<math display="block">
f^*(x^*) = x^*\ln(x^*)-e^{\ln(x^*)} = x^*(\ln(x^*) - 1)
</math>
and has domain <math>I^* = (0, \infty).</math> This illustrates that the [[domain of a function|domain]]s of a function and its Legendre transform can be different. 

To find the Legendre transformation of the Legendre transformation of <math> f</math>, 
<math display="block">
f^{**}(x) = \sup_{x^*\in \mathbb{R}}(xx^*-x^*(\ln(x^*) - 1)),\quad x\in I,
</math>
where a variable <math>
x
</math> is intentionally used as the argument of the function <math>
f^{**}
</math> to show the [[Involution (mathematics)|involution]] property of the Legendre transform as <math>
f^{**} = f
</math>. we compute
<math display="block">
\begin{aligned}
0 
&= \frac{d}{dx^*}\big( xx^*-x^*(\ln(x^*) - 1) \big)
= x - \ln(x^*)
\end{aligned}
</math>
thus the maximum occurs at <math>x^* = e^x</math> because the second derivative <math>
\frac{d^2}{{dx^*}^2}f^{**}(x) = - \frac{1}{x^*} < 0
</math> over the domain of <math>
f^{**}
</math> as <math>I^* = (0, \infty).</math> As a result, <math>
f^{**}
</math> is found as 
<math display="block">
\begin{aligned}
f^{**}(x)
&= xe^x - e^x(\ln(e^x) - 1) 
= e^x,
\end{aligned}
</math>
thereby confirming that <math>f = f^{**},</math> as expected.

===Example 2===
Let {{math|1=''f''(''x'') = ''cx''<sup>2</sup>}} defined on {{math|'''R'''}}, where {{math|''c'' > 0}} is a fixed constant.

For {{math|''x''*}} fixed, the function of {{mvar|x}}, {{math|1=''x''*''x'' − ''f''(''x'') = ''x''*''x'' − ''cx''<sup>2</sup>}} has the first derivative {{math|''x''* − 2''cx''}} and second derivative {{math|−2''c''}}; there is one stationary point at {{math|1=''x'' = ''x''*/2''c''}}, which is always a maximum.

Thus, {{math|1=''I''* = '''R'''}} and
<math display="block">f^*(x^*)=\frac{ {x^*}^2}{4c} ~.</math>

The first derivatives of {{math|''f''}}, 2{{math|''cx''}}, and of {{math|''f'' *}}, {{math|''x''*/(2''c'')}}, are inverse functions to each other. Clearly, furthermore,
<math display="block">f^{**}(x)=\frac{1}{4 (1/4c)}x^2=cx^2~,</math>
namely {{math|1=''f'' ** = ''f''}}.

===Example 3===
Let {{math|1=''f''(''x'') = ''x''<sup>2</sup>}} for {{math|1=''x'' ∈ (''I'' = [2, 3])}}.

For {{math|''x''*}} fixed, {{math|''x''*''x'' − ''f''(''x'')}} is continuous on {{mvar|I}} [[compact space|compact]], hence it always takes a finite maximum on it; it follows that the domain of the Legendre transform of <math>f</math> is {{math|1=''I''* = '''R'''}}.

The stationary point at {{math|1=''x'' = ''x''*/2}} (found by setting that the first derivative of {{math|''x''*''x'' − ''f''(''x'')}} with respect to <math>x</math> equal to zero) is in the domain {{math|[2, 3]}} if and only if {{math|4 ≤ ''x''* ≤ 6}}. Otherwise the maximum is taken either at {{math|1=''x'' = 2}} or {{math|1=''x'' = 3}} because the second derivative of {{math|''x''*''x'' − ''f''(''x'')}} with respect to <math>x</math> is negative as <math>-2</math>; for a part of the domain <math>x^* < 4</math> the maximum that {{math|''x''*''x'' − ''f''(''x'')}} can take with respect to <math>x \in [2,3]</math> is obtained at <math>x = 2</math> while for <math>x^* > 6</math> it becomes the maximum at <math>x = 3</math>. Thus, it follows that
<math display="block">f^*(x^*)=\begin{cases}
2x^*-4, & x^*<4\\
\frac{{x^*}^2}{4}, & 4\leq x^*\leq 6,\\
3x^*-9, & x^*>6.
\end{cases}</math>

===Example 4===
The function {{math|1=''f''(''x'') = ''cx''}} is convex, for every {{mvar|x}} (strict convexity is not required for the Legendre transformation to be well defined). Clearly {{math|1=''x''*''x'' − ''f''(''x'') = (''x''* − ''c'')''x''}} is never [[Bounded function|bounded from above]] as a function of {{mvar|x}}, unless {{math|1=''x''* − ''c'' = 0}}. Hence {{math|''f''*}} is defined on {{math|1=''I''* = {''c''}<nowiki/>}} and {{math|1=''f''*(''c'') = 0}}. ([[#Definition|The definition of the Legendre transform]] requires the existence of the [[Infimum and supremum|supremum]], that requires upper bounds.)

One may check involutivity: of course, {{math|''x''*''x'' − ''f''*(''x''*)}} is always bounded as a function of {{math|''x''*∈{''c''}<nowiki/>}}, hence {{math|1=''I''** = '''R'''}}. Then, for all {{mvar|x}} one has
<math display="block">\sup_{x^*\in\{c\}}(xx^*-f^*(x^*))=xc,</math>
and hence {{math|1=''f'' **(''x'') = ''cx'' = ''f''(''x'')}}.

=== Example 5 ===
As an example of a convex continuous function that is not everywhere differentiable, consider <math>f(x)= |x|</math>. This gives<math display="block">f^*(x^*) = \sup_{ x }(xx^*-|x|)=\max\left(\sup_{x\ge 0} x(x^*-1), 
 \,\sup_{x\le  0} x(x^*+1)  \right),</math>and thus <math>f^*(x^*)=0</math> on its domain <math>I^*=[-1,1]</math>.

===Example 6: several variables===
Let
<math display="block">f(x)=\langle x,Ax\rangle+c</math>
be defined on {{math|1=''X'' = '''R'''<sup>''n''</sup>}}, where {{mvar|A}} is a real, positive definite matrix.

Then {{mvar|f}} is convex, and
<math display="block">\langle p,x\rangle-f(x)=\langle p,x \rangle-\langle x,Ax\rangle-c,</math>
has gradient {{math|''p'' − 2''Ax''}} and [[Hessian matrix|Hessian]] {{math|−2''A''}}, which is negative; hence the stationary point {{math|1=''x'' = ''A''<sup>−1</sup>''p''/2}} is a maximum.

We have {{math|1=''X''* = '''R'''<sup>''n''</sup>}}, and
<math display="block">f^*(p)=\frac{1}{4}\langle p,A^{-1}p\rangle-c.</math>

==Behavior of differentials under Legendre transforms==
The Legendre transform is linked to [[integration by parts]], {{math|1=''p dx'' = ''d''(''px'') − ''x dp''}}.

Let {{math|''f''(''x'',''y'')}} be a function of two independent variables {{mvar|x}} and {{mvar|y}}, with the differential
<math display="block">df = \frac{\partial f}{\partial x}\,dx + \frac{\partial f}{\partial y}\,dy = p\,dx + v\,dy.</math>

Assume that the function {{mvar|f}} is convex in {{mvar|x}} for all {{mvar|y}}, so that one may perform the Legendre transform on {{mvar|f}} in {{mvar|x}}, with {{mvar|p}} the variable conjugate to {{mvar|x}} (for information, there is a relation <math>\frac{\partial f}{\partial x} |_{\bar{x}} = p</math> where <math>\bar{x}</math> is a point in {{mvar|x}} maximizing or making <math>px - f(x,y)</math> bounded for given {{mvar|p}} and {{mvar|y}}). Since the new independent variable of the transform with respect to {{mvar|f}} is {{mvar|p}}, the differentials {{math|''dx''}} and {{math|''dy''}} in {{mvar|df}} devolve to {{math|''dp''}} and {{math|''dy''}} in the differential of the transform, i.e., we build another function with its differential expressed in terms of the new basis {{math|''dp''}} and {{math|''dy''}}.

We thus consider the function {{math|1=''g''(''p'', ''y'') = ''f'' − ''px''}} so that
<math display="block">dg = df - p\,dx - x\,dp = -x\,dp + v\,dy</math>
<math display="block">x = -\frac{\partial g}{\partial p}</math>
<math display="block">v = \frac{\partial g}{\partial y}.</math>

The function {{math|−''g''(''p'', ''y'')}} is the Legendre transform of {{math|''f''(''x'', ''y'')}}, where only the independent variable {{mvar|x}} has been supplanted by {{mvar|p}}. This is widely used in [[thermodynamics]], as illustrated below.

==Applications==

===Analytical mechanics===
A Legendre transform is used in [[classical mechanics]] to derive the [[Hamiltonian mechanics|Hamiltonian formulation]] from the [[Lagrangian mechanics|Lagrangian formulation]], and conversely. A typical Lagrangian has the form

<math display="block">L(v,q)=\tfrac{1}2\langle v,Mv\rangle-V(q),</math>
where <math>(v,q)</math> are coordinates on {{math|'''R'''<sup>''n''</sup> × '''R'''<sup>''n''</sup>}}, {{mvar|M}} is a positive definite real matrix, and
<math display="block">\langle x,y\rangle = \sum_j x_j y_j.</math>

For every {{mvar|q}} fixed, <math>L(v, q)</math> is a convex function of <math>v</math>, while <math>V(q)</math> plays the role of a constant.

Hence the Legendre transform of <math>L(v, q)</math> as a function of <math>v</math> is the Hamiltonian function,
<math display="block">H(p,q)=\tfrac {1}{2} \langle p,M^{-1}p\rangle+V(q).</math>

In a more general setting, <math>(v, q)</math> are local coordinates on the [[tangent bundle]] <math>T\mathcal M</math> of a manifold <math>\mathcal M</math>. For each {{mvar|q}}, <math>L(v, q)</math> is a convex function of the tangent space {{math|''V<sub>q</sub>''}}. The Legendre transform gives the Hamiltonian <math>H(p, q)</math> as a function of the coordinates {{math|(''p'', ''q'')}} of the [[cotangent bundle]] <math>T^*\mathcal M</math>; the inner product used to define the Legendre transform is inherited from the pertinent canonical [[symplectic vector space|symplectic structure]]. In this abstract setting, the Legendre transformation corresponds to the [[tautological one-form]].{{Explain|date=April 2023}}

===Thermodynamics===
The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new (conjugate) function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an [[Intensive and extensive properties|extensive variable]] to its conjugate intensive variable, which can often be controlled more easily in a physical experiment.

For example, the [[internal energy]] {{mvar|U}} is an explicit function of the ''[[extensive quantity|extensive variables]]'' [[entropy]] {{mvar|S}}, [[volume]] ''{{mvar|V}}'', and [[chemical composition]] {{mvar|N<sub>i</sub>}} (e.g., <math> i = 1, 2, 3, \ldots</math>)
<math display="block"> U = U \left (S,V,\{N_i\} \right ),</math>
which has a total differential
<math display="block"> dU = T\,dS - P\,dV + \sum \mu_i \,dN _i</math>

where <math> T = \left. \frac{\partial U}{\partial S} \right \vert _{V, N_{i\ for\ all\  i\ values}}, P = \left. -\frac{\partial U}{\partial V} \right \vert _{S, N_{i\ for\ all\  i\ values}}, \mu_i = \left. \frac{\partial U}{\partial N_i} \right \vert _{S,V, N_{j\ for\ all\ j \ne i}}</math>.

(Subscripts are not necessary by the definition of partial derivatives but left here for clarifying variables.) Stipulating some common reference state, by using the (non-standard) Legendre transform of the internal energy {{mvar|U}} with respect to volume {{mvar|V}}, the [[enthalpy]] {{mvar|H}} may be obtained as the following.

To get the (standard) Legendre transform <math display="inline">U^*</math> of the internal energy {{mvar|U}} with respect to volume {{mvar|V}}, the function <math display="inline">u\left( p,S,V,\{{{N}_{i}}\} \right)=pV-U</math> is defined first, then it shall be maximized or bounded by {{mvar|V}}. To do this, the condition <math display="inline">\frac{\partial u}{\partial V} = p - \frac{\partial U}{\partial V} = 0 \to p = \frac{\partial U}{\partial V}</math> needs to be satisfied, so <math display="inline">U^* = \frac{\partial U}{\partial V}V - U</math> is obtained. This approach is justified because {{mvar|U}} is a linear function with respect to {{mvar|V}} (so a convex function on {{mvar|V}}) by the definition of [[Intensive and extensive properties|extensive variables]]. The non-standard Legendre transform here is obtained by negating the standard version, so <math display="inline">-U^* = H = U - \frac{\partial U}{\partial V}V = U + PV</math>.

{{mvar|H}} is definitely a [[state function]] as it is obtained by adding {{mvar|PV}} ({{mvar|P}} and {{mvar|V}} as [[State variable|state variables]]) to a state function <math display="inline"> U = U \left (S,V,\{N_i\} \right )</math>, so its differential is an [[exact differential]]. Because of <math display="inline"> dH = T\,dS + V\,dP + \sum \mu_i \,dN _i</math> and the fact that it must be an exact differential, <math> H = H(S,P,\{N_i\})</math>.

The enthalpy is suitable for description of processes in which the pressure is controlled from the surroundings.

It is likewise possible to shift the dependence of the energy from the extensive variable of entropy, {{mvar|S}}, to the (often more convenient) intensive variable {{mvar|T}}, resulting in the [[Helmholtz energy|Helmholtz]] and [[Gibbs energy|Gibbs]] [[thermodynamic free energy|free energies]]. The Helmholtz free energy {{mvar|A}}, and Gibbs energy {{mvar|G}}, are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,
<math display="block"> A = U - TS ~,</math><math display="block"> G = H - TS = U + PV - TS ~.</math>

The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are controlled from the surroundings, while the Gibbs energy is often the most useful when temperature and pressure are controlled from the surroundings.

===Variable capacitor===
As another example from [[physics]], consider a parallel conductive plate [[capacitor]], in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the [[force]] acting on the plates. One may think of the electric charge as analogous to the "charge" of a [[gas]] in a [[cylinder (engine)|cylinder]], with the resulting mechanical [[force]] exerted on a [[piston]].

Compute the force on the plates as a function of {{math|'''x'''}}, the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function.

The [[Electric potential energy|electrostatic potential energy]] stored in a capacitor of the [[capacitance]] {{math|''C''('''x''')}} and a positive [[electric charge]] {{math|+''Q''}} or negative charge {{math|-''Q''}} on each conductive plate is (with using the definition of the capacitance as <math display="inline">C = \frac{Q}{V}</math>),

<math display="block"> U (Q, \mathbf{x}) = \frac{1}{2} QV(Q,\mathbf{x}) = \frac{1}{2} \frac{Q^2}{C(\mathbf{x})},~</math>

where the dependence on the area of the plates, the dielectric constant of the insulation material between the plates, and the separation {{math|'''x'''}} are abstracted away as the [[capacitance]] {{math|''C''('''x''')}}. (For a parallel plate capacitor, this is proportional to the area of the plates and inversely proportional to the separation.)

The force {{math|'''F'''}} between the plates due to the electric field created by the charge separation is then
<math display="block"> \mathbf{F}(\mathbf{x}) = -\frac{dU}{d\mathbf{x}} ~. </math>

If the capacitor is not connected to any electric circuit, then the ''[[electric charge|electric charges]]'' on the plates remain constant and the voltage varies when the plates move with respect to each other, and the force is the negative [[gradient]] of the [[electrostatics|electrostatic]] potential energy as
<math display="block"> \mathbf{F}(\mathbf{x}) = \frac{1}{2} \frac{dC(\mathbf{x})}{d\mathbf{x}} \frac{Q^2}{{C(\mathbf{x})}^2}
= \frac{1}{2} \frac{dC(\mathbf{x})}{d\mathbf{x}}V(\mathbf{x})^2 </math>

where <math display="inline"> V(Q,\mathbf{x}) = V(\mathbf{x})  </math> as the charge is fixed in this configuration.

However, instead, suppose that the ''[[volt]]age'' between the plates {{math|''V''}} is maintained constant as the plate moves by connection to a [[battery (electricity)|battery]], which is a reservoir for electric charges at a constant potential difference. Then the amount of ''charges'' <math display="inline"> Q </math> ''is a variable'' instead of the voltage; <math display="inline"> Q </math> and <math display="inline"> V </math> are the Legendre conjugate to each other. To find the force, first compute the non-standard Legendre transform <math display="inline">U^*</math> with respect to <math display="inline"> Q </math> (also with using <math display="inline">C = \frac{Q}{V}</math>),

<math display="block">U^* = U - \left.\frac{\partial U}{\partial Q} \right|_\mathbf{x} \cdot Q =U - \frac{1}{2C(\mathbf{x})} \left. \frac{\partial Q^2}{\partial Q} \right|_\mathbf{x} \cdot Q = U - QV = \frac{1}{2} QV - QV = -\frac{1}{2} QV= - \frac{1}{2} V^2 C(\mathbf{x}).</math>

This transformation is possible because <math display="inline"> U </math> is now a linear function of <math display="inline"> Q </math> so is convex on it. The force now becomes the negative gradient of this Legendre transform, resulting in the same force obtained from the original function <math display="inline"> U </math>,
<math display="block"> \mathbf{F}(\mathbf{x}) = -\frac{dU^*}{d\mathbf{x}} = \frac{1}{2} \frac{dC(\mathbf{x})}{d\mathbf{x}}V^2 .</math>

The two conjugate energies <math display="inline"> U </math> and <math display="inline"> U^* </math> happen to stand opposite to each other (their signs are opposite), only because of the [[linear]]ity of the [[capacitance]]—except now {{math|''Q''}} is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.

===Probability theory===
In [[large deviations theory]], the ''rate function'' is defined as the Legendre transformation of the logarithm of the [[moment generating function]] of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of [[Independent and identically distributed random variables|i.i.d. random variables]], in particular in [[Cramér's theorem (large deviations)|Cramér's theorem]].

If <math>X_n</math> are i.i.d. random variables, let <math>S_n=X_1+\cdots+X_n</math> be the associated [[random walk]] and <math>M(\xi)</math> the moment generating function of <math>X_1</math>.  For <math>\xi\in\mathbb R</math>, <math>E[e^{\xi S_n}] = M(\xi)^n</math>.  Hence, by [[Markov's inequality]], one has for <math>\xi\ge 0</math> and <math>a\in\mathbb R</math>
<math display="block">P(S_n/n > a) \le e^{-n\xi a}M(\xi)^n=\exp[-n(\xi a - \Lambda(\xi))]</math>
where <math>\Lambda(\xi)=\log M(\xi)</math>.  Since the left-hand side is independent of <math>\xi</math>, we may take the infimum of the right-hand side, which leads one to consider the supremum of <math>\xi a - \Lambda(\xi)</math>, i.e., the Legendre transform of <math>\Lambda</math>, evaluated at <math>x=a</math>.

===Microeconomics===
Legendre transformation arises naturally in [[microeconomics]] in the process of finding the ''[[supply (economics)|supply]]'' {{math|''S''(''P'')}} of some product given a fixed price {{math|''P''}} on the market knowing the [[cost curve|cost function]] {{math|''C''(''Q'')}}, i.e. the cost for the producer to make/mine/etc. {{math|''Q''}} units of the given product.

A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is {{math|''P''}}. For a company selling this good, the best strategy is to adjust the production {{math|''Q''}} so that its profit is maximized. We can maximize the profit
<math display="block">\text{profit} = \text{revenue} - \text{costs} = PQ - C(Q)</math>
by differentiating with respect to {{math|''Q''}} and solving
<math display="block">P - C'(Q_\text{opt}) = 0.</math>

{{math|''Q''<sub>opt</sub>}} represents the optimal quantity {{math|''Q''}} of goods that the producer is willing to supply, which is indeed the supply itself:
<math display="block">S(P) = Q_\text{opt}(P) = (C')^{-1}(P).</math>

If we consider the maximal profit as a function of price, <math>\text{profit}_\text{max}(P)</math>, we see that it is the Legendre transform of the cost function <math>C(Q)</math>.

==Geometric interpretation==
For a [[strictly convex function]], the Legendre transformation can be interpreted as a mapping between the [[Graph of a function|graph]] of the function and the family of [[tangent]]s of the graph. (For a function of one variable, the tangents are well-defined at all but at most [[Countable set|countably many]] points, since a convex function is [[Derivative|differentiable]] at all but at most countably many points.)

The equation of a line with [[slope]] <math>p</math> and [[Y-intercept|<math>y</math>-intercept]] <math>b</math> is given by <math>y = p x + b</math>. For this line to be tangent to the graph of a function <math>f</math> at the point <math>\left(x_0, f(x_0)\right)</math> requires
<math display="block">f(x_0) = p x_0 + b</math>
and
<math display="block">p = f'(x_0).</math>

Being the derivative of a strictly convex function, the function <math>f'</math> is strictly monotone and thus [[Injective function|injective]]. The second equation can be solved for <math display="inline">x_0 = f^{\prime-1}(p),</math> allowing elimination of <math>x_0</math> from the first, and solving for the <math>y</math>-intercept <math>b</math> of the tangent as a function of its slope <math>p,</math> <math display="inline">b = f(x_0) - p x_0 = f\left(f^{\prime-1}(p)\right) - p \cdot f^{\prime-1}(p) = -f^\star(p)</math> where <math>f^{\star}</math> denotes the Legendre transform of <math>f.</math>

The [[Indexed family|family]] of tangent lines of the graph of <math>f</math> parameterized by the slope <math>p</math> is therefore given by
<math display="inline">y = p x - f^{\star}(p),</math> or, written implicitly, by the solutions of the equation
<math display="block">F(x,y,p) = y + f^{\star}(p) - p x = 0~.</math>

The graph of the original function can be reconstructed from this family of lines as the [[Envelope (mathematics)|envelope]] of this family by demanding
<math display="block">\frac{\partial F(x,y,p)}{\partial p} = f^{\star\prime}(p) - x = 0.</math>

Eliminating <math>p</math> from these two equations gives
<math display="block">y = x \cdot f^{\star\prime-1}(x) - f^{\star}\left(f^{\star\prime-1}(x)\right).</math>

Identifying <math>y</math> with <math>f(x)</math> and recognizing the right side of the preceding equation as the Legendre transform of <math>f^{\star},</math> yield <math display="inline">f(x) = f^{\star\star}(x) ~.</math>

==Legendre transformation in more than one dimension==
For a differentiable real-valued function on an [[open set|open]] convex subset {{mvar|U}} of {{math|'''R'''<sup>''n''</sup>}} the Legendre conjugate of the pair {{math|(''U'', ''f'')}} is defined to be the pair {{math|(''V'', ''g'')}}, where {{mvar|V}} is the image of {{mvar|U}} under the [[gradient]] mapping {{math|''Df''}}, and {{mvar|g}} is the function on {{mvar|V}} given by the formula
<math display="block">g(y) = \left\langle y, x \right\rangle - f(x), \qquad x = \left(Df\right)^{-1}(y)</math>
where
<math display="block">\left\langle u,v\right\rangle = \sum_{k=1}^n u_k \cdot v_k</math>

is the [[scalar product]] on {{math|'''R'''<sup>''n''</sup>}}. The multidimensional transform can be interpreted as an encoding of the [[convex hull]] of the function's [[epigraph (mathematics)|epigraph]] in terms of its [[supporting hyperplane]]s.<ref>{{Cite web |url=http://maze5.net/?page_id=733 |title=Legendre Transform {{pipe}} Nick Alger // Maps, art, etc |access-date=2011-01-26 |archive-url=https://web.archive.org/web/20150312152731/http://maze5.net/?page_id=733 |archive-date=2015-03-12 |url-status=dead }}</ref> This can be seen as consequence of the following two observations. On the one hand, the hyperplane tangent to the epigraph of <math>f</math> at some point <math>(\mathbf x, f(\mathbf x))\in U\times \mathbb{R}</math> has normal vector <math>(\nabla f(\mathbf x),-1)\in\mathbb{R}^{n+1}</math>. On the other hand, any closed convex set <math>C\in\mathbb{R}^m</math> can be characterized via the set of its [[Supporting hyperplane|supporting hyperplanes]] by the equations <math>\mathbf x\cdot\mathbf n = h_C(\mathbf n)</math>, where <math>h_C(\mathbf n)</math> is the [[support function]] of <math>C</math>. But the definition of Legendre transform via the maximization matches precisely that of the support function, that is, <math>f^*(\mathbf x)=h_{\operatorname{epi}(f)}(\mathbf x,-1) </math>. We thus conclude that the Legendre transform characterizes the epigraph in the sense that the tangent plane to the epigraph at any point <math>(\mathbf x,f(\mathbf x))</math>  is given explicitly by<math display="block">\{\mathbf z\in\mathbb{R}^{n+1}: \,\, \mathbf z\cdot \mathbf x= f^*(\mathbf x)\}. </math>

Alternatively, if {{mvar|X}} is a [[vector space]] and {{math|''Y''}} is its [[dual space|dual vector space]], then for each point {{mvar|x}} of {{math|''X''}} and {{math|''y''}} of {{math|''Y''}}, there is a natural identification of the [[cotangent space]]s {{math|T*''X<sub>x</sub>''}} with {{math|''Y''}} and {{math|T*''Y<sub>y</sub>''}} with {{math|''X''}}. If {{mvar|f}} is a real differentiable function over {{math|''X''}}, then its [[exterior derivative]], {{math|''df''}}, is a section of the [[cotangent bundle]] {{math|T*''X''}} and as such, we can construct a map from {{math|''X''}} to {{math|''Y''}}. Similarly, if {{mvar|g}} is a real differentiable function over {{math|''Y''}}, then {{math|''dg''}} defines a map from {{math|''Y''}} to {{math|''X''}}. If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the [[tautological one-form]] is commonly used in this setting.

When the function is not differentiable, the Legendre transform can still be extended, and is known as the [[Legendre-Fenchel transformation]]. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse (unless there are extra assumptions, like [[convex function|convexity]]).

==Legendre transformation on manifolds==

Let <math display="inline">M</math> be a [[smooth manifold]], let <math>E</math> and <math display="inline">\pi : E\to M</math> be a [[vector bundle]] on <math>M</math> and its associated [[bundle projection]], respectively. Let <math display="inline">L : E\to \R</math> be a smooth function. We think of <math display="inline">L</math> as a [[Lagrangian mechanics|Lagrangian]] by analogy with the classical case where <math display="inline">M = \R</math>, <math display="inline">E = TM = \Reals \times \Reals </math> and <math display="inline">L(x,v) = \frac 1 2 m v^2 - V(x)</math> for some positive number <math display="inline">m\in \Reals</math> and function <math display="inline">V : M \to \Reals</math>.

As usual, the [[dual bundle|dual]] of <math display="inline">E</math> is denote by <math display="inline">E^*</math>. The fiber of <math display="inline">\pi</math> over <math display="inline">x\in M</math> is denoted <math display="inline">E_x</math>, and the restriction of <math display="inline">L</math> to <math display="inline">E_x</math> is denoted by <math display="inline">L|_{E_x} : E_x\to \R</math>. The ''Legendre transformation'' of <math display="inline">L</math> is the smooth morphism<math display="block">\mathbf F L : E \to E^*</math> defined by <math display="inline">\mathbf FL(v) = d(L|_{E_x})_v \in E_x^*</math>, where <math display="inline">x = \pi(v)</math>. Here we use the fact that since <math display="inline">E_x</math> is a vector space, <math display="inline">T_v(E_x)</math> can be identified with <math display="inline">E_x</math>.
In other words, <math display="inline">\mathbf FL(v)\in E_x^*</math> is the covector that sends <math display="inline">w\in E_x</math> to the directional derivative <math display="inline">\left.\frac d {dt}\right|_{t=0} L(v + tw)\in \R</math>.

To describe the Legendre transformation locally, let <math display="inline">U\subseteq M</math> be a coordinate chart over which <math display="inline">E</math> is trivial. Picking a trivialization of <math display="inline">E</math> over <math display="inline">U</math>, we obtain charts <math display="inline">E_U \cong U \times \R^r</math> and <math display="inline">E_U^* \cong U \times \R^r</math>. In terms of these charts, we have <math display="inline">\mathbf FL(x; v_1, \dotsc, v_r) = (x; p_1,\dotsc, p_r)</math>, where <math display="block">p_i = \frac {\partial L}{\partial v_i}(x; v_1, \dotsc, v_r)</math> for all <math display="inline">i = 1, \dots, r</math>. If, as in the classical case, the restriction of <math display="inline">L : E\to \mathbb R</math> to each fiber <math display="inline">E_x</math> is strictly convex and bounded below by a positive definite quadratic form minus a constant, then the Legendre transform <math display="inline">\mathbf FL : E\to E^*</math> is a diffeomorphism.<ref name="CdS2008">Ana Cannas da Silva. ''Lectures on Symplectic Geometry'', Corrected 2nd printing. Springer-Verlag, 2008. pp. 147-148. {{ISBN|978-3-540-42195-5}}.</ref> Suppose that <math display="inline">\mathbf FL</math> is a diffeomorphism and let <math display="inline">H : E^* \to \R</math> be the "[[Hamiltonian mechanics|Hamiltonian]]" function defined by <math display="block">H(p) = p \cdot v - L(v),</math> where <math display="inline">v = (\mathbf FL)^{-1}(p)</math>. Using the natural isomorphism <math display="inline">E\cong E^{**}</math>, we may view the Legendre transformation of <math display="inline">H</math> as a map <math display="inline">\mathbf FH : E^* \to E</math>. Then we have<ref name="CdS2008"/> <math display="block">(\mathbf FL)^{-1} = \mathbf FH.</math>

==Further properties==

===Scaling properties===
The Legendre transformation has the following scaling properties: For {{math|''a'' > 0}},

<math display="block">f(x) = a \cdot g(x) \Rightarrow f^\star(p) = a \cdot g^\star\left(\frac{p}{a}\right) </math>
<math display="block">f(x) = g(a \cdot x) \Rightarrow f^\star(p) = g^\star\left(\frac{p}{a}\right).</math>

It follows that if a function is [[homogeneous function|homogeneous of degree {{mvar|r}}]] then its image under the Legendre transformation is a homogeneous function of degree {{mvar|s}}, where {{math|1=1/''r'' + 1/''s'' = 1}}. (Since {{math|1=''f''(''x'') = ''x<sup>r</sup>''/''r''}}, with {{math|''r'' > 1}}, implies {{math|1=''f''*(''p'') = ''p<sup>s</sup>''/''s''}}.) Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.

===Behavior under translation===
<math display="block"> f(x) = g(x) + b \Rightarrow f^\star(p) = g^\star(p) - b</math>
<math display="block"> f(x) = g(x + y) \Rightarrow f^\star(p) = g^\star(p) - p \cdot y </math>

===Behavior under inversion===
<math display="block"> f(x) = g^{-1}(x) \Rightarrow f^\star(p) = - p \cdot g^\star\left(\frac{1}{p} \right) </math>

===Behavior under linear transformations===
Let {{math|''A'' : '''R'''<sup>''n''</sup> → '''R'''<sup>''m''</sup>}} be a [[linear transformation]]. For any convex function {{mvar|f}} on {{math|'''R'''<sup>''n''</sup>}}, one has
<math display="block"> (A f)^\star = f^\star A^\star </math>
where {{math|''A''*}} is the [[adjoint operator]] of {{mvar|A}} defined by
<math display="block"> \left \langle Ax, y^\star \right \rangle = \left \langle x, A^\star y^\star \right \rangle, </math>
and {{math|''Af''}} is the ''push-forward'' of {{mvar|f}} along {{mvar|A}}
<math display="block"> (A f)(y) = \inf\{ f(x) : x \in X , A x = y \}. </math>

A closed convex function {{mvar|f}} is symmetric with respect to a given set {{mvar|G}} of [[orthogonal matrix|orthogonal linear transformations]],
<math display="block">f(A x) = f(x), \; \forall x, \; \forall A \in G </math>
[[if and only if]] {{math|''f''*}} is symmetric with respect to {{mvar|G}}.

===Infimal convolution===
The '''infimal convolution''' of two functions {{mvar|f}} and {{mvar|g}} is defined as

<math display="block"> \left(f \star_\inf g\right)(x) = \inf \left \{ f(x-y) + g(y) \, | \, y \in \mathbf{R}^n \right \}. </math>

Let {{math|''f''<sub>1</sub>, ..., ''f<sub>m</sub>''}} be proper convex functions on {{math|'''R'''<sup>''n''</sup>}}. Then

<math display="block"> \left( f_1 \star_\inf \cdots \star_\inf f_m \right)^\star = f_1^\star + \cdots + f_m^\star. </math>

===Fenchel's inequality===
For any function {{mvar|f}} and its convex conjugate {{math|''f'' *}} ''Fenchel's inequality'' (also known as the ''Fenchel–Young inequality'') holds for every {{math|''x'' ∈ ''X''}} and {{math|''p'' ∈ ''X''*}}, i.e., ''independent'' {{math|''x'', ''p''}} pairs,
<math display="block">\left\langle p,x \right\rangle \le f(x) + f^\star(p).</math>

==See also==
* [[Dual curve]]
* [[Projective duality]]
* [[Young's inequality for products]]
* [[Convex conjugate]]
* [[Moreau's theorem]]
* [[Integration by parts]]
* [[Fenchel's duality theorem]]

==References==
{{reflist}}
* {{cite book | last1=Courant |first1=Richard |author-link1=Richard Courant |last2=Hilbert |first2=David |author-link2=David Hilbert | title=Methods of Mathematical Physics |volume=2 |year=2008 | publisher=John Wiley & Sons |isbn=978-0471504399}}
* {{cite book | last=Arnol'd | first=Vladimir Igorevich | author-link=Vladimir Igorevich Arnol'd | title=Mathematical Methods of Classical Mechanics | edition=2nd | publisher=Springer | year=1989 | isbn=0-387-96890-3 | url-access=registration | url=https://archive.org/details/mathematicalmeth0000arno }}
* Fenchel, W. (1949). "On conjugate convex functions", ''Can. J. Math'' '''1''': 73-77.
* {{cite book | last=Rockafellar |first=R. Tyrrell | author-link=R. Tyrrell Rockafellar |title=Convex Analysis |publisher=Princeton University Press |year=1996 |orig-year=1970 |isbn=0-691-01586-4}}
* {{Cite journal| last1 = Zia | first1 = R. K. P.| last2 = Redish | first2 = E. F.| last3 = McKay | first3 = S. R.| doi = 10.1119/1.3119512 | title = Making sense of the Legendre transform| journal = American Journal of Physics| volume = 77| issue = 7 | pages = 614| year = 2009| arxiv = 0806.1147| bibcode= 2009AmJPh..77..614Z| s2cid = 37549350}}

==Further reading==
*{{cite web
|url = https://www.lix.polytechnique.fr/~nielsen/Note-LegendreTransformation.pdf
|title = Legendre transformation and information geometry
|access-date = 2016-01-24
|last = Nielsen
|first = Frank
|date = 2010-09-01
}}
*{{cite web
|url = http://appliedmaths.sun.ac.za/~htouchette/archive/notes/lfth2.pdf
|title = Legendre-Fenchel transforms in a nutshell
|access-date = 2016-01-24
|last = Touchette
|first = Hugo
|date = 2005-07-27
}}
*{{cite web
|url = http://www.physics.sun.ac.za/~htouchette/archive/pnotes/convex1.pdf
|archive-url = https://web.archive.org/web/20160201051903/http://www.physics.sun.ac.za/~htouchette/archive/pnotes/convex1.pdf
|url-status = dead
|archive-date = 2016-02-01
|title = Elements of convex analysis
|access-date = 2016-01-24
|last = Touchette
|first = Hugo
|date = 2006-11-21
}}

==External links==
{{Commons category|Legendre transformation}}
*[https://web.archive.org/web/20150312152731/http://maze5.net/?page_id=733 Legendre transform with figures] at maze5.net
*[http://www.onmyphd.com/?p=legendre.fenchel.transform Legendre and Legendre-Fenchel transforms in a step-by-step explanation] at onmyphd.com

[[Category:Transforms]]
[[Category:Duality theories]]
[[Category:Concepts in physics]]
[[Category:Convex analysis]]
[[Category:Mathematical physics]]