Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Convex function
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Strongly convex functions== The concept of strong convexity extends and parametrizes the notion of strict convexity. Intuitively, a strongly-convex function is a function that grows as fast as a quadratic function.<ref>{{Cite web |title=Strong convexity Β· Xingyu Zhou's blog |url=https://xingyuzhou.org/blog/notes/strong-convexity |access-date=2023-09-27 |website=xingyuzhou.org}}</ref> A strongly convex function is also strictly convex, but not vice versa. If a one-dimensional function <math>f</math> is twice continuously differentiable and the domain is the real line, then we can characterize it as follows: *<math>f</math> convex if and only if <math>f''(x) \ge 0</math> for all <math>x.</math> *<math>f</math> strictly convex if <math>f''(x) > 0</math> for all <math>x</math> (note: this is sufficient, but not necessary). *<math>f</math> strongly convex if and only if <math>f''(x) \ge m > 0</math> for all <math>x.</math> For example, let <math>f</math> be strictly convex, and suppose there is a sequence of points <math>(x_n)</math> such that <math>f''(x_n) = \tfrac{1}{n}</math>. Even though <math>f''(x_n) > 0</math>, the function is not strongly convex because <math>f''(x)</math> will become arbitrarily small. More generally, a differentiable function <math>f</math> is called strongly convex with parameter <math>m > 0</math> if the following inequality holds for all points <math>x, y</math> in its domain:<ref name="bertsekas">{{cite book|page=[https://archive.org/details/convexanalysisop00bert_476/page/n87 72]|title=Convex Analysis and Optimization|url=https://archive.org/details/convexanalysisop00bert_476|url-access=limited|author=Dimitri Bertsekas| others= Contributors: Angelia Nedic and Asuman E. Ozdaglar|publisher=Athena Scientific|year=2003|isbn=9781886529458}}</ref><math display="block">(\nabla f(x) - \nabla f(y) )^T (x-y) \ge m \|x-y\|_2^2 </math> or, more generally, <math display=block>\langle \nabla f(x) - \nabla f(y), x-y \rangle \ge m \|x-y\|^2 </math> where <math>\langle \cdot, \cdot\rangle</math> is any [[inner product]], and <math>\|\cdot\|</math> is the corresponding [[Norm (mathematics)|norm]]. Some authors, such as <ref name="ciarlet">{{cite book| title=Introduction to numerical linear algebra and optimisation|author=Philippe G. Ciarlet|publisher=Cambridge University Press |year=1989 |isbn=9780521339841}}</ref> refer to functions satisfying this inequality as [[Elliptic operator|elliptic]] functions. An equivalent condition is the following:<ref name="nesterov">{{cite book|pages=[https://archive.org/details/introductorylect00nest/page/n79 63]β64|title=Introductory Lectures on Convex Optimization: A Basic Course|url=https://archive.org/details/introductorylect00nest|url-access=limited|author=Yurii Nesterov|publisher=Kluwer Academic Publishers|year=2004|isbn=9781402075537}}</ref> <math display=block>f(y) \ge f(x) + \nabla f(x)^T (y-x) + \frac{m}{2} \|y-x\|_2^2 </math> It is not necessary for a function to be differentiable in order to be strongly convex. A third definition<ref name="nesterov"/> for a strongly convex function, with parameter <math>m,</math> is that, for all <math>x, y</math> in the domain and <math>t \in [0,1],</math> <math display=block>f(tx+(1-t)y) \le t f(x)+(1-t)f(y) - \frac{1}{2} m t(1-t) \|x-y\|_2^2</math> Notice that this definition approaches the definition for strict convexity as <math>m \to 0,</math> and is identical to the definition of a convex function when <math>m = 0.</math> Despite this, functions exist that are strictly convex but are not strongly convex for any <math>m > 0</math> (see example below). If the function <math>f</math> is twice continuously differentiable, then it is strongly convex with parameter <math>m</math> if and only if <math>\nabla^2 f(x) \succeq mI</math> for all <math>x</math> in the domain, where <math>I</math> is the identity and <math>\nabla^2f</math> is the [[Hessian matrix]], and the inequality <math>\succeq</math> means that <math>\nabla^2 f(x) - mI</math> is [[Positive-definite matrix|positive semi-definite]]. This is equivalent to requiring that the minimum [[eigenvalue]] of <math>\nabla^2 f(x)</math> be at least <math>m</math> for all <math>x.</math> If the domain is just the real line, then <math>\nabla^2 f(x)</math> is just the second derivative <math>f''(x),</math> so the condition becomes <math>f''(x) \ge m</math>. If <math>m = 0</math> then this means the Hessian is positive semidefinite (or if the domain is the real line, it means that <math>f''(x) \ge 0</math>), which implies the function is convex, and perhaps strictly convex, but not strongly convex. Assuming still that the function is twice continuously differentiable, one can show that the lower bound of <math>\nabla^2 f(x)</math> implies that it is strongly convex. Using [[Taylor's theorem|Taylor's Theorem]] there exists <math display=block>z \in \{ t x + (1-t) y : t \in [0,1] \}</math> such that <math display=block>f(y) = f(x) + \nabla f(x)^T (y-x) + \frac{1}{2} (y-x)^T \nabla^2f(z) (y-x)</math> Then <math display=block>(y-x)^T \nabla^2f(z) (y-x) \ge m (y-x)^T(y-x) </math> by the assumption about the eigenvalues, and hence we recover the second strong convexity equation above. A function <math>f</math> is strongly convex with parameter ''m'' if and only if the function <math display=block>x\mapsto f(x) -\frac m 2 \|x\|^2</math> is convex. A twice continuously differentiable function <math>f</math> on a compact domain <math>X</math> that satisfies <math>f''(x)>0</math> for all <math>x\in X</math> is strongly convex. The proof of this statement follows from the [[extreme value theorem]], which states that a continuous function on a compact set has a maximum and minimum. Strongly convex functions are in general easier to work with than convex or strictly convex functions, since they are a smaller class. Like strictly convex functions, strongly convex functions have unique minima on compact sets. === Properties of strongly-convex functions === If ''f'' is a strongly-convex function with parameter ''m'', then:<ref name=":0">{{Cite web |last=Nemirovsky and Ben-Tal |date=2023 |title=Optimization III: Convex Optimization |url=http://www2.isye.gatech.edu/~nemirovs/OPTIIILN2023Spring.pdf}}</ref>{{Rp|location=Prop.6.1.4}} * For every real number ''r'', the [[level set]] {''x'' | ''f''(''x'') β€ ''r''} is [[Compact space|compact]]. * The function ''f'' has a unique [[global minimum]] on ''R<sup>n</sup>''.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)