Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Stochastic programming
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Statistical inference === Consider the following stochastic programming problem <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \min\limits_{x\in X}\{ g(x) = f(x)+E[Q(x,\xi)] \} </math></div> Here <math>X</math> is a nonempty closed subset of <math>\mathbb{R}^n</math>, <math>\xi</math> is a random vector whose probability distribution <math>P</math> is supported on a set <math>\Xi \subset \mathbb{R}^d</math>, and <math>Q: X \times \Xi \rightarrow \mathbb{R}</math>. In the framework of two-stage stochastic programming, <math>Q(x,\xi)</math> is given by the optimal value of the corresponding second-stage problem. Assume that <math>g(x)</math> is well defined and ''finite valued'' for all <math>x\in X</math>. This implies that for every <math>x\in X</math> the value <math>Q(x,\xi)</math> is finite almost surely. Suppose that we have a sample <math>\xi^1,\dots,\xi^N</math> of <math>N</math>realizations of the random vector <math>\xi</math>. This random sample can be viewed as historical data of <math>N</math> observations of <math>\xi</math>, or it can be generated by Monte Carlo sampling techniques. Then we can formulate a corresponding ''sample average approximation'' <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \min\limits_{x\in X}\{ \hat{g}_N(x) = f(x)+\frac{1}{N} \sum_{j=1}^N Q(x,\xi^j) \} </math></div> By the [[law of large numbers]] we have that, under some regularity conditions <math>\frac{1}{N} \sum_{j=1}^N Q(x,\xi^j)</math> converges pointwise with probability 1 to <math>E[Q(x,\xi)]</math> as <math>N \rightarrow \infty</math>. Moreover, under mild additional conditions the convergence is uniform. We also have <math>E[\hat{g}_N(x)]=g(x)</math>, i.e., <math>\hat{g}_N(x)</math> is an ''unbiased'' estimator of <math>g(x)</math>. Therefore, it is natural to expect that the optimal value and optimal solutions of the SAA problem converge to their counterparts of the true problem as <math>N \rightarrow \infty</math>. ====Consistency of SAA estimators==== Suppose the feasible set <math>X</math> of the SAA problem is fixed, i.e., it is independent of the sample. Let <math>\vartheta^*</math> and <math>S^*</math> be the optimal value and the set of optimal solutions, respectively, of the true problem and let <math>\hat{\vartheta}_N</math> and <math>\hat{S}_N</math> be the optimal value and the set of optimal solutions, respectively, of the SAA problem. # Let <math>g: X \rightarrow \mathbb{R}</math> and <math>\hat{g}_N: X \rightarrow \mathbb{R}</math> be a sequence of (deterministic) real valued functions. The following two properties are equivalent: #* for any <math>\overline{x}\in X</math> and any sequence <math>\{x_N\}\subset X</math> converging to <math>\overline{x}</math> it follows that <math>\hat{g}_N(x_N)</math> converges to <math>g(\overline{x})</math> #* the function <math>g(\cdot)</math> is continuous on <math>X</math> and <math>\hat{g}_N(\cdot)</math> converges to <math>g(\cdot)</math> uniformly on any compact subset of <math>X</math> # If the objective of the SAA problem <math>\hat{g}_N(x)</math> converges to the true problem's objective <math>g(x)</math> with probability 1, as <math>N \rightarrow \infty</math>, uniformly on the feasible set <math>X</math>. Then <math>\hat{\vartheta}_N</math> converges to <math>\vartheta^*</math> with probability 1 as <math>N \rightarrow \infty</math>. # Suppose that there exists a compact set <math>C \subset \mathbb{R}^n</math> such that #* the set <math>S</math> of optimal solutions of the true problem is nonempty and is contained in <math>C</math> #* the function <math>g(x)</math> is finite valued and continuous on <math>C</math> #* the sequence of functions <math>\hat{g}_N(x)</math> converges to <math>g(x)</math> with probability 1, as <math>N \rightarrow \infty</math>, uniformly in <math>x\in C</math> #* for <math>N</math> large enough the set <math>\hat{S}_N</math> is nonempty and <math>\hat{S}_N \subset C</math> with probability 1 :: then <math>\hat{\vartheta}_N \rightarrow \vartheta^*</math> and <math>\mathbb{D}(S^*,\hat{S}_N)\rightarrow 0 </math> with probability 1 as <math>N\rightarrow \infty </math>. Note that <math>\mathbb{D}(A,B) </math> denotes the ''deviation of set <math>A</math> from set <math>B</math>'', defined as <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \mathbb{D}(A,B) := \sup_{x\in A} \{ \inf_{x' \in B} \|x-x'\| \} </math></div> In some situations the feasible set <math>X</math> of the SAA problem is estimated, then the corresponding SAA problem takes the form <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \min_{x\in X_N} \hat{g}_N(x) </math></div> where <math>X_N</math> is a subset of <math>\mathbb{R}^n</math> depending on the sample and therefore is random. Nevertheless, consistency results for SAA estimators can still be derived under some additional assumptions: # Suppose that there exists a compact set <math>C \subset \mathbb{R}^n</math> such that #* the set <math>S</math> of optimal solutions of the true problem is nonempty and is contained in <math>C</math> #* the function <math>g(x)</math> is finite valued and continuous on <math>C</math> #* the sequence of functions <math>\hat{g}_N(x)</math> converges to <math>g(x)</math> with probability 1, as <math>N \rightarrow \infty</math>, uniformly in <math>x\in C</math> #* for <math>N</math> large enough the set <math>\hat{S}_N</math> is nonempty and <math>\hat{S}_N \subset C</math> with probability 1 #* if <math> x_N \in X_N</math> and <math> x_N </math> converges with probability 1 to a point <math> x</math>, then <math> x \in X</math> #* for some point <math> x \in S^*</math> there exists a sequence <math> x_N \in X_N</math> such that <math> x_N \rightarrow x</math> with probability 1. :: then <math>\hat{\vartheta}_N \rightarrow \vartheta^*</math> and <math>\mathbb{D}(S^*,\hat{S}_N)\rightarrow 0 </math> with probability 1 as <math>N\rightarrow \infty </math>. ==== Asymptotics of the SAA optimal value ==== Suppose the sample <math>\xi^1,\dots,\xi^N</math> is i.i.d. and fix a point <math>x \in X</math>. Then the sample average estimator <math>\hat{g}_N(x)</math>, of <math>g(x)</math>, is unbiased and has variance <math>\frac{1}{N}\sigma^2(x)</math>, where <math>\sigma^2(x):=Var[Q(x,\xi)]</math> is supposed to be finite. Moreover, by the [[central limit theorem]] we have that <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \sqrt{N} [\hat{g}_N- g(x)] \xrightarrow{\mathcal{D}} Y_x </math></div> where <math>\xrightarrow{\mathcal{D}}</math> denotes convergence in ''distribution'' and <math>Y_x</math> has a normal distribution with mean <math>0</math> and variance <math>\sigma^2(x)</math>, written as <math>\mathcal{N}(0,\sigma^2(x))</math>. In other words, <math>\hat{g}_N(x)</math> has ''asymptotically normal'' distribution, i.e., for large <math>N</math>, <math>\hat{g}_N(x)</math> has approximately normal distribution with mean <math>g(x)</math> and variance <math>\frac{1}{N}\sigma^2(x)</math>. This leads to the following (approximate) <math>100(1-\alpha)</math>% confidence interval for <math>f(x)</math>: <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"> <math> \left[ \hat{g}_N(x)-z_{\alpha/2} \frac{\hat{\sigma}(x)}{\sqrt{N}}, \hat{g}_N(x)+z_{\alpha/2} \frac{\hat{\sigma}(x)}{\sqrt{N}}\right] </math></div> where <math>z_{\alpha/2}:=\Phi^{-1}(1-\alpha/2)</math> (here <math>\Phi(\cdot)</math> denotes the cdf of the standard normal distribution) and <div class="center" style="width: auto; margin-left: auto; margin-right: auto;"><math> \hat{\sigma}^2(x) := \frac{1}{N-1}\sum_{j=1}^{N} \left[ Q(x,\xi^j)-\frac{1}{N} \sum_{j=1}^N Q(x,\xi^j) \right]^2 </math></div> is the sample variance estimate of <math>\sigma^2(x)</math>. That is, the error of estimation of <math>g(x)</math> is (stochastically) of order <math> O(\sqrt{N})</math>.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)