您的位置:首页 > 运维架构

[笔记] Convex Optimization 2015.11.25

2015-12-19 01:15 323 查看
∥y∥∗=sup{xTy:∥x∥≤1}⟹xTy≤∥x∥⋅∥y∥∗\lVert y \rVert _* = \sup \{ x^T y : \lVert x \rVert \le 1 \} \implies x^T y \le \lVert x \rVert \cdot \lVert y \rVert _*

(because xT∥x∥y≤∥y∥∗\frac{x^T}{\lVert x \rVert} y \le \lVert y \rVert _*)

Want inequality of type: xTy≤f(x)+"f∗(y)"x^T y \le f(x) + "f^*(y)" for “general” ff (Fenchel’s Inequality)

Definition: For f:Rn→Rf : \mathbb{R}^n \to \mathbb{R}, the conjugate f∗f^* of ff is defined by f∗(y)=supx(xTy−f(x))f^*(y) = \underset{x}{\sup} (x^T y - f(x))

with domf∗=dom \, f^* = set of yy’s for which sup\sup is <∞\lt \infty.

Example:

f(x)=aTx+b(x∈Rn)f(x) = a^T x + b (x \in \mathbb{R}^n)

f∗(y)=supxxTy−aTx−b={∞−bif y≠aif y=af^*(y) = \underset{x}{\sup} x^T y - a^T x - b =
\begin{cases}
\infty & \text{if } y \neq a \\
-b & \text{if } y = a
\end{cases}

f(x)=−logx(x>0)f(x) = -\log x (x \gt 0)

(xy+logx)′=y+1x=0⟹x=−1y(xy + \log x)' = y + \frac{1}{x} = 0 \implies x = -\frac{1}{y}

f∗(y)=supx>0xTy+logx={∞−log(−y)−1if y≥0if y<0f^*(y) = \underset{x \gt 0}{\sup} x^T y + \log x =
\begin{cases}
\infty & \text{if } y \ge 0 \\
-\log(-y) - 1 & \text{if } y \lt 0
\end{cases}

f(x)=ex(x∈R)f(x) = e^x (x \in \mathbb{R})

(xy−ex)′=y−ex=0⟹x=logy(xy - e^x)' = y - e^x = 0 \implies x = \log y

f∗(y)=supxxTy−ex={∞ylogy−yif y<0if y≥0f^*(y) = \underset{x}{\sup} x^T y - e^x =
\begin{cases}
\infty & \text{if } y \lt 0 \\
y\log y - y & \text{if } y \ge 0
\end{cases}

f(x)=xlogx(x≥0)f(x) = x \log x (x \ge 0)

(xy−xlogx)′=y−logx−1=0⟹x=ey−1(xy - x \log x)' = y - \log x - 1 = 0 \implies x = e^{y - 1}

f∗(y)=supx≥0xTy−xlogx=yey−1−(y−1)ey−1=ey−1f^*(y) = \underset{x \ge 0}{\sup} x^T y - x \log x = y e^{y - 1} - (y - 1) e^{y - 1} = e^{y - 1}

f(x)=12xTQxf(x) = \frac{1}{2} x^T Qx with Q∈Sn++Q \in S_{++}^n

f∗(y)=supxxTy−12xTQx=yTQ−1y−12yTQ−1y=12yTQ−1yf^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} x^T Qx = y^T Q^{-1} y - \frac{1}{2} y^T Q^{-1} y = \frac{1}{2} y^T Q^{-1} y

(infxxTAx+xTb⟹bestx=−12A−1b\underset{x}{\inf} x^T Ax + x^T b \implies \text{best} x = -\frac{1}{2} A^{-1} b)

So x=Q−1yx = Q^{-1} y

⟹xTy≤12xTQx+12yTQ−1y\implies x^T y \le \frac{1}{2} x^T Qx + \frac{1}{2} y^T Q^{-1} y, for all Q≻0Q \succ 0

f(x)=log(∑ni=1exi)f(x) = \log \left( \sum _{i = 1}^n e^{x_i} \right)

f∗(y)=supxxTy−log(∑ni=1exi)f*(y) = \underset{x}{\sup} x^T y - \log \left( \sum _{i = 1}^n e^{x_i} \right)

(xy−log(∑ni=1exi))′=y−exi∑ni=1exi=0\left(xy - \log \left( \sum _{i = 1}^n e^{x_i} \right) \right)' = y - \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}} = 0

⟹yi=exi∑ni=1exi,y⪰0,1Ty=1\implies y_i = \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}}, y \succeq 0, 1^T y = 1

assume for simplicity, y≻0y \succ 0

put xi=log(yi)x_i = \log (y_i), then ∑exi=1Ty=1\sum e^{x_i} = 1^T y = 1 and optimality conditions hold

then f∗(y)=∑ni=1yilog(yi)−log(1Ty)=∑ni=1yilog(yi)f^*(y) = \sum _{i = 1}^n y_i \log (y_i) - \log (1^T y) = \sum _{i = 1}^n y_i \log (y_i)

f(x)=∥x∥f(x) = \lVert x \rVert

f∗(y)=supxxTy−∥x∥={0∞if ∥y∥∗≤1if ∥y∥∗>1f^*(y) = \underset{x}{\sup} x^T y - \lVert x \rVert =
\begin{cases}
0 & \text{if } \lVert y \rVert _* \le 1 \\
\infty & \text{if } \lVert y \rVert _* \gt 1
\end{cases}

xTy−∥x∥≤∥x∥⋅∥y∥∗−∥x∥=∥x∥(∥y∥∗−1)≤0x^T y - \lVert x \rVert \le \lVert x \rVert \cdot \lVert y \rVert _* - \lVert x \rVert = \lVert x \rVert (\lVert y \rVert _* - 1) \le 0 if ∥y∥∗−1≤0\lVert y \rVert _* - 1 \le 0

f(x)=12∥x∥2f(x) = \frac{1}{2} \lVert x \rVert ^2

f∗(y)=supxxTy−12∥x∥2=12∥y∥2∗f^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} \lVert x \rVert ^2 = \frac{1}{2} \lVert y \rVert _*^2

xTy−12∥x∥2≤∥x∥⋅∥y∥∗−12∥x∥2≤12∥y∥2∗x^T y - \frac{1}{2} \lVert x \rVert ^2 \le \lVert x \rVert \cdot \lVert y \rVert _* - \frac{1}{2} \lVert x \rVert ^2 \le \frac{1}{2} \lVert y \rVert _*^2 (∥x∥=∥y∥∗\lVert x \rVert = \lVert y \rVert_*)

⟹xTy≤12∥x∥2+12∥y∥2∗\implies x^T y \le \frac{1}{2} \lVert x \rVert ^2 + \frac{1}{2} \lVert y \rVert _*^2

Proof of general hyperplane seperation:

Let C⊆RnC \subseteq \mathbb{R}^n be a convex set, H⊆RH \subseteq \mathbb{R} be the affine subspace of smallest dimention containing CC, we write Cε={x:Bε(x)⋂H⊆C}C_{\varepsilon} = \{ x : B_{\varepsilon} (x) \bigcap H \subseteq C \}

then Cε⊆"relint(C)"=⋃ε>0CεC_{\varepsilon} \subseteq "\text{relint} (C)" = \underset{\varepsilon \gt 0}{\bigcup} C_{\varepsilon}. (relint: relative interior)

(C⊆relint(C)¯¯¯¯¯¯¯¯¯¯¯¯¯C \subseteq \overline{\text{relint}(C)}, CC is a subset of closure of relint(C)\text{relint}(C))

Let C,DC, D be disjoint convex sets. Then for every ε>0\varepsilon \gt 0 the sets Aε=Cε¯¯¯¯⋂B1ε(0)A_{\varepsilon} = \overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0), D¯¯¯\overline{D} are closed disjoint convex sets with Cε¯¯¯¯⋂B1ε(0)\overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0) bounded, and dist(Aε,D¯¯¯)≥ε>0\text{dist}(A_{\varepsilon}, \overline{D}) \ge \varepsilon \gt 0.

So ∃Aε∈Rn\exists A_{\varepsilon} \in \mathbb{R}^n, aε≠0a_{\varepsilon} \neq 0, bε∈Rb_{\varepsilon} \in \mathbb{R} s.t. (aε,bε)(a_{\varepsilon}, b_{\varepsilon}) define a seperating hyperplane for Aε,D¯¯¯A_{\varepsilon}, \overline{D}.

aTεx≤bε∀x∈Aεa_{\varepsilon}^T x \le b_{\varepsilon} \; \forall x \in A_{\varepsilon}, aTεx≥bε∀x∈D¯¯¯a_{\varepsilon}^T x \ge b_{\varepsilon} \; \forall x \in \overline{D}

WLOG ∥aε∥=1\lVert a_{\varepsilon} \rVert = 1

The sequence (a⃗ 1n)∞n=1(\vec{a}_\frac{1}{n})_{n = 1}^{\infty} is a sequence of unit vectors and so has a convergent subsequence, say WLOG convergent to a0∈Rna_0 \in \mathbb{R}^n.

can assume sequence b1nb_{\frac{1}{n}} is bonded (or else one of the sets C,DC, D is empty)

and so also convergent to some value b0∈Rb_0 \in \mathbb{R}.

Want to show (a0,b0)(a_0, b_0) is SH for C,DC, D, i.e., that

aT0x≤b0∀x∈C,aT0x≥b0∀x∈Da_0^T x \le b_0 \; \forall x \in C, \, a_0^T x \ge b_0 \; \forall x \in D

(Assume CC is not a point, proof like above; then assume D is not a point, switch C,DC, D.

If C,DC, D are points, obious true.)

Log-convexity and log-concavity

- Definition: f:Rn→R>0f : \mathbb{R}^n \to \mathbb{R}_{\gt 0} is log-convex (log-concave) if log(f)\log (f) is convex (concave).

- Convexity:

log(f(θx+(1−θ)y))≤θlog(f(x))+(1−θ)log(f(y))=log(f(x)θf(y)1−θ)\log (f(\theta x + (1 - \theta) y)) \le \theta \log (f(x)) + (1 - \theta) \log (f(y)) = \log (f(x)^{\theta} f(y)^{1 - \theta})

⟺f(θx+(1−θ)y)≤f(x)θf(y)1−θ\iff f(\theta x + (1 - \theta)y) \le f(x)^{\theta} f(y)^{1 - \theta}

- Remark 2: log-convex ⟹\implies convex, f(x)=elogf(x)f(x) = e^{\log f(x)}, (composition function, QED)

concave ⟹\implies log-concave
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: