[笔记] Convex Optimization 2015.11.25
2015-12-19 01:15
323 查看
∥y∥∗=sup{xTy:∥x∥≤1}⟹xTy≤∥x∥⋅∥y∥∗\lVert y \rVert _* = \sup \{ x^T y : \lVert x \rVert \le 1 \} \implies x^T y \le \lVert x \rVert \cdot \lVert y \rVert _*
(because xT∥x∥y≤∥y∥∗\frac{x^T}{\lVert x \rVert} y \le \lVert y \rVert _*)
Want inequality of type: xTy≤f(x)+"f∗(y)"x^T y \le f(x) + "f^*(y)" for “general” ff (Fenchel’s Inequality)
Definition: For f:Rn→Rf : \mathbb{R}^n \to \mathbb{R}, the conjugate f∗f^* of ff is defined by f∗(y)=supx(xTy−f(x))f^*(y) = \underset{x}{\sup} (x^T y - f(x))
with domf∗=dom \, f^* = set of yy’s for which sup\sup is <∞\lt \infty.
Example:
f(x)=aTx+b(x∈Rn)f(x) = a^T x + b (x \in \mathbb{R}^n)
f∗(y)=supxxTy−aTx−b={∞−bif y≠aif y=af^*(y) = \underset{x}{\sup} x^T y - a^T x - b =
\begin{cases}
\infty & \text{if } y \neq a \\
-b & \text{if } y = a
\end{cases}
f(x)=−logx(x>0)f(x) = -\log x (x \gt 0)
(xy+logx)′=y+1x=0⟹x=−1y(xy + \log x)' = y + \frac{1}{x} = 0 \implies x = -\frac{1}{y}
f∗(y)=supx>0xTy+logx={∞−log(−y)−1if y≥0if y<0f^*(y) = \underset{x \gt 0}{\sup} x^T y + \log x =
\begin{cases}
\infty & \text{if } y \ge 0 \\
-\log(-y) - 1 & \text{if } y \lt 0
\end{cases}
f(x)=ex(x∈R)f(x) = e^x (x \in \mathbb{R})
(xy−ex)′=y−ex=0⟹x=logy(xy - e^x)' = y - e^x = 0 \implies x = \log y
f∗(y)=supxxTy−ex={∞ylogy−yif y<0if y≥0f^*(y) = \underset{x}{\sup} x^T y - e^x =
\begin{cases}
\infty & \text{if } y \lt 0 \\
y\log y - y & \text{if } y \ge 0
\end{cases}
f(x)=xlogx(x≥0)f(x) = x \log x (x \ge 0)
(xy−xlogx)′=y−logx−1=0⟹x=ey−1(xy - x \log x)' = y - \log x - 1 = 0 \implies x = e^{y - 1}
f∗(y)=supx≥0xTy−xlogx=yey−1−(y−1)ey−1=ey−1f^*(y) = \underset{x \ge 0}{\sup} x^T y - x \log x = y e^{y - 1} - (y - 1) e^{y - 1} = e^{y - 1}
f(x)=12xTQxf(x) = \frac{1}{2} x^T Qx with Q∈Sn++Q \in S_{++}^n
f∗(y)=supxxTy−12xTQx=yTQ−1y−12yTQ−1y=12yTQ−1yf^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} x^T Qx = y^T Q^{-1} y - \frac{1}{2} y^T Q^{-1} y = \frac{1}{2} y^T Q^{-1} y
(infxxTAx+xTb⟹bestx=−12A−1b\underset{x}{\inf} x^T Ax + x^T b \implies \text{best} x = -\frac{1}{2} A^{-1} b)
So x=Q−1yx = Q^{-1} y
⟹xTy≤12xTQx+12yTQ−1y\implies x^T y \le \frac{1}{2} x^T Qx + \frac{1}{2} y^T Q^{-1} y, for all Q≻0Q \succ 0
f(x)=log(∑ni=1exi)f(x) = \log \left( \sum _{i = 1}^n e^{x_i} \right)
f∗(y)=supxxTy−log(∑ni=1exi)f*(y) = \underset{x}{\sup} x^T y - \log \left( \sum _{i = 1}^n e^{x_i} \right)
(xy−log(∑ni=1exi))′=y−exi∑ni=1exi=0\left(xy - \log \left( \sum _{i = 1}^n e^{x_i} \right) \right)' = y - \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}} = 0
⟹yi=exi∑ni=1exi,y⪰0,1Ty=1\implies y_i = \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}}, y \succeq 0, 1^T y = 1
assume for simplicity, y≻0y \succ 0
put xi=log(yi)x_i = \log (y_i), then ∑exi=1Ty=1\sum e^{x_i} = 1^T y = 1 and optimality conditions hold
then f∗(y)=∑ni=1yilog(yi)−log(1Ty)=∑ni=1yilog(yi)f^*(y) = \sum _{i = 1}^n y_i \log (y_i) - \log (1^T y) = \sum _{i = 1}^n y_i \log (y_i)
f(x)=∥x∥f(x) = \lVert x \rVert
f∗(y)=supxxTy−∥x∥={0∞if ∥y∥∗≤1if ∥y∥∗>1f^*(y) = \underset{x}{\sup} x^T y - \lVert x \rVert =
\begin{cases}
0 & \text{if } \lVert y \rVert _* \le 1 \\
\infty & \text{if } \lVert y \rVert _* \gt 1
\end{cases}
xTy−∥x∥≤∥x∥⋅∥y∥∗−∥x∥=∥x∥(∥y∥∗−1)≤0x^T y - \lVert x \rVert \le \lVert x \rVert \cdot \lVert y \rVert _* - \lVert x \rVert = \lVert x \rVert (\lVert y \rVert _* - 1) \le 0 if ∥y∥∗−1≤0\lVert y \rVert _* - 1 \le 0
f(x)=12∥x∥2f(x) = \frac{1}{2} \lVert x \rVert ^2
f∗(y)=supxxTy−12∥x∥2=12∥y∥2∗f^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} \lVert x \rVert ^2 = \frac{1}{2} \lVert y \rVert _*^2
xTy−12∥x∥2≤∥x∥⋅∥y∥∗−12∥x∥2≤12∥y∥2∗x^T y - \frac{1}{2} \lVert x \rVert ^2 \le \lVert x \rVert \cdot \lVert y \rVert _* - \frac{1}{2} \lVert x \rVert ^2 \le \frac{1}{2} \lVert y \rVert _*^2 (∥x∥=∥y∥∗\lVert x \rVert = \lVert y \rVert_*)
⟹xTy≤12∥x∥2+12∥y∥2∗\implies x^T y \le \frac{1}{2} \lVert x \rVert ^2 + \frac{1}{2} \lVert y \rVert _*^2
Proof of general hyperplane seperation:
Let C⊆RnC \subseteq \mathbb{R}^n be a convex set, H⊆RH \subseteq \mathbb{R} be the affine subspace of smallest dimention containing CC, we write Cε={x:Bε(x)⋂H⊆C}C_{\varepsilon} = \{ x : B_{\varepsilon} (x) \bigcap H \subseteq C \}
then Cε⊆"relint(C)"=⋃ε>0CεC_{\varepsilon} \subseteq "\text{relint} (C)" = \underset{\varepsilon \gt 0}{\bigcup} C_{\varepsilon}. (relint: relative interior)
(C⊆relint(C)¯¯¯¯¯¯¯¯¯¯¯¯¯C \subseteq \overline{\text{relint}(C)}, CC is a subset of closure of relint(C)\text{relint}(C))
Let C,DC, D be disjoint convex sets. Then for every ε>0\varepsilon \gt 0 the sets Aε=Cε¯¯¯¯⋂B1ε(0)A_{\varepsilon} = \overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0), D¯¯¯\overline{D} are closed disjoint convex sets with Cε¯¯¯¯⋂B1ε(0)\overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0) bounded, and dist(Aε,D¯¯¯)≥ε>0\text{dist}(A_{\varepsilon}, \overline{D}) \ge \varepsilon \gt 0.
So ∃Aε∈Rn\exists A_{\varepsilon} \in \mathbb{R}^n, aε≠0a_{\varepsilon} \neq 0, bε∈Rb_{\varepsilon} \in \mathbb{R} s.t. (aε,bε)(a_{\varepsilon}, b_{\varepsilon}) define a seperating hyperplane for Aε,D¯¯¯A_{\varepsilon}, \overline{D}.
aTεx≤bε∀x∈Aεa_{\varepsilon}^T x \le b_{\varepsilon} \; \forall x \in A_{\varepsilon}, aTεx≥bε∀x∈D¯¯¯a_{\varepsilon}^T x \ge b_{\varepsilon} \; \forall x \in \overline{D}
WLOG ∥aε∥=1\lVert a_{\varepsilon} \rVert = 1
The sequence (a⃗ 1n)∞n=1(\vec{a}_\frac{1}{n})_{n = 1}^{\infty} is a sequence of unit vectors and so has a convergent subsequence, say WLOG convergent to a0∈Rna_0 \in \mathbb{R}^n.
can assume sequence b1nb_{\frac{1}{n}} is bonded (or else one of the sets C,DC, D is empty)
and so also convergent to some value b0∈Rb_0 \in \mathbb{R}.
Want to show (a0,b0)(a_0, b_0) is SH for C,DC, D, i.e., that
aT0x≤b0∀x∈C,aT0x≥b0∀x∈Da_0^T x \le b_0 \; \forall x \in C, \, a_0^T x \ge b_0 \; \forall x \in D
(Assume CC is not a point, proof like above; then assume D is not a point, switch C,DC, D.
If C,DC, D are points, obious true.)
Log-convexity and log-concavity
- Definition: f:Rn→R>0f : \mathbb{R}^n \to \mathbb{R}_{\gt 0} is log-convex (log-concave) if log(f)\log (f) is convex (concave).
- Convexity:
log(f(θx+(1−θ)y))≤θlog(f(x))+(1−θ)log(f(y))=log(f(x)θf(y)1−θ)\log (f(\theta x + (1 - \theta) y)) \le \theta \log (f(x)) + (1 - \theta) \log (f(y)) = \log (f(x)^{\theta} f(y)^{1 - \theta})
⟺f(θx+(1−θ)y)≤f(x)θf(y)1−θ\iff f(\theta x + (1 - \theta)y) \le f(x)^{\theta} f(y)^{1 - \theta}
- Remark 2: log-convex ⟹\implies convex, f(x)=elogf(x)f(x) = e^{\log f(x)}, (composition function, QED)
concave ⟹\implies log-concave
(because xT∥x∥y≤∥y∥∗\frac{x^T}{\lVert x \rVert} y \le \lVert y \rVert _*)
Want inequality of type: xTy≤f(x)+"f∗(y)"x^T y \le f(x) + "f^*(y)" for “general” ff (Fenchel’s Inequality)
Definition: For f:Rn→Rf : \mathbb{R}^n \to \mathbb{R}, the conjugate f∗f^* of ff is defined by f∗(y)=supx(xTy−f(x))f^*(y) = \underset{x}{\sup} (x^T y - f(x))
with domf∗=dom \, f^* = set of yy’s for which sup\sup is <∞\lt \infty.
Example:
f(x)=aTx+b(x∈Rn)f(x) = a^T x + b (x \in \mathbb{R}^n)
f∗(y)=supxxTy−aTx−b={∞−bif y≠aif y=af^*(y) = \underset{x}{\sup} x^T y - a^T x - b =
\begin{cases}
\infty & \text{if } y \neq a \\
-b & \text{if } y = a
\end{cases}
f(x)=−logx(x>0)f(x) = -\log x (x \gt 0)
(xy+logx)′=y+1x=0⟹x=−1y(xy + \log x)' = y + \frac{1}{x} = 0 \implies x = -\frac{1}{y}
f∗(y)=supx>0xTy+logx={∞−log(−y)−1if y≥0if y<0f^*(y) = \underset{x \gt 0}{\sup} x^T y + \log x =
\begin{cases}
\infty & \text{if } y \ge 0 \\
-\log(-y) - 1 & \text{if } y \lt 0
\end{cases}
f(x)=ex(x∈R)f(x) = e^x (x \in \mathbb{R})
(xy−ex)′=y−ex=0⟹x=logy(xy - e^x)' = y - e^x = 0 \implies x = \log y
f∗(y)=supxxTy−ex={∞ylogy−yif y<0if y≥0f^*(y) = \underset{x}{\sup} x^T y - e^x =
\begin{cases}
\infty & \text{if } y \lt 0 \\
y\log y - y & \text{if } y \ge 0
\end{cases}
f(x)=xlogx(x≥0)f(x) = x \log x (x \ge 0)
(xy−xlogx)′=y−logx−1=0⟹x=ey−1(xy - x \log x)' = y - \log x - 1 = 0 \implies x = e^{y - 1}
f∗(y)=supx≥0xTy−xlogx=yey−1−(y−1)ey−1=ey−1f^*(y) = \underset{x \ge 0}{\sup} x^T y - x \log x = y e^{y - 1} - (y - 1) e^{y - 1} = e^{y - 1}
f(x)=12xTQxf(x) = \frac{1}{2} x^T Qx with Q∈Sn++Q \in S_{++}^n
f∗(y)=supxxTy−12xTQx=yTQ−1y−12yTQ−1y=12yTQ−1yf^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} x^T Qx = y^T Q^{-1} y - \frac{1}{2} y^T Q^{-1} y = \frac{1}{2} y^T Q^{-1} y
(infxxTAx+xTb⟹bestx=−12A−1b\underset{x}{\inf} x^T Ax + x^T b \implies \text{best} x = -\frac{1}{2} A^{-1} b)
So x=Q−1yx = Q^{-1} y
⟹xTy≤12xTQx+12yTQ−1y\implies x^T y \le \frac{1}{2} x^T Qx + \frac{1}{2} y^T Q^{-1} y, for all Q≻0Q \succ 0
f(x)=log(∑ni=1exi)f(x) = \log \left( \sum _{i = 1}^n e^{x_i} \right)
f∗(y)=supxxTy−log(∑ni=1exi)f*(y) = \underset{x}{\sup} x^T y - \log \left( \sum _{i = 1}^n e^{x_i} \right)
(xy−log(∑ni=1exi))′=y−exi∑ni=1exi=0\left(xy - \log \left( \sum _{i = 1}^n e^{x_i} \right) \right)' = y - \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}} = 0
⟹yi=exi∑ni=1exi,y⪰0,1Ty=1\implies y_i = \frac{e^{x_i}}{\sum _{i = 1}^n e^{x_i}}, y \succeq 0, 1^T y = 1
assume for simplicity, y≻0y \succ 0
put xi=log(yi)x_i = \log (y_i), then ∑exi=1Ty=1\sum e^{x_i} = 1^T y = 1 and optimality conditions hold
then f∗(y)=∑ni=1yilog(yi)−log(1Ty)=∑ni=1yilog(yi)f^*(y) = \sum _{i = 1}^n y_i \log (y_i) - \log (1^T y) = \sum _{i = 1}^n y_i \log (y_i)
f(x)=∥x∥f(x) = \lVert x \rVert
f∗(y)=supxxTy−∥x∥={0∞if ∥y∥∗≤1if ∥y∥∗>1f^*(y) = \underset{x}{\sup} x^T y - \lVert x \rVert =
\begin{cases}
0 & \text{if } \lVert y \rVert _* \le 1 \\
\infty & \text{if } \lVert y \rVert _* \gt 1
\end{cases}
xTy−∥x∥≤∥x∥⋅∥y∥∗−∥x∥=∥x∥(∥y∥∗−1)≤0x^T y - \lVert x \rVert \le \lVert x \rVert \cdot \lVert y \rVert _* - \lVert x \rVert = \lVert x \rVert (\lVert y \rVert _* - 1) \le 0 if ∥y∥∗−1≤0\lVert y \rVert _* - 1 \le 0
f(x)=12∥x∥2f(x) = \frac{1}{2} \lVert x \rVert ^2
f∗(y)=supxxTy−12∥x∥2=12∥y∥2∗f^*(y) = \underset{x}{\sup} x^T y - \frac{1}{2} \lVert x \rVert ^2 = \frac{1}{2} \lVert y \rVert _*^2
xTy−12∥x∥2≤∥x∥⋅∥y∥∗−12∥x∥2≤12∥y∥2∗x^T y - \frac{1}{2} \lVert x \rVert ^2 \le \lVert x \rVert \cdot \lVert y \rVert _* - \frac{1}{2} \lVert x \rVert ^2 \le \frac{1}{2} \lVert y \rVert _*^2 (∥x∥=∥y∥∗\lVert x \rVert = \lVert y \rVert_*)
⟹xTy≤12∥x∥2+12∥y∥2∗\implies x^T y \le \frac{1}{2} \lVert x \rVert ^2 + \frac{1}{2} \lVert y \rVert _*^2
Proof of general hyperplane seperation:
Let C⊆RnC \subseteq \mathbb{R}^n be a convex set, H⊆RH \subseteq \mathbb{R} be the affine subspace of smallest dimention containing CC, we write Cε={x:Bε(x)⋂H⊆C}C_{\varepsilon} = \{ x : B_{\varepsilon} (x) \bigcap H \subseteq C \}
then Cε⊆"relint(C)"=⋃ε>0CεC_{\varepsilon} \subseteq "\text{relint} (C)" = \underset{\varepsilon \gt 0}{\bigcup} C_{\varepsilon}. (relint: relative interior)
(C⊆relint(C)¯¯¯¯¯¯¯¯¯¯¯¯¯C \subseteq \overline{\text{relint}(C)}, CC is a subset of closure of relint(C)\text{relint}(C))
Let C,DC, D be disjoint convex sets. Then for every ε>0\varepsilon \gt 0 the sets Aε=Cε¯¯¯¯⋂B1ε(0)A_{\varepsilon} = \overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0), D¯¯¯\overline{D} are closed disjoint convex sets with Cε¯¯¯¯⋂B1ε(0)\overline{C_{\varepsilon}} \bigcap B_{\frac{1}{\varepsilon}}(0) bounded, and dist(Aε,D¯¯¯)≥ε>0\text{dist}(A_{\varepsilon}, \overline{D}) \ge \varepsilon \gt 0.
So ∃Aε∈Rn\exists A_{\varepsilon} \in \mathbb{R}^n, aε≠0a_{\varepsilon} \neq 0, bε∈Rb_{\varepsilon} \in \mathbb{R} s.t. (aε,bε)(a_{\varepsilon}, b_{\varepsilon}) define a seperating hyperplane for Aε,D¯¯¯A_{\varepsilon}, \overline{D}.
aTεx≤bε∀x∈Aεa_{\varepsilon}^T x \le b_{\varepsilon} \; \forall x \in A_{\varepsilon}, aTεx≥bε∀x∈D¯¯¯a_{\varepsilon}^T x \ge b_{\varepsilon} \; \forall x \in \overline{D}
WLOG ∥aε∥=1\lVert a_{\varepsilon} \rVert = 1
The sequence (a⃗ 1n)∞n=1(\vec{a}_\frac{1}{n})_{n = 1}^{\infty} is a sequence of unit vectors and so has a convergent subsequence, say WLOG convergent to a0∈Rna_0 \in \mathbb{R}^n.
can assume sequence b1nb_{\frac{1}{n}} is bonded (or else one of the sets C,DC, D is empty)
and so also convergent to some value b0∈Rb_0 \in \mathbb{R}.
Want to show (a0,b0)(a_0, b_0) is SH for C,DC, D, i.e., that
aT0x≤b0∀x∈C,aT0x≥b0∀x∈Da_0^T x \le b_0 \; \forall x \in C, \, a_0^T x \ge b_0 \; \forall x \in D
(Assume CC is not a point, proof like above; then assume D is not a point, switch C,DC, D.
If C,DC, D are points, obious true.)
Log-convexity and log-concavity
- Definition: f:Rn→R>0f : \mathbb{R}^n \to \mathbb{R}_{\gt 0} is log-convex (log-concave) if log(f)\log (f) is convex (concave).
- Convexity:
log(f(θx+(1−θ)y))≤θlog(f(x))+(1−θ)log(f(y))=log(f(x)θf(y)1−θ)\log (f(\theta x + (1 - \theta) y)) \le \theta \log (f(x)) + (1 - \theta) \log (f(y)) = \log (f(x)^{\theta} f(y)^{1 - \theta})
⟺f(θx+(1−θ)y)≤f(x)θf(y)1−θ\iff f(\theta x + (1 - \theta)y) \le f(x)^{\theta} f(y)^{1 - \theta}
- Remark 2: log-convex ⟹\implies convex, f(x)=elogf(x)f(x) = e^{\log f(x)}, (composition function, QED)
concave ⟹\implies log-concave
相关文章推荐
- win2008r2 or centos6 硬盘挂载
- Win10右键添加获取管理员权限
- 设置linux主机名
- Absloute path vs relative path in Linux/Unix(Linux/Unix中的绝对路径vs相对路径)
- powershell玩转xml之20问
- 服务器架构思考
- 创建Docker基本的debian镜像
- docker 存储池扩容方案
- CentOS下与Apache连接的PHP多版本共存方案实现详解
- linux netstat命令技巧
- javascript精确统计网站访问量实例代码
- 你必须了解的基础的 Linux 网络命令
- shoesfitflops singapore on
- centos7配置JDK环境变量
- SercureCRT 连不上Linux 和配置SSH方法
- Xshell配置ssh免密码登录-密钥公钥(Public key)与私钥(Private Key)登
- Linux 每日一练 :让你的Linux登录提示 be special one!!PS1
- Linux 每日一练 :显示你当前bash 的进程PID:$$
- linux 每日一练习:what have happened?指令执行的怎么样了?成功或者失败?
- linux 每日一练习:父程序与子程序的概念