您的位置：首页 > 其它

最大似然估计，最大后验估计，概率相关知识

2016-09-23 14:26 351 查看

1、什么是似然函数

The likelihood of a set of parameter values, θ, given outcomes x, is equal to the probability of those observed outcomes given those parameter values, that is
{\displaystyle {\mathcal {L}}(\theta |x)=P(x|\theta
)}

.
The likelihood function is defined differently for discrete and continuous probability distributions.

Discrete probability distribution

Let X be a random variable with a discrete probability distribution p depending
on a parameter θ. Then the function
{\displaystyle {\mathcal {L}}(\theta |x)=p_{\theta
}(x)=P_{\theta }(X=x),\,}

considered as a function of θ, is called the likelihood function (of θ, given the outcome x of the random
variable X). Sometimes the probability of the value x of X for the parameter value θ is written as {\displaystyle
P(X=x|\theta )}

;
often written as {\displaystyle P(X=x;\theta )}

to
emphasize that this differs from {\displaystyle {\mathcal {L}}(\theta |x)}

which
is not a conditional probability, because θ is a parameter and not a random
variable.

Continuous probability distribution

Let X be a random variable following an absolutely
continuous probability distribution with density function f depending
on a parameter θ. Then the function
{\displaystyle {\mathcal {L}}(\theta |x)=f_{\theta
}(x),\,}

considered as a function of θ, is called the likelihood function (of θ, given the outcome x of X). Sometimes the density function for the value x of X for the parameter value θ is written as {\displaystyle
f(x|\theta )}

;
this should not be confused with {\displaystyle {\mathcal {L}}(\theta |x)}

which
should not be considered a conditional probability density.

总的来说，似然函数就是，一个概率模型的参数θ还没有确定时，给定一组已经发生的样本（输出给定）X，这个参数θ的似然L(θ|X)定义为：

在参数为θ时，样本X发生的概率。

2、最大似然估计的步骤

2.1离散型变量

我们现在有一组样本，样本数量为n，分别是，X1，X2，X3，X4，...，Xn

我们现在的概率模型中有k个参数θ1，...，θk，记做θall

（1）得到表达式

若为离散型随机变量，一般情况下我们都会假设变量之间相互独立，那么似然函数为

L（θall|X1，X2，..Xn）=P(X1|θall)*P（X2|θall）*....*P(Xn|θall)

L（似然值）=各个样本在θ1，...，θk这一组参数下的概率的乘积

（2）求解最大值

这是一个关于θ1，...，θk的k元函数，以为这组样本已经发生，所以概率值越大越好

我们要求这个函数的最大值，这也就变成了一个最优化问题。

关于离散型求解方法，进一步研究，不知道求导是否可行

2.2连续性变量
（1）连续性变量道理一致，只需要将概率P改成概率密度函数。
L（θall|X1，X2，..Xn）=f(X1|θall)*f（X2|θall）*....*f(Xn|θall)
（2）两边取对数，因为对数函数是单调递增，所以最大值点相同，不受影响
ln L（θall|X1，X2，..Xn）=ln f(X1|θall)+
ln f(X2|θall)+...+ln f(Xn|θall)
（3）求ln(L)对θ1，θ2，....θn的偏导数，另各阶偏导数为0，得到n个方程，这样就能解得函数极值点。
（4）如果不能求根，或者导数不存在，就要考虑其他方法。

3、贝叶斯公式
贝叶斯定理由英国数学家贝叶斯
( Thomas Bayes 1702-1761 ) 发展，用来描述两个条件概率之间的关系，
比如
P(A|B) 和 P(B|A)。按照乘法法则，可以立刻导出：
P(A∩B)
= P(A)*P(B|A)=P(B)*P(A|B)。
如上公式等式的后两项也可变形为：
P(B|A)
= P(A|B)*P(B) / P(A)。

贝叶斯公式就是刻画了两个条件概率的相互关系，并没有什么特别之处。
后验估计时，把参数当成了随机变量，那么参数和样本就是两个互相作用的条件概率。

3、后验概率
The posterior probability
is the probability of the parameters {\displaystyle
\theta }

given
the evidence {\displaystyle
X}

: {\displaystyle
p(\theta |X)}

.

注意后验概率把参数当成了随机变量，求的是在样本发生的情况下，参数是

的概率
这与参数似然不同，参数

的似然实际上还是求的在参数是

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航