您的位置:首页 > 产品设计 > UI/UE

Coursera Machine Learning 第八周 quiz Principal Component Analysis

2016-11-12 22:50 531 查看
1

point

1.

Consider the following 2D dataset:



Which of the following figures correspond to possible values that PCA may return for u(1)(the
first eigenvector / first principal component)? Check all that apply (you may have to check more than one figure).

答案AB









1

point

2.

Which of the following is a reasonable way to select the number of principal components k?

(Recall that n is
the dimensionality of the input data and m is
the number of input examples.)

答案D

Choose the value of k that
minimizes the approximation error 1m∑mi=1||x(i)−x(i)approx||2.

Choose k to
be 99% of n (i.e., k=0.99∗n,
rounded to the nearest integer).

Choose k to
be the smallest value so that at least 1% of the variance is retained.

Choose k to
be the smallest value so that at least 99% of the variance is retained.

1

point

3.

Suppose someone tells you that they ran PCA in such a way that "95% of the variance was retained." What is an equivalent statement to this?

答案C

1m∑mi=1||x(i)−x(i)approx||21m∑mi=1||x(i)||2≥0.05

1m∑mi=1||x(i)−x(i)approx||21m∑mi=1||x(i)||2≥0.95

1m∑mi=1||x(i)−x(i)approx||21m∑mi=1||x(i)||2≤0.05

1m∑mi=1||x(i)−x(i)approx||21m∑mi=1||x(i)||2≤0.95

1

point

4.

Which of the following statements are true? Check all that apply.

答案BD

Given only z(i) and Ureduce,
there is no way to reconstruct any reasonable approximation to x(i).

Given input data x∈Rn,
it makes sense to run PCA only with values of k that
satisfy k≤n.
(In particular, running it with k=n is
possible but not helpful, and k>n does
not make sense.)

PCA is susceptible to local optima; trying multiple random initializations may help.

Even if all the input features are on very similar scales, we should still perform mean normalization (so that each feature has zero mean) before running PCA.

1

point

5.

Which of the following are recommended applications of PCA? Select all that apply.

答案CD

Preventing overfitting: Reduce the number of features (in a supervised learning problem), so that there are fewer parameters to learn.

To get more features to feed into a learning algorithm.

Data visualization: Reduce data to 2D (or 3D) so that it can be plotted.

Data compression: Reduce the dimension of your data, so that it takes up less memory / disk space.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐