浙江大学2016 –2017 学年春夏学期
《Artificial Intelligence》课程期末考试试卷课程号: 21191890  ,开课学院:_计算机科学与技术学院_
考试试卷:A卷、B卷(请在选定项上打√)
考试形式:闭、开卷(请在选定项上打√),允许带___________入场
考试日期:  2017  年 6 月 25 日,考试时间:  120  分钟
诚信考试,沉着应考,杜绝违纪。
考生姓名:_____________学号:_____________所属院系:_____________
1.Fill in the blanks (30 points)
1)Two common structures are used, namely first-in-first-out (FIFO queue) and last-in-
first-out (LIFO queue). The breadth-first-search uses a __________queue, depth-first search uses a__________ queue.
2)In alpha-beta pruning search, the algorithm maintains two values, alpha and beta, which
represents the maximum score that the maximizing player is assured of and the minimum score that the minimizing player is assured of respectively. At the beginning of alpha-beta search, alpha is set to __________  and beta is set to  __________, i.e. both players start with their lowest possible score.
3)  A and B are two random variables. P(A) and P(B) are their probabilities. It is known
that P(A)+P(B)=0.5, and  P(A|B)
P(B|A)
=1
4
.
Then the P(A) is_______.
4) In your 10-day vacation in Alaska, you kept the following log on the weather and
whether you saw a bear that day:
(rain, bear)    1 day (¬rain, bear)    2 days (rain, ¬bear)    6 days (¬rain, ¬bear)
1 day
a) Compute the marginal probability P(bear ) = _______
b) Compute the conditional probability P(¬bear|rain) = _______
5) In the figure, the circles show a plot of a training
data set of 10 data points, the dash line shows the function f(x) used to generate the data and solid curve shows the higher order polynomial g(x) fitted to given 10 data points. The fitted curve passes exactly through each data point, so the value of RMS error E RMS  = 0 and g(x) gives a
very poor representation of the function f(x). This behavior is known as
___________________ (please select over-fitting or under-fitting to fill in this blank).
6) For a given likelihood function p(x n |θ), if we obtain a data set of observations X =
{x 1,x 2,x 3} and these data points are independent and identically distributed (i.i.d.), then p (X |θ)=p(x 1,x 2,x 3|θ)= ____________.
7) For multivariate Gaussian distribution N (x|μ, Σ) of the D dimensional input space x,
we have  __________ independent parameters for μ and Σ. If  Σ is a diagonal matrix  and 2σ∑=I , the number of total parameters reduces to __________.
8) The Linear basis function models involve linear combinations of fixed nonlinear
functions of the input variables. If given basis functions ϕ(x )=
(ϕ0(x ),ϕ1(x ),ϕ2(x ))T
, where ϕ0(x )=1 and the model parameters w =(w 0,w 1,w 2)T , then the linear basis function y (x , w ) =  ____________.
9)In general, a deep convolutional neural network consists of convolutional layer,
pooling layer, fully-connected layer and classifier layer, the softmax is usually
employed at the __________  layer.
10)Reinforcement learning mainly consists of policy, value function and model.
A__________ maps a state to an action, and a value function is a prediction of future reward. In Q-value function, discount factor γ is usually used, the range of discount factor γ is __________.
2. Multiple Choice (36 points, only one of the options is correct.)
1)Consider three 2D points a = (0, 0), b = (0, 1), c = (1, 0). Run k-means with two
clusters. Let the initial cluster centers be (-1, 0), (0, 2). What clusters will k-means learn after one iterat
ion? _____
(A) {a}, {b, c}
(B) {a, b}, {c}
(C) {a, c}, {b}
(D) none of the above
2)The sigmoid function in a neural network is defined as g(x)=
e x
1+e x
. There is
another commonly used activation function called the hyperbolic tangent function,
which is defined as tanh(x)=e x−e−x
e x+e−x
. How are these two functions related?
____________
(A) tanℎ(x)=g(x)−1
(B) tanℎ(x)=2g(x)−1
(C) tanℎ(x)=g(2x)−1
(D) tanℎ(x)=2g(2x)−1
3)Which nodes will be pruned along with their branches by alpha-beta pruning? _____
(A) I                (B) H, I        (C) G, H, I        (D) C, H, I
4)Consider a 3-puzzle where, like in the usual 8-puzzle game, a tile can
only move to an adjacent empty space. Given the initial state which of
the following state cannot be reached? _____
(A)      (B) (C) (D)
5)Given two Gaussian distribution N(x|−1,1) and N(x|1,1), which of the following
formula is correct?  ______________
(A) N(0|−1,1)>N(0|1,1)
(B) N(−1|−1,1)>N(−1|1,1)
(C) N(0|−1,1)<N(0|1,1)
(D) N(−1|−1,1)<N(−1|1,1)
6)The Fisher’s criterion is defined to be _________
(A)the separation of the projected class means.
(B)the separation of the projected class variances.
(C)the ratio of the between-class variance to the within-class variance.
(D)the ratio of the within-class variance to the between-class variance.
7)Suppose we have a data set {x1,…,x N} drawn from the mixture of two 2D Gaussians,计算机中spring是什么意思
which can be written as p(x)=0.5N(x|μ1,Σ1)+0.5N(x|μ2,Σ2). If  Σ1=Σ2=σ2I in this model, which of the following figures is consistent with the distribution of data points p(x)? _______________
(A)      (B)        (C)        (D)
8)Consider a polynomial curve fitting problem. If the fitted curve oscillates wildly
through each point and achieve bad generalization by making accurate predictions for new data, we say this behavior is over-fitting. Which of the following methods cannot be used to control over-fitting? ________________
(A)Use fewer training data
(B)Add validation set, use Cross-validation
(C)Add a regularization term to an error function
(D)Use Bayesian approach with suitable prior
9)AlexNet (one of popular multi-layer convolutional neural networks for image
classification) is trained in a_________ setting, K-means clustering is employed in a _________ setting and Boosting for classification is implemented in a_________ setting , linear regression model for classification is realized in a _________ setting.
(A) unsupervised, supervised, supervised, unsupervised
(B) supervised, supervised, supervised, supervised
(C) supervised, supervised, unsupervised, unsupervised
(D) supervised, unsupervised, supervised, supervised

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。