2.10(iii) From (2.57), Var() = 2/. 由提示:: , and so Var() Var()variable什么意思中文. A more direct way to see this is to write(一个更直接的方式看到这是编写) =, which is less than unless = 0.
(iv)给定的c但随着的增加,的方差与Var()的相关性也增加.小时的偏差也小.因此, 在均方误差的基础上不管我们选择还是要取决于, ,和n的大小 (除了的大小).
3.7We can use Table 3.2. By definition, > 0, and by assumption, Corr(x1,x2) < 0. Therefore, there is a negative bias in: E() <. This means that, on average across different random samples, the simple regression estimator underestimates the effect of the training program. It is even possible that E() is negative even though > 0. 我们可以使用表3.2。根据定义,> 0,由假设,科尔(X1,X2)<0。因此,有一个负偏压为:E()<。这意味着,平均在不同的随机抽样,简单的回归估计低估的培训计划的效果。 E(下),它甚至可能是负的,即使>0。 我们可以使用表格3.2。根据定义,> 0,通过假设,柯尔(x1,x2)< 0。因此,有一种负面的偏见:E()<。这意味着,平均跨不同的随机样本,简单的回归估计低估了培训项目的效果。甚至可能让E()是负的,尽管> 0。
3.8 Only (ii), omitting an important variable, can cause bias, and this is true only when the omitted variable is correlated with the included explanatory variables. The homoskedasticity assumption, MLR.5, played no role in showing that the OLS estimators are unbiased. (Homoskedasticity was used to obtain the usual variance formulas for the.) Further, the degree of collinearity between the explanatory variables in the sample, even if it is reflected in a correlation as high as .95, does not affect the Gauss-Markov assumptions. Only if there is a perfect linear relationship among two or more explanatory variables is MLR.3 violated. 只有3.8(ii),遗漏重要变量,会造成偏见确实是这样,只有当省略变量就与包括解释变量。homoskedasticity的假设,多元线性回归。5,没有发挥作用在显示OLS估计量是公正的。(Homoskedasticity是用来获取通常的方差公式。)进一步,共线的程度解释变量之间的样品中,即使它是反映在尽可能高的相关性。95年,不影响的高斯-马尔可夫假定。只要有一个完美的线性关系在两个或更多的解释变量是多元线性回归。三违反了。
3.9 (i) Because is highly correlated with and, and these latter variables have large partial effects on y, the simple and multiple regression coefficients on can differ by large amounts. We have not done this case explicitly, but given equation (3.46) and the discussion with a single omitted variable, the intuition is pretty straightforward. 因为 是高度相关,和这些后面的变量有很大部分影响y,简单和多元回归系数的差异可大量。我们还没有做到,这种情况下显式,但鉴于方程(3.46)和以讨论单个变量遗漏,直觉是相当简单的。 (ii) Here we would expect and to be similar (subject, of course, to what we mean by “almost uncorrelated”). The amount of correlation between and does not directly effect the multiple regression estimate on if is essentially uncorrelated with and.
这里我们将期待和相似(主题,当然对我们所说的“几乎不相关的”)。相关性的数量,但不会直接影响了多元回归估计如果本质上是不相关的和。
(iii) (iii) In this case we are (unnecessarily) introducing multicollinearity into the regression: and have small partial effects on y and yet and are highly correlated with. Adding and like increases the standard error of the coefficient on substantially, so se() is likely to be much larger than se().在这种情况下我们(不必要的)引入重合放入回归:,有微小的部分影响,但y,是高度相关的。添加和像增加标准错误的系数显著,所以se()可能会远远大于se()。
(iv) In this case, adding and will decrease the residual variance without causing much collinearity (because is almost uncorrelated with and), so we should see se() smaller than se(). The amount of correlation between and does not directly affect se().在这种情况下,添加和将减少剩余方差,也没有引起共线(因为几乎是不相关的,),所以我们应该看到se()小于se()。相关性的数量,但不会直接影响se()。
3.11 (i) < 0 because more pollution can be expected to lower housing values; note that is the elasticity of price with respect to nox. is probably positive because rooms roughly measures the size of a house. (However, it does not allow us to distinguish homes where each room is large from homes where each room is small.) < 0,因为更多的污染可以预期较低的房屋价值;注意,价格弹性对氮氧化物。可能是积极的因为房间粗略地度量大小的房子。(然而,不允许我们自己去辨别的家中,每个房间都是大从家中,每个房间小。)
(ii) If we assume that rooms increases with quality of the home, then log(nox) and rooms are negatively correlated when poorer neighborhoods have more pollution, something that is often true. We can use Table 3.2 to determine the direction of the bias. If > 0 and Corr(x1,x2) < 0, the simple regression estimator has a downward bias. But because < 0, this means that the simple regression, on average, overstates the importance of pollution. [E() is more negative than.]如果我们假设房间随质量的家里,然后日志(nox)和房间反比当没那么富裕的社区有更多的污染,这往往是正确的。我们可以使用表3.2来确定方向的偏见。如果> 0和柯尔(x1,x2)< 0,那么简单的
(iii) This is what we expect from the typical sample based on our analysis in part (ii). The simple regression estimate, 1.043, is more negative (larger in magnitude) than the multiple regression estimate, 0.718. As those estimates are only for one sample, we can never know which is closer to. But if this is a “typical” sample, is closer to 0.718. 这是我们期待的东西从典型的示例基于我们的分析部分(ii)。简单的回归估计,?1.043,是更多的负面(大级)比多元回归估计,?0.718。作为这些估计仅供一个样品,我们永远也不会知道,更靠近。但是如果这是一个“典型”的示例,接近?0.718
6.4 (i) The answer is not entire obvious, but one must properly interpret the coefficient on alcohol in either case. If we include attend, then we are measuring the effect of alcohol consumption on college GPA, holding attendance fixed. Because attendance is likely to be an important mechanism through which drinking affects performance, we probably do not want to hold it fixed in the analysis. If we do include attend, then we interpret the estimate of as being those effects on colGPA that are not due to attending class. (For example, we could be measuring the effects that drinking alcohol has on study time.) To get a total effect of alcohol consumption, we would leave attend out. 答案并不完全是显而易见的,但你必须正确解析系数酒精在这两种情况下。如果我们包括参加,那么我们正在测量效果的酒精消费对大学GPA,持有出席固定。因为出勤率可能是一个重要的机制,通过这种机制,饮酒会影响性能,我们可能不想把它固定在分析。如果我们确实包括参加,然后我们把这些影响的估计是在colGPA,不是由于atten
(ii) We would want to include SAT and hsGPA as controls, as these measure student abilities and motivation. Drinking behavior in college could be correlated with one’s performance in high school and on standardized tests. Other factors, such as family backg
round, would also be good controls.
我们想要包括SAT和hsGPA作为对照组,这些测量学生的能力和动力。在大学的饮酒行为可以与一个人的表现在高中和标准化考试。其他因素,如家庭背景,也将是良好的控制。
6.6 The second equation is clearly preferred, as its adjusted R-squared is notably larger than that
in the other two equations. The second equation contains the same number of estimated
parameters as the first, and the one fewer than the third. The second equation is also easier to
interpret than the third. 第二个方程显然是首选的,因为它是大调整平方比其他两个方程。第二个等式包含相同数量的估计参数作为第一个,和一个少于第三。第二个方程也更容易解释第三。
7.3 (i) The t statistic on hsize2 is over four in absolute value, so there is very strong evidence that it belongs in the equation. We obtain this by finding the turnaround point; this is the value of hsize that maximizes (other things fixed): 19.3/(22.19) 4.41. Because hsize is measured in hundreds, the optimal size of graduating class is about 441. 在hsize2 t统计超过4在绝对价值,所以有非常有力的证据,它是属于方程。我们通过发现获得这样的转变点,这是hsize的价值最大化(其他东西固定):19.3 /(2 2.19)?4.41。因为hsize是以数百,最佳的毕业生的人数大约是441。
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论