作业二:
打开“bankloan.sav”,数据是某银行收集到的客户违约信息,待分析的因变量是default,其它变量是可能影响客户是否违约的因素。
1.使用logistic回归、判别分析、分类树方法进行分析,判断哪些变量会对客户违约产生影响。
2.比较这几种方法的分类准确性
logistic回归
Classification Tablea,b
Observed
Predicted
Previously defaulted
Percentage Correct
No
Yes
Step 0
Previously defaulted
No
516
0
100.0
Yes
183
0
.0
Overall Percentage
73.8
a. Constant is included in the model.
b. The cut value is .500
如图:Previously defaulted的频数最多,因此将所有个案都划分到Previously defaulted中,分类的正确率为73.2%
Hosmer and Lemeshow Test
Step
Chi-square
df
Sig.
1
8.467
8
.389
如图:Chi-square=8.467 sig=0.389>0.05 在5%的置信水平下模型并不能较好的拟合数据
Model Summary
Step
-2 Log likelihood
Cox & Snell R Square
1
551.427a
.303
.443
a. Estimation terminated at iteration number 6 because parameter estimates changed by less than .001.
如上图:-2 Log likelihoodCox & Snell R SquareNagelkerke R Square三个拟合系数的数值都不是很大,说米模型的拟合效果一般。
Contingency Table for Hosmer and Lemeshow Test
Previously defaulted = No
Previously defaulted = Yes
Total
Observed
Expected
Observed
Expected
Step 1
1
70
69.745
0
.255
70
2
69
68.805
1
1.195
70
3
64
66.852
6
3.148
70
4
64
63.931
6
6.069
70
5
65
59.868
5
10.132
70
6
51
54.527
19
15.473
70
7
49
48.175
21
21.825
70
8
39
40.568
31
29.432
70
9
34
30.362
36
39.638
70
10
11
13.166
58
55.834
69
Classification Tablea
Observed
Predicted
Previously defaulted
Percentage Correct
No
Yes
Step 1
Previously defaulted
No
471
45
91.3
Yes
90
93
50.8
Overall Percentage
80.7
a. The cut value is .500
将上图与最初的图对比,可以明显的看出划分得到了改善,与初始的分类表相比,个案分类准确率提高到了80.7%。
Variables in the Equation
B
S.E.
Wald
df
Sig.
Exp(B)
Step 1a
age
.034
.017
3.887
1
.049
1.035
ed
.090
.123
.532
1
.466
1.094
employ
-.258
.033
60.385
1
.000
.773
address
-.105
.023
20.251
1
.000
.901
income
-.009
.008
1.159
1
.282
.991
debtinc
.067
.031
4.881
1
.027
1.070
creddebt
.625
.113
30.724
1
.000
1.869
othdebt
.062
.077
.642
1
.423
1.064
Constant
-1.551
.619
6.274
1
.012
.212
a. Variable(s) entered on step 1: age, ed, employ, address, income, debtinc, creddebt, othdebt.
由上图可知由于sig>0.10,age、ed、income、debtinc、othdebt、constant这几个自变量在回归模型中的作用并不显著。
Forward回归法:
Variables not in the Equation
Score
df
Sig.
Step 0
Variables
age
13.210
1
.000
ed
9.099
1
.003
employ
55.843
1
.000
address
18.768
1
.000
income
3.526
1
.060
debtinc
106.506
1
.000
creddebt
42.116
1
.000
othdebt
14.871
1
.000
Overall Statistics
201.719
8
.000
Variables not in the Equation
Score
df
Sig.
Step 1
Variables
age
16.400
1
.000
ed
10.307
1
.001
employ
60.633
1
.000
address
23.220
1
.000
income
3.218
1
.073
creddebt
2.302
1
.129
othdebt
6.679
1
.010
Overall Statistics
113.903
7
.000
Step 2
Variables
age
.006
1
.940
ed
3.792
1
.052
address
8.306
1
.004
income
21.294
1
.000
creddebt
64.810
1
.000
othdebt
4.429
1
.035
Overall Statistics
83.978
6
.000
Step 3
Variables
age
.637
1
.425
ed
.016
1
.898
address
17.701
1
.000
income
.798
1
.372
othdebt
.009
1
.925
Overall Statistics
22.632
5
.000
Step 4
Variables
age
3.591
1
.058
ed
.380
1
.538
income
.015
1
.902
othdebt
.302
1
.582
Overall Statistics
5.179
4
.269
Model if Term Removed
Variable
Model Log Likelihood
Change in -2 Log Likelihood
df
Sig. of the Change
Step 1
debtinc
-401.879
103.208
1
.000
Step 2
employ
-350.275
69.972
1
.000
debtinc
-369.531
108.484
1
.000
Step 3
employ
-349.117
123.054
1
.000
debtinc
-299.505
23.831
1
.000
creddebt
-315.289
55.398
1
.000
Step 4
employ
-333.305
110.179
1
.000
address
-287.590
18.748
1
.000
debtinc
-289.870
23.308
1
.000
creddebt
-310.976
65.521
1
.000
variable used in lambda
Classification Tablea
Observed
Predicted
Previously defaulted
Percentage Correct
No
Yes
Step 1
Previously defaulted
No
488
28
94.6
Yes
135
48
26.2
Overall Percentage
76.7
Step 2
Previously defaulted
No
479
37
92.8
Yes
109
74
40.4
Overall Percentage
79.1
Step 3
Previously defaulted
No
476
40
92.2
Yes
99
84
45.9
Overall Percentage
80.1
Step 4
Previously defaulted
No
477
39
92.4
Yes
91
92
50.3
Overall Percentage
81.4
a.The cut value is .500
Model Summary
Step
-2 Log likelihood
Cox & Snell R Square
Nagelkerke R Square
1
700.550a
.137
.201
2
630.578b
.219
.321
3
575.180b
.279
.408
4
556.432c
.298
.436
Variables in the Equation
B
S.E.
Wald
df
Sig.
Exp(B)
Step 1a
debtinc
.132
.014
85.544
1
.000
1.141
Constant
-2.530
.195
168.488
1
.000
.080
Step 2b
employ
-.141
.019
53.493
1
.000
.869
debtinc
.145
.016
87.340
1
.000
1.156
Constant
-1.695
.219
59.886
1
.000
.184
Step 3c
employ
-.244
.027
79.984
1
.000
.784
debtinc
.088
.018
23.375
1
.000
1.092
creddebt
.502
.081
38.608
1
.000
1.653
Constant
-1.228
.231
28.222
1
.000
.293
Step 4d
employ
-.242
.028
74.540
1
.000
.785
address
-.081
.020
17.043
1
.000
.922
debtinc
.088
.019
22.686
1
.000
1.092
creddebt
.572
.087
43.043
1
.000
1.773
Constant
-.794
.252
9.954
1
.002
.452
a. Variable(s) entered on step 1: debtinc.
b. Variable(s) entered on step 2: employ.
c. Variable(s) entered on step 3: creddebt.
d. Variable(s) entered on step 4: address.
判别分析
这里采用逐步判别法,具体如下图
Variables in the Analysis
Step
Tolerance
F to Remove
Wilks' Lambda
1
Debt to income ratio (x100)
1.000
125.293
2
Debt to income ratio (x100)
.992
130.842
.920
Years with current employer
.992
65.708
.848
3
Debt to income ratio (x100)
.766
36.043
.766
Years with current employer
.716
111.035
.844
Credit card debt in thousands
.573
44.384
.775
4
Debt to income ratio (x100)
.766
35.137
.753
Years with current employer
.691
89.788
.809
Credit card debt in thousands
.564
48.856
.767
Years at current address
.898
10.895
.728
Variables Not in the Analysis
Step
Tolerance
Min. Tolerance
F to Enter
Wilks' Lambda
0
Age in years
1.000
1.000
13.426
.981
Level of education
1.000
1.000
9.192
.987
Years with current employer
1.000
1.000
60.518
.920
Years at current address
1.000
1.000
19.231
.973
Household income in thousands
1.000
1.000
3.534
.995
Debt to income ratio (x100)
1.000
1.000
125.293
.848
Credit card debt in thousands
1.000
1.000
44.688
.940
Other debt in thousands
1.000
1.000
15.151
.979
1
Age in years
.994
.994
17.403
.827
Level of education
.999
.999
10.146
.835
Years with current employer
.992
.992
65.708
.775
Years at current address
.993
.993
23.970
.819
Household income in thousands
1.000
1.000
3.029
.844
Credit card debt in thousands
.793
.793
2.723
.844
Other debt in thousands
.664
.664
8.594
.837
2
Age in years
.725
.723
.003
.775
Level of education
.983
.977
4.404
.770
Years at current address
.912
.911
6.601
.767
Household income in thousands
.604
.599
17.057
.756
Credit card debt in thousands
.573
.573
44.384
.728
Other debt in thousands
.486
.486
1.980
.772

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。