# MATH 4300b

This is the home page for Math 4300b, Reading Course in Topics in Applied Statistics in the Summer 2006

• Learn R R:_Getting_started
• Textbook: Faraway, J. (2006) "Extending the Linear Model with R"
• Assignments: work on even numbered problems

## Chapter 1

### Question 2

#### Details

Twelve core samples from petroleum reservoirs were sampled by 4 cross-sections. Each core sample was measured for permeability, and each cross-section has total area of pores, total perimeter of pores, and shape.

#### Summary

```>library(faraway)
>data(rock)
area    peri     shape perm
1 4990 2791.90 0.0903296  6.3
2 7002 3892.60 0.1486220  6.3
3 7558 3930.66 0.1833120  6.3
4 7352 3869.32 0.1170630  6.3
5 7943 3948.54 0.1224170 17.1
6 7979 4010.15 0.1670450 17.1
> summary(rock)
area            peri            shape              perm
Min.   : 1016   Min.   : 308.6   Min.   :0.09033   Min.   :   6.30
1st Qu.: 5305   1st Qu.:1414.9   1st Qu.:0.16226   1st Qu.:  76.45
Median : 7487   Median :2536.2   Median :0.19886   Median : 130.50
Mean   : 7188   Mean   :2682.2   Mean   :0.21811   Mean   : 415.45
3rd Qu.: 8870   3rd Qu.:3989.5   3rd Qu.:0.26267   3rd Qu.: 777.50
Max.   :12212   Max.   :4864.2   Max.   :0.46413   Max.   :1300.00
```

#### First Linear Model

```> lmod<- lm(perm ~ area+peri, rock)
> coef(lmod)
(Intercept)        area        peri
696.6882869   0.1063866  -0.3899456
> predict(lmod)
1            2            3            4            5            6            7            8            9
138.8686099  -76.2945490  -31.9849061  -29.9812931    2.0017229  -18.1929039   -5.0111249 -124.1997605  153.7962011
10           11           12           13           14           15           16           17           18
171.9176359  -54.0828274   59.7500760  255.7815298  268.2809255  138.9946948  225.6235901   65.7721221  -27.3112085
19           20           21           22           23           24           25           26           27
63.5677960   -0.3207833  193.5248265  262.6483980  164.0541645  200.5694321  602.2328598  604.8960561  738.4223204
28           29           30           31           32           33           34           35           36
636.5651050  667.2877993  734.0509709  743.5772447  966.7705521  776.5216595  876.2550277  956.3754548  966.7705521
37           38           39           40           41           42           43           44           45
528.9054859  667.1242240  607.7701571  615.5818660  866.5775262  684.4235346  846.2286680  742.8790519  608.5428761
46           47           48
639.2200528  715.5902715 1151.2583149
> residuals(lmod)
1           2           3           4           5           6           7           8           9          10
-132.568610   82.594549   38.284906   36.281293   15.098277   35.292904   22.111125  141.299761  -34.796201  -52.917636
11          12          13          14          15          16          17          18          19          20
173.082827   59.249924 -173.381530 -185.880926  -56.594695 -143.223590   -7.172122   85.911209   -4.967796   58.920783
21          22          23          24          25          26          27          28          29          30
-51.524827 -120.648398  -22.054165  -58.569432  137.767140  135.103944    1.577680  103.434895  222.712201  155.949029
31          32          33          34          35          36          37          38          39          40
146.422755  -76.770552  173.478341   73.744972   -6.375455  -16.770552 -428.905486 -567.124224 -507.770157 -515.581866
41          42          43          44          45          46          47          48
433.422474  615.576465  453.771332  557.120948  -28.542876  -59.220053 -135.590272 -571.258315
> deviance(lmod)
[1] 2853383
> cor(predict(lmod),rock\$perm)^2
[1] 0.6832807
> summary(lmod)

Call:
lm(formula = perm ~ area + peri, data = rock)

Residuals:
Min       1Q   Median       3Q      Max
-571.258  -63.608   -1.695  111.352  615.576

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 696.68829  107.00334   6.511 5.43e-08 ***
area          0.10639    0.02406   4.421 6.14e-05 ***
peri         -0.38995    0.04511  -8.645 4.05e-11 ***
---

Residual standard error: 251.8 on 45 degrees of freedom
Multiple R-Squared: 0.6833,     Adjusted R-squared: 0.6692
F-statistic: 48.54 on 2 and 45 DF,  p-value: 5.823e-12
```

#### Second Linear Model

```> lmod2 <- lm(perm ~ area+peri+shape, rock)
> summary(lmod2)

Call:
lm(formula = perm ~ area + peri + shape, data = rock)

Residuals:
Min      1Q  Median      3Q     Max
-750.26  -59.57   10.66  100.25  620.91

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 485.61797  158.40826   3.066 0.003705 **
area          0.09133    0.02499   3.654 0.000684 ***
peri         -0.34402    0.05111  -6.731 2.84e-08 ***
shape       899.06926  506.95098   1.773 0.083070 .
---

Residual standard error: 246 on 44 degrees of freedom
Multiple R-Squared: 0.7044,     Adjusted R-squared: 0.6843
F-statistic: 34.95 on 3 and 44 DF,  p-value: 1.033e-11
```

#### F-Test

```> anova(lmod,lmod2)
Analysis of Variance Table

Model 1: perm ~ area + peri
Model 2: perm ~ area + peri + shape
Res.Df     RSS Df Sum of Sq      F  Pr(>F)
1     45 2853383
2     44 2663023  1    190360 3.1452 0.08307 .
---
```

## Chapter 2

### Question 4

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked from package:datasets :

faithful

> data(aflatoxin)
> aflatoxin
dose total tumor
1    0    18     0
2    1    22     2
3    5    22     1
4   15    21     4
5   50    25    20
6  100    28    28
> xtabs ((tumor/total)~dose, aflatoxin)
dose
0          1          5         15         50        100
0.00000000 0.09090909 0.04545455 0.19047619 0.80000000 1.00000000
> nontumor <- aflatoxin\$total-aflatoxin\$tumor
> nontumor
[1] 18 20 21 17  5  0
> modl <- glm(cbind(tumor,nontumor) ~dose, family=binomial, aflatoxin)
> summary(modl)

Call:
glm(formula = cbind(tumor, nontumor) ~ dose, family = binomial,
data = aflatoxin)

Deviance Residuals:
1        2        3        4        5        6
-1.2995   0.7959  -0.4814   0.4174  -0.1629   0.3774

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.03604    0.48226  -6.295 3.07e-10 ***
dose         0.09009    0.01456   6.189 6.04e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 116.5243  on 5  degrees of freedom
Residual deviance:   2.8970  on 4  degrees of freedom
AIC: 17.685

Number of Fisher Scoring iterations: 5

> modp <- glm(cbind(tumor,nontumor) ~dose, family=binomial(link=probit), data=aflatoxin)
> modc <- glm(cbind(tumor,nontumor) ~dose, family=binomial(link=cloglog), data=aflatoxin)
> fitted(modl)
1          2          3          4          5          6
0.04582416 0.04992838 0.07007143 0.15647348 0.81281443 0.99745948

> predict (modl)
1         2         3         4         5         6
-3.036036 -2.945948 -2.585593 -1.684705  1.468402  5.972841

> cbind(fitted(modl),fitted(modp),fitted(modc))

[,1]       [,2]       [,3]
1 0.04582416 0.04126148 0.05294011
2 0.04992838 0.04606205 0.05656383
3 0.07007143 0.06989618 0.07361027
4 0.15647348 0.16922517 0.14022564
5 0.81281443 0.80522509 0.80574950
6 0.99745948 0.99972699 1.00000000
```

### Question 6

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked from package:datasets :

faithful

> data(turtle)
> turtle
temp male female
1  27.2    1      9
2  27.2    0      8
3  27.2    1      8
4  27.7    7      3
5  27.7    4      2
6  27.7    6      2
7  28.3   13      0
8  28.3    6      3
9  28.3    7      1
10 28.4    7      3
11 28.4    5      3
12 28.4    7      2
13 29.9   10      1
14 29.9    8      0
15 29.9    9      0
> bmod <- glm(cbind(male,female)~temp, family=binomial, turtle)
> bmod

Call:  glm(formula = cbind(male, female) ~ temp, family = binomial,      data = turtle)

Coefficients:
(Intercept)         temp
-61.318        2.211

Degrees of Freedom: 14 Total (i.e. Null);  13 Residual
Null Deviance:      74.51
Residual Deviance: 24.94        AIC: 53.84

> halfnorm(residuals(bmod))

> (sigma2 <- sum(residuals(bmod,type="pearson")^2)/12)
[1] 2.186861
> summary (bmod,dispersion=sigma2)

Call:
glm(formula = cbind(male, female) ~ temp, family = binomial,
data = turtle)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-2.0721  -1.0291  -0.2714   0.8087   2.5550

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -61.3183    17.7787  -3.449 0.000563 ***
temp          2.2110     0.6371   3.470 0.000520 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 2.186861)

Null deviance: 74.508  on 14  degrees of freedom
Residual deviance: 24.942  on 13  degrees of freedom
AIC: 53.836

Number of Fisher Scoring iterations: 5
```

## Chapter 3

### Question 2

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(salmonella)
> salmonella
colonies dose
1        15    0
2        21    0
3        29    0
4        16   10
5        18   10
6        21   10
7        16   33
8        26   33
9        33   33
10       27  100
11       41  100
12       60  100
13       33  333
14       38  333
15       41  333
16       20 1000
17       27 1000
18       42 1000
> modl <- lm(colonies~dose, salmonella)
> plot(predict(modl),residuals(modl),xlab="Fitted",ylab="Residuals")
> modt <- lm(sqrt(colonies)~dose, salmonella)
> plot(predict(modt),residuals(modt),xlab="Fitted",ylab="Residuals")
> modp <- glm(colonies~dose, family=poisson,salmonella)

```

```> summary(modp)
Call:
glm(formula = colonies ~ dose, family = poisson, data = salmonella)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-2.6482  -1.8225  -0.2993   1.2917   5.1861

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.3219950  0.0540292  61.485   <2e-16 ***
dose        0.0001901  0.0001172   1.622    0.105
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 78.358  on 17  degrees of freedom
Residual deviance: 75.806  on 16  degrees of freedom
AIC: 172.34

Number of Fisher Scoring iterations: 4

> plot(log(fitted(modp)),log((salmonella\$colonies-fitted(modp))^2),xlab=expression(hat(mu)),ylab=expression((y-hat(mu))^2))
> abline(0,1)
```

### Question 4

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(africa)
miltcoup oligarchy pollib parties pctvote popn size numelec numregim
Angola          0         0      2      38      NA  9.7 1247       0        1
Benin           5         7      1      34   45.68  4.6  113       8        3
Botswana        0         0     NA       7   20.30  1.2  582       5        1
Burkina         6        13      2      62   17.50  8.8  274       5        3
Burundi         2        13      2      10   34.39  5.3   28       3        3
Cameroon        0         0      2      34   30.30 11.6  475      14        3
> help(africa)
> modp <- glm (miltcoup~.,family=poisson,africa)
> summary(modp)

Call:
glm(formula = miltcoup ~ ., family = poisson, data = africa)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-1.3443  -0.9542  -0.2587   0.3905   1.6953

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.5102693  0.9053301  -0.564  0.57301
oligarchy    0.0730814  0.0345958   2.112  0.03465 *
pollib      -0.7129779  0.2725635  -2.616  0.00890 **
parties      0.0307739  0.0111873   2.751  0.00595 **
pctvote      0.0138722  0.0097526   1.422  0.15491
popn         0.0093429  0.0065950   1.417  0.15658
size        -0.0001900  0.0002485  -0.765  0.44447
numelec     -0.0160783  0.0654842  -0.246  0.80605
numregim     0.1917349  0.2292890   0.836  0.40303
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 65.945  on 35  degrees of freedom
Residual deviance: 28.668  on 27  degrees of freedom
AIC: 111.48

Number of Fisher Scoring iterations: 6

> pchisq(deviance(modp),df.residual(modp),lower=FALSE)
[1] 0.3771631
> pchisq(65.945,36,lower=FALSE)
[1] 0.001695765
> halfnorm(residuals(modp))
> drop1 (modp,test="F")
Single term deletions

Model:
miltcoup ~ oligarchy + pollib + parties + pctvote + popn + size +
numelec + numregim
Df Deviance     AIC F value   Pr(F)
<none>         28.668 111.477
oligarchy  1   33.032 113.840  4.1093 0.05263 .
pollib     1   35.581 116.390  6.5102 0.01670 *
parties    1   35.501 116.310  6.4350 0.01728 *
pctvote    1   30.668 111.477  1.8832 0.18126
popn       1   30.628 111.437  1.8454 0.18557
size       1   29.281 110.090  0.5772 0.45399
numelec    1   28.728 109.537  0.0566 0.81375
numregim   1   29.363 110.172  0.6541 0.42573
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Warning message:
F test assumes quasipoisson family in: drop1.glm(modp, test = "F")
```

## Chapter 4

### Question 2

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(melanoma)
> melanoma
count         tumor      site
5      2       freckle     trunk
6     54   superficial     trunk
7     33       nodular     trunk
8     17 indeterminate     trunk
9     10       freckle extremity
10   115   superficial extremity
11    73       nodular extremity
12    28 indeterminate extremity
> (ct <- xtabs(count~tumor+site,melanoma))
site
freckle              10   22     2
indeterminate        28   11    17
nodular              73   19    33
superficial         115   16    54
> summary(ct)
Call: xtabs(formula = count ~ tumor + site, data = melanoma)
Number of cases in table: 400
Number of factors: 2
Test for independence of all factors:
Chisq = 65.81, df = 6, p-value = 2.943e-12
> dotchart(ct)
> mosaicplot(ct,color=TRUE,main=NULL,las=1)
```

### Question 4

```PART A
> (ct <- xtabs(y~defend+penalty,death))
penalty
defend  no yes
b 149  17
w 141  19
> prop.table(ct,1)
penalty
defend        no       yes
b 0.8975904 0.1024096
w 0.8812500 0.1187500
> (cta <- xtabs(y~defend+penalty,death, subset=(victim=="w")))
penalty
defend  no yes
b  52  11
w 132  19
> prop.table(cta,1)
penalty
defend        no       yes
b 0.8253968 0.1746032
w 0.8741722 0.1258278
> (ctb <- xtabs(y~defend+penalty,death, subset=(victim=="b")))
penalty
defend no yes
b 97   6
w  9   0
> prop.table(ctb,1)
penalty
defend         no        yes
b 0.94174757 0.05825243
w 1.00000000 0.00000000
> summary(ct)
Call: xtabs(formula = y ~ defend + penalty, data = death)
Number of cases in table: 326
Number of factors: 2
Test for independence of all factors:
Chisq = 0.22145, df = 1, p-value = 0.638

> fisher.test(cta)

Fisher's Exact Test for Count Data

data:  cta
p-value = 0.3893
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.2849356 1.7014713
sample estimates:
odds ratio
0.6817309

> fisher.test(ctb)

Fisher's Exact Test for Count Data

data:  ctb
p-value = 1
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.00000 10.72162
sample estimates:
odds ratio
0
PART B:
> ct3 <-xtabs(y~ defend+penalty+victim,death)
> apply(ct3,3,function(x) (x[1,1]*x[2,2])/(x[1,2]*x[2,1]))
b         w
0.0000000 0.6804408
> modsat <- glm(y~defend*penalty*victim,death,family=poisson)
> drop1 (modsat,test="Chi")
Single term deletions

Model:
y ~ defend * penalty * victim
Df  Deviance    AIC    LRT Pr(Chi)
<none>                   4.122e-10 51.682
defend:penalty:victim  1     0.701 50.382  0.701  0.4025
> modu <- glm(y~(defend+penalty+victim)^2,death,family=poisson)
> drop1 (modu,test="Chi")
Single term deletions

Model:
y ~ (defend + penalty + victim)^2
Df Deviance     AIC     LRT   Pr(Chi)
<none>               0.701  50.382
defend:penalty  1    1.882  49.563   1.181  0.277121
defend:victim   1  131.458 179.140 130.757 < 2.2e-16 ***
penalty:victim  1    7.910  55.592   7.209  0.007252 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

PARTC:
> ybin <- matrix(death\$y,ncol=2)
> modbin <- glm(ybin~victim*penalty,death[1:4,],family=binomial)
> drop1(modbin,test="Chi")
Single term deletions

Model:
ybin ~ victim * penalty
Df  Deviance     AIC     LRT Pr(Chi)
<none>            2.288e-10 21.2230
victim:penalty  1    0.7007 19.9237  0.7007  0.4025
> modbinr <- glm(ybin~victim+penalty,death[1:4,],family=binomial)
> drop1(modbinr,test="Chi")
Single term deletions

Model:
ybin ~ victim + penalty
Df Deviance     AIC     LRT Pr(Chi)
<none>        0.701  19.924
victim   1  131.458 148.681 130.757  <2e-16 ***
penalty  1    1.882  19.105   1.181  0.2771
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Relationship between binomial model and uniform model
> ctf <- xtabs(fitted(modu) ~defend+penalty+victim, death)
> apply(ctf,3,function(x) (x[1,1]*x[2,2])/(x[1,2]*x[2,1]))
b         w
0.6438933 0.6438933
> deviance(modu)
[1] 0.7007402
> deviance(modbinr)
[1] 0.7007402
> exp(-coef(modbinr)[2])
victimw
0.6438933
```

### Question 6

```> library(faraway)
> data(suicide)
> suicide
y cause age sex
1  398  drug   y   m
2  121   gas   y   m
3  455  hang   y   m
4  155   gun   y   m
5   55  jump   y   m
6  124 other   y   m
7  399  drug   m   m
8   82   gas   m   m
9  797  hang   m   m
10 168   gun   m   m
11  51  jump   m   m
12  82 other   m   m
13  93  drug   o   m
14   6   gas   o   m
15 316  hang   o   m
16  33   gun   o   m
17  26  jump   o   m
18  14 other   o   m
19 259  drug   y   f
20  15   gas   y   f
21  95  hang   y   f
22  14   gun   y   f
23  40  jump   y   f
24  38 other   y   f
25 450  drug   m   f
26  13   gas   m   f
27 450  hang   m   f
28  26   gun   m   f
29  71  jump   m   f
30  60 other   m   f
31 154  drug   o   f
32   5   gas   o   f
33 185  hang   o   f
34   7   gun   o   f
35  38  jump   o   f
36  10 other   o   f
> modu <- glm(y~(age+cause+sex)^2,suicide,family=poisson)
> drop1(modu,test="Chi")
Single term deletions

Model:
y ~ (age + cause + sex)^2
Df Deviance    AIC    LRT   Pr(Chi)
<none>          14.90 285.00
age:cause 10   293.18 543.27 278.28 < 2.2e-16 ***
age:sex    2   154.70 420.80 139.80 < 2.2e-16 ***
cause:sex  5   389.00 649.09 374.10 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> modsat <- glm(y~age*cause*sex,suicide,family=poisson)
> drop1(modsat,test="Chi")
Single term deletions

Model:
y ~ age * cause * sex
Df  Deviance     AIC     LRT Pr(Chi)
<none>           1.237e-13 290.095
age:cause:sex 10    14.901 284.995  14.901  0.1357
```

### Question 8

```> library(faraway)
> data(HairEyeColor)
> HairEyeColor
, , Sex = Male

Eye
Hair    Brown Blue Hazel Green
Black    32   11    10     3
Brown    38   50    25    15
Red      10   10     7     7
Blond     3   30     5     8

, , Sex = Female

Eye
Hair    Brown Blue Hazel Green
Black    36    9     5     2
Brown    81   34    29    14
Red      16    7     7     7
Blond     4   64     5     8
> y <- c(32,11,10,3,38,50,25,15,10,10,7,7,3,30,5,8)
> eyec <- gl(4,1,4,labels=c("Brown","Blue","Hazel","Green"))
> hairc <- gl(4,4,labels=c("Black","Brown","Red","Blond"))
> haireyec <- data.frame(y,eyec,hairc)
> haireyec
y  eyec hairc
1  32 Brown Black
2  11  Blue Black
3  10 Hazel Black
4   3 Green Black
5  38 Brown Brown
6  50  Blue Brown
7  25 Hazel Brown
8  15 Green Brown
9  10 Brown   Red
10 10  Blue   Red
11  7 Hazel   Red
12  7 Green   Red
13  3 Brown Blond
14 30  Blue Blond
15  5 Hazel Blond
16  8 Green Blond
> z <- c(36,9,5,2,81,34,29,14,16,7,7,7,4,64,5,8)
> haireyec2 <- data.frame(z,eyec,hairc)
> haireyec2
z  eyec hairc
1  36 Brown Black
2   9  Blue Black
3   5 Hazel Black
4   2 Green Black
5  81 Brown Brown
6  34  Blue Brown
7  29 Hazel Brown
8  14 Green Brown
9  16 Brown   Red
10  7  Blue   Red
11  7 Hazel   Red
12  7 Green   Red
13  4 Brown Blond
14 64  Blue Blond
15  5 Hazel Blond
16  8 Green Blond
> modl <- glm(y~haireyec\$eyec+haireyec\$hairc,poisson)
> summary(modl)

Call:
glm(formula = y ~ haireyec\$eyec + haireyec\$hairc, family = poisson)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-3.6724  -0.9527  -0.1018   0.5635   3.0744

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)           2.8682     0.1616  17.748  < 2e-16 ***
haireyec\$eyecBlue     0.1963     0.1482   1.325  0.18522
haireyec\$eyecHazel   -0.5687     0.1826  -3.115  0.00184 **
haireyec\$eyecGreen   -0.9223     0.2058  -4.482 7.40e-06 ***
haireyec\$haircBrown   0.8267     0.1602   5.160 2.47e-07 ***
haireyec\$haircRed    -0.4990     0.2174  -2.295  0.02173 *
haireyec\$haircBlond  -0.1967     0.1990  -0.989  0.32288
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 163.492  on 15  degrees of freedom
Residual deviance:  44.315  on  9  degrees of freedom
AIC: 127.46

Number of Fisher Scoring iterations: 5

> drop1(modl,test="Chi")
Single term deletions

Model:
y ~ haireyec\$eyec + haireyec\$hairc
Df Deviance     AIC     LRT   Pr(Chi)
<none>              44.315 127.462
haireyec\$eyec   3   90.643 167.790  46.328 4.831e-10 ***
haireyec\$hairc  3  117.164 194.311  72.849 1.047e-15 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> modl2 <- glm(z~haireyec2\$eyec+haireyec2\$hairc,poisson)
> summary(modl2)

Call:
glm(formula = z ~ haireyec2\$eyec + haireyec2\$hairc, family = poisson)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-6.5256  -1.8738  -0.0602   1.4517   5.7814

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)            3.0782     0.1532  20.088  < 2e-16 ***
haireyec2\$eyecBlue    -0.1838     0.1268  -1.450   0.1471
haireyec2\$eyecHazel   -1.0913     0.1704  -6.404 1.51e-10 ***
haireyec2\$eyecGreen   -1.4860     0.1989  -7.471 7.93e-14 ***
haireyec2\$haircBrown   1.1114     0.1599   6.951 3.62e-12 ***
haireyec2\$haircRed    -0.3403     0.2151  -1.582   0.1136
haireyec2\$haircBlond   0.4432     0.1777   2.494   0.0126 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 319.17  on 15  degrees of freedom
Residual deviance: 117.89  on  9  degrees of freedom
AIC: 201.50

Number of Fisher Scoring iterations: 5

> drop1(modl2,test="Chi")
Single term deletions

Model:
z ~ haireyec2\$eyec + haireyec2\$hairc
Df Deviance    AIC    LRT   Pr(Chi)
<none>               117.89 201.50
haireyec2\$eyec   3   220.16 297.76 102.26 < 2.2e-16 ***
haireyec2\$hairc  3   216.90 294.51  99.01 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
```

## Chapter 5

### Question 2

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(happy)
> happy
happy money sex love work
1     10    36   0    3    4
2      8    47   1    3    1
3      8    53   0    3    5
4      8    35   1    3    3
5      4    88   1    1    2
6      9   175   1    3    4
7      8   175   1    3    4
8      6    45   0    2    3
9      5    35   1    2    2
10     4    55   1    1    4
11     7    40   0    2    3
12     8    45   1    3    4
13     8    45   1    3    3
14     8    45   1    3    4
15     8    62   0    3    4
16     4    44   1    2    3
17     8    85   1    3    4
18     8    81   0    2    4
19     6   112   1    3    2
20     5    40   1    2    4
21     7    40   1    3    3
22     9    44   1    3    4
23     4    35   1    2    2
24     5    56   1    2    3
25     9   115   1    3    4
26     7    44   1    2    4
27     7    50   1    3    3
28     7    45   0    2    4
29     8    41   0    3    5
30     8    50   0    3    5
31     3    85   1    1    1
32     7    90   1    2    2
33     7    85   1    2    4
34     8    75   1    3    4
35     5    70   0    2    3
36     2     0   0    2    2
37     5    31   0    2    4
38     7    60   1    3    4
39     8    65   1    3    3
> library(nnet)
> mmod <- multinom(happy~money+love+sex+work,happy)
# weights:  54 (40 variable)
initial  value 85.691759
iter  10 value 68.212870
iter  20 value 38.631288
iter  30 value 28.889527
iter  40 value 27.437462
iter  50 value 26.714973
iter  60 value 26.708293
iter  70 value 26.703682
final  value 26.703644
converged
> mmodi <- step(mmod)
Start:  AIC= 133.41
happy ~ money + love + sex + work

trying - money
# weights:  45 (32 variable)
initial  value 85.691759
iter  10 value 54.761294
iter  20 value 35.659486
iter  30 value 32.182758
iter  40 value 32.048055
iter  50 value 32.046211
final  value 32.046201
converged
trying - love
# weights:  45 (32 variable)
initial  value 85.691759
iter  10 value 72.252617
iter  20 value 52.501144
iter  30 value 48.215674
iter  40 value 47.713894
iter  50 value 47.513363
iter  60 value 47.451456
final  value 47.449857
converged
trying - sex
# weights:  45 (32 variable)
initial  value 85.691759
iter  10 value 68.317736
iter  20 value 44.624172
iter  30 value 35.061199
iter  40 value 33.728123
iter  50 value 33.347594
iter  60 value 32.556498
iter  70 value 32.504218
iter  80 value 32.500711
iter  90 value 32.500380
iter 100 value 32.500305
final  value 32.500305
stopped after 100 iterations
trying - work
# weights:  45 (32 variable)
initial  value 85.691759
iter  10 value 67.265742
iter  20 value 39.099419
iter  30 value 33.357361
iter  40 value 32.907973
iter  50 value 32.331249
iter  60 value 32.137494
iter  70 value 32.135654
final  value 32.135653
converged
Df      AIC
- money 32 128.0924
- work  32 128.2713
- sex   32 129.0006
<none>  40 133.4073
- love  32 158.8997
# weights:  45 (32 variable)
initial  value 85.691759
iter  10 value 54.761294
iter  20 value 35.659486
iter  30 value 32.182758
iter  40 value 32.048055
iter  50 value 32.046211
final  value 32.046201
converged

Step:  AIC= 128.09
happy ~ love + sex + work

trying - love
# weights:  36 (24 variable)
initial  value 85.691759
iter  10 value 63.720236
iter  20 value 56.392984
iter  30 value 53.258146
iter  40 value 52.966959
iter  50 value 52.953641
iter  50 value 52.953641
final  value 52.953641
converged
trying - sex
# weights:  36 (24 variable)
initial  value 85.691759
iter  10 value 62.292328
iter  20 value 44.946954
iter  30 value 42.644163
iter  40 value 42.341546
iter  50 value 42.334514
iter  50 value 42.334514
final  value 42.334514
converged
trying - work
# weights:  36 (24 variable)
initial  value 85.691759
iter  10 value 54.196269
iter  20 value 42.259689
iter  30 value 41.914108
iter  40 value 41.912629
final  value 41.912626
converged
Df      AIC
<none> 32 128.0924
- work 24 131.8253
- sex  24 132.6690
- love 24 153.9073
> summary(mmodi)
Call:
multinom(formula = happy ~ love + sex + work, data = happy)

Coefficients:
(Intercept)        love         sex       work
3    338.98835 -253.994745  312.388992 -69.489730
4    153.55986 -192.146302  272.587810  33.719591
5    177.40939 -132.837421  128.392023  34.497620
6    -99.42055    7.088609   61.722456  33.388798
7     26.27725  -58.048294  128.552862  34.937105
8   -254.88030   81.920387   -9.583918  35.083241
9   -427.13017   23.257952  101.054127  94.377886
10  -372.68526  173.908506 -177.388421  -4.456565

Std. Errors:
(Intercept)      love          sex      work
3          NaN       NaN          NaN       NaN
4   0.43871634 0.8774327 4.387163e-01 0.9704548
5   0.50533690 1.0106738 1.135253e+00 0.6832557
6   1.40325890 0.9159532 2.124148e+00 0.9712213
7   1.00942684 0.5660942 1.115373e+00 0.5051437
8   0.99274531 0.7657262 1.409613e+00 0.6676865
9   0.03373516 0.1012055 3.373516e-02 0.1349406
10  0.05908112 0.1772434 8.706144e-22 0.2363245

Residual Deviance: 64.0924
AIC: 128.0924

Correlation of Coefficients:
3:(Intercept) 3:love     3:sex      3:work     4:(Intercept)
3:love                NaN
3:sex                 NaN           NaN
3:work                NaN           NaN        NaN
4:(Intercept)         NaN           NaN        NaN        NaN
4:love                NaN           NaN        NaN        NaN  1.0000000
4:sex                 NaN           NaN        NaN        NaN  1.0000000
4:work                NaN           NaN        NaN        NaN -0.9402609
5:(Intercept)         NaN           NaN        NaN        NaN  0.1406546
5:love                NaN           NaN        NaN        NaN  0.1406546
5:sex                 NaN           NaN        NaN        NaN -0.1793527
5:work                NaN           NaN        NaN        NaN -0.0392331
6:(Intercept)         NaN           NaN        NaN        NaN -0.1495422
6:love                NaN           NaN        NaN        NaN -0.4729412
6:sex                 NaN           NaN        NaN        NaN -0.0063551
6:work                NaN           NaN        NaN        NaN  0.4123940
7:(Intercept)         NaN           NaN        NaN        NaN -0.0941749
7:love                NaN           NaN        NaN        NaN -0.3122561
7:sex                 NaN           NaN        NaN        NaN -0.1988096
7:work                NaN           NaN        NaN        NaN  0.3877128
8:(Intercept)         NaN           NaN        NaN        NaN -0.2041552
8:love                NaN           NaN        NaN        NaN -0.5262937
8:sex                 NaN           NaN        NaN        NaN  0.0007502
8:work                NaN           NaN        NaN        NaN  0.5268267
9:(Intercept)         NaN           NaN        NaN        NaN -0.0271885
9:love                NaN           NaN        NaN        NaN -0.0271885
9:sex                 NaN           NaN        NaN        NaN -0.0271885
9:work                NaN           NaN        NaN        NaN -0.0271885
10:(Intercept)        NaN           NaN        NaN        NaN -0.0218996
10:love               NaN           NaN        NaN        NaN -0.0218996
10:sex                NaN           NaN        NaN        NaN  0.5749721
10:work               NaN           NaN        NaN        NaN -0.0218996
4:love     4:sex      4:work     5:(Intercept) 5:love
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex           1.0000000
4:work         -0.9402609 -0.9402609
5:(Intercept)   0.1406546  0.1406546 -0.1243675
5:love          0.1406546  0.1406546 -0.1243675  1.0000000
5:sex          -0.1793527 -0.1793527  0.2753187 -0.5402119    -0.5402119
5:work         -0.0392331 -0.0392331  0.0160235 -0.9478378    -0.9478378
6:(Intercept)  -0.1495422 -0.1495422  0.2725884 -0.3786512    -0.3786512
6:love         -0.4729412 -0.4729412  0.5276619 -0.5761459    -0.5761459
6:sex          -0.0063551 -0.0063551 -0.1326228  0.2518509     0.2518509
6:work          0.4123940  0.4123940 -0.5041348  0.5219202     0.5219202
7:(Intercept)  -0.0941749 -0.0941749 -0.0502827  0.2717231     0.2717231
7:love         -0.3122561 -0.3122561  0.2244406 -0.3098767    -0.3098767
7:sex          -0.1988096 -0.1988096  0.2945376 -0.1545839    -0.1545839
7:work          0.3877128  0.3877128 -0.2761089  0.0785171     0.0785171
8:(Intercept)  -0.2041552 -0.2041552  0.1506176 -0.2953422    -0.2953422
8:love         -0.5262937 -0.5262937  0.4676838 -0.4970389    -0.4970389
8:sex           0.0007502  0.0007502  0.0381989  0.1375091     0.1375091
8:work          0.5268267  0.5268267 -0.4921499  0.4326597     0.4326597
9:(Intercept)  -0.0271885 -0.0271885 -0.0208337 -0.1427235    -0.1427235
9:love         -0.0271885 -0.0271885 -0.0208337 -0.1427235    -0.1427235
9:sex          -0.0271885 -0.0271885 -0.0208337 -0.1427235    -0.1427235
9:work         -0.0271885 -0.0271885 -0.0208337 -0.1427235    -0.1427235
10:(Intercept) -0.0218996 -0.0218996 -0.0883920 -0.2025818    -0.2025818
10:love        -0.0218996 -0.0218996 -0.0883920 -0.2025818    -0.2025818
10:sex          0.5749721  0.5749721 -0.6469436  0.5967365     0.5967365
10:work        -0.0218996 -0.0218996 -0.0883920 -0.2025818    -0.2025818
5:sex      5:work     6:(Intercept) 6:love     6:sex
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex
4:work
5:(Intercept)
5:love
5:sex
5:work          0.3604973
6:(Intercept)   0.5992415  0.3330443
6:love          0.5185856  0.5310782  0.8175995
6:sex          -0.5681249 -0.2110267 -0.9686870    -0.6490386
6:work         -0.5620114 -0.4735635 -0.8484082    -0.9538571  0.7096417
7:(Intercept)  -0.6669242 -0.2000188 -0.6906598    -0.4801911  0.7054670
7:love          0.0662298  0.2474952 -0.1713999    -0.0594167  0.2008400
7:sex           0.2934861  0.1361615  0.6120182     0.4971981 -0.5942286
7:work          0.2589008 -0.0470194  0.3877549     0.2094353 -0.4220083
8:(Intercept)   0.1774826  0.2222817 -0.4534443    -0.1692305  0.5261367
8:love          0.2186918  0.4382747 -0.1841473     0.1342479  0.3011924
8:sex          -0.1267352 -0.0711531  0.5289074     0.3064848 -0.5666565
8:work         -0.1941596 -0.3825590  0.1976275    -0.1056329 -0.3066640
9:(Intercept)   0.0448968  0.1374251  0.4377514     0.3210100 -0.4399537
9:love          0.0448968  0.1374251  0.4377514     0.3210100 -0.4399537
9:sex           0.0448968  0.1374251  0.4377514     0.3210100 -0.4399537
9:work          0.0448968  0.1374251  0.4377514     0.3210100 -0.4399537
10:(Intercept)  0.1063692  0.0920999 -0.2327023    -0.1147201  0.2579881
10:love         0.1063692  0.0920999 -0.2327023    -0.1147201  0.2579881
10:sex         -0.6231676 -0.5184768 -0.8495405    -0.9286233  0.7220188
10:work         0.1063692  0.0920999 -0.2327023    -0.1147201  0.2579881
6:work     7:(Intercept) 7:love     7:sex      7:work
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex
4:work
5:(Intercept)
5:love
5:sex
5:work
6:(Intercept)
6:love
6:sex
6:work
7:(Intercept)   0.5596771
7:love          0.1095642  0.5691351
7:sex          -0.5476029 -0.8053144    -0.4671998
7:work         -0.2779616 -0.7081589    -0.8860878  0.4160458
8:(Intercept)   0.1916586 -0.1293000    -0.0412432  0.1127593 -0.0355601
8:love         -0.0647897 -0.0661922     0.0964546  0.1531844 -0.1286341
8:sex          -0.3031374  0.1415375     0.1085941 -0.0723718 -0.0224633
8:work          0.0290738  0.0466636    -0.0717501 -0.1304778  0.1197054
9:(Intercept)  -0.3615045 -0.0403146     0.0953439  0.0863584  0.0003785
9:love         -0.3615045 -0.0403146     0.0953439  0.0863584  0.0003785
9:sex          -0.3615045 -0.0403146     0.0953439  0.0863584  0.0003785
9:work         -0.3615045 -0.0403146     0.0953439  0.0863584  0.0003785
10:(Intercept)  0.0480883 -0.1104543    -0.0451940  0.0773362 -0.0638154
10:love         0.0480883 -0.1104543    -0.0451940  0.0773362 -0.0638154
10:sex          0.9141239  0.4960771    -0.0921599 -0.5365694 -0.0864402
10:work         0.0480883 -0.1104543    -0.0451940  0.0773362 -0.0638154
8:(Intercept) 8:love     8:sex      8:work     9:(Intercept)
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex
4:work
5:(Intercept)
5:love
5:sex
5:work
6:(Intercept)
6:love
6:sex
6:work
7:(Intercept)
7:love
7:sex
7:work
8:(Intercept)
8:love          0.8061907
8:sex          -0.9493575    -0.5806201
8:work         -0.7754689    -0.9783039  0.5495971
9:(Intercept)  -0.5055819    -0.2834510  0.5430064  0.3410348
9:love         -0.5055819    -0.2834510  0.5430064  0.3410348  1.0000000
9:sex          -0.5055819    -0.2834510  0.5430064  0.3410348  1.0000000
9:work         -0.5055819    -0.2834510  0.5430064  0.3410348  1.0000000
10:(Intercept)  0.5068052     0.2794165 -0.5201564 -0.2683437 -0.3614255
10:love         0.5068052     0.2794165 -0.5201564 -0.2683437 -0.3614255
10:sex          0.1426099    -0.2518397 -0.3323966  0.2305795 -0.3393056
10:work         0.5068052     0.2794165 -0.5201564 -0.2683437 -0.3614255
9:love     9:sex      9:work     10:(Intercept) 10:love
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex
4:work
5:(Intercept)
5:love
5:sex
5:work
6:(Intercept)
6:love
6:sex
6:work
7:(Intercept)
7:love
7:sex
7:work
8:(Intercept)
8:love
8:sex
8:work
9:(Intercept)
9:love
9:sex           1.0000000
9:work          1.0000000  1.0000000
10:(Intercept) -0.3614255 -0.3614255 -0.3614255
10:love        -0.3614255 -0.3614255 -0.3614255  1.0000000
10:sex         -0.3393056 -0.3393056 -0.3393056  0.1259370      0.1259370
10:work        -0.3614255 -0.3614255 -0.3614255  1.0000000      1.0000000
10:sex
3:love
3:sex
3:work
4:(Intercept)
4:love
4:sex
4:work
5:(Intercept)
5:love
5:sex
5:work
6:(Intercept)
6:love
6:sex
6:work
7:(Intercept)
7:love
7:sex
7:work
8:(Intercept)
8:love
8:sex
8:work
9:(Intercept)
9:love
9:sex
9:work
10:(Intercept)
10:love
10:sex
10:work         0.1259370
Warning message:
NaNs produced in: sqrt(diag(vc))
> predict(mmodi,data.frame(money=30,sex=0,work=1,love=1),type="probs")
2             3             4             5             6
4.576279e-35  2.476061e-28  3.522616e-37  1.000000e+00  1.152999e-60
7             8             9            10
1.085211e-33  6.045559e-95 1.773500e-169 2.496925e-123
```

### Question 4

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(pneumo)
> pneumo
Freq status year
1    98 normal  5.8
2    51 normal 15.0
3    34 normal 21.5
4    35 normal 27.5
5    32 normal 33.5
6    23 normal 39.5
7    12 normal 46.0
8     4 normal 51.5
9     0   mild  5.8
10    2   mild 15.0
11    6   mild 21.5
12    5   mild 27.5
13   10   mild 33.5
14    7   mild 39.5
15    6   mild 46.0
16    2   mild 51.5
17    0 severe  5.8
18    1 severe 15.0
19    3 severe 21.5
20    8 severe 27.5
21    9 severe 33.5
22    8 severe 39.5
23   10 severe 46.0
24    5 severe 51.5
> help(pneumo)
> library(nnet)
> mmod <- multinom(status ~ year,pneumo)
# weights:  9 (4 variable)
initial  value 26.366695
final  value 26.366695
converged
> mmodi <- step(mmod)
Start:  AIC= 60.73
status ~ year

trying - year
# weights:  6 (2 variable)
initial  value 26.366695
final  value 26.366695
converged
Df      AIC
- year  2 56.73339
<none>  4 60.73339
# weights:  6 (2 variable)
initial  value 26.366695
final  value 26.366695
converged

Step:  AIC= 56.73
status ~ 1

> predict(mmod,data.frame(year=25),type="probs")
mild    normal    severe
0.3333333 0.3333333 0.3333333
> predict(mmodi,data.frame(year=25),type="probs")
mild    normal    severe
0.3333333 0.3333333 0.3333333
```

## Chapter 6

### Question 2

```> library(faraway)
> data(orings)
> logitmod <- glm(cbind(damage,6-damage) ~ temp, family=binomial,orings)
> summary(logitmod)

Call:
glm(formula = cbind(damage, 6 - damage) ~ temp, family = binomial,
data = orings)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-0.9529  -0.7345  -0.4393  -0.2079   1.9565

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 11.66299    3.29626   3.538 0.000403 ***
temp        -0.21623    0.05318  -4.066 4.78e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 38.898  on 22  degrees of freedom
Residual deviance: 16.912  on 21  degrees of freedom
AIC: 33.675

Number of Fisher Scoring iterations: 6
```

## Chapter 7

### Question 2

```> library(faraway)
> data(rats)
> summary(rats)
time        poison   treat
Min.   :0.1800   I  :16   A:12
1st Qu.:0.3000   II :16   B:12
Median :0.4000   III:16   C:12
Mean   :0.4794            D:12
3rd Qu.:0.6225
Max.   :1.2400
> llmdl<- lm(log(time)~.^2,rats)
> rlmdl<- step(llmdl)
Start:  AIC= -129.85
log(time) ~ (poison + treat)^2

Df Sum of Sq      RSS      AIC
- poison:treat  6     0.396    2.342 -132.964
<none>                         1.947 -129.848

Step:  AIC= -132.96
log(time) ~ poison + treat

Df Sum of Sq      RSS      AIC
<none>                   2.342 -132.964
- treat   3     3.557    5.899  -94.625
- poison  2     5.237    7.580  -80.595
> rgmdl <- step(gmdl)
Start:  AIC= -74.11
time ~ (poison + treat)^2

Df Deviance     AIC
- poison:treat  6    2.404 -76.840
<none>               1.920 -74.113

Step:  AIC= -75.26
time ~ poison + treat

Df Deviance     AIC
<none>         2.404 -75.262
- treat   3    6.105 -20.864
- poison  2    7.501   3.923
> summary(rgmdl)

Call:
glm(formula = time ~ poison + treat, family = Gamma(link = log),
data = rats)

Deviance Residuals:
Min        1Q    Median        3Q       Max
-0.42216  -0.16456  -0.02846   0.10147   0.58524

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.87536    0.08752 -10.002 1.12e-12 ***
poisonII    -0.16096    0.08752  -1.839   0.0730 .
poisonIII   -0.78792    0.08752  -9.003 2.34e-11 ***
treatB       0.72218    0.10106   7.146 8.99e-09 ***
treatC       0.19855    0.10106   1.965   0.0561 .
treatD       0.52281    0.10106   5.173 6.05e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Gamma family taken to be 0.06127708)

Null deviance: 11.5710  on 47  degrees of freedom
Residual deviance:  2.4036  on 42  degrees of freedom
AIC: -75.262

Number of Fisher Scoring iterations: 5
```

### Question 4

```> data(happy)
> happy\$phappy <- (happy\$happy/10)
> summary(happy\$phappy)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.2000  0.5000  0.7000  0.6744  0.8000  1.0000
> modl <- glm(phappy~log(money)+sex+love+work,family=quasibinomial,happy)
Error: NA/NaN/Inf in foreign function call (arg 1)???
```

### Question 6

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(fruitfly)
> help(fruitfly)
> fruitfly
thorax longevity activity
1     0.68        37      low
2     0.68        49      low
3     0.72        46      low
4     0.72        63      low
5     0.76        39      low
6     0.76        46      low
7     0.76        56      low
8     0.76        63      low
9     0.76        65      low
10    0.80        56      low
11    0.80        65      low
12    0.80        70      low
13    0.84        63      low
14    0.84        65      low
15    0.84        70      low
16    0.84        77      low
17    0.84        81      low
18    0.84        86      low
19    0.88        70      low
20    0.88        70      low
21    0.92        77      low
22    0.92        77      low
23    0.92        81      low
24    0.94        77      low
25    0.64        40     many
26    0.70        37     many
27    0.72        44     many
28    0.72        47     many
29    0.72        47     many
30    0.76        47     many
31    0.78        68     many
32    0.80        47     many
33    0.84        54     many
34    0.84        61     many
35    0.84        71     many
36    0.84        75     many
37    0.84        89     many
38    0.88        58     many
39    0.88        59     many
40    0.88        62     many
41    0.88        79     many
42    0.88        96     many
43    0.92        58     many
44    0.92        62     many
45    0.92        70     many
46    0.92        72     many
47    0.92        75     many
48    0.92        96     many
49    0.94        75     many
50    0.64        46 isolated
51    0.68        42 isolated
52    0.72        65 isolated
53    0.76        46 isolated
54    0.76        58 isolated
55    0.80        42 isolated
56    0.80        48 isolated
57    0.80        58 isolated
58    0.82        50 isolated
59    0.82        80 isolated
60    0.84        63 isolated
61    0.84        65 isolated
62    0.84        70 isolated
63    0.84        70 isolated
64    0.84        72 isolated
65    0.84        97 isolated
66    0.88        46 isolated
67    0.88        56 isolated
68    0.88        70 isolated
69    0.88        70 isolated
70    0.88        72 isolated
71    0.88        76 isolated
72    0.88        90 isolated
73    0.92        76 isolated
74    0.92        92 isolated
75    0.68        21      one
76    0.68        40      one
77    0.72        44      one
78    0.76        54      one
79    0.78        36      one
80    0.80        40      one
81    0.80        56      one
82    0.80        60      one
83    0.84        48      one
84    0.84        53      one
85    0.84        60      one
86    0.84        60      one
87    0.84        65      one
88    0.84        68      one
89    0.88        60      one
90    0.88        81      one
91    0.88        81      one
92    0.90        48      one
93    0.90        48      one
94    0.90        56      one
95    0.90        68      one
96    0.90        75      one
97    0.90        81      one
98    0.92        48      one
99    0.92        68      one
100   0.64        16     high
101   0.64        19     high
102   0.68        19     high
103   0.72        32     high
104   0.72        33     high
105   0.74        33     high
106   0.76        30     high
107   0.76        42     high
108   0.76        42     high
109   0.78        33     high
110   0.80        26     high
111   0.80        30     high
112   0.82        40     high
113   0.82        54     high
114   0.84        34     high
115   0.84        34     high
116   0.84        47     high
117   0.84        47     high
118   0.88        42     high
119   0.88        47     high
120   0.88        54     high
121   0.88        54     high
122   0.88        56     high
123   0.88        60     high
124   0.92        44     high
> llmdl <- lm(log(longevity)~.^2,fruitfly)
> rlmdl <- step(llmdl)
Start:  AIC= -400.87
log(longevity) ~ (thorax + activity)^2

Df Sum of Sq     RSS     AIC
- thorax:activity  4      0.24    4.40 -401.96
<none>                            4.16 -400.87

Step:  AIC= -401.96
log(longevity) ~ thorax + activity

Df Sum of Sq     RSS     AIC
<none>                     4.40 -401.96
- activity  4      4.15    8.56 -327.54
- thorax    1      5.08    9.48 -308.86
> summary(rlmdl)

Call:
lm(formula = log(longevity) ~ thorax + activity, data = fruitfly)

Residuals:
Min        1Q    Median        3Q       Max
-0.526412 -0.136294 -0.008234  0.139176  0.392732

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)       1.84421    0.19882   9.276 1.04e-15 ***
thorax            2.72146    0.23329  11.666  < 2e-16 ***
activityisolated  0.05174    0.05468   0.946   0.3459
activityone      -0.12387    0.05463  -2.268   0.0252 *
activitylow       0.08791    0.05546   1.585   0.1156
activityhigh     -0.41925    0.05527  -7.586 8.35e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1931 on 118 degrees of freedom
Multiple R-Squared: 0.7025,     Adjusted R-squared: 0.6899
F-statistic: 55.72 on 5 and 118 DF,  p-value: < 2.2e-16

> rgmdl <- step(gmdl)
Start:  AIC= 943.83
longevity ~ (thorax + activity)^2

Df Deviance    AIC
- thorax:activity  4     4.32 942.02
<none>                   4.10 943.83

Step:  AIC= 942.29
longevity ~ thorax + activity

Df Deviance     AIC
<none>            4.32  942.29
- activity  4     8.13 1041.76
- thorax    1     8.87 1068.52
> summary(rgmdl)

Call:
glm(formula = longevity ~ thorax + activity, family = Gamma(link = log),
data = fruitfly)

Deviance Residuals:
Min        1Q    Median        3Q       Max
-0.50718  -0.15216  -0.02833   0.12434   0.39938

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)       1.88722    0.19405   9.726  < 2e-16 ***
thorax            2.68778    0.22769  11.804  < 2e-16 ***
activityisolated  0.05527    0.05337   1.036   0.3024
activityone      -0.11646    0.05332  -2.184   0.0309 *
activitylow       0.08250    0.05413   1.524   0.1302
activityhigh     -0.41466    0.05394  -7.687 4.93e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Gamma family taken to be 0.03552979)

Null deviance: 13.2803  on 123  degrees of freedom
Residual deviance:  4.3151  on 118  degrees of freedom
AIC: 942.29

Number of Fisher Scoring iterations: 4
```

## Chapter 8

### Question 8

```> library(faraway)
> library(lme4)
> data(jsp)
> jspr <- jsp[jsp\$year==2,]
> plot(jitter(english)~jitter(raven),data=jspr,xlab="Raven score",ylab="English score")
> boxplot (english~social,data=jspr,xlab="Social Class",ylab="English Score")
> glin <- lm(english~raven*gender*social,jspr)
> anova(glin)
Analysis of Variance Table

Response: english
Df Sum Sq Mean Sq  F value    Pr(>F)
raven                 1 108089  108089 336.8389 < 2.2e-16 ***
gender                1  11612   11612  36.1871 2.587e-09 ***
social                8  26772    3346  10.4287 4.648e-14 ***
raven:gender          1    110     110   0.3422   0.55870
raven:social          8   5795     724   2.2575   0.02166 *
gender:social         8   3631     454   1.4146   0.18611
raven:gender:social   8   3574     447   1.3921   0.19579
Residuals           917 294258     321
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> glin <- lm(english ~raven*social,jspr)
> anova(glin)
Analysis of Variance Table

Response: english
Df Sum Sq Mean Sq  F value    Pr(>F)
raven          1 108089  108089 323.1950 < 2.2e-16 ***
social         8  27735    3467  10.3661 5.597e-14 ***
raven:social   8   5317     665   1.9875   0.04508 *
Residuals    935 312700     334
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> glin <- lm(english~raven+social,jspr)
> summary(glin)

Call:
lm(formula = english ~ raven + social, data = jspr)

Residuals:
Min       1Q   Median       3Q      Max
-49.1804 -13.0315  -0.6695  13.6494  56.0671

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   7.9749     4.4815   1.780 0.075477 .
raven         1.6566     0.1062  15.606  < 2e-16 ***
social2       4.1631     3.6823   1.131 0.258513
social3      -2.7269     3.8984  -0.699 0.484418
social4      -9.8314     3.4558  -2.845 0.004539 **
social5      -7.3409     3.8774  -1.893 0.058631 .
social6     -14.6911     4.1112  -3.573 0.000370 ***
social7     -14.9217     4.2081  -3.546 0.000410 ***
social8      -8.2554     5.7871  -1.427 0.154047
social9      -9.2991     3.6755  -2.530 0.011568 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.36 on 943 degrees of freedom
Multiple R-Squared: 0.2993,     Adjusted R-squared: 0.2926
F-statistic: 44.75 on 9 and 943 DF,  p-value: < 2.2e-16

> table(jspr\$school)

1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
26 11 14 24 26 18 11 27 21  0 11 23 22 13  7 16  6 18 14 13 28 14 18 21 14 20
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 44 45 46 47 48 49 50
22 15 13 27 35 23 44 27 16 28 17 12 14 10 10 41  5 11 15 33 63 22 14
> mmod <- lmer(english~raven*social*gender+(1|school)+(1|school:class),data=jspr)
> anova(mmod)
Analysis of Variance Table
Df Sum Sq Mean Sq
raven                1  85815   85815
social               8  17711    2214
gender               1   7923    7923
raven:social         8   5756     719
raven:gender         1    171     171
social:gender        8   2226     278
raven:social:gender  8   3542     443
> jspr\$craven <- jspr\$raven-mean(jspr\$raven)
> mmod <- lmer(english~craven+social+(1|school)+(1|school:class),jspr)
> summary(mmod)
Linear mixed-effects model fit by REML
Formula: english ~ craven + social + (1 | school) + (1 | school:class)
Data: jspr
AIC      BIC    logLik MLdeviance REMLdeviance
8172.728 8231.043 -4074.364       8177     8148.728
Random effects:
Groups       Name        Variance Std.Dev.
school:class (Intercept)  20.658   4.5452
school       (Intercept)  29.205   5.4042
Residual                 289.010  17.0003
number of obs: 953, groups: school:class, 90; school, 48

Fixed effects:
Estimate Std. Error t value
(Intercept)  48.22346    3.36819 14.3173
craven        1.61083    0.10452 15.4117
social2       4.01650    3.56129  1.1278
social3      -2.27485    3.79105 -0.6001
social4      -8.36889    3.38239 -2.4743
social5      -6.62519    3.76736 -1.7586
social6     -11.69028    4.01003 -2.9153
social7     -12.10605    4.13184 -2.9299
social8      -6.62797    5.53847 -1.1967
social9      -7.37251    3.58627 -2.0558

Correlation of Fixed Effects:
(Intr) craven socil2 socil3 socil4 socil5 socil6 socil7 socil8
craven  -0.079
social2 -0.855  0.001
social3 -0.818  0.060  0.763
social4 -0.922  0.088  0.857  0.821
social5 -0.824  0.091  0.765  0.727  0.825
social6 -0.784  0.090  0.721  0.696  0.788  0.698
social7 -0.765  0.093  0.700  0.679  0.766  0.686  0.654
social8 -0.561  0.055  0.523  0.502  0.561  0.501  0.473  0.466
social9 -0.870  0.090  0.803  0.773  0.872  0.776  0.743  0.723  0.525
```

## Chapter 9

### Question 2

```> data(hprice)
narsp   ypc   perypc regtest rcdum ajwtr msa time
1 4.223910 13585 6.465517      20     0     0   1    1
2 4.271095 14296 5.233714      20     0     0   1    2
3 4.326778 15413 7.813374      20     0     0   1    3
4 4.363099 16490 6.987608      20     0     0   1    4
5 4.391977 17634 6.937538      20     0     0   1    5
6 4.454347 18210 3.266417      20     0     0   1    6
> library(lattice)

Attaching package: 'lattice'

The following object(s) are masked from package:faraway :

melanoma

> lmod <- lm(narsp~time,subset=(msa==1),hprice)
> coef(lmod)
(Intercept)        time
4.17260942  0.04809316
> library(lme4)
> mmod <- lmer(narsp~time*ajwtr+ypc+rcdum+regtest+(time|msa),hprice)
> summary(mmod)
Linear mixed-effects model fit by REML
Formula: narsp ~ time * ajwtr + ypc + rcdum + regtest + (time | msa)
Data: hprice
AIC       BIC   logLik MLdeviance REMLdeviance
-645.1503 -607.3428 332.5751  -724.7333    -665.1503
Random effects:
Groups   Name        Variance   Std.Dev. Corr
msa      (Intercept) 0.03388540 0.184080
time        0.00035724 0.018901 -0.472
Residual             0.00314028 0.056038
number of obs: 324, groups: msa, 36

Fixed effects:
Estimate  Std. Error t value
(Intercept)  2.8050e+00  2.0197e-01 13.8884
time        -4.1798e-04  6.9705e-03 -0.0600
ajwtr1       4.8422e-02  6.6548e-02  0.7276
ypc          5.2526e-05  7.3981e-06  7.0999
rcdum1       1.9492e-01  9.2166e-02  2.1149
regtest      3.1684e-02  9.1003e-03  3.4816
time:ajwtr1  6.0380e-04  6.8581e-03  0.0880

Correlation of Fixed Effects:
(Intr) time   ajwtr1 ypc    rcdum1 regtst
time         0.266
ajwtr1      -0.036  0.245
ypc         -0.427 -0.774 -0.064
rcdum1       0.410  0.156 -0.207 -0.202
regtest     -0.829  0.085 -0.058 -0.110 -0.363
time:ajwtr1  0.095 -0.355 -0.472 -0.069  0.014  0.008

```

### Question 4

```> data(attenu)
> attenu
event mag station  dist accel
1       1 7.0     117  12.0 0.359
2       2 7.4    1083 148.0 0.014
3       2 7.4    1095  42.0 0.196
4       2 7.4     283  85.0 0.135
5       2 7.4     135 107.0 0.062
6       2 7.4     475 109.0 0.054
7       2 7.4     113 156.0 0.014
8       2 7.4    1008 224.0 0.018
9       2 7.4    1028 293.0 0.010
10      2 7.4    2001 359.0 0.004
11      2 7.4     117 370.0 0.004
12      3 5.3    1117   8.0 0.127
13      4 6.1    1438  16.1 0.411
14      4 6.1    1083  63.6 0.018
15      4 6.1    1013   6.6 0.509
16      4 6.1    1014   9.3 0.467
17      4 6.1    1015  13.0 0.279
18      4 6.1    1016  17.3 0.072
19      4 6.1    1095 105.0 0.012
20      4 6.1    1011 112.0 0.006
21      4 6.1    1028 123.0 0.003
22      5 6.6     270 105.0 0.018
23      5 6.6     280 122.0 0.048
24      5 6.6     116 141.0 0.011
25      5 6.6     266 200.0 0.007
26      5 6.6     117  45.0 0.142
27      5 6.6     113 130.0 0.031
28      5 6.6     112 147.0 0.006
29      5 6.6     130 187.0 0.010
30      5 6.6     475 197.0 0.010
31      5 6.6     269 203.0 0.006
32      5 6.6     135 211.0 0.013
33      6 5.6    1093  62.0 0.005
34      7 5.7    1093  62.0 0.003
35      8 5.3     111  19.0 0.086
36      8 5.3     116  21.0 0.179
37      8 5.3     290  13.0 0.205
38      8 5.3     112  22.0 0.073
39      8 5.3     113  29.0 0.045
40      9 6.6     128  17.0 0.374
41      9 6.6     126  19.6 0.200
42      9 6.6     127  20.2 0.147
43      9 6.6     141  21.1 0.188
44      9 6.6     266  21.9 0.204
45      9 6.6     110  24.2 0.335
46      9 6.6    1027  66.0 0.057
47      9 6.6     111  87.0 0.021
48      9 6.6     125  23.4 0.152
49      9 6.6     135  24.6 0.217
50      9 6.6     475  25.7 0.114
51      9 6.6     262  28.6 0.150
52      9 6.6     269  37.4 0.148
53      9 6.6    1052  46.7 0.112
54      9 6.6     411  56.9 0.043
55      9 6.6     290  60.7 0.057
56      9 6.6     130  61.4 0.030
57      9 6.6     272  62.0 0.027
58      9 6.6    1096  64.0 0.028
59      9 6.6    1102  82.0 0.034
60      9 6.6     112  88.0 0.030
61      9 6.6     113  91.0 0.039
62     10 5.3    1028  31.0 0.030
63     11 7.7    2714  45.0 0.110
64     11 7.7    2708 145.0 0.010
65     11 7.7    2715 300.0 0.010
66     12 6.2    3501   5.0 0.390
67     13 5.6     655  50.0 0.031
68     13 5.6     272  16.0 0.130
69     14 5.2    1032  17.0 0.011
70     14 5.2    1377   8.0 0.120
71     14 5.2    1028  10.0 0.170
72     14 5.2    1250  10.0 0.140
73     15 6.0    1051   8.0 0.110
74     15 6.0    1293  32.0 0.040
75     15 6.0    1291  30.0 0.070
76     15 6.0    1292  31.0 0.080
77     16 5.1     283   2.9 0.210
78     16 5.1     885   3.2 0.390
79     16 5.1    <NA>   7.6 0.280
80     17 7.6    2734  25.4 0.160
81     17 7.6    <NA>  32.9 0.064
82     17 7.6    2728  92.2 0.090
83     18 5.8    1413   1.2 0.420
84     18 5.8    1445   1.6 0.230
85     18 5.8    1408   9.1 0.130
86     18 5.8    1411   3.7 0.260
87     18 5.8    1410   5.3 0.270
88     18 5.8    1409   7.4 0.260
89     18 5.8    1377  17.9 0.110
90     18 5.8    1492  19.2 0.120
91     18 5.8    1251  23.4 0.038
92     18 5.8    1422  30.0 0.044
93     18 5.8    1376  38.9 0.046
94     19 6.5    <NA>  23.5 0.170
95     19 6.5     286  26.0 0.210
96     19 6.5    <NA>   0.5 0.320
97     19 6.5    5028   0.6 0.520
98     19 6.5     942   1.3 0.720
99     19 6.5    <NA>   1.4 0.320
100    19 6.5    5054   2.6 0.810
101    19 6.5     958   3.8 0.640
102    19 6.5     952   4.0 0.560
103    19 6.5    5165   5.1 0.510
104    19 6.5     117   6.2 0.400
105    19 6.5     955   6.8 0.610
106    19 6.5    5055   7.5 0.260
107    19 6.5    <NA>   7.6 0.240
108    19 6.5    <NA>   8.4 0.460
109    19 6.5    5060   8.5 0.220
110    19 6.5     412   8.5 0.230
111    19 6.5    5053  10.6 0.280
112    19 6.5    5058  12.6 0.380
113    19 6.5    5057  12.7 0.270
114    19 6.5    <NA>  12.9 0.310
115    19 6.5    5051  14.0 0.200
116    19 6.5    <NA>  15.0 0.110
117    19 6.5    5115  16.0 0.430
118    19 6.5    <NA>  17.7 0.270
119    19 6.5     931  18.0 0.150
120    19 6.5    5056  22.0 0.150
121    19 6.5    5059  22.0 0.150
122    19 6.5    5061  23.0 0.130
123    19 6.5    <NA>  23.2 0.190
124    19 6.5    5062  29.0 0.130
125    19 6.5    5052  32.0 0.066
126    19 6.5    <NA>  32.7 0.350
127    19 6.5     724  36.0 0.100
128    19 6.5    <NA>  43.5 0.160
129    19 6.5    5066  49.0 0.140
130    19 6.5    5050  60.0 0.049
131    19 6.5    2316  64.0 0.034
132    20 5.0    5055   7.5 0.264
133    20 5.0     942   8.8 0.263
134    20 5.0    5028   8.9 0.230
135    20 5.0    5165   9.4 0.147
136    20 5.0     952   9.7 0.286
137    20 5.0     958   9.7 0.157
138    20 5.0     955  10.5 0.237
139    20 5.0     117  10.5 0.133
140    20 5.0     412  12.0 0.055
141    20 5.0    5053  12.2 0.097
142    20 5.0    5054  12.8 0.129
143    20 5.0    5058  14.6 0.192
144    20 5.0    5057  14.9 0.147
145    20 5.0    5115  17.6 0.154
146    20 5.0    5056  23.9 0.060
147    20 5.0    5060  25.0 0.057
148    21 5.8    1030  10.8 0.120
149    21 5.8    1418  15.7 0.154
150    21 5.8    1383  16.7 0.052
151    21 5.8    1308  20.8 0.045
152    21 5.8    1298  28.5 0.086
153    21 5.8    1299  33.1 0.056
154    21 5.8    1219  40.3 0.065
155    22 5.5    <NA>   4.0 0.259
156    22 5.5    <NA>  10.1 0.267
157    22 5.5    1030  11.1 0.071
158    22 5.5    1418  17.7 0.275
159    22 5.5    1383  22.5 0.058
160    22 5.5    <NA>  26.5 0.026
161    22 5.5    1299  29.0 0.039
162    22 5.5    1308  30.9 0.112
163    22 5.5    1219  37.8 0.065
164    22 5.5    1456  48.3 0.026
165    23 5.3    5045   5.8 0.123
166    23 5.3    5044  12.0 0.133
167    23 5.3    5160  12.1 0.073
168    23 5.3    5043  20.5 0.097
169    23 5.3    5047  20.5 0.096
170    23 5.3    c168  25.3 0.230
171    23 5.3    5068  35.9 0.082
172    23 5.3    c118  36.1 0.110
173    23 5.3    5042  36.3 0.110
174    23 5.3    5067  38.5 0.094
175    23 5.3    5049  41.4 0.040
176    23 5.3    c204  43.6 0.050
177    23 5.3    5070  44.4 0.022
178    23 5.3    c266  46.1 0.070
179    23 5.3    c203  47.1 0.080
180    23 5.3    5069  47.7 0.033
181    23 5.3    5073  49.2 0.017
182    23 5.3    5072  53.1 0.022
> library(lme4)

Attaching package: 'lattice'

The following object(s) are masked from package:Matrix :

qqmath

The following object(s) are masked from package:faraway :

melanoma

> help(attenu)
> mmod <- lmer(log(accel)~log(dist)+(1|mag),attenu)
> summary(mmod)
Linear mixed-effects model fit by REML
Formula: log(accel) ~ log(dist) + (1 | mag)
Data: attenu
AIC    BIC   logLik MLdeviance REMLdeviance
402.498 412.11 -198.249   389.7837      396.498
Random effects:
Groups   Name        Variance Std.Dev.
mag      (Intercept) 0.10422  0.32282
Residual             0.45926  0.67768
number of obs: 182, groups: mag, 17

Fixed effects:
Estimate Std. Error  t value
(Intercept)  0.12915    0.19924   0.6482
log(dist)   -0.84433    0.05295 -15.9458

Correlation of Fixed Effects:
(Intr)
log(dist) -0.849
```

## Chapter 10

### Question 2

```>data(potuse)
sex year.76 year.77 year.78 year.79 year.80 count total
1   1       1       1       1       1       1    48     5
2   1       1       1       1       1       2     8     6
3   1       1       1       1       1       3     4     7
4   1       1       1       1       2       1     2     6
5   1       1       1       1       2       2     4     7
6   1       1       1       1       2       3     1     8
> potuse\$y76 <- ifelse(potuse\$year.76==1,1,0)
sex year.76 year.77 year.78 year.79 year.80 count total y76
1   1       1       1       1       1       1    48     5   1
2   1       1       1       1       1       2     8     6   1
3   1       1       1       1       1       3     4     7   1
4   1       1       1       1       2       1     2     6   1
5   1       1       1       1       2       2     4     7   1
6   1       1       1       1       2       3     1     8   1
> potuse\$y77 <- ifelse(potuse\$year.77==1,1,0)
> potuse\$y78 <- ifelse(potuse\$year.78==1,1,0)
> potuse\$y79 <- ifelse(potuse\$year.79==1,1,0)
> potuse\$y80 <- ifelse(potuse\$year.80==1,1,0)
sex year.76 year.77 year.78 year.79 year.80 count total y76 y77 y78 y79 y80
1   1       1       1       1       1       1    48     5   1   1   1   1   1
2   1       1       1       1       1       2     8     6   1   1   1   1   0
3   1       1       1       1       1       3     4     7   1   1   1   1   0
4   1       1       1       1       2       1     2     6   1   1   1   0   1
5   1       1       1       1       2       2     4     7   1   1   1   0   0
6   1       1       1       1       2       3     1     8   1   1   1   0   0
> potuse\$overall <- (potuse\$y76+potuse\$y77+potuse\$y78+potuse\$y79+potuse\$y80)/5
> potuse\$overall
[1] 1.0 0.8 0.8 0.8 0.6 0.6 0.8 0.6 0.6 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[19] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[37] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[55] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[73] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[91] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[109] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[127] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[145] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[163] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[181] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[199] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[217] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[235] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 1.0 0.8 0.8 0.8 0.6 0.6 0.8 0.6 0.6
[253] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[271] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[289] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[307] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[325] 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[343] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[361] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[379] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[397] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.8 0.6 0.6 0.6 0.4 0.4 0.6 0.4 0.4
[415] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[433] 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
[451] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.6 0.4 0.4 0.4 0.2 0.2 0.4 0.2 0.2
[469] 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0 0.4 0.2 0.2 0.2 0.0 0.0 0.2 0.0 0.0
Here 0 indicates that the people used pot for all 5 years, 0.2= 4 years, 0.4=3 years
0.6=2 years, 0.8=1 year and 1.0 indicates that the people did not use pot at all

need id for data? how to convert count to id for generalized estimating equations (gee)?
```

## Chapter 11

### Question 2

```> data(uswages)
wage educ exper race smsa ne mw so we pt
6085  771.60   18    18    0    1  1  0  0  0  0
23701 617.28   15    20    0    1  0  0  0  1  0
16208 957.83   16     9    0    1  0  0  1  0  0
2720  617.28   12    24    0    1  1  0  0  0  0
9723  902.18   14    12    0    1  0  1  0  0  0
22239 299.15   12    33    0    1  0  0  0  1  0
> help(uswages)
This data frame contains the following columns:

'wage' Real weekly wages in dollars (deflated by personal
consumption expenditures - 1992 base year)

'educ' Years of education

'exper' Years of experience

'race' 1 if Black, 0 if White (other races not in sample)

'smsa' 1 if living in Standard Metropolitan Statistical Area, 0 if
not

'ne' 1 if living in the North East

'mw' 1 if living in the Midwest

'we' 1 if living in the West

'so' 1 if living in the South

'pt' 1 if working part time, 0 if not
> plot(wage~educ,uswages,main="Wages in the U.S.")
> plot(wage~educ,uswages,main="Wages in the U.S.")
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",0.5))
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",2))
> plot(wage~educ,uswages,main="Bandwith=0.5")
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",0.5))
> plot(wage~educ,uswages,main="Bandwith=2")
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",2))
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",1.5))
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",1))
> lines(ksmooth(uswages\$educ,uswages\$wage,"normal",0.5))
we see that a bandwith of 2 gives us a fit we can work with as it is the least smooth
fit that does not show any implausible fluctuations.

> hm <- hcv(uswages\$educ,uswages\$wage,display="lines")

hcv: boundary of search area reached.
hstart:  0.06958409
hend  :  1.391682

h        cv
[1,] 0.06958409 402321049
[2,] 0.10675086 402321049
[3,] 0.16376942 402321049
[4,] 0.25124316 402293059
[5,] 0.38543902 400136532
[6,] 0.59131256 395095896
[7,] 0.90714880 393778752
[8,] 1.39168184 392410227

Next we try a smoothing Spline:
> plot(wage~educ,uswages,main="Wages in the U.S.")
> lines(smooth.spline(uswages\$educ,uswages\$wage))
We see from the graph that it is a reasonable fit

Next we try a Lowess Smoothing:
> plot(wage~educ,uswages,main="Wages in the U.S.")
> f <- loess(wage~educ,uswages)
> i <- order(uswages\$educ)
> lines (f\$x[i],f\$fitted[i])
Again, we see a resonable default fit
```

### Question 4

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(divusa)
> plot(divorce~year,divusa,main="Divorces in US",pch=".")
> library(sm)
Library `sm', version 2.1;  Copyright (C) 1997, 2000, 2005 A.W.Bowman & A.Azzalini
type help(sm) for summary information
> hm <- hcv(divusa\$year,divusa\$divorce,display="lines")

hcv: boundary of search area reached.
hstart:  0.994012
hend  :  19.88024

h        cv
[1,]  0.994012  27.82906
[2,]  1.524941  38.89271
[3,]  2.339454  57.27233
[4,]  3.589020  86.72126
[5,]  5.506015 144.23398
[6,]  8.446928 255.30853
[7,] 12.958663 392.76460
[8,] 19.880241 491.59201

Smoothing Splines:

>  plot(divorce~year,divusa,main="Divorces in US",pch=".")
> lines(divusa\$year,divusa\$divorce)
> lines(smooth.spline(divusa\$year,divusa\$divorce),lty=2)
As we can see from the Plot the fit is reasonable

Lowess Splines :
> f <- loess(divorce~year,divusa)
> i <- order(divusa\$year)
>  plot(divorce~year,divusa,main="Divorces in US",pch=".")
> lines(f\$x[i],f\$fitted[i],lty=2)
> lines(divusa\$year,divusa\$divorce,lty=1)

With Span =1:
> f <- loess(divorce~year,divusa,span=1)
> lines(f\$x[i],f\$fitted[i],lty=5)

Here the default choice is too large. The choice where span=1 minimizes the squared error
between the estimated and true function.

As in the book, the loess smoother did a better job of smoothing than the smoothing splines
as there are no outliers in the data as explained in the question.

```

## Chapter 12

### Question 2

```> library(faraway)

Attaching package: 'faraway'

The following object(s) are masked _by_ .GlobalEnv :

aflatoxin babyfood salmonella turtle

The following object(s) are masked from package:datasets :

faithful

> data(trees)
Girth Height Volume
1   8.3     70   10.3
2   8.6     65   10.3
3   8.8     63   10.2
4  10.5     72   16.4
5  10.7     81   18.8
6  10.8     83   19.7

> help(trees)

Girth, Height and Volume for Black Cherry Trees

Description:

This data set provides measurements of the girth, height and
volume of timber in 31 felled black cherry trees.  Note that girth
is the diameter of the tree (in inches) measured at 4 ft 6 in
above the ground.

Usage:

trees

Format:

A data frame with 31 observations on 3 variables.

'[,1]'  'Girth'   numeric  Tree diameter in inches
'[,2]'  'Height'  numeric  Height in ft
'[,3]'  'Volume'  numeric  Volume of timber in cubic ft

Simple Linear Model:
> olm <- lm(Volume~Girth+Height,trees)
> summary(olm)

Call:
lm(formula = Volume ~ Girth + Height, data = trees)

Residuals:
Min      1Q  Median      3Q     Max
-6.4065 -2.6493 -0.2876  2.2003  8.4847

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -57.9877     8.6382  -6.713 2.75e-07 ***
Girth         4.7082     0.2643  17.816  < 2e-16 ***
Height        0.3393     0.1302   2.607   0.0145 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.882 on 28 degrees of freedom
Multiple R-Squared: 0.948,      Adjusted R-squared: 0.9442
F-statistic:   255 on 2 and 28 DF,  p-value: < 2.2e-16

> library(gam)
> amgam <- gam(Volume~lo(Girth)+lo(Height),data=trees)
> summary(amgam)

Call: gam(formula = Volume ~ lo(Girth) + lo(Height), data = trees)
Deviance Residuals:
Min      1Q  Median      3Q     Max
-4.6699 -1.7095  0.3245  1.7893  4.2444

(Dispersion Parameter for gaussian family taken to be 8.3502)

Null Deviance: 8106.084 on 30 degrees of freedom
Residual Deviance: 181.6329 on 21.7519 degrees of freedom
AIC: 163.2785

Number of Local Scoring Iterations: 2

DF for Terms and F-values for Nonparametric Effects

Df Npar Df Npar F    Pr(F)
(Intercept) 1.0
lo(Girth)   1.0     3.4 7.4153 0.000972 ***
lo(Height)  1.0     2.9 0.2245 0.869554
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Compared to the linear model the R-squared is:

> 1-181.6329/8106.084
[1] 0.977593

So the fit is a better, but uses more degrees of freedom

Using ACE:
> x <- trees[,c("Girth","Height")]
> library(acepack)
> acefit <- ace(x,trees\$Volume)
> summary (lm(acefit\$ty~acefit\$tx))

Call:
lm(formula = acefit\$ty ~ acefit\$tx)

Residuals:
Min        1Q    Median        3Q       Max
-0.290320 -0.098722  0.003021  0.105572  0.219676

Coefficients:
Estimate Std. Error  t value Pr(>|t|)
(Intercept)     -9.377e-17  2.605e-02 -3.6e-15        1
acefit\$txGirth   1.002e+00  3.363e-02   29.786  < 2e-16 ***
acefit\$txHeight  1.036e+00  1.889e-01    5.483 7.42e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.145 on 28 degrees of freedom
Multiple R-Squared: 0.981,      Adjusted R-squared: 0.9796
F-statistic:   723 on 2 and 28 DF,  p-value: < 2.2e-16

Both transformed predictors are srongly significant, and the R-squared is
higher than that of the additive model

Graphs of Ace Transformations:
> plot(trees\$Volume,acefit\$ty,xlab="Volume",ylab=expression(theta(Volume)))
> plot(x[,1],acefit\$tx[,1],xlab="Girth",ylab="f(Girth)")
> plot(x[,2],acefit\$tx[,2],xlab="Height",ylab="f(Height)")

Using AVAS:
> avasfit <- avas(x,trees\$Volume)
> plot(trees\$Volume,avasfit\$ty,xlab="Volume",ylab=expression(theta(Volume)))
> plot(x[,1],avasfit\$tx[,1],xlab="Girth",ylab="f(Girth)")
> plot(x[,2],avasfit\$tx[,2],xlab="Height",ylab="f(Height)")

> plot(trees\$Volume[i],avasfit\$ty[i],type="l",xlab="Volume",ylab=expression(theta(Volume)))
> gs <- lm(avasfit\$ty[i]~sqrt(trees\$Volume[i]))
> lines (trees\$Volume[i],gs\$fit,lty=2)
> gl <- lm(avasfit\$ty[i]~log(trees\$Volume[i]))
> lines (trees\$Volume[i],gl\$fit,lty=5)

> lmod <- lm(avasfit\$ty~avasfit\$tx)
> summary(lmod)

Call:
lm(formula = avasfit\$ty ~ avasfit\$tx)

Residuals:
Min       1Q   Median       3Q      Max
-0.24141 -0.07164  0.01239  0.08441  0.19841

Coefficients:
Estimate Std. Error  t value Pr(>|t|)
(Intercept)      3.369e-07  2.123e-02 1.59e-05        1
avasfit\$txGirth  1.001e+00  2.974e-02   33.652  < 2e-16 ***
avasfit\$txHeight 1.028e+00  1.563e-01    6.577 3.93e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1182 on 28 degrees of freedom
Multiple R-Squared: 0.9874,     Adjusted R-squared: 0.9865
F-statistic:  1095 on 2 and 28 DF,  p-value: < 2.2e-16

> plot(predict(lmod),residuals(lmod),xlab="Fitted",ylab="Residuals")

Both models are very well-fitting

```

### Question 4

```> data(dvisits)
sex  age  agesq income levyplus freepoor freerepa illness actdays hscore
1   1 0.19 0.0361   0.55        1        0        0       1       4      1
2   1 0.19 0.0361   0.45        1        0        0       1       2      1
3   0 0.19 0.0361   0.90        0        0        0       3       0      0
4   0 0.19 0.0361   0.15        0        0        0       1       0      0
5   0 0.19 0.0361   0.45        0        0        0       2       5      1
6   1 0.19 0.0361   0.35        0        0        0       5       1      9
chcond1 chcond2 doctorco nondocco hospadmi hospdays medicine prescrib
1       0       0        1        0        0        0        1        1
2       0       0        1        0        0        0        2        1
3       0       0        1        0        1        4        2        1
4       0       0        1        0        0        0        0        0
5       1       0        1        0        0        0        3        1
6       1       0        1        0        0        0        1        1
nonpresc
1        0
2        1
3        1
4        0
5        2
6        0
> help(dvisits)
Doctor visits in Australia

Description:

The data come from the Australian Health Survey of 1977-78 and
consist of 5190 single adults where young and old have been
oversampled.

Usage:

data(dvisits)

Format:

A data frame with 5190 observations on the following 19 variables.

'sex' 1 if female, 0 if male

'age' Age in years divided by 100 (measured as mid-point of 10 age
groups from 15-19 years to 65-69 with 70 or more coded
treated as 72)

'agesq' age squared

'income' Annual income in Australian dollars divided by 1000
(measured as mid-point of coded ranges Nil, less than 200,
200-1000, 1001-, 2001-, 3001-, 4001-, 5001-, 6001-, 7001-,
8001-10000, 10001-12000, 12001-14000, with 14001- treated as
15000

'levyplus' 1 if covered by private health insurance fund for
private patient in public hospital (with doctor of choice), 0
otherwise

'freepoor' 1 if covered by government because low income, recent
immigrant, unemployed, 0 otherwise

'freerepa' 1 if covered free by government because of old-age or
disability pension, or because invalid veteran or family of
deceased  veteran, 0 otherwise

'illness' Number of illnesses in past 2 weeks with 5 or more coded
as 5

'actdays' Number of days of reduced activity in past two weeks due
to  illness or injury

'hscore' General health questionnaire score using Goldberg's
method. High score indicates bad health

'chcond1' 1 if chronic condition(s) but not limited in activity, 0
otherwise

'chcond2' 1 if chronic condition(s) and limited in activity, 0
otherwise

'doctorco' Number of consultations with a doctor or specialist  in
the past 2 weeks

'nondocco' Number of consultations with non-doctor health
professionals (chemist, optician, physiotherapist, social
worker,  district community nurse, chiropodist or
chiropractor) in the past 2 weeks

hospital,  nursing or convalescent home in the past 12 months
(up to 5 or more admissions which is coded as 5)

'hospdays' Number of nights in a hospital, etc. during most recent
admission: taken, where appropriate, as the mid-point of the
intervals 1, 2, 3, 4, 5, 6, 7, 8-14, 15-30, 31-60, 61-79 with
80 or more admissions coded as 80. If no admission in past 12
months then equals zero

'medicine' Total number of prescribed and nonprescribed
medications used in past 2 days

'prescrib' Total number of prescribed medications used in past 2
days

'nonpresc' Total number of nonprescribed medications used in past
2 days
> amgam <- gam(doctorco~sex+lo(age)+lo(agesq)+lo(income)+levyplus+freepoor+freerepa+lo(illness)+lo(actdays)+lo(hscore)+chcond1+chcond2,data=dvisits)
Warning messages:
1: pseudoinverse used at 2
3: reciprocal condition number  0
4: There are other near singularities as well. 1
5: lo.wam convergence not obtained in  30  iterations in: lo.wam(x, z, wz, fit\$smooth, which, fit\$smooth.frame, bf.maxit,
6: pseudoinverse used at 2
8: reciprocal condition number  0
9: There are other near singularities as well. 1
10: lo.wam convergence not obtained in  30  iterations in: lo.wam(x, z, wz, fit\$smooth, which, fit\$smooth.frame, bf.maxit,
> summary(amgam)

Call: gam(formula = doctorco ~ sex + lo(age) + lo(agesq) + lo(income) +
levyplus + freepoor + freerepa + lo(illness) + lo(actdays) +
lo(hscore) + chcond1 + chcond2, data = dvisits)
Deviance Residuals:
Min       1Q   Median       3Q      Max
-2.44501 -0.25614 -0.13470 -0.02021  7.05098

(Dispersion Parameter for gaussian family taken to be 0.4979)

Null Deviance: 3305.484 on 5189 degrees of freedom
Residual Deviance: 2568.656 on 5159.138 degrees of freedom
AIC: 11141.92

Number of Local Scoring Iterations: 2

DF for Terms and F-values for Nonparametric Effects

Df Npar Df Npar F   Pr(F)
(Intercept) 1.0
sex         1.0
lo(age)     1.0     2.5  0.240 0.82979
lo(agesq)   1.0     2.8  0.379 0.75308
lo(income)  1.0     2.7  1.248 0.29047
levyplus    1.0
freepoor    1.0
freerepa    1.0
lo(illness) 1.0     4.0  2.040 0.08603 .
lo(actdays) 1.0     3.1 38.532 < 2e-16 ***
lo(hscore)  1.0     2.8  1.429 0.23378
chcond1     1.0
chcond2     1.0
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 1-2568.656/3305.484
[1] 0.2229108
This is a really bad Fit

Diagnostics:
> plot(predict(amgam),residuals(amgam),xlab="Predicted",ylab="Residuals")

We see that the residuals do not look normal, there is also non-constant variance
```

### Question 6

```> library(lattice)
> data(ethanol)
NOx  C     E
1 3.741 12 0.907
2 2.295 12 0.761
3 1.498 12 1.108
4 2.881 12 1.016
5 0.760 12 1.189
6 3.120  9 1.001
> help(ethanol)
Description:

Ethanol fuel was burned in a single-cylinder engine.  For various
settings of the engine compression and equivalence ratio, the
emissions of nitrogen oxides were recorded.

Usage:

ethanol

Format:

A data frame with 88 observations on the following 3 variables.

NOx Concentration of nitrogen oxides (NO and NO2) in micrograms/J.

C Compression ratio of the engine.

E Equivalence ratio-a measure of the richness of the air and
ethanol fuel mixture.
```