Ejemplo: Problema de anteras embriogénicas de la planta de especie Datura innoxia

(Dobson (1990))

La siguiente tabla muestra el número de anteras embriogénicas \(y_{jk}\) de planta de especie Datura innoxia obtenidas cuando un número de anteras \(n_{jk}\) fueron preparadas bajo distintas condiciones. Hay un factor cualitativo: y una covariable continua con 3 valores de fuerza centrífuga. Interesa comparar el efecto del tratamiento de almacenaje después de ajustar, si es necesario, por la fuerza centrifuga.

Lectura de datos y características de los mismos:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
dat <-  anthers <- read.csv("../RM_tema_8/anthers.dat", sep="")
dim(anthers)
## [1] 6 4
str(anthers)
## 'data.frame':    6 obs. of  4 variables:
##  $ f: Factor w/ 2 levels "ctrl","trt": 1 1 1 2 2 2
##  $ x: int  40 150 350 40 150 350
##  $ n: int  102 99 108 76 81 90
##  $ y: int  55 52 57 55 50 50
head(anthers)
##      f   x   n  y
## 1 ctrl  40 102 55
## 2 ctrl 150  99 52
## 3 ctrl 350 108 57
## 4  trt  40  76 55
## 5  trt 150  81 50
## 6  trt 350  90 50
summary(anthers)
##     f           x               n                y        
##  ctrl:3   Min.   : 40.0   Min.   : 76.00   Min.   :50.00  
##  trt :3   1st Qu.: 67.5   1st Qu.: 83.25   1st Qu.:50.50  
##           Median :150.0   Median : 94.50   Median :53.50  
##           Mean   :180.0   Mean   : 92.67   Mean   :53.17  
##           3rd Qu.:300.0   3rd Qu.:101.25   3rd Qu.:55.00  
##           Max.   :350.0   Max.   :108.00   Max.   :57.00

Modelo 1 con interacción tratamiento*fuerza

dat <- dat %>% mutate(logx=log(x),prop=y/n)
print(dat)
##      f   x   n  y     logx      prop
## 1 ctrl  40 102 55 3.688879 0.5392157
## 2 ctrl 150  99 52 5.010635 0.5252525
## 3 ctrl 350 108 57 5.857933 0.5277778
## 4  trt  40  76 55 3.688879 0.7236842
## 5  trt 150  81 50 5.010635 0.6172840
## 6  trt 350  90 50 5.857933 0.5555556
mod1 <- glm(prop~logx*f,family = binomial("logit"),data=dat,weights = n)
print(summary(mod1))
## 
## Call:
## glm(formula = prop ~ logx * f, family = binomial("logit"), data = dat, 
##     weights = n)
## 
## Deviance Residuals: 
##        1         2         3         4         5         6  
##  0.03611  -0.09370   0.05466   0.04305  -0.09855   0.05560  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.23389    0.62839   0.372   0.7097
## logx        -0.02274    0.12685  -0.179   0.8577
## ftrt         1.97711    0.99802   1.981   0.0476
## logx:ftrt   -0.31862    0.19888  -1.602   0.1091
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 10.451974  on 5  degrees of freedom
## Residual deviance:  0.027728  on 2  degrees of freedom
## AIC: 37.596
## 
## Number of Fisher Scoring iterations: 3
print(anova(mod1))
## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: prop
## 
## Terms added sequentially (first to last)
## 
## 
##        Df Deviance Resid. Df Resid. Dev
## NULL                       5    10.4520
## logx    1   2.3604         4     8.0916
## f       1   5.4727         3     2.6188
## logx:f  1   2.5911         2     0.0277

Modelo 2 con factor tratamiento y covariable fuerza

mod2 <- glm(prop~f+logx,family=binomial("logit"),data=dat,weights=n)
print(summary(mod2))
## 
## Call:
## glm(formula = prop ~ f + logx, family = binomial("logit"), data = dat, 
##     weights = n)
## 
## Deviance Residuals: 
##        1         2         3         4         5         6  
## -0.74964  -0.00509   0.72746   0.99006  -0.13512  -0.72744  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.87673    0.48701   1.800   0.0718
## ftrt         0.40684    0.17462   2.330   0.0198
## logx        -0.15459    0.09702  -1.593   0.1111
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 10.4520  on 5  degrees of freedom
## Residual deviance:  2.6188  on 3  degrees of freedom
## AIC: 38.187
## 
## Number of Fisher Scoring iterations: 3
print(anova(mod2))
## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: prop
## 
## Terms added sequentially (first to last)
## 
## 
##      Df Deviance Resid. Df Resid. Dev
## NULL                     5    10.4520
## f     1   5.2790         4     5.1730
## logx  1   2.5541         3     2.6188

Modelo 3 con covariable fuerza

mod3 <- glm(prop~logx,family=binomial("logit"),data=dat,weights=n)
print(summary(mod3))
## 
## Call:
## glm(formula = prop ~ logx, family = binomial("logit"), data = dat, 
##     weights = n)
## 
## Deviance Residuals: 
##       1        2        3        4        5        6  
## -1.5947  -0.8896  -0.2283   1.9610   0.8700   0.3204  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   1.0213     0.4813   2.122   0.0338
## logx         -0.1478     0.0965  -1.532   0.1255
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 10.4520  on 5  degrees of freedom
## Residual deviance:  8.0916  on 4  degrees of freedom
## AIC: 41.66
## 
## Number of Fisher Scoring iterations: 3
print(anova(mod3))
## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: prop
## 
## Terms added sequentially (first to last)
## 
## 
##      Df Deviance Resid. Df Resid. Dev
## NULL                     5    10.4520
## logx  1   2.3604         4     8.0916

Resultados

library(xtable)
tab <- dat %>% mutate(fc=x,x=logx,y=prop,yhat1=fitted(mod1),yhat2=fitted(mod2),
                      yhat3=fitted(mod3)) %>% select(fc,logx,y,yhat1,yhat2,yhat3)
tab <- round(tab,3)
print(tab)
##    fc  logx     y yhat1 yhat2 yhat3
## 1  40 3.689 0.539 0.537 0.576 0.617
## 2 150 5.011 0.525 0.530 0.526 0.570
## 3 350 5.858 0.528 0.525 0.493 0.539
## 4  40 3.689 0.724 0.721 0.671 0.617
## 5 150 5.011 0.617 0.623 0.625 0.570
## 6 350 5.858 0.556 0.553 0.593 0.539
cat("Devianzas")
## Devianzas
Dev <- data.frame(modelo1=deviance(mod1),modelo2=deviance(mod2),modelo3=deviance(mod3))
print(Dev)
##     modelo1  modelo2  modelo3
## 1 0.0277278 2.618837 8.091578

La tabla muestra las proporciones ajustadas por los 3 modelos. Note lo bien que el modelo 1 ajusta los datos. Sucede por que se estima ¡ 4 parámetros con base a 6 observaciones!