Ejemolo Error Puro y Falta de Ajuste

(Simulado)

Lectura de datos y características de los mismos:

EpFa.data <- read.csv("./EpFa.dat", sep="")
dim(EpFa.data)
## [1] 39  6
str(EpFa.data)
## 'data.frame':    39 obs. of  6 variables:
##  $ i    : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ x    : int  1 1 1 2 2 2 2 2 3 3 ...
##  $ y    : num  -2.25 7.46 5.9 0.34 15.19 ...
##  $ ymed : num  3.71 3.71 3.71 7.38 7.38 7.38 7.38 7.38 6.49 6.49 ...
##  $ yhat1: num  -10.96 -10.96 -10.96 1.05 1.05 ...
##  $ yhat2: num  3.84 3.84 3.84 6.34 6.34 ...
head(EpFa.data)
##   i x     y ymed  yhat1 yhat2
## 1 1 1 -2.25 3.71 -10.96  3.84
## 2 2 1  7.46 3.71 -10.96  3.84
## 3 3 1  5.90 3.71 -10.96  3.84
## 4 4 2  0.34 7.38   1.05  6.34
## 5 5 2 15.19 7.38   1.05  6.34
## 6 6 2  3.98 7.38   1.05  6.34
summary(EpFa.data)
##        i              x                y               ymed       
##  Min.   : 1.0   Min.   : 1.000   Min.   : -2.25   Min.   :  3.71  
##  1st Qu.:10.5   1st Qu.: 3.000   1st Qu.: 10.54   1st Qu.:  7.38  
##  Median :20.0   Median : 5.000   Median : 33.11   Median : 31.84  
##  Mean   :20.0   Mean   : 5.487   Mean   : 42.94   Mean   : 42.94  
##  3rd Qu.:29.5   3rd Qu.: 8.000   3rd Qu.: 67.65   3rd Qu.: 71.17  
##  Max.   :39.0   Max.   :10.000   Max.   :114.44   Max.   :112.79  
##      yhat1            yhat2       
##  Min.   :-10.96   Min.   :  3.84  
##  1st Qu.: 13.06   1st Qu.: 11.24  
##  Median : 37.09   Median : 28.19  
##  Mean   : 42.94   Mean   : 42.94  
##  3rd Qu.: 73.13   3rd Qu.: 71.52  
##  Max.   : 97.15   Max.   :112.36

Ajuste modelo lineal

## 
## Call:
## lm(formula = y ~ x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.500  -7.882  -1.406   8.416  18.425 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -22.977      3.652  -6.292 2.53e-07
## x             12.013      0.594  20.224  < 2e-16
## 
## Residual standard error: 10.28 on 37 degrees of freedom
## Multiple R-squared:  0.917,  Adjusted R-squared:  0.9148 
## F-statistic:   409 on 1 and 37 DF,  p-value: < 2.2e-16
## Analysis of Variance Table
## 
## Response: y
##           Df Sum Sq Mean Sq F value    Pr(>F)
## x          1  43256   43256  409.03 < 2.2e-16
## Residuals 37   3913     106

Ajuste modelo cuadrático

## 
## Call:
## lm(formula = y ~ x + I(x^2))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.6446  -2.4755   0.7354   3.2211  13.5258 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   3.7174     3.7326   0.996    0.326
## x            -1.0761     1.5490  -0.695    0.492
## I(x^2)        1.1940     0.1378   8.665 2.47e-10
## 
## Residual standard error: 5.935 on 36 degrees of freedom
## Multiple R-squared:  0.9731, Adjusted R-squared:  0.9716 
## F-statistic: 651.5 on 2 and 36 DF,  p-value: < 2.2e-16
## Analysis of Variance Table
## 
## Response: y
##           Df Sum Sq Mean Sq  F value    Pr(>F)
## x          1  43256   43256 1228.010 < 2.2e-16
## I(x^2)     1   2645    2645   75.084 2.467e-10
## Residuals 36   1268      35

Análisis de Residuales