Module # 8 Correlation Analysis and ggplot2

 library(ggplot2)

data("mtcars")

#regression analysis of mpg to disp

reg <- lm(data = mtcars, mpg ~ disp)

summary(reg)

Call:

lm(formula = mpg ~ disp, data = mtcars)


Residuals:

    Min      1Q  Median      3Q     Max 

-4.8922 -2.2022 -0.9631  1.6272  7.2305 


Coefficients:

             Estimate Std. Error t value Pr(>|t|)    

(Intercept) 29.599855   1.229720  24.070  < 2e-16 ***

disp        -0.041215   0.004712  -8.747 9.38e-10 ***

---

Signif. codes:  

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 3.251 on 30 degrees of freedom

Multiple R-squared:  0.7183, Adjusted R-squared:  0.709 

F-statistic: 76.51 on 1 and 30 DF,  p-value: 9.38e-10

ggplot(mtcars, aes(x=mpg, y=disp)) + geom_point() + stat_smooth(method = "lm", col = "hotpink")




For this visualization, I decided to use the popular mtcars dataset and do a regression analysis of mpg to disp. With the scatterplot created, we can see that the variables have a negative relationship. This can also be seen in the summary of the analysis. The graph makes coming to this conclusion a lot easier for those who may not understand the numbers being presented in the summary. I believe that this is the best method of visualization because of its ease to understand, no key is necessary to understand the two variables relationship.


Comments