Posts

Showing posts from February, 2023

Module # 8 Correlation Analysis and ggplot2

Image
 library(ggplot2) data("mtcars") #regression analysis of mpg to disp reg <- lm(data = mtcars, mpg ~ disp) summary(reg) Call: lm(formula = mpg ~ disp, data = mtcars) Residuals:     Min      1Q  Median      3Q     Max  -4.8922 -2.2022 -0.9631  1.6272  7.2305  Coefficients:              Estimate Std. Error t value Pr(>|t|)     (Intercept) 29.599855   1.229720  24.070  < 2e-16 *** disp        -0.041215   0.004712  -8.747 9.38e-10 *** --- Signif. codes:   0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.251 on 30 degrees of freedom Multiple R-squared:  0.7183, Adjusted R-squared:  0.709  F-statistic: 76.51 on 1 and 30 DF,  p-value: 9.38e-10 ggplot(mtcars, aes(x=mpg, y=disp)) + geom_point() + stat_smooth(method = "lm", col = "hotpink") For t...

Module # 7 Visual Distribution Analysis

Image
For this assignment, I decided to use the popular mtcars dataset. I chose to use ggplot2 to create this graph as it makes my plot look much more appealing to the reader. For my visual analysis, I created a scatter plot comparing the displacement to the miles per gallon. From the results of my graph, we can see that the higher the car's miles per gallon, the lower the displacement. I like the use of a scatter plot in the situation because it shows the linear downward trend of the data.  car <- mtcars #scatterplot of displacement and miles per gallon ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point() +  xlab("Displacement") + ylab("Miles per gallon") #shows the higher the mpg the lower the displacement

Module # 6 Visual Differences & Deviation Analysis via R

Image
For this assignment, I decided to take a dataset from kaggle.com of the most common game types among novice chess players. The dataset contains the ratings of both sides, game length, time control, and the Portable Game Notation (pgn) of the entire game. When analyzing the raw data, it can be challenging to understand which type of game is most common. By doing a simple bar chart hundreds of rows of data is turned into a simple graphic that anyone can understand.  My basic visualization does fit into Few and Yau’s discussion. My bar graph shows the frequency of types of chess games people can conclude that the most popular types of chess match for beginners is rapid games. For beginners, this is the best time control as it allows for them to have the time to understand why they are making the moves that they are making. Whereas blits and bullet are much more fast pace causing for players likelihood of making mistakes to be greater.  dataset: https://www.kaggle.com/datasets/tia...

Module # 5 assignment

Image
When creating this visualization, a mistake was including the column containing the row numbers. It's essential to remove this when creating the graph because it does not need to be plotted. This graph displays the differences between the average position in comparison to the time. From the visualization, we can see that as it time increases the average position increases alongside it. It also seems as it the average position is beginning to flatten as time increases, this may mean that the average position has a peak value before it stagnates. Indicating that if this graph were to continue with time increasing the average position would stay the same or possibly start decreasing.