When you’re learning how to code… Don’t take notes. I know some people who handwrite their code, but that’s because they already know how to code. Try your best to listen. To learn how to code, you ought to watch coding being done and practice coding for yourself.
This is regular text.
Here’s a code block.
# CODE BLOCK
"This is a string, but do not try and write your code in quotes like this!"
## [1] "This is a string, but do not try and write your code in quotes like this!"
hi = 4
hi <- 4 # MORE "R-LIKE"
hi<-4
hi=4
a <- 1
b <- 2
# TRUE OR FALSE?
a < b
## [1] TRUE
a > b
## [1] FALSE
a <= b
## [1] TRUE
a >= b
## [1] FALSE
a == b
## [1] FALSE
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
You can run using the green arrow, the run dropdown above, or control/command+enter.
CS_data <- read.csv("data/Cesarean.csv")
You need to check whether an object is in your environment or not before you call upon it. It should be instinctual for you to load in your data (i.e. tell your computer what you want to act upon) before you start telling R to use it.
A dataframe in R is how you would expect a “table” to be. (Be careful, don’t call them tables!)
class(CS_data)
## [1] "data.frame"
#View(CS_data)
To easily manage dataframes in R, we can use a library called dplyr
. A library in R is what you would expect of an app on your phone. Your phone has a the capability of being extremely useful in many cases. A phone app unlocks its ability to manage your weight loss or track your menstrual period, making your life easier. Likewise, a library in R was written by a developer for the community to use and make computing in R easier.
# EXAMPLE FROM LAB
head(CS_data %>% mutate(CS_rate_100 = CS_rate * 100))
## Country_Name CountryCode Births_Per_1000 Income_Group
## 1 Albania ALB 46 Upper middle income
## 2 Andorra AND 1 High income: nonOECD
## 3 United Arab Emirates ARE 63 High income: nonOECD
## 4 Argentina ARG 689 High income: nonOECD
## 5 Armenia ARM 47 Lower middle income
## 6 Australia AUS 267 High income: OECD
## Region GDP_2006 CS_rate CS_rate_100
## 1 Europe & Central Asia 3051.768 0.256 25.6
## 2 Europe & Central Asia 42417.229 0.237 23.7
## 3 Middle East & North Africa 42950.101 0.100 10.0
## 4 Latin America & Caribbean 6649.414 0.352 35.2
## 5 Europe & Central Asia 2126.619 0.141 14.1
## 6 East Asia & Pacific 36100.559 0.303 30.3
CS_data <- CS_data %>% mutate(CS_rate_100 = CS_rate * 100)
# ANOTHER HELPFUL(?) EXAMPLE
CS_data_new <- CS_data %>% mutate(CS_logical_check = CS_rate < median(CS_rate))
CS_data <- CS_data %>% mutate(CS_logical_check = CS_rate < median(CS_rate))
# ANOTHA ONE
CS_data <- CS_data %>% rename(CS_rate_below_median = CS_logical_check)
We can totally plot stuff in base R. Check this out.
# BASE R PLOT
plot(CS_data$Births_Per_1000, CS_data$CS_rate)
# TRYING TO MAKE IT PRETTY
plot(CS_data$Births_Per_1000, CS_data$CS_rate, xlab="Births Per 1000", ylab="Cesarian Section Rate", main="Cesarian Rates vs. Births Per 1000", pch=19, col="blue")
You can sure visualize data like this. Up to you. We’re opting for ggplot2
in this class though. We’re going to start with …
?ggplot
We’re going to make the same scatterplot above… Recall that you’re plotting two different numeric variables.
this_plot = ggplot(data=CS_data, aes(x=Births_Per_1000, y=CS_rate))
this_plot
All you need …
this_plot + geom_point()
But let’s add some fun…
this_plot = ggplot(CS_data, aes(x=Births_Per_1000, y=CS_rate)) +
geom_point(col=alpha("pink2", 0.6)) +
ggtitle("Here's my title") +
xlab("X-Lab") +
ylab("Y-lab")
this_plot
Let’s start basic.
ggplot(CS_data, aes(x=Income_Group)) + geom_bar()
But what do you think is the reason we visualize data?
ggplot(CS_data, aes(x=Income_Group)) + geom_bar(aes(fill=Region)) + xlab("Income Group") + ggtitle("Income Groups per Country") + theme(axis.text.x = element_text(angle = 20, hjust = 1))
ggplot(CS_data, aes(x=Region)) + geom_bar(aes(fill=Income_Group)) + xlab("Income Group") + ggtitle("Income Groups per Country") + theme(axis.text.x = element_text(angle = 20, hjust = 1))
You don’t want to confuse your audience. You want to educate. You want to tell the best story you can. Use color. Label your axes. Think wisely about how to organize your data. It all makes a difference.
You are learning dplyr
and ggplot2
! One is for manipulating data. The other is for data viz.