Chapter 3 Data transformation
For the market cap data, the Market Cap variable is in character type, so we need to transform this attribute to numeric type.
Market_cap = crypto_market_cap$`Market Cap`
Market_cap = substr(Market_cap,2,length(Market_cap))
Market_cap = str_replace_all(Market_cap,',','')
options(digits=10)
crypto_market_cap$`Market Cap` = as.numeric(Market_cap)
head(crypto_market_cap,n=4)## # A tibble: 4 × 3
## Symbol `Market Cap` year
## <chr> <dbl> <int>
## 1 BTC 15492555878. 2016
## 2 ETH 696993350. 2016
## 3 XRP 234334890. 2016
## 4 LTC 212503031. 2016
And as we basically want the eight cryptocurrencies with the largest market cap each year. So we will get a tidy version of the data and aggregate the other cryptocurrencies.
crypto_market_cap_1 = crypto_market_cap %>%
arrange(`Market Cap`) %>%
group_by(year) %>%
slice(1:17) %>%
group_by(year) %>%
summarise(`Market Cap` = sum(`Market Cap`)) %>%
mutate(Symbol='Others') %>%
dplyr::select(Symbol,`Market Cap`,year)
crypto_market_cap_2 = crypto_market_cap %>%
arrange(desc(`Market Cap`)) %>%
group_by(year) %>%
slice(1:3)
crypto_market_cap_tidy = rbind(crypto_market_cap_2,crypto_market_cap_1) %>%
arrange(year,desc(`Market Cap`))
head(crypto_market_cap_tidy)## # A tibble: 6 × 3
## # Groups: year [2]
## Symbol `Market Cap` year
## <chr> <dbl> <int>
## 1 BTC 15492555878. 2016
## 2 Others 942262021. 2016
## 3 ETH 696993350. 2016
## 4 XRP 234334890. 2016
## 5 BTC 237466518547. 2017
## 6 Others 146690653305. 2017