Chapter 3 Data transformation

For the market cap data, the Market Cap variable is in character type, so we need to transform this attribute to numeric type.

Market_cap = crypto_market_cap$`Market Cap`
Market_cap = substr(Market_cap,2,length(Market_cap))
Market_cap = str_replace_all(Market_cap,',','')
options(digits=10)
crypto_market_cap$`Market Cap` = as.numeric(Market_cap)
head(crypto_market_cap,n=4)
## # A tibble: 4 × 3
##   Symbol `Market Cap`  year
##   <chr>         <dbl> <int>
## 1 BTC    15492555878.  2016
## 2 ETH      696993350.  2016
## 3 XRP      234334890.  2016
## 4 LTC      212503031.  2016

And as we basically want the eight cryptocurrencies with the largest market cap each year. So we will get a tidy version of the data and aggregate the other cryptocurrencies.

crypto_market_cap_1 = crypto_market_cap %>%
  arrange(`Market Cap`) %>%
  group_by(year) %>%
  slice(1:17) %>%
  group_by(year) %>%
  summarise(`Market Cap` = sum(`Market Cap`)) %>%
  mutate(Symbol='Others') %>%
  dplyr::select(Symbol,`Market Cap`,year)
crypto_market_cap_2 = crypto_market_cap %>%
  arrange(desc(`Market Cap`)) %>%
  group_by(year) %>%
  slice(1:3)
crypto_market_cap_tidy = rbind(crypto_market_cap_2,crypto_market_cap_1) %>%
  arrange(year,desc(`Market Cap`))
head(crypto_market_cap_tidy)
## # A tibble: 6 × 3
## # Groups:   year [2]
##   Symbol  `Market Cap`  year
##   <chr>          <dbl> <int>
## 1 BTC     15492555878.  2016
## 2 Others    942262021.  2016
## 3 ETH       696993350.  2016
## 4 XRP       234334890.  2016
## 5 BTC    237466518547.  2017
## 6 Others 146690653305.  2017