Chapter 4 Missing values

We are going to use adjusted closing price to compare the behavior between the traditional stocks and cryptocurrencies.

For traditional stocks, due to the limitation of trading dates, OHLCV data on non-trading dates will be missing compared to cryptocurrencies, and we choose to use previous data to fill in.

We will try to figure out the problem using the daily adjusted closing price of Bitcoin and NASDAQ 100 from 2016 to 2021, here is the sample of our origin data.

##            BTC.USD.Adjusted NDX.Adjusted
## 2015-12-31       430.566986  4593.270020
## 2016-01-01       434.334015           NA
## 2016-01-02       433.437988           NA
## 2016-01-03       430.010986           NA
## 2016-01-04       433.091003  4497.859863

Here we plot the missing values in year 2016.

Only NDX.Adj contains the missing value, and the proportion of missing row is approximately 30%, which is consistent with the proportion of the non-trading days of a year(1-252/365).

And we will use last non-missing observation carried forward to fill missing values

df_no_missing = setnafill(df_2,'locf')
head(df_no_missing)
##            BTC.USD.Adjusted NDX.Adjusted
## 2015-12-31       430.566986  4593.270020
## 2016-01-01       434.334015  4593.270020
## 2016-01-02       433.437988  4593.270020
## 2016-01-03       430.010986  4593.270020
## 2016-01-04       433.091003  4497.859863
## 2016-01-05       431.959991  4484.180176