Chapter 4 Missing values
We are going to use adjusted closing price to compare the behavior between the traditional stocks and cryptocurrencies.
For traditional stocks, due to the limitation of trading dates, OHLCV data on non-trading dates will be missing compared to cryptocurrencies, and we choose to use previous data to fill in.
We will try to figure out the problem using the daily adjusted closing price of Bitcoin and NASDAQ 100 from 2016 to 2021, here is the sample of our origin data.
## BTC.USD.Adjusted NDX.Adjusted
## 2015-12-31 430.566986 4593.270020
## 2016-01-01 434.334015 NA
## 2016-01-02 433.437988 NA
## 2016-01-03 430.010986 NA
## 2016-01-04 433.091003 4497.859863
Here we plot the missing values in year 2016.
Only NDX.Adj contains the missing value, and the proportion of missing row is approximately 30%, which is consistent with the proportion of the non-trading days of a year(1-252/365).
And we will use last non-missing observation carried forward to fill missing values
= setnafill(df_2,'locf')
df_no_missing head(df_no_missing)
## BTC.USD.Adjusted NDX.Adjusted
## 2015-12-31 430.566986 4593.270020
## 2016-01-01 434.334015 4593.270020
## 2016-01-02 433.437988 4593.270020
## 2016-01-03 430.010986 4593.270020
## 2016-01-04 433.091003 4497.859863
## 2016-01-05 431.959991 4484.180176