Abstract
After the boom and bust of cryptocurrencies’ prices in recent years, Bitcoin has been increasingly regarded as an investment asset. Because of its highly volatile nature, there is a need for good predictions on which to base investment decisions. Although existing studies have leveraged machine learning for more accurate Bitcoin price prediction, few have focused on the feasibility of applying different modeling techniques to samples with different data structures and dimensional features. To predict Bitcoin price at different frequencies using machine learning techniques, we first classify Bitcoin price by daily price and high-frequency price. A set of high-dimension features including property and network, trading and market, attention and gold spot price are used for Bitcoin daily price prediction, while the basic trading features acquired from a cryptocurrency exchange are used for 5-minute interval price prediction. Statistical methods including Logistic Regression and Linear Discriminant Analysis for Bitcoin daily price prediction with high-dimensional features achieve an accuracy of 66%, outperforming more complicated machine learning algorithms. Compared with benchmark results for daily price prediction, we achieve a better performance, with the highest accuracies of the statistical methods and machine learning algorithms of 66% and 65.3%, respectively. Machine learning models including Random Forest, XGBoost, Quadratic Discriminant Analysis, Support Vector Machine and Long Short-term Memory for Bitcoin 5-minute interval price prediction are superior to statistical methods, with accuracy reaching 67.2%. Our investigation of Bitcoin price prediction can be considered a pilot study of the importance of the sample dimension in machine learning techniques.
1. Introduction
Bitcoin, invented in 2008 to solve the inherent weakness of the trust-based model of transactions and initially defined as a purely peer-to-peer electronic cash system [1], has become an asset or commodity-like product traded in more than 16,000 markets around the world.1 Although proponents hold that one of Bitcoin’s important application is to take the place of fiat currency, the true nature of Bitcoin remains a vexing problem. Investors do not treat Bitcoin as a currency according to the criteria used by economists; instead, they regard Bitcoin as a speculative investment similar to the Internet stocks of the last century [2]. Before Bitcoin disrupted existing payment and monetary systems, its several-year trading and increasing popularity attracted attention from across society, including from policymakers, and the peak of Bitcoin’s market capitalization in 2017 reached 300 billion US dollars, almost equal to that of Amazon in 2016.
6. Conclusion and discussion
In this study, we investigated machine learning techniques based upon sample characteristics of sample and dimension to predict Bitcoin price. While most previous works simply leverage machine learning algorithms in Bitcoin price prediction, we show that the sample’s granularity and feature dimensions should be considered. The Bitcoin aggregated daily price, acquired from CoinMarketCap, facilitates the inclusion of high-dimensional features, including property and network, trading and market, attention and gold spot price. The Bitcoin 5-minute interval trading price is facilitated by features from the Binance exchange. Based on the Occam’s razor principle and the paradigms applied in practical prediction problems using machine learning algorithms, we adopted statistical methods for Bitcoin daily price prediction and machine learning models for Bitcoin 5-minute interval price prediction. The results show that the statistical methods perform better for low-frequency data with high-dimensional features, while the machine learning models outperform statistical methods for high-frequency data. Most of our results also outperform the benchmark results of other machine learning algorithms. We envision that our approach to sampling dimension engineering using machine learning models for the prediction can be applied to other areas that have similar characteristics to Bitcoin.