Etherum Price Prediction
Using Machine Learning
Methods
Robert Kozub
University of Texas at Dallas
Knowledge Mining
Spring 2022
Introduction Ethereum (ether) or ETH, is a virtual
machine available around the world
powered by blockchain technology.
Created in 2013 by Vitalik Buterin, the
cryptocurrency has skyrocketed in
price from $1.25 to around $2,500 today
(May 2022).
The all-time high for ETH
was approximately $4,400 in
November 2021.
Blockchain
01
Linear Regression
Seeks to understand the
relationship between input
and output variables
(numerical).
02
Prophet
The Prophet package in R was produced as a
Machine Learning model for time series data
by Facebook. The model is open-source and
available for anyone to use to accurately and
automatically forecast data.
Methods The following methods were used to
create two difference algorithms with the
goal of accurately predicting Ethereum
prices using one year of historic closing
price data.
Data Exploration
Ethereum Closing Prices (4/30/2021-4/30/2022) Ethereum Open vs. Closing Prices
Data Exploration
Frequency of ETH Volume ETH Closing Price vs. Volume Correlation Between All ETH Variables
Linear
Regression
Linear Regression (via analyticsvidhya.com)
To being, random "seeds" will be generated using the function
set.seed(). This allows for random numbers or pseudorandom
numbers to be generated using an algorithm. The point of this
function is to sample without reproducing the same number twice.
Random Seed
Next, a training and testing dataset are created and put into data
frames using our random seeds. This ensures there are two
different set of dates to sample from when running our
regression model and making predictions.
Training and Testing Datasets
This is where the training dataset gets put to work and the
regression model, using the lm() function trains the dataset in order
to calculate trends. For this regression model, the opening and
closing prices were used to compare with each other in order to
train the model to be ready to input into the prediction model.
Train Model
Plot Residuals
Plotted Residuals
Prediction
Using the regression model and the
testing model, the results of those two
analysis are used for the prediction
which is performed using the predict()
function in R. This function predicts
values based upon the input data (the
regression and testing model).
Prediction
Results
MSE
18205.93
RMSE
134.9294
Mean Squared Error
(MSE) and Root Mean
Squared Error (RMSE)
MSE and RMSE
Price Prediction
Utilizing Prophet
The Prophet package in R was produced as
a Machine Learning model for time series
data by Facebook. The model is open-
source and available for anyone to use to
accurately and automatically forecast data.
Setting Up Data
To begin with Prophet, the data must be loaded into
R and the appropriate packages should be loaded.
Plugging Data into Prophet
Function
The data is then put into the Prophet function for
prediction. This model uses a combination of
piecewise linear and logistic growth curve trends.
Similar to how a Regression Model works, but the
algorithm in Prophet automatically detects changes
in trends and works even with messy data.
Prophet Prediction Results
FProphet Prediction Results
Conclusions
Linear Regression
Can prove to be accurate
Longer process
Clear process
Key Findings
Prophet
Easy to build
Can use messier datasets
Unclear process
Key Findings
Robert Kozub
rxk200039@utdallas.edu
GitHub: robertkozub.github.io
THANK
YOU