Application of machine learning algorithms to stock price movement

Patrycja Pultowicz

doi:10.25365/thesis.63334

Title (eng)

Parallel title (deu)

Anwendung von Machine Learning Algorithmen für Aktienkursprognosen

Author

Patrycja Pultowicz

Advisor

Erhard Reschenhofer

Assessor

Erhard Reschenhofer

Abstract (deu)

Diese Arbeit vergleicht die Prognosegenauigkeit von Machine Learning Algorithmen auf Preisbewegungen von Aktien aus S&P 500 im Zeitraum 2017 bis 2019. Im Rahmen eines binären Klassifizierungsproblems setzen wir vier Machine Learning Modelle zur Prognostizierung von Klassen ein, die auf den Tagesreturns basieren: ein Standard Deep Neural Network, ein Long Short-Term Memory (LSTM) Neural Network als Sonderfall von rekurrenten neuronalen Netzen, sowie Random Forest und Logistische Regression. Wir betrachten zusätzlich eine naïve Prognosestrategie als Benchmark. Als Input unserer Modelle verwenden wir zeitverzögerte Aktienreturns. Als Zielvariable im Modell ziehen wir zwei verschiedene Varianten für die Klassifizierung heran: in der einen Variante ist die Zielvariable als Klasse 1 definiert, wenn der Aktienreturn positiv ist und 0 sonst (auch bekannt als direction-of-change). In der anderen Variante ist die Zielvariable als Klasse 1 definiert, wenn der Aktienreturn über dem Querschnittsmedian aller Aktien liegt und 0 sonst. Mehrere Simulationsparameter werden betrachtet: wir verwenden zum einen Inputvariablen mit und ohne Standardisierung, zum anderen eine unterschiedliche Anzahl an zeitversetzten Aktienreturns als Inputvariablen. Unsere Ergebnisse zeigen, dass in allen Simulationen weder die Machine Learning Algorithmen, noch die naïve Methode besser sind als ein Münzwurf in Hinblick auf die Prognosegenauigkeit. Wir schließen daraus, dass Machine Learning Modelle auf Basis von historischen Aktienkursen zur Prognostizierung von Preisbewegungen unzureichend sind und ein hohes Overfitting-Risiko bergen.

Abstract (eng)

This paper examines the prediction performance of machine learning models on stock price movements of S&P 500 stocks, from 2017 until 2019, in a binary classification problem. We apply four machine learning algorithms for predicting daily return based target classes: a standard deep feedforward neural network, a long short-term memory (LSTM) network as a special case of a recurrent neural network, random forest and logistic regression. We additionally consider a naïve prediction strategy for comparison. As input of the machine learning algorithms, we consider daily stock return lags. As model target, we consider two variants: one target variant is defined as class 1 if the stock return is positive, and 0 else (also known as direction-of-change). The other target variant is defined as class 1 if the stock return is above the cross-sectional median of all stocks, and 0 else. We have several simulation settings: with and without standardization of input features and with an increasing number of stock return lags. Our experimental results show that in all of the settings neither the machine learning algorithms nor the naïve strategy outperform a purely random classification of a coin flip in terms of prediction accuracy. We conclude that using machine learning algorithms for predicting price movements based on historic price data alone is insufficient and contains a high risk of overfitting.

Keywords (eng)

Machine LearningForecastingStock price movementNeural Networks

Keywords (deu)

Machine LearningPrognoseAktienkurseNeuronale Netzwerke

Subject (deu)

Naturwissenschaften in Beziehung zu anderen Fachgebieten

Type (deu)

Magisterarbeit

Persistent identifier

https://phaidra.univie.ac.at/o:1390384

DOI

10.25365/thesis.63334

URN

urn:nbn:at:at-ubw:1-21748.08172.391464-0

URI

https://utheses.univie.ac.at/detail/56171

Extent (deu)

iii, 33 Seiten : Diagramme

Number of pages

37

Study plan

Magisterstudium Statistik

[UA]

[066]

[951]

Association (deu)

Fakultät für Wirtschaftswissenschaften

Title (eng)

Application of machine learning algorithms to stock price movement

Parallel title (deu)

Anwendung von Machine Learning Algorithmen für Aktienkursprognosen

Author

Patrycja Pultowicz

Abstract (deu)

Diese Arbeit vergleicht die Prognosegenauigkeit von Machine Learning Algorithmen auf Preisbewegungen von Aktien aus S&P 500 im Zeitraum 2017 bis 2019. Im Rahmen eines binären Klassifizierungsproblems setzen wir vier Machine Learning Modelle zur Prognostizierung von Klassen ein, die auf den Tagesreturns basieren: ein Standard Deep Neural Network, ein Long Short-Term Memory (LSTM) Neural Network als Sonderfall von rekurrenten neuronalen Netzen, sowie Random Forest und Logistische Regression. Wir betrachten zusätzlich eine naïve Prognosestrategie als Benchmark. Als Input unserer Modelle verwenden wir zeitverzögerte Aktienreturns. Als Zielvariable im Modell ziehen wir zwei verschiedene Varianten für die Klassifizierung heran: in der einen Variante ist die Zielvariable als Klasse 1 definiert, wenn der Aktienreturn positiv ist und 0 sonst (auch bekannt als direction-of-change). In der anderen Variante ist die Zielvariable als Klasse 1 definiert, wenn der Aktienreturn über dem Querschnittsmedian aller Aktien liegt und 0 sonst. Mehrere Simulationsparameter werden betrachtet: wir verwenden zum einen Inputvariablen mit und ohne Standardisierung, zum anderen eine unterschiedliche Anzahl an zeitversetzten Aktienreturns als Inputvariablen. Unsere Ergebnisse zeigen, dass in allen Simulationen weder die Machine Learning Algorithmen, noch die naïve Methode besser sind als ein Münzwurf in Hinblick auf die Prognosegenauigkeit. Wir schließen daraus, dass Machine Learning Modelle auf Basis von historischen Aktienkursen zur Prognostizierung von Preisbewegungen unzureichend sind und ein hohes Overfitting-Risiko bergen.

Abstract (eng)

This paper examines the prediction performance of machine learning models on stock price movements of S&P 500 stocks, from 2017 until 2019, in a binary classification problem. We apply four machine learning algorithms for predicting daily return based target classes: a standard deep feedforward neural network, a long short-term memory (LSTM) network as a special case of a recurrent neural network, random forest and logistic regression. We additionally consider a naïve prediction strategy for comparison. As input of the machine learning algorithms, we consider daily stock return lags. As model target, we consider two variants: one target variant is defined as class 1 if the stock return is positive, and 0 else (also known as direction-of-change). The other target variant is defined as class 1 if the stock return is above the cross-sectional median of all stocks, and 0 else. We have several simulation settings: with and without standardization of input features and with an increasing number of stock return lags. Our experimental results show that in all of the settings neither the machine learning algorithms nor the naïve strategy outperform a purely random classification of a coin flip in terms of prediction accuracy. We conclude that using machine learning algorithms for predicting price movements based on historic price data alone is insufficient and contains a high risk of overfitting.

Keywords (eng)

Machine LearningForecastingStock price movementNeural Networks

Keywords (deu)

Machine LearningPrognoseAktienkurseNeuronale Netzwerke

Subject (deu)

Naturwissenschaften in Beziehung zu anderen Fachgebieten

Type (deu)

Magisterarbeit

Persistent identifier

https://phaidra.univie.ac.at/o:1390385

Number of pages

37

Association (deu)

Fakultät für Wirtschaftswissenschaften

License

Download

Citable links

Persistent identifier
https://phaidra.univie.ac.at/o:1390384
Handle
https://hdl.handle.net/11353/10.1390384
DOI
https://doi.org/10.25365/thesis.63334
URN
https://nbn-resolving.org/nbn:at:at-ubw:1-21748.08172.391464-0
Other links

URI
https://utheses.univie.ac.at/detail/56171
Managed by

u:theses
Details

Uploader

Universitätsbibliothek Wien / u:theses

Object type

Container

Created

04.12.2021 09:46:54 UTC
Usage statistics

-
Metadata

JSON-LD
Export formats

Dublin Core

DataCite

LOM

EDM

OpenAIRE