• Kromite on the record

Daydream or Nightmare: Application of Machine Learning Algorithms in the Stock Market?

Updated: May 28, 2019

By Xixi Chen

It is in our DNA to explore uncertain areas and figure them out. We eventually strive to develop a system or theory to scientifically predict and explain them. For speculators, predicting stock prices and gaining the arbitrary profits are always their ultimate goals. With the increasing popularity of machine learning algorithms in recent years, professionals turned their steering wheels to machine learning algorithms and tried to predict the future stock prices with higher accuracy. In this article we will explore the possibility of using a Recurrent Neural Network (RNN) as a prediction mechanism in assembling an optimized stock portfolio that minimizes risk with the intent of starting a conversation about this application.

Introduction of RNN-LSTM model

A recurrent neural network (RNN) is an advanced artificial neural network. An RNN model holds the capacity to process long sequential data and to tackle tasks with context spreading in time. RNN has shown tremendous promise in approaching problems including handwriting and speech recognition as well as machine translation. The popularity of AI or machine learning algorithms in financial markets is influenced by two factors: First, cloud computing and more advanced computer chips offer computer scientists more raw power to handle their algorithms; Second, the increasing amount of data satisfies machine learning algorithms’ requirement for the availability of big data.

Let’s dive a bit deeper with an example. Here we have a fictitious financial portfolio of 63 stocks from Small, Mid, Large Cap S&P Index with an original investment of $100,000. We want to minimize the standard deviation of the daily account balance return rate for a 3-month testing period while taking into account trading fees of $5 per transaction for each stock. Using 2016-2017 daily stock prices as our training data set and 2018 January through March daily stock prices as our testing data set, we will test an RNN-LSTM algorithm with input Pt and output Pt+1.

The S&P index always offers a good stock pool to build our customized portfolio. To make the stock pool more generalized, we select 21 stocks from small, middle and large cap S&P indices respectively and use 2016-2017 daily prices of all 63 stocks.

Portfolio Optimization

We have two methods by which we will optimize this portfolio. In Method 1, we use one predicted price and 62 past stock prices. In Method 2, we use all 63 past stock prices. In both, the following parameters apply:

I found it surprising that using all past data to do portfolio optimization (Method 2) resulted in a portfolio with higher return and lower standard deviation. But this result does not imply that the neural network is useless for this application. It only indicates that the algorithm does not provide a good performance result compared with Method 2 based on current limited training data and current parameter setting.

(Appendix Note: To see the tickers for the stocks considered; their initial and re-balanced portfolio weights for both methodologies, click on the link at the bottom of this post to download the table in Excel.)


In the case study above, we see that there is room for improvement, perhaps, the use of larger training data sets to boost the prediction accuracy or doing sensitivity analysis to get the optimal parameters for the neural network. But it is still possible that the portfolio built using past stock price data can’t be beaten by an RNN model. Also, putting too much weight on future predicted data based on machine learning algorithms like neural networks is also an area to approach with caution. The present financial market is already heavily depending on machines; an example of which is high-frequency trading. The underlying reason behind this is that trading volume is out of human bounds to execute and proceed.

Even though the future of machine learning algorithms in the financial market holds promise, the most sophisticated techniques employed today don’t bring traders huge profits in the near term. The machine learning application in finance could be a double-edged sword. It could reveal to us some potential relationships you never imagined before. On the other hand, it could also have an “overfitting” problem if we push ourselves too hard to achieve precision. The precision of these advanced machine learning techniques could suddenly go off track if some random events happen or some circumstances change that we did not consider. Therefore, we encourage a more comprehensive investigation of machine learning algorithms in finance.