Explainable Machine Learning Exploiting News and Domain-specific Lexicon for Stock Market Forecasting
In this manuscript, we propose a Machine Learning approach to predict the magnitude of future stock price variations for individual companies of the S&P 500 index. Sets of lexicons are generated from globally published articles with the goal of identifying the most impactful words on the market in a specific time interval and within a certain business sector. A feature engineering process is then performed out of the generated lexicons, and the obtained features are fed to a Decision Tree classifier. The predicted label is represented by the underlying company’s stock price variation on the next day. The performance evaluation we have carried out through a walk-forward strategy, and against a set of solid baselines, showed that our approach clearly outperforms the competitors. Moreover, the AI behind our approach is explainable in the sense that we analysed the white-box of the classifier and provided a set of explanations on the obtained results.
CARTA Salvatore;
CONSOLI Sergio;
PIRAS Luca;
PODDA Alessandro Sebastian;
REFORGIATO RECUPERO Diego;
2021-04-27
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
JRC122424
2169-3536 (online),
https://ieeexplore.ieee.org/document/9355141,
https://publications.jrc.ec.europa.eu/repository/handle/JRC122424,
10.1109/ACCESS.2021.3059960 (online),
Additional supporting files
File name | Description | File type | |