
The study uses three main datasets: CRSP daily returns, news headlines, and RavenPack. The sample period is from October 2021 to December 2022, ensuring evaluation on new data. The goal is to analyze the relationship between ChatGPT’s sentiment scores and stock market returns. Relevant news headlines are collected and matched with RavenPack’s data. Filtering techniques remove duplicates and ensure data integrity.
ASSET CLASS: stocks | REGION: United States | FREQUENCY:
Daily | MARKET: equities | KEYWORD: ChatGPT, Forecast
I. STRATEGY IN A NUTSHELL
The study uses CRSP daily returns, news headlines, and RavenPack data from October 2021 to December 2022 to analyze ChatGPT’s sentiment scores in predicting stock returns. Headlines are processed through customized prompts where ChatGPT evaluates potential impacts on stock prices, generating “ChatGPT scores” mapped numerically (YES = 1, UNKNOWN = 0, NO = -1). Scores are averaged per company per day and matched to next-day returns, with linear regressions analyzing predictive power. The results show that ChatGPT’s sentiment scores significantly forecast daily stock movements, outperforming traditional methods and models like BERT, GPT-1, and GPT-2, especially for small-cap stocks.
II. ECONOMIC RATIONALE
The strategy leverages ChatGPT’s advanced natural language understanding to extract meaningful signals from financial news, addressing a largely unexplored application of large language models in financial economics. Empirical evidence confirms that ChatGPT’s sentiment analysis provides superior predictive insights compared to conventional sentiment methods, highlighting its potential value in investment decision-making.
III. SOURCE PAPER
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [Click to Open PDF]
Alejandro Lopez-Lira, University of Florida – Department of Finance, Insurance and Real Estate; Yuehua Tang, University of Florida – Department of Finance
<Abstract>
We examine the potential of ChatGPT, and other large language models, in predicting stock market returns using sentiment analysis of news headlines. We use ChatGPT to indicate whether a given headline is good, bad, or irrelevant news for firms’ stock prices. We then compute a numerical score and document a positive correlation between these “ChatGPT scores” and subsequent daily stock market returns. Further, ChatGPT outperforms traditional sentiment analysis methods. We find that more basic models such as GPT-1, GPT-2, and BERT cannot accurately forecast returns, indicating return predictability is an emerging capacity of complex models. ChatGPT-4’s implied Sharpe ratios are larger than ChatGPT-3’s; however, the latter model has larger total returns. Our results suggest that incorporating advanced language models into the investment decision-making process can yield more accurate predictions and enhance the performance of quantitative trading strategies. Predictability is concentrated on smaller stocks and more prominent on firms with bad news, consistent with limits-to-arbitrage arguments rather than market inefficiencies.


IV. BACKTEST PERFORMANCE
| Annualised Return | 110% |
| Volatility | 28.94% |
| Beta | N/A |
| Sharpe Ratio | 3.8 |
| Sortino Ratio | N/A |
| Maximum Drawdown | N/A |
| Win Rate | N/A |