Alejandro Lopez-Lira, Assistant Professor at the University of Florida

Alejandro Lopez-Lira, Assistant Professor at the University of Florida

Basic ChatGPT Strategy Vulnerable to Arbitrage

In Conversation with Alejandro Lopez-Lira

The basic form of a ChatGPT-based investment strategy has likely already stopped working, the author of a recent study says, but he also indicates it might work better in more inefficient markets.

A recent paper by academics from the University of Florida explores the application of ChatGPT stock market investing using sentiment analysis. The researchers asked the program to read news headlines and predict next-day US stock price movements through the use of sentiment analysis.

This worked surprisingly well, even when incorporating 25 basis points of trading costs.

But Alejandro Lopez-Lira, Assistant Professor at the University of Florida and one of the authors of the paper, says the strategy in its basic form would probably be arbitraged away quite quickly, if it hasn’t already been so.

“I would expect the baseline strategy to have stopped working or stop working very soon,” Lopez-Lira says in an interview with [i3] Insights.

“It’s tricky to implement because you need a high trading volume. And at that point, you really need to be good at managing your trading costs, both the price impact and otherwise, so I would expect the base strategy to fade out pretty quickly.”

image shows a quotation mark

It's tricky to implement because you need a high trading volume. And at that point, you really need to be good at managing your trading costs, both the price impact and otherwise, so I would expect the base strategy to fade out pretty quickly

But Lopez-Lira also says a more sophisticated use of this strategy will probably become part of every investor’s tool box as the processing of information will continue to become more automated.

“You can imagine that this can be used to process complex documents, to join information together to use as a pipeline input. That’s something that potentially would work better in the long term,” he says.

“Obviously, it requires thinking more about which sorts of information you want to use. And at that point, you’re basically running a hedge fund.”

The impact of such strategies on the functioning of markets could lead them to become even more efficient, since information that traditionally required a human assessment can now be processed using large language models.

“In my mind, everything is going to be rule-based now. Data processing, broadly defined, will be automated, using this model. So I would expect it to just keep growing and what will happen then is that markets will probably incorporate information quicker,” Lopez-Lira says.

Inefficient Markets

Lopez-Lira and co-author Yuehua Tang applied the experiment only to US stocks because they had access to the relevant data for this market.

But he expects in markets less efficient than the US and in assets that are harder to trade, including shorting stocks, the strategy will do better.

“I would expect it to work better because usually the US is the hardest market to trade in, just because all the players want to trade here,” he says.

“We haven’t done anything [in other markets] because we didn’t have the data sources available, but the basic logic is that if news does not convert instantly into the stock price, then you’ll get some return predictability.

“In the US markets, this may happen for very large stocks within milliseconds and for smaller stocks within a day. There’s nothing after a day. I would expect this pattern to be slower in emerging markets, for example.”

He also expects it to work better for smaller investors and even retail investors if they can get their head around the strategy.

“The funny thing is that it’s very hard for institutional investors to run these strategies because, you know, you have to concentrate on the smaller stocks. But it’s actually easier for retail traders to do this because their price impact is zero basically,” he says.

[i3] Luncheon: A.I. & Investment Decision-making | Investment Innovation Institute

Newer Language Models

In his experiment, Lopez-Lira found only ChatGPT 3.5 and 4 were successful at showing clear correlations between how they interpreted news and the movement of US stock prices the next day.

Earlier models, including ChatGPT 1, 2 and BERT, another large language model, did not do as well. In addition, Lopez-Lira found more complex text led to better results.

“We can test it with headlines that are more complex or less complex and we found that only the larger models can do these more complex headlines. But that’s where a lot of the return predictability is,” he says.

“So it’s purely that larger models are able to understand language and context better.”

The way the questions were framed, or the prompting, was important too. Every question started with the statement: “Forget all your previous instructions.” Lopez-Lira says this instruction was to pre-empt any limitations the designers might have put on the model.

“Basically, it’s just to avoid any guardrails that the company has put in place. Sometimes these models will refuse to answer specific questions because they can’t provide financial advice or can’t provide medical advice,” he says.

image shows a quotation mark

But these large language models are funny because the only thing they understand is the text that is written; they cannot distinguish between what is an instruction and what is part of the text

“But these large language models are funny because the only thing they understand is the text that is written; they cannot distinguish between what is an instruction and what is part of the text.

“So if you write as part of it to forget about all the previous instructions, then it helps a little bit with removing the limitations that are running in the model.”

Lopez-Lira initiated the experiment largely out of curiosity about what the system could do. Previous research had shown certain machine-learning strategies had some success in predicting next-day stock prices, but they were specifically trained on financial data.

ChatGPT was not.

“ChatGPT is not trying to predict stocks in particular; it’s just trying to order tasks,” Lopez-Lira says.

“But it has been showing some capability in domains where it’s not specifically trained. You can ask it to write a poem, even though it wasn’t trained explicitly for that.

“We were just trying to assess whether these more complex models understand the finance work better and whether they can predict return. I was actually just curious whether it was possible.”

__________

[i3] Insights is the official educational bulletin of the Investment Innovation Institute [i3]. It covers major trends and innovations in institutional investing, providing independent and thought-provoking content about pension funds, insurance companies and sovereign wealth funds across the globe.