Shilpa Blog

Large Language Models has shown remarkable capabilities in many NLP tasks. Even though there are advancements in finance industry, the research is limited. There was a research paper published in Feb 2024 which has provided survey and enhancement of LLM in finance industry.

From General LLMs to Finance

LLM started with transformer architecture which was published in 2017 and then it led to 2 different types of models which are discriminative models like BERT for classifying text and Generative models like GPT which is designed to produce fluent text.

The GPT series showed the power of scaling from GPT-3 which was in-context learning and ChatGPT combined GPT-3 with code models, reinforcement learning from human feedback RLHF bringing conversational AI to the mainstream. GPT-4 pushed things with multimodal inputs like text, images, audio.

There are also open-source communities released like BERT, BLOOM and LlaMA which allowed everyone to build their own models. This gave finance domain way to adapt it.

FinPLMs( Financial Pretrained Language Models) -> FinVERT-19, FinBERT-20, FinBERT-21, FLANG
FinLLM (financial LLM) -> BloombergGPT, FinMA, InvestLM, FinGPT

The paper compares five main approaches

Continual Pretraining — Start with general model BERT then keep training on finance corpora
Domain-specific Training from Scratch — Train entirely on finance text is costly
Mixed Domain Pretraining — using both general and finance data together like FinBERT-21
Mixed-domain LLM + Prompt Engineering — Keep weights frozen but use the prompts like BloombergGPT
Instruction Fine-tuning — Convert financial tasks into instruction format like question + answer and fine tune like FinMA, InvestLM, FinGPT

This evolution has a trend of moving from static data training to instruction based and interactive for finance

Did they perform well?

The survey tests finLLMs and general LLM on 6 benchmarks which are

Sentiment Analysis — Classifying financial news as positive/negative /neutral where FLANG and GPT-4 did well
Text Classification — Categorizing financial headlines or Fed statements, FinMA which has 30B hits and 98% of them F1
Named Entity Recognition — Extracting Company names and tickers etc where GPT-4 leads and FinLLM lag
Question Answering — From simple financial Q&A to hybrid numerical reasoning was tested and GPT-4 is near human level and Bloomberg GPT lags
Stock Movement Prediction — Predicting stock price direction using text and prices where GPT-4 was better than FinLLMs but was below SOTA specialized models
Text Summarization — Condensing the earnings call transcripts where task specific models was better than LLM but GPT-4 was better than FinLLMs and optimal

So overall the FinPLM are good on simple classification tasks but for complex reasoning, GPT-4 gave better results than FinLLMs

The research paper also says these benchmarks are basic and shallow where they also used 8 advanced tasks for future LLM to handle

Relation Extraction — For example, the company A acquired by Company B
Event Detection — Tracking corporate events like mergers
Casuality Detection — Understanding cause-effect in financial text
Numerical Reasoning — Math with textual numbers
Structure Recognition — Understanding tables in reports
Multimodal finance — combining text, audio from earnings calls, or even video
Machine translation — finance aware translation across languages
Market Forecasting — beyond stock moves like predicting volatility, risk and trends

These are needed for the real-world financial analysis and go beyond sentiment classification

The finance LLMs might have to address the above things while the scope of the future can be automation with streamlining report writing, compliance checks and financial analysis and making complex financial insights available for retail investors and providing power tools for traders, analysts and regulators to support their decisions and also providing natural language interfaces to query the finance databases and dashboards.

Challenges that LLMs should overcome

Overcome the hallucinations as it can be very costly in finance industry
The private data should be handled safely as it is sensitive information
As the markets evolve and change the model should be able to adapt to the trends and changes
There can be risk of bias with amplifying sentiment or misinformation
Training and running FinLLM from scratch can be expensive and need lots of resources
Current evaluation metrics like F1, accuracy don’t capture financial risk and need domain specific metrics like Sharpe ratio or expert reviews

We would need LLM that not only talks about finance but FinLLM should be reliably assist in financial decision making.

AI in Finance Industry

From General LLMs to Finance

Did they perform well?

Challenges that LLMs should overcome

Shilpa Thota