Eagle Alpha’s 2023 Alternative Data Report

by Dec 15, 2023Blog

Eagle Alpha’s 2023 Alternative Data Report

Below is an excerpt from our annual report. Our full report includes more granular industry trends, data sourcing, leading case studies, and technical product updates. Request full report here.


As we close 2023 in the alternative data industry, we reflect on a polarized year of macro uncertainty coupled with technological innovation. It is a year characterized by an interplay of robust supply dynamics and nuanced demand patterns where the investment community has had to contend with rising interest rates and a year of anticipated recession risk.

Three years ago, we had to contend with a global lockdown and this year a threat of recession, political instability on the global stage and AI disruptors. Despite this, the supply dynamic of the market is as compelling as ever and the level of innovation on the demand side could herald further advancement of the adoption curve.

Data Sourcing Perspectives – A Look at 2023

Data Sourcing at Eagle Alpha had another busy year.

We grew total datasets by over 17% and by year end we will have added over 250 new datasets to the platform, all with a detailed profile and attached DDQ. This figure includes new data vendors and new datasets added by existing vendors.

Figure 1: Dataset Growth by Category (Source: Eagle Alpha)

LLMs and ChatGPT

From a data sourcing perspective, there have been a few dominant themes this year. No prizes for anyone to guess what dominated conversations this year… yes, LLM and NLP. In last year’s annual report, we highlighted the fact that even before ChatGPT was released there was a fair amount of interest in access to data sources across news, transcripts and filings. This continued in 2023, but attention also shifted quite a bit as the year progressed.

Early this year we began to talk to most of the established NLP vendors about the threat of open LLMs and chat functionality. Most of the vendors recognized that things had changed and were quickly jumping on open models to enhance their workflow and data products.

Some others seemed to be slower but did show evidence of pivoting as the year progressed. On the buy-side, there was a lot of interest in what these NLP vendors were doing and around Spring a notable shift began to happen. Funds were looking to quickly get to grips with the topic. Our take initially was that it became less advanced NLP using your own data inputs to build sentiment and signals and more of an efficiency play.

“How can I use ChatGPT and other models to gain an efficiency advantage?” Broadly this is summarization, prompt and Q&A that can make any analyst or PM more efficient. This then morphed into “how can I bring my own internal data to a model” but, critically, without risk of data leakage.

On this market-driven demand, many of the vendors did another pivot and opened up their domain knowledge and products to an NLP-as-a-service offering. On the one hand, using a ChatGPT-like functionality and then on the other providing LLM services in a secure cloud or on-prem environment. This is a very dynamic space right now. Things are changing rapidly, and it is going to take some time to play out.


Another theme that played out this year is more market acceptance of macro nowcasting and forecasting. Our belief here is that this is driven by the inflation trade. Equity and bond markets are following the lead of the Fed and other central banks on rising interest rates to combat inflation. One could gather several alternative datasets and build an inflation model, but this would be a costly endeavor. Given the different ways inflation is measured globally, this would be even more costly.

Why not use a company that provides inflation and other macro forecasts/nowcasts instead? Up to the end of last year, this idea largely met with resistance from buy-side shops, but we have seen a change in attitude this year to this. Data vendors like ClearMacro and Macrobond Financial saw a high meeting count at our October conference in London, for example.

While this shift seems to be well underway, we saw interest in some data categories in the inflation bucket. This was more on the higher frequency side of things in automotive, travel and leisure. One area where we were surprised this year was that there was little interest in house prices and rental data. As this is such a large part of the CPI basket in the US, we had expected some demand for this data.

There is a lengthy lag to the impact on inflation, but one would have thought it would be relevant to model out. But, I guess, with the Fed ignoring it – why bother?

Data Category Themes

Last year we saw disruption across the consumer transaction data category. We did not see anything as significant this year, but we did spend some time early in the year talking to clients about data quality issues at various geolocation vendors.

The problem of dirty data and “replayed” data was a big problem for many vendors. Data brokers were replaying, or reselling, old data as new data and this had to be scrubbed from panels. Many of the vendors ended up with smaller but cleaner panels, which ultimately should provide more accurate insights.

This panel issue aside, demand for geolocation data, in general, remains lackluster given personally identifiable information (PII) concerns and legal cases around the US Supreme Court ruling on Roe vs. Wade. In addition, investors still want to see datasets mapped to tickers and many geolocation vendors have not done this, which also puts a hindrance on demand. We have seen interest in fundamental and private equity use cases for geolocation data with some vendors willing to do bespoke project work.

We did see some renewed but sporadic interest in commodities, energy and healthcare over the year, but we would not classify it overly significant. On the topic of ESG, the focus was on the environmental side of things with climate transition and net zero as the predominant requests for data.

Looking Ahead – Corporate Data Access

In Q4, we are adding corporate dataset profiles for the first time to our platform. Through our partnerships, we have identified over 200 corporates with a desire to monetize their data. A subset of these will be profiled subject to achieving thresholds that are common to the investment vertical.

You are invited to download the report.  Our full report includes more granular industry trends, data sourcing, leading case studies, and technical product updates. Request report here.