Quantifying Alternative Data Alpha Opportunities

February 26, 2018 02:29 PM

When we think of “quant” hedge funds, we often think of a room full of PhD’s and applied math nerds crunching vast amounts of data, creating algorithms and then delivering trade signals to make buy and sell orders in incredibly short timeframes. In theory, data mining traditional price and fundamental and trading flow data is no longer enough to produce consistent returns. Times are changing, fast. 

Quant hedge funds are the R&D laboratory for the financial industry. Hedge fund research is where sophisticated new techniques are hatched and experiments are made. Some of these new strategies produce outsized returns and some are failures. The hope is that you find that pot-of-gold at the end of the rainbow. This is not R&D research to change the world or prevent a disease, but it could buy you a new house in the Hamptons.

The key word there is "laboratory." Literally, millions of signals can be generated from datasets before finding one that provides an edge. Hedge funds don’t use 98% of the quant research produced. Investors should appreciate the amount of failed experiments and the time it takes to produce alpha. 

One reason hedge fund returns have been lackluster is that new ideas and algorithms that started in the hedge fund world have migrated into other financial industry products like exchange-traded funds (ETF). The best example of this is all the popular smart beta and alternative investment strategy ETFs with lower fee structures. These strategies were only recently available to researchers at hedge funds, but it’s safe to say that those strategies will not produce the same returns and hence the market will be forced to adapt. Bottom line, too much money chasing the same strategy reduces profit margins.

Elon Musk said the best piece of advice is to “constantly think about how you could be doing things better and questioning yourself.” He also said, “Some people don’t like change, but you need to embrace change if the alternative is disaster.”

Hedge funds are constantly looking for an edge, and once they find it there are competitors arbitraging out that edge so there is a constant need to find the next big thing. What is the next big thing: Alternative data? Machine learning? Deep learning? Hedge funds are always trying to stay one step ahead, which raises the idea of what’s next? One answer may be to follow the leaders.

We reached out to Andrej Rusakov, partner of Data Capital Management (DCM), about their quant hedge fund and the new generation of quant research. 

DCM is a new generation algorithmic hedge fund disrupting the investment management industry by leveraging cognitive computing and big data technologies. They specialize in strategies that make use of both traditional and novel sources of information (news, sentiment, crowedsourced financial estimates, etc.) as well as machine learning methods to generate investment returns.

The culture at DCM is collaborative and curious and values scientific data and facts more than experience and intuition. The team consists of Ph.D.s and Harvard MBAs, with degrees in computer science, engineering, physics, mathematics and business management, and experience in quantitative research and risk management at global investment banks, fundamental analysis at leading private equity firms and cutting-edge technologies in big data companies.

A high-powered group of investors at the general partnership level include Howard Morgan, co-founder and former president of Renaissance Technologies; Raymond J. McGuire, global head of corporate & investment banking at Citigroup; Daniel Neidich, founder and CEO of Dune Real Estate Partners and Greg Gurevich, founding partner of Maritime Capital.

Here is what Rusakov had to say. 

Modern Trader: How does DCM use alternative data?

Andrej Rusakov: We have spent two years developing a robust proprietarily-built cloud-based big data platform delivering “data on demand” to internal end users (strategy development and risk management teams). We say that our technology is a key differentiating factor and will become more and more of a competitive advantage going forward. 

MT: What’s the process when deciding to use an alternative dataset from a vendor?

Rusakov: We check for data veracity, the completeness of its history and assess its predictive power. If we like what we find, we buy it.

MT: What are the biggest challenges using alternative data?

Rusakov: Alternative data is very messy. Cleaning it and making it usable is a big challenge. Data integration is also a challenge. [This includes] data linkage, dealing with bi-temporality (data based on varying timeframes) and data querying. 

MT: Do you look for exclusive or shared use of alternative datasets?

Rusakov: Shared. Exclusive will become illegal as regulators will deem it to be insider trading.  Building a business on exclusive data usage is shortsighted in my view. 

MT: What datasets are most popular?

Rusakov: There is a lot of hype around satellite images and sentiment. The popularity of the dataset does not mean it is the most predictive one.

MT: What criteria must be present in a dataset to identify a sustainable and exploitable opportunity?

Rusakov: Cleanness, repeatability.

MT: What is the most unusual dataset someone has tried to pitch that you didn’t use?

Rusakov: It’s confidential as we spend a lot of resources figuring out what to use and what not.

MT: What are the economics that make it worthwhile to use alternative data? 

Rusakov: If the assets under management expected to be deployed in the strategy powered by alternative data multiplied by the additional return generated using this data set is greater than the cost of such dataset, the deal makes economic sense.

MT: What are the demands and trends in future alternative data products? Is there a dataset that you’d really like that doesn’t exist today? What would it be?

Rusakov: That is proprietary information.

MT: Do you own the full process and dataset from A to Z or use third parties? Do you have technology architecture and footprints to consume or do you build?

Rusakov: We have spent two years developing a robust proprietarily-built cloud-based big data platform delivering “data on demand” to end users (strategy development and risk management teams). We call our system DADS (Diversified Alpha Discovery System).  It is capable of consuming and delivering on demand with very rapid speeds heterogeneous multidimensional data coming at varying velocities and frequencies.  

MT: How do you consume data? What is your preferred method of receiving data?

Rusakov: Largely raw and somewhat processed. Very rarely signal-only. API is preferred. 

MT: Any regulatory/compliance risks with using alternative data? 

Rusakov: Yes. Data cannot be proprietary as it will be made illegal to trade on by regulators in our view.   

About the Author

Chris Randle is a proprietary global futures and commodity trader. He focuses daily on short term relative value and momentum trading. He also has investment projects in FinTech, Crypto, Alternative Data and AI. Twitter @crandl