High Alpha’s Approach to Data Science

by Mark Clerkin

High Alpha, and its portfolio companies, made huge investments in data science last year and are showing no signs of slowing down in 2018. One of the main ways this happened was by forming a data science team within High Alpha. This centralized approach allows us to embed data science in our companies from their inception, without the initial expense of onboarding a complete team.

Once big enough, the High Alpha portfolio companies will eventually house their own data scientists, but until they reach that point, we help get them off the ground. Here are a few reasons why we take this approach to data science.

Team Advantage

As previously discussed, data science is a team sport. While an individual data scientist can make an initial impact, true success comes when you combine skillsets, backgrounds, and levels of experience.

As in all teams, data scientists have different strengths and weaknesses. While a researcher might know the appropriate technique to solve a problem, she might not know the best way to capture, store, and process data at scale. On the other hand, an engineer might not know the pitfalls of various algorithms and how to overcome those challenges. Having a team balances those concerns and creates an environment where innovation can really thrive.

Economies of Scale

Another advantage of the team approach is that we are able to address multiple problems simultaneously. The standard data science lifecycle is as follows:

  1. Business Understanding
  2. Data Acquisition
  3. Data Preparation
  4. Modeling
  5. Evaluation
  6. Deployment

The problem with this lifecycle is that it is mostly linear in the sense that you generally progress from one step to the next. An individual would have trouble managing—and delivering on—these different steps, especially if they were to be doing multiple projects at once.

A major benefit of the team approach is that we can allocate resources to different stages of multiple projects. This allows us to rotate our talent to where it is needed most at any given time.

Innovation Templates

One of our main goals is to package a lot of the work we do and share it across the portfolio. How many times do you need to write database connectors, transformation functions, pipelines, etc.. ? The advantage here is that once we have solved a problem — we can clone it 10x and everyone wins.

Finally, working on multiple projects also helps us to cross pollinate ideas while avoiding knowledge and skillset fragmentation. By doing so, we are able to keep innovating and avoid getting stuck doing the same thing over and over. This approach is a big advantage that will create innovative products and generate alpha for our investors.