Introduction

Welcome to the second post of my series about AI Talent. In the previous post, I introduced the series and discussed how to source the right talent, but finding a pool of great candidates is just the first step in order to maximise the potential of AI in your organisation.

Assessing Talent: Measuring company-candidate fit

Once you have access to a pool of potential candidates, you need to be able to understand, measure and evaluate the fit between each of them and your organisation. It is critical to understand that this is a bidirectional relationship, where the goal is to find someone who would be great in your organisation and (simultaneously) for whom your organisation is great. This relates to many factors, including their interest in the role, the culture of the company and the individual, their ambition for career progression and their expertise and skills. 

“Talent is cheaper than salt. What separates the talented individual from the successful one is a lot of hard work”,
Stephen King

Unfortunately, measuring the match quality is not easy as many variables are involved, ranging from the details and clarity of the role, to the factors that Data Scientists value the most. I will share my thoughts on some of these signals in the next sections.

Role Definition: What is a Data Scientist? 

Before discussing how to assess Data Scientists, it is worth discussing what a Data Scientist is, given the fuzzy and chaotic world of professional titles in the AI space. This is a common problem and several people have proposed different divisions such as the type A/B Data Scientists. My perspective is closer to the categories shown in the MMC AI Playbook: I see 5 high-level family roles and responsibilities of what people related to AI or Data Science tend to focus on:

Answering Questions given company data

The main goal is to investigate and obtain actionable insights given data (e.g., user interaction). The main requirement is to understand the connection between the business goals and the questions and answers that the data can provide. This role requires analytical and storytelling skills. For us, this person would be a Data Analyst. Other common titles in the market are Product Data Analyst and Product Analyst. Some companies also refer to this as Data Scientists.

Creating Proofs of Concept (POCs) or AI-driven products

The main responsibility is to create end-to-end POCs or prototypes where AI adds value. This role requires a user and problem-focused mindset and the ability to code full end to end systems. This person also has to understand the feasibility and potential of Data Science solutions and be able to work under high uncertainty. We define this role as Data Scientist, while other people also refer to it as Product Data Scientist, ML Engineer or Data Science Engineer.

Improve AI models and publish papers

This role focuses on pushing the edge of Machine Learning and tend to be focused more on answering research questions and improving solutions to specific problems to the limit of what current technology allows us to. We know these people as Researchers and this is the main title in the market too, potentially with a specialisation before it to mention their area of specialisation (e.g., NLP Researcher or Deep Learning Researcher). You can also find variations with the word scientist in it such as Research Scientist.

Robust deployment at scale

In order to benefit from Machine Learning, the solutions using it have to be “live” in some way or another, and this process has to be done in a robust, scalable and efficient manner. The people that ensure that the models are scalable, efficient and constantly monitored are usually referred to as Machine Learning Engineers or Data Engineers.

Strategic direction and management

This group is slightly broader but the principle is that anyone with one of the words Head, VP or Chief followed by ML, AI, Data, Data Scientist or Scientist will most likely be focused on strategic direction on the applicability of AI and/or on the structure and growth of the Research teams.

As mentioned before, these are high-level categories and each one of them could be further divided into more specialised sub-categories (e.g., Deep Learning Researcher). In reality, most Data Scientists (especially in start-ups) will be doing a combination of these roles. However, most people tend to plan their personal development towards one, or maybe two, of these career paths. 

Important factors: What drives Data Scientists to join a company? 

To measure the fit between the role and the company we need to understand the different types of Data Scientists, but it is also important to know the factors that drive them. In addition to the classic factors such as office, overall package, management or company vision, there are some characteristics that are especially relevant for these roles. The chart below, shows the “very important” factors for Data Scientists to choose a place of work, according to a survey by Kaggle that was summarised in the MMC State of AI report

Screenshot 2019-07-18 at 16.16.51.png
Figure 1. Ratio of Data Scientists listing factors as “very important” for choosing an employer

I would like to focus on three specific factors that I believe are especially important for Data Scientists: learning, impact and publication opportunities.

Learning: This includes continuous personal development on the day to day, working with peers you respect and want to learn from. It also includes the culture of learning in the company. For instance, at Signal AI, we have a budget for conferences, regular reading groups and research guilds where all researchers learn together and stay up to date with the latest advances in the field. 

ImpactOne of the major challenges in academia is the perceived lack of impact of your work. A relatively common thought for PhD students is if someone will actually bother reading their thesis. In the industrial world, especially in upscale companies like Signal AI, the impact of research is clear. This is not only a great factor for revenue (driven by better products) and hiring, but it has also been instrumental for our collaborations with academia, such as the MSc students from universities like UCL who did their thesis with Signal AI.

Publication opportunities: For researchers, if you want to keep the possibility of going back to academia, or at least retain your research reputation, you need a reasonable publication record, both in quality and quantity. We are very proud of our collaborations with universities that led to several publications. This was a deciding factor for some of our researchers to join Signal AI and for multiple universities to collaborate with us. 

Main skills: Look for more than “just” technical skills

Soft skills such as communication and contextualisation are as important as technical knowledge for a Data Scientist to be impactful within an organisation. This includes the ability to communicate and collaborate effectively within a team, within a company and outside the organisation. This requires being able to change the communication style to completely different audiences with different goals and contexts. A related factor is the business and product awareness and interest, which is the key to understand the impact and value of lines of research. Therefore, you should not focus on technical know-how and expertise alone. This is especially true for startups and scaleups, where everyone needs to be much more adaptable.

I have been asked several times before if early stage companies should hire “generalist” or “specialist” Data Scientists. Full-stack Data Scientists (with development skills to build prototypes end-to-end) are more impactful in the early stages as you are trying to validate product market fit. At that point scope and priorities will change radically and the main necessity is to have someone who is adaptable. As the company (and the offering) matures, specialists will allow the optimisation of specific parts of the system.

Innovation: Look for different perspectives

One of the challenges related to assessing people is that we all have subconscious biases and, in many cases, we tend to hire people similar to ourselves. Talent is much more diverse and distributed than we tend to believe and how you measure fit will be a major factor on the impact and innovation of your teams. This includes multiple dimensions, including (but not limited to) research areas of expertise and background. For instance, a team of three researchers from the field of Information Retrieval will probably think about problems in a similar way, while a natural language processing researcher, a statistician and a mathematician will have different perspectives and might find more innovative solutions. There is a catch though, the less overlap between the backgrounds, the more difficult the communication will be due to missing context and different vocabularies. Finding the right balance is key to have a healthy and productive team. 

Take Away

Assessing the quality of Data Scientists effectively is a key part of the hiring process that has to be understood as a bidirectional match between the candidate and the company. It starts by defining your role well and understanding what Data Scientists value the most. I would recommend to be very aware of unconscious bias and to hire people from backgrounds close enough so that communication is fluid, but far enough so that novel and innovative ideas are introduced in the group.

My most important learning on this section is to always hire for more than just technical expertise: In many cases, especially in quickly growing companies, the influence and impact of a Data Scientist is much more correlated with their communication and contextualisation skills than their knowledge of machine learning models.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s