Welcome to the second post of my series about AI Talent. In the previous post, I introduced the series and discussed how to source the right talent, but finding a pool of great candidates is just the first step in order to maximise the potential of AI in your organisation.
Assessing Talent: Measuring company-candidate fit
Once you have access to a large pool of talented individuals, you need to be able to understand, measure and evaluate the fit between each of the candidates and your organisation. It is critical to understand that this is a bidirectional relationship, where the goal is to find someone who would be great in your organisation and (simultaneously) for whom your organisation is great. This relates to many factors, including their interest on the role, the culture of the company and the individual, their career progressions and their expertise and skills.
“Talent is cheaper than salt. What separates the talented individual from the successful one is a lot of hard work”,
Unfortunately, measuring the match quality is not easy as many variables are involved, ranging from the details and clarity of the role, to the factors that Data Scientists value the most. I will share some of my thoughts on some of these signals in the next sections.
Role Definition: What is a Data Scientist?
Before discussing how to measure Data Scientists, it is worth discussing what a Data Scientist is, given the fuzzy and chaotic world of professional titles in the AI space. This is a common problem and several people have proposed different divisions such as the type A/B Data Scientists. My perspective is closer to the categories shown in the MMC AI Playbook: I see 5 high-level family roles and responsibilities of what people related to AI or Data Science tend to focus on:
Answering Questions given company data
The main goal is to investigate and obtain actionable insights given data (e.g., user interaction). The main requirement is to understand the connection between the business goals and the questions and answers that the data can provide. This role requires analytical and storytelling skills. For us, this person would be a Data Analyst. Other common titles in the market are Product Data Analyst and Product Analyst. Some companies also refer to this as Data Scientists.
Creating Proofs of Concept (POCs) or AI-driven products
The main responsibility is to create end-to-end POCs or prototypes where AI adds value. This role requires a user and problem-focused mindset and the ability to code full end to end systems. This person also has to understand the feasibility and potential of Data Science solutions and be able to work under high uncertainty. We define this role as Data Scientist, while other people also refer to it as Product Data Scientist, ML Engineer or Data Science Engineer.
Improve AI models and publish papers
This role focuses on pushing the edge of Machine Learning and tend to be focused more on answering research questions and improving solutions to specific problems to the limit of what current technology allows us to. We know these people as Researchers and this is the main title in the market too, potentially with a specialisation before it to mention their area of specialisation (e.g., NLP Researcher or Deep Learning Researcher). You can also find variations with the word scientist in it such as Research Scientist.
Robust deployment at scale
In order to benefit from Machine Learning, the solutions using it have to be “live” in some way or another, and this process has to be done in a robust, scalable and efficient manner. The people that ensure that the models are scalable, efficient and constantly monitored are usually referred to as Machine Learning Engineers or Data Engineers.
Strategic direction and management
This group is slightly broader but the principle is that anyone with one of the words Head, VP or Chief followed by ML, AI, Data, Data Scientist or Scientist will most likely be focused on strategic direction on the applicability of AI and/or on the structure and growth of the Research teams.
As mentioned before, these are high-level categories and each one of them could be further divided into more specialised sub-categories (e.g., Deep Learning Researcher). In reality, most Data Scientists (especially in start-ups) will be doing a combination of these roles. However, most people tend to plan their personal development towards one, or maybe two, of these career paths.
Important factors: What drives Data Scientists to join a company?
Understanding the types of data scientists is needed in order to measure the fit with the role and the company but it is also important to know the factors that drive them. In addition to the classic factors such as Office, Overall Package, Management or Company vision, there are some characteristics that are especially relevant for these roles. Figure 1 shows the “very important” factors for Data Scientists to choose a place of work, according to a survey by Kaggle that was summarised in the MMC State of AI report.
I would like to focus on three specific factors that I believe are especially important for Data Scientists: Learning, Impact and Publication Opportunities.
Learning: This includes the continuous learning on the day to day work, but also working with peers you respect and want to learn from. It also includes the facilities and culture of learning in the company. For instance, in Signal AI, we have a budget for conferences, regular reading groups and research guilds where all researchers learn together and keep up to date with the latest advances on the field.
Impact: One of the major challenges in academia is the perceived lack of impact of your work. A relatively common thought for PhD students is if someone will actually bother reading their thesis. In the industrial world, especially upscale companies like Signal AI, the impact of research is clear. This is not only a great factor for revenue (driven by better products) and hiring, but it has also been pivotal for our collaborations with academia, such as the many MSc students from UCL that did their thesis in Signal AI.
Publication Opportunities: For researchers, if you want to keep the possibility of going back to academia, or at least to keep your research reputation, you need a reasonable publication record, both in quality and quantity. We are very proud of our collaborations with universities and our publication record. This was a deciding factor for some of our researchers to join Signal AI and for multiple universities to collaborate with us.
Main Skills: Look for more than “just” technical skills
Soft skills such as communication and contextualisation are as important as technical knowledge for a data scientist to be impactful within an organisation. Therefore, you should not focus on technical know-how and expertise alone. This is especially true for startups and scaleups, where everyone needs to be much more adaptable. One of the most important skills is the ability to communicate and collaborate effectively within a team, within a company and outside the organisation. This requires being able to change the communication style to completely different audiences with different goals and contexts. A related factor is the business and product awareness and interest, which is the conduit to understand the impact and value of lines of research.
I have been asked several times before if early stage companies should hire “generalist” or “specialist” data scientists. Full-stack data scientists (with development skills to build prototypes end to end) are more impactful in the early stages as you are trying to validate product market fit. At that point scope and priorities will change wildly and the main necessity is to have someone who is adaptable. As the company (and the offering) matures, specialists will allow the optimisation of specific parts of the system.
Innovation: Look for different perspectives
One of the challenges related to assessing people is that we all have subconscious biases and, in many cases, we tend to hire people similar to yourselves. Talent is much more diverse and distributed as we tend to believe and how you measure fit will be a major factor on the impact and innovation of your teams. This includes multiple dimensions, including (but not limited to) research areas of expertise and background. For instance, a team of three researchers from the Information Retrieval field will probably think about problems in a similar way, while three researchers with different backgrounds will have different perspectives and might find a more innovative solution. There is a catch thought, the less overlap between the backgrounds, the more difficult the communication will be due to missing context and different vocabularies. Finding the right balance is key to have a healthy and productive team.
Assessing the quality of Data Scientists effectively is a key part of the hiring process that has to be understood as a bidirectional match between the candidate and the company. It starts by defining your role well and understanding what Data Scientists value the most. I would recommend to be very aware of unconscious bias and to hire people from backgrounds close enough so that communication is fluid, but far enough so that novel and innovative ideas are introduced in the group. However, my most important learning on this section is to always hire for more than just technical expertise. In many cases, especially in quickly growing companies, the influence and impact of a data scientist is much more correlated with their communication and contextualisation skills than their technical expertise.