
What skills should a data scientist have?
21 March 2017

Contrary to what many people assume, the data scientist is not necessarily a professional with a “numerical” background. While the disciplines most commonly associated with data science like mathematics, statistics, physics or other sciences do provide a useful foundation, data scientists come from a more varied number of academic backgrounds. Some data scientists also come from fields such as telecommunications, engineering or computer science, but they also come from seemingly remote areas such as communication, economics, finance, or biomedicine.
Why? The data scientist’s most important job is to analyze data: they play with it, work with it, question it, yearn for it. The data scientist should be a curious, creative, innovative, and even defiant person, capable of questioning the status quo. And that’s why their training is not as decisive as their attitude is.
Technical skills
There is no doubt that the data scientist’s work revolves around the combination of technology, creativity, and data. Where there is a likely set prerequisites that are fundamental for a data scientist’s overall performance, the data scientist’s profile will branch out into other specialties over time.
In short, the data scientist should be fully at ease with the following four disciplines:
- Statistics / Mathematics: the data scientist should be able to analyze databases, build models, make statistical forecasts, and distinguish what is representative and what is not. Therefore, they should have a strong mathematical background that allows them to control supervised models with predictive techniques (data mining, machine learning) and unsupervised segmentation models. Prior to this modeling, they should be able to work with all mathematical techniques of data pre-processing, and once the model is built, of data evaluation. In short, they should be familiar with a skill set to be able to construct and evaluate a predictive model, as well as being able to apply statistical logic to programming languages.
- Technology: in order to transform data into knowledge, the data scientist must understand the technological needs of the business and know how to implement them. Algorithm design is key to data transformation and demands fluency in multiple computer languages, as well as complete knowledge of database management. It’s very important to be proficient in automation since many processes are repeated on a computer while the data scientist is working on refining or calibrating the model.
- Business analytics: the data scientist should speak the corporate language, understand the business’s objectives, the industry in which the company is operating, and the processes that drive growth in both profit and of the company. This is the only way in which they will be properly equipped to discern which problems can be solved through data processing. And they will only be able to transfer data analysis into valuable recommendations for the company through this deep understanding of the company’s inner workings. Without an adequate knowledge of the business environment, mere technical knowledge can lead to a rejection of the “techie,” a difficulty in understanding them, or even awkward situations in which the data scientist has only offered obvious answers.
- Communication: the data scientist will at some point have to present the results of their work – not based on experience, but rather on their analysis – to professionals, often managers with decision-making powers and extensive business acumen but without technical training, and do so without falling short on meticulousness or accuracy. That’s why they should possess the ability to communicate with ease and create a dialogue at the level of their audience. It’s paramount that the result of an analytical process can be understood by any manager within the company, whether this is an engineer or a social media specialist.
Wheel’s on fire, rolling down the road. Please notify my next of kin. This wheel shall explode.
Skills above and beyond the technical
The data scientist doesn’t only live off technical know-how. Ideally, the above capabilities are complemented by a series of personal characteristics, thereby forming a skill set in which specialization merges with human qualities.
- Creativity: this is necessary to give a different perspective in data analysis thanks to the ability of new methods to collect, interpret, and analyze data. The technology itself is not the differentiator, from the moment that all programs are available to any organization. That’s why the know-how is of vital importance: the tools may be the same for everyone, but the minds handling them are not. Technological uniformity melts down when intelligence is added, turning the results offering a software solution – perhaps the same as the competition – into unique ones.
- Intuition: the ability to choose between one way or another of reaching a solution is extremely important. Experts underline the importance of applying an artistic component to a technical working process that usually triggers a fixed sequence (data processing, curation, modeling, etc.), but requires an intuitive spark to discriminate which steps are suited to critical analysis.
- Flexibility: trial and error mechanisms allow us to evaluate and choose one option or another for the work already underway, complementing – or even rectifying – decisions made before starting the project. Mathematical models are not unique but are grouped into toolboxes that include different techniques. Therefore, agility is required to opt for a particular analytical tool or technique, depending on the structure of the data, the information available, etc. And that could be a weak point for professionals trained in theory but with little experience in the practical side.
- Curiosity: understood as the ability to ask questions, understand what is asked, and to consider what’s the right path to take. Curiosity is essential for keeping abreast of techniques and arts, as well as constantly refreshing knowledge. Ultimately, this will draw meaningful inferences from the data.
- Empathy. Although their work is the result of hours and hours spent in front of a computer, the data scientist is not a lone wolf. The human factor must be present in their daily lives, in the sense that their work depends on collaboration with other departments, and it is impossible to pull it off without cooperation. Accustomed to mobility between projects and areas, the challenge lies in creating a free-flowing dialogue with other parts of the organization. Besides that, they may sometimes have to present undesirable results to customers or superiors, which further reinforces the importance of the personal touch.
- Pragmatism. Finally, there’s no point in all this theoretical analysis if it doesn’t come with a practical impact. Technical skills are of little use if the data scientist isn’t able to integrate into a team or convert all their analytical potential into results that benefit the company or other working groups. Therefore, they must be able to turn data analysis into insights or actions with a direct impact on the business.
__________________________________________________________________________________
This article is part of the study “Data Scientists: Who are they? What do they do? How do they work?“, available on Rebel Thinking.
Other articles from the study:

