“Data science” is a relatively new field that combines knowledge of statistics, machine learning and programming in solving real world problems using data. The field is still very new and whilst many companies understand why they need data scientists, many don’t know what skills they should be looking for in potential hires. A lot of data science job descriptions have a long list of skills which suggests that companies are looking for a “unicorn” with all the possible combination of skills required. However, in reality, most of these unicorns do not exist and its often more practical to build a data science team with individuals who have strengths in various areas.
Broadly speaking, data careers can be divided into data science and data engineering. Data science is more to do with analysing, visualising and deriving meaningful insights from data. Data engineering on the other hand is more concerned with building data pipelines to deal with large datasets, etc. Data engineering is more related to software engineering whilst data science is more suitable for candidates from physical sciences background (including maths, physics, quantitative biology/neuroscience, computer science, engineering, etc). In reality, this distinction is often not clear on job descriptions. Also, due to a shortage of talents, intersections between data science and engineering is quite common. See the four types of data scientists for more.
How do you get your first as a data scientist?
Getting that first job as a data scientist can be a very important first step on your journey to becoming a competent data scientist. It can very challenging especially if you’re coming straight from university with little “real world” experience. A PhD qualification alone will not give you an automatic entry into a data science job-you will have to be able to demonstrate what you can contribute to a data science team. It’s worth mentioning that there are various paths to becoming a data scientist and you don’t necessarily need a PhD to get into one but you do have to demonstrate a breadth of skills.
- PhD route: A common route for landing a data science role is through a PhD qualification in a quantitative/scientific field such as physics, mathematics, engineering, neuroscience, biology, bioinformatics, computer science, etc. There are bootcamps that specifically recruit PhD graduates where they get to work on a real world project in collaboration with other graduates. Some popular bootcamps include the ASI fellowship, s2ds and insight fellowship programmes (note that the first two are based in the UK whilst the third is in the USA). For a more comprehensive list of data science bootcamps, see the following link.
- Masters/undergraduate route: Data scientists can be hired straight after an undergraduate or masters degrees into entry level data science roles. Its important to demonstrate your competencies through projects and of possible by doing an internship with a data-driven company.
- Portfolio/work experience route: This may be suitable for individuals who are already in industry and are wishing to move into data science roles. Increasingly, many software engineers may find themselves in this situation.
So the question is “How do you land that first job”?
- Know the basics of data science theory very well. This includes mathematics/statistics (linear algebra, calculus, numerical optimization, regression, algorithms, etc), programming (at the very minimum-python and R), machine learning techniques, visualisation, some familiarity with big data tools. If interested in Data Engineering, its crucial to get familiar with the big data tools-scala, spark, hadoop, Apache. As a general rule, try to be comfortable with at least one of the data tools (R or python)-this means at least 10,000 hours of coding in that particular language.
- Demonstrate your interest by undertaking a data science project in your spare time. Find a question that you can address using online data and showcase your work on a github account. If you’re already studying for a masters or a PhD, try to demonstrate your data science interest through your projects.
- If possible get some relevant industry experience related to data science and to the industry of your choice. Domain knowledge is crucial in being an effective data scientist. This can be acquired through an internship, kaggle competition, hackathon or through previous work experience.
- Network within the data science community by attending meetups, conferences and arranging meetings with data driven companies of interest.
- Keep up to date with the field by reading new articles, publications and algorithms.