Best Data Science Books
I get asked a lot what resources I recommend for people who want to start their Data Science journey. This section enlists books I recommend you should read at least once in your life as a Data Scientist.
Do you need to read these books to learn to be a Data Scientist? The answer is: no. There are plenty of tutorials and free material online that is as good as these books. However, if you can afford to buy them and can read them as supplementary material they can become a very good resource to learn. Unlike online tutorials, these books have a structure and teach concepts in an organized and structured manner. This means instead of wasting time searching the internet to find good tutorials you can spend this time learning.
The books I recommend here cover the main topics that you will need to master as a Data Scientist: programming (python), data analysis, and Machine Learning (including deep learning). I know there are plenty of books on each topic but those are the ones that I have used in my learning journey and I can truly recommend them.
Note: The following books are paid resources that I've attached affiliate links to. This means if you click on it and purchase them, the price stays the same but some of your funds will go to me. These funds help me to run this site.
As a Data Scientist, you should be primarily a good programmer or at least work towards achieving programming proficiency at least in one language. I recommend learning python for its common usage in the Data Science and relatively simple learning curve.
This book is like a python bible. It has around 1600 pages and covers all basic and more advanced python concepts.
It is a good book for someone starting with python as it has in-depth explanations of the language and programming concepts, and the content is presented in a simple understandable manner.
It will also be a very good revision for someone who has been working with python for a while but wants to get better at it, improve the understanding of the language and common concepts especially Object-Oriented Programming.
This book covers almost everything that concerns data analysis, data cleaning, and data preprocessing with pandas. And what do Data Science do most of the time?
Unfortunately or fortunately, we spend most of the time preparing data for fitting in Machine Learning algorithms. This book covers it all, and just enough python for data analyst or junior Data Scientist to get familiar with programming and libraries popular for data analysis.
Additionally, this book has been written by Wes McKinney who is the author of pandas package. And who would be the best person to learn data analysis from if not the author of one of the most popular python data analysis library that has been created.
If you were to buy only one book about Machine Learning that would be my choice.
It could be a book for a beginner Data Scientist wanting to have an overview of Machine Learning algorithms and how to implement them on real-life examples using scikit-learn.
It is also a good revision for someone who is already familiar with Machine Learning concepts and wants a book for quick references and review.
Additionally, it has a fantastic second section that focuses on od deep learning with Keras and TensorFlow.
Other topics in Data Science
Being a Data Scientist does not involve only python programming, data analysis, and Machine Learning. There are other topics that you should master in this profession. The first areas that come to my mind are Maths and Statistics.
I am not recommending any books on those topics as I have been relying on my high school and university knowledge with those, and supplying this knowledge with online tutorials and resources. If I read any good books on those topics I will update this list.