DATA SCIENCE ROADMAP 2026
Data Science Roadmap for anyone interested in how to break into the field!
This repository is intended to provide a free Self-Learning Roadmap to learn the field of Data Science. I provide some of the best free resources.
Our Previous Roadmap
If you Dont know What`s Data Science or Projects Life Cycle (starting from Business Understanding to Deployment) or Which Programming Language you should go for or Job Descriptions or the required Soft & Hard Skills needed for this field or Data Science Applications or the Most Common Mistakes, then
This Video is for you (Highly Recommended )
Data Science vs Data Analytics vs Data Engineering - What's the Difference?
These terms are wrongly used interchangeably among people. There are distinct differences:
| Data Science | Data Analytics | Data Engineering |
|---|---|---|
Is a multidisciplinary field that focuses on looking at raw and structured data sets and providing potential actionable insights. The field of Data Science looks at ensuring we are asking the right questions as opposed to finding exact answers. Data Scientist require skillsets that are centered on Computer Science, Mathematics, and Statistics. Data Scientist use several unique techniques to analyze data such as machine learning, trends, linear regressions, and predictive modeling. The tools Data Scientist use to apply these techniques include Python and R. |
Focuses on looking at existing data sets and creating solutions to capture data, process data, and finally organize data to draw actionable insights. This field looks at finding general process, business, and engineering improvements we can make based on questions we don't know the answers to. Data Analytics require skillsets that are centered on Statistics, Mathematics, and high level understanding of Computer Science. It involves data cleaning, data visualization, and simple modeling. Common Data Analytic tools used include Microsoft Power Bi, Tableau, and SQL. |
Focuses on creating the correct infrastructure and tools required to support the business. Data Engineers look at what are the optimal ways to store and extract data and involves writing scripts and building data warehouses. Data Engineering require skillsets that are centered on Software Engineering, Computer Science and high level Data Science. The tools Data Engineers utilize are mainly Python, Java, Scala, Hadoop, and Spark. |
Prepare your workspace
Tip 1 : Pick one and stick to it. (Click)
Tip 1 : Pick one and stick to it. (Click)
Anaconda: It's a tool kit that fulfills all your necessities in writing and running code. From Powershell prompt to Jupyter Notebook and PyCharm, even R Studio (if interested to try R)
Atom: A more advanced Python interface, highly recommended by experts.
Google Colab: It's like a Jupyter Notebook but in the cloud. You don't need to install anything locally. All the important libraries are already installed. For example NumPy, Pandas, Matplotlib, and Sci-kit Learn
PyCharm: PyCharm is another excellent IDE that enables you to integrate with libraries such as NumPy and Matplotlib, allowing you to work with array viewers and interactive plots.
Thonny: Thonny is an IDE for teaching and learning programming. Thonny is equipped with a debugger, and supports code completion, and highlights syntax errors.
Tip 2 : Focus on one course at least.
Tip 3 : Don't chase certifications.
Tip 4 : Don't rush for ML without having a good background in programming & maths.
This track is divided into 3 phases :
1. Beginner: you get a basic understanding of data analysis, tools and techniques.
2. Intermediate: dive deeper in more complex topics of ML, Math and data engineering.
3. Advanced: where we learn more advanced Math, DL and Deployment.
For Data Camp courses, github student pack gives 3 free months. Google how to get it.
if you already used it, do not hesitate to contact us to have an account with free access.
Legend
- Video Content
- Online Article Content / Book
Roadmap Explanation > Youtube Video
Beginner
Algorithms Book Every piece of code could be called an algorithm, but this book covers the
more interesting bits.
Specializations (data structures-algorithms)
1. Descriptive Statistics
Introduction to Statistics - DataCamp
Intro to Descriptive Statistics - YouTube old Udacity Course
Statistics Fundamentals - StatQuest - YouTube
Introduction to Statistics - YouTube
Online statistics education
Arabic Courses 1 - 2
Intro to Inferential Statistics++
Practical Statistics for Data Scientists
2. Probability
Khan Academy
Probability Bootcamp by Dr.Steve - Oct 2024- YouTube
Arabic Course
Probability and Statistics for AI and DS - Arabic (Dr.Hatem Elattar)
Introduction to Probability
3. Programming Languages
R - good tool for visualization and statistical analysis.
Introduction to R (DataCamp)
Data Science Specialization - Coursera
An Introduction to R
R for Data Science
Python
Introduction to Python Programming
OOP
Arabic - Hassouna | Elzero
Python Full Course - FreeCodeCamp on YouTube
Intro to Python for CS and Data Science
more in OOP
4. Pandas
Corey Schafer-YouTube
Kaggle
Docs
Data School-YouTube
Arabic Course
PandasAI1 - 2 Enhances the capabilities of Pandas by integrating Generative AI functionalities into it.
5. Numpy
Kaggle
NumPy Tutorial by Keith Galli - YouTube
Arabic Course - Elzero
Tutorial
Docs
6. Scipy
Tutorial
Docs
7. Data Cleaning: One of the MOST important skills that you need to master to become a good data scientist, you need to practice on many datasets to master it.
Read this
Course 1
Notebook1
Notebook2
Notebook3
Kaggle Data cleaning
8. Data Visualization
Introduction to Data Visualization with Matplotlib or
Corey Schafer - Playlist on Youtube or
sentdex - Playlist on YouTube
Kaggle to Data Visualization with Seaborn
Playlist-Youtube
Course1: Intro to Data Visualization with Seaborn
Course2: Intermediate Data Visualization with Seaborn
Course3: Understanding and Visualizing with Python
9. EDA
Note: it's already mentioned in the above probability course
DataCamp-EDA in Python
IBM-EDA for Machine Learning
10. Dashboards
Power BI
Power BI - YouTube (Alex)
Power BI training
Arabic - YouTube (Zanoon)
Arabic - YouTube
Guy in a Cube - YouTube
Tableau
Data With Baraa - YouTube
Tutorial
Tableau Training
Course - DataCamp
Simplilearn - YouTube
11. SQL and DB
SQL for Data Analysis (Udacity-notesll or simplilearn)
Intro to SQL or IBM (SQL for Data Science)
Intro to Relational Databases in SQL
Arabic Course (Theoritical - Practical) Eldesouki
Arabic - ITI by Eng.Ramy Advanced - (Labs Answers + Notes + Full Materials)
Arabic - SQL for Data Analysis by Ahmed Sami
Data With Baraa - YouTube - [Materials]
365 Data Science - SQL
CMU Intro to DB - Fall 2022 - <Schedule> - Book
SQL for Data Analysis
Practice InterviewMaster & HackerRank & LeetCode & DataLemur
12. DWH : A system used for reporting - A core component of business intelligence.
Mostly used by Data Engineers.
The Data Warehouse Toolkit
Data Warehousing Tutorial Videos
Garage Education (Ar)
Implementing Data Warehouse in Arabic (Ar)
More in Arabic? (Ar)
Data Warehouse - University of Colorado
[SSIS] SQL Server Integration Services (Ar)
Project - Building Sales Data Mart Using SSIS (Ar)
Project - Building DWH Step by Step
Project - Create DWH Fact and Dimensions (Ar)
Implement SCD in SSIS Continue the playlist
CDC in SSIS tutorial
13. Python Regular Expression
Tutorial
Regular Expressions by Corey - YouTube
Arabic Course - Elzero starting from the 95th video.
14. Time Series Analysis
Track - DataCamp
Course - Coursera
Book
fbprohet
Arabic Source Video1 & Video2
At The end of the Beginner phase apply what you've learned on a project.
Intermediate
1. Math for ML: consists of Linear Algebra, Calculus and PCA.
Mathematics for Machine Learning and Data Science - Andrew Ng
Specialization
Mathematics for Machine Learning - Most of the needed basics
Linear Algebra
Khan Academy - Linear Algebra
Mathematics for Machine Learning: Linear Algebra
3Blue1Brown - Essence of Linear Algebra
Calculus
Multivariate Calculus - Coursera
Essence of calculus - Youtube
PCA
PCA - Coursera
2. Machine Learning
Coursera - Old Course by Andrew Ng (Octave/Matlab)
Coursera Andrew`s new ML Specialization (Python)
Machine Learning - StatQuest - YouTube
Machine Learning Stanford Full Course on YouTube by Andrew
CS480/680 Intro to Machine Learning - Spring 2019 - University of Waterloo
SYDE 522 - Machine Intelligence (Winter 2018, University of Waterloo)
Machine Learning for Engineers 2022 / (YouTube)
Introduction to Machine Learning Course - Udacity
Hesham Asem - Arabic content
IBM ML with Python
Machine Learning From Scratch - YouTube (Python Engineer)
Hands On ML (1st & 2nd & 3rd) Editions | Code:
ML Algorithms in Practice
ML scientist
Project
3. Web Scraping/APIs
course
intro2
Tutorial
Book for both topics
APIs
Tutorial
Article
Tutorial
4. Stats.
This stats - Book
Think Bayes - Book
5. Advanced SQL
Joining Data in SQL - DataCamp
Intermediate SQL - DataCamp
More advanced SQL
7. Feature Engineering
Tutorial
Article
Book
8. interpret Shapley-based explanations of ML models.
SHAP
Kaggle ML explainability
After finishing this level apply to 2 or 3 good sized projects.
Read this book, please Introduction to Statistical Learning with Applications in R bqwlk qr'h
Advanced
1. Deep Learning
Deep Learning Fundamentals
Introduction to
Deep Learning - MIT
Specialization
Dive into Deep Learning (En) | (Ar) version Part1 & Part2
Deep Learning UC Berkely
github of Dive into DL
Stanford Lecture - Convolutional Neural Networks for Visual Recognition
University of Waterloo - ML / DL
Deep Learning for coders with fastai & PyTorch
2. Tensorflow
Specialization
Youtube
fast.ai's Deep Learning Courses
TensorFlow beats PyTorch in visualization capabilities and deploying trained models. Go for PyTorch if you want flexibility, debugging capabilities, and short training duration.
3. PyTorch
PyTorch (UC Berkeley - Youtube) - Lec3 (The 5 parts)
PyTorch - Dr. Data Science - Youtube
Pytorch Tutorial - Aladdin - Youtube
PyTorch Course (2022) - Youtube
Deep Learning With Pytorch
Machine Learning with PyTorch and Scikit-Learn -2022
4. Advanced Data Science
Advanced Data Science with IBM Specialization Includes Apache Spark
Advanced ML Topics | Lecs (YouTube)
Stanford CS330: Deep Multi-Task and Meta Learning I Autumn 2022 - Materials
18.409 Algorithmic Aspects of Machine Learning Spring 2015 - MIT
ML based Computer Vision | Lecs (YouTube)
CS 198-126: Modern Computer Vision Fall 2022 (UC Berkeley)
NOC:Deep Learning For Visual Computing - IIT Kharagpur
Deep Learning for Computer Vision - Michigan
5. NLP
Specialization - Coursera
Arabic - Ahmed El Sallab
Stanford CS224N Lectures - Winter 2021- YouTube
Stanford XCS224U Lectures - Spring 2021- YouTube
Introduction to Natural Language Processing in Python
LLMS What`s Large Language Model?
Generative AI for Everyone (Andrew Nj) - Coursera[NEW]
Generative AI with LLMs
Stanford CS236: Deep Generative Models I 2023 - YouTube
Stanford CS25 - Transformers United 2023 - YouTube
Recent Advances on Foundation Models - Winter 2024 - University of Waterloo
Understanding LLMs Foundations and Safety UC Berkeley - Spring 2024 - YouTube
LLM Foundations
How ChatGPTs / Transformers work?1 - 2 - 3 overview & Maths behind
Prompt Engineering | (Ar) If you want to get the most out of LLMs
LLMOps A Lec going through the entire LLM pipeline
6. Inferential Statistics
Specialization, 2nd & 3rd courses
course
7. Bayesian Statistics
1 - From Concept to Data Analysis
2 - Techniques and Models
3 - Mixture Models
8. Model Deployment
Flask tutorial
TensorFlow: Data and Deployment Specialization
Deploy Models with TensorFlow Serving and Flask
How to Deploy a Machine Learning Model to Google Cloud - Daniel Bourke
if you`re interested in more deployment methods, search for (FastAPI - Heroku - chitra)
9. MLOps : is a combination of Model Deployment, Model Serving, Model Monitoring, and Model Maintenance.
MLOps-zoomcamp
MLOps-guide
Practical MLOps
10. Probabilistic Graphical Models
Specialization - Coursera
Spring 2016, University of Utah - YouTube
Read these books, they will be beneficial to you.
Bayesian Reasoning and Machine Learning
The Elements of Statistical Learning
Pattern Recognition and Machine Learning - Bishop (Advanced)
Recommended by Eng.Mohamed Hammad.
PROJECTS
Deena Gergis - End to end Project
Machine Learning Projects - Youtube
Top 10 Data Science Projects for Beginners
12 Data Science Projects for Beginners and Experts
Data Science Projects & Ideas
Top 310+ Machine Learning Projects for 2023
10 End-to-End Guided Data Science Projects
Real-World ML Tutorial w/ Scikit Learn
Python Codes in Data Science
End To End ML Project With Dockers,Github Actions And Deployment
12 free Data Science projects to practice Python and Pandas (resolve interactive online)
Common Tools
| English | Arabic | Book |
|---|---|---|
| Git - Udacity | shkhbT wnt mTmn | Pro Git |
| w3schools | almadrasa | |
| Elzero |
More Books Check This!
12 Free Important Books
Mathematics for Machine Learning
An Introduction to Statistical Learning
Understanding ML: From Theory to Algorithms
Probabilistic Machine Learning: An Introduction
storytelling with data Important data visualization guide.
Collection of the best Cheat sheets
Collection of the best Cheat sheets
-
Pandas
-
Machine Learning Cheat Sheets (Recommended Guide) rj` lmwDy` lly fy lshyt dy y `zyzy wshwf lly nqSk
Competitions will make you even more proficient in Data Science.
When we talk about top data science competitions, Kaggle is one of the most popular platforms for data science. Kaggle has a lot of competitions where you can participate according to your knowledge level.
You can also check these platforms for data science competitions-
- Driven Data
- Codalab
- Iron Viz
- Topcoder
- CrowdANALYTIX Community
- Bitgrit
Interview Preparation: Your Roadmap to Success
Data Science Interview Questions:
- (7) 30 days of interview preparation
Practical Interview Questions from Actual Companies: Data Analysis & Data Engineering by @Prepare.sh.
Data Science Podcasts:
The Best Way to Stay Up-to-Date on the Latest Data Science Trends and Developments
| Podcasts | About | Produced by |
|---|---|---|
| Data Science at Home | A podcast that provides practical advice and tutorials on data science topics. | Greg Linhardt, a data scientist and machine learning engineer at Google AI |
| Data Stories | An interview-driven podcast that tells the stories of data scientists and how they're using their skills to make a difference in the world. | Kirill Eremenko, a data scientist and machine learning engineer at Netflix |
| O'Reilly Data Show | A podcast that covers a wide range of data science topics, from machine learning to artificial intelligence to big data. | Ben Lorica, the Chief Data Scientist at O'Reilly |
| Learning Machines 101 | Mathematics, statistics, and algorithms that power the machine learning systems that we rely on every day. | Richard Golden, a machine learning engineer and researcher at Google AI |
| Data Engineering Podcast | Tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation. | Tobias Macey, a data engineer at Netflix |
| Data Science Mixer | A great resource for anyone who wants to learn more about data science and the latest trends in the field. It is also a great way to get inspired by the work of other data scientists and machine learning engineers. | Alteryx, a data science and analytics software company |
| Chai Time Data Science Show | Interviews top data scientists, practitioners, and researchers from around the world. | Sanyam Bhutani, a data scientist and machine learning engineer at Google AI. |
| Becoming a Data Scientist | Podcast that interviews data scientists about their journey to becoming a data scientist. | Renee Teate, a data scientist and machine learning engineer at Google AI. |
| AI Today Podcast | Explores the latest trends and developments in artificial intelligence. | Ron Schmelzer and Kathleen Walch |
| Gradient Dissent | A weekly podcast that explores the latest research in machine learning and artificial intelligence. | Chris Olah, a machine learning engineer at Google AI |
| Data Skeptic | A podcast that challenges the conventional wisdom in data science and asks tough questions about the ethics and implications of data-driven decision making. | Kyle Polich, a data scientist and machine learning engineer |
| Linear Digressions | A podcast that covers a wide range of data science topics, from the technical to the theoretical. | Ben Recht and Noah Smith, two machine learning researchers at the University of California, Berkeley |
| The Data Engineering Show | For data engineering and BI practitioners to go beyond theory, and learn from the biggest influencers in tech about their practical day to day data challenges. | Eldad Farkash and Benjamin Wagner, who are both data engineering experts with experience at companies like Firebolt and Sisense |
| DataTalks.Club | A weekly online community of data enthusiasts and practitioners that learn from each other and share their knowledge and experiences through meetups, workshops, and a podcast. | A rotating cast of data experts |
| Datacast | Top data scientists and practitioners in the data and AI infrastructure space. | James Le, who is a data infrastructure expert with experience at companies like Google and Netflix |
| How to Get an Analytics Job Podcast | A great resource for anyone who is interested in a career in analytics. The guests share their insights and advice on how to get started in analytics and how to succeed in an analytics career. | John David Ariansen, an analytics agency owner and career coach |
| The Analytics Power Hour | Five awesome people, an occasional guest, and drinks all around tackling the hottest data and analytics topics of the day. | Tim Wilson, Michael Helbling, Josh Crowhurst, and Val Kroll. They are all analytics experts from different companies |
Arabic Podcasts??
shyfk ylly zhqn fy lmwSlt
Arabic Data Podcast | Spotify by Eng. Kareem Abdelsalam
lly lbynt wm b`dh by Eng. Youssef Hosni
Garage Education by Eng. Mostafa Alaa
Data Science bl`rby
Data Analysis Recommendations.
Books ( The Data Analysis Workshop &
Head First Data Analysis)
Google Data Analytics Professional Certificate
IBM Data Analyst Professional Certificate
Google Advanced Data Analytics Professional Certificate [NEW]
Alex The Analyst - YouTube
FWD - (The 3 Levels)
Arabic - ITI - BI Dev Track
Note: A good knowledge & projects in just Excel, SQL & Power BI / Tableau can bring you great opportunities.
- Excel More Resources: (Arabic 1 - Arabic 2 - Books and cheat sheets for revising)
Data Engineering Recommendations.
Books ( Fundamentals of Data Engineering &
Designing Data-Intensive Applications)
Arabic Podcast, Starting a Career in Data Engineering.
For Arab, I recommend 2 YouTube Channels: (Garage Education & Big Data bl`rby)
Roadmap 1 - (Recommended)
Roadmap 2
Roadmap 3
IBM Data Engineering Professional Certificate
Note: A good knowledge & projects in SQL, Python, Apache Spark/Hadoop, Data Modeling and [Data Warehouse - {Arabic-Starting from the 7th video} can bring you great opportunities. Start with them then go for the other tools,concepts and cloud platforms.
CV / Resumes
- Common mistakes by Yehia Arafa Mostafa
- CV Tips by Omar Yasser
- This Is What A GOOD Resume Should Look Like by careercup
- After you have made your beta-version resume, check those reviews from Mostafa Nageeb
- After Graduation by Yasser Alaa
- How to make Data Science Resume
- Data Science Resume Guide
- Resume/CV building for Data Jobs (Arabic)
Video 1
Video 2
Data & AI Companies in Egypt - AI/ML Driven Companies In Egypt