mjbahmani/10stepstobecomeadatascientist
Ready to learn or review your knowledge! You will learn 10 skills as data scientist: Python, Machine Learning, Deep Learning, Data Cleaning, EDA, python packages such as Numpy, Pandas, Seaborn, Matplotlib, Plotly, Tensorfolw, Theano…., Linear Algebra, Big Data, Analysis Tools and solve some real problems such as predict house prices.
repo name  mjbahmani/10stepstobecomeadatascientist 
repo link  https://github.com/mjbahmani/10stepstobecomeadatascientist 
homepage  
language  Jupyter Notebook 
size (curr.)  47104 kB 
stars (curr.)  1124 
created  20181010 
license  Apache License 2.0 
ðŸ“¢ 10 Steps to Become a Data Scientist
CLEAR DATA. MADE MODEL.
last update: 19/07/2019
ðŸ’»ðŸ’¾ðŸ““âœ’ðŸ“Š
 Python
 Python Packages
 Mathematics and Linear Algebra
 Programming & Analysis Tools
 Big Data
 Data visualization
 Data Cleaning
 How to solve Problem?
 Machine Learning
 Deep Learning
Introduction
If you Read and Follow Job Ads to hire a machine learning expert or a data scientist, you find that some skills you should have to get the job. In this Repository, I want to review 10 skills that are essentials to get the job.
In fact, this Repository is a reference for 10 other Notebooks, which you can learn with them, all of the skills that you need.
1Python
Python is a modern, robust, high level programming language. It is very easy to pick up even if you are completely new to programming.
You can read and learn following topic on this Notebook:

web development (serverside)

software development

mathematics

system scripting.

Basics

Functions

Types and Sequences

More on Strings

Reading and Writing CSV files

Dates and Times

Objects and map()

Lambda and List Comprehensions

OOP
for Reading this section please fork this kernel:
numpypandasmatplotlibseabornscikitlearn
2Python Packages

Numpy

Pandas

Matplotlib

Seaborn
In this Step, we have a comprehensive tutorials for Five packages in python after that you can start reading my other kernels about machine learning and deep learning.
21. Numpy

Creating Arrays

Combining Arrays

Operations

Math Functions

Indexing / Slicing

Copying Data

Iterating Over Arrays

The Series Data Structure

Querying a Series
22. Pandas

The DataFrame Data Structure

Dataframe Indexing and Loading

Missing values

Merging Dataframes

Making Code Pandorable

Group by

Scales

Pivot Tables

Date Functionality

Distributions in Pandas

Hypothesis Testing

Matplotlib

Scatterplots

Line Plots

Bar Charts

Histograms

Box Plots

Heatmaps

Animations

Interactivity

DataFrame.plot
23. seaborn

Seaborn Vs Matplotlib

Useful Python Data Visualization Libraries
24. SKlearn

Introduction

Algorithms

Framework

Applications

Data

Supervised Learning: Classification

Separate training and testing sets

linear, binary classifier

Prediction

Back to the original threeclass problem

Evaluating the classifier

Using the four flower attributes

Unsupervised Learning: Clustering

Supervised Learning: Regression
for Reading this section please fork this kernel:
numpypandasmatplotlibseabornscikitlearn
3 Mathematics and Linear Algebra
for Reading this section please fork this kernel:
4 Programming & Analysis Tools
for Reading this section please fork and upvote this kernel:
5 Big Data
for Reading this section please fork this kernel:
AComprehensiveDeepLearningWorkflowwithPython
6 Data Visualization
for Reading this section please fork this kernel:
7 Data Cleaning
for Reading this section please fork this kernel:
8 How to solve Problem?
The purpose of this section is to solve a few real problem. so, we have tried to solve some problems such as Quora, Elo, House price prediction. for Reading this section please fork this kernel:
AComprehensiveDeepLearningWorkflowwithPython
9 Machine learning
for Reading this section please fork this kernel:
A Comprehensive ML Workflow with Python
Do You Need Help?
I hope, you have enjoyed reading my python notebooks.
If you have any problem and question to run notebooks please open an issue here in GitHub.
for most of the my notebooks you need dataset as input.
To use the correct data, please download the data set from the Kaggle site and put it in your notebook folder.
Mj Bhamnai
Citation
If you use my code in your research, please cite this project.
@misc{10stepstobecomeadatascientist,
author = {MJ Bahmani,
title = {10stepstobecomeadatascientist},
howpublished = {\url{https://github.com/mjbahmani/10stepstobecomeadatascientist/}},
year = {2018}
}
Have Fun!