khanhnamle1994/crackingthedatascienceinterview
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
repo name  khanhnamle1994/crackingthedatascienceinterview 
repo link  https://github.com/khanhnamle1994/crackingthedatascienceinterview 
homepage  https://medium.com/crackingthedatascienceinterview 
language  Jupyter Notebook 
size (curr.)  246863 kB 
stars (curr.)  465 
created  20180809 
license  
Here are the sections:
 Data Science Cheatsheets
 Data Science EBooks
 Data Science Question Bank
 Data Science Case Studies
 Data Science Portfolio
 Data Journalism Portfolio
 Downloadable Cheatsheets
Data Science Cheatsheets
This section contains cheatsheets of basic concepts in data science that will be asked in interviews:
 SQL
 Statistics and Probability
 Mathematics
 Machine Learning Concepts
 Deep Learning Concepts
 Supervised Learning
 Unsupervised Learning
 Computer Vision
 Natural Language Processing
 Stanford Materials
Data Science EBooks
This section contains books that I have read about data science and machine learning:
 Intro To Machine Learning with Python
 Machine Learning In Action
 Python Data Science Handbook
 Doing Data Science  Straight Talk From The Front Line
 Machine Learning For Finance
 Practical Statistics for Data Science
 A/B Testing
Data Science Question Bank
This section contains sample questions that were asked in actual data science interviews:
 Data Interview Qs
 Data Science Prep
 Interview Query
 Analytics Vidhya
 Springboard
 Elite Data Science
 Workera
 150 Essential Data Science Questions and Answers
Data Science Case Studies
This section contains case study questions that concern designing machine learning systems to solve practical problems.
Data Science Portfolio
This section contains portfolio of data science projects completed by me for academic, self learning, and hobby purposes.
For a more visually pleasant experience for browsing the portfolio, check out jameskle.com/dataportfolio

Recommendation Systems

Transfer Rec: My ongoing research work that intersects deep learning and recommendation systems.

Movie Recommendation: Designed 4 different models that recommend items on the MovieLens dataset.
Tools: PyTorch, TensorBoard, Keras, Pandas, NumPy, SciPy, Matplotlib, Seaborn, ScikitLearn, Surprise, Wordcloud


Machine Learning

Trip Optimizer: Used XGBoost and evolutionary algorithms to optimize the travel time for taxi vehicles in New York City.

Instacart Market Basket Analysis: Tackled the Instacart Market Basket Analysis challenge to predict which products will be in a user’s next order.
Tools: Pandas, NumPy, Matplotlib, XGBoost, Geopy, ScikitLearn


Computer Vision

Fashion Recommendation: Built a ResNetbased model that classifies and recommends fashion images in the DeepFashion database based on semantic similarity.

Fashion Classification: Developed 4 different Convolutional Neural Networks that classify images in the Fashion MNIST dataset.

Dog Breed Classification: Designed a Convolutional Neural Network that identifies dog breed.

Road Segmentation: Implemented a FullyConvolutional Network for semantic segmentation task in the Kitty Road Dataset.
Tools: TensorFlow, Keras, Pandas, NumPy, Matplotlib, ScikitLearn, TensorBoard


Natural Language Processing
 Classifying Tweets with Weights & Biases: Developed 3 different neural network models that classify tweets on a crowdsourced dataset in Figure Eight.

Data Analysis and Visualization

World Cup 2018 Team Analysis: Analysis and visualization of the FIFA 18 dataset to predict the best possible international squad lineups for 10 teams at the 2018 World Cup in Russia.

Spotify Artists Analysis: Analysis and visualization of musical styles from 50 different artists with a wide range of genres on Spotify.
Tools: Pandas, NumPy, Matplotlib, Rspotify, httr, dplyr, tidyr, radarchart, ggplot2

Data Journalism Portfolio
This section contains portfolio of data journalism articles completed by me for freelance clients and selflearning purposes.
For a more visually pleasant experience for browsing the portfolio, check out jameskle.com/datajournalism

Statistics

Machine Learning

Deep Learning

The 8 Neural Network Architectures ML Researchers Need to Learn

The 5 Deep Learning Frameworks Every Serious Machine Learner Should Be Familiar With

The 5 Computer Vision Techniques That Will Change How You See The World

Convolutional Neural Networks: The BiologicallyInspired Model

Recurrent Neural Networks: The Powerhouse of Language Modeling

The 7 NLP Techniques That Will Change How You Communicate in the Future

The 3 Deep Learning Frameworks For EndtoEnd Speech Recognition That Power Your Devices

The 5 Algorithms for Efficient Deep Learning Inference on Small Devices

The 4 Research Techniques to Train Deep Neural Network Models More Efficiently

The 2 Hardware Architectures for Efficient Training and Inference of Deep Nets
Downloadable Cheatsheets
These PDF cheatsheets come from BecomingHuman.AI.