khangich/machinelearninginterview
Machine Learning Interviews from FAAG, Snapchat, LinkedIn.
repo name  khangich/machinelearninginterview 
repo link  https://github.com/khangich/machinelearninginterview 
homepage  
language  
size (curr.)  4984 kB 
stars (curr.)  1305 
created  20200811 
license  
Minimum Viable Study Plan for Machine Learning Interviews
Machine Learning System design is now available, become sponsor to get started. We will launch the course on educative.io in January 2021.
Section  

1. Youtube Recommendation  
2. The main components in MLSD  
3. LinkedIn Feed Ranking  
4. Ad Click Prediction  
5. Estimate Delivery time  
6. Airbnb Search ranking 
Getting Started
How to  Resources 

Prepare for interview  Common questions about Machine Learning Interview process. 
Study guide  Study guide contained minimum set of focus area to aces your interview. 
Design ML system  ML system design includes actual ML system design usecases. 
ML usecases  ML usecases from top companies 
Test your ML knowledge  Machine Learning quiz are designed based on actual interview questions from dozen of big companies. 
Practice coding  Leetcode questions by categories for MLE 
Advance topics  Read advance topics 
Mock interview  Contact helppreparemle@gmail.com 
Study guide
LeetCode (not all companies ask Leetcode questions)

NOTE: there are a lot of companies that do NOT ask leetcode questions. There are many paths to become an MLE, you can create your own path if you feel like leetcoding is a waste of time.

I use LC time tracking to keep track of how many times I solves a question and how long I spent each time. Once I finish nontrivial medium LC questions 3 times, I have absolutely no issues solving them in actual interviews (sometimes within 810 minutes). It makes a big difference.
Leetcode questions by categories
SQL
 Know SQL join: self join, inner, left, right etc.
 Use hackerrank to practice SQL.
 Revise/Learn SQL Window Functions: window functions
Programming
 Java garbage collection
 Python passbyobjectreference
 Python GIL, Fluent Python, chapter 17
 Python multithread
 Python concurrency, Fluent Python, chapter 18
Statistics and probability
 The only cheatsheet that you'’ll ever need
 Learn Bayesian and practice problems in Bayesian
 Let A and B be events on the same sample space, with P (A) = 0.6 and P (B) = 0.7. Can these two events be disjoint?
 Given that Alice has 2 kids, at least one of which is a girl, what is the probability that both kids are girls? (credit swierdo)
 A group of 60 students is randomly split into 3 classes of equal size. All partitions are equally likely. Jack and Jill are two students belonging to that group. What is the probability that Jack and Jill will end up in the same class?
 Given an unfair coin with the probability of heads not equal to .5. What algorithm could you use to create a list of random 1s and 0s.
Big data
 Spark architecture and Spark lessons learned (outdated since Spark 3.0 release)
 Spark OOM
 Cassandra best practice and here
ML fundamentals
 Collinearity and read more
 Features scaling
 Random forest vs GBDT
 SMOTE synthetic minority oversampling technique
 Compare discriminative vs generative model and extra read
 Logistic regression. Try to implement logistic regression from scratch. Bonus point for vectorized version in numpy + completed in 20 minutes sample code from martinpella. Followup with MapReduce version.
 Quantile regression
 L1/L2 intuition
 Decision tree and Random Forest fundamental
 Explain boosting
 Least Square as Maximum Likelihood Estimator
 Maximum Likelihood Estimator introduction
 Kmeans. Try to implement Kmeans from scratch sample code from flothesof.github.io. Bonus point for vectorized version in numpy + completed in 20 minutes. Followup with worst case time complexity and improvement for initialization.
 Fundamentals about PCA
 I didn’t use flashcard but I’m sure it helps up to certain extend.
AB testing
DL fundamentals
 The deep learning book. Read Part ii
 Machine Learning Yearning. Read from section 5 to section 27.
 Neural network and backpropagation
 Activation functions
 Loss and optimization
 Convolution Neural network notes
 Recurrent Neural Networks
ML system design
ML classic paper
 Technical debt in ML
 Rules of ML
 An Opinionated Guide to ML Research. There is valuable advice in the Personal development section at the bottom.
ML productions
Food delivery
 Uber eats trip optimization
 Uber food discovery
 Personalized store feed
 Doordash dispatch optimization
Fraud detection (TBD)
Adtech
 Ad click prediction trend
 Ad Clicks CTR
 Delayed feedbacks
 Entity embedding
 Star space, embedding all the things
 Twitter timeline ranking
Recommendations:
 Instagram explore
 TikTok recommendation
 Deep Neural Networks for YouTube Recommendations
 Wide & Deep Learning for Recommender Systems
Testimonials
 V, Amazon L5 DS
I really found the quizzes very helpful for testing my ML understanding. Also, the resources shared helped me a lot for revising concepts for my interview preparation. This course will definitely help engineers crack Machine Learning Engineering and Data Science interviews.
 K, Facebook MLE
I really like what you’ve built, it’ll help a lot of engineers.
 D, NVIDIA DS
I have been using your github repo to prep for my interviews and got an offer with NVIDIA with their data science team. Thanks again for your help!
 A, Booking
Woow this is very useful summaries, so nice.
 H, Microsoft
That’s incredible!
 V, Intel
The repo is extremely cohesive! Thanks again.
Intro

This repo is written based on REAL interview questions from big companies and the study materials are based on legit experts i.e Andrew Ng, Yoshua Bengio etc.

I have 6 YOE in Machine Learning and have interviewed more than dozen big companies. This is the minimum viable study plan that covers all actual interview questions from Facebook, Amazon, Apple, Google, MS, SnapChat, Linkedin etc.

If you’re interested to learn more about paid ML system design course, click here. This course will provide 67 practical usecases with proven solutions. After this course you will be able to solve new problem with systematic approach.
Acknowledgements and contributing

Thanks for early feedbacks and contributions from Vivian, aragorn87 and others. You can create an Issue or Pull Request on this repo. You can also help upvote on ProductHunt

If you find this helpful, you can Sponsor this project. It’s cool if you don’t.

Thanks to this community, we have donated about $200 to HopeForPaws. If you want to support, you can contribute too on their website.