ML Interview Prepration Guide (Draft)

ML Interview Guide
Author

Aayush Agrawal

Published

August 24, 2024

A collection of resources while preparing for MLE interviews at Meta or other big tech companies.

Not long ago, I transitioned from a Senior ML Scientist role at Microsoft to a Machine Learning Engineer position at Meta, and the journey was anything but quick. The preparation process was extensive, especially since it was my first experience with LeetCode-style coding interviews and ML system design interviews. While there are many resources available for preparation, I’ll be sharing the ones that helped me navigate and succeed in this challenging process.

I will try to cover the following -

  1. Interview Process Overview: Break down the entire interview process, from initial screenings to the final onsite interviews, and share tips on what to expect at each stage.

  2. Coding Interviews: Dive into the LeetCode-style coding interviews,how I prepared for them, and the strategies that worked best for me.

  3. ML System Design Interviews: Explore the ML system design interviews, offering insights into the key concepts you need to know, how to approach open-ended design problems, and the resources that helped me build a strong foundation.

  4. Behavioral Interviews: Finally, I’ll talk about the behavioral interviews, and how I prepared to effectively communicate my past work and problem-solving approach.

1 Interview Process Overview

The interview process typically starts with a screening round, followed by a more extensive onsite round.

For the phone screen, the format varies depending on the level you’re interviewing for:

  • E4/E5:
    • 5 min: Introduction
    • 35 mins: Two leetcode coding problems (Easy/Medium)
    • 5 mins: Questions for interviewer
  • E6:
    • 2 min: Introduction
    • 25 mins: One or Two leetcode coding problems
    • 15 mins: Behavioral interview
    • 3 mins: Question for interviewer

Once you clear the phone screen you will be invited for on-site interviews which includes coding round, ML system design round and behavioral round. Again the composition varies depending on the level you’re interviewing for:

  • E4/E5:
    • 2 coding rounds
    • 1 ML system design
    • 1 Behavioral
  • E6
    • 2 coding rounds
    • 2 ML system design
    • 1 Behavioral

The level of preparation required for both phone interviews and onsite rounds is similar.

While performance in coding round are bare minimum you need to pass the loop, leveling is decided by how well you do in system design and behavioral round.

Next, we’ll dive deeper into each type of interview and explore how to best prepare for them.

2 Coding Interviews

Preparing for this was the most time-consuming part for me, especially since I don’t come from a CS background. Although I worked as an ML Scientist at Microsoft, I had limited exposure to these types of problems in the real world. However, mastering these concepts is essential for entry in high-tech software/ML roles, so it’s important to invest time in thorough preparation.

Coding round interviews unsurprisingly focus heavily on coding. A typical Interview structure looks like the following -

  • 5 min: Introduction
  • 35 mins: Two leetcode coding problems
  • 5 mins: Questions for interviewer

You are mostly expected to code on a plain text notepad with execution disabled. To get a more realistic idea of how a coding interview environment look like watch this mock interview by interviewing.io -

Python interview with an interviewing.io engineer: Print k largest elements

2.1 A structured approach to solving coding problems in interviews

When tackling a coding problem, following this structured approach can be very helpful:

  1. Ask Clarifying Questions(~3mins): When the problem is presented, read it aloud to ensure you fully understand the requirements before jumping to a solution. Ask follow-up questions to clarify any ambiguities. This might involve discussing test cases, considering edge cases, and understanding the expected input range or type. For example, think about how the solution should handle null inputs or extreme values. Ideal state is to get an alignment with your interviewer by writing out some test cases and expected output for the same.

  2. Plan Your Approach(~5 mins): Outline your solution strategy and explain it to your interviewer while typing it out in the shared text window. Break down the problem into smaller parts if possible, and decide on the most appropriate algorithm or data structure and discuss any trade-offs you are making and write down potential time and space complexity of the solution you are proposing. Once your interview agree with your approach and then ask permission to code it out.

  3. Write the code(~5 mins): Implement your solution, keeping your code clean and well-organized. As you code, ensure that you handle edge cases. Make sure to name your functions, classes and variables appropriately so anybody reading your code can follow.

  4. Pseudo run your solution(~2 mins): Manually run your code against various test cases while explaining it to your interviewer, including both typical and edge cases, to ensure it behaves as expected. This will help you find potential bugs and an opportunity to correct them before your interviewer points it out.

  5. Close(~2 mins): Explain time and space complexity of the solution and answer any followup questions your interviewer might have.

By following these steps, you can effectively navigate coding problems and demonstrate a clear, methodical problem-solving approach.

Tip
  1. Keep your introduction breif (~30 seconds) to have more time for solving the problem. For example - “Hey I am [Your Name], I currently work as [Title] at [Employer Name]. I have been working here from past [N] years. I’m now seeking new opportunities, which brings me here today.”
  2. If you’re running out of time, it’s acceptable to manually walk through one or two test cases with your interviewer. You can then suggest moving on to the next question to ensure you cover everything within the allotted time.
  3. Its okay to ask for help from interviewer if you are stuck on a problem

As you can see from above, coding rounds are really fast paced and you need to be well prepared to get through it. That brings us to prepration.

2.2 How to prepare for coding interviews

Here is a simple guide on how to prepare -

  1. Purchase a leetcode subscription - This website is the only paid resource you need to prepare for coding interviews.

  2. Getting started with Leetcode learn - If you are like me who doesn’t come from a CS degree then going through Leetcode learn cards is a good starting point. Here is the structure I followed-

    • Array
    • Linked List
    • Stack & Queue
    • Array & Strings
    • Binary Tree
    • Binary Search
    • Binary Search Tree
    • Heap
    • Graph
    • Sorting
    • Dynamic Programming
  3. Following Neetcode.io Roadmap- This roadmap contains 75 leetcode questions which will familiarize you with common coding patterns useful in coding interviews

  4. Solving company tagged questions - On leetcode.com you can filter for the company you are interviewing and see top tagged question for the same. I would recommend solving top 100 tagged questions based on frequency which are asked in last six months.

  5. Pay attention to design questions -While I recommend focusing on the top 100 questions tagged by the company of interest, interviews often place special emphasis on design-related questions. In these cases, you may be asked to design a class to solve a specific use case. I will go beyond the top 100 questions to find every relevant design question asked in the past year. Here are some top design questions tagged for Meta on Leetcode at the time of writing -

  1. Practice - Once you are done with the above then you can practice timed assessment on leetcode.com for your specific employer or generic ones if not listed. If you want more realistic practice then you can buy some mock interviews on interviewing.io where an engineer from top tech company will take your mock and provide feedback on your performance.
Tip
  1. The more you practice in conditions similar to the actual interview—such as using a text editor and working within timed constraints—the better you will perform
  2. When practicing, attempt to solve the problem on your own for 20-30 minutes before consulting the solution.
  3. If you find a problem challenging to understand, search for [LeetCode Problem #XYZ] on YouTube; you’ll likely find a video with a clearer explanation.
  4. Keeping an Excel sheet to track the problems you’ve solved during practice, along with notes such as ‘needs revision’, time/space complexity and a brief summary of the solution, can be very helpful for review later. You can use this Conding Tracking Sheet Tab as a template.

3 Machine learning System Design

The Machine Learning System Design interview typically lasts 45 minutes and assesses your ability to solve an abstract ML problem from start to finish. Depending on the level you’re aiming for, you may encounter 1-2 rounds of these interviews. Excalidraw is what is typically used as the platform, but you are just supposed to write / talk about the problem instead of drawing block diagrams in this one.

3.1 A structured approach to solving machine learning system design problem in interviews

In an ML System Design interview, it’s crucial for the interviewee to take the lead in the discussion and ensure all aspects of the ML design are covered. The interviews are fast paced and following a structured method can be highly beneficial:

  1. Clarifying Requirements: In every ML system design interview, you’re typically presented with an abstract problem. For example, you might be asked to “Design a ‘People You May Follow’ recommendation system for Threads or Twitter.” It’s essential to clarify the scope of the problem, ensuring it can be managed within the 45-minute timeframe. Asking the right clarifying questions not only helps you gain clarity but also demonstrates your product awareness. For instance, in the case of a “People You May Follow” recommendation system, you might ask:

    • “Can I assume the purpose of this feature is to help users find influencers or people aligned with their interests?”
    • “On Threads/Twitter, following is unidirectional—one user can follow another without reciprocation. Is that correct?”
    • “What is the estimated total number of users on the platform?
    • What is the current count of daily active users (DAUs)?”
    • “What’s the average number of people each user follows?”
    • “Since this is a mature platform, can we assume that the user-follow graph is relatively stable and doesn’t change drastically over short periods?”

    Once you’ve clarified the requirements, it’s important to summarize them back to the interviewer, for example: “Okay, we are designing a ‘People You May Follow’ recommendation system for the Threads/Twitter platform. We have around XYZ million daily active users, and we’re assuming that the followership graph remains relatively stable over time.” This helps ensure alignment before diving into the design process.

  2. Frame the problem as an ML task: Once you’ve clarified the requirements, the next step is to map the problem to a known ML task. This could be something like binary classification, learning to rank, edge prediction, or even visual object detection. Often, a problem can be framed using multiple objectives, and it’s important to explain what each objective aims to achieve and the pros and cons of each approach. For example, in the case of a recommendation system, you could frame it as a learning-to-rank problem or as edge prediction in a user graph. The key is to pick the objective that best aligns with the problem at hand.

  3. Data sources for training labels: Next, brainstorm potential data sources for defining training labels. There could be various options, each with its own trade-offs. For instance, if you’re working on a video recommendation system, you might need to decide between implicit and explicit feedback. Explicit feedback (such as likes, shares, or subscriptions) tends to be high quality but sparse, as not all users provide these signals. On the other hand, implicit feedback (like watch time or time spent) is lower quality but available for every user interaction. It’s essential to highlight these trade-offs and choose the best data source for your model’s needs.

  4. Data prep and Feature Engineering -

  5. Loss function-

  6. Evaluation- Online / Offline

  7. Serving-

3.2 How to prepare for ML system design interviews

For ML Sytem Design, I recommend ByteByteGo Machine Learning System Design Interview as the only resource apart from company blogs to read.

4 Behavioral Interviews

5 Conclusion

Preparing for this was the most time-consuming part for me, especially since I don’t come from a CS background. Even though I was working as an ML scientist at Microsoft, I had not encountered these types of problems in the real world. Nonetheless, mastering these concepts is essential for high-tech roles, so it’s worth investing the time to prepare thoroughly.

I hope you enjoyed reading it. If there is any feedback on the code or just the blog post, feel free to comment below or reach out on LinkedIn.