276°
Posted 20 hours ago

Machine Learning System Design Interview

£16.15£32.30Clearance
ZTS2023's avatar
Shared by
ZTS2023
Joined in 2023
82
63

About this deal

Alexey: But where does system design actually come into the picture here? Because here, we talked about selecting the right metric, which was the important thing, as you said. You said it was log loss for this specific case. Or even before log loss, I think it was expected calibration error. ( 24:28) Within each data source, you can iterate on the types features available. It’s good to call out some example specific features, but it would take too long to be exhaustive about these. Eg. for a Facebook user you have features like: Here’s where you discuss the actual modelling techniques you can use for your various components. As with the rest of the design, we’re moving from higher levels of abstraction to more specific. We’ve already described the data inputs to our components, now we want to break down higher level components into lower level ML model types: Valerii: Do we need to introduce some weights? Okay, good. What data will we use? Is it the amount of the transaction? Is it just the history of the user? How fast will we update them? Now let's say we have a model. How can we assume that model is better than the previous one? Of course, we have some offline metrics. We have an expected calibration error, weighted expected calibration error, precision – we don't have precision, forget about that. It's a bad metric because it's class-balance sensitive. We have specificity. We have recall. What now? ( 16:43) For recommendation systems, nearest neighbours can be very useful, especially if you’ve embedded your candidates into a lower dimensional space where distance represents similarity. For candidate generation, you often want to select the k closest items in a catalog. How can you do that without evaluating every single item? You should understand LSH and have general knowledge about the existence of open source solutions like Spotify’s Annoy and Facebook’s Faiss. This Google Cloud article is helpful. Deep Learning

Valerii: What else? Should we take a look into other metrics? Probably, yes. But we know that the fraud is very class- balance skewed. We know that class imbalance is extremely high there. We also know that it might change. So that means that if we would like to take a look into the metrics, these metrics have to be class-balance insensitive, probably. Because otherwise, yes, class balance changed, metrics change, but the model’s the same. Okay, so what are the most favorite metrics? Is it precision and recall? Recall is class-balance insensitive, while precision is class-balance sensitive. So, forget about precision. Can we replace precision with something? Why not specificity? Also not that. Okay, something else? Maybe. We know that there are some thresholds of expected fraud level, which we can just go with and then we can. ( 16:43) Two of the crucial signals you need to provide at this interview are the ability to think of useful data to feed into your models and your knowledge about transforming raw signals into usable numeric features for your models. Here’s a hint, this is probably something you can think about ahead of time for your interview. For the company you’re interviewing at, think about the useful data sources and features you could use. At the same time, many models have thousands of inputs, so you can’t spend the whole interview cycling through this. You can split this up into a couple layers of abstraction. Data Sources Alexey: Okay. So let's go to the questions. We have quite a few of them. The first question we have is, “What are the typical components of a machine learning system? And what percentage of it are machine learning algorithms?” ( 47:52) It’s not always a good idea to throw the kitchen sink at your model. Discuss some techniques for feature importance ranking and selection. Bear in mind this is fairly high level and abstract since you don’t have the data in front of you. You can also discuss regularization when you start to talk about models. Candidate Sources Machine Learning Design Patterns: Solutions to Common Challenges in Data Preparation, Model Building, and MLOpsGo over questions for requirements gathering. It’s easy to forget key questions when you’re nervous! You’re worried that there might be biases in your ML systems and you want to make your systems responsible! Get Book Machine Learning, Multi Agent And Cyber Physical Systems - Proceedings Of The 15th International Flins Conference (Flins 2022) by Qinglin Sun,Jie Lu,Xianyi Zeng,Etienne E Kerre,Tianrui Li Pdf

Success” can be measured in numerous ways in machine learning system design. A successful machine learning system must gauge its performance by testing different scenarios. This can make a model’s design more innovative.

about the technology

Valerii: Yes. To approximate, “Can you move directly to your goal? Or can you approximate moving to your goal?” Also, the thing is that – if a metric becomes your goal, with some time, it usually ceases to be a good metric. ( 43:17)

Alexey: [laughs] But I think for many people, it will be useful because for each pattern there, they talk about when exactly you need to apply this and how to apply this. They also talk about what kind of tools there are. And since this is a book from Google, there is a lot of focus on Google Cloud, but they also talk about open source solutions like Kubeflow, for example. ( 53:37) Recommending what video to watch next: a multitask ranking system (2019) - Youtube’s multitask learner for ranking For our recommender example, the ranking component can be built with an ML model. We can rank the candidates by their predicted outcome for the user. For example, maybe based on our initial discussions, perhaps we’re trying to increase engagement by showing posts that increase user interactions with the posts. There’s lots of ways to do this: The tutorial approach has been tremendously successful in getting models off the ground. However, the

Overview of ML interview concepts and techniques

Valerii: Let's do a mental exercise. Let's imagine that you have a computer vision, deep learning model. Very sophisticated – 175 layers. And then there is a classification model. And on top of this model, you have what? You have a linear classificator. What does it mean? It means that, actually, this model classifies with their linear model. And all that is done before is just representational learning, transforming the original features to the features, which might be fed to the linear model very successfully. See – features. Just with this mental exercise, you can see that. So that's why you can take embeddings, put them in whatever model you would like to, and you have a proper output. ( 49:57) Applying ML systems to real-world problems Valerii: Exactly. Yes, like that. You could also make the same example of the fraud system. In this case, the system design question would be “Can you build a system which will handle 3 billion transactions per day and these transactions are coming from this?” So, you see? ( 24:04) April 29th: I launched mlengineer.io blog so you can get latest machine learning interview experience.

Asda Great Deal

Free UK shipping. 15 day free returns.
Community Updates
*So you can easily identify outgoing links on our site, we've marked them with an "*" symbol. Links on our site are monetised, but this never affects which deals get posted. Find more info in our FAQs and About Us page.
New Comment