Quiz: Building and evaluating ML models

 1. You work at a car manufacturing company that is ready to deploy a machine learning model. However, you want to evaluate the model first and decide to evaluate your model with a small set of data. You cannot measure how accurate the model is on all the original training data because it could memorize all answers and perform badly after deployment. What is a reasonable percentage of the data to reserve when you are evaluating the accuracy of a machine learning model?

Answer: 10–25%


2. You work at a medical research facility that analyzes patient data for local hospitals. You want to use machine learning for specialized image recognition in order to identify bacterial infections in patients’ x-ray images. What is the preferred method of obtaining a labeled dataset for this custom image recognition use case?

Answer: Use AutoML Vision to classify the x-ray images.


3. You work in the customer retention team at a bank and have noticed an increase in customers leaving your service. To solve this problem, you use machine learning with an objective to improve customer retention at your bank by personalizing services and loans. What is the preferred optimization of your objective to improve customer experience and retention at your bank?

Answer: Provide offers based on customer spending behavior.


4. You are working on the data team at a global banking company. You are gathering a wide variety of labeled data from different departments and locations for future machine learning experiments. Before you can introduce the data to train the machine learning model, what do you need to do?

Answer: Prepare the data and store it in a single location.


5. You are a doctor at a small medical clinic studying the symptoms and effects of common health conditions. You want to use machine learning to predict which of your patients might have an increased probability of heart disease. However, you have a limited dataset due to having fewer patients than a full-sized hospital. What would be the preferred solution to identify patients with an increased probability of heart disease using machine learning

Answer: Use existing data from a large nearby hospital as proxy data.


6. Machine learning projects consist of many different phases. However, a lot of useful information cannot be described in the phases alone, such as guidance on machine learning best practices. What is considered an example of good practice in machine learning?

Answer: Testing your machine learning projects with end users


7. You lead the marketing team for a startup accommodation booking website. You want to provide users with personalized accommodation recommendations, but lack sufficient historical labeled data of customer bookings to use as an exclusive data source. Instead, you and your team have only been using user clicks and accommodation viewings as a proxy for your entire dataset. What is the issue of only using user clicks and accommodation viewings as your dataset that might lead to few converted bookings?

Answer: Customers might browse accommodations while having no intention of booking them


8. You work at a mobile phone manufacturer and are preparing to launch the newest version of your high-end phone. You want to analyze the battery efficiency of your new phone against previous models. You have a backlog of historical data on previous models and their results, but these datasets exist in silos separate from the data for your new phone. How can you acquire a labeled dataset in this scenario when datasets exist in separate silos

Answer: Use a data warehouse to join the datasets into one source.


9. Input data in a machine learning model is often made of three parts, which are: the features of an example, the resulting label, and the label type. How can the ‘features’ be defined in a machine learning context?

Answer: The characteristics that give meaning to a piece of data


10. You are the communications manager at a marketing company. Recently, you noticed an increase in spam marketing emails disguised as popular brand emails that you want to filter out of your inbox. You want to use machine learning to predict which emails are spam and should be filtered. What are some possible features in this machine learning use case to detect deceptive spam emails?

Answer: Word sequence that is closer to those found in spam emails versus safe ones


Comments

Popular posts from this blog

Put Chai Ko In Singapore

Stand behind the yellow line?

Factory Outlet Stores of popular brands in Singapore