Artificial Intelligence is growing rapidly and is being used by various companies across the globe. This rapid growth in AI could potentially create 58 million net new jobs by 2020 according to a new report from the World Economic Forum.
With this net positive job growth, companies are expected to expand their use of AI specialists and Uber is doing exactly that. Uber intends to use AI in order to make predictions about market demand, find optimal routes for cab services, update knowledge about the changing traffic conditions and find better routes, and much more.
Uber India, in a statement put out in 2018, announced that it would double up hiring in the engineering department from 500 to over 1000 technologists in its Bangalore and Hyderabad offices. These roles will mainly involve ML and AI experts. So if you’re someone who would love to work at Uber, but have no clue about what to expect from the interview process, then we’ve got you covered.
Here are 11 AI Interview questions that will help you prepare for your dream interview-
What is cross-validation?
Cross-validation is a technique that is used for the assessment of how the results of statistical analysis generalize to an independent data set. It is also known as Rotation Estimation.
Cross-validation is largely used in a setting wherein the target is predicted and it is necessary to estimate the accuracy of performance of a predictive model. The prime reason for the use of cross-validation is that there is not enough data available for partitioning them into separate training and test sets, as done in conventional validation. This results in a loss of testing and modeling capability.
How do you use A/B Testing?
A/B Testing is a marketing experiment wherein you split your audience to test a number of campaign variations in order to determine what performs better.
It helps marketers observe how one piece of content performs alongside another. The process of A/B Testing is explained below-
Before A/B Testing:
a. Pick one variable to test
b. Identify your goal
c. Create a ‘Controller’ and a ‘Challenger’
d. Split your sample groups equally and randomly
e. Determine your sample size, if applicable
f. Decide how significant your results should be
g. Make sure you’re running only one test at a time on any campaign
During A/B Testing:
h. Use an A/B testing tool
i. Test both variables simultaneously
j. Give the A/B test enough time to produce significant data
k. Ask for feedback from real users
After A/B Testing:
l. Focus on your goal metric
m. Measure the significance of your results using A/B test calculator
n. Take action based on your results
Describe Binary Classification.
Statistical binary classification is a method of machine learning. It is a type of supervised learning, wherein the categories are predefined and are used to categorize new probabilistic observations into said categories. When only two categories are present, the problem can be known as statistical binary classification.
Some of the methods commonly used are-
a. Random forests
b. Decision Trees
c. Bayesian Networks
d. Support vector machines
e. Neural networks
f. Logistic regression
How does Uber’s surge pricing algorithm work?
Surge pricing is the practice of charging customers a higher price when the demand for rides is higher than the supply of cars. The pricing algorithm detects situations of high demand and low supply, and depending on the storage, it hikes the price.
Let’s assume that the agents are riders and the items are the ride being assigned to them. An ideal scenario is when the number of agents and the number of items are equal, but this is rarely the case. If there are x agents and x items, they can be easily mapped to one another, but usually the agents and the number of items aren’t equal. Let’s consider an example of x agents and x-y items. At such a time, the surge pricing algorithm would increase the prices in order to motivate more Uber drivers to get on the road again, ensuring that the number of agents and items are equal once again.
What is the meaning of P-Value?
The level of marginal significance within a statistical hypothesis test, representing the probability of the occurrence of a given event is known as the P-value. To provide the smallest level of significance at which the null hypothesis would be rejected, the p-value is used as an alternative to rejection points. If there is stronger evidence in favour of the alternative hypothesis, it signifies a smaller P-value.
How does caching work and how is it helpful to Uber?
A cache is a high-speed data storage layer which stores a subset of data so that future requests for that data can be served faster when compared to accessing the data’s primary storage location. It allows you to efficiently reuse previously retrieved or computed data. The data in a cache is generally stored in fast access hardware such as RAM and its primary purpose is to increase data retrieval performance.
Uber Lite also uses cache by providing offline search. It smartly caches the locations when the user is on Wi-Fi and full charge and uses this as in when required.
What are Time Series Forecasting Methods?
Time series forecasting methods mainly use historical values to produce forecasts. In ML, time series forecasting is done using computing technologies. There are a lot of statistical techniques available for time series forecasting, a few of them are listed below:
a. Simple moving average
b. Exponential Smoothing
c. Auto regressive Inegrated Moving Average
d. Neural Network
What are anomaly detection methods?
If we’re able to detect any abnormal events or changes in datasets, we’ll be able to process data faster and more efficiently. This is where anomaly detection, which is a technology that relies on AI, comes into the picture. It usually identifies problems which can’t be detected by a human.
Supervised and Unsupervised Machine Learning techniques are used for Anomaly detection.
Supervised neural networks, support vector machines, Bayesian networks, decision trees are some of the Supervised ML Techniques used. These are known to be better at anomaly detection than Unsupervised methods due to their ability to use prior knowledge and data. The most popular unsupervised algorithms are hypothesis testing and PCA’s.
What are the assumptions of linear regression?
There are mainly four assumptions of linear regression, they are as follows- a. Independence of residuals
b. Normal distribution of residual
c. Linearity of residuals
d. Equal variance of residual.
What is a Bayesian network?
A Bayesian network is a type of probabilistic graphical model which uses inference for probability computations. When an event occurs and we want to predict the likelihood that any one of the several known causes was a contributing factor, a Bayesian network is used. For example, if a Bayesian network represents the probabilistic relationships between diseases and symptoms. If the symptoms are given, the Bayesian network can be used to compute probability of various diseases.
What is an ROC Curve?
A Receiver Operating Characteristic Curve or ROC Curve is a graph that shows the performance of a classification model at all classification thresholds. There are mainly two parameters, namely, True Positive Rate (TPR) and False Positive Rate (FPR).
AUC stands for Area Under the ROC Curve and is used to calculate the entire two-dimensional curve under the ROC Curve.
If you want to know more information and refine your knowledge about these topics, you can enroll in our Artificial Intelligence Course– with training from industry professionals, case studies and hands-on projects.
We’ll wrap up the questions here, all the best! You can get in touch with us for any further details, we will get back to you with the most industry-relevant answers.
If you wish to read how our alumni, Alvino Aji went from being a fresh college graduate to landing a job at Uber, follow his journey here.