{"id":6057,"date":"2023-01-10T12:06:03","date_gmt":"2023-01-10T06:36:03","guid":{"rendered":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/"},"modified":"2024-09-02T15:21:57","modified_gmt":"2024-09-02T09:51:57","slug":"linear-regression-in-machine-learning","status":"publish","type":"post","link":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/","title":{"rendered":"A Guide to Linear Regression in Machine Learning"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"what-is-linear-regression\"><strong>What is Linear Regression?<\/strong><\/h2>\n\n\n\n<p>Linear Regression is the basic form of regression analysis. It assumes that there is a linear relationship between the dependent variable and the predictor(s). In regression, we try to calculate the best fit line, which describes the relationship between the predictors and predictive\/dependent variables.<\/p>\n\n\n\n<p>There are four assumptions associated with a linear regression model:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Linearity<\/strong>: The relationship between independent variables and the mean of the dependent variable is linear.&nbsp;<\/li>\n\n\n\n<li><strong>Homoscedasticity<\/strong>: The variance of residuals should be equal.<\/li>\n\n\n\n<li><strong>Independence<\/strong>: Observations are independent of each other.<\/li>\n\n\n\n<li><strong>Normality<\/strong>: The dependent variable is normally distributed for any fixed value of an independent variable.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"isnt-linear-regression-from-statistics\"><b>Isn\u2019t Linear Regression from Statistics?<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">Before we dive into the details of linear regression, you may be asking yourself why we are looking at this algorithm.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">Isn\u2019t it a technique from statistics? <a aria-label=\" (opens in a new tab)\" href=\"https:\/\/www.mygreatlearning.com\/blog\/what-is-machine-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\">Machine learning<\/a>, more specifically the field of <a href=\"https:\/\/www.mygreatlearning.com\/blog\/what-is-predictive-modeling\/\">predictive modeling<\/a>,<\/span> is primarily concerned with minimizing the error of a model or making the most accurate predictions possible<span style=\"font-weight: 400\"> at the expense of explainability. In applied machine learning, we will borrow and reuse algorithms from many different fields, including statistics and use them towards these ends.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">As such, linear regression was developed in the field of statistics and is studied as a model for understanding the relationship between input and output numerical variables. However, it has been borrowed by machine learning, and it is both a statistical algorithm and a machine learning algorithm.<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"linear-regression-model-representation\"><b>Linear Regression Model Representation<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">Linear regression is an attractive model because the representation is so simple.<\/span><br><span style=\"font-weight: 400\">The representation is a linear equation that combines a specific set of input values (x), the solution to which is the predicted output for that set of input values (y). As such, both the input values (x) and the output value are numeric.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">The linear equation assigns one scale factor to each input value or column, called a coefficient and represented by the capital Greek letter Beta (B). One additional coefficient is <\/span>added, giving the line an additional degree of freedom (e.g.,<span style=\"font-weight: 400\"> moving up and down on a two-dimensional plot) and is often called the intercept or the bias coefficient.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">For example, in a simple regression problem (a single x and a single y), the form of the model would be:<\/span><br><span style=\"font-weight: 400\">Y= \u03b20 + \u03b21x<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">In higher dimensions, the line is called a plane or a hype<\/span>r-plane when we have more than one input (x). The representation, therefore, is in the form of the equation and the specific values used for the coefficients (e.g.,<span style=\"font-weight: 400\"> \u03b20and \u03b21 in the above example).<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"performance-of-regression\"><strong>Performance of Regression<\/strong><\/h2>\n\n\n\n<p>The regression model's performance can be evaluated using various metrics like MAE, MAPE, RMSE, R-squared, etc.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"mean-absolute-error-mae\">Mean Absolute Error (MAE)<\/h3>\n\n\n\n<p>By using MAE, we calculate the average absolute difference between the actual values and the predicted values.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"mean-absolute-percentage-error-mape\">Mean Absolute Percentage Error (MAPE)&nbsp;<\/h3>\n\n\n\n<p>MAPE is defined as the average of the absolute deviation of the predicted value from the actual value. It is the average of the ratio of the absolute difference between actual &amp; predicted values and actual values.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"root-mean-square-error-rmse\">Root Mean Square Error (RMSE)<\/h3>\n\n\n\n<p>RMSE calculates the square root average of the sum of the squared difference between the actual and the predicted values.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"r-squared-values\">R-squared values<\/h3>\n\n\n\n<p>R-square value depicts the percentage of the variation in the dependent variable explained by the independent variable in the model.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>RSS = Residual sum of squares<\/strong>: It measures the difference between the expected and the actual output. A small RSS indicates a tight fit of the model to the data. It is also defined as follows:&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>TSS = Total sum of squares<\/strong>: It is the sum of data points' errors from the response variable's mean.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>R<sup>2<\/sup> value ranges from 0 to 1. The higher the R-square value better the model. The value of R2 increases if we add more variables to the model, irrespective of whether the variable contributes to the model or not. This is the disadvantage of using R<sup>2<\/sup>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"adjusted-r-squared-values\">Adjusted R-squared values<\/h3>\n\n\n\n<p>The Adjusted R2 value fixes the disadvantage of R2. The adjusted R2 value will improve only if the added variable contributes significantly to the model, and the adjusted R<sup>2<\/sup> value adds a penalty to the model.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>where R<sup>2<\/sup> is the R-square value, n = the total number of observations, and k = the total number of variables used in the model, if we increase the number of variables, the denominator becomes smaller, and the overall ratio will be high. Subtracting from 1 will reduce the overall Adjusted R<sup>2<\/sup>. So to increase the Adjusted R<sup>2<\/sup>, the contribution of additive features to the model should be significantly high.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"simple-linear-regression-example\"><strong>Simple Linear Regression Example<\/strong><\/h3>\n\n\n\n<p>For the given equation for the Linear Regression,<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>If there is only 1 predictor available, then it is known as Simple Linear Regression.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>While executing the prediction, there is an error term that is associated with the equation.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>The SLR model aims to find the estimated values of \u03b2<sub>1 <\/sub>&amp; \u03b2<sub>0<\/sub> by keeping the error term (\u03b5) minimum.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"multiple-linear-regression-example\"><strong>Multiple Linear Regression Example<\/strong><\/h3>\n\n\n\n<p><strong><em>Contributed by: Rakesh Lakalla <br>LinkedIn profile: <a rel=\"noreferrer noopener\" aria-label=\"https:\/\/www.linkedin.com\/in\/lakkalarakesh\/  (opens in a new tab)\" href=\"https:\/\/www.linkedin.com\/in\/lakkalarakesh\/\" target=\"_blank\">https:\/\/www.linkedin.com\/in\/lakkalarakesh\/ <\/a><\/em><\/strong><\/p>\n\n\n\n<p>For the given equation of Linear Regression, <\/p>\n\n\n<figure class=\"wp-block-image size-large is-resized zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-2.png\"><img decoding=\"async\" width=\"364\" height=\"37\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-2.png\" alt=\"\" class=\"wp-image-13073\" style=\"width:519px;height:53px\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-2.png 364w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-2-300x30.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-2-356x37.png 356w\" sizes=\"(max-width: 364px) 100vw, 364px\" \/><\/figure>\n\n\n\n<p>if there is more than 1 predictor available, then it is known as Multiple Linear Regression.&nbsp;<\/p>\n\n\n\n<p>The equation for MLR will be:<\/p>\n\n\n<figure class=\"wp-block-image size-large is-resized zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-3.png\"><img decoding=\"async\" width=\"436\" height=\"44\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-3.png\" alt=\"\" class=\"wp-image-13074\" style=\"width:556px;height:56px\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-3.png 436w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-8-3-300x30.png 300w\" sizes=\"(max-width: 436px) 100vw, 436px\" \/><\/figure>\n\n\n\n<p>\u03b2<sub>1<\/sub> = coefficient for X<sub>1<\/sub> variable<\/p>\n\n\n\n<p>\u03b2<sub>2<\/sub> = coefficient for X<sub>2<\/sub> variable<\/p>\n\n\n\n<p>\u03b2<sub>3<\/sub> = coefficient for X<sub>3<\/sub> variable and so on\u2026<\/p>\n\n\n\n<p>\u03b2<sub>0<\/sub> is the intercept (constant term). While making the prediction, there is an error term that is associated with the equation.<\/p>\n\n\n<figure class=\"wp-block-image size-large is-resized zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-3.png\"><img decoding=\"async\" width=\"497\" height=\"49\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-3.png\" alt=\"\" class=\"wp-image-13075\" style=\"width:656px;height:66px\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-3.png 497w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-3-300x30.png 300w\" sizes=\"(max-width: 497px) 100vw, 497px\" \/><\/figure>\n\n\n\n<p>The goal of the MLR model is to find the estimated values of \u03b2<sub>0, <\/sub>\u03b2<sub>1, <\/sub>\u03b2<sub>2,<\/sub> \u03b2<sub>3\u2026<\/sub> by keeping the error term (i) minimum.<\/p>\n\n\n\n<p>Broadly speaking, supervised machine learning algorithms are classified into two types-<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Regression: Used to predict a continuous variable<\/li>\n\n\n\n<li>Classification: Used to predict discrete variable&nbsp;<\/li>\n<\/ol>\n\n\n\n<p>In this post, we will discuss one of the regression techniques, \u201cMultiple Linear Regression,\u201d and its implementation using Python.<\/p>\n\n\n\n<p>Linear regression is one of the statistical methods of predictive analytics to predict the target variable (dependent variable). When we have one independent variable, we call it Simple Linear Regression. If the number of independent variables is more than one, we call it Multiple Linear Regression.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"assumptions-for-multiple-linear-regression\"><strong>Assumptions for Multiple Linear Regression<\/strong><\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Linearity: <\/strong>There should be a linear relationship between dependent and independent variables, as shown in the below example graph.<\/li>\n<\/ol>\n\n\n\n<p>2. <strong>Multicollinearity: <\/strong>There should not be a high correlation between two or more independent variables. Multicollinearity can be checked using a correlation matrix, Tolerance and Variance Influencing Factor (VIF).<\/p>\n\n\n\n<p>3. <strong>Homoscedasticity: <\/strong>If Variance of errors is constant across independent variables, then it is called Homoscedasticity. The residuals should be homoscedastic. Standardized residuals versus predicted values are used to check homoscedasticity, as shown in the below figure. Breusch-Pagan and White tests are the famous tests used to check Homoscedasticity. Q-Q plots are also used to check homoscedasticity.<\/p>\n\n\n\n<p>4. <strong>Multivariate Normality: <\/strong>Residuals should be normally distributed.<\/p>\n\n\n\n<p>5. <strong>Categorical Data: <\/strong>Any categorical data present should be converted into dummy variables.<\/p>\n\n\n\n<p>6. <strong>Minimum records: <\/strong>There should be at least 20 records of independent variables.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"a-mathematical-formulation-of-multiple-linear-regression\"><strong>A mathematical formulation of Multiple Linear Regression<\/strong><\/h4>\n\n\n\n<p>In Linear Regression, we try to find a linear relationship between independent and dependent variables by using a linear equation on the data.<\/p>\n\n\n\n<p>The equation for a linear line is-<\/p>\n\n\n\n<p><strong>\t<\/strong><strong>Y=mx + c<\/strong><\/p>\n\n\n\n<p>Where m is slope and c is the intercept.<\/p>\n\n\n\n<p>In Linear Regression, we are actually trying to predict the best m and c values for dependent variable Y and independent variable x. We fit as many lines and take the best line that gives the least possible error. We use the corresponding m and c values to predict the y value.<\/p>\n\n\n\n<p>The same concept can be used in multiple Linear Regression where we have multiple independent variables, x1, x2, x3\u2026xn.<\/p>\n\n\n\n<p>Now the equation changes to-&nbsp;<\/p>\n\n\n\n<p><strong>Y=M1X1 + M2X2 + M3M3 + \u2026MnXn+C<\/strong><\/p>\n\n\n\n<p>The above equation is not a line but a plane of multi-dimensions.<\/p>\n\n\n\n<p><strong>Model Evaluation:<\/strong><\/p>\n\n\n\n<p>A model can be evaluated by using the below methods-<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Mean absolute error: <\/strong>It is the mean of absolute values of the errors, formulated as-&nbsp;<\/li>\n<\/ol>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Mean squared error: <\/strong>It is the mean of the square of errors.<\/li>\n<\/ol>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Root mean squared error: <\/strong>It is just the square root of MSE.<\/li>\n<\/ol>\n\n\n\n<p><strong>Applications<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The effect of the independent variable on the dependent variable can be calculated.<\/li>\n\n\n\n<li>Used to predict trends.<\/li>\n\n\n\n<li>Used to find how much change can be expected in a dependent variable with change in an independent variable.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"polynomial-regression\"><strong>Polynomial Regression<\/strong><\/h2>\n\n\n\n<p>Polynomial regression is a non-linear regression. In Polynomial regression, the relationship of the dependent variable is fitted to the nth degree of the independent variable.&nbsp;<\/p>\n\n\n\n<p>Equation of polynomial regression:&nbsp;<\/p>\n\n\n<figure class=\"wp-block-image size-large is-resized zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-4.png\"><img decoding=\"async\" width=\"552\" height=\"55\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-4.png\" alt=\"\" class=\"wp-image-13076\" style=\"width:740px;height:75px\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-4.png 552w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-9-4-300x30.png 300w\" sizes=\"(max-width: 552px) 100vw, 552px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"underfitting-and-overfitting\"><strong>Underfitting and Overfitting<\/strong><\/h2>\n\n\n\n<p>When we fit a model, we try to find the optimized, best-fit line, which can describe the impact of the change in the independent variable on the change in the dependent variable by keeping the error term minimum. While fitting the model, there can be 2 events that will lead to the bad performance of the model. These events are<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.mygreatlearning.com\/blog\/overfitting-and-underfitting-in-machine-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"Underfitting&nbsp;\u2028Overfitting (opens in a new tab)\">Underfitting&nbsp;<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.mygreatlearning.com\/blog\/overfitting-and-underfitting-in-machine-learning\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"Underfitting&nbsp;\u2028Overfitting (opens in a new tab)\">Overfitting<\/a><\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"underfitting\"><strong>Underfitting&nbsp;<\/strong><\/h3>\n\n\n\n<p>Underfitting is the condition where the model cannot fit the data well enough. The under-fitted model leads to low accuracy of the model. Therefore, the model is unable to capture the relationship, trend, or pattern in the training data. Underfitting of the model could be avoided by using more data or by optimizing the parameters of the model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"overfitting\"><strong>Overfitting<\/strong><\/h3>\n\n\n\n<p>Overfitting is the opposite case of underfitting, i.e., when the model predicts very well on training data and is not able to predict well on test data or validation data. The main reason for overfitting could be that the model is memorizing the training data and is unable to generalize it on a test\/unseen dataset. Overfitting can be reduced by making feature selection or by using regularisation techniques.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>The above graphs depict the 3 cases of the model performance.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"implementing-linear-regression-in-python\"><strong>Implementing Linear Regression in Python<\/strong><\/h2>\n\n\n\n<p><strong><em>Contributed by:  Ms. Manorama Yadav <br>LinkedIn:  <a href=\"https:\/\/www.linkedin.com\/in\/manorama-3110\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\" (opens in a new tab)\">https:\/\/www.linkedin.com\/in\/manorama-3110\/<\/a> <\/em><\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"dataset-introduction\">Dataset Introduction<\/h3>\n\n\n\n<p>The data concerns city-cycle fuel consumption in miles per gallon(mpg) to be predicted. There are a total of 392 rows, 5 independent variables, and 1 dependent variable. All 5 predictors are continuous variables.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"attribute-information\"><strong>&nbsp;Attribute Information:<\/strong><\/h4>\n\n\n\n<ol class=\"wp-block-list\">\n<li>mpg: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; continuous (<strong>Dependent Variable<\/strong>)<\/li>\n\n\n\n<li>cylinders: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; multi-valued discrete<\/li>\n\n\n\n<li>displacement: &nbsp; Continuous<\/li>\n\n\n\n<li>horsepower:&nbsp; &nbsp; &nbsp; continuous<\/li>\n\n\n\n<li>weight: &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Continuous<\/li>\n\n\n\n<li>acceleration: &nbsp; &nbsp; Continuous<\/li>\n<\/ol>\n\n\n\n<p><strong>The objective of the problem statement is to predict the miles per gallon using the Linear Regression model.<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"python-packages-for-linear-regression\"><strong>Python Packages for Linear Regression<\/strong><\/h3>\n\n\n\n<p><a aria-label=\"Import the necessary Python package (opens in a new tab)\" href=\"https:\/\/www.mygreatlearning.com\/blog\/python-tutorial-for-beginners-a-complete-guide\/\" target=\"_blank\" rel=\"noreferrer noopener\">Import the necessary Python package<\/a> to perform various steps like data reading, plotting the data, and performing linear regression. Import the following packages:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/slIMpGq2ppmUTn-IM0bUMIg\/image?w=602&amp;h=174&amp;rev=3&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"read-the-data\">Read the data<\/h3>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sNyRbmFTMMcdz9_mnMYoB5Q\/image?w=659&amp;h=103&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>Download the data and save it in the data directory of the project folder. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"simple-linear-regression-with-scikit-learn\"><strong>Simple Linear Regression With scikit-learn<\/strong><\/h3>\n\n\n\n<p>Simple Linear regression has only 1 predictor variable and 1 dependent variable. From the above dataset, let\u2019s consider the effect of horsepower on the \u2018mpg\u2019 of the vehicle.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sh-8lD3UHH0LC6Cs1vtlUkA\/image?w=602&amp;h=95&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>Let\u2019s take a look at what the data looks like:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/spvfMi6qjGrdxbmTVQqBzDQ\/image?w=602&amp;h=197&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>From the above graph, we can infer a negative linear relationship between horsepower and miles per gallon (mpg). With horsepower increasing, mpg is decreasing.<\/p>\n\n\n\n<p>Now, let\u2019s perform the Simple linear regression.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sv7F1q7rqkqDh8ecduDThuw\/image?w=635&amp;h=221&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>From the output of the above SLR model, the equation of the best fit line of the model is&nbsp;<\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>mpg = 39.94 + (-0.16)*(horsepower)<\/strong><\/p>\n\n\n\n<p>By comparing the above equation to the SLR model equation Yi= \u03b2iXi + \u03b20 , <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">\u03b20=39.94, \u03b21=-0.16<\/span><\/p>\n\n\n\n<p>Now, check for the model relevancy by looking at its R<sup>2<\/sup> and RMSE Values<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/szcudMbNCXnnjfAfEissJ4w\/image?w=660&amp;h=163&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>R<sup>2<\/sup> and RMSE (Root mean square) values are <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">0.6059 <\/span>and <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">4.89,<\/span> respectively. It means that 60% of the variance in mpg is explained by horsepower. For a simple linear regression model, this result is okay but not so good since there could be an effect of other variables like cylinders, acceleration, etc. RMSE value is also very less.&nbsp;<\/p>\n\n\n\n<p>Let\u2019s check how the line fits the data.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/syH0Th6MQ5tgLJdhNTWztyg\/image?w=614&amp;h=240&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>From the graph, we can infer that the best fit line is able to explain the effect of horsepower on mpg.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"multiple-linear-regression-with-scikit-learn\"><strong>Multiple Linear Regression With scikit-learn<\/strong><\/h3>\n\n\n\n<p>Since the data is already loaded in the system, we will start performing multiple linear regression.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/s8l2m1pqR32r1g9fkPDKe5w\/image?w=625&amp;h=57&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>The actual data has 5 independent variables and 1 dependent variable (mpg)<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/s9432HXFvflt46mNUrGhEuA\/image?w=669&amp;h=322&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>The best fit line for Multiple Linear Regression is&nbsp;<\/p>\n\n\n\n<p><strong>Y = 46.26 + -0.4cylinders + -8.313e-05displacement + -0.045horsepower + -0.01weight + -0.03acceleration<\/strong><\/p>\n\n\n\n<p>By comparing the best fit line equation with<\/p>\n\n\n<figure class=\"wp-block-image size-large is-resized zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-10-2.png\"><img decoding=\"async\" width=\"358\" height=\"36\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-10-2.png\" alt=\"\" class=\"wp-image-13078\" style=\"width:550px;height:56px\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-10-2.png 358w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-10-2-300x30.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/03\/linreg-10-2-356x36.png 356w\" sizes=\"(max-width: 358px) 100vw, 358px\" \/><\/figure>\n\n\n\n<p> <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">\u03b20 (Intercept)= 46.25,  \u03b21 = -0.4,  \u03b22 = -8.313e-05,  \u03b23= -0.045,  \u03b24= 0.01,  \u03b25 = -0.03<\/span><\/p>\n\n\n\n<p>Now, let\u2019s check the R<sup>2<\/sup> and RMSE values.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sbjnafMIHsvyrK2cKdY_uOg\/image?w=661&amp;h=167&amp;rev=3&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>R<sup>2<\/sup> and RMSE (Root mean square) values are <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">0.707<\/span> and <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">4.21,<\/span> respectively. It means that ~71% of the variance in mpg is explained by all the predictors. This depicts a good model. Both values are less than the results of Simple Linear Regression, which means that adding more variables to the model will help in good model performance. However, the more the value of R<sup>2<\/sup> and the least RMSE, the better the model will be.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"multiple-linear-regression-implementation-using-python\"><strong>Multiple Linear Regression- Implementation using Python<\/strong><\/h4>\n\n\n\n<p>Let us take a small data set and try out a building model using python.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport numpy as np\nimport seaborn as sns\nimport matplotlib.pyplot as plt\n%matplotlib inline\nfrom sklearn.model_selection import train_test_split \nfrom sklearn.linear_model import LinearRegression\nfrom sklearn import metrics\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>\ndata=pd.read_csv(\"Consumer.csv\")\ndata.head()\n<\/code><\/pre>\n\n\n\n<p>The above figure shows the top 5 rows of the data. We are actually trying to predict the Amount charged (dependent variable) based on the other two independent variables, Income and Household Size. We first check for our assumptions in our data set.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Check for Linearity<\/strong><\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>plt.figure(figsize=(14,5))\nplt.subplot(1,2,1)\nplt.scatter(data&#91;'AmountCharged'], data&#91;'Income'])\nplt.xlabel('AmountCharged')\nplt.ylabel('Income')\nplt.subplot(1,2,2)\nplt.scatter(data&#91;'AmountCharged'], data&#91;'HouseholdSize'])\nplt.xlabel('AmountCharged')\nplt.ylabel('HouseholdSize')\nplt.show()\n<\/code><\/pre>\n\n\n\n<p>We can see from the above graph, there exists a linear relationship between the Amount Charged and Income, Household Size. <\/p>\n\n\n\n<p>2. <strong>Check for Multicollinearity<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sns.scatterplot(data&#91;'Income'],data&#91;'HouseholdSize'])\n<\/code><\/pre>\n\n\n\n<p>There exists no collinearity between Income and HouseholdSize from the above graph.<\/p>\n\n\n\n<p>We split our data to train and test in a ratio of 80:20, respectively, using the function <strong>train_test_split<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>X = pd.DataFrame(np.c_&#91;data&#91;'Income'], data&#91;'HouseholdSize']], columns=&#91;'Income','HouseholdSize'])\ny=data&#91;'AmountCharged']\nX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=9)\n<\/code><\/pre>\n\n\n\n<p>3. <strong>Check for Homoscedasticity<\/strong><\/p>\n\n\n\n<p>First, we need to calculate residuals-<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>resi=y_test-prediction\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"polynomial-regression-with-scikit-learn\"><strong>Polynomial Regression With scikit-learn<\/strong><\/h3>\n\n\n\n<p>For Polynomial regression, we will use the same data that we used for Simple Linear Regression.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sPARqoFTpQu19xMKGIleRLQ\/image?w=602&amp;h=95&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/ssZ5jZJYLM17JdmveLc69Qw\/image?w=619&amp;h=200&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>The graph shows that the relationship between horsepower and miles per gallon is not perfectly linear. It\u2019s a little bit curved.&nbsp;<\/p>\n\n\n\n<p>Graph for the Best fit line for Simple Linear Regression as per below:<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>From the plot, we can infer that the best fit line is able to explain the effect of the independent variable, however, this does not apply to most of the data points.&nbsp;<\/p>\n\n\n\n<p>Let\u2019s try polynomial regression on the above dataset. Let\u2019s fit degree = 2&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sraNfwTOJwA9SWe5UOk4Wag\/image?w=654&amp;h=226&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>Now, visualize the Polynomial Regression results<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/swEx0X1A-1YKe68ZFUmSxFw\/image?w=655&amp;h=262&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>From the graph, the best fit line looks better than the Simple Linear Regression.&nbsp;<\/p>\n\n\n\n<p>Let\u2019s find out the model performance by calculating mean absolute Error, Mean squared error, and Root mean square.<\/p>\n\n\n\n<p><strong>Simple Linear Regression Model Performance:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/s27828kdSTc5DWmlElOPFMA\/image?w=681&amp;h=295&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p><strong>Polynomial Regression (degree = 2) Model Performance:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/s9-4HhGzOwldnTupcool9wg\/image?w=681&amp;h=165&amp;rev=1&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>From the above results, we can see that Error-values are less in Polynomial regression but there is not much improvement. We can increase the polynomial degree and experiment with the model performance.&nbsp;<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"advanced-linear-regression-with-statsmodels\"><strong>Advanced Linear Regression with statsmodels<\/strong><\/h3>\n\n\n\n<p>There are many ways to perform regression in python.&nbsp;<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>scikit Learn&nbsp;<\/li>\n\n\n\n<li>statsmodels&nbsp;<\/li>\n<\/ol>\n\n\n\n<p>In the MLR in the python section explained above, we have performed MLR using the scikit learn library. Now, let\u2019s perform MLR using the statsmodels library.<\/p>\n\n\n\n<p>Import the below-required libraries<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/stHczuVwLG46rm_EekFTnhQ\/image?w=625&amp;h=52&amp;rev=1&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>Now, perform Multiple Linear Regression using statsmodels<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sph7Tl4Bgt6toQTS9EExQ-Q\/image?w=670&amp;h=507&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<p>From the above results, R<sup>2<\/sup> and Adjusted R<sup>2<\/sup> are <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">0.708<\/span> and <span style=\"background-color:#f7cc62\" class=\"td_text_highlight_marker\">0.704,<\/span> respectively. All the independent variables explain almost 71% of the variation in the dependent variables. The value of R<sup>2<\/sup> is the same as the result of the scikit learn library.&nbsp;<\/p>\n\n\n\n<p>By looking at the p-value for the independent variables, intercept, horsepower, and weight are important variables since the p-value is less than 0.05 (significance level). We can try to perform MLR by removing other variables which are not contributing to the model and selecting the best model.<\/p>\n\n\n\n<p>Now, let\u2019s check the model performance by calculating the RMSE value:<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/4\/d\/sXs0iePhIBnwU52ywSkyUxg\/image?w=645&amp;h=259&amp;rev=2&amp;ac=1&amp;parent=1TL-O7MLrz5xFwDNx0YPi5DXaeyn4IIdh\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"linear-regression-in-r\"><strong>Linear Regression in R<\/strong><\/h2>\n\n\n\n<p><strong><em>Contributed by:  By Mr. Abhay Poddar <\/em><\/strong><\/p>\n\n\n\n<p>To see an example of Linear Regression in R, we will choose the CARS, which is an inbuilt dataset in R. Typing CARS in the R Console can access the dataset. We can observe that the dataset has 50 observations and 2 variables, namely distance and speed. The objective here is to predict the distance traveled by a car when the speed of the car is known. Also, we need to establish a linear relationship between them with the help of an arithmetic equation. Before getting into modeling, it is always advisable to do an Exploratory Data Analysis, which helps us to understand the data and the variables.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"exploratory-data-analysis\"><strong>Exploratory Data Analysis<\/strong><\/h3>\n\n\n\n<p>This paper aims to build a Linear Regression Model that can help predict distance. The following are the basic visualizations that will help us understand more about the data and the variables:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Scatter Plot \u2013 To help establish whether there exists a linear relationship between distance and speed.<\/li>\n\n\n\n<li>Box Plot \u2013 To check whether there are any outliers in the dataset.<\/li>\n\n\n\n<li>Density Plot \u2013 To check the distribution of the variables; ideally, it should be normally distributed.<\/li>\n<\/ol>\n\n\n\n<p>Below are the steps to make these graphs in R.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"scatter-plots-to-visualize-relationship\"><strong>Scatter Plots to visualize Relationship<\/strong><\/h4>\n\n\n\n<p>A Scatter Diagram plots the pairs of numerical data with one variable on each axis, and helps establish the relationship between the independent and dependent variables.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"steps-in-r\"><span style=\"text-decoration: underline\">Steps in R<\/span><\/h4>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sG4hP7CaqVdsHN0yK_rR2mQ\/image?w=673&amp;h=230&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>If we carefully observe the scatter plot, we can see that the variables are correlated as they fall along the line\/curve. The higher the correlation, the nearer the points, will be to the line\/curve.&nbsp;<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>As discussed earlier, the Scatter Plot shows a linear and positive relationship between Distance and Speed. Thus, it fulfills one of the assumptions of Linear Regression i.e., there should be a positive and linear relationship between dependent and independent variables.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"check-for-outliers-using-boxplots\"><strong>Check for Outliers using Boxplots.<\/strong><\/h4>\n\n\n\n<p>A boxplot is also called a box and whisker plot that is used in statistics to represent the five number summaries. It is used to check whether the distribution is skewed or whether there are any outliers in the dataset.<\/p>\n\n\n\n<p>Wikipedia defines \u2018Outliers\u2019 as an observation point that is distant from other observations in the dataset.<\/p>\n\n\n\n<p>Now, let's plot the Boxplot to check for outliers.<br><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/s34xGkTVwcRHJ3_RZ5EujWQ\/image?w=673&amp;h=85&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>After observing the Boxplots for both Speed and Distance, we can say that there are no outliers in Speed, and there seems to be a single outlier in Distance. Thus, there is no need for the treatment of outliers.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"checking-distribution-of-data-using-density-plots\"><strong>Checking distribution of Data using Density Plots<\/strong><\/h4>\n\n\n\n<p>One of the key assumptions to performing Linear Regression is that the data should be normally distributed. This can be done with the help of Density Plots. A Density Plot helps us visualize the distribution of a numeric variable over a period of time.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>After looking at the Density Plots, we can conclude that the data set is more or less normally distributed.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"linear-regression-modelling\"><strong>Linear Regression Modelling<\/strong><\/h3>\n\n\n\n<p>Now, let's get into the building of the Linear Regression Model. But before that, there is one check we need to perform, which is \u2018Correlation Computation\u2019. The Correlation Coefficients help us to check how strong is the relationship between the dependent and independent variables. The value of the Correlation Coefficient ranges from -1 to 1.<\/p>\n\n\n\n<p>A Correlation of 1 indicates a perfect positive relationship. It means if one variable's value increases, the other variable's value also increases.<\/p>\n\n\n\n<p>A Correlation of -1 indicates a perfect negative relationship. It means if the value of variable x increases, the value of variable y decreases.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sa6q25RbxY6RYSHmELUhkfw\/image?w=673&amp;h=85&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>A Correlation of 0 indicates there is no relationship between the variables.<br><\/p>\n\n\n\n<p>The output of the above R Code is 0.8068949. It shows that the correlation between speed and distance is 0.8, which is close to 1, stating a positive and strong correlation.<\/p>\n\n\n\n<p>The linear regression model in R is built with the help of the lm() function.<\/p>\n\n\n\n<p>The formula uses two main parameters:<\/p>\n\n\n\n<p>Data \u2013 variable containing the dataset.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sFZpOI9rPjLOcUDR2oBaBfQ\/image?w=673&amp;h=169&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>Formula \u2013 an object of the class formula.<br><\/p>\n\n\n\n<p>The results show us the intercept and beta coefficient of the variable speed.<\/p>\n\n\n\n<p>From the output above,<\/p>\n\n\n\n<p>a) We can write the regression equation as distance = -17.579 + 3.932 (speed).<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"model-diagnostics\"><strong>Model Diagnostics<\/strong><\/h4>\n\n\n\n<p>Just building the model and using it for prediction is the job half done. Before using the model, we need to ensure that the model is statistically significant. This means:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>To check if there is a statistically significant relationship between the dependent and independent variables.<\/li>\n\n\n\n<li>The model that we built fits the data very well.<\/li>\n<\/ol>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sLB5py0Vo52fQwosF2GnPCg\/image?w=673&amp;h=49&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>We do this by a statistical summary of the model using the summary() function in R.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>The summary output shows the following:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Call \u2013 The function call used to compute the regression model.<\/li>\n\n\n\n<li>Residuals \u2013 Distribution of residuals, which generally has a mean of 0. Thus, the median should not be far from 0, and the minimum and maximum should be equal in absolute value.<\/li>\n\n\n\n<li>Coefficients \u2013 It shows the regression beta coefficients and their statistical significance.<\/li>\n\n\n\n<li>Residual stand effort (RSE), R \u2013 Square, and F \u2013Statistic \u2013 These are the metrics to check how well the model fits our data.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"detecting-t-statistics-and-p-value\"><strong>Detecting t-statistics and P-Value<\/strong><\/h3>\n\n\n\n<p>T-Statistic and associated p-values are very important metrics while checking model fitment.<\/p>\n\n\n\n<p>The t-statistics tests whether there is a statistically significant relationship between the independent and dependent variables. This means whether the beta coefficient of the independent variable is significantly different from 0. So, the higher the t-value, the better.<\/p>\n\n\n\n<p>Whenever there is a p-value, there is always a null as well as an alternate hypothesis associated with it. The p-value helps us to test for the null hypothesis, i.e., the coefficients are equal to 0. A low p-value means we can reject the null hypothesis.<\/p>\n\n\n\n<p>The statistical hypotheses are as follows:<\/p>\n\n\n\n<p>Null Hypothesis (H0) \u2013 Coefficients are equal to zero.<\/p>\n\n\n\n<p>Alternate Hypothesis (H1) \u2013 Coefficients are not equal to zero.<\/p>\n\n\n\n<p>As discussed earlier, when the p-value &lt; 0.05, we can safely reject the null hypothesis.<\/p>\n\n\n\n<p>In our case, since the p-value is less than 0.05, we can reject the null hypothesis and conclude that the model is highly significant. This means there is a significant association between the independent and dependent variables.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"r-squared-and-adjusted-r-squared\"><strong>R \u2013 Squared and Adjusted R - Squared<\/strong><\/h3>\n\n\n\n<p>R \u2013 Squared (R2) is a basic metric which tells us how much variance has been explained by the model. It ranges from 0 to 1. In Linear Regression, if we keep adding new variables, the value of R \u2013 Square will keep increasing irrespective of whether the variable is significant. This is where Adjusted R \u2013 Square comes to help. Adjusted R \u2013 Square helps us to calculate R \u2013 Square from only those variables whose addition to the model is significant. So, while performing Linear Regression, it is always preferable to look at Adjusted R \u2013 Square rather than just R \u2013 Square.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>An Adjusted R \u2013 Square value close to 1 indicates that the regression model has explained a large proportion of variability.<\/li>\n\n\n\n<li>A number close to 0 indicates that the regression model did not explain too much variability.<\/li>\n<\/ol>\n\n\n\n<p>In our output, Adjusted R Square value is 0.6438, which is closer to 1, thus indicating that our model has been able to explain the variability.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"aic-and-bic\"><strong>AIC and BIC<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/srQe0xr6Fd7LFPkIBzBbxbQ\/image?w=673&amp;h=157&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>AIC and BIC are widely used metrics for model selection. AIC stands for Akaike Information Criterion, and BIC stands for Bayesian Information Criterion. These help us to check the goodness of fit for our model. For model comparison model with the lowest AIC and BIC is preferred.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"which-regression-model-is-the-best-fit-for-the-data\"><strong>Which Regression Model is the best fit for the data?<\/strong><\/h3>\n\n\n\n<p>There are number of metrics that help us decide the best fit model for our data, but the most widely used are given below:<\/p>\n\n\n\n<figure class=\"wp-block-table is-style-stripes\"><table><tbody><tr><td><strong>Statistics<\/strong><\/td><td><strong>Criterion<\/strong><\/td><\/tr><tr><td>R - Squared<\/td><td>Higher the better<\/td><\/tr><tr><td>Adjusted R - Squared<\/td><td>Higher the better<\/td><\/tr><tr><td>t-statistic<\/td><td>Higher the t-values lower the p-value<\/td><\/tr><tr><td>f-statistic<\/td><td>Higher the better<\/td><\/tr><tr><td>AIC<\/td><td>Lower the better<\/td><\/tr><tr><td>BIC<\/td><td>Lower the better<\/td><\/tr><tr><td>Mean Standard Error (MSE)<\/td><td>Lower the better<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"predicting-linear-models\"><strong>Predicting Linear Models<\/strong><\/h3>\n\n\n\n<p>Now we know how to build a Linear Regression Model In R using the full dataset. But this approach does not tell us how well the model will perform and fit new data.<\/p>\n\n\n\n<p>Thus, to solve this problem, the general practice in the industry is to split the data into the Train and Test datasets in the ratio of 80:20 (Train 80% and Test 20%). With the help of this method, we can now get the values for the test dataset and compare them with the values from the actual dataset.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"splitting-the-data\"><strong>Splitting the Data<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sOdXW6MwMrMIinmM65YjLeA\/image?w=673&amp;h=97&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>We do this with the help of the sample() function in R.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sbK-22xfyHSjQFSigOcXFOQ\/image?w=673&amp;h=109&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/s_xGkGAaJCBvuGzsJtnrK7w\/image?w=673&amp;h=85&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"building-the-model-on-train-data-and-predict-on-test-data\"><strong>Building the model on Train Data and Predict on Test Data<\/strong><\/h3>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/s9WUyw-WByyOvsiT_0SaGog\/image?w=673&amp;h=433&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"model-diagnostics\"><strong>Model Diagnostics<\/strong><\/h4>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sCsTs10yxwfxfNRlfiI0pCw\/image?w=673&amp;h=157&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>If we look at the p-value, since it is less than 0.05, we can conclude that the model is significant. Also, if we compare the Adjusted R \u2013 Squared value with the original dataset, it is close to it, thus validating that the model is significant.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"k-fold-cross-validation\"><strong>K \u2013 Fold Cross-Validation<\/strong><\/h3>\n\n\n\n<p>Now, we have seen that the model performs well on the test dataset as well. But this does not guarantee that the model will be a good fit in the future as well. The reason is that there might be a case that a few data points in the dataset might not be representative of the whole population. Thus, we need to check the model performance as much as possible. One way to ensure this is to check whether the model performs well on train and test data chunks. This can be done with the help of K \u2013 Fold Cross-validation.&nbsp;<\/p>\n\n\n\n<p>The procedure of K \u2013 Fold Cross-validation is given below:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The random shuffling of the dataset.<\/li>\n\n\n\n<li>Splitting of data into k folds\/sections\/groups.<\/li>\n\n\n\n<li>For each fold\/section\/group:<\/li>\n<\/ol>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Make the fold\/section\/group the test data.<\/li>\n\n\n\n<li>Take the rest data as train data.<\/li>\n\n\n\n<li>Run the model on train data and evaluate the test data.<\/li>\n\n\n\n<li>Keep the evaluation score and discard the model.<\/li>\n<\/ol>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/svnnosuvelLG3RaP5k3mnuw\/image?w=673&amp;h=313&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/docs.google.com\/drawings\/u\/0\/d\/sfvS5oo5kY5RFuYnX2p9guw\/image?w=673&amp;h=373&amp;rev=1&amp;ac=1&amp;parent=1xa2wHUjVCefGWNRSJebCiXEj7Sf26croQdGgW28dJ3I\" alt=\"\"\/><\/figure>\n\n\n\n<p>After performing the K \u2013 Fold Cross-validation, we can observe that the R \u2013 Square value is close to the original data, as well, as MAE is 12%, which helps us conclude that model is a good fit.<br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"advantages-of-using-linear-regression\"><strong>Advantages of Using Linear Regression<\/strong><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>The linear Regression method is very easy to use. If the relationship between the variables (independent and dependent) is known, we can easily implement the regression method accordingly (Linear Regression for linear relationship).<\/li>\n\n\n\n<li>Linear Regression provides the significance level of each attribute contributing to the prediction of the dependent variable. With this data, we can choose between the variables which are highly contributing\/ important variables.&nbsp;<\/li>\n\n\n\n<li>After performing linear regression, we get the best fit line, which is used in prediction, which we can use according to the business requirement.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"limitations-of-linear-regression\"><strong>Limitations of Linear Regression<\/strong><\/h3>\n\n\n\n<p>The main limitation of linear regression is that its performance is not up to the mark in the case of a nonlinear relationship. Linear regression can be affected by the presence of outliers in the dataset. The presence of high correlation among the variables also leads to the poor performance of the linear regression model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"linear-regression-examples\"><strong>Linear Regression Examples<\/strong><\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Linear Regression can be used for product sales prediction to optimize inventory management.<\/li>\n\n\n\n<li>It can be used in the Insurance domain, for example, to predict the insurance premium based on various features.<\/li>\n\n\n\n<li>Monitoring website click count on a daily basis using linear regression could help in optimizing the website efficiency etc.<\/li>\n\n\n\n<li>Feature selection is one of the applications of Linear Regression. <\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"linear-regression-learning-the-model\"><b>Linear Regression - Learning the Model<\/b><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Simple Linear Regression<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">With simple linear regression, when we have a single input, we can use statistics to estimate the coefficients.<\/span><br><span style=\"font-weight: 400\">This requires that you calculate statistical properties from the data, such as mean, standard deviation, correlation, and covariance. All of the data must be available to traverse and calculate statistics.<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Ordinary Least Squares<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">When we have more than one input, we can use Ordinary Least Squares to estimate the values of the coefficients.<\/span><br><span style=\"font-weight: 400\">The Ordinary Least Squares procedure seeks to minimize the sum of the squared residuals. This means that given a regression line through the data, we calculate the distance from each data point to the regression line, square it, and sum all of the squared errors together. This is the quantity that ordinary least squares seek to minimize.<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b> Gradient Descent<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">This operation is called Gradient Descent and works by starting with random values for each coefficient. The sum of the squared errors is calculated for each pair of input and output values. A learning rate is used as a scale factor, and the coefficients are updated in the direction of minimizing the error. The process is repeated until a minimum sum squared error is achieved or no further improvement is possible.<\/span><br><span style=\"font-weight: 400\">When using this method, you must select a learning rate (alpha) parameter that determines the size of the improvement step to take on each iteration of the procedure.<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Regularization<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">There are extensions to the training of the linear model called regularization methods. These seek to minimize the sum of the squared error of the model on the training data (using ordinary least squares) and also to reduce the complexity of the model (like the number or absolute size of the sum of all coefficients in the model).<\/span><br><span style=\"font-weight: 400\">Two popular examples of regularization procedures for linear regression are:<\/span><br><b>- Lasso Regression<\/b><span style=\"font-weight: 400\">: where Ordinary Least Squares are modified also to minimize the absolute sum of the coefficients (called L1 regularization).<\/span><br><b>- Ridge Regression<\/b><span style=\"font-weight: 400\">: where Ordinary Least Squares are modified also to minimize the squared absolute sum of the coefficients (called L2 regularization).<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"preparing-data-for-linear-regression\"><b>Preparing Data for Linear Regression<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">Linear regression has been studied at great length, and there is a lot of literature on how your data must be structured to best use the model. In practice, you can use these rules more like rules of thumb when using Ordinary Least Squares Regression, the most common implementation of linear regression.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">Try different preparations of your data using these heuristics and see what works best for your problem.<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Linear Assumption<\/b><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">Noise Removal<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">Remove Collinearity<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">Gaussian Distributions<\/span><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"summary\"><b>Summary<\/b><\/h3>\n\n\n\n<p><span style=\"font-weight: 400\">In this post, you discovered the linear regression algorithm for machine learning.<\/span><br><span style=\"font-weight: 400\">You covered a lot of ground, including:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>The common names used when describing linear regression models.<\/b><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">The representation used by the model.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">Learning algorithms are used to estimate the coefficients in the model.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: bold\">Rules of thumb to consider when preparing data for use with linear regression.<\/span>&nbsp;<\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">Try out linear regression and get comfortable with it. If you are planning a career in Machine Learning, here are some&nbsp;<\/span><a aria-label=\" (opens in a new tab)\" href=\"https:\/\/www.mygreatlearning.com\/blog\/machine-learning-resume\/\" target=\"_blank\" rel=\"noreferrer noopener\">Must-Haves On Your Resume&nbsp;<\/a>and the <a aria-label=\"most common interview questions (opens in a new tab)\" href=\"https:\/\/www.mygreatlearning.com\/blog\/machine-learning-interview-questions\/\" target=\"_blank\" rel=\"noreferrer noopener\">most common interview questions<\/a> to prepare.<\/p>\n\n\n\n<div style=\"background-color: #efefef;border: 1px solid #000;padding: 8px\"><p><b>Find  Machine Learning Course in Top Indian Cities<\/b><\/p> \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-chennai\" title=\" Machine Learning Course in Chennai\">Chennai<\/a> | \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-bangalore\" title=\" Machine Learning Course in Bangalore\">Bangalore<\/a> | \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-hyderabad\" title=\" Machine Learning Course in Hyderabad\">Hyderabad<\/a> | \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-pune\" title=\" Machine Learning Course in Pune\">Pune<\/a> | \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-mumbai\" title=\" Machine Learninge Course in Mumbai\">Mumbai<\/a> | \n    <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-machine-learning-course-in-delhi-ncr\" title=\" Machine Learning Course in Delhi NCR\">Delhi NCR<\/a><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" class=\"wp-block-heading\" id=\"our-machine-learning-courses\">Our Machine Learning Courses<\/h2>\n\n\n\n<p>Explore our Machine Learning and AI courses, designed for comprehensive learning and skill development.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th><strong>Program Name<\/strong><\/th><th><strong>Duration<\/strong><\/th><\/tr><tr><th><a href=\"https:\/\/professionalonline2.mit.edu\/no-code-artificial-intelligence-machine-learning-program\">MIT No code AI and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><tr><th><a href=\"https:\/\/idss-gl.mit.edu\/mit-idss-data-science-machine-learning-online-program\">MIT Data Science and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><tr><th><a href=\"https:\/\/www.mygreatlearning.com\/mit-data-science-and-machine-learning-program\">Data Science and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><\/thead><\/table><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What is Linear Regression? Linear Regression is the basic form of regression analysis. It assumes that there is a linear relationship between the dependent variable and the predictor(s). In regression, we try to calculate the best fit line, which describes the relationship between the predictors and predictive\/dependent variables. There are four assumptions associated with a [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":6059,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[2],"tags":[],"content_type":[],"class_list":["post-6057","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>A Guide to Linear Regression in Machine Learning<\/title>\n<meta name=\"description\" content=\"Linear Regression Machine Learning: Let&#039;s know the when and why do we use, Definition, Advantages &amp; Disadvantages, Examples and Models Etc.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"A Guide to Linear Regression in Machine Learning\" \/>\n<meta property=\"og:description\" content=\"Linear Regression Machine Learning: Let&#039;s know the when and why do we use, Definition, Advantages &amp; Disadvantages, Examples and Models Etc.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"Great Learning Blog: Free Resources what Matters to shape your Career!\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/GreatLearningOfficial\/\" \/>\n<meta property=\"article:published_time\" content=\"2023-01-10T06:36:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-09-02T09:51:57+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1000\" \/>\n\t<meta property=\"og:image:height\" content=\"667\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Great Learning Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/twitter.com\/Great_Learning\" \/>\n<meta name=\"twitter:site\" content=\"@Great_Learning\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Great Learning Editorial Team\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/\"},\"author\":{\"name\":\"Great Learning Editorial Team\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\"},\"headline\":\"A Guide to Linear Regression in Machine Learning\",\"datePublished\":\"2023-01-10T06:36:03+00:00\",\"dateModified\":\"2024-09-02T09:51:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/\"},\"wordCount\":5106,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Linear-Regression-for-Beginners-1.jpg\",\"articleSection\":[\"AI and Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/\",\"name\":\"A Guide to Linear Regression in Machine Learning\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Linear-Regression-for-Beginners-1.jpg\",\"datePublished\":\"2023-01-10T06:36:03+00:00\",\"dateModified\":\"2024-09-02T09:51:57+00:00\",\"description\":\"Linear Regression Machine Learning: Let's know the when and why do we use, Definition, Advantages & Disadvantages, Examples and Models Etc.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Linear-Regression-for-Beginners-1.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/08\\\/Linear-Regression-for-Beginners-1.jpg\",\"width\":1000,\"height\":667,\"caption\":\"Linear Regression for Beginners\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/linear-regression-in-machine-learning\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI and Machine Learning\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/artificial-intelligence\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"A Guide to Linear Regression in Machine Learning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"name\":\"Great Learning Blog\",\"description\":\"Learn, Upskill &amp; Career Development Guide and Resources\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"alternateName\":\"Great Learning\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\",\"name\":\"Great Learning\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"width\":900,\"height\":900,\"caption\":\"Great Learning\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/GreatLearningOfficial\\\/\",\"https:\\\/\\\/x.com\\\/Great_Learning\",\"https:\\\/\\\/www.instagram.com\\\/greatlearningofficial\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/in.pinterest.com\\\/greatlearning12\\\/\",\"https:\\\/\\\/www.youtube.com\\\/user\\\/beaconelearning\\\/\"],\"description\":\"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.\",\"email\":\"info@mygreatlearning.com\",\"legalName\":\"Great Learning Education Services Pvt. Ltd\",\"foundingDate\":\"2013-11-29\",\"numberOfEmployees\":{\"@type\":\"QuantitativeValue\",\"minValue\":\"1001\",\"maxValue\":\"5000\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\",\"name\":\"Great Learning Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"caption\":\"Great Learning Editorial Team\"},\"description\":\"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.\",\"sameAs\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/\",\"https:\\\/\\\/in.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/x.com\\\/https:\\\/\\\/twitter.com\\\/Great_Learning\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UCObs0kLIrDjX2LLSybqNaEA\"],\"award\":[\"Best EdTech Company of the Year 2024\",\"Education Economictimes Outstanding Education\\\/Edtech Solution Provider of the Year 2024\",\"Leading E-learning Platform 2024\"],\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/author\\\/greatlearning\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"A Guide to Linear Regression in Machine Learning","description":"Linear Regression Machine Learning: Let's know the when and why do we use, Definition, Advantages & Disadvantages, Examples and Models Etc.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"A Guide to Linear Regression in Machine Learning","og_description":"Linear Regression Machine Learning: Let's know the when and why do we use, Definition, Advantages & Disadvantages, Examples and Models Etc.","og_url":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/","og_site_name":"Great Learning Blog: Free Resources what Matters to shape your Career!","article_publisher":"https:\/\/www.facebook.com\/GreatLearningOfficial\/","article_published_time":"2023-01-10T06:36:03+00:00","article_modified_time":"2024-09-02T09:51:57+00:00","og_image":[{"width":1000,"height":667,"url":"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg","type":"image\/jpeg"}],"author":"Great Learning Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@https:\/\/twitter.com\/Great_Learning","twitter_site":"@Great_Learning","twitter_misc":{"Written by":"Great Learning Editorial Team"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#article","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/"},"author":{"name":"Great Learning Editorial Team","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad"},"headline":"A Guide to Linear Regression in Machine Learning","datePublished":"2023-01-10T06:36:03+00:00","dateModified":"2024-09-02T09:51:57+00:00","mainEntityOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/"},"wordCount":5106,"commentCount":0,"publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg","articleSection":["AI and Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/","url":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/","name":"A Guide to Linear Regression in Machine Learning","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg","datePublished":"2023-01-10T06:36:03+00:00","dateModified":"2024-09-02T09:51:57+00:00","description":"Linear Regression Machine Learning: Let's know the when and why do we use, Definition, Advantages & Disadvantages, Examples and Models Etc.","breadcrumb":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#primaryimage","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg","width":1000,"height":667,"caption":"Linear Regression for Beginners"},{"@type":"BreadcrumbList","@id":"https:\/\/www.mygreatlearning.com\/blog\/linear-regression-in-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog","item":"https:\/\/www.mygreatlearning.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AI and Machine Learning","item":"https:\/\/www.mygreatlearning.com\/blog\/artificial-intelligence\/"},{"@type":"ListItem","position":3,"name":"A Guide to Linear Regression in Machine Learning"}]},{"@type":"WebSite","@id":"https:\/\/www.mygreatlearning.com\/blog\/#website","url":"https:\/\/www.mygreatlearning.com\/blog\/","name":"Great Learning Blog","description":"Learn, Upskill &amp; Career Development Guide and Resources","publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"alternateName":"Great Learning","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.mygreatlearning.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization","name":"Great Learning","url":"https:\/\/www.mygreatlearning.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","width":900,"height":900,"caption":"Great Learning"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/GreatLearningOfficial\/","https:\/\/x.com\/Great_Learning","https:\/\/www.instagram.com\/greatlearningofficial\/","https:\/\/www.linkedin.com\/school\/great-learning\/","https:\/\/in.pinterest.com\/greatlearning12\/","https:\/\/www.youtube.com\/user\/beaconelearning\/"],"description":"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.","email":"info@mygreatlearning.com","legalName":"Great Learning Education Services Pvt. Ltd","foundingDate":"2013-11-29","numberOfEmployees":{"@type":"QuantitativeValue","minValue":"1001","maxValue":"5000"}},{"@type":"Person","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad","name":"Great Learning Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","caption":"Great Learning Editorial Team"},"description":"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.","sameAs":["https:\/\/www.mygreatlearning.com\/","https:\/\/in.linkedin.com\/school\/great-learning\/","https:\/\/x.com\/https:\/\/twitter.com\/Great_Learning","https:\/\/www.youtube.com\/channel\/UCObs0kLIrDjX2LLSybqNaEA"],"award":["Best EdTech Company of the Year 2024","Education Economictimes Outstanding Education\/Edtech Solution Provider of the Year 2024","Leading E-learning Platform 2024"],"url":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"}]}},"uagb_featured_image_src":{"full":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",1000,667,false],"thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1-150x150.jpg",150,150,true],"medium":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1-300x200.jpg",300,200,true],"medium_large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1-768x512.jpg",768,512,true],"large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",1000,667,false],"1536x1536":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",1000,667,false],"2048x2048":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",1000,667,false],"web-stories-poster-portrait":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",640,427,false],"web-stories-publisher-logo":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",96,64,false],"web-stories-thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/08\/Linear-Regression-for-Beginners-1.jpg",150,100,false]},"uagb_author_info":{"display_name":"Great Learning Editorial Team","author_link":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"},"uagb_comment_info":0,"uagb_excerpt":"What is Linear Regression? Linear Regression is the basic form of regression analysis. It assumes that there is a linear relationship between the dependent variable and the predictor(s). In regression, we try to calculate the best fit line, which describes the relationship between the predictors and predictive\/dependent variables. There are four assumptions associated with a&hellip;","_links":{"self":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/6057","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/comments?post=6057"}],"version-history":[{"count":69,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/6057\/revisions"}],"predecessor-version":[{"id":106648,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/6057\/revisions\/106648"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media\/6059"}],"wp:attachment":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media?parent=6057"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/categories?post=6057"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/tags?post=6057"},{"taxonomy":"content_type","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/content_type?post=6057"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}