{"id":32485,"date":"2021-04-26T18:07:37","date_gmt":"2021-04-26T12:37:37","guid":{"rendered":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/"},"modified":"2024-09-02T15:28:27","modified_gmt":"2024-09-02T09:58:27","slug":"introduction-to-stochastic-gradient-descent","status":"publish","type":"post","link":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/","title":{"rendered":"Introduction to Stochastic Gradient Descent"},"content":{"rendered":"\n<p>Stochastic: \u201cProcess involving a randomly determined sequence of observations, each of which is considered as a sample of one element from a probability distribution.\u201d&nbsp; Or, in simple terms, \u201cRandom selection.\u201d<\/p>\n\n\n\n<p><em><strong>Contributed by: <a href=\"https:\/\/www.linkedin.com\/in\/sarveshwaran-rajagopal-5107b718\/\" target=\"_blank\" rel=\"noreferrer noopener\">Sarveshwaran<\/a><\/strong><\/em><\/p>\n\n\n\n<p><strong>Points discussed:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Optimization<\/li>\n\n\n\n<li>Optimization Function<\/li>\n\n\n\n<li>Loss Function<\/li>\n\n\n\n<li>Derivative<\/li>\n\n\n\n<li>Optimization Methods<\/li>\n\n\n\n<li>Gradient Descent<\/li>\n\n\n\n<li>Stochastic Gradient Descent<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-is-the-need-of-optimization\"><strong>What is the need of Optimization?<\/strong><\/h2>\n\n\n\n<p>Any algorithm has an objective of reducing the error, reduction in error is achieved by optimization techniques.<\/p>\n\n\n\n<p>Error: Cross-Entropy Loss in Logistic Regression, Sum of Squared Loss in Linear Regression<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-is-optimization-function\"><strong>What is Optimization function?<\/strong><\/h2>\n\n\n\n<p>Optimization is a mathematical technique of either minimizing or maximizing some function <strong><em>f(x)<\/em><\/strong> by altering x. In many real-time scenarios, we will be minimizing f(x). Well, if the thought comes up, how do we maximize the f(x)? It is just possible by minimizing the -f(x). Any function <strong><em>f(x)<\/em><\/strong> that we want to either minimize or maximize we call the <strong>objective function or criterion.<\/strong><\/p>\n\n\n\n    <div class=\"courses-cta-container\">\n        <div class=\"courses-cta-card\">\n            <div class=\"courses-cta-header\">\n                <div class=\"courses-learn-icon\"><\/div>\n                <span class=\"courses-learn-text\">Texas McCombs, UT Austin<\/span>\n            <\/div>\n            <p class=\"courses-cta-title\">\n                <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-artificial-intelligence-course\" class=\"courses-cta-title-link\">PG Program in AI &amp; Machine Learning<\/a>\n            <\/p>\n            <p class=\"courses-cta-description\">Master AI with hands-on projects, expert mentorship, and a prestigious certificate from UT Austin and Great Lakes Executive Learning.<\/p>\n            <div class=\"courses-cta-stats\">\n                <div class=\"courses-stat-item\">\n                    <div class=\"courses-stat-icon courses-user-icon\"><\/div>\n                    <span>Duration: 12 months<\/span>\n                <\/div>\n                <div class=\"courses-stat-item\">\n                    <div class=\"courses-stat-icon courses-star-icon\"><\/div>\n                    <span>Ratings: 4.72<\/span>\n                <\/div>\n            <\/div>\n            <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-artificial-intelligence-course\" class=\"courses-cta-button\">\n                Start Learning today\n                <div class=\"courses-arrow-icon\"><\/div>\n            <\/a>\n        <\/div>\n    <\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-is-a-loss-function\"><strong>What is a Loss function?<\/strong><\/h2>\n\n\n\n<p>Whenever we minimize our objective function <strong><em>f(x)<\/em><\/strong> we call it a loss function. Loss function takes various names such as <strong><em>Cost function or error function.&nbsp;<\/em><\/strong><\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"derivative\"><strong>Derivative:<\/strong><\/h4>\n\n\n\n<p>Let\u2019s take an example of a function y = f(x), where both x and y are real numbers. The derivative of this function is denoted as f\u2019(x) or as dx\/dy. The derivative f\u2019(x) gives the slope of f(x) at the point x. In other words, it directs us how a small change in the input will correspond to the change in output. The derivative is useful to minimize the loss because it tells us how to change x to reduce the error or to make a small improvement in y.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"optimization-methods\"><strong>Optimization Methods:<\/strong><\/h2>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-16.png\"><img decoding=\"async\" width=\"696\" height=\"106\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-16.png\" alt=\"\" class=\"wp-image-32486\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-16.png 696w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-16-300x46.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-16-150x23.png 150w\" sizes=\"(max-width: 696px) 100vw, 696px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"1-un-constrained-closed-form-solution\"><strong>1. Un constrained - Closed Form Solution:<\/strong><\/h3>\n\n\n\n<p>Steepest descent convergence when every element of the gradient is zero (at least very close to zero). In some cases, we may be able to avoid running an iterative algorithm and just jump to the critical point by solving equation&nbsp; \u0394xfx=0 for x.<\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-19.png\"><img decoding=\"async\" width=\"605\" height=\"169\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-19.png\" alt=\"\" class=\"wp-image-32489\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-19.png 605w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-19-300x84.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-19-150x42.png 150w\" sizes=\"(max-width: 605px) 100vw, 605px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"why-is-the-iterative-method-more-robust-than-the-closed-form\"><strong>Why is the Iterative method more robust than the closed form?<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Closed form works well for simpler cases, If the cost function has many variables, minimizing using the closed form will be complicated<\/li>\n\n\n\n<li>Iterative methods will help us reach local minima, most of the time reaching the global minima would be an unaccomplished task.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"gradient-descent-first-order-iterative-method\"><strong>Gradient Descent (First Order Iterative Method):<\/strong><\/h3>\n\n\n\n<p>Gradient Descent is an iterative method. You start at some Gradient (or) Slope, based on the slope, take a step of the descent. The technique of moving x in small steps with the opposite sign of the derivative is called Gradient Descent. In other words, the positive gradient points direct uphill, and the negative gradient points direct downhill. We can decrease the value off<strong><em> <\/em><\/strong>by moving in the direction of the negative gradient. This is known as the method of <strong>steepest descent<\/strong> or <strong>gradient descent<\/strong>.<\/p>\n\n\n\n<p>New point is proposed by:<\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-20.png\"><img decoding=\"async\" width=\"230\" height=\"50\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-20.png\" alt=\"\" class=\"wp-image-32490\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-20.png 230w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-20-150x33.png 150w\" sizes=\"(max-width: 230px) 100vw, 230px\" \/><\/figure>\n\n\n\n<p>Where is the learning rate, a positive scalar determining the size of the step. Choose to a small constant. Popular method to reach with optimum learning rate (\u03f5) is by using the <strong>grid search or line search.<\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"choice-of-learning-rate\"><strong>Choice of Learning Rate:<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Larger learning rate \ud83e\udc6a Chances of missing the global minima, as the learning curve will show violent oscillations with the cost function increasing significantly.<\/li>\n\n\n\n<li>Small learning rate \ud83e\udc6a Convergence is slow and if the learning rate is too low, learning may become stuck with high cost value.<\/li>\n<\/ul>\n\n\n\n<p>Gradient descent is limited to optimization in continuous spaces, the general concept of repeatedly making a best small move (either positive or negative) toward the better configurations can be generalized to discrete spaces.<\/p>\n\n\n\n<p>Gradient Descent can be associated with the ball rolling down from a valley and the lowest point is the steepest descent, learning rate (\u03f5) consider it as the steps taken by the ball to reach the lowest point of the valley.<\/p>\n\n\n\n<p>For example, let\u2019s consider the below function as the cost function:<\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21.png\"><img decoding=\"async\" width=\"749\" height=\"269\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21.png\" alt=\"\" class=\"wp-image-32492\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21.png 749w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21-300x108.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21-696x250.png 696w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-21-150x54.png 150w\" sizes=\"(max-width: 749px) 100vw, 749px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"step-by-step-approach\"><strong>Step by step approach:<\/strong><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Start with an initial assumed parameter \ud83e\udc6a assumed value=(x); = learning rate<\/li>\n\n\n\n<li>For the value (x), you calculate the output of the differentiated function which we denote as f\u2019(x)<\/li>\n\n\n\n<li>Now, the value of parameter \ud83e\udc6a x \u2013 (\u03f5*f'(x))<\/li>\n\n\n\n<li>Continue this same process until the algorithm reaches an optimum point (\u03b1)<\/li>\n\n\n\n<li>Error reduces because the cost function is convex in nature.<\/li>\n<\/ol>\n\n\n\n<p>The question arises when the derivative function or f\u2019(x) = 0, in that situation the derivative provides no information about which direction to move. Points where f\u2019(x) = 0 are known as <strong>critical points<\/strong> or <strong>stationary points.<\/strong><\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-22.png\"><img decoding=\"async\" width=\"558\" height=\"164\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-22.png\" alt=\"\" class=\"wp-image-32493\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-22.png 558w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-22-300x88.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-22-150x44.png 150w\" sizes=\"(max-width: 558px) 100vw, 558px\" \/><\/figure>\n\n\n\n<p><strong>Few other important terminologies to know before we move to SGD:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Local Minimum<\/li>\n\n\n\n<li>Local Maximum<\/li>\n\n\n\n<li>Global Minimum<\/li>\n\n\n\n<li>Saddle Points<\/li>\n\n\n\n<li>Jacobian<strong> <\/strong>Matrix<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"local-minimum\"><strong>Local Minimum:<\/strong><\/h3>\n\n\n\n<p>Local minimum is a point where f(x) is lower than all neighboring points, so it is no longer possible to decrease f(x) by making baby steps. This is also called as <strong>local minima (or) relative minimum<\/strong>. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"local-maximum\"><strong>Local Maximum:<\/strong><\/h3>\n\n\n\n<p>Local Maximum is a point where f(x) is higher than at all neighboring points so it is not possible to increase f(x) by taking baby steps. This is also called a local<strong> maxima (or)&nbsp; relative maximum.<\/strong><\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-23.png\"><img decoding=\"async\" width=\"371\" height=\"175\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-23.png\" alt=\"\" class=\"wp-image-32494\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-23.png 371w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-23-300x142.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/image-23-150x71.png 150w\" sizes=\"(max-width: 371px) 100vw, 371px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"saddle-points\"><strong>Saddle Points:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/strong><\/h3>\n\n\n\n<p>Some critical points or stationary points are neither maxima or minima, they are called Saddle points.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"global-minimum\"><strong>Global Minimum:<\/strong><\/h2>\n\n\n\n<p>Global minimum is the smallest value of the function f(x) that is also called the <strong>absolute minimum<\/strong>. There would be only one global minimum whereas there could be more than one or more local minimum.&nbsp;<\/p>\n\n\n\n<p>When the input to the function is multidimensional most of the optimization functions fail to find the global minimum as they have multiple local minima, local maxima and saddle points.&nbsp;<\/p>\n\n\n\n<p>This is one of the greatest challenges in optimization, we often settle with the value of f that is minimal not necessarily global minimum in any formal sense.<\/p>\n\n\n\n    <div class=\"courses-cta-container\">\n        <div class=\"courses-cta-card\">\n            <div class=\"courses-cta-header\">\n                <div class=\"courses-learn-icon\"><\/div>\n                <span class=\"courses-learn-text\">Advance Your Data Science Career<\/span>\n            <\/div>\n            <p class=\"courses-cta-title\">\n                <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-data-science-and-business-analytics-course\" class=\"courses-cta-title-link\">Post Graduate Program in Data Science<\/a>\n            <\/p>\n            <p class=\"courses-cta-description\">Master data science skills with a focus on in-demand Gen AI through real-world projects and expert-led learning.<\/p>\n            <div class=\"courses-cta-stats\">\n                <div class=\"courses-stat-item\">\n                    <div class=\"courses-stat-icon courses-user-icon\"><\/div>\n                    <span>Duration: 12 months<\/span>\n                <\/div>\n                <div class=\"courses-stat-item\">\n                    <div class=\"courses-stat-icon courses-star-icon\"><\/div>\n                    <span>Ratings: 4.78<\/span>\n                <\/div>\n            <\/div>\n            <a href=\"https:\/\/www.mygreatlearning.com\/pg-program-data-science-and-business-analytics-course\" class=\"courses-cta-button\">\n                Explore the Course\n                <div class=\"courses-arrow-icon\"><\/div>\n            <\/a>\n        <\/div>\n    <\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-is-the-jacobian-matrix\"><strong>What is the Jacobian Matrix?<\/strong><\/h2>\n\n\n\n<p>When we have multiple inputs, we must use <strong>partial derivatives <\/strong>of each<strong> <\/strong>variable<strong> <\/strong>xi<strong>. <\/strong>The partial derivative a\/axi f(x) measures how <strong><em>f<\/em><\/strong> changes as only the variable xi increases at point x. The gradient of <strong><em>f<\/em><\/strong><em> <\/em>&nbsp;is the vector containing all the partial derivatives, denoted by xf(x). In multiple dimensions critical points are points where every element of the gradient is equal to zero.&nbsp;<\/p>\n\n\n\n<p>Sometimes we need to find all derivatives of a function whose input and output are both vectors. The matrix containing all such partial derivatives is known as the Jacobian<strong> Matrix. <\/strong>Discussion of<strong> <\/strong>Jacobian and Hessian matrices are beyond the scope of current discussion.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"challenges-in-gradient-descent\"><strong>Challenges in Gradient Descent:<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>For a good generalization we should have a large training set, which comes with a <strong>huge computational cost<\/strong>.\u00a0<\/li>\n\n\n\n<li>i.e., as the training set grows to billions of examples, the time taken to take a <strong>single gradient step<\/strong> becomes long.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"stochastic-gradient-descent\"><strong>Stochastic Gradient Descent:<\/strong><\/h2>\n\n\n\n<p>Stochastic Gradient Descent is the extension of Gradient Descent.<\/p>\n\n\n\n<p>Any Machine Learning\/ Deep Learning function works on the same objective function f(x) to reduce the error and generalize when a new data comes in.&nbsp;<\/p>\n\n\n\n<p>To overcome the challenges in Gradient Descent we are taking a small set of samples, specifically on each step of the algorithm, we can sample a <strong>minibatch<\/strong> drawn uniformly from the training set. The minibatch size is typically chosen to be a relatively small number of examples; it could be from <strong>one to few hundred.<\/strong>&nbsp;<\/p>\n\n\n\n<p>Using the examples from the minibatch. The SGD algorithm then follows the expected gradient downhill:<\/p>\n\n\n\n<p>The Gradient Descent has often been regarded as slow or unreliable, it was not feasible to deal with non-convex optimization problems. Now with Stochastic Gradient Descent, machine learning algorithms work very well when trained, though it reaches the local minimum in the reasonable amount of time.&nbsp;<\/p>\n\n\n\n<p>A crucial parameter for SGD is the learning rate, it is necessary to decrease the learning rate over time, so we now denote the learning rate at iteration k as Ek.<\/p>\n\n\n\n<p>This brings us to the end of the blog on Stochastic Gradient Descent. We hope that you were able to gain valuable insights about Stochastic Gradient Descent. If you wish to learn more such concepts, enroll with <a href=\"https:\/\/www.mygreatlearning.com\/academy\" target=\"_blank\" rel=\"noreferrer noopener\">Great Learning Academy's Free Online Courses<\/a> and learn now! <\/p>\n\n\n<figure class=\"wp-block-image size-large zoomable\" data-full=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2.png\"><a href=\"https:\/\/www.mygreatlearning.com\/academy\/learn-for-free\/courses\/basics-of-machine-learning-1\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" width=\"1000\" height=\"242\" src=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2.png\" alt=\"\" class=\"wp-image-20062\" srcset=\"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2.png 1000w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2-300x73.png 300w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2-768x186.png 768w, https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2020\/09\/Machine-Learning-Foundations-2-696x168.png 696w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><\/a><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>Stochastic: \u201cProcess involving a randomly determined sequence of observations, each of which is considered as a sample of one element from a probability distribution.\u201d&nbsp; Or, in simple terms, \u201cRandom selection.\u201d Contributed by: Sarveshwaran Points discussed: What is the need of Optimization? Any algorithm has an objective of reducing the error, reduction in error is achieved [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":32502,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[2],"tags":[],"content_type":[],"class_list":["post-32485","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Introduction to Stochastic Gradient Descent<\/title>\n<meta name=\"description\" content=\"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\/ Deep Learning function works on the same objective function f(x).\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Introduction to Stochastic Gradient Descent\" \/>\n<meta property=\"og:description\" content=\"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\/ Deep Learning function works on the same objective function f(x).\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/\" \/>\n<meta property=\"og:site_name\" content=\"Great Learning Blog: Free Resources what Matters to shape your Career!\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/GreatLearningOfficial\/\" \/>\n<meta property=\"article:published_time\" content=\"2021-04-26T12:37:37+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-09-02T09:58:27+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1411\" \/>\n\t<meta property=\"og:image:height\" content=\"744\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Great Learning Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/twitter.com\/Great_Learning\" \/>\n<meta name=\"twitter:site\" content=\"@Great_Learning\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Great Learning Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/\"},\"author\":{\"name\":\"Great Learning Editorial Team\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\"},\"headline\":\"Introduction to Stochastic Gradient Descent\",\"datePublished\":\"2021-04-26T12:37:37+00:00\",\"dateModified\":\"2024-09-02T09:58:27+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/\"},\"wordCount\":1426,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/iStock-1146014337.jpg\",\"articleSection\":[\"AI and Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/\",\"name\":\"Introduction to Stochastic Gradient Descent\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/iStock-1146014337.jpg\",\"datePublished\":\"2021-04-26T12:37:37+00:00\",\"dateModified\":\"2024-09-02T09:58:27+00:00\",\"description\":\"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\\\/ Deep Learning function works on the same objective function f(x).\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/iStock-1146014337.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/iStock-1146014337.jpg\",\"width\":1411,\"height\":744,\"caption\":\"Stochastic Gradient Descent\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/introduction-to-stochastic-gradient-descent\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI and Machine Learning\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/artificial-intelligence\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Introduction to Stochastic Gradient Descent\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"name\":\"Great Learning Blog\",\"description\":\"Learn, Upskill &amp; Career Development Guide and Resources\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"alternateName\":\"Great Learning\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\",\"name\":\"Great Learning\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"width\":900,\"height\":900,\"caption\":\"Great Learning\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/GreatLearningOfficial\\\/\",\"https:\\\/\\\/x.com\\\/Great_Learning\",\"https:\\\/\\\/www.instagram.com\\\/greatlearningofficial\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/in.pinterest.com\\\/greatlearning12\\\/\",\"https:\\\/\\\/www.youtube.com\\\/user\\\/beaconelearning\\\/\"],\"description\":\"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.\",\"email\":\"info@mygreatlearning.com\",\"legalName\":\"Great Learning Education Services Pvt. Ltd\",\"foundingDate\":\"2013-11-29\",\"numberOfEmployees\":{\"@type\":\"QuantitativeValue\",\"minValue\":\"1001\",\"maxValue\":\"5000\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\",\"name\":\"Great Learning Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"caption\":\"Great Learning Editorial Team\"},\"description\":\"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.\",\"sameAs\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/\",\"https:\\\/\\\/in.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/x.com\\\/https:\\\/\\\/twitter.com\\\/Great_Learning\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UCObs0kLIrDjX2LLSybqNaEA\"],\"award\":[\"Best EdTech Company of the Year 2024\",\"Education Economictimes Outstanding Education\\\/Edtech Solution Provider of the Year 2024\",\"Leading E-learning Platform 2024\"],\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/author\\\/greatlearning\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Introduction to Stochastic Gradient Descent","description":"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\/ Deep Learning function works on the same objective function f(x).","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/","og_locale":"en_US","og_type":"article","og_title":"Introduction to Stochastic Gradient Descent","og_description":"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\/ Deep Learning function works on the same objective function f(x).","og_url":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/","og_site_name":"Great Learning Blog: Free Resources what Matters to shape your Career!","article_publisher":"https:\/\/www.facebook.com\/GreatLearningOfficial\/","article_published_time":"2021-04-26T12:37:37+00:00","article_modified_time":"2024-09-02T09:58:27+00:00","og_image":[{"width":1411,"height":744,"url":"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg","type":"image\/jpeg"}],"author":"Great Learning Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@https:\/\/twitter.com\/Great_Learning","twitter_site":"@Great_Learning","twitter_misc":{"Written by":"Great Learning Editorial Team","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#article","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/"},"author":{"name":"Great Learning Editorial Team","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad"},"headline":"Introduction to Stochastic Gradient Descent","datePublished":"2021-04-26T12:37:37+00:00","dateModified":"2024-09-02T09:58:27+00:00","mainEntityOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/"},"wordCount":1426,"commentCount":0,"publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg","articleSection":["AI and Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/","url":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/","name":"Introduction to Stochastic Gradient Descent","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#primaryimage"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg","datePublished":"2021-04-26T12:37:37+00:00","dateModified":"2024-09-02T09:58:27+00:00","description":"Stochastic Gradient Descent is the extension of Gradient Descent. Any Machine Learning\/ Deep Learning function works on the same objective function f(x).","breadcrumb":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#primaryimage","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg","width":1411,"height":744,"caption":"Stochastic Gradient Descent"},{"@type":"BreadcrumbList","@id":"https:\/\/www.mygreatlearning.com\/blog\/introduction-to-stochastic-gradient-descent\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog","item":"https:\/\/www.mygreatlearning.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AI and Machine Learning","item":"https:\/\/www.mygreatlearning.com\/blog\/artificial-intelligence\/"},{"@type":"ListItem","position":3,"name":"Introduction to Stochastic Gradient Descent"}]},{"@type":"WebSite","@id":"https:\/\/www.mygreatlearning.com\/blog\/#website","url":"https:\/\/www.mygreatlearning.com\/blog\/","name":"Great Learning Blog","description":"Learn, Upskill &amp; Career Development Guide and Resources","publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"alternateName":"Great Learning","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.mygreatlearning.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization","name":"Great Learning","url":"https:\/\/www.mygreatlearning.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","width":900,"height":900,"caption":"Great Learning"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/GreatLearningOfficial\/","https:\/\/x.com\/Great_Learning","https:\/\/www.instagram.com\/greatlearningofficial\/","https:\/\/www.linkedin.com\/school\/great-learning\/","https:\/\/in.pinterest.com\/greatlearning12\/","https:\/\/www.youtube.com\/user\/beaconelearning\/"],"description":"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.","email":"info@mygreatlearning.com","legalName":"Great Learning Education Services Pvt. Ltd","foundingDate":"2013-11-29","numberOfEmployees":{"@type":"QuantitativeValue","minValue":"1001","maxValue":"5000"}},{"@type":"Person","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad","name":"Great Learning Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","caption":"Great Learning Editorial Team"},"description":"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.","sameAs":["https:\/\/www.mygreatlearning.com\/","https:\/\/in.linkedin.com\/school\/great-learning\/","https:\/\/x.com\/https:\/\/twitter.com\/Great_Learning","https:\/\/www.youtube.com\/channel\/UCObs0kLIrDjX2LLSybqNaEA"],"award":["Best EdTech Company of the Year 2024","Education Economictimes Outstanding Education\/Edtech Solution Provider of the Year 2024","Leading E-learning Platform 2024"],"url":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"}]}},"uagb_featured_image_src":{"full":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg",1411,744,false],"thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-150x150.jpg",150,150,true],"medium":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-300x158.jpg",300,158,true],"medium_large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-768x405.jpg",768,405,true],"large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-1024x540.jpg",1024,540,true],"1536x1536":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg",1411,744,false],"2048x2048":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337.jpg",1411,744,false],"web-stories-poster-portrait":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-640x744.jpg",640,744,true],"web-stories-publisher-logo":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-96x96.jpg",96,96,true],"web-stories-thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2021\/04\/iStock-1146014337-150x79.jpg",150,79,true]},"uagb_author_info":{"display_name":"Great Learning Editorial Team","author_link":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"},"uagb_comment_info":0,"uagb_excerpt":"Stochastic: \u201cProcess involving a randomly determined sequence of observations, each of which is considered as a sample of one element from a probability distribution.\u201d&nbsp; Or, in simple terms, \u201cRandom selection.\u201d Contributed by: Sarveshwaran Points discussed: What is the need of Optimization? Any algorithm has an objective of reducing the error, reduction in error is achieved&hellip;","_links":{"self":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/32485","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/comments?post=32485"}],"version-history":[{"count":8,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/32485\/revisions"}],"predecessor-version":[{"id":110519,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/32485\/revisions\/110519"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media\/32502"}],"wp:attachment":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media?parent=32485"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/categories?post=32485"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/tags?post=32485"},{"taxonomy":"content_type","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/content_type?post=32485"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}