{"id":8839,"date":"2019-10-29T23:57:57","date_gmt":"2019-10-29T18:27:57","guid":{"rendered":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/"},"modified":"2024-09-12T16:01:34","modified_gmt":"2024-09-12T10:31:34","slug":"reinforcement-machine-learning","status":"publish","type":"post","link":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/","title":{"rendered":"Reinforcement Learning"},"content":{"rendered":"\n<p><span style=\"font-weight: 400\">You might have seen robots doing mundane tasks like cleaning room or serving beer to people. However, these actions are usually remote-controlled by a human. These robots are physically capable of doing things following a set of instructions given to them, but they lack the basic intelligence to decide and do things by themselves. Embedding intelligence is a software challenge, and reinforcement learning, a subfield in machine learning, provides a promising direction towards developing intelligent robotics.&nbsp;<\/span><br><span style=\"font-weight: 400\">Reinforcement learning is concerned with how an agent uses the feedback to evaluate its actions and plan about future actions in the given environment to maximize the results. In reinforcement learning, the agent is empowered to decide how to perform a task, which makes it different from other such <a href=\"https:\/\/www.mygreatlearning.com\/blog\/what-is-machine-learning\/\" target=\"_blank\" rel=\"noopener noreferrer\">machine learning<\/a> models where the agent blindly follows a set of instructions given to it. The machine acts on its own, not according to a set of pre-written commands. Thus, reinforcement learning denotes those algorithms, which work based on the feedback of their actions and decide how to accomplish a complex task.&nbsp;<\/span><br><span style=\"font-weight: 400\">These algorithms are rewarded when they make the right decision and are punished when they make the wrong decision. Under favourable conditions, they can do a superhuman performance. Here is an comprehensive Tutorial on Reinforcement learning along with a case study.<\/span><\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Reinforcement Learning Tutorial | Reinforcement Learning in Artificial Intelligence | Full Course\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/f8bnkro3yXY?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><b>Importance of Reinforce Learning<\/b><\/p>\n\n\n\n<p><span style=\"font-weight: 400\">We need technological assistance to simplify life, improve productivity and to make better business decisions. To achieve this goal, we need intelligent machines. While it is easy to write programs for simple tasks, we need a way out to build machines that carry out complex tasks. To Achieve this is to create machines that are capable of learning things by themselves. Reinforce learning does this.<\/span><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"reinforcement-learning-basics\"><b>Reinforcement Learning Basics<\/b><\/h3>\n\n\n\n<p><span style=\"font-weight: 400\">Basics of reinforcement machine learning include:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">An Input, an initial state, from which the model starts an action<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Outputs \u2013 there could be many possible solutions to a given problem, which means there could be many outputs<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">The training on deep reinforcement learning is based on the input, and the user can decide to either reward or punish the model depending on the output. The model decides the best solution based on the maximum reward.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">The model considers the rewards and punishments and continues to learn through them.<\/span><\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"reinforcement-learning-types\"><b>Reinforcement Learning: Types&nbsp;<\/b><\/h3>\n\n\n\n<p><span style=\"font-weight: 400\">Reinforcement is of two different types: positive and negative<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Positive Reinforcement<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">A reinforcement is considered positive when a given event has a positive effect such as an increase in the frequency and strength of the behaviour.&nbsp;<\/span><br><span style=\"font-weight: 400\">Positive reinforcement has the following advantages:<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">It gives the maximum possible performance<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">It sustains the change for a long time<\/span><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">Positive reinforcement has a disadvantage as well \u2013 if the reinforcement is too much, it could cause overload and weaken the result.<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><b>Negative Reinforcement<\/b><\/li>\n<\/ul>\n\n\n\n<p><span style=\"font-weight: 400\">A reinforcement is considered negative when an action is stopped or dodged due to a negative condition.<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"deep-reinforcement-learning\"><b>Deep Reinforcement Learning<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">Deep reinforcement learning uses a training set to learn and then applies that to a new set of data. It is a bit different from reinforcement learning which is a dynamic process of learning through continuous feedback about its actions and adjusting future actions accordingly acquire the maximum reward.<\/span><br><b>Fields of Applications<\/b><span style=\"font-weight: 400\">&nbsp;<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">Gaming<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Robotics<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">E-commerce<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Self-driving cars<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Industrial automation<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Stock price forecasting<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">News<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Design training systems<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Web search engines like Google<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Photo tagging applications<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Spam detector applications<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Weather forecasting application<\/span><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"definitions-in-reinforcement-learning\"><b>Definitions in Reinforcement Learning<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">There are several concepts and definitions in reinforcement learning. Major ones are listed below:<\/span><br><span style=\"font-weight: 400\"><strong>Agent<\/strong>: Agent is the one that takes actions. For instance, Super Mario is an agent as it navigates a video game.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Action<\/strong> (A): It is the collection of all possible moves any agent is capable of making.&nbsp; It is self-explanatory, and the agents can choose from a set of possible actions.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Discount factor<\/strong>: To fight against delayed gratification, we need to make immediate rewards greater than future rewards. The discount factor is used for this and thus apply a short-term gratification in the agent.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Environment<\/strong>: Just as the word implies, the \u2018environment\u2019 is the surroundings through which the agents move.&nbsp; The environment considers the action and the current state of the agent as the input and grants a reward for the agent in the next state, and that is the output.<\/span><br><span style=\"font-weight: 400\"><strong>State<\/strong>: This refers to the current situation where the agent places itself \u2013 such as a specific place or action. A state relates the agent to other relevant things such as obstacles, rewards, enemies and tools.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Reward<\/strong>: This denotes the feedback given for an action taken by the agent. The feedback is an evaluation of the agent\u2019s action and decides if it is a success or failure.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Policy<\/strong>: This denotes the agent\u2019s strategy to decide the next course of action. Each policy is taken based on the current state. It aims to do those actions that bring in the highest reward.&nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Value<\/strong>: Denotes expected long-term return to the current state, in contrast to the short-term rewards. &nbsp;<\/span><br><span style=\"font-weight: 400\"><strong>Q-value or action-value<\/strong>:&nbsp;It&nbsp;is very similar to the concept of value, except that it considers the current action as well. &nbsp;Q-value is the one that maps the state and action to rewards. Trajectory: This denotes several states lined in a sequence and the actions that could influence them.&nbsp;<\/span><br><span style=\"font-weight: 400\">From the feedback loop given above, an agent does a certain action based on the environment it is, in and this constitutes the state. The agent\u2019s action and the environment are considered and then a feedback is generated, which decides if that action is a success or failure. The goal could be different in different scenarios.&nbsp;<\/span><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">The goal in a video game may be to finish the game with maximum points. Hence, each additional point gained in the game will affect the subsequent action of the agent.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">The goal in the real world may be to travel between two points, say, A to B. Every small unit the robot moves closer towards point B could be counted as points.<\/span><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"pros-and-cons-of-reinforcement-machine-learning\"><b>Pros and Cons of Reinforcement Machine Learning<\/b><\/h2>\n\n\n\n<p><b>Pros<\/b><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">It helps to solve very complex problems that conventional techniques fail to solve<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">It gives long-term results that are very difficult to accomplish.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">This model works like human learning pattern and hence, demonstrates perfection in every action.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">The model is capable of learning from the errors and corrects them. So there is a very little chance of repetition of the same error.\u00a0<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">It learns from experience and hence a dataset is not needed to guide its actions.\u00a0<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">It provides scope for an intelligent examination of the situation-action relation and creates the ideal behaviour within a given context, that leads to maximum performance.<\/span><\/li>\n<\/ul>\n\n\n\n<p><b>Cons<\/b><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><span style=\"font-weight: 400\">Too much of reinforcement may cause an overload which could weaken the results.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Reinforcement learning is preferred for solving complex problems, not simple ones.<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">It requires plenty of data and involves a lot of computation.\u00a0<\/span><\/li>\n\n\n\n<li><span style=\"font-weight: 400\">Maintenance cost is high<\/span><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"challenges-faced-by-reinforcement-learning\"><b style=\"font-size: 24px\">Challenges Faced by Reinforcement Learning<\/b><\/h2>\n\n\n\n<p><span style=\"font-weight: 400\">As mentioned earlier, reinforcement learning uses feedback method to take the best possible actions. This makes it suitable for finding a solution for many complex problems and it has found application in many domains. But it faces many challenges as well. The main one is the challenge in&nbsp;creating the&nbsp;simulation environment that depends a lot on the chosen task. In chess or Go games, where the model has to perform superhuman tasks, the environment is simple. However, it is a bit complex when you consider a real-life application like designing an autonomous car model where you need a highly realistic simulator. This is crucial as you are going to drive the car on the&nbsp;street. The&nbsp;model must&nbsp;be capable of figuring out how and when to apply the brake or&nbsp;how to avoid a&nbsp;collision. It could not be a problem in a virtual world, but it becomes a hard-to-crack-problem when you need to hit the real world. Things get tricky when you transfer the&nbsp;model from the safe training environment into&nbsp;the&nbsp;real world.<\/span><br><span style=\"font-weight: 400\">Another challenge lies in tweaking and scaling the neural network that controls the agent.&nbsp; It is complex because the only way to communicate with the network is through rewards and penalties. The major challenge associated with this is that this could lead to catastrophic forgetting or in other words, this might cause some old knowledge to get erased as it acquires new knowledge.&nbsp;<\/span><br><span style=\"font-weight: 400\">Another challenge is that sometimes the agent does a task just as it is, which means the model does not achieve the optimal output. For example, the model causes a jumper to just jump like a kangaroo, instead of leading the agent to do things that we expect the agent to do \u2013 such as walking.&nbsp;&nbsp;<\/span><br><span style=\"font-weight: 400\">Last but not least, there could arise a problem where the agent just optimizes the prize but does not intend to do the task. Consider the open AI video as an example of this. In this video, the agent learned to bag the rewards without completing the race.&nbsp;<\/span><br><span style=\"font-weight: 400\">There is no doubt that reinforcement machine learning has huge potential to change the world. The biggest advantage of this cutting-edge technology is that it is capable of learning by itself through trial and error, just like human beings. It makes mistakes, corrects them, learn from them to avoid making the same mistake in the future. It can be best combined with other machine learning technologies for better performance. No wonder that it is used in many real-world applications such as robotics, gaming to mention some. It is the best way to&nbsp;incorporate creative and innovation to perform a task. Reinforcement learning surely has the&nbsp;potential to&nbsp;become a&nbsp;revolutionary technology in the future development of artificial intelligence.&nbsp;<\/span><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" class=\"wp-block-heading\" id=\"our-machine-learning-courses\">Our Machine Learning Courses<\/h2>\n\n\n\n<p>Explore our Machine Learning and AI courses, designed for comprehensive learning and skill development.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th><strong>Program Name<\/strong><\/th><th><strong>Duration<\/strong><\/th><\/tr><tr><th><a href=\"https:\/\/professionalonline2.mit.edu\/no-code-artificial-intelligence-machine-learning-program\">MIT No code AI and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><tr><th><a href=\"https:\/\/idss-gl.mit.edu\/mit-idss-data-science-machine-learning-online-program\">MIT Data Science and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><tr><th><a href=\"https:\/\/www.mygreatlearning.com\/mit-data-science-and-machine-learning-program\">Data Science and Machine Learning Course<\/a><\/th><th>12 Weeks<\/th><\/tr><\/thead><\/table><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>You might have seen robots doing mundane tasks like cleaning room or serving beer to people. However, these actions are usually remote-controlled by a human. These robots are physically capable of doing things following a set of instructions given to them, but they lack the basic intelligence to decide and do things by themselves. Embedding [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":8091,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[2],"tags":[],"content_type":[],"class_list":["post-8839","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-artificial-intelligence"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Reinforcement Machine Learning-An Introduction to the Basics<\/title>\n<meta name=\"description\" content=\"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Reinforcement Learning\" \/>\n<meta property=\"og:description\" content=\"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"Great Learning Blog: Free Resources what Matters to shape your Career!\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/GreatLearningOfficial\/\" \/>\n<meta property=\"article:published_time\" content=\"2019-10-29T18:27:57+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-09-12T10:31:34+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1322\" \/>\n\t<meta property=\"og:image:height\" content=\"793\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Great Learning Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@https:\/\/twitter.com\/Great_Learning\" \/>\n<meta name=\"twitter:site\" content=\"@Great_Learning\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Great Learning Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/\"},\"author\":{\"name\":\"Great Learning Editorial Team\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\"},\"headline\":\"Reinforcement Learning\",\"datePublished\":\"2019-10-29T18:27:57+00:00\",\"dateModified\":\"2024-09-12T10:31:34+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/\"},\"wordCount\":1702,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/04\\\/Artificial-Intelligence-Roundup-1-4.jpg\",\"articleSection\":[\"AI and Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/\",\"name\":\"Reinforcement Machine Learning-An Introduction to the Basics\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/04\\\/Artificial-Intelligence-Roundup-1-4.jpg\",\"datePublished\":\"2019-10-29T18:27:57+00:00\",\"dateModified\":\"2024-09-12T10:31:34+00:00\",\"description\":\"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/04\\\/Artificial-Intelligence-Roundup-1-4.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2019\\\/04\\\/Artificial-Intelligence-Roundup-1-4.jpg\",\"width\":1322,\"height\":793,\"caption\":\"Reinforcement Machine Learning\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/reinforcement-machine-learning\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI and Machine Learning\",\"item\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/artificial-intelligence\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Reinforcement Learning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"name\":\"Great Learning Blog\",\"description\":\"Learn, Upskill &amp; Career Development Guide and Resources\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\"},\"alternateName\":\"Great Learning\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#organization\",\"name\":\"Great Learning\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/GL-Logo.jpg\",\"width\":900,\"height\":900,\"caption\":\"Great Learning\"},\"image\":{\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/GreatLearningOfficial\\\/\",\"https:\\\/\\\/x.com\\\/Great_Learning\",\"https:\\\/\\\/www.instagram.com\\\/greatlearningofficial\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/in.pinterest.com\\\/greatlearning12\\\/\",\"https:\\\/\\\/www.youtube.com\\\/user\\\/beaconelearning\\\/\"],\"description\":\"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.\",\"email\":\"info@mygreatlearning.com\",\"legalName\":\"Great Learning Education Services Pvt. Ltd\",\"foundingDate\":\"2013-11-29\",\"numberOfEmployees\":{\"@type\":\"QuantitativeValue\",\"minValue\":\"1001\",\"maxValue\":\"5000\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/#\\\/schema\\\/person\\\/6f993d1be4c584a335951e836f2656ad\",\"name\":\"Great Learning Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"contentUrl\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/02\\\/unnamed.webp\",\"caption\":\"Great Learning Editorial Team\"},\"description\":\"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.\",\"sameAs\":[\"https:\\\/\\\/www.mygreatlearning.com\\\/\",\"https:\\\/\\\/in.linkedin.com\\\/school\\\/great-learning\\\/\",\"https:\\\/\\\/x.com\\\/https:\\\/\\\/twitter.com\\\/Great_Learning\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UCObs0kLIrDjX2LLSybqNaEA\"],\"award\":[\"Best EdTech Company of the Year 2024\",\"Education Economictimes Outstanding Education\\\/Edtech Solution Provider of the Year 2024\",\"Leading E-learning Platform 2024\"],\"url\":\"https:\\\/\\\/www.mygreatlearning.com\\\/blog\\\/author\\\/greatlearning\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Reinforcement Machine Learning-An Introduction to the Basics","description":"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Reinforcement Learning","og_description":"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.","og_url":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/","og_site_name":"Great Learning Blog: Free Resources what Matters to shape your Career!","article_publisher":"https:\/\/www.facebook.com\/GreatLearningOfficial\/","article_published_time":"2019-10-29T18:27:57+00:00","article_modified_time":"2024-09-12T10:31:34+00:00","og_image":[{"width":1322,"height":793,"url":"http:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg","type":"image\/jpeg"}],"author":"Great Learning Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@https:\/\/twitter.com\/Great_Learning","twitter_site":"@Great_Learning","twitter_misc":{"Written by":"Great Learning Editorial Team","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#article","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/"},"author":{"name":"Great Learning Editorial Team","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad"},"headline":"Reinforcement Learning","datePublished":"2019-10-29T18:27:57+00:00","dateModified":"2024-09-12T10:31:34+00:00","mainEntityOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/"},"wordCount":1702,"commentCount":0,"publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg","articleSection":["AI and Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/","url":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/","name":"Reinforcement Machine Learning-An Introduction to the Basics","isPartOf":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg","datePublished":"2019-10-29T18:27:57+00:00","dateModified":"2024-09-12T10:31:34+00:00","description":"Reinforcement machine learning is concerned with how an agent uses feedback to evaluate its actions and plan about future actions to maximize the results.","breadcrumb":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#primaryimage","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg","width":1322,"height":793,"caption":"Reinforcement Machine Learning"},{"@type":"BreadcrumbList","@id":"https:\/\/www.mygreatlearning.com\/blog\/reinforcement-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog","item":"https:\/\/www.mygreatlearning.com\/blog\/"},{"@type":"ListItem","position":2,"name":"AI and Machine Learning","item":"https:\/\/www.mygreatlearning.com\/blog\/artificial-intelligence\/"},{"@type":"ListItem","position":3,"name":"Reinforcement Learning"}]},{"@type":"WebSite","@id":"https:\/\/www.mygreatlearning.com\/blog\/#website","url":"https:\/\/www.mygreatlearning.com\/blog\/","name":"Great Learning Blog","description":"Learn, Upskill &amp; Career Development Guide and Resources","publisher":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization"},"alternateName":"Great Learning","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.mygreatlearning.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.mygreatlearning.com\/blog\/#organization","name":"Great Learning","url":"https:\/\/www.mygreatlearning.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/06\/GL-Logo.jpg","width":900,"height":900,"caption":"Great Learning"},"image":{"@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/GreatLearningOfficial\/","https:\/\/x.com\/Great_Learning","https:\/\/www.instagram.com\/greatlearningofficial\/","https:\/\/www.linkedin.com\/school\/great-learning\/","https:\/\/in.pinterest.com\/greatlearning12\/","https:\/\/www.youtube.com\/user\/beaconelearning\/"],"description":"Great Learning is a leading global ed-tech company for professional training and higher education. It offers comprehensive, industry-relevant, hands-on learning programs across various business, technology, and interdisciplinary domains driving the digital economy. These programs are developed and offered in collaboration with the world's foremost academic institutions.","email":"info@mygreatlearning.com","legalName":"Great Learning Education Services Pvt. Ltd","foundingDate":"2013-11-29","numberOfEmployees":{"@type":"QuantitativeValue","minValue":"1001","maxValue":"5000"}},{"@type":"Person","@id":"https:\/\/www.mygreatlearning.com\/blog\/#\/schema\/person\/6f993d1be4c584a335951e836f2656ad","name":"Great Learning Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","url":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","contentUrl":"https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2022\/02\/unnamed.webp","caption":"Great Learning Editorial Team"},"description":"The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.","sameAs":["https:\/\/www.mygreatlearning.com\/","https:\/\/in.linkedin.com\/school\/great-learning\/","https:\/\/x.com\/https:\/\/twitter.com\/Great_Learning","https:\/\/www.youtube.com\/channel\/UCObs0kLIrDjX2LLSybqNaEA"],"award":["Best EdTech Company of the Year 2024","Education Economictimes Outstanding Education\/Edtech Solution Provider of the Year 2024","Leading E-learning Platform 2024"],"url":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"}]}},"uagb_featured_image_src":{"full":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",1322,793,false],"thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4-150x150.jpg",150,150,true],"medium":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4-300x180.jpg",300,180,true],"medium_large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4-768x461.jpg",768,461,true],"large":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4-1024x614.jpg",1024,614,true],"1536x1536":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",1322,793,false],"2048x2048":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",1322,793,false],"web-stories-poster-portrait":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",640,384,false],"web-stories-publisher-logo":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",96,58,false],"web-stories-thumbnail":["https:\/\/www.mygreatlearning.com\/blog\/wp-content\/uploads\/2019\/04\/Artificial-Intelligence-Roundup-1-4.jpg",150,90,false]},"uagb_author_info":{"display_name":"Great Learning Editorial Team","author_link":"https:\/\/www.mygreatlearning.com\/blog\/author\/greatlearning\/"},"uagb_comment_info":0,"uagb_excerpt":"You might have seen robots doing mundane tasks like cleaning room or serving beer to people. However, these actions are usually remote-controlled by a human. These robots are physically capable of doing things following a set of instructions given to them, but they lack the basic intelligence to decide and do things by themselves. Embedding&hellip;","_links":{"self":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/8839","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/comments?post=8839"}],"version-history":[{"count":7,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/8839\/revisions"}],"predecessor-version":[{"id":106976,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/posts\/8839\/revisions\/106976"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media\/8091"}],"wp:attachment":[{"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/media?parent=8839"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/categories?post=8839"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/tags?post=8839"},{"taxonomy":"content_type","embeddable":true,"href":"https:\/\/www.mygreatlearning.com\/blog\/wp-json\/wp\/v2\/content_type?post=8839"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}