Terms used in Reinforcement Leaning

Every AI/ML/Data Science enthusiast knows the definition of Reinforcement Learning – it is a feedback-based machine learning technique in which an agent learns to behave in an environment by performing actions and observing their outcomes. For each goo…


This content originally appeared on DEV Community and was authored by Anurag Verma

Every AI/ML/Data Science enthusiast knows the definition of Reinforcement Learning - it is a feedback-based machine learning technique in which an agent learns to behave in an environment by performing actions and observing their outcomes. For each good action, the agent receives positive feedback, and for each bad action, it receives negative feedback or a penalty. However, many are not familiar with the specific terms used in this definition. Let me explain them with an example.

Let's consider the example of a robot that is learning to navigate a maze. In this scenario:

🕵️Agent: The robot is the agent, which is the decision-maker that interacts with the environment. The agent can perceive the environment and take actions to achieve its goal.

🧀‍ꡌ‍ꡙ‍ꡚ‍🐁 Environment: The maze is the environment, which is the context in which the agent operates. The environment can provide feedback to the agent in the form of rewards or punishments.

🎬 Actions: The robot can take different actions such as moving forward, turning left, or turning right. These actions are the choices available to the agent.

🙂Feedback: The environment provides feedback to the agent based on its actions. The feedback can be positive, negative, or neutral.

🏆 Reward: The agent receives a reward when it takes an action that leads it closer to its goal. For example, if the robot moves towards the exit of the maze, it may receive a positive reward.

🚫 Punishment: The agent receives punishment when it takes an action that leads it further away from its goal. For example, if the robot hits a wall, it may receive a negative reward.

📜 Policy: The policy is the strategy used by the agent to select actions based on its current state. The goal of the agent is to learn an optimal policy that maximizes the long-term reward. For example, the robot may learn to follow the left wall of the maze to reach the exit.

📍 State: The state is a representation of the environment at a particular time, which includes information such as the location of the agent and other relevant information.

datascience #machinelearning #ai #ml #reinforcementlearning


This content originally appeared on DEV Community and was authored by Anurag Verma


Print Share Comment Cite Upload Translate Updates
APA

Anurag Verma | Sciencx (2023-03-25T13:43:21+00:00) Terms used in Reinforcement Leaning. Retrieved from https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/

MLA
" » Terms used in Reinforcement Leaning." Anurag Verma | Sciencx - Saturday March 25, 2023, https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/
HARVARD
Anurag Verma | Sciencx Saturday March 25, 2023 » Terms used in Reinforcement Leaning., viewed ,<https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/>
VANCOUVER
Anurag Verma | Sciencx - » Terms used in Reinforcement Leaning. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/
CHICAGO
" » Terms used in Reinforcement Leaning." Anurag Verma | Sciencx - Accessed . https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/
IEEE
" » Terms used in Reinforcement Leaning." Anurag Verma | Sciencx [Online]. Available: https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/. [Accessed: ]
rf:citation
» Terms used in Reinforcement Leaning | Anurag Verma | Sciencx | https://www.scien.cx/2023/03/25/terms-used-in-reinforcement-leaning/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.