Agent-Environment interface

Reinforcement learning can be seen as a special case of the interaction problem for achieving a goal. The entity that must reach the goal is called an agent. The entity with which the agent must interact is called the environment, which corresponds to everything that is external to the agent.

So far, we have focused more on the term agent, but what does it represent? The agent is a software entity that performs services on behalf of another program, usually automatically and invisibly. These software are also called smart agents.

The following are the most important features of an agent:

It can choose an action on the environment between a continuous and a discrete set
Action depends on the situation. The situation is summarized in the system state
The agent continuously monitors the environment (input) and continuously changes the status
The choice of action is not trivial and requires a certain degree of intelligence
The agent has a smart memory

The agent has a goal-directed behavior but acts in an uncertain environment not known a priori or partially known. An agent learns by interacting with the environment. Planning can be developed while learning about the environment through the measurements made by the agent itself. The strategy is close to trial-and-error theory.

Trial and error is a fundamental method of problem solving. It is characterized by repeated, varied attempts that are continued until success, or until the agent stops trying.

The Agent-Environment interaction is continuous; the agent chooses an action to be taken, and in response, the environment changes states, presenting a new situation to be faced.

In the particular case of reinforcement learning, the environment provides the agent with a reward; it is essential that the source of the reward is the environment to avoid the formation of a personal reinforcement mechanism within the agent that would compromise learning.

The value of the reward is proportional to the influence that the action has in reaching the objective; so it is positive or high in the case of a correct action, or negative or low action for an incorrect action.

The following are some examples from real life in which there is an interaction between the agent and environment to solve the problem:

A chess player, for each move, has information on the configurations of pieces that can create and on the possible countermoves of the opponent
A little giraffe learns to get up and run at 50 km/h in a few hours
A truly autonomous robot learns to move in a room to get out of it
The parameters of a refinery (oil pressure, flow, and so on) are set in real time so as to obtain the maximum yield or maximum quality

All the examples we have analyzed have the following characteristics in common:

Interaction with the environment
Objective of the agent
Uncertainty or partial knowledge of the environment

From the analysis of these examples, it is possible to make the following observations:

The agent learns from its own experience.
When the actions change the status (the situation), the possibilities of choices in the future change (delayed reward).
The effect of an action cannot be completely predicted.
The agent has a global assessment of it behavior.
It must exploit this information to improve his choices. Choices improve with experience.
Problems can have a finite or infinite time horizon.

Table of Contents for Agent-Environment interface

Create new playlist

Sign In

Sign Up

Table of Contents for
Agent-Environment interface