Exporting my trained actor, critic NN agent from MATLAB Reinforcement Environment to TensorFlow

Question

Mahsa Raeisinezhad on 5 Jun 2023

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/1978624-exporting-my-trained-actor-critic-nn-agent-from-matlab-reinforcement-environment-to-tensorflow

Answered: Sanjana on 28 Aug 2023

I am trying to export my trained actor, critic NN agent from MATLAB Reinforcement Environment to TensorFlow,

env = Nuc_Maint_Env_Proposal_220211_NPIC_MATLAB2022A;

initOpts = rlAgentInitializationOptions();

Obtain observation and action specifications.

obsInfo = getObservationInfo(env);

actInfo = getActionInfo(env);

Create a PPO agent from the environment observation and action specifications. This agent uses default deep neural networks for its actor and critic.

agent = rlPPOAgent(obsInfo,actInfo);

% agent = rlACAgent(actor,critic,agentOpts);

To modify the deep neural networks within a reinforcement learning agent, you must first extract the actor and critic function approximators.

actor = getActor(agent);

critic = getCritic(agent);

Extract the deep neural networks from both the actor and critic function approximators.

actorNet = getModel(actor);

criticNet = getModel(critic);

exportNetworkToTensorFlow(actorNet,"actorNet")

exportNetworkToTensorFlow(criticNet,"criticNet"),

The problem is that, when I import the models in python using tensorflow, after steping into the environment my actor setup consistently outputs the same index position for the maximum probability, even though the values vary the index of the maximum probability stays the same, which leads to the same decision output. This only happens in Python and not in MATLAB. Is there anything wrong with the was I am exporting my trained Neural Network?

Below is the python code for getting the action_log:

# python function to get the state_log and action_log

def eval():

action_log = []

state_log = []

env = Nuc_Maint_Env_Proposal_220211_NPIC_MATLAB2022A()

observation = env.reset()

observation = tf.ragged.constant(observation)

observation = tf.reshape(observation, (1, -1))

done = False

reward = 0

num_episodes = 720

for episode in range(num_episodes):

state = env.reset()

action_logits = model_actorNet(observation)

actionelements = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1]])

action_log_prob = tf.argmax(action_logits, axis=-1)

action_index = action_log_prob.numpy().item()

action = actionelements[action_index]

observation, reward, done, _ = env.step(action)

reward += reward

action_log.append(action)

state_log.append(observation)

if done:

break

return np.array(state_log), np.array(action_log)

Any help would be great.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Sanjana on 28 Aug 2023

0
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/1978624-exporting-my-trained-actor-critic-nn-agent-from-matlab-reinforcement-environment-to-tensorflow#answer_1294807

Hi Mahsa,

I understand that you are facing an issue with using the exported “actor” and “critic” models from MATLAB, in python with TensorFlow.

As per the documentation, the code you provided for exporting the trained “actor” and “critic” models, is correct.

The reason for the “actor” to consistently output the same index position, is because of the use of “tf.argmax” function, which is mostly used in the classification tasks and this causes the “actor” to always choose the action with highest probability.

In the context of reinforcement learning, you can use the “tf.random.categorical” function, which is specifically designed for sampling from a categorical distribution, and it allows the “actor” to randomly explore different actions, even if they might not be the most probable ones.

Please refer to the following link, for further information,

https://www.mathworks.com/help/deeplearning/ref/exportnetworktotensorflow.html

Hope this helps!

Regards,

Sanjana

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Exporting my trained actor, critic NN agent from MATLAB Reinforcement Environment to TensorFlow

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Exporting my trained actor, critic NN agent from MATLAB Reinforcement Environment to TensorFlow

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments