17808181. UNDERSTANDING REINFORCEMENT LEARNING POLICIES BY IDENTIFYING STRATEGIC STATES simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

From WikiPatents
Jump to navigation Jump to search

UNDERSTANDING REINFORCEMENT LEARNING POLICIES BY IDENTIFYING STRATEGIC STATES

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Ronny Luss of Yorktown Heights NY (US)

Amit Dhurandhar of Yorktown Heights NY (US)

MIAO Liu of Ossining NY (US)

UNDERSTANDING REINFORCEMENT LEARNING POLICIES BY IDENTIFYING STRATEGIC STATES - A simplified explanation of the abstract

This abstract first appeared for US patent application 17808181 titled 'UNDERSTANDING REINFORCEMENT LEARNING POLICIES BY IDENTIFYING STRATEGIC STATES

Simplified Explanation

The patent application describes a method for generating explanations for a deep reinforcement learning policy.

  • The method involves computing a maximum likelihood path matrix, which represents the shortest path between each state in a set of states associated with a trained model.
  • Explanations are generated based on identified meta-states and selected strategic states, using the computed maximum likelihood path matrix.

Potential Applications

This technology has potential applications in various fields, including:

  • Artificial intelligence
  • Machine learning
  • Reinforcement learning
  • Robotics
  • Autonomous systems

Problems Solved

The technology addresses the following problems:

  • Lack of interpretability in deep reinforcement learning policies
  • Difficulty in understanding the decision-making process of AI systems
  • Limited ability to explain the reasoning behind AI-driven actions

Benefits

The technology offers several benefits, such as:

  • Improved transparency and interpretability of deep reinforcement learning policies
  • Enhanced understanding of AI system decision-making
  • Ability to provide explanations for AI-driven actions
  • Facilitation of trust and accountability in AI systems


Original Abstract Submitted

One or more computer processors compute a maximum likelihood path matrix comprising a respective shortest path between each state in a set of states associated with a model trained with a deep reinforcement learning policy. The one or more computer processors generate explanations for the deep reinforcement learning policy based one or more identified meta-states for each state in the set of states and corresponding selected strategic states utilizing the computed maximum likelihood path matrix.