Journal Article

Neurosymbolic Reinforcement Learning and Planning: A Survey

An IEEE Transactions on Artificial Intelligence survey on neurosymbolic reinforcement learning and planning, covering neural-symbolic integration, RL components, interpretable policies, reward shaping, and safe decision-making.

2024 IEEE Transactions on Artificial Intelligence DOI: 10.1109/TAI.2023.3311428

Neurosymbolic AI Reinforcement Learning Planning

DOI Publisher Page Cite

Abstract

The area of neurosymbolic artificial intelligence (Neurosymbolic AI) is rapidly developing and has become a popular research topic, encompassing subfields, such as neurosymbolic deep learning and neurosymbolic reinforcement learning (Neurosymbolic RL). Compared with traditional learning methods, Neurosymbolic AI offers significant advantages by simplifying complexity and providing transparency and explainability. Reinforcement learning (RL), a long-standing artificial intelligence (AI) concept that mimics human behavior using rewards and punishment, is a fundamental component of Neurosymbolic RL, a recent integration of the two fields that has yielded promising results. The aim of this article is to contribute to the emerging field of Neurosymbolic RL by conducting a literature survey. Our evaluation focuses on the three components that constitute Neurosymbolic RL: neural, symbolic, and RL. We categorize works based on the role played by the neural and symbolic parts in RL, into three taxonomies: learning for reasoning, reasoning for learning, and learning–reasoning. These categories are further divided into subcategories based on their applications. Furthermore, we analyze the RL components of each research work, including the state space, action space, policy module, and RL algorithm. In addition, we identify research opportunities and challenges in various applications within this dynamic field.

Plain-Language Summary

This paper studies how reinforcement learning systems can become more explainable and structured by combining data-driven learning with symbolic reasoning and planning.

Why This Paper Matters

Reinforcement learning has achieved strong results in games, robotics, and decision-making, but it often struggles with sample efficiency, safety, interpretability, reward design, and verification. This survey explains how neurosymbolic RL can combine neural learning with symbolic reasoning to create agents that learn, reason, plan, and expose more understandable policies for real-world decision-making.

Research Summary

This paper surveys neuro-symbolic reinforcement learning and planning, an area that combines neural learning, symbolic reasoning, and sequential decision-making. The motivation is that reinforcement learning can learn policies from interaction, but often lacks interpretability, structure, and reliable reasoning.

The survey categorizes existing methods based on how neural and symbolic components interact. Some systems use learning to support reasoning, others use reasoning to improve learning, and some integrate both directions into combined learning-reasoning architectures.

The paper also examines reinforcement learning components such as state spaces, action spaces, policy modules, and algorithms. This makes the survey useful for researchers trying to understand where symbolic knowledge can improve planning, transparency, and generalization in RL systems.

Neurosymbolic RL Integration Framework

Neural Component

Learns patterns from data, handles perception, reduces symbolic search spaces, and supports policy learning in complex environments.

Symbolic Component

Provides rules, logic, knowledge graphs, constraints, programmatic policies, and interpretable reasoning for RL agents.

Reinforcement Learning Component

Defines state spaces, action spaces, policies, rewards, and algorithms that allow agents to learn through interaction.

Planning and Explanation

Combines learned policies with symbolic planning to improve transparency, verifiability, reward shaping, and safe exploration.

Key Contributions

Surveys neuro-symbolic reinforcement learning and planning research.
Introduces taxonomies based on how neural and symbolic components interact in RL.
Analyzes reinforcement learning components across state spaces, actions, policies, and algorithms.
Identifies research opportunities and challenges across application areas.

Modeling Approaches Reviewed

Learning for Reasoning RL

Uses neural models to transform unstructured data, reduce symbolic search spaces, or distill learned policies into symbolic systems.

Reasoning for Learning RL

Uses symbolic knowledge to guide neural policy learning, improve reward shaping, and generate more interpretable programmatic policies.

Learning-Reasoning RL

Creates bidirectional interaction between neural and symbolic modules so each component can refine the other during RL.

Safe and Verified RL

Applies symbolic specifications and verification-oriented reasoning to improve safety in exploration and deployment.

Programmatic Policy Learning

Learns policies that can be represented as programs or symbolic structures, supporting interpretability and generalization.

Research Gaps

Automatic symbolic knowledge Verification and validation New NS-RL algorithms Symbol grounding Safe exploration Reward design Scalable reasoning Real-world deployment

Publication Details

Type: Journal Article
Venue: IEEE Transactions on Artificial Intelligence
Year: 2024
Published: May 1, 2024
Volume: 5
Issue: 5
Pages: 1939-1953
DOI: 10.1109/TAI.2023.3311428

Authors

Research Topics