Improving Fleet Management Using Transition-Informed Reinforcement Learning (TIRL) | Nanyang Technological University | Innovation and Entrepreneurship

Synopsis

This technology is designed for scenarios where one leader incentivises a large population of followers to maximise their utility. Potential applications include fleet management, such as e-hailing driver re-positioning, online ad auctions and elections.

Opportunity

Solving large-scale multi-agent problems remains a formidable challenge in urban planning. This technology can efficiently solve large-scale Stackelberg mean-field games (SMFGs) that model various real-world applications, such as fleet management and online ad auctions. Traditional multi-agent reinforcement learning (RL) methods are typically limited to a small number of agents and suffer from scalability issues. In contrast, this technology can scale to hundreds or even thousands of agents.

Technology

Many real-world scenarios, such as fleet management and online ad auctions, can be modelled as SMFGs, where a leader incentivises numerous homogeneous, self-interested followers to maximise their utility. Traditional model-free RL methods often suffer from data inefficiency, as the data is ignored after each update of the agent policies.

To efficiently leverage past experiences, Transition-Informed Reinforcement Learning (TIRL) introduces a model to capture the evolution of followers, facilitating the update of agent policies. The model is a neural network that takes the current state and action of an agent as input and outputs the probability distribution over the state space. This learned model quickly obtains the new state distribution of the followers in a non-atomic way, significantly improving scalability. Furthermore, this technology employs regularisation techniques to stabilise the learning process, achieving superior performance.

Figure 1: Overview of Transition-Informed Reinforcement Learning (TIRL) framework.

Figure 1: Overview of Transition-Informed Reinforcement Learning (TIRL) framework.

Figure 2: Technical details of TIRL.

Figure 2: Technical details of TIRL.

Applications & Advantages

Main applications include fleet management, online ad auctions, airline price analysis, elections and stock trading.

Advantages:

Efficiently solves large-scale multi-agent problems in urban planning and beyond
Scales to hundreds or thousands of agents
Leverages a neural network model to predict followers' state distributions
Uses regularisation techniques for stable learning
Provides a scalable, data-efficient solution for optimising complex systems involving numerous interacting agents

Innovation and Entrepreneurship

How can we help you?

Programmes

Financial Matters

Student Exchange

Student Life

NTULearn

Overseas exchanges

Library

Course finder

Alumni events

Alumni stories

Professional development

Alumni discounts

Research Focus

TRACS

GAIN

Research Hub

Academic partners

Research collaborations

Transition-Informed Reinforcement Learning for Large-Scale Stackelberg Mean-Field Games

Technology Readiness Level (TRL)

Synopsis

This technology is designed for scenarios where one leader incentivises a large population of followers to maximise their utility. Potential applications include fleet management, such as e-hailing driver re-positioning, online ad auctions and elections.

Opportunity

Technology

Applications & Advantages

Inventor

Prof AN Bo

Technology Readiness Level (TRL)

Programmes

Financial Matters

Student Exchange

Student Life

NTULearn

Overseas exchanges

Library

Course finder

Alumni events

Alumni stories

Professional development

Alumni discounts

Research Focus

TRACS

GAIN

Research Hub

Academic partners

Research collaborations

Technology Readiness Level (TRL)

Synopsis

This technology is designed for scenarios where one leader incentivises a large population of followers to maximise their utility. Potential applications include fleet management, such as e-hailing driver re-positioning, online ad auctions and elections.

Opportunity

Technology

Applications & Advantages

Inventor

Prof AN Bo

Technology Readiness Level (TRL)

Related Research News

A machine learning approach to weather insurance

Algorithms forecast future electricity demands

Surfers of the machine revolution: Prof An Bo

Surfers of the machine revolution: Assoc Prof Kelly Ke

Accelerating research excellence

Transforming the future of media with artificial intelligence

RaBitQ: Quantising High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search

Faster LSM-Tree Structure Implemented on RocksDB

Flexible Graphical User Interface for AGV Fleet Management Simulation Software

Semantic Multi-Modal SLAM System in Complex Dynamic Environment

A Data-Driven Framework for an Enhanced Indoor Localisation and Positioning Precision

Inference and Prediction of Participant Behaviour with Entry-Flipped Transformer

Bidirectional AC-DC Power Conversion

EarPCG: Fine-Grained Heartbeat Waveform Extraction With ANC-Based Earable Sensing