面向智能博弈的决策Transformer方法综述

罗俊仁; 张万鹏; 苏炯铭; 王尧; 陈璟

doi:10.3969/j.issn.2096-0204.2023.01.0009

面向智能博弈的决策Transformer方法综述

On Decision-making Transformer Methods for Intelligent Gaming

摘要

摘要: 智能博弈是认知决策智能领域的挑战性问题, 是辅助联合作战筹划与智能任务规划的关键支撑. 从协作式团队博弈、竞争式零和博弈和混合式一般和博弈共3 个角度梳理了智能博弈模型, 从认知角度出发定义了运筹型博弈（完全/ 有限理性）、不确定型博弈（经验/知识）、涌现探索型博弈（直觉+ 灵感）、群体交互型博弈（协同演化）共4 类智能博弈认知模型, 从问题可信任解、策略训练平台、问题求解范式共3 个视角给出智能博弈求解方案. 基于Transformer 架构重点梳理了架构增强（表示学习、网络组合、模型扩展）与序列建模（离线预训练、在线适变、模型扩展）共2 大类6 小类决策Transformer 方法, 相关研究为开展“离线预训练+ 在线适变”范式下满足多主体、多任务、多模态及虚实迁移等应用场景的决策预训练模型构建提供了初始参考. 为智能博弈领域的决策基石模型相关研究提供可行借鉴.

Abstract: Intelligent gaming is a challenging problem in the field of cognitive decision-making intelligence, and it is the key support for assisting joint combat planning and intelligent mission planning. The intelligent gaming model is sorted out from three perspectives: collaborative team game, competitive zero-sum game and mixed general-sum game, four kinds of cognitive models of intelligent gaming are defined from the perspective of cognition: operational game （complete or bounded rationality）, uncertain game （experience /knowledge）, emerging exploratory game （intuition and inspiration）, and population interactive game （co-evolution）. Solutions of intelligent gaming are given from three perspectives: trustworthy solution of problems, benchmark learning method, and strategy training platform. Secondly, based on Transformer framework, the decision-making Transformer methods are analyzed from architecture enhancement （presentation learning, network combination, model extension）and sequence modeling （offline pre-training, online adaptation, model extension）. Relevant research provides an initial reference for the construction of decision-making pre-trained model in multi-agent, multi-task, multi-mode and sim-to-real transfer application scenarios under the paradigm of "offline pre-training + online adaptation". It is expected to provide feasible reference for the research on the decision-making foundation model in the field of intelligent gaming.

HTML全文

参考文献(0)

施引文献

资源附件(0)