nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo journalinfonormal searchdiv searchzone qikanlogo popupnotification paper paperNew
2026, 02, No.189 81-92
基于大语言模型的铁路货运营销分析智能体研究与实践
基金项目(Foundation): 中国国家铁路集团有限公司科技研究开发计划系统性重大项目(P2025X004); 中国铁路广州局集团有限公司科技研究开发计划课题(2025Q184-K)
邮箱(Email):
DOI: 10.20162/j.cnki.issn.1004-9746.2026.02.08
发布时间: 2026-04-25
出版时间: 2026-04-25
移动端阅读
摘要:

为解决当前铁路货运营销中分析工作日益高频、现有人工模式耗时费力、海量数据挖掘不充分、工作效能有待提升等问题,亟需构建一种面向铁路货运营销分析的智能体系统,旨在实现从自然语言查询到结构化数据结果的端到端智能响应。首先,基于广州局集团公司“铁路货运营销大数据分析系统”,融合人工智能技术,系统梳理货运营销领域的典型问题、专业术语及分析逻辑,构建了铁路货运营销专用知识库。该知识库由两个部分组成:一是语义知识库,涵盖货运营销多个业务场景的实体、关系与规则,用于支撑意图识别与上下文理解;二是历史问答对知识库,采用混合检索策略进行索引构建,同时建立向量索引与关键词索引,通过语义相似度匹配与关键词精确检索的融合机制,既保留对专业术语关键实体的高精度召回能力,又具备对口语化、模糊表达等非规范查询的泛化理解能力。其次,面向结构化指标分析需求,通过数据分类、清洗和标注等方法,对货票、运单、客户档案等多源异构数据进行融合建模,构建了涵盖货运调度、班列运输、货物品类分析、客户管理等8个核心业务领域的结构化知识视图,形成高质量、可查询的数据底座。在此基础上,引入通义千问大语言模型,研发了一套基于Agentic RAG架构的智能问答系统,该系统通过提问路由、任务分解、工具调用、答案生成与结果导出方法,实现复杂货运营销指标查询场景下的用户需求动态理解、数据高效检索和内容精准生成,引入反思机制对生成的SQL语句及查询结果进行校验,确保输出数据的准确性。为验证系统性能,通过741条真实业务测试样本,以SQL执行准确率和有效率得分为核心指标,系统验证了铁路货运营销分析智能体的有效性。其中SQL执行准确率达到95.1%,有效效率得分为0.894,相较于基线系统,执行准确率提升了12.1%,有效效率得分增加14.6%,平均耗时减少12.7%,有效融合大语言模型的语义理解能力、铁路货运营销领域的结构化知识体系、可执行的工具调用机制以及基于业务逻辑的自动校验流程,为构建实用性强、可拓展性高、安全可控的生成式货运营销分析智能体提供了切实可行的技术路径。

Abstract:

To address the challenges in current railway freight marketing analytics, such as increasingly frequent analysis tasks, timeconsuming and labor-intensive existing manual methods, insufficient mining of massive data resources, and the need to improve work efficiency, it is imperative to develop an intelligent agent system for railway freight marketing analysis, aiming to achieve endto-end intelligent response from natural language queries to structured data results. Firstly, based on the "Railway Freight Marketing Big Data Analysis System" of China Railway Guangzhou Group and integrating artificial intelligence technology, this paper systematically sorts out typical problems, professional terminology, and analytical logic in the field of freight marketing, and constructs a specialized knowledge base for railway freight marketing. This knowledge base consists of two parts: one is the semantic knowledge base, covering entities, relationships, and rules in multiple business scenarios of freight marketing, which is used to support intent recognition and context understanding; the other is the historical questionand-answer pair knowledge base, which adopts a hybrid retrieval strategy for index construction, and establishes both vector indexes and keyword indexes. Through a fusion mechanism of semantic similarity matching and keyword exact retrieval, it not only retains high-precision recall capabilities for key entities in professional terminology, but also possesses the generalization ability to understand non-standard queries such as colloquial and vague expressions. Secondly, to meet the demand for structured indicator analysis, multi-source heterogeneous data including freight tickets, waybills, and customer profiles are integrated and modeled through data classification, cleaning, and annotation, and a structured knowledge view covering eight core business areas including freight scheduling, train service transportation, freight category analysis, and customer management is constructed, forming a high-quality and queryable data foundation. On this basis, the Tongyi Qianwen large language model is introduced, and an intelligent question-answering system based on the Agentic RAG architecture is developed. This system achieves dynamic understanding of user needs, efficient data retrieval, and accurate content generation in complex freight marketing indicator query scenarios through methods such as question routing, task decomposition, tool invocation, answer generation, and result export. A self-reflection mechanism is introduced to validate the generated SQL statements and query results, ensuring the accuracy of output data. To verify the system performance, 741 real business test samples are adopted, with SQL execution accuracy and valid efficiency scores as core indicators, to systematically verify the effectiveness and performance of the railway freight marketing analysis intelligent agent. The SQL execution accuracy reaches 95.1%, and the valid efficiency score is 0.894. Compared with the baseline system, the execution accuracy increases by 12.1%, the valid efficiency score rises by 14.6%, and the average time consumption is reduced by 12.7%. The system effectively integrates the semantic understanding capabilities of large language models, the structured knowledge system in the field of railway freight marketing, the executable tool invocation mechanism, and the automatic verification process based on business logic, providing a feasible technical path for building a generative freight marketing analysis agent with strong practicality, high scalability, and secure and controllable features.

参考文献

[1]BROWN T,MANN B,RYDER N,et al.Language Models Are Few-Shot Learners[J].Advances in Neural Information Processing Systems,2020,33:1877-1901.

[2]NAVEED H,KHAN A U,QIU S,et al. 2025. A Comprehensive Overview of Large Language Models[J]. ACM Transactions on Intelligent Systems and Technology.2025,16(5):1-72.

[3]YIN S K,FU C Y,ZHAO S R,et al.A Survey on Multimodal Large Language Models[J]. National Science Review,2024,11(12):nwae403.

[4]KASWAN K S,DHATTERWAL J S,MALIK K,et al.Generative AI:A Review on Models and Applications[C]//2023 International Conference on Communication,Security and Artificial Intelligence(ICCSAI).Piscataway,NJ:IEEE,2023:699-704.

[5]梁美佳,刘昕武,胡晓鹏.基于改进YOLOv3的列车运行环境图像小目标检测算法[J].计算机应用,2023,43(8):2611-2618.LIANG Meijia,LIU Xinwu,HU Xiaopeng.Small Target Detection Algorithm for Train Operating Environment Image Based on Improved YOLOv3[J].Journal of Computer Applications,2023,43(8):2611-2618.

[6]牛宏侠,李朔,侯涛,等.MDQF-YOLO:面向复杂环境的铁路异物检测算法[J/OL].计算机工程与应用,1-17.(2025-12-05)[2026-03-12]. https://link. cnki. net/urlid/11.2127.TP.20251204.1752.016.NIU Hongxia,LI Shuo,HOU Tao,et al. MDQF-YOLO:Railway Foreign Object Detection Algorithm for Complex Environments[J/OL]. Computer Engineering and Applications,1-17.(2025-12-05)[2026-03-12].https://link.cnki.net/urlid/11.2127.TP.20251204.1752.016.

[7]穆鑫.基于大模型技术的TFDS故障智能识别方案与应用[J].铁路计算机应用,2025,34(11):8-14.MU Xin.Intelligent Recognition of TFDS Faults Based on Large Model Technology[J]. Railway Computer Application,2025,34(11):8-14.

[8]史维峰,包云,傅荟瑾,等.列车运行环境安全智能检测监测系统预期功能安全分析[J].中国铁路,2024(10):156-162.SHI Weifeng,BAO Yun,FU Huijin,et al.Expected Functional Safety Analysis of the Intelligent Detection and Monitoring System for Train Operating Environment Safety[J].China Railway,2024(10):156-162.

[9]DI SUMMA M,GRISETA M E,MOSCA N,et al.A Review on Deep Learning Techniques for Railway Infrastructure Monitoring[J].IEEE Access,2023,11:114638-114661.

[10]温子睿.高速铁路信道特性及基于人工智能的信道预测研究[D].北京:北京交通大学,2021.WEN Zirui.Research on Channel Characteristics of HighSpeed Railway and Channel Prediction Based on Artificial Intelligence[D].Beijing:Beijing Jiaotong University,2021.

[11]李仕旺,江琳,王桂林.基于自然语言处理的铁路客运营销分析智能对话系统研究[J].铁路计算机应用,2024,33(8):61-71.LI Shiwang,JIANG Lin,WANG Guilin.Research on Intelligent Dialogue System for Railway Passenger Transport Marketing Analysis Based on Natural Language Processing[J].Railway Computer Application,2024,33(8):61-71.

[12]张海山,王文斌,周瑾.基于异质时空图注意力网络的铁路车站货运量预测[J].铁道货运,2024,42(6):52-59.ZHANG Haishan,WANG Wenbin,ZHOU Jin.Freight Volume Prediction of Railway Station Based on Heterogeneous Spatial-Temporal Graph Attention Network[J]. Railway Freight Transport,2024,42(6):52-59.

[13]林裕森.铁路集装箱货运营销数据分析系统设计与实现[D].成都:西南交通大学,2021.LIN Yusen. Design and Implementation of Railway Container Freight Marketing Data Analysis System[D].Chengdu:Southwest Jiaotong University,2021.

[14]王娜娜,张建国.铁路货运营销大数据系统的设计与实现[J].铁路计算机应用,2024,33(5):36-39.WANG Nana,ZHANG Jianguo.Railway Freight Marketing Big Data System[J].Railway Computer Application,2024,33(5):36-39.

[15]WANG L,MA C,FENG X Y,et al.A Survey on Large Language Model Based Autonomous Agents[J]. Frontiers of Computer Science,2024,18(6):186345.

[16]李斌.铁路货运大数据应用策略探讨[J].铁道货运,2020,38(5):28-32.LI Bin. A Study on the Application Strategies of Railway Freight Big Data[J].Railway Freight Transport,2020,38(5):28-32.

[17]NAVEED H,KHAN A U,QIU S,et al. A Comprehensive Overview of Large Language Models[J].ACM Transactions on Intelligent Systems and Technology,2025,16(5):1-72.

[18]WANG J G,YI X M,GUO R T,et al.Milvus:A PurposeBuilt Vector Data Management System[C]//Proceedings of the 2021 International Conference on Management of Data.New York:ACM,2021:2614-2627.

[19]高洪深.决策支持系统(DSS)理论·方法·案例[M].第3版.北京:清华大学出版社,2005.

[20]付晓凤,范新龙,刘立宁.西安局集团公司货运站智能营销决策系统设计与开发[J].铁道货运,2022,40(2):23-29.FU Xiaofeng,FAN Xinlong,LIU Lining.Design and Development of Intelligent Decision-Making System for Freight Station Marketing in Xi'an Railway Group[J]. Railway Freight Transport,2022,40(2):23-29.

[21]王鸿,周红岩,肇冰.哈尔滨局集团公司货运智能决策支持系统研究[J].铁道货运,2020,38(3):36-39.WANG Hong,ZHOU Hongyan,ZHAO Bing. A Study on the Intelligent Decision-Making Support System for Railway Freight Transportation at China Railway Harbin Group[J].Railway Freight Transport,2020,38(3):36-39.

[22]WELD H,HUANG X Q,LONG S Q,et al. A Survey of Joint Intent Detection and Slot Filling Models in Natural Language Understanding[J]. ACM Computing Surveys,2023,55(8):1-38.

[23]ASAI A,WU Z,WANG Y,et al. Self-RAG:Learning to Retrieve, Generate, and Critique through Self-Reflection[C]//The Twelfth International Conference on Learning Representations. Vienna,Austria:ICLR,2024.

[24]FAN W Q,DING Y J,NING L B,et al.A Survey on RAG Meeting LLMS:Towards Retrieval-Augmented Large Language Models[C]//Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.New York:ACM,2024:6491-6501.

[25]KATSOGIANNIS-MEIMARAKIS G,KOUTRIKA G. A Survey on Deep Learning Approaches for Text-to-SQL[J].The VLDB Journal,2023,32(4):905-936.

[26]SINGH A,EHTESHAM A,KUMAR S,et al. Agentic Retrieval-Augmented Generation:A Survey on Agentic RAG[EB/OL]. 2025. arXiv:2501.09136. https://arxiv. org/abs/2501.09136.

[27]SHAILENDRA P,GHOSH R C,KUMAR R,et al.Survey of Large Language Models for Answering Questions across Various Fields[C]//2024 10th International Conference on Advanced Computing and Communication Systems(ICACCS).Piscataway,NJ:IEEE,2024:520-527.

[28]CHEN J L,XIAO S T,ZHANG P T,et al.M3-Embedding:Multi-Linguality, Multi-Functionality, Multi-Granularity Text Embeddings through Self-Knowledge Distillation[C]//Findings of the Association for Computational Linguistics ACL 2024. Stroudsburg,PA,USA:ACL,2024:2318-2335.

[29]ARMBRUST M,DAS T,SUN L W,et al.Delta Lake:HighPerformance ACID Table Storage over Cloud Object Stores[J].Proceedings of the VLDB Endowment,2020,13(12):3411-3424.

[30]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. New York:ACM,2020:9459-9474.

[31]YU W H,ZHANG H M,PAN X M,et al.Chain-of-Note:Enhancing Robustness in Retrieval-Augmented Language Models[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.Stroudsburg.PA,USA:ACL,2024:14672-14685.

[32]CHEN M Y,SUN L Z,LI T P,et al.ReSearch:Learning to Reason with Search for LLMS via Reinforcement Learning[EB/OL]. 2025:arXiv:2503.19470. https://arxiv. org/abs/2503.19470

[33]AGARWAL S,SUNDARESAN S,MITRA S,et al.CacheCraft:Managing Chunk-Caches for Efficient RetrievalAugmented Generation[J].Proceedings of the ACM on Management of Data,2025,3(3):1-28.

[34]JIN C,ZHANG Z L,JIANG X L,et al. RAGCache:Efficient Knowledge Caching for Retrieval-Augmented Generation[J].ACM Transactions on Computer Systems,2026,44(1):1-27.

[35]JIANG W Q,ZHANG S,HAN B R,et al.PipeRAG:Fast Retrieval-Augmented Generation via Algorithm-System Co-Design[EB/OL]. 2024. arXiv:2403.05676. https://arxiv.org/abs/2403.05676.

[36]PARK J S,O'BRIEN J,CAI C J,et al.Generative Agents:Interactive Simulacra of Human Behavior[C]//Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology.New York:ACM,2023:1-22.

[37]FENG X Y,CHEN Z Y,QIN Y J,et al. Large Language Model-Based Human-Agent Collaboration for Complex Task Solving[C]//Findings of the Association for Computational Linguistics:EMNLP 2024. Stroudsburg,PA,USA:ACL,2024:1336-1357.

[38]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need[J]. Advances in Neural Information Processing Systems,2017, 30:5998-6008.

[39]YU T,ZHANG R,YANG K,et al.Spider:A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA,USA:ACL,2018:3911-3921.

[40]CHANG K,CHENG R,MA C H,et al.Can LLM already Serve as a Database Interface? A BIg Bench for LargeScale Database Grounded Text-to-SQLS[C]//Advances in Neural Information Processing Systems 36.Neural Information Processing Systems Foundation,Inc.(NeurIPS),2023:42330-42357.

[41]牛一凡,吕晓艳,李仕旺,等.基于NL2SQL的铁路客运营销数据智能交互[J].铁道科学与工程学报,2024,21(9):3529-3539.NIU Yifan,LYU Xiaoyan,LI Shiwang,et al.Intelligent Interaction of Railway Passenger Transportation's Marketing Data Based on NL2SQL[J].Journal of Railway Science and Engineering,2024,21(9):3529-3539.

基本信息:

DOI:10.20162/j.cnki.issn.1004-9746.2026.02.08

中图分类号:F532.6;TP18

引用信息:

[1]陈兵,王克达,张明威,等.基于大语言模型的铁路货运营销分析智能体研究与实践[J].铁道经济研究,2026,No.189(02):81-92.DOI:10.20162/j.cnki.issn.1004-9746.2026.02.08.

基金信息:

中国国家铁路集团有限公司科技研究开发计划系统性重大项目(P2025X004); 中国铁路广州局集团有限公司科技研究开发计划课题(2025Q184-K)

发布时间:

2026-04-25

出版时间:

2026-04-25

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文
检 索 高级检索