摘要
为更好地评价我国的交通运输基本能力与交通运输服务水平,提出了基于文本分析算法和情感分析算法的公众评议指数计算模型。针对我国交通运输客货运、重大事件、重大安全事故等方面的公众评议,基于共享交通平台、在线票务平台、出行服务平台、电子商务平台、社交媒体平台5类数据源上发布的交通运输相关内容并综合公众对于其提供的直接或第三方服务的评价、意见、投诉等,利用网络爬虫等方式收集数据,对数据进行了预处理并基于领域、内容、情感分析器对模型进行了训练。通过对不同的领域及内容赋予权重,将文本中的情感值和领域内容方向的权重相结合,得到相关评议指数。以马蜂窝平台中的问答数据为例,通过基于大数据的评议指数计算模型对客运相关企业进行了指数评议。分别验证了综合评议指数计算框架中提出的领域、内容、情感3个分类器的有效性。实例分析结果表明:该模型较传统指数计算模型,对于信息的挖掘更加深入,能够简单、高效地完成模型的训练并达到较高的分类准确率;在后续增量数据的补充过程中,可以克服因初始样本不足导致的信息不足问题,能够作为对传统评价指标的补充,在实际应用中具有可行性。
In order to better evaluate the basic capacity and service level of transport in China, a public evaluation index calculation model based on text analysis algorithm and sentiment analysis algorithm is proposed. In regard to public comments on passenger and freight transport, major events, major safety accidents, the data of contents of transport are collected by web spider from sharing transport platform, online ticketing platform, travel service platform, e-commerce platform, social media platform. Besides, public comments, opinions, and complaints on the direct or third-party services provided by them are also taken in account. The data are preprocessed, and the model is trained based on domain, content, and sentiment analyzers. By assigning weights to different fields and contents, the sentiment value in the text and the weight of the field content direction are combined to obtain the relevant evaluation index. Taking the question and answer data in the hornet’s nest platform for example, the index of passenger-related enterprises is evaluated by using the evaluation index calculation model based on big data. The effectiveness of the domain, content and sentiment classifiers proposed in the comprehensive evaluation index framework is verified respectively. The result of example analysis shows that(1) compared with the traditional index computing model, the proposed model is more in-depth in information mining, it can simply and efficiently complete the training of the model and achieve a higher classification accuracy;(2) in the subsequent incremental data supplement process, the problem of insufficient information caused by insufficient initial samples can be overcome, it can be used as a supplement to the traditional evaluation index, which is feasible in practical application.
作者
李弢
刘勇凤
成倩倩
李绪茂
LI Tao;LIU Yong-feng;CHENG Qian-qian;LI Xu-mao(Transport Planning and Research Institute,Ministry of Transport,Beijing 100028,China;Laboratory for Traffic&Transport Planning Digitalization,Beijing 100028,China)
出处
《公路交通科技》
CAS
CSCD
北大核心
2022年第9期177-184,共8页
Journal of Highway and Transportation Research and Development
基金
交通运输战略规划政策项目(2016-6-5)。
关键词
运输经济
交通运输服务评议指数
文本分析算法
情感分析算法
交通运输服务水平
交通运输综合服务能力
大数据
transport engineering
transport service evaluation index
text analysis algorithm
sentiment analysis algorithm
transport services level
transport comprehensive service capacity
big data