针对传统单因子模型无法充分利用时间序列相关信息,以及这些模型对时间序列预测准确性和可靠性较差的问题,提出一种基于多模态信息融合的时间序列预测模型--Skip-Fusion对多模态数据中的文本数据和数值数据进行融合。首先利用BERT(Bidir...针对传统单因子模型无法充分利用时间序列相关信息,以及这些模型对时间序列预测准确性和可靠性较差的问题,提出一种基于多模态信息融合的时间序列预测模型--Skip-Fusion对多模态数据中的文本数据和数值数据进行融合。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型和独热编码对不同类别的文本数据进行编码表示;再使用基于全局注意力机制的预训练模型获得多文本特征融合的单一向量表示;然后将得到的单一向量表示与数值数据按时间顺序对齐;最后通过时间卷积网络(TCN)模型实现文本和数值特征的融合,并通过跳跃连接完成多模态数据的浅层和深层特征的再次融合。在股票价格序列的数据集上进行实验,Skip-Fusion模型的均方根误差(RMSE)和日收益(R)分别为0.492和0.930,均优于现有的单模态模型和多模态融合模型的结果,同时在可决系数(R-Squared)上取得了0.955的拟合优度。实验结果表明,Skip-Fusion模型能够有效进行多模态信息融合并具有较高的预测准确性和可靠性。展开更多
急性肾衰竭是一种发病率较高的临床疾病,尽早识别潜在患者有助于医生对其进行医疗干预,降低发病率和死亡率.近年来,依靠电子健康病历去预测患者潜在的健康风险逐渐受到关注.大多数模型通过聚合数据或者填充缺失值的方式处理人体生理指...急性肾衰竭是一种发病率较高的临床疾病,尽早识别潜在患者有助于医生对其进行医疗干预,降低发病率和死亡率.近年来,依靠电子健康病历去预测患者潜在的健康风险逐渐受到关注.大多数模型通过聚合数据或者填充缺失值的方式处理人体生理指标数据中存在的稀疏性和不规则性问题,忽视了缺失信息隐含的患者健康状态.此外,现有的急性肾衰竭预测模型并没有考虑各种模态的数据特点和模态之间的相关性.为了解决以上问题,提出了基于多模态的急性肾衰竭预测模型.该模型考虑了人体生理指标数据、疾病数据和人口统计学数据.设计了新的基于掩码和时间差的LSTM(long short term memory)网络去学习各个生理指标的时间间隔和缺失信息,捕获指标的数值变化和检测频率变化,引入了多头自注意力机制促进各模态表征的相互学习.在真实的数据集上进行了急性肾衰竭预测问题和死亡风险预测问题的实验,证明了所提出模型的有效性和合理性.展开更多
The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It au...The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expec- tation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHO-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval SHC-EM outperforms the traditional variational 1earning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning.展开更多
文摘针对传统单因子模型无法充分利用时间序列相关信息,以及这些模型对时间序列预测准确性和可靠性较差的问题,提出一种基于多模态信息融合的时间序列预测模型--Skip-Fusion对多模态数据中的文本数据和数值数据进行融合。首先利用BERT(Bidirectional Encoder Representations from Transformers)预训练模型和独热编码对不同类别的文本数据进行编码表示;再使用基于全局注意力机制的预训练模型获得多文本特征融合的单一向量表示;然后将得到的单一向量表示与数值数据按时间顺序对齐;最后通过时间卷积网络(TCN)模型实现文本和数值特征的融合,并通过跳跃连接完成多模态数据的浅层和深层特征的再次融合。在股票价格序列的数据集上进行实验,Skip-Fusion模型的均方根误差(RMSE)和日收益(R)分别为0.492和0.930,均优于现有的单模态模型和多模态融合模型的结果,同时在可决系数(R-Squared)上取得了0.955的拟合优度。实验结果表明,Skip-Fusion模型能够有效进行多模态信息融合并具有较高的预测准确性和可靠性。
文摘急性肾衰竭是一种发病率较高的临床疾病,尽早识别潜在患者有助于医生对其进行医疗干预,降低发病率和死亡率.近年来,依靠电子健康病历去预测患者潜在的健康风险逐渐受到关注.大多数模型通过聚合数据或者填充缺失值的方式处理人体生理指标数据中存在的稀疏性和不规则性问题,忽视了缺失信息隐含的患者健康状态.此外,现有的急性肾衰竭预测模型并没有考虑各种模态的数据特点和模态之间的相关性.为了解决以上问题,提出了基于多模态的急性肾衰竭预测模型.该模型考虑了人体生理指标数据、疾病数据和人口统计学数据.设计了新的基于掩码和时间差的LSTM(long short term memory)网络去学习各个生理指标的时间间隔和缺失信息,捕获指标的数值变化和检测频率变化,引入了多头自注意力机制促进各模态表征的相互学习.在真实的数据集上进行了急性肾衰竭预测问题和死亡风险预测问题的实验,证明了所提出模型的有效性和合理性.
基金Supported by the National Natural Science Foundation of China under Grant No 60972106the China Postdoctoral Science Foundation under Grant No 2014M561053+1 种基金the Humanity and Social Science Foundation of Ministry of Education of China under Grant No 15YJA630108the Hebei Province Natural Science Foundation under Grant No E2016202341
文摘The contribution of this work is twofold: (1) a multimodality prediction method of chaotic time series with the Gaussian process mixture (GPM) model is proposed, which employs a divide and conquer strategy. It automatically divides the chaotic time series into multiple modalities with different extrinsic patterns and intrinsic characteristics, and thus can more precisely fit the chaotic time series. (2) An effective sparse hard-cut expec- tation maximization (SHC-EM) learning algorithm for the GPM model is proposed to improve the prediction performance. SHO-EM replaces a large learning sample set with fewer pseudo inputs, accelerating model learning based on these pseudo inputs. Experiments on Lorenz and Chua time series demonstrate that the proposed method yields not only accurate multimodality prediction, but also the prediction confidence interval SHC-EM outperforms the traditional variational 1earning in terms of both prediction accuracy and speed. In addition, SHC-EM is more robust and insusceptible to noise than variational learning.