Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis,...Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface(API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.展开更多
针对大多数跨度模型将文本分割成跨度序列时,产生大量非实体跨度,导致了数据不平衡和计算复杂度高等问题,提出了基于跨度和边界探测的实体关系联合抽取模型(joint extraction model for entity relationships based on span and boundar...针对大多数跨度模型将文本分割成跨度序列时,产生大量非实体跨度,导致了数据不平衡和计算复杂度高等问题,提出了基于跨度和边界探测的实体关系联合抽取模型(joint extraction model for entity relationships based on span and boundary detection,SBDM)。SBDM首先使用训练Transformer的双向编码器表征量(bidirectional encoder representations from Transformer,BERT)模型将文本转化为词向量,并融合了通过图卷积获取的句法依赖信息以形成文本的特征表示;接着通过局部信息和句子上下文信息去探测实体边界并进行标记,以减少非实体跨度;然后将实体边界标记形成的跨度序列进行实体识别;最后将局部上下文信息融合到1个跨度实体对中并使用sigmoid函数进行关系分类。实验表明,SBDM在SciERC(multi-task identification of entities,relations,and coreference for scientific knowledge graph construction)数据集、CoNLL04(the 2004 conference on natural language learning)数据集上的关系分类指标S F1分别达到52.86%、74.47%,取得了较好效果。SBDM用于关系分类任务中,能促进跨度分类方法在关系抽取上的研究。展开更多
Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurr...Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurrent Temporal Graph Convolution Networks(IndRT-GCNets)framework to efficiently and accurately capture event attribute information.The framework models the knowledge graph sequences to learn the evolutionary represen-tations of entities and relations within each period.Firstly,by utilizing the temporal graph convolution module in the evolutionary representation unit,the framework captures the structural dependency relationships within the knowledge graph in each period.Meanwhile,to achieve better event representation and establish effective correlations,an independent recurrent neural network is employed to implement auto-regressive modeling.Furthermore,static attributes of entities in the entity-relation events are constrained andmerged using a static graph constraint to obtain optimal entity representations.Finally,the evolution of entity and relation representations is utilized to predict events in the next subsequent step.On multiple real-world datasets such as Freebase13(FB13),Freebase 15k(FB15K),WordNet11(WN11),WordNet18(WN18),FB15K-237,WN18RR,YAGO3-10,and Nell-995,the results of multiple evaluation indicators show that our proposed IndRT-GCNets framework outperforms most existing models on knowledge reasoning tasks,which validates the effectiveness and robustness.展开更多
基金the National Natural Science Foundation of China (Nos. 61472447 and 61802435).
文摘Malware homology identification is important in attacking event tracing, emergency response scheme generation, and event trend prediction. Current malware homology identification methods still rely on manual analysis, which is inefficient and cannot respond quickly to the outbreak of attack events. In response to these problems, we propose a new malware homology identification method from a gene perspective. A malware gene is represented by the subgraph, which can describe the homology of malware families. We extract the key subgraph from the function dependency graph as the malware gene by selecting the key application programming interface(API) and using the community partition algorithm. Then, we encode the gene and design a frequent subgraph mining algorithm to find the common genes between malware families. Finally, we use the family genes to guide the identification of malware based on homology. We evaluate our method with a public dataset, and the experiment results show that the accuracy of malware classification reaches 97% with high efficiency.
文摘针对大多数跨度模型将文本分割成跨度序列时,产生大量非实体跨度,导致了数据不平衡和计算复杂度高等问题,提出了基于跨度和边界探测的实体关系联合抽取模型(joint extraction model for entity relationships based on span and boundary detection,SBDM)。SBDM首先使用训练Transformer的双向编码器表征量(bidirectional encoder representations from Transformer,BERT)模型将文本转化为词向量,并融合了通过图卷积获取的句法依赖信息以形成文本的特征表示;接着通过局部信息和句子上下文信息去探测实体边界并进行标记,以减少非实体跨度;然后将实体边界标记形成的跨度序列进行实体识别;最后将局部上下文信息融合到1个跨度实体对中并使用sigmoid函数进行关系分类。实验表明,SBDM在SciERC(multi-task identification of entities,relations,and coreference for scientific knowledge graph construction)数据集、CoNLL04(the 2004 conference on natural language learning)数据集上的关系分类指标S F1分别达到52.86%、74.47%,取得了较好效果。SBDM用于关系分类任务中,能促进跨度分类方法在关系抽取上的研究。
基金the National Natural Science Founda-tion of China(62062062)hosted by Gulila Altenbek.
文摘Due to the structural dependencies among concurrent events in the knowledge graph and the substantial amount of sequential correlation information carried by temporally adjacent events,we propose an Independent Recurrent Temporal Graph Convolution Networks(IndRT-GCNets)framework to efficiently and accurately capture event attribute information.The framework models the knowledge graph sequences to learn the evolutionary represen-tations of entities and relations within each period.Firstly,by utilizing the temporal graph convolution module in the evolutionary representation unit,the framework captures the structural dependency relationships within the knowledge graph in each period.Meanwhile,to achieve better event representation and establish effective correlations,an independent recurrent neural network is employed to implement auto-regressive modeling.Furthermore,static attributes of entities in the entity-relation events are constrained andmerged using a static graph constraint to obtain optimal entity representations.Finally,the evolution of entity and relation representations is utilized to predict events in the next subsequent step.On multiple real-world datasets such as Freebase13(FB13),Freebase 15k(FB15K),WordNet11(WN11),WordNet18(WN18),FB15K-237,WN18RR,YAGO3-10,and Nell-995,the results of multiple evaluation indicators show that our proposed IndRT-GCNets framework outperforms most existing models on knowledge reasoning tasks,which validates the effectiveness and robustness.