期刊文献+
共找到1,444篇文章
< 1 2 73 >
每页显示 20 50 100
高压直流输电控制保护系统的冗余设计 被引量:49
1
作者 张望 黄利军 +1 位作者 郝俊芳 张爱玲 《电力系统保护与控制》 EI CSCD 北大核心 2009年第13期88-91,共4页
冗余是高压直流输电控制保护系统的重要概念。与自诊断功能相结合,冗余配置的直流控制保护设备可以避免控制保护设备内部故障造成的功能失效,极大地提高直流输电系统运行的可靠性和系统可用率。基于高压直流输电控制保护系统的总体结构... 冗余是高压直流输电控制保护系统的重要概念。与自诊断功能相结合,冗余配置的直流控制保护设备可以避免控制保护设备内部故障造成的功能失效,极大地提高直流输电系统运行的可靠性和系统可用率。基于高压直流输电控制保护系统的总体结构,对实际工程中直流控制保护系统采用的多种冗余方案作了全面分析和总结,相关概念和原则可以在进行直流控制保护的系统设计时参考。 展开更多
关键词 高压直流输电 控制保护 冗余 双重化 三重化 切换逻辑 三取二选择
下载PDF
Widespread Whole Genome Duplications Contribute to Genome Complexity and Species Diversity in Angiosperms 被引量:43
2
作者 Ren Ren Haifeng Wang +5 位作者 Chunce Guo Ning Zhang Liping Zeng Yamao Chen Hong Ma Ji Qi 《Molecular Plant》 SCIE CAS CSCD 2018年第3期414-428,共15页
Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of re... Gene duplications provide evolutionary potentials for generating novel functions, while polyploidization or whole genome duplication (WGD) doubles the chromosomes initially and results in hundreds to thousands of retained duplicates. WGDs are strongly supported by evidence commonly found in many species-rich lineages of eukaryotes, and thus are considered as a major driving force in species diversification. We per- formed comparative genomic and phylogenomic analyses of 59 public genomes/transcriptomes and 46 newly sequenced transcriptomes covering major lineages of angiosperms to detect large-scale gene dupli- cation events by surveying tens of thousands of gene family trees. These analyses confirmed most of the previously reported WGDs and provided strong evidence for novel ones in many lineages. The detected WGDs supported a model of exponential gene loss during evolution with an estimated half-life of approx- imately 21.6 million years, and were correlated with both the emergence of lineages with high degrees of diversification and periods of global climate changes. The new datasets and analyses detected many novel WGDs widely spread during angiosperm evolution, uncovered preferential retention of gene functions in essential cellular metabolisms, and provided clues for the roles of WGD in promoting angiosperm radiation and enhancing their adaptation to environmental changes. 展开更多
关键词 whole genome duplication duplicate gene POLYPLOIDIZATION ANGIOSPERM PHYLOGENOMICS
原文传递
Electrophoretic Analysis of Isozymes and Discussion about Species Differentiation in Three Species of Genus Gymnocypris 被引量:15
3
作者 陈毅峰 何德奎 陈宜瑜 《Zoological Research》 CAS CSCD 北大核心 2001年第1期9-19,共11页
By using the method of electrophoresis,three isozymes (lactate dehydrogenase,malate dehydrogenase and esterase) of three species of genus Gymnocypris were described and analyzed from North Tibet in this paper. The... By using the method of electrophoresis,three isozymes (lactate dehydrogenase,malate dehydrogenase and esterase) of three species of genus Gymnocypris were described and analyzed from North Tibet in this paper. The results showed that all three isozymes presented interspecific difference and distinct differentiation among individuals in the same population,and there was no electrophorectic difference between males and females. Analysis of relationships among three naked carps indicated a high degree of similarity between G. selincuoensis and G. cuoensis ,whereas low degree between G. selincuoensis and G. namensis . Furthermore,three isozymes presented expression of null alleles,and the duplicate genes of LDH A 2,LDH B 2,s MDH A 2 and m MDH B 2 also expressed in some individuals. Compared to other tetraploid fishes,three naked carps retained more functional duplicate genes and null alleles. This suggests fishes of genus Gymnocypris are at the early stage of evolution after polyploidization than that of fishes of Catostomidae,it directly related to the later time of schizothoracine fishes originate as well as severe environment. 展开更多
关键词 Naked carps ( Gymnocypris ) North Tibet Isozyme electrophoresis duplicate gene Null allele Species differentiation
下载PDF
二里头遗址绿松石龙形器的清理与仿制复原 被引量:17
4
作者 李存信 《中原文物》 北大核心 2006年第4期92-96,F0002,F0003,共7页
二里头考古工作队在田野发掘过程中,出土了一件举世瞩目的绿松石镶嵌饰物,经过认真系统地清剔处理,比较完整地将饰物原本的状态展现了出来,并且也搞清楚了饰物的大概结构。在此基础上,我们根据饰物总体的形态和不同部位的相互关系以及... 二里头考古工作队在田野发掘过程中,出土了一件举世瞩目的绿松石镶嵌饰物,经过认真系统地清剔处理,比较完整地将饰物原本的状态展现了出来,并且也搞清楚了饰物的大概结构。在此基础上,我们根据饰物总体的形态和不同部位的相互关系以及各种纹饰的表现形式,使用替代材料和采取现代工艺合理地进行了仿制复原。 展开更多
关键词 二里头遗址 绿松石 镶嵌饰物 仿制复原
原文传递
分布式数据库管理系统的设计与实现 被引量:9
5
作者 陈业斌 《安徽工业大学学报(自然科学版)》 CAS 2005年第3期290-292,共3页
分析了企业对分布式系统的要求,提出了系统在B/S模式下的解决方案。利用ASP实现了真正的3层结构,提高了系统的灵活性、交互性和可靠性,对相关系统的设计具有重要的实践意义。
关键词 B/S结构 分布式 节点 副本
下载PDF
利用FAIMS法鉴别黄斑烟污染物的来源 被引量:7
6
作者 秦诗棋 周沅桢 +3 位作者 刘泽 李萍 张波 林婷 《中国烟草学报》 EI CAS CSCD 北大核心 2018年第4期7-15,共9页
为了鉴别黄斑烟中香精斑烟和料斑烟支污染物的来源,利用高场不对称离子迁移谱(FAIMS)法不同离子在强场(>15000 V/cm)条件下离子迁移率呈非线性变化的原理,使黄斑烟支上疑似香精污染物的离子团相互分离,得到待测物质的特征三维扫描图... 为了鉴别黄斑烟中香精斑烟和料斑烟支污染物的来源,利用高场不对称离子迁移谱(FAIMS)法不同离子在强场(>15000 V/cm)条件下离子迁移率呈非线性变化的原理,使黄斑烟支上疑似香精污染物的离子团相互分离,得到待测物质的特征三维扫描图谱。通过图像相似度计算查找软件(Visual Similarity Duplicate Image Finder),对比香精斑烟支污染物和对照样品三维扫描图谱的相似度,鉴别香精斑烟支污染物来源。料斑烟在经过样品处理后,可利用同样的方法鉴别。结果表明:人工分拣香精斑烟支污染物的FAIMS图谱和人造香精斑的FAIMS图谱相似度达到95%以上,人工分拣的烟丝湿团的FAIMS图谱和各类人造污染物的FAIMS图谱相似度约为95%,说明可利用FAIMS分析与鉴别香精斑烟和料斑烟表面污染物来源,且人工观察分拣黄斑烟支归类的方法基本准确。 展开更多
关键词 高场不对称离子迁移谱(FAIMS) 黄斑烟污染物 糖料 香精 图像相似度计算查找软件(Visual Similarity duplicate Image Finder)
下载PDF
Multi-Factor Duplicate Question Detection in Stack Overflow 被引量:5
7
作者 张芸 David Lo +1 位作者 夏鑫 孙建伶 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第5期981-997,共17页
Stack Overflow is a popular on-line question and answer site for software developers to share their experience and expertise. Among the numerous questions posted in Stack Overflow, two or more of them may express the ... Stack Overflow is a popular on-line question and answer site for software developers to share their experience and expertise. Among the numerous questions posted in Stack Overflow, two or more of them may express the same point and thus are duplicates of one another. Duplicate questions make Stack Overflow site maintenance harder, waste resources that could have been used to answer other questions, and cause developers to unnecessarily wait for answers that are already available. To reduce the problem of duplicate questions, Stack Overflow allows questions to be manually marked as duplicates of others. Since there are thousands of questions submitted to Stack Overflow every day, manually identifying duplicate questions is a difficult work. Thus, there is a need for an automated approach that can help in detecting these duplicate questions. To address the above-mentioned need, in this paper, we propose an automated approach named DuPPREDICTOR that takes a new question as input and detects potential duplicates of this question by considering multiple factors. DuPPREDICTOR extracts the title and description of a question and also tags that are attached to the question. These pieces of information (title, description, and a few tags) are mandatory information that a user needs to input when posting a question. DuPPREDICTOR then computes the latent topics of each question by using a topic model. Next, for each pair of questions, it computes four similarity scores by comparing their titles, descriptions, latent topics, and tags. These four similarity scores are finally combined together to result in a new similarity score that comprehensively considers the multiple factors. To examine the benefit of DuPPREDICTOR, we perform an experiment on a Stack Overflow dataset which contains a total of more than two million questions. The result shows that DuPPREDICTOR can achieve a recali-rate@20 score of 63.8%. We compare our approach with the standard search engine of Stack Overflow, and DuPPREDICTOR improves its recall-rat 展开更多
关键词 software information site duplicate question Stack Overflow DupPredictor
原文传递
方便快捷复印病案的探讨 被引量:7
8
作者 徐建国 王晓华 +1 位作者 胡琼 邹剑 《中国病案》 2009年第5期7-8,共2页
本文根据病案复印工作的实际情况,提出了方便快捷地完成病案复印工作应坚持的原则及酌情掌握的具体方法。
关键词 病案 复印 原则
原文传递
The Method and Practice of Constructing 3D Geological Model from Coalfield Exploration 2D Maps
9
作者 Hui Su Qingyuan Li +4 位作者 Duohu Hao Ke Xiong Wei Hu Xinyong Wei Xuan Zhang 《International Journal of Geosciences》 2023年第7期635-654,共20页
3D geological modeling is an inevitable choice for coal exploration to adapt to the transformation of coal mining for green, fine, transparent and Intelligent mining. In the traditional Coalfield exploration geologica... 3D geological modeling is an inevitable choice for coal exploration to adapt to the transformation of coal mining for green, fine, transparent and Intelligent mining. In the traditional Coalfield exploration geological reports, the spatial expression form for the coal seams and their surrounding rocks are 2D maps. These 2D maps are excellent data sources for constructing 3D geological models of coal field exploration areas. How to construct 3D models from these 2D maps has been studying in coal exploration industry for a long time, and still no breakthrough has been achieved so far. This paper discusses the principle, method and software design idea of constructing 3D geological model of an exploration area with 2D maps made by AutoCAD/MapGIS. At first, the paper analyzes 3D geological surface expression mode in 3D geological modeling software. It is pointed out that although contour method has unique advantages in coal field exploration, TIN (Triangular Irregular Network) is still the standard configuration of 3D modeling software for coal field. Then, the paper discusses the method of 2D line features obtaining elevation and upgrading 2D curve to 3D curve. Next, the method of semi-automatic partition is introduced to build the boundary ring of the surface patch, that is, the user clicks and selects the line feature to build the outer boundary ring of the surface patch. Then, Auto-process method for fault line inside of the outer boundary ring is discussed, it including construction of fault ring, determining fault ring being normal fault ring or reverse fault ring and an algorithm of dealing with normal fault ring. An algorithm of dealing with reverse fault ring is discussed detailly, the method of expanding reverse fault ring and dividing the duplicate area in reverse fault into two portions is introduced. The paper also discusses the method of extraction ridge line/valley line, the construction of fault plane, the construction of stratum and coal body. The above ideas and methods have been initially implement 展开更多
关键词 Coalfield Exploration 3D Geological Modeling Semi-Automatic Partition Partition Triangulation Reverse Fault duplicate Area Triangulation
下载PDF
BIGpre: A Quality Assessment Package for Next-Generation Sequencing Data 被引量:4
10
作者 Tongwu Zhang Yingfeng Luo +4 位作者 Kan Liu Linlin Pan Bing Zhang Jun Yu Songnlan Hu 《Genomics, Proteomics & Bioinformatics》 SCIE CAS CSCD 2011年第6期238-244,共7页
The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the ... The emergence of next-generation sequencing (NGS) technologies has significantly improved sequencing throughput and reduced costs. However, the short read length, duplicate reads and massive volume of data make the data processing much more difficult and complicated than the first-generation sequencing technology. Al- though there are some software packages developed to assess the data quality, those packages either are not easily available to users or require bioinformatics skills and computer resources. Moreover, almost all the quality assessment software currently available didn't taken into account the sequencing errors when dealing with the du- plicate assessment in NGS data. Here, we present a new user-friendly quality assessment software package called BIGpre, which works for both Illumina and 454 platforms. BIGpre contains all the functions of other quality assessment software, such as the correlation between forward and reverse reads, read GC-content distribution, and base Ns quality. More importantly, BIGpre incorporates associated programs to detect and remove duplicate reads after taking sequencing errors into account and trimming low quality reads from raw data as well. BIGpre is primarily written in Perl and integrates graphical capability from the statistics package R. This package produces both tabular and graphical summaries of data quality for sequencing datasets from Illumina and 454 platforms. Processing hundreds of millions reads within minutes, this package provides immediate diagnostic information for user to manipulate sequencing data for downstream analyses. BIGpre is freely available at http://bigpre.sourceforge.net/. 展开更多
关键词 next-generation sequencing quality assessment duplicate reads sequencing error
原文传递
Evidence-based literature review:De-duplication a cornerstone for quality
11
作者 Barbara Hammer Elettra Virgili Federico Bilotta 《World Journal of Methodology》 2023年第5期390-398,共9页
Evidence-based literature reviews play a vital role in contemporary research,facilitating the synthesis of knowledge from multiple sources to inform decisionmaking and scientific advancements.Within this framework,de-... Evidence-based literature reviews play a vital role in contemporary research,facilitating the synthesis of knowledge from multiple sources to inform decisionmaking and scientific advancements.Within this framework,de-duplication emerges as a part of the process for ensuring the integrity and reliability of evidence extraction.This opinion review delves into the evolution of de-duplication,highlights its importance in evidence synthesis,explores various de-duplication methods,discusses evolving technologies,and proposes best practices.By addressing ethical considerations this paper emphasizes the significance of deduplication as a cornerstone for quality in evidence-based literature reviews. 展开更多
关键词 duplicate publications as topic Databases BIBLIOGRAPHIC Artificial intelligence Systematic reviews as topic Review literature as topic De-duplication duplicate references Reference management software
下载PDF
分布式数据库系统中的复制服务器技术 被引量:3
12
作者 张锐 张翔 《武汉理工大学学报(信息与管理工程版)》 CAS 2003年第5期24-26,共3页
在大规模的分布式数据库系统中,数据库复制器对于系统的可靠性及效率起着非常关键的作用。复制服务器是一种解决维护分布式数据和管理分布式事务的固有问题的革命性方法。提出了复制服务器的基本思想和处理方法,分析了复制服务器的特点... 在大规模的分布式数据库系统中,数据库复制器对于系统的可靠性及效率起着非常关键的作用。复制服务器是一种解决维护分布式数据和管理分布式事务的固有问题的革命性方法。提出了复制服务器的基本思想和处理方法,分析了复制服务器的特点和工作过程,并对分布式数据库系统的复制服务器技术的应用作了简要介绍。 展开更多
关键词 分布式数据库系统 复制技术 复制服务器
下载PDF
论电子证据的认证规则体系——以《民事诉讼法》修订为背景 被引量:6
13
作者 刘显鹏 《大连理工大学学报(社会科学版)》 CSSCI 2013年第2期87-91,共5页
电子证据是借助电子设备展现的、以电子形式存在的可作为法院认定案件事实依据的所有证据材料。从认证的场合来看,法院可以通过当庭和庭外两种方式对电子证据进行认证。从电子证据认证的一般规则来看,主要涉及电子证据可受性和可靠性的... 电子证据是借助电子设备展现的、以电子形式存在的可作为法院认定案件事实依据的所有证据材料。从认证的场合来看,法院可以通过当庭和庭外两种方式对电子证据进行认证。从电子证据认证的一般规则来看,主要涉及电子证据可受性和可靠性的判断,前者是看电子证据是否符合特定的形式要件;后者是看电子证据的内容是否真实可靠。从电子证据认证的特殊规则来看,主要是对电子证据复制件效力的准确把握和认定。 展开更多
关键词 电子证据 认证 复制件
下载PDF
“艺术平民化”的两个案例——安迪·沃霍尔与约瑟夫·波伊斯作品解读 被引量:5
14
作者 吕彤 《天津大学学报(社会科学版)》 CSSCI 2008年第2期164-168,共5页
安迪.沃霍尔(Andy Warhol,1928—1987年)和约瑟夫.波伊斯(Beuys Joseph,1921—1986年)同是20世纪后半叶美欧艺术的旗帜性人物,也是艺术评论的焦点,但针对他们的评述往往盲目与武断。从社会、公众、艺术等诸因素入手,对他们的艺术进行了... 安迪.沃霍尔(Andy Warhol,1928—1987年)和约瑟夫.波伊斯(Beuys Joseph,1921—1986年)同是20世纪后半叶美欧艺术的旗帜性人物,也是艺术评论的焦点,但针对他们的评述往往盲目与武断。从社会、公众、艺术等诸因素入手,对他们的艺术进行了深入分析,并通过他们的两件作品,客观地解读了这两位艺术家,揭示了当代艺术作品的构成元素、方式以及作品内涵等因素间的密切联系。 展开更多
关键词 安迪·沃霍尔 约瑟夫·波伊斯 波普艺术 艺术平民化 元素 复制
下载PDF
影响病案复印服务质量的因素 被引量:6
15
作者 张淑贞 史素丽 +1 位作者 杜伟 王永超 《中国病案》 2009年第2期7-8,共2页
病案科作为对外服务的重要窗口,承担着为患者提供病案复印的服务,服务质量直接影响医患关系和医院形象。做好病案复印服务不仅取决于病案科复印人员的综合素质,还取决于医院领导的重视、医院整体管理水平及全院病案管理各个环节的规范... 病案科作为对外服务的重要窗口,承担着为患者提供病案复印的服务,服务质量直接影响医患关系和医院形象。做好病案复印服务不仅取决于病案科复印人员的综合素质,还取决于医院领导的重视、医院整体管理水平及全院病案管理各个环节的规范管理。 展开更多
关键词 病案管理 复印 服务质量
原文传递
软件即服务模式下租户多副本数据存储完整性问题研究 被引量:6
16
作者 李琳 钱进 +2 位作者 张永新 丁艳辉 孔兰菊 《南京大学学报(自然科学版)》 CAS CSCD 北大核心 2016年第2期324-334,共11页
针对云中软件即服务(Software as a Service,SaaS)多租户共享存储模式下恶意服务提供商伪造、删除或篡改租户定制存储的数据副本数据问题,结合多租户数据共享存储特点以及租户间隐私与隔离需求,提出了面向租户的多副本完整性保护机制(Te... 针对云中软件即服务(Software as a Service,SaaS)多租户共享存储模式下恶意服务提供商伪造、删除或篡改租户定制存储的数据副本数据问题,结合多租户数据共享存储特点以及租户间隐私与隔离需求,提出了面向租户的多副本完整性保护机制(Tenant-oriented duplication integrity checking scheme,TDIC).TDIC通过对租户副本元组进行周期性随机抽样的方式,来降低验证对象的生成代价.为适应租户数据的动态更新,建立面向租户多副本辅助验证结构(Tenant duplication authentication structure,TDAS),TDAS可以将每个数据节点上不同租户的副本验证信息隔离,保证租户副本验证过程的隔离性.结合租户元组的同态标签与TDAS,TDIC可以在不泄露租户数据内容的前提下,委托可信第三方对租户副本进行抽样检查.分析表明,如果租户逻辑视图中包含一万个数据元组时,在元组破坏率为1%的情况下发现数据被破坏的随机抽样数目最大约为元组总数的5%,相对全部验证的方法有效降低了系统资源消耗. 展开更多
关键词 软件即服务(Software as a Service SaaS) 多租户 数据副本 完整性保护
下载PDF
国家基本药物目录视角下的过评仿制药分析
17
作者 王金石 赵紫楠 +7 位作者 李可欣 张天齐 朱柏霖 纪立伟 金鹏飞 胡欣 罗小峰 赵飞 《临床药物治疗杂志》 2024年第5期39-43,共5页
目的探讨我国基本药物过评仿制药情况和重复仿制现状。方法搜集、整理国家药品监督管理局2017年8月1日至2023年9月1日公布的通过一致性评价的仿制药产品信息,统计分析过评仿制药的重复情况及其在2018年版《国家基本药物目录》中的分布... 目的探讨我国基本药物过评仿制药情况和重复仿制现状。方法搜集、整理国家药品监督管理局2017年8月1日至2023年9月1日公布的通过一致性评价的仿制药产品信息,统计分析过评仿制药的重复情况及其在2018年版《国家基本药物目录》中的分布与变化情况。结果共收集到来自不同厂家的7431个仿制药产品,2818个品规,1170个品种,涉及26种药理作用类别和41种剂型。438种基本药物中,135(30.8%)种全部品规有过评仿制药,104(23.7%)种部分品规有过评仿制药,199(45.4%)种品规无过评仿制药。基本药物共计1128个品规,614(54.4%)个品规无过评仿制药,514(45.6%)个品规有过评仿制药。326(63.4%)个品规的基本药物过评仿制药重复产品数为1~3个,79(15.4%)个品规的基本药物过评仿制药重复产品数为4~6个,109(21.2%)个品规的基本药物过评仿制药重复产品数≥7个。结论我国基本药物的仿制尚不均衡,部分基本药物过评仿制药过度重复仿制现象严重。 展开更多
关键词 国家基本药物目录 仿制药 一致性评价 重复仿制
原文传递
2006年-2010年住院病案资料复印情况浅析 被引量:6
18
作者 刘欢 郭志超 吴新竹 《中国病案》 2011年第7期13-14,共2页
目的分析医院住院病案复印情况,以加强病案质量管理。方法统计近5年病案资料复印17,200人次。结果住院病案资料复印数量逐年增加,主要用于医保报销及商业保险占61.94%、院外治疗占18.38%、医学证明及其他占19.67%。结论病案复印服务需... 目的分析医院住院病案复印情况,以加强病案质量管理。方法统计近5年病案资料复印17,200人次。结果住院病案资料复印数量逐年增加,主要用于医保报销及商业保险占61.94%、院外治疗占18.38%、医学证明及其他占19.67%。结论病案复印服务需求不断增加,提示应严格执行规范的工作流程,及时做好病案归档工作,提高病案书写质量,认真总结工作经验。加强病案管理人员的法律意识和服务意识,树立良好形象,加强与患者的沟通,为患者提供方便,才是做好病历复印工作的关键。 展开更多
关键词 病案 复印 利用
原文传递
Improved Approximate Detection of Duplicates for Data Streams Over Sliding Windows 被引量:3
19
作者 沈鸿 张育 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第6期973-987,共15页
Detecting duplicates in data streams is an important problem that has a wide range of applications. In general, precisely detecting duplicates in an unbounded data stream is not feasible in most streaming scenarios, a... Detecting duplicates in data streams is an important problem that has a wide range of applications. In general, precisely detecting duplicates in an unbounded data stream is not feasible in most streaming scenarios, and, on the other hand, the elements in data streams are always time sensitive. These make it particular significant approximately detecting duplicates among newly arrived elements of a data stream within a fixed time frame. In this paper, we present a novel data structure, Decaying Bloom Filter (DBF), as an extension of the Counting Bloom Filter, that effectively removes stale elements as new elements continuously arrive over sliding windows. On the DBF basis we present an efficient algorithm to approximately detect duplicates over sliding windows. Our algorithm may produce false positive errors, but not false negative errors as in many previous results. We analyze the time complexity and detection accuracy, and give a tight upper bound of false positive rate. For a given space G bits and sliding window size W, our algorithm has an amortized time complexity of O(√G/W). Both analytical and experimental results on synthetic data demonstrate that our algorithm is superior in both execution time and detection accuracy to the previous results. 展开更多
关键词 data stream duplicate detection bloom filter approximate query sliding window
原文传递
Duplicate identification model for deep web 被引量:4
20
作者 刘丽楠 寇月 +2 位作者 孙高尚 申德荣 于戈 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期315-317,共3页
A duplicate identification model is presented to deal with semi-structured or unstructured data extracted from multiple data sources in the deep web.First,the extracted data is generated to the entity records in the d... A duplicate identification model is presented to deal with semi-structured or unstructured data extracted from multiple data sources in the deep web.First,the extracted data is generated to the entity records in the data preprocessing module,and then,in the heterogeneous records processing module it calculates the similarity degree of the entity records to obtain the duplicate records based on the weights calculated in the homogeneous records processing module.Unlike traditional methods,the proposed approach is implemented without schema matching in advance.And multiple estimators with selective algorithms are adopted to reach a better matching efficiency.The experimental results show that the duplicate identification model is feasible and efficient. 展开更多
关键词 duplicate records deep web data cleaning semi-structured data
下载PDF
上一页 1 2 73 下一页 到第
使用帮助 返回顶部