未登录词(out of vocabulary,OOV)的查询翻译是影响跨语言信息检索(cross-language information retrieval,CLIR)性能的关键因素之一.它根据维基百科(Wikipedia)的数据结构和语言特性,将译文环境划分为目标存在环境和目标缺失环境.针对...未登录词(out of vocabulary,OOV)的查询翻译是影响跨语言信息检索(cross-language information retrieval,CLIR)性能的关键因素之一.它根据维基百科(Wikipedia)的数据结构和语言特性,将译文环境划分为目标存在环境和目标缺失环境.针对目标缺失环境下的译文挖掘难点,它采用频度变化信息和邻接信息实现候选单元抽取,并建立基于频度-距离模型、表层匹配模板和摘要得分模型的混合译文挖掘策略.实验将基于搜索引擎的未登录词挖掘技术作为baseline,并采用TOP1进行评测.实验验证基于维基百科的混合译文挖掘方法可达到0.6822的译文正确率,相对baseline取得6.98%的改进.展开更多
Personalized medicine is the development of “tailored” therapies that reflect traditional medical approaches with the incorporation of the patient’s unique genetic profile and the environmental basis of the disease...Personalized medicine is the development of “tailored” therapies that reflect traditional medical approaches with the incorporation of the patient’s unique genetic profile and the environmental basis of the disease. These individualized strategies encompass disease prevention and diagnosis, as well as treatment strategies. Today’s healthcare workforce is faced with the availability of massive amounts of patient- and disease-related data. When mined effectively, these data will help produce more efficient and effective diagnoses and treatment, leading to better prognoses for patients at both the individual and population level. Designing preventive and therapeutic interventions for those patients who will benefit most while minimizing side effects and controlling healthcare costs requires bringing diverse data sources together in an analytic paradigm. A resource to clinicians in the development and application of personalized medicine is largely facilitated, perhaps even driven, by the analysis of “big data”. For example, the availability of clinical data warehouses is a significant resource for clinicians in practicing personalized medicine. These “big data” repositories can be queried by clinicians, using specific questions, with data used to gain an understanding of challenges in patient care and treatment. Health informaticians are critical partners to data analytics including the use of technological infrastructures and predictive data mining strategies to access data from multiple sources, assisting clinicians’ interpretation of data and development of personalized, targeted therapy recommendations. In this paper, we look at the concept of personalized medicine, offering perspectives in four important, influencing topics: 1) the availability of “big data” and the role of biomedical informatics in personalized medicine, 2) the need for interdisciplinary teams in the development and evaluation of personalized therapeutic approaches, and 3) the impact of electronic medical record systems 展开更多
Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-...Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-transliterated words based on transliterated word identification model, and are further channeled to different mining processes. This paper is a pilot study on query classification for better translation mining performance, which is based on supervised classification and linguistic heuristics. The person name identification gets a precision of over 97%. Transliterated word translation mining shows satisfactory performance.展开更多
为了解决困扰词义及译文消歧的数据稀疏及知识获取问题,提出一种基于Web利用n-gram统计语言模型进行消歧的方法.在提出词汇语义与其n-gram语言模型存在对应关系假设的基础上,首先利用Hownet建立中文歧义词的英文译文与知网DEF的对应关...为了解决困扰词义及译文消歧的数据稀疏及知识获取问题,提出一种基于Web利用n-gram统计语言模型进行消歧的方法.在提出词汇语义与其n-gram语言模型存在对应关系假设的基础上,首先利用Hownet建立中文歧义词的英文译文与知网DEF的对应关系并得到该DEF下的词汇集合,然后通过搜索引擎在Web上搜索,并以此计算不同DEF中词汇n-gram出现的概率,然后进行消歧决策.在国际语义评测SemEval-2007中的Multilingual Chinese English Lexical Sample Task测试集上的测试表明,该方法的Pmar值为55.9%,比其上该任务参评最好的无指导系统性能高出12.8%.展开更多
Many translators always get into troubles in Mining English translation practice under the guidance of traditional translation principles.A suitable theory is necessary to serve as a guiding principle of translation p...Many translators always get into troubles in Mining English translation practice under the guidance of traditional translation principles.A suitable theory is necessary to serve as a guiding principle of translation practice.This paper tries to study the application of functional translation theories in Mining English translation,especially the application of text topology and skopos theory.This paper has carried on the preliminary research and exploration for Mining English translation techniques,in combination with specific examples in translation practices,and finally summed up the proper translation techniques for Mining English translation.展开更多
文摘未登录词(out of vocabulary,OOV)的查询翻译是影响跨语言信息检索(cross-language information retrieval,CLIR)性能的关键因素之一.它根据维基百科(Wikipedia)的数据结构和语言特性,将译文环境划分为目标存在环境和目标缺失环境.针对目标缺失环境下的译文挖掘难点,它采用频度变化信息和邻接信息实现候选单元抽取,并建立基于频度-距离模型、表层匹配模板和摘要得分模型的混合译文挖掘策略.实验将基于搜索引擎的未登录词挖掘技术作为baseline,并采用TOP1进行评测.实验验证基于维基百科的混合译文挖掘方法可达到0.6822的译文正确率,相对baseline取得6.98%的改进.
文摘Personalized medicine is the development of “tailored” therapies that reflect traditional medical approaches with the incorporation of the patient’s unique genetic profile and the environmental basis of the disease. These individualized strategies encompass disease prevention and diagnosis, as well as treatment strategies. Today’s healthcare workforce is faced with the availability of massive amounts of patient- and disease-related data. When mined effectively, these data will help produce more efficient and effective diagnoses and treatment, leading to better prognoses for patients at both the individual and population level. Designing preventive and therapeutic interventions for those patients who will benefit most while minimizing side effects and controlling healthcare costs requires bringing diverse data sources together in an analytic paradigm. A resource to clinicians in the development and application of personalized medicine is largely facilitated, perhaps even driven, by the analysis of “big data”. For example, the availability of clinical data warehouses is a significant resource for clinicians in practicing personalized medicine. These “big data” repositories can be queried by clinicians, using specific questions, with data used to gain an understanding of challenges in patient care and treatment. Health informaticians are critical partners to data analytics including the use of technological infrastructures and predictive data mining strategies to access data from multiple sources, assisting clinicians’ interpretation of data and development of personalized, targeted therapy recommendations. In this paper, we look at the concept of personalized medicine, offering perspectives in four important, influencing topics: 1) the availability of “big data” and the role of biomedical informatics in personalized medicine, 2) the need for interdisciplinary teams in the development and evaluation of personalized therapeutic approaches, and 3) the impact of electronic medical record systems
文摘Query translation mining is a key technique in cross-language information retrieval and machine translation knowl-edge acquisition. For better performance, the queries are classified into transliterated words and non-transliterated words based on transliterated word identification model, and are further channeled to different mining processes. This paper is a pilot study on query classification for better translation mining performance, which is based on supervised classification and linguistic heuristics. The person name identification gets a precision of over 97%. Transliterated word translation mining shows satisfactory performance.
文摘为了解决困扰词义及译文消歧的数据稀疏及知识获取问题,提出一种基于Web利用n-gram统计语言模型进行消歧的方法.在提出词汇语义与其n-gram语言模型存在对应关系假设的基础上,首先利用Hownet建立中文歧义词的英文译文与知网DEF的对应关系并得到该DEF下的词汇集合,然后通过搜索引擎在Web上搜索,并以此计算不同DEF中词汇n-gram出现的概率,然后进行消歧决策.在国际语义评测SemEval-2007中的Multilingual Chinese English Lexical Sample Task测试集上的测试表明,该方法的Pmar值为55.9%,比其上该任务参评最好的无指导系统性能高出12.8%.
文摘Many translators always get into troubles in Mining English translation practice under the guidance of traditional translation principles.A suitable theory is necessary to serve as a guiding principle of translation practice.This paper tries to study the application of functional translation theories in Mining English translation,especially the application of text topology and skopos theory.This paper has carried on the preliminary research and exploration for Mining English translation techniques,in combination with specific examples in translation practices,and finally summed up the proper translation techniques for Mining English translation.