摘要
大数据为语言研究带来了大量新型自然语料,但这些基于网络的非结构电子数据对于语料库研制而言既是机遇又是挑战。本文通过梳理语料库研制基本步骤,回顾现有研制软件和引介新技术工具,发现语料库研制当前呈现出三大趋势:研制工具上,单机软件转向网络应用;研制目的上,语料库研制与分析功能融合;研制应用上,语料库研制趋向大数据应用。
Big data has brought language research a great volume of new natural language materials in variety,but these webbased unstructured electronic data are both opportunities and challenges for corpora development and research.By reviewing the basic steps of corpus development,traditional and up-to-date software and new technology tools are introduced and it is found that corpora development currently is faced with three major trends:for the development tools,corpora development turns more to web-based application than the stand-alone software;for the purpose of development,corpora development and corpora-based analysis are integrated functionally;for corpora application,corpora development tends to provide access to applications of big data.
出处
《外语与翻译》
2018年第4期39-44,99,共7页
Foreign Languages and Translation
基金
广东省高层次人才引进联合项目“认知翻译学的理论及模型建构”(项目号:GWTP-YJ-2015-07)
广东外语外贸大学翻译学研究中心2015年重点项目“认知翻译学的学科理论基础及体系构建研究”(项目号:CTS201503)
广东外语外贸大学翻译学研究中心2017年招标项目重点项目“认知翻译学视角下的翻译过程研究”(项目号:CTS201702A)的阶段性研究成果