The rapid development of high-throughput sequencing techniques has led biology into the big-data era.Data analyses using various bioinformatics tools rely on programming and command-line environments,which are challen...The rapid development of high-throughput sequencing techniques has led biology into the big-data era.Data analyses using various bioinformatics tools rely on programming and command-line environments,which are challenging and time-consuming for most wet-lab biologists.Here,we present TBtools(a Toolkit for Biologists integrating various biological data-handling tools),a stand-alone software with a userfriendly interface.The toolkit incorporates over 130 functions,which are designed to meet the increasing demand for big-data analyses,ranging from bulk sequence processing to interactive data visualization.A wide variety of graphs can be prepared in TBtools using a new plotting engine("JIGplot")developed to maximize their interactive ability;this engine allows quick point-and-click modification of almost every graphic feature.TBtools is platform-independent software that can be run under all operating systems with Java Runtime Environment 1.6 or newer.It is freely available to non-commercial users at https://github.com/CJ-Chen/TBtools/releases.展开更多
With the popularization of the Intemet, permeation of sensor networks, emergence of big data, increase in size of the information community, and interlinking and fusion of data and information throughout human society...With the popularization of the Intemet, permeation of sensor networks, emergence of big data, increase in size of the information community, and interlinking and fusion of data and information throughout human society, physical space, and cyberspace, the information environment related to the current development of artificial intelligence (AI) has profoundly changed. AI faces important adjustments, and scientific foundations are confronted with new breakthroughs, as AI enters a new stage: AI 2.0. This paper briefly reviews the 60-year developmental history of AI, analyzes the external environment promoting the formation of AI 2.0 along with changes in goals, and describes both the beginning of the technology and the core idea behind AI 2.0 development. Furthermore, based on combined social demands and the information environment that exists in relation to Chinese development, suggestions on the develoDment of Al 2.0 are given.展开更多
Since the official release of the stand-alone bioinformatics toolkit TBtools in 2020,its superior functionality in data analysis has been demonstrated by its widespread adoption by many thousands of users and referenc...Since the official release of the stand-alone bioinformatics toolkit TBtools in 2020,its superior functionality in data analysis has been demonstrated by its widespread adoption by many thousands of users and references in more than 5000 academic articles.Now,TBtools is a commonly used tool in biological laboratories.Over the past 3 years,thanks to invaluable feedback and suggestions from numerous users,we have optimized and expanded the functionality of the toolkit,leading to the development of an upgraded version—TBtools-II.In this upgrade,we have incorporated over 100 new features,such as those for comparative genomics analysis,phylogenetic analysis,and data visualization.Meanwhile,to better meet the increasing needs of personalized data analysis,we have launched the plugin mode,which enables users to develop their own plugins and manage their selection,installation,and removal according to individual needs.To date,the plugin store has amassed over 50 plugins,with more than half of them being independently developed and contributed by TBtools users.These plugins offer a range of data analysis options including co-expression network analysis,single-cell data analysis,and bulked segregant analysis sequencing data analysis.Overall,TBtools is now transforming from a stand-alone software to a comprehensive bioinformatics platform of a vibrant and cooperative community in which users are also developers and contributors.By promoting the theme“one for all,all for one”,we believe that TBtools-II will greatly benefit more biological researchers in this big-data era.展开更多
Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic...Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic high ground in the era of knowledge economies and also constitutes a new national and global strategic resource.“Big Earth data”,derived from,but not limited to,Earth observation has macro-level capabilities that enable rapid and accurate monitoring of the Earth,and is becoming a new frontier contributing to the advancement of Earth science and significant scientific discoveries.Within the context of the development of big data,this paper analyzes the characteristics of scientific big data and recognizes its great potential for development,particularly with regard to the role that big Earth data can play in promoting the development of Earth science.On this basis,the paper outlines the Big Earth Data Science Engineering Project(CASEarth)of the Chinese Academy of Sciences Strategic Priority Research Program.Big data is at the forefront of the integration of geoscience,information science,and space science and technology,and it is expected that big Earth data will provide new prospects for the development of Earth science.展开更多
The advance of nanophotonics has provided a variety of avenues for light–matter interaction at the nanometer scale through the enriched mechanisms for physical and chemical reactions induced by nanometer-confined opt...The advance of nanophotonics has provided a variety of avenues for light–matter interaction at the nanometer scale through the enriched mechanisms for physical and chemical reactions induced by nanometer-confined optical probes in nanocomposite materials.These emerging nanophotonic devices and materials have enabled researchers to develop disruptive methods of tremendously increasing the storage capacity of current optical memory.In this paper,we present a review of the recent advancements in nanophotonics-enabled optical storage techniques.Particularly,we offer our perspective of using them as optical storage arrays for next-generation exabyte data centers.展开更多
With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for ma...With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.展开更多
With the rapid development of mechanical equipment, the mechanical health monitoring field has entered the era of big data. However, the method of manual feature extraction has the disadvantages of low efficiency and ...With the rapid development of mechanical equipment, the mechanical health monitoring field has entered the era of big data. However, the method of manual feature extraction has the disadvantages of low efficiency and poor accuracy, when handling big data. In this study, the research object was the asynchronous motor in the drivetrain diagnostics simulator system. The vibration signals of different fault motors were collected. The raw signal was pretreated using short time Fourier transform (STFT) to obtain the corresponding time-frequency map. Then, the feature of the time-frequency map was adap- tively extracted by using a convolutional neural network (CNN). The effects of the pretreatment method, and the hyper parameters of network diagnostic accuracy, were investigated experimentally. The experimental results showed that the influence of the preprocessing method is small, and that the batch-size is the main factor affecting accuracy and training efficiency. By investigating feature visualization, it was shown that, in the case of big data, the extracted CNN features can represent complex mapping relationships between signal and health status, and can also overcome the prior knowledge and engineering experience requirement for feature extraction, which is used by tra- ditional diagnosis methods. This paper proposes a new method, based on STFT and CNN, which can complete motor fault diagnosis tasks more intelligently and accurately.展开更多
Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data...Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data—is creating new opportunities for the Earth sciences and revolutionizing the innovation of methodologies and thought patterns. It has potential to advance in-depth development of Earth sciences and bring more exciting scientific discoveries.The Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space was held in Beijing in June of 2015.The forum analyzed the development of Earth observation technology and big data, explored the concepts and scientific connotations of Big Earth Data from space, discussed the correlation between Big Earth Data and Digital Earth, and dissected the potential of Big Earth Data from space to promote scientific discovery in the Earth sciences, especially concerning global changes.展开更多
At present, it is projected that about 4 zettabytes (or 10^**21 bytes) of digital data are being generated per year by everything from underground physics experiments to retail transactions to security cameras to ...At present, it is projected that about 4 zettabytes (or 10^**21 bytes) of digital data are being generated per year by everything from underground physics experiments to retail transactions to security cameras to global positioning systems. In the U. S., major research programs are being funded to deal with big data in all five sectors (i.e., services, manufacturing, construction, agriculture and mining) of the economy. Big Data is a term applied to data sets whose size is beyond the ability of available tools to undertake their acquisition, access, analytics and/or application in a reasonable amount of time. Whereas Tien (2003) forewarned about the data rich, information poor (DRIP) problems that have been pervasive since the advent of large-scale data collections or warehouses, the DRIP conundrum has been somewhat mitigated by the Big Data approach which has unleashed information in a manner that can support informed - yet, not necessarily defensible or valid - decisions or choices. Thus, by somewhat overcoming data quality issues with data quantity, data access restrictions with on-demand cloud computing, causative analysis with correlative data analytics, and model-driven with evidence-driven applications, appropriate actions can be undertaken with the obtained information. New acquisition, access, analytics and application technologies are being developed to further Big Data as it is being employed to help resolve the 14 grand challenges (identified by the National Academy of Engineering in 2008), underpin the 10 breakthrough technologies (compiled by the Massachusetts Institute of Technology in 2013) and support the Third Industrial Revolution of mass customization.展开更多
Over the past decade,traditional Chinese medicine(TCM) has widely embraced systems biology and its various data integration approaches to promote its modernization.Thus,integrative pharmacology-based traditional Chine...Over the past decade,traditional Chinese medicine(TCM) has widely embraced systems biology and its various data integration approaches to promote its modernization.Thus,integrative pharmacology-based traditional Chinese medicine(TCMIP) was proposed as a paradigm shift in TCM.This review focuses on the presentation of this novel concept and the main research contents,methodologies and applications of TCMIP.First,TCMIP is an interdisciplinary science that can establish qualitative and quantitative pharmacokinetics-pharmacodynamics(PK-PD) correlations through the integration of knowledge from multiple disciplines and techniques and from different PK-PD processes in vivo.Then,the main research contents of TCMIP are introduced as follows:chemical and ADME/PK profiles of TCM formulas;confirming the three forms of active substances and the three action modes;establishing the qualitative PK-PD correlation;and building the quantitative PK-PD correlations,etc.After that,we summarize the existing data resources,computational models and experimental methods of TCMIP and highlight the urgent establishment of mathematical modeling and experimental methods.Finally,we further discuss the applications of TCMIP for the improvement of TCM quality control,clarification of the molecular mechanisms underlying the actions of TCMs and discovery of potential new drugs,especially TCM-related combination drug disco very.展开更多
The outbreak of the 2019 novel coronavirus disease(COVID-19)has caused more than 100,000 people infected and thousands of deaths.Currently,the number of infections and deaths is still increasing rapidly.COVID-19 serio...The outbreak of the 2019 novel coronavirus disease(COVID-19)has caused more than 100,000 people infected and thousands of deaths.Currently,the number of infections and deaths is still increasing rapidly.COVID-19 seriously threatens human health,production,life,social functioning and international relations.In the fight against COVID-19,Geographic Information Systems(GIS)and big data technologies have played an important role in many aspects,including the rapid aggregation of multi-source big data,rapid visualization of epidemic information,spatial tracking of confirmed cases,prediction of regional transmission,spatial segmentation of the epidemic risk and prevention level,balancing and management of the supply and demand of material resources,and socialemotional guidance and panic elimination,which provided solid spatial information support for decision-making,measures formulation,and effectiveness assessment of COVID-19 prevention and control.GIS has developed and matured relatively quickly and has a complete technological route for data preparation,platform construction,model construction,and map production.However,for the struggle against the widespread epidemic,the main challenge is finding strategies to adjust traditional technical methods and improve speed and accuracy of information provision for social management.At the data level,in the era of big data,data no longer come mainly from the government but are gathered from more diverse enterprises.As a result,the use of GIS faces difficulties in data acquisition and the integration of heterogeneous data,which requires governments,businesses,and academic institutions to jointly promote the formulation of relevant policies.At the technical level,spatial analysis methods for big data are in the ascendancy.Currently and for a long time in the future,the development of GIS should be strengthened to form a data-driven system for rapid knowledge acquisition,which signifies ts that GIS should be used to reinforce the social operation parameterization of models and 展开更多
Materials development has historically been driven by human needs and desires, and this is likely to con- tinue in the foreseeable future. The global population is expected to reach ten billion by 2050, which will pro...Materials development has historically been driven by human needs and desires, and this is likely to con- tinue in the foreseeable future. The global population is expected to reach ten billion by 2050, which will promote increasingly large demands for clean and high-ef ciency energy, personalized consumer prod- ucts, secure food supplies, and professional healthcare. New functional materials that are made and tai- lored for targeted properties or behaviors will be the key to tackling this challenge. Traditionally, advanced materials are found empirically or through experimental trial-and-error approaches. As big data generated by modern experimental and computational techniques is becoming more readily avail- able, data-driven or machine learning (ML) methods have opened new paradigms for the discovery and rational design of materials. In this review article, we provide a brief introduction on various ML methods and related software or tools. Main ideas and basic procedures for employing ML approaches in materials research are highlighted. We then summarize recent important applications of ML for the large-scale screening and optimal design of polymer and porous materials, catalytic materials, and energetic mate- rials. Finally, concluding remarks and an outlook are provided.展开更多
Big Personal Data is growing explosively. Consequently, an increasing number of internet users are drowning in a sea of data. Big Personal Data has enormous commercial value; it is a new kind of data asset. An urgent ...Big Personal Data is growing explosively. Consequently, an increasing number of internet users are drowning in a sea of data. Big Personal Data has enormous commercial value; it is a new kind of data asset. An urgent problem has thus arisen in the data market: How to price Big Personal Data fairly and reasonably. This paper proposes a pricing model for Big Personal Data based on tuple granularity, with the help of comparative analysis of existing data pricing models and strategies. This model is put forward to implement positive rating and reverse pricing for Big Personal Data by investigating data attributes that affect data value, and analyzing how the value of data tuples varies with information entropy, weight value, data reference index, cost, and other factors. The model can be adjusted dynamically according to these parameters. With increases in data scale, reductions in its cost,and improvements in its quality, Big Personal Data users can thereby obtain greater benefits.展开更多
Advances in biological and medical technologies have been providing us explosive vol- umes of biological and physiological data, such as medical images, electroencephalography, geno- mic and protein sequences. Learnin...Advances in biological and medical technologies have been providing us explosive vol- umes of biological and physiological data, such as medical images, electroencephalography, geno- mic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning展开更多
Big data with its vast volume and complexity is increasingly concerned, developed and used for all professions and trades. Remote sensing, as one of the sources for big data, is generating earth-observation data and a...Big data with its vast volume and complexity is increasingly concerned, developed and used for all professions and trades. Remote sensing, as one of the sources for big data, is generating earth-observation data and analysis results daily from the platforms of satellites, manned/unmanned aircrafts, and ground-based structures. Agricultural remote sensing is one of the backbone technologies for precision agriculture, which considers within-field variability for site-specific management instead of uniform management as in traditional agriculture. The key of agricultural remote sensing is, with global positioning data and geographic information, to produce spatially-varied data for subsequent precision agricultural operations. Agricultural remote sensing data, as general remote sensing data, have all characteristics of big data. The acquisition, processing, storage, analysis and visualization of agricultural remote sensing big data are critical to the success of precision agriculture. This paper overviews available remote sensing data resources, recent development of technologies for remote sensing big data management, and remote sensing data processing and management for precision agriculture. A five-layer-fifteen- level (FLFL) satellite remote sensing data management structure is described and adapted to create a more appropriate four-layer-twelve-level (FLTL) remote sensing data management structure for management and applications of agricultural remote sensing big data for precision agriculture where the sensors are typically on high-resolution satellites, manned aircrafts, unmanned aerial vehicles and ground-based structures. The FLTL structure is the management and application framework of agricultural remote sensing big data for precision agriculture and local farm studies, which outlooks the future coordination of remote sensing big data management and applications at local regional and farm scale.展开更多
The roles of subduction of the Pacific plate and the big mantle wedge(BMW) in the evolution of east Asian continental margin have attracted lots of attention in past years. This paper reviews recent progresses regardi...The roles of subduction of the Pacific plate and the big mantle wedge(BMW) in the evolution of east Asian continental margin have attracted lots of attention in past years. This paper reviews recent progresses regarding the composition and chemical heterogeneity of the BMW beneath eastern Asia and geochemistry of Cenozoic basalts in the region, with attempts to put forward a general model accounting for the generation of intraplate magma in a BMW system. Some key points of this review are summarized in the following.(1) Cenozoic basalts from eastern China are interpreted as a mixture of high-Si melts and low-Si melts. Wherever they are from, northeast, north or south China, Cenozoic basalts share a common low-Si basalt endmember, which is characterized by high alkali, Fe_2O_3~T and TiO_2 contents, HIMU-like trace element composition and relatively low ^(206)Pb/^(204)Pb compared to classic HIMU basalts. Their Nd-Hf isotopic compositions resemble that of Pacific Mantle domain and their source is composed of carbonated eclogites and peridotites. The high-Si basalt endmember is characterized by low alkali, Fe_2O_3~T and TiO_2 contents, Indian Mantle-type Pb-Nd-Hf isotopic compositions, and a predominant garnet pyroxenitic source. High-Si basalts show isotopic provinciality, with those from North China and South China displaying EM1-type and EM2-type components, respectively, while basalts from Northeast China containing both EM1-and EM2-type components.(2) The source of Cenozoic basalts from eastern China contains abundant recycled materials, including oceanic crust and lithospheric mantle components as well as carbonate sediments and water. According to their spatial distribution and deep seismic tomography, it is inferred that the recycled components are mostly from stagnant slabs in the mantle transition zone,whereas EM1 and EM2 components are from the shallow mantle.(3) Comparison of solidi of garnet pyroxenite, carbonated eclogite and peridotite with regional geotherm constrains the initial melting depth of high展开更多
The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or eve...The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or even decelerate in the near future.There is an urgent need for a new scientific approach and an advanced form of evidence-based decisionmaking towards the benefit of society,the economy,and the environment.To understand the impacts and interrelationships between humans as a society and natural Earth system processes,we propose a new engineering discipline,Big Earth Data science.This science is called to provide the methodologies and tools to generate knowledge from diverse,numerous,and complex data sources necessary to ensure a sustainable human society essential for the preservation of planet Earth.Big Earth Data science aims at utilizing data from Earth observation and social sensing and develop theories for understanding the mechanisms of how such a social-physical system operates and evolves.The manuscript introduces the universe of discourse characterizing this new science,its foundational paradigms and methodologies,and a possible technological framework to be implemented by applying an ecosystem approach.CASEarth and GEOSS are presented as examples of international implementation attempts.Conclusions discuss important challenges and collaboration opportunities.展开更多
基金This work was funded by the National Key Research and Developmental Program of China(2018YFD1000104)This work is also supported by awards to R.X.,Y.H.,and H.C.from the National Key Research and Developmental Program of China(2017YFD0101702,2018YFD1000500,2019YFD1000500)+4 种基金the National Science Foundation of China(#31872063)the Special Support Program of Guangdong Province(2019TX05N193)the Key-Area Research and Development Program of Guangdong Province(2018B020202011)the Guangzhou Science and Technology Key Project(201804020063)Support to M.H.F.comes from the NSF Faculty Early Career Development Program(IOS-1942437).
文摘The rapid development of high-throughput sequencing techniques has led biology into the big-data era.Data analyses using various bioinformatics tools rely on programming and command-line environments,which are challenging and time-consuming for most wet-lab biologists.Here,we present TBtools(a Toolkit for Biologists integrating various biological data-handling tools),a stand-alone software with a userfriendly interface.The toolkit incorporates over 130 functions,which are designed to meet the increasing demand for big-data analyses,ranging from bulk sequence processing to interactive data visualization.A wide variety of graphs can be prepared in TBtools using a new plotting engine("JIGplot")developed to maximize their interactive ability;this engine allows quick point-and-click modification of almost every graphic feature.TBtools is platform-independent software that can be run under all operating systems with Java Runtime Environment 1.6 or newer.It is freely available to non-commercial users at https://github.com/CJ-Chen/TBtools/releases.
文摘With the popularization of the Intemet, permeation of sensor networks, emergence of big data, increase in size of the information community, and interlinking and fusion of data and information throughout human society, physical space, and cyberspace, the information environment related to the current development of artificial intelligence (AI) has profoundly changed. AI faces important adjustments, and scientific foundations are confronted with new breakthroughs, as AI enters a new stage: AI 2.0. This paper briefly reviews the 60-year developmental history of AI, analyzes the external environment promoting the formation of AI 2.0 along with changes in goals, and describes both the beginning of the technology and the core idea behind AI 2.0 development. Furthermore, based on combined social demands and the information environment that exists in relation to Chinese development, suggestions on the develoDment of Al 2.0 are given.
基金supported by the Key Area Research and Development Program of Guangdong Province(2022B0202070003,and 2021B0707010004)supported by the National Science Foundation of China(#32072547,and#32102320)+5 种基金the National Key Research and Development Program(2021YFF1000101,and 2019YFD1000500)the Special Support Program of Guangdong Province(2019TX05N193)the Scientific Research Foundation of the Hunan Provincial Education Department(20A261),)the open competition program of top ten critical priorities of Agricultural Science and Technology Innovation for the 14th Five-Year Plan of Guangdong Province(2022SDZG05)C.C.is supported by the Guangzhou Municipal Science and Technology Plan Project(2023A04J0113)J.F.is supported by the Hainan Provincial Natural Science Foundation of China(323QN279).
文摘Since the official release of the stand-alone bioinformatics toolkit TBtools in 2020,its superior functionality in data analysis has been demonstrated by its widespread adoption by many thousands of users and references in more than 5000 academic articles.Now,TBtools is a commonly used tool in biological laboratories.Over the past 3 years,thanks to invaluable feedback and suggestions from numerous users,we have optimized and expanded the functionality of the toolkit,leading to the development of an upgraded version—TBtools-II.In this upgrade,we have incorporated over 100 new features,such as those for comparative genomics analysis,phylogenetic analysis,and data visualization.Meanwhile,to better meet the increasing needs of personalized data analysis,we have launched the plugin mode,which enables users to develop their own plugins and manage their selection,installation,and removal according to individual needs.To date,the plugin store has amassed over 50 plugins,with more than half of them being independently developed and contributed by TBtools users.These plugins offer a range of data analysis options including co-expression network analysis,single-cell data analysis,and bulked segregant analysis sequencing data analysis.Overall,TBtools is now transforming from a stand-alone software to a comprehensive bioinformatics platform of a vibrant and cooperative community in which users are also developers and contributors.By promoting the theme“one for all,all for one”,we believe that TBtools-II will greatly benefit more biological researchers in this big-data era.
基金This work is supported by the Strategic Priority Research Program of Chinese Academy of Sciences,Project title:CASEarth(XDA19000000)and Digital Belt and Road(XDA19030000).
文摘Big data is a revolutionary innovation that has allowed the development of many new methods in scientific research.This new way of thinking has encouraged the pursuit of new discoveries.Big data occupies the strategic high ground in the era of knowledge economies and also constitutes a new national and global strategic resource.“Big Earth data”,derived from,but not limited to,Earth observation has macro-level capabilities that enable rapid and accurate monitoring of the Earth,and is becoming a new frontier contributing to the advancement of Earth science and significant scientific discoveries.Within the context of the development of big data,this paper analyzes the characteristics of scientific big data and recognizes its great potential for development,particularly with regard to the role that big Earth data can play in promoting the development of Earth science.On this basis,the paper outlines the Big Earth Data Science Engineering Project(CASEarth)of the Chinese Academy of Sciences Strategic Priority Research Program.Big data is at the forefront of the integration of geoscience,information science,and space science and technology,and it is expected that big Earth data will provide new prospects for the development of Earth science.
基金The authors thank the Australian Research Council for its support through the Laureate Fellowship project(FL100100099).
文摘The advance of nanophotonics has provided a variety of avenues for light–matter interaction at the nanometer scale through the enriched mechanisms for physical and chemical reactions induced by nanometer-confined optical probes in nanocomposite materials.These emerging nanophotonic devices and materials have enabled researchers to develop disruptive methods of tremendously increasing the storage capacity of current optical memory.In this paper,we present a review of the recent advancements in nanophotonics-enabled optical storage techniques.Particularly,we offer our perspective of using them as optical storage arrays for next-generation exabyte data centers.
基金supported by grants from the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant Nos.XDB13040500 and XDA08020102)the National High-tech R&D Program(863 Program+5 种基金Grant Nos.2014AA021503 and 2015AA020108)the National Key Research Program of China(Grant Nos.2016YFC0901603,2016YFB0201702,2016YFC0901903,and 2016YFC0901701)the International Partnership Program of the Chinese Academy of Sciences(Grant No.153F11KYSB20160008)the Key Program of the Chinese Academy of Sciences(Grant No.KJZD-EW-L14)the Key Technology Talent Program of the Chinese Academy of Sciences(awarded to WZ)the 100 Talent Program of the Chinese Academy of Sciences(awarded to ZZ)
文摘With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.
基金Supported by National Natural Science Foundation of China(Grant No.51405241,51505234,51575283)
文摘With the rapid development of mechanical equipment, the mechanical health monitoring field has entered the era of big data. However, the method of manual feature extraction has the disadvantages of low efficiency and poor accuracy, when handling big data. In this study, the research object was the asynchronous motor in the drivetrain diagnostics simulator system. The vibration signals of different fault motors were collected. The raw signal was pretreated using short time Fourier transform (STFT) to obtain the corresponding time-frequency map. Then, the feature of the time-frequency map was adap- tively extracted by using a convolutional neural network (CNN). The effects of the pretreatment method, and the hyper parameters of network diagnostic accuracy, were investigated experimentally. The experimental results showed that the influence of the preprocessing method is small, and that the batch-size is the main factor affecting accuracy and training efficiency. By investigating feature visualization, it was shown that, in the case of big data, the extracted CNN features can represent complex mapping relationships between signal and health status, and can also overcome the prior knowledge and engineering experience requirement for feature extraction, which is used by tra- ditional diagnosis methods. This paper proposes a new method, based on STFT and CNN, which can complete motor fault diagnosis tasks more intelligently and accurately.
基金supported by the Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space
文摘Big data is a strategic highland in the era of knowledge-driven economies, and it is also a new type of strategic resource for all nations. Big data collected from space for Earth observation—so-called Big Earth Data—is creating new opportunities for the Earth sciences and revolutionizing the innovation of methodologies and thought patterns. It has potential to advance in-depth development of Earth sciences and bring more exciting scientific discoveries.The Academic Divisions of the Chinese Academy of Sciences Forum on Frontiers of Science and Technology for Big Earth Data from Space was held in Beijing in June of 2015.The forum analyzed the development of Earth observation technology and big data, explored the concepts and scientific connotations of Big Earth Data from space, discussed the correlation between Big Earth Data and Digital Earth, and dissected the potential of Big Earth Data from space to promote scientific discovery in the Earth sciences, especially concerning global changes.
文摘At present, it is projected that about 4 zettabytes (or 10^**21 bytes) of digital data are being generated per year by everything from underground physics experiments to retail transactions to security cameras to global positioning systems. In the U. S., major research programs are being funded to deal with big data in all five sectors (i.e., services, manufacturing, construction, agriculture and mining) of the economy. Big Data is a term applied to data sets whose size is beyond the ability of available tools to undertake their acquisition, access, analytics and/or application in a reasonable amount of time. Whereas Tien (2003) forewarned about the data rich, information poor (DRIP) problems that have been pervasive since the advent of large-scale data collections or warehouses, the DRIP conundrum has been somewhat mitigated by the Big Data approach which has unleashed information in a manner that can support informed - yet, not necessarily defensible or valid - decisions or choices. Thus, by somewhat overcoming data quality issues with data quantity, data access restrictions with on-demand cloud computing, causative analysis with correlative data analytics, and model-driven with evidence-driven applications, appropriate actions can be undertaken with the obtained information. New acquisition, access, analytics and application technologies are being developed to further Big Data as it is being employed to help resolve the 14 grand challenges (identified by the National Academy of Engineering in 2008), underpin the 10 breakthrough technologies (compiled by the Massachusetts Institute of Technology in 2013) and support the Third Industrial Revolution of mass customization.
基金supported by grants from the National Natural Science Foundation of China (Grant Nos. 81830111 and 81774201)National Key Research and Development Program of China (2017YFC1702104 and 2017YFC1702303)+2 种基金the Youth Innovation Team of Shaanxi Universities and Shaanxi Provincial Science and Technology Department Project (No. 2016SF-378, China)the Fundamental Research Funds for the Central public Welfare Research Institutes (ZXKT17058, China)the National Science and Technology Major Project of China (2019ZX09201005-001-003)。
文摘Over the past decade,traditional Chinese medicine(TCM) has widely embraced systems biology and its various data integration approaches to promote its modernization.Thus,integrative pharmacology-based traditional Chinese medicine(TCMIP) was proposed as a paradigm shift in TCM.This review focuses on the presentation of this novel concept and the main research contents,methodologies and applications of TCMIP.First,TCMIP is an interdisciplinary science that can establish qualitative and quantitative pharmacokinetics-pharmacodynamics(PK-PD) correlations through the integration of knowledge from multiple disciplines and techniques and from different PK-PD processes in vivo.Then,the main research contents of TCMIP are introduced as follows:chemical and ADME/PK profiles of TCM formulas;confirming the three forms of active substances and the three action modes;establishing the qualitative PK-PD correlation;and building the quantitative PK-PD correlations,etc.After that,we summarize the existing data resources,computational models and experimental methods of TCMIP and highlight the urgent establishment of mathematical modeling and experimental methods.Finally,we further discuss the applications of TCMIP for the improvement of TCM quality control,clarification of the molecular mechanisms underlying the actions of TCMs and discovery of potential new drugs,especially TCM-related combination drug disco very.
基金funded by the National Natural Science Foundation of China(41421001,42041001 and 41525004).
文摘The outbreak of the 2019 novel coronavirus disease(COVID-19)has caused more than 100,000 people infected and thousands of deaths.Currently,the number of infections and deaths is still increasing rapidly.COVID-19 seriously threatens human health,production,life,social functioning and international relations.In the fight against COVID-19,Geographic Information Systems(GIS)and big data technologies have played an important role in many aspects,including the rapid aggregation of multi-source big data,rapid visualization of epidemic information,spatial tracking of confirmed cases,prediction of regional transmission,spatial segmentation of the epidemic risk and prevention level,balancing and management of the supply and demand of material resources,and socialemotional guidance and panic elimination,which provided solid spatial information support for decision-making,measures formulation,and effectiveness assessment of COVID-19 prevention and control.GIS has developed and matured relatively quickly and has a complete technological route for data preparation,platform construction,model construction,and map production.However,for the struggle against the widespread epidemic,the main challenge is finding strategies to adjust traditional technical methods and improve speed and accuracy of information provision for social management.At the data level,in the era of big data,data no longer come mainly from the government but are gathered from more diverse enterprises.As a result,the use of GIS faces difficulties in data acquisition and the integration of heterogeneous data,which requires governments,businesses,and academic institutions to jointly promote the formulation of relevant policies.At the technical level,spatial analysis methods for big data are in the ascendancy.Currently and for a long time in the future,the development of GIS should be strengthened to form a data-driven system for rapid knowledge acquisition,which signifies ts that GIS should be used to reinforce the social operation parameterization of models and
文摘Materials development has historically been driven by human needs and desires, and this is likely to con- tinue in the foreseeable future. The global population is expected to reach ten billion by 2050, which will promote increasingly large demands for clean and high-ef ciency energy, personalized consumer prod- ucts, secure food supplies, and professional healthcare. New functional materials that are made and tai- lored for targeted properties or behaviors will be the key to tackling this challenge. Traditionally, advanced materials are found empirically or through experimental trial-and-error approaches. As big data generated by modern experimental and computational techniques is becoming more readily avail- able, data-driven or machine learning (ML) methods have opened new paradigms for the discovery and rational design of materials. In this review article, we provide a brief introduction on various ML methods and related software or tools. Main ideas and basic procedures for employing ML approaches in materials research are highlighted. We then summarize recent important applications of ML for the large-scale screening and optimal design of polymer and porous materials, catalytic materials, and energetic mate- rials. Finally, concluding remarks and an outlook are provided.
基金supported in part by the National Natural Science Foundation of China (Nos. 61332001, 61272104, and 61472050)the Science and Technology Planning Project of Sichuan Province (Nos. 2014JY0257, 2015GZ0103, and 2014-HM01-00326SF)
文摘Big Personal Data is growing explosively. Consequently, an increasing number of internet users are drowning in a sea of data. Big Personal Data has enormous commercial value; it is a new kind of data asset. An urgent problem has thus arisen in the data market: How to price Big Personal Data fairly and reasonably. This paper proposes a pricing model for Big Personal Data based on tuple granularity, with the help of comparative analysis of existing data pricing models and strategies. This model is put forward to implement positive rating and reverse pricing for Big Personal Data by investigating data attributes that affect data value, and analyzing how the value of data tuples varies with information entropy, weight value, data reference index, cost, and other factors. The model can be adjusted dynamically according to these parameters. With increases in data scale, reductions in its cost,and improvements in its quality, Big Personal Data users can thereby obtain greater benefits.
基金supported by the Center for Precision Medicine, Sun Yat-sen University and the National High-tech R&D Program (863 Program Grant No. 2015AA020110) of China awarded to YZ
文摘Advances in biological and medical technologies have been providing us explosive vol- umes of biological and physiological data, such as medical images, electroencephalography, geno- mic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning
基金financially supported by the funding appropriated from USDA-ARS National Program 305 Crop Productionthe 948 Program of Ministry of Agriculture of China (2016-X38)
文摘Big data with its vast volume and complexity is increasingly concerned, developed and used for all professions and trades. Remote sensing, as one of the sources for big data, is generating earth-observation data and analysis results daily from the platforms of satellites, manned/unmanned aircrafts, and ground-based structures. Agricultural remote sensing is one of the backbone technologies for precision agriculture, which considers within-field variability for site-specific management instead of uniform management as in traditional agriculture. The key of agricultural remote sensing is, with global positioning data and geographic information, to produce spatially-varied data for subsequent precision agricultural operations. Agricultural remote sensing data, as general remote sensing data, have all characteristics of big data. The acquisition, processing, storage, analysis and visualization of agricultural remote sensing big data are critical to the success of precision agriculture. This paper overviews available remote sensing data resources, recent development of technologies for remote sensing big data management, and remote sensing data processing and management for precision agriculture. A five-layer-fifteen- level (FLFL) satellite remote sensing data management structure is described and adapted to create a more appropriate four-layer-twelve-level (FLTL) remote sensing data management structure for management and applications of agricultural remote sensing big data for precision agriculture where the sensors are typically on high-resolution satellites, manned aircrafts, unmanned aerial vehicles and ground-based structures. The FLTL structure is the management and application framework of agricultural remote sensing big data for precision agriculture and local farm studies, which outlooks the future coordination of remote sensing big data management and applications at local regional and farm scale.
基金supported by the Chinese Academy of Sciences(Grant No.XDB18000000)the National Natural Science Foundation of China(Grant No.41688103)the State Oceanography Bureau(Grant No.GASI-GEOGE-02)
文摘The roles of subduction of the Pacific plate and the big mantle wedge(BMW) in the evolution of east Asian continental margin have attracted lots of attention in past years. This paper reviews recent progresses regarding the composition and chemical heterogeneity of the BMW beneath eastern Asia and geochemistry of Cenozoic basalts in the region, with attempts to put forward a general model accounting for the generation of intraplate magma in a BMW system. Some key points of this review are summarized in the following.(1) Cenozoic basalts from eastern China are interpreted as a mixture of high-Si melts and low-Si melts. Wherever they are from, northeast, north or south China, Cenozoic basalts share a common low-Si basalt endmember, which is characterized by high alkali, Fe_2O_3~T and TiO_2 contents, HIMU-like trace element composition and relatively low ^(206)Pb/^(204)Pb compared to classic HIMU basalts. Their Nd-Hf isotopic compositions resemble that of Pacific Mantle domain and their source is composed of carbonated eclogites and peridotites. The high-Si basalt endmember is characterized by low alkali, Fe_2O_3~T and TiO_2 contents, Indian Mantle-type Pb-Nd-Hf isotopic compositions, and a predominant garnet pyroxenitic source. High-Si basalts show isotopic provinciality, with those from North China and South China displaying EM1-type and EM2-type components, respectively, while basalts from Northeast China containing both EM1-and EM2-type components.(2) The source of Cenozoic basalts from eastern China contains abundant recycled materials, including oceanic crust and lithospheric mantle components as well as carbonate sediments and water. According to their spatial distribution and deep seismic tomography, it is inferred that the recycled components are mostly from stagnant slabs in the mantle transition zone,whereas EM1 and EM2 components are from the shallow mantle.(3) Comparison of solidi of garnet pyroxenite, carbonated eclogite and peridotite with regional geotherm constrains the initial melting depth of high
基金the Strategic Priority Research Program of the Chinese Academy of Sciences(grant numbers XDA19030000 and XDA19090000)the DG Research and Innovation of the European Commission(H2020 grant number 34538).
文摘The digital transformation of our society coupled with the increasing exploitation of natural resources makes sustainability challenges more complex and dynamic than ever before.These changes will unlikely stop or even decelerate in the near future.There is an urgent need for a new scientific approach and an advanced form of evidence-based decisionmaking towards the benefit of society,the economy,and the environment.To understand the impacts and interrelationships between humans as a society and natural Earth system processes,we propose a new engineering discipline,Big Earth Data science.This science is called to provide the methodologies and tools to generate knowledge from diverse,numerous,and complex data sources necessary to ensure a sustainable human society essential for the preservation of planet Earth.Big Earth Data science aims at utilizing data from Earth observation and social sensing and develop theories for understanding the mechanisms of how such a social-physical system operates and evolves.The manuscript introduces the universe of discourse characterizing this new science,its foundational paradigms and methodologies,and a possible technological framework to be implemented by applying an ecosystem approach.CASEarth and GEOSS are presented as examples of international implementation attempts.Conclusions discuss important challenges and collaboration opportunities.