Purpose: This paper develops and validates a bibliometric framework for identifying the "princes" (PR) who wake up the "sleeping beauty" (SB) in challenge-type scientific discoveries, so as to figure out the ...Purpose: This paper develops and validates a bibliometric framework for identifying the "princes" (PR) who wake up the "sleeping beauty" (SB) in challenge-type scientific discoveries, so as to figure out the awakening mechanisms, and promote potentially valuable but not readily accepted innovative research. (A PR is a research study.) Design/methodology/approach: We propose that PR candidates must meet the following four criteria: (1) be published near the time when the SB began to attract a lot of citations; (2) be highly cited papers themselves; (3) receive a substantial number of co-citations with the SB; and (4) within the challenge-type discoveries which contradict established theories, the "pulling effect" of the PR on the SB must be strong. We test the usefulness of the bibliometric framework through a case study of a key publication by the 2014 chemistry Nobel laureate Stefan W. Hell, who negated Ernst Abbe's diffraction limit theory, one of the most prominent paradigms in the natural sciences. Findings: The first-ranked candidate PR article identified by the bibliometric framework is in line with historical facts. An SB may need one or more PRs and even "retinues" to be "awakened." Documents with potential awakening functionality tend to be published in prestigious multidisciplinary journals with higher impact and wider scope than the journals publishing SBs. Research limitations: The above framework is only applicable to transformative innovations, and the conclusions are drawn from the analysis of one typical SB and her awakening process. Therefore the generality of our work might be limited. Practical implications: Publications belonging to so-called transformative research, even when less frequently cited, should be given special attention as early as possible, because they may suddenly attract many citations after a period of sleep, as reflected in our case study.Originality/value: The definition of PR(s) as the first paper(s) t展开更多
Purpose: This study aims at identifying potential industry-university-research collaboration(IURC) partners effectively and analyzes the conditions and dynamics in the IURC process based on innovation chain theory....Purpose: This study aims at identifying potential industry-university-research collaboration(IURC) partners effectively and analyzes the conditions and dynamics in the IURC process based on innovation chain theory.Design/methodology/approach: The method utilizes multisource data, combining bibliometric and econometrics analyses to capture the core network of the existing collaboration networks and institution competitiveness in the innovation chain. Furthermore, a new identification method is constructed that takes into account the law of scientific research cooperation and economic factors.Findings: Empirical analysis of the genetic engineering vaccine field shows that through the distribution characteristics of creative technologies from different institutions, the analysis based on the innovation chain can identify the more complementary capacities among organizations.Research limitations: In this study, the overall approach is shaped by the theoretical concept of an innovation chain, a linear innovation model with specific types or stages of innovation activities in each phase of the chain, and may, thus, overlook important feedback mechanisms in the innovation process.Practical implications: Industry-university-research institution collaborations are extremely important in promoting the dissemination of innovative knowledge, enhancing the quality of innovation products, and facilitating the transformation of scientific achievements.Originality/value: Compared to previous studies, this study emulates the real conditions of IURC. Thus, the rule of technological innovation can be better revealed, the potential partners of IURC can be identified more readily, and the conclusion has more value.展开更多
With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for clou...With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.展开更多
Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library arc...Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library archive. It is challenging to identify the data usage that is mentioned in literature and associate it with its source. Here, we investigated the data usage of a government-funded cancer genomics project, The Cancer Genome Atlas(TCGA), via a full-text literature analysis.Design/methodology/approach: We focused on identifying articles using the TCGA dataset and constructing linkages between the articles and the specific TCGA dataset. First, we collected 5,372 TCGA-related articles from Pub Med Central(PMC). Second, we constructed a benchmark set with 25 full-text articles that truly used the TCGA data in their studies, and we summarized the key features of the benchmark set. Third, the key features were applied to the remaining PMC full-text articles that were collected from PMC.Findings: The amount of publications that use TCGA data has increased significantly since 2011, although the TCGA project was launched in 2005. Additionally, we found that the critical areas of focus in the studies that use the TCGA data were glioblastoma multiforme, lung cancer, and breast cancer; meanwhile, data from the RNA-sequencing(RNA-seq) platform is the most preferable for use.Research limitations: The current workflow to identify articles that truly used TCGA data is labor-intensive. An automatic method is expected to improve the performance.Practical implications: This study will help cancer genomics researchers determine the latest advancements in cancer molecular therapy, and it will promote data sharing and data-intensive scientific discovery.Originality/value: Few studies have been conducted to investigate data usage by governmentfunded projects/programs since their launch. In this preliminary study, we extracted articles that use TCGA data from PMC, and we created a link between the full-tex展开更多
In the background of the green transformation of the economy and society,the ESG performance of enterprises has been paid more and more attention in the investment decision-making.However,previous studies have inadequ...In the background of the green transformation of the economy and society,the ESG performance of enterprises has been paid more and more attention in the investment decision-making.However,previous studies have inadequately explored how the ESG performance affects corporate financing costs.Based on the information asymmetry theory,this paper analyzes the impact mechanism of ESG performance on corporate financing costs.Then,taking 1044 A-share listed companies in2016–2020 as a sample,through the sorting and analysis of ESG report disclosure and rating data,the company’s ESG performance indicators are obtained,and an empirical model is built to test the relationship between ESG performance and corporate financing costs.This paper constructs a panel regression model using ESG rating data and corporate financial data and finds that in the overall sample,the higher the ESG performance,the lower the equity financing cost;The higher the ESG performance,the lower the debt financing cost.In addition,it also discussed the moderating effect of enterprise scale and media attention on the impact of ESG performance on enterprise financing costs.The empirical results show that the influence of company size on ESG performance on financing costs has a moderating effect and a positive moderating effect.展开更多
The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge ...The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge management- the knowledge domain framework (KDF), and introduces an integrated development environment (IDE) named large-scale ontology development environment (LODE), which implements the proposed theoretical framework. We also compared LODE with other popular ontology development environments in this paper. The practice of using LODE on management and development of agriculture ontologies shows that knowledge domain framework can handle the development activities of large scale ontologies. Application studies based on the described briefly. principle of knowledge domain framework and LODE was展开更多
Purpose: The number of retracted papers from Chinese university-affiliated hospitals is increasing, which has raised much concern. The aim of this study is to analyze the retracted papers from university-affiliated ho...Purpose: The number of retracted papers from Chinese university-affiliated hospitals is increasing, which has raised much concern. The aim of this study is to analyze the retracted papers from university-affiliated hospitals in China’s mainland from 2000 to 2021. Design/methodology/approach: Data for 1,031 retracted papers were identified from the Web of Science Core collection database. The information of the hospitals involved was obtained from their official websites. We analyzed the chronological changes, journal distribution, discipline distribution and retraction reasons for the retracted papers. The grade and geographic locations of the hospitals involved were explored as well.Findings: We found a rapid increase in the number of retracted papers, while the retraction time interval is decreasing. The main reasons for retraction are plagiarism/self-plagiarism(n=255), invalid data/images/conclusions(n=212), fake peer review(n=175) and honesty error(n=163). The disciplines are mainly distributed in oncology(n=320), pharmacology & pharmacy(n=198) and research & experimental medicine(n=166). About 43.8% of the retracted papers were from hospitals affiliated with prestigious universities. Research limitations: This study fails to differentiate between retractions due to honest error and retractions due to research misconduct. We believe that there is a fundamental difference between honest error retractions and misconduct retractions. Another limitation is that authors of the retracted papers have not been analyzed in this study.Practical implications: This study provides a reference for addressing research misconduct in Chinese university-affiliated hospitals. It is our recommendation that universities and hospitals should educate all their staff about the basic norms of research integrity, punish authors of scientific misconduct retracted papers, and reform the unreasonable evaluation system.Originality/value: Based on the analysis of retracted papers, this study further analyzes the characteristics of instit展开更多
1.INTRODUCTION Metadata,as a type of data,describes content,provides context,documents transactions,and situates data.Interest in metadata has steadily grown over the last several decades,motivated initially by the in...1.INTRODUCTION Metadata,as a type of data,describes content,provides context,documents transactions,and situates data.Interest in metadata has steadily grown over the last several decades,motivated initially by the increase in digital information,open access,early data sharing policies,and interoperability goals.This foundation has accelerated in more recent times,due to the increase in research data management policies and advances in Al.Specific to research data management,one of the larger factors has been the global adoption of the FAIR(findable,accessible,interoperable,and reusable)data principles[1,2],which are highly metadatadriven.Additionally,researchers across nearly every domain are interested in leveraging metadata for machine learning and other Al applications.展开更多
Purpose: Disseminating medical and health information is a mission of a public medical library. This paper describes a practice of a medical library in providing online access to health information for the general pub...Purpose: Disseminating medical and health information is a mission of a public medical library. This paper describes a practice of a medical library in providing online access to health information for the general public.Design/methodology/approach: A four-step workflow is developed to integrate and disseminate heterogeneous health information from medical associations. First, a raw data repository is developed to manage the original submissions from information providers.Second, each document in the raw data repository is represented in a standardized XML schema. Third, the medical terms are identified and manually annotated, enriching the semantics of health information. Lastly, all the semantically enriched XML documents are converted to HTMLs for online dissemination.Findings: A health information website, CHealth, was developed for Chinese speakers. It provides free online access for all without any login or IP constrains. CHealth is available at www.chealth.org.cn.Research limitations: The current workflow is time-consuming and labor-intensive due to the lack of information submission/exchange standard and commonly agreed-on consumer health terminology in Chinese.Originality/value: In this work, the target audience of the medical library has been extended from traditional academic/professional to the general public. Methodologies in library sciences have been combined with those in consumer health informatics in CHealth development.展开更多
At the Extraordinary G20 Leaders'Summit on COVID-19 on March 26,2020,China launched an online knowledge center for the prevention and control of novel coronavirus pneumonia,a center open to all countries.Since the...At the Extraordinary G20 Leaders'Summit on COVID-19 on March 26,2020,China launched an online knowledge center for the prevention and control of novel coronavirus pneumonia,a center open to all countries.Since the outbreak of coronavirus disease,China has been sharing knowledge about disease prevention and control with the World Health Organization and governments throughout the world.展开更多
Much research has been conductedabroad in recent years concerningacupuncture and moxibustion on theimmunologic functions of the organism.Anoutline of this is presented as follows.EFFECTS OF ACUPUNCTUREAND MOXIBUSTION ...Much research has been conductedabroad in recent years concerningacupuncture and moxibustion on theimmunologic functions of the organism.Anoutline of this is presented as follows.EFFECTS OF ACUPUNCTUREAND MOXIBUSTION ON NORMALIMMUNOLOGIC FUNCTIONS展开更多
Purpose: This paper tries to understand the dynamics of scientific communication systems during crises by investigating as a case study the blogging activities that took place during the period of the 2011 earthquake ...Purpose: This paper tries to understand the dynamics of scientific communication systems during crises by investigating as a case study the blogging activities that took place during the period of the 2011 earthquake and related events in Japan. Interactions between bloggers and registered users are studied quantitatively and qualitatively at Sciencenet.cn, an influential science-related blogosphere in China.Design/methodology/approach: The editors of Sciencenet.cn compiled a special issue of science blog articles under the title Analysis of the Japanese Earthquake. We developed a spider program and downloaded from this special issue the metadata about title, content,publishing time, total read count, reply count and recommendation count, and further collected information about bloggers and recommenders. We then sent a short message to the bloggers who wrote articles on these emergencies, asking for their educational and professional background.Findings: We found that knowledge reflected in the blog articles is strongly related to the educational and professional background of bloggers. Knowledge diffusion is facilitated by interactions, such as recommendations, comments and answers. Interactions via comments and recommendations are of an assortative nature: A blog article is more likelyto be commented on and recommended by those bloggers who write on the same or similar topics than by those writing on a different one. Registered users tend to give comments on articles dealing with the topic that they recommend, and vice versa.Interaction in the intersection of two or three topics is more intense than that within one topic. The impact of blog articles is also influenced by other factors, such as the reputation of the blogger and the type of information they contain.Implications and limitations: It is confirmed that studying blogs is a valid approach within informetric studies. Yet, we only studied one triple(earthquake, tsunami, nuclear disaster) event based on data originating from one Chinese blog website. More展开更多
The present study was an attempt to delineate potential groundwater zones in Kalikavu Panchayat of Malappuram district,Kerala,India.The geo-spatial database on geomorphology,landuse,geology,slope and drainage network ...The present study was an attempt to delineate potential groundwater zones in Kalikavu Panchayat of Malappuram district,Kerala,India.The geo-spatial database on geomorphology,landuse,geology,slope and drainage network was generated in a geographic information system(GIS)environment from satellite data,Survey of India topographic sheets and field observations.To understand the movement and occurrence of groundwater,the geology,geomorphology,structural set-up and recharging conditions have to be well understood.In the present study,the potential recharge areas are delineated in terms of geology,geomorphology,land use,slope,drainage pattern,etc.Various thematic data generated were integrated using a heuristic method in the GIS domain to generate maps showing potential groundwater zones.The composite output map scores were reclassified into different zones using a decision rule.The final output map shows different zones of groundwater prospect,viz.,very good(15.57%of the area),good(43.74%),moderate(28.38%)and poor(12.31%).Geomorphic units such as valley plains,valley fills and alluvial terraces were identified as good to excellent prospect zones,while the gently sloping lateritic uplands were identified as good to moderate zones.Steeply sloping hilly terrains underlain by hard rocks were identified as poor groundwater prospect zones.展开更多
This paper introduces efforts and achievements of Agriculture Ontology Service Research Group of Agricultural Information Institute of Chinese Academy of Agriculture Sciences in last 10 years. It summarizes the resear...This paper introduces efforts and achievements of Agriculture Ontology Service Research Group of Agricultural Information Institute of Chinese Academy of Agriculture Sciences in last 10 years. It summarizes the research on ontology construction methodology, ontology management system, ontology application and etc.展开更多
Purpose: The study was carried out to construct a domain knowledge service system based on the Scientific & Technological Knowledge Organization Systems(STKOS). Design/methodology/approach: The framework of a doma...Purpose: The study was carried out to construct a domain knowledge service system based on the Scientific & Technological Knowledge Organization Systems(STKOS). Design/methodology/approach: The framework of a domain knowledge service system is designed on the basis of the STKOS, and the STKOS science and technology vocabularies, category systems, and ontology networks are applied to realize the knowledge organization and semantic linking of the scientific and technological information resources. Meanwhile, related knowledge-mining analysis algorithms and models are improved, and some tools such as Solr and D3 are used for developing the system. This system integrates various knowledge service modules, including unified search of domain information resources and knowledge-linked navigation, domain hotspot and burst topics monitoring analysis, knowledge structure and evolution analysis, literature citation network, and research agents’ cooperative relationship network analysis. Findings: The system can help to refine descriptions, knowledge organization, and the semantic linking of various kinds of information resources closely related to science and technology. Such resources include domain literature, institutions, scientists, projects, and more. Research limitations: Trial assessment and performance improvement should be carried out for the knowledge service application on the basis of more types of and larger quantities of domain information resources.Practical implications: The domain knowledge service system provides an integrated knowledge discovery tool, as well as several kinds of knowledge mining analysis services for researchers.Originality/value: Our practice can be used as a valuable guide for libraries and information institutions that plan to provide deep domain knowledge services.展开更多
Recognizing the importance of innovation in science and technology (S&T) as a driver of continued economic growth, China has introduced new policies to facilitate and encourage innovation and made significant progr...Recognizing the importance of innovation in science and technology (S&T) as a driver of continued economic growth, China has introduced new policies to facilitate and encourage innovation and made significant progress in S&T innovation in the past decade. The total volume of innova- tion resources is steadily growing. Commensurate with its status as the second largest economy in the world, China ranks second globally now in research and development (R&D) investment.展开更多
Objective To compare the performance of five machine learning models and SAPSⅡ score in predicting the 30-day mortality amongst patients with sepsis.Methods The sepsis patient-related data were extracted from the MIM...Objective To compare the performance of five machine learning models and SAPSⅡ score in predicting the 30-day mortality amongst patients with sepsis.Methods The sepsis patient-related data were extracted from the MIMIC-Ⅳ database.Clinical features were generated and selected by mutual information and grid search.Logistic regression,Random forest,LightGBM,XGBoost,and other machine learning models were constructed to predict the mortality probability.Five measurements including accuracy,precision,recall,F1 score,and area under curve(AUC) were acquired for model evaluation.An external validation was implemented to avoid conclusion bias.Results LightGBM outperformed other methods,achieving the highest AUC(0.900),accuracy(0.808),and precision(0.559).All machine learning models performed better than SAPSⅡ score(AUC=0.748).LightGBM achieved 0.883 in AUC in the external data validation.Conclusions The machine learning models are more effective in predicting the 30-day mortality of patients with sepsis than the traditional SAPS Ⅱ score.展开更多
INTRODUCTIONHart once proposed in his book "corporate governance": Some theory and implications that governance issue would emerge if there is any interest conflict between principal and agent with no support of a ...INTRODUCTIONHart once proposed in his book "corporate governance": Some theory and implications that governance issue would emerge if there is any interest conflict between principal and agent with no support of a complete contract in a principal-agent relation. Governance theory developed from the governance issue believes that an effective institutional arrangement could avoid or decrease the occurrence of these issues and force the agent to maxilnize his principal's interest.展开更多
In the context of constructing national public cultural demonstration areas all over China and taking the system innovation history of Baoji as an example,this paper studies the path choices and service system design ...In the context of constructing national public cultural demonstration areas all over China and taking the system innovation history of Baoji as an example,this paper studies the path choices and service system design of the development of service system of public libraries in the western region of China.The results show that the key issue the China public library service system is facing is to promote system innovation;the 'chain system and grassroots sites of libraries' is a new synthesized innovative program on public library service system in accordance with the actual situations of Baoji area and other areas in the western region of China;the service system design of local public libraries should insist on seeking truth from facts,and carrying out an integrated,simple and feasible reform according to local conditions.展开更多
China National Information and Documentation Standardization Technical Committee(SAC/TC4),founded in 1979,is the national technology standardization organization engaged in the field of information and documentation i...China National Information and Documentation Standardization Technical Committee(SAC/TC4),founded in 1979,is the national technology standardization organization engaged in the field of information and documentation in China.Its institutional settings,scope and content of the work exactly correspond to the Information and Documentation Standardization Technical Committee of International Organization for Standardization(ISO/TC46).For 30 years,SAC/TC4 has always harmonized and organized national standardization work in accordance with the standard working system of ISO,established a clear standard constituting strategy and principles,set up an open mechanism for the standards development,promoted China’s information and document standardization,and obtained great achievements and valuable experiences.Following the rapid development of information and network technologies,standardization work in the field of international information and documentation is facing new challenges.SAC/TC4 also needs to cope with the situation by adopting a variety of strategies.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.:71373252)the Project from Institute of Medical Information of Chinese Academy of Medical Sciences(Grant No.:14R0106)
文摘Purpose: This paper develops and validates a bibliometric framework for identifying the "princes" (PR) who wake up the "sleeping beauty" (SB) in challenge-type scientific discoveries, so as to figure out the awakening mechanisms, and promote potentially valuable but not readily accepted innovative research. (A PR is a research study.) Design/methodology/approach: We propose that PR candidates must meet the following four criteria: (1) be published near the time when the SB began to attract a lot of citations; (2) be highly cited papers themselves; (3) receive a substantial number of co-citations with the SB; and (4) within the challenge-type discoveries which contradict established theories, the "pulling effect" of the PR on the SB must be strong. We test the usefulness of the bibliometric framework through a case study of a key publication by the 2014 chemistry Nobel laureate Stefan W. Hell, who negated Ernst Abbe's diffraction limit theory, one of the most prominent paradigms in the natural sciences. Findings: The first-ranked candidate PR article identified by the bibliometric framework is in line with historical facts. An SB may need one or more PRs and even "retinues" to be "awakened." Documents with potential awakening functionality tend to be published in prestigious multidisciplinary journals with higher impact and wider scope than the journals publishing SBs. Research limitations: The above framework is only applicable to transformative innovations, and the conclusions are drawn from the analysis of one typical SB and her awakening process. Therefore the generality of our work might be limited. Practical implications: Publications belonging to so-called transformative research, even when less frequently cited, should be given special attention as early as possible, because they may suddenly attract many citations after a period of sleep, as reflected in our case study.Originality/value: The definition of PR(s) as the first paper(s) t
基金funded by National Natural Science Foundation of China (Grant No. 71704170)the China Postdoctoral Science Foundation funded project (Grant No. 2016M590124)the Youth Innovation Promotion Association, CAS (Grant No. 2016159)
文摘Purpose: This study aims at identifying potential industry-university-research collaboration(IURC) partners effectively and analyzes the conditions and dynamics in the IURC process based on innovation chain theory.Design/methodology/approach: The method utilizes multisource data, combining bibliometric and econometrics analyses to capture the core network of the existing collaboration networks and institution competitiveness in the innovation chain. Furthermore, a new identification method is constructed that takes into account the law of scientific research cooperation and economic factors.Findings: Empirical analysis of the genetic engineering vaccine field shows that through the distribution characteristics of creative technologies from different institutions, the analysis based on the innovation chain can identify the more complementary capacities among organizations.Research limitations: In this study, the overall approach is shaped by the theoretical concept of an innovation chain, a linear innovation model with specific types or stages of innovation activities in each phase of the chain, and may, thus, overlook important feedback mechanisms in the innovation process.Practical implications: Industry-university-research institution collaborations are extremely important in promoting the dissemination of innovative knowledge, enhancing the quality of innovation products, and facilitating the transformation of scientific achievements.Originality/value: Compared to previous studies, this study emulates the real conditions of IURC. Thus, the rule of technological innovation can be better revealed, the potential partners of IURC can be identified more readily, and the conclusion has more value.
基金sponsored by the National Natural Science Foundation of China under grant number No. 62172353, No. 62302114, No. U20B2046 and No. 62172115Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1331007 and No. 1311022+1 种基金Natural Science Foundation of the Jiangsu Higher Education Institutions Grant No. 17KJB520044Six Talent Peaks Project in Jiangsu Province No.XYDXX-108
文摘With the rapid development of information technology,IoT devices play a huge role in physiological health data detection.The exponential growth of medical data requires us to reasonably allocate storage space for cloud servers and edge nodes.The storage capacity of edge nodes close to users is limited.We should store hotspot data in edge nodes as much as possible,so as to ensure response timeliness and access hit rate;However,the current scheme cannot guarantee that every sub-message in a complete data stored by the edge node meets the requirements of hot data;How to complete the detection and deletion of redundant data in edge nodes under the premise of protecting user privacy and data dynamic integrity has become a challenging problem.Our paper proposes a redundant data detection method that meets the privacy protection requirements.By scanning the cipher text,it is determined whether each sub-message of the data in the edge node meets the requirements of the hot data.It has the same effect as zero-knowledge proof,and it will not reveal the privacy of users.In addition,for redundant sub-data that does not meet the requirements of hot data,our paper proposes a redundant data deletion scheme that meets the dynamic integrity of the data.We use Content Extraction Signature(CES)to generate the remaining hot data signature after the redundant data is deleted.The feasibility of the scheme is proved through safety analysis and efficiency analysis.
基金supported by the National Population and Health Scientific Data Sharing Program of Chinathe Knowledge Centre for Engineering Sciences and Technology (Medical Centre)the Fundamental Research Funds for the Central Universities (Grant No.: 13R0101)
文摘Purpose: In the open science era, it is typical to share project-generated scientific data by depositing it in an open and accessible database. Moreover, scientific publications are preserved in a digital library archive. It is challenging to identify the data usage that is mentioned in literature and associate it with its source. Here, we investigated the data usage of a government-funded cancer genomics project, The Cancer Genome Atlas(TCGA), via a full-text literature analysis.Design/methodology/approach: We focused on identifying articles using the TCGA dataset and constructing linkages between the articles and the specific TCGA dataset. First, we collected 5,372 TCGA-related articles from Pub Med Central(PMC). Second, we constructed a benchmark set with 25 full-text articles that truly used the TCGA data in their studies, and we summarized the key features of the benchmark set. Third, the key features were applied to the remaining PMC full-text articles that were collected from PMC.Findings: The amount of publications that use TCGA data has increased significantly since 2011, although the TCGA project was launched in 2005. Additionally, we found that the critical areas of focus in the studies that use the TCGA data were glioblastoma multiforme, lung cancer, and breast cancer; meanwhile, data from the RNA-sequencing(RNA-seq) platform is the most preferable for use.Research limitations: The current workflow to identify articles that truly used TCGA data is labor-intensive. An automatic method is expected to improve the performance.Practical implications: This study will help cancer genomics researchers determine the latest advancements in cancer molecular therapy, and it will promote data sharing and data-intensive scientific discovery.Originality/value: Few studies have been conducted to investigate data usage by governmentfunded projects/programs since their launch. In this preliminary study, we extracted articles that use TCGA data from PMC, and we created a link between the full-tex
基金Supported by the National Natural Science Foundation of China(72192843,72334006)the Fundamental Research Funds for the Central Universities(E1E40808X2)。
文摘In the background of the green transformation of the economy and society,the ESG performance of enterprises has been paid more and more attention in the investment decision-making.However,previous studies have inadequately explored how the ESG performance affects corporate financing costs.Based on the information asymmetry theory,this paper analyzes the impact mechanism of ESG performance on corporate financing costs.Then,taking 1044 A-share listed companies in2016–2020 as a sample,through the sorting and analysis of ESG report disclosure and rating data,the company’s ESG performance indicators are obtained,and an empirical model is built to test the relationship between ESG performance and corporate financing costs.This paper constructs a panel regression model using ESG rating data and corporate financial data and finds that in the overall sample,the higher the ESG performance,the lower the equity financing cost;The higher the ESG performance,the lower the debt financing cost.In addition,it also discussed the moderating effect of enterprise scale and media attention on the impact of ESG performance on enterprise financing costs.The empirical results show that the influence of company size on ESG performance on financing costs has a moderating effect and a positive moderating effect.
基金supported by the Key Technology R&D Program of China during the 12th Five-Year Plan period:Super-Class Scientific and Technical Thesaurus and Ontology Construction Faced the Foreign Scientific and Technical Literature (2011BAH10B01)
文摘The key activity to build semantic web is to build ontologies. But today, the theory and methodology of ontology construction is still far from ready. This paper proposed a theoretical framework for massive knowledge management- the knowledge domain framework (KDF), and introduces an integrated development environment (IDE) named large-scale ontology development environment (LODE), which implements the proposed theoretical framework. We also compared LODE with other popular ontology development environments in this paper. The practice of using LODE on management and development of agriculture ontologies shows that knowledge domain framework can handle the development activities of large scale ontologies. Application studies based on the described briefly. principle of knowledge domain framework and LODE was
基金supported by grants from Humanity and Social Science Youth Foundation of Ministry of Education of China (21YJC870016).
文摘Purpose: The number of retracted papers from Chinese university-affiliated hospitals is increasing, which has raised much concern. The aim of this study is to analyze the retracted papers from university-affiliated hospitals in China’s mainland from 2000 to 2021. Design/methodology/approach: Data for 1,031 retracted papers were identified from the Web of Science Core collection database. The information of the hospitals involved was obtained from their official websites. We analyzed the chronological changes, journal distribution, discipline distribution and retraction reasons for the retracted papers. The grade and geographic locations of the hospitals involved were explored as well.Findings: We found a rapid increase in the number of retracted papers, while the retraction time interval is decreasing. The main reasons for retraction are plagiarism/self-plagiarism(n=255), invalid data/images/conclusions(n=212), fake peer review(n=175) and honesty error(n=163). The disciplines are mainly distributed in oncology(n=320), pharmacology & pharmacy(n=198) and research & experimental medicine(n=166). About 43.8% of the retracted papers were from hospitals affiliated with prestigious universities. Research limitations: This study fails to differentiate between retractions due to honest error and retractions due to research misconduct. We believe that there is a fundamental difference between honest error retractions and misconduct retractions. Another limitation is that authors of the retracted papers have not been analyzed in this study.Practical implications: This study provides a reference for addressing research misconduct in Chinese university-affiliated hospitals. It is our recommendation that universities and hospitals should educate all their staff about the basic norms of research integrity, punish authors of scientific misconduct retracted papers, and reform the unreasonable evaluation system.Originality/value: Based on the analysis of retracted papers, this study further analyzes the characteristics of instit
文摘1.INTRODUCTION Metadata,as a type of data,describes content,provides context,documents transactions,and situates data.Interest in metadata has steadily grown over the last several decades,motivated initially by the increase in digital information,open access,early data sharing policies,and interoperability goals.This foundation has accelerated in more recent times,due to the increase in research data management policies and advances in Al.Specific to research data management,one of the larger factors has been the global adoption of the FAIR(findable,accessible,interoperable,and reusable)data principles[1,2],which are highly metadatadriven.Additionally,researchers across nearly every domain are interested in leveraging metadata for machine learning and other Al applications.
基金supported by the National Key Technology R&D Program of China (Grant No.:2013BAI06B01)
文摘Purpose: Disseminating medical and health information is a mission of a public medical library. This paper describes a practice of a medical library in providing online access to health information for the general public.Design/methodology/approach: A four-step workflow is developed to integrate and disseminate heterogeneous health information from medical associations. First, a raw data repository is developed to manage the original submissions from information providers.Second, each document in the raw data repository is represented in a standardized XML schema. Third, the medical terms are identified and manually annotated, enriching the semantics of health information. Lastly, all the semantically enriched XML documents are converted to HTMLs for online dissemination.Findings: A health information website, CHealth, was developed for Chinese speakers. It provides free online access for all without any login or IP constrains. CHealth is available at www.chealth.org.cn.Research limitations: The current workflow is time-consuming and labor-intensive due to the lack of information submission/exchange standard and commonly agreed-on consumer health terminology in Chinese.Originality/value: In this work, the target audience of the medical library has been extended from traditional academic/professional to the general public. Methodologies in library sciences have been combined with those in consumer health informatics in CHealth development.
文摘At the Extraordinary G20 Leaders'Summit on COVID-19 on March 26,2020,China launched an online knowledge center for the prevention and control of novel coronavirus pneumonia,a center open to all countries.Since the outbreak of coronavirus disease,China has been sharing knowledge about disease prevention and control with the World Health Organization and governments throughout the world.
文摘Much research has been conductedabroad in recent years concerningacupuncture and moxibustion on theimmunologic functions of the organism.Anoutline of this is presented as follows.EFFECTS OF ACUPUNCTUREAND MOXIBUSTION ON NORMALIMMUNOLOGIC FUNCTIONS
基金supported by the National Natural Science Foundation of China(Grant No.:71173154)the National Social Science Foundation of China(Grant No.:08BZX076)the Social Science Foundation of Tongji University(Grant No.:3850219007)
文摘Purpose: This paper tries to understand the dynamics of scientific communication systems during crises by investigating as a case study the blogging activities that took place during the period of the 2011 earthquake and related events in Japan. Interactions between bloggers and registered users are studied quantitatively and qualitatively at Sciencenet.cn, an influential science-related blogosphere in China.Design/methodology/approach: The editors of Sciencenet.cn compiled a special issue of science blog articles under the title Analysis of the Japanese Earthquake. We developed a spider program and downloaded from this special issue the metadata about title, content,publishing time, total read count, reply count and recommendation count, and further collected information about bloggers and recommenders. We then sent a short message to the bloggers who wrote articles on these emergencies, asking for their educational and professional background.Findings: We found that knowledge reflected in the blog articles is strongly related to the educational and professional background of bloggers. Knowledge diffusion is facilitated by interactions, such as recommendations, comments and answers. Interactions via comments and recommendations are of an assortative nature: A blog article is more likelyto be commented on and recommended by those bloggers who write on the same or similar topics than by those writing on a different one. Registered users tend to give comments on articles dealing with the topic that they recommend, and vice versa.Interaction in the intersection of two or three topics is more intense than that within one topic. The impact of blog articles is also influenced by other factors, such as the reputation of the blogger and the type of information they contain.Implications and limitations: It is confirmed that studying blogs is a valid approach within informetric studies. Yet, we only studied one triple(earthquake, tsunami, nuclear disaster) event based on data originating from one Chinese blog website. More
文摘The present study was an attempt to delineate potential groundwater zones in Kalikavu Panchayat of Malappuram district,Kerala,India.The geo-spatial database on geomorphology,landuse,geology,slope and drainage network was generated in a geographic information system(GIS)environment from satellite data,Survey of India topographic sheets and field observations.To understand the movement and occurrence of groundwater,the geology,geomorphology,structural set-up and recharging conditions have to be well understood.In the present study,the potential recharge areas are delineated in terms of geology,geomorphology,land use,slope,drainage pattern,etc.Various thematic data generated were integrated using a heuristic method in the GIS domain to generate maps showing potential groundwater zones.The composite output map scores were reclassified into different zones using a decision rule.The final output map shows different zones of groundwater prospect,viz.,very good(15.57%of the area),good(43.74%),moderate(28.38%)and poor(12.31%).Geomorphic units such as valley plains,valley fills and alluvial terraces were identified as good to excellent prospect zones,while the gently sloping lateritic uplands were identified as good to moderate zones.Steeply sloping hilly terrains underlain by hard rocks were identified as poor groundwater prospect zones.
基金supported by the by the Key Technology R&D Program of China during the 12th Five-Year Plan period:Super-Class Scientific and Technical Thesaurus and Ontology Construction Faced the Foreign Scientifi cand Technical Literature (2011BAH10B01)
文摘This paper introduces efforts and achievements of Agriculture Ontology Service Research Group of Agricultural Information Institute of Chinese Academy of Agriculture Sciences in last 10 years. It summarizes the research on ontology construction methodology, ontology management system, ontology application and etc.
基金supported by the Ministry of Science and Technology of China(Project No.:2011BAH10B06)
文摘Purpose: The study was carried out to construct a domain knowledge service system based on the Scientific & Technological Knowledge Organization Systems(STKOS). Design/methodology/approach: The framework of a domain knowledge service system is designed on the basis of the STKOS, and the STKOS science and technology vocabularies, category systems, and ontology networks are applied to realize the knowledge organization and semantic linking of the scientific and technological information resources. Meanwhile, related knowledge-mining analysis algorithms and models are improved, and some tools such as Solr and D3 are used for developing the system. This system integrates various knowledge service modules, including unified search of domain information resources and knowledge-linked navigation, domain hotspot and burst topics monitoring analysis, knowledge structure and evolution analysis, literature citation network, and research agents’ cooperative relationship network analysis. Findings: The system can help to refine descriptions, knowledge organization, and the semantic linking of various kinds of information resources closely related to science and technology. Such resources include domain literature, institutions, scientists, projects, and more. Research limitations: Trial assessment and performance improvement should be carried out for the knowledge service application on the basis of more types of and larger quantities of domain information resources.Practical implications: The domain knowledge service system provides an integrated knowledge discovery tool, as well as several kinds of knowledge mining analysis services for researchers.Originality/value: Our practice can be used as a valuable guide for libraries and information institutions that plan to provide deep domain knowledge services.
基金supported by the National Science and Technology Libarary(NSTL)the ISTIC and Thomon Reuters Scietometrics Joint Lab
文摘Recognizing the importance of innovation in science and technology (S&T) as a driver of continued economic growth, China has introduced new policies to facilitate and encourage innovation and made significant progress in S&T innovation in the past decade. The total volume of innova- tion resources is steadily growing. Commensurate with its status as the second largest economy in the world, China ranks second globally now in research and development (R&D) investment.
文摘Objective To compare the performance of five machine learning models and SAPSⅡ score in predicting the 30-day mortality amongst patients with sepsis.Methods The sepsis patient-related data were extracted from the MIMIC-Ⅳ database.Clinical features were generated and selected by mutual information and grid search.Logistic regression,Random forest,LightGBM,XGBoost,and other machine learning models were constructed to predict the mortality probability.Five measurements including accuracy,precision,recall,F1 score,and area under curve(AUC) were acquired for model evaluation.An external validation was implemented to avoid conclusion bias.Results LightGBM outperformed other methods,achieving the highest AUC(0.900),accuracy(0.808),and precision(0.559).All machine learning models performed better than SAPSⅡ score(AUC=0.748).LightGBM achieved 0.883 in AUC in the external data validation.Conclusions The machine learning models are more effective in predicting the 30-day mortality of patients with sepsis than the traditional SAPS Ⅱ score.
文摘INTRODUCTIONHart once proposed in his book "corporate governance": Some theory and implications that governance issue would emerge if there is any interest conflict between principal and agent with no support of a complete contract in a principal-agent relation. Governance theory developed from the governance issue believes that an effective institutional arrangement could avoid or decrease the occurrence of these issues and force the agent to maxilnize his principal's interest.
基金a periodical research achievement of"Research on the Construction of the System of Main-branch Public Libraries(Sites)"(No.38,Baoji Official Announcement in 2011),which is a project of system design of Baoji creating a demonstration area of national public cultural service system"Research of Free Access to Public Libraries in Shaanxi Province in the Perspective of Public Cultural Space"(No.2012C019),which is a research project of major issues in theory and practice of the circle of social sciences in Shaanxi Province"Research of the Construction of Library Union in Xi’an City"(No.12WL26),which is a project of the fund of planning of social sciences in Xi’an City
文摘In the context of constructing national public cultural demonstration areas all over China and taking the system innovation history of Baoji as an example,this paper studies the path choices and service system design of the development of service system of public libraries in the western region of China.The results show that the key issue the China public library service system is facing is to promote system innovation;the 'chain system and grassroots sites of libraries' is a new synthesized innovative program on public library service system in accordance with the actual situations of Baoji area and other areas in the western region of China;the service system design of local public libraries should insist on seeking truth from facts,and carrying out an integrated,simple and feasible reform according to local conditions.
文摘China National Information and Documentation Standardization Technical Committee(SAC/TC4),founded in 1979,is the national technology standardization organization engaged in the field of information and documentation in China.Its institutional settings,scope and content of the work exactly correspond to the Information and Documentation Standardization Technical Committee of International Organization for Standardization(ISO/TC46).For 30 years,SAC/TC4 has always harmonized and organized national standardization work in accordance with the standard working system of ISO,established a clear standard constituting strategy and principles,set up an open mechanism for the standards development,promoted China’s information and document standardization,and obtained great achievements and valuable experiences.Following the rapid development of information and network technologies,standardization work in the field of international information and documentation is facing new challenges.SAC/TC4 also needs to cope with the situation by adopting a variety of strategies.