Software defect detection aims to automatically identify defective software modules for efficient software test in order to improve the quality of a software system. Although many machine learning methods have been su...Software defect detection aims to automatically identify defective software modules for efficient software test in order to improve the quality of a software system. Although many machine learning methods have been successfully applied to the task, most of them fail to consider two practical yet important issues in software defect detection. First, it is rather difficult to collect a large amount of labeled training data for learning a well-performing model; second, in a software system there are usually much fewer defective modules than defect-free modules, so learning would have to be conducted over an imbalanced data set. In this paper~ we address these two practical issues simultaneously by proposing a novel semi-supervised learning approach named Rocus. This method exploits the abundant unlabeled examples to improve the detection accuracy, as well as employs under-sampling to tackle the class-imbalance problem in the learning process. Experimental results of real-world software defect detection tasks show that Rocus is effective for software defect detection. Its performance is better than a semi-supervised learning method that ignores the class-imbalance nature of the task and a class-imbalance learning method that does not make effective use of unlabeled data.展开更多
针对热轧卷表面缺陷,基于大数据挖掘技术中的神经网络预测模型,提出了一种优化连铸工艺参数的新方法(prediction model method,简称PMM)。PMM方法可以得到各连铸参数对表面缺陷发生可能性的多样本连续变化图,并以此得到对应影响规律、...针对热轧卷表面缺陷,基于大数据挖掘技术中的神经网络预测模型,提出了一种优化连铸工艺参数的新方法(prediction model method,简称PMM)。PMM方法可以得到各连铸参数对表面缺陷发生可能性的多样本连续变化图,并以此得到对应影响规律、关键工艺参数及临界值。结果表明,吹氩参数中,保护氩气流量对低碳钢热轧卷表面缺陷影响最为明显且呈负相关关系,塞棒与水口位置的最佳吹氩流量分别为3.0和1.8 L/min。结晶器热流参数中,内弧侧水流量影响最明显,各面水温差最佳范围为7~9℃,最佳进水温度在35℃附近。同时,表面缺陷发生可能性随拉速提高、板坯宽度、浇铸长度增加而增加明显,但随中间包钢水质量增加而逐渐降低。此外,对比发现浇铸速度、板坯宽度、保护氩气流量与结晶器冷却水流量等参数是影响热轧卷表面缺陷形成的关键连铸工艺参数,且缺陷发生可能性对结晶器冷却水总流量的波动最为灵敏,其临界下限值为8700 L/min。展开更多
The quality of a product is dependent on both facilities/equipment and manufacturing processes. Any error or disorder in facilities and processes can cause a catastrophic failure. To avoid such failures, a zero- defec...The quality of a product is dependent on both facilities/equipment and manufacturing processes. Any error or disorder in facilities and processes can cause a catastrophic failure. To avoid such failures, a zero- defect manufacturing (ZDM) system is necessary in order to increase the reliability and safety of manufacturing systems and reach zero-defect quality of products. One of the major challenges for ZDM is the analysis of massive raw datasets. This type of analysis needs an automated and self-orga- nized decision making system. Data mining (DM) is an effective methodology for discovering interesting knowl- edge within a huge datasets. It plays an important role in developing a ZDM system. The paper presents a general framework of ZDM and explains how to apply DM approaches to manufacture the products with zero-defect. This paper also discusses 3 ongoing projects demonstrating the practice of using DM approaches for reaching the goal of ZDM.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos. 60975043,60903103,and 60721002
文摘Software defect detection aims to automatically identify defective software modules for efficient software test in order to improve the quality of a software system. Although many machine learning methods have been successfully applied to the task, most of them fail to consider two practical yet important issues in software defect detection. First, it is rather difficult to collect a large amount of labeled training data for learning a well-performing model; second, in a software system there are usually much fewer defective modules than defect-free modules, so learning would have to be conducted over an imbalanced data set. In this paper~ we address these two practical issues simultaneously by proposing a novel semi-supervised learning approach named Rocus. This method exploits the abundant unlabeled examples to improve the detection accuracy, as well as employs under-sampling to tackle the class-imbalance problem in the learning process. Experimental results of real-world software defect detection tasks show that Rocus is effective for software defect detection. Its performance is better than a semi-supervised learning method that ignores the class-imbalance nature of the task and a class-imbalance learning method that does not make effective use of unlabeled data.
文摘针对热轧卷表面缺陷,基于大数据挖掘技术中的神经网络预测模型,提出了一种优化连铸工艺参数的新方法(prediction model method,简称PMM)。PMM方法可以得到各连铸参数对表面缺陷发生可能性的多样本连续变化图,并以此得到对应影响规律、关键工艺参数及临界值。结果表明,吹氩参数中,保护氩气流量对低碳钢热轧卷表面缺陷影响最为明显且呈负相关关系,塞棒与水口位置的最佳吹氩流量分别为3.0和1.8 L/min。结晶器热流参数中,内弧侧水流量影响最明显,各面水温差最佳范围为7~9℃,最佳进水温度在35℃附近。同时,表面缺陷发生可能性随拉速提高、板坯宽度、浇铸长度增加而增加明显,但随中间包钢水质量增加而逐渐降低。此外,对比发现浇铸速度、板坯宽度、保护氩气流量与结晶器冷却水流量等参数是影响热轧卷表面缺陷形成的关键连铸工艺参数,且缺陷发生可能性对结晶器冷却水总流量的波动最为灵敏,其临界下限值为8700 L/min。
文摘The quality of a product is dependent on both facilities/equipment and manufacturing processes. Any error or disorder in facilities and processes can cause a catastrophic failure. To avoid such failures, a zero- defect manufacturing (ZDM) system is necessary in order to increase the reliability and safety of manufacturing systems and reach zero-defect quality of products. One of the major challenges for ZDM is the analysis of massive raw datasets. This type of analysis needs an automated and self-orga- nized decision making system. Data mining (DM) is an effective methodology for discovering interesting knowl- edge within a huge datasets. It plays an important role in developing a ZDM system. The paper presents a general framework of ZDM and explains how to apply DM approaches to manufacture the products with zero-defect. This paper also discusses 3 ongoing projects demonstrating the practice of using DM approaches for reaching the goal of ZDM.