The random forests (RF) algorithm, which combines the predictions from an ensemble of random trees, has achieved significant improvements in terms of classification accuracy. In many real-world applications, however...The random forests (RF) algorithm, which combines the predictions from an ensemble of random trees, has achieved significant improvements in terms of classification accuracy. In many real-world applications, however, ranking is often required in order to make optimal decisions. Thus, we focus our attention on the ranking performance of RF in this paper. Our experi- mental results based on the entire 36 UC Irvine Machine Learning Repository (UCI) data sets published on the main website of Weka platform show that RF doesn't perform well in ranking, and is even about the same as a single C4.4 tree. This fact raises the question of whether several improvements to RF can scale up its ranking performance. To answer this question, we single out an improved random forests (IRF) algorithm. Instead of the information gain measure and the maximum-likelihood estimate, the average gain measure and the similarity- weighted estimate are used in IRF. Our experiments show that IRF significantly outperforms all the other algorithms used to compare in terms of ranking while maintains the high classification accuracy characterizing RF.展开更多
Very often it so happens that the cost of operating an Intrusion Detection System (IDS) exceeds the cost of purchasing the IDS itself. In such cases, regular operation and maintenance of the system becomes expensive. ...Very often it so happens that the cost of operating an Intrusion Detection System (IDS) exceeds the cost of purchasing the IDS itself. In such cases, regular operation and maintenance of the system becomes expensive. Thus, it becomes essential to reduce the operating cost of the IDS without compromising on the performance and reliability of the IDS. Apart from the initial cost of procuring the IDS, other costs include cost of accessories required and cost of administration etc. In this paper we calculate the cost benefit tradeoffs of an IDS. We propose a method to determine the optimum operating point of the IDS. In an effort to solve the problems of the previously proposed metrics, we propose a decision tree based approach to calculate the cost of operating an IDS in a mobile ad hoc network. Mathematically and programmatically we deduce the minimum operating point of operation of an IDS and generate the receiver operating characteristic curve of the IDS. To further ascertain this, we use available network packet capture data and calculate the minimum operating cost of an IDS. The main motive behind this paper is to show that the cost of operating an IDS in a MANET can be minimized and hence the effectiveness and performance of the IDS can be maximized.展开更多
文摘The random forests (RF) algorithm, which combines the predictions from an ensemble of random trees, has achieved significant improvements in terms of classification accuracy. In many real-world applications, however, ranking is often required in order to make optimal decisions. Thus, we focus our attention on the ranking performance of RF in this paper. Our experi- mental results based on the entire 36 UC Irvine Machine Learning Repository (UCI) data sets published on the main website of Weka platform show that RF doesn't perform well in ranking, and is even about the same as a single C4.4 tree. This fact raises the question of whether several improvements to RF can scale up its ranking performance. To answer this question, we single out an improved random forests (IRF) algorithm. Instead of the information gain measure and the maximum-likelihood estimate, the average gain measure and the similarity- weighted estimate are used in IRF. Our experiments show that IRF significantly outperforms all the other algorithms used to compare in terms of ranking while maintains the high classification accuracy characterizing RF.
文摘Very often it so happens that the cost of operating an Intrusion Detection System (IDS) exceeds the cost of purchasing the IDS itself. In such cases, regular operation and maintenance of the system becomes expensive. Thus, it becomes essential to reduce the operating cost of the IDS without compromising on the performance and reliability of the IDS. Apart from the initial cost of procuring the IDS, other costs include cost of accessories required and cost of administration etc. In this paper we calculate the cost benefit tradeoffs of an IDS. We propose a method to determine the optimum operating point of the IDS. In an effort to solve the problems of the previously proposed metrics, we propose a decision tree based approach to calculate the cost of operating an IDS in a mobile ad hoc network. Mathematically and programmatically we deduce the minimum operating point of operation of an IDS and generate the receiver operating characteristic curve of the IDS. To further ascertain this, we use available network packet capture data and calculate the minimum operating cost of an IDS. The main motive behind this paper is to show that the cost of operating an IDS in a MANET can be minimized and hence the effectiveness and performance of the IDS can be maximized.