针对在大数据的处理过程中,对大数据任务的划分和资源分配缺乏合理性的问题,提出一种面向大数据任务的调度方法。该方法首先引入了调度理论用于处理大数据任务,帮助建立合理的大数据任务管理体系并规范大数据任务处理流程;然后,基于大...针对在大数据的处理过程中,对大数据任务的划分和资源分配缺乏合理性的问题,提出一种面向大数据任务的调度方法。该方法首先引入了调度理论用于处理大数据任务,帮助建立合理的大数据任务管理体系并规范大数据任务处理流程;然后,基于大数据任务的本质对数据集进行分析处理,引入决策表进行属性约简,以减小大数据分析任务的数据量和提高大数据分析效率;最后,采用模糊综合评价方法,将模糊综合评价的结果作为对任务调度的依据,以提高任务资源分配合理性。在UCI(University of California Irvine)数据集上进行测试,实验结果表明,该调度算法在平均预测准确度上比朴素贝叶斯(NB)算法高7.42个百分点,比误差反向传播(BP)算法高5.16个百分点,比均方根传递(RMSProp)算法高3.74个百分点。而对于特征数较多的数据集,所提算法在预测精度上较其他算法有显著提高。所提算法在平均调度长度比(SLR)上较HCPFS(Heterogeneous Critcal Path First Synthesis)算法和HIPLTS(Heterogeneous Improved Priority List for Task Scheduling)算法分别下降了12.14%和4.56%,在平均加速比上分别提升了7.14%和42.56%,表明该算法能有效提高大数据系统中任务调度的效率。综合比较分析,所提方法具有较高的预测精度,且高效可靠。展开更多
This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists ...This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists of a high-accuracy single stage detector(SSD)and an efficient tiny convolutional neural network(T-CNN)for joint face detection refinement,alignment and attribute analysis.Though the SSD face detectors achieve promising results,we find that applying a tiny CNN on detections further boosts the detected face scores and bounding boxes.By multi-task training,our T-CNN aims to provide five facial landmarks,facial quality scores,and facial attributes like wearing sunglasses and wearing masks.Since there is no public facial quality data and facial attribute data as we need,we contribute two datasets,namely FaceQ and FaceA,which are collected from the Internet.Experiments show that our MHCNN achieves face detection performance comparable to the state of the art in face detection data set and benchmark(FDDB),and gets reasonable results on AFLW,FaceQ and FaceA.展开更多
Many Task Computing(MTC)is a new class of computing paradigm in which the aggregate number of tasks,quantity of computing,and volumes of data may be extremely large.With the advent of Cloud computing and big data era,...Many Task Computing(MTC)is a new class of computing paradigm in which the aggregate number of tasks,quantity of computing,and volumes of data may be extremely large.With the advent of Cloud computing and big data era,scheduling and executing large-scale computing tasks efficiently and allocating resources to tasks reasonably are becoming a quite challenging problem.To improve both task execution and resource utilization efficiency,we present a task scheduling algorithm with resource attribute selection,which can select the optimal node to execute a task according to its resource requirements and the fitness between the resource node and the task.Experiment results show that there is significant improvement in execution throughput and resource utilization compared with the other three algorithms and four scheduling frameworks.In the scheduling algorithm comparison,the throughput is 77%higher than Min-Min algorithm and the resource utilization can reach 91%.In the scheduling framework comparison,the throughput(with work-stealing)is at least 30%higher than the other frameworks and the resource utilization reaches 94%.The scheduling algorithm can make a good model for practical MTC applications.展开更多
Stacked bar charts are a visualization method for presenting multiple attributes of data,and many visualization tools support these charts.To assess the efficacy of stacked bar charts in supporting attributecomparison...Stacked bar charts are a visualization method for presenting multiple attributes of data,and many visualization tools support these charts.To assess the efficacy of stacked bar charts in supporting attributecomparison tasks,we conducted a user study to compare three types of stacked bar charts:classical,inverting,and diverging.Each chart type was used to visualize six attributes of data where half of the attributes have the characteristics of‘lower better’whereas the other half attributes are with‘higher better.’Thirty participants were asked to perform two types of comparison tasks:single-attribute and overall-attribute comparisons.We measured the completion time,error rate,and perceived difficulty of the comparison tasks.The results of the study suggest that,for overall-attribute comparisons,the inverting stacked bar chart was the most effective with regards to the completion time.The results also show that performing overall-attribute comparisons using the classical and diverging stacked bar charts required more time than performing single-attribute comparisons using these charts.Participants perceived the inverting and diverging stacked bar charts as easier-to-use than the classical stacked bar chart for overall-attribute comparisons.However,for single-attribute comparisons,all chart types delivered similar performance.We discuss how these findings can inform the better design of interactive stacked bar charts and visualization tools.展开更多
文摘针对在大数据的处理过程中,对大数据任务的划分和资源分配缺乏合理性的问题,提出一种面向大数据任务的调度方法。该方法首先引入了调度理论用于处理大数据任务,帮助建立合理的大数据任务管理体系并规范大数据任务处理流程;然后,基于大数据任务的本质对数据集进行分析处理,引入决策表进行属性约简,以减小大数据分析任务的数据量和提高大数据分析效率;最后,采用模糊综合评价方法,将模糊综合评价的结果作为对任务调度的依据,以提高任务资源分配合理性。在UCI(University of California Irvine)数据集上进行测试,实验结果表明,该调度算法在平均预测准确度上比朴素贝叶斯(NB)算法高7.42个百分点,比误差反向传播(BP)算法高5.16个百分点,比均方根传递(RMSProp)算法高3.74个百分点。而对于特征数较多的数据集,所提算法在预测精度上较其他算法有显著提高。所提算法在平均调度长度比(SLR)上较HCPFS(Heterogeneous Critcal Path First Synthesis)算法和HIPLTS(Heterogeneous Improved Priority List for Task Scheduling)算法分别下降了12.14%和4.56%,在平均加速比上分别提升了7.14%和42.56%,表明该算法能有效提高大数据系统中任务调度的效率。综合比较分析,所提方法具有较高的预测精度,且高效可靠。
基金supported by ZTE Corporation and State Key Laboratory of Mobile Network and Mobile Multimedia Technology
文摘This paper proposes a universal framework,termed as Multi-Task Hybrid Convolutional Neural Network(MHCNN),for joint face detection,facial landmark detection,facial quality,and facial attribute analysis.MHCNN consists of a high-accuracy single stage detector(SSD)and an efficient tiny convolutional neural network(T-CNN)for joint face detection refinement,alignment and attribute analysis.Though the SSD face detectors achieve promising results,we find that applying a tiny CNN on detections further boosts the detected face scores and bounding boxes.By multi-task training,our T-CNN aims to provide five facial landmarks,facial quality scores,and facial attributes like wearing sunglasses and wearing masks.Since there is no public facial quality data and facial attribute data as we need,we contribute two datasets,namely FaceQ and FaceA,which are collected from the Internet.Experiments show that our MHCNN achieves face detection performance comparable to the state of the art in face detection data set and benchmark(FDDB),and gets reasonable results on AFLW,FaceQ and FaceA.
基金ACKNOWLEDGEMENTS The authors would like to thank the reviewers for their detailed reviews and constructive comments, which have helped improve the quality of this paper. The research has been partly supported by National Natural Science Foundation of China No. 61272528 and No. 61034005, and the Central University Fund (ID-ZYGX2013J073).
文摘Many Task Computing(MTC)is a new class of computing paradigm in which the aggregate number of tasks,quantity of computing,and volumes of data may be extremely large.With the advent of Cloud computing and big data era,scheduling and executing large-scale computing tasks efficiently and allocating resources to tasks reasonably are becoming a quite challenging problem.To improve both task execution and resource utilization efficiency,we present a task scheduling algorithm with resource attribute selection,which can select the optimal node to execute a task according to its resource requirements and the fitness between the resource node and the task.Experiment results show that there is significant improvement in execution throughput and resource utilization compared with the other three algorithms and four scheduling frameworks.In the scheduling algorithm comparison,the throughput is 77%higher than Min-Min algorithm and the resource utilization can reach 91%.In the scheduling framework comparison,the throughput(with work-stealing)is at least 30%higher than the other frameworks and the resource utilization reaches 94%.The scheduling algorithm can make a good model for practical MTC applications.
基金Lee Howorko received funding from MacEwan University,Canada through the Undergraduate Student Research Initiative Grant.
文摘Stacked bar charts are a visualization method for presenting multiple attributes of data,and many visualization tools support these charts.To assess the efficacy of stacked bar charts in supporting attributecomparison tasks,we conducted a user study to compare three types of stacked bar charts:classical,inverting,and diverging.Each chart type was used to visualize six attributes of data where half of the attributes have the characteristics of‘lower better’whereas the other half attributes are with‘higher better.’Thirty participants were asked to perform two types of comparison tasks:single-attribute and overall-attribute comparisons.We measured the completion time,error rate,and perceived difficulty of the comparison tasks.The results of the study suggest that,for overall-attribute comparisons,the inverting stacked bar chart was the most effective with regards to the completion time.The results also show that performing overall-attribute comparisons using the classical and diverging stacked bar charts required more time than performing single-attribute comparisons using these charts.Participants perceived the inverting and diverging stacked bar charts as easier-to-use than the classical stacked bar chart for overall-attribute comparisons.However,for single-attribute comparisons,all chart types delivered similar performance.We discuss how these findings can inform the better design of interactive stacked bar charts and visualization tools.