With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous developme...With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous development of cloud computing has prompted service providers to outsource data to the cloud to provide POI recommendation services.However,there is a degree of distrust of the cloud by service providers.To protect digital assets,service providers encrypt data before outsourcing it.However,encryption reduces data availability,making it more challenging to provide POI recommendation services in outsourcing scenarios.Some privacy-preserving schemes for geo-social-based POI recommendation have been presented,but they have some limitations in supporting group query,considering both geographic and social factors,and query accuracy,making these schemes impractical.To solve this issue,we propose two practical and privacy-preserving geo-social-based POI recommendation schemes for single user and group users,which are named GSPR-S and GSPR-G.Specifically,we first utilize the quad tree to organize geographic data and the MinHash method to index social data.Then,we apply BGV fully homomorphic encryption to design some private algorithms,including a private max/min operation algorithm,a private rectangular set operation algorithm,and a private rectangular overlapping detection algorithm.After that,we use these algorithms as building blocks in our schemes for efficiency improvement.According to security analysis,our schemes are proven to be secure against the honest-but-curious cloud servers,and experimental results show that our schemes have good performance.展开更多
This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of t...This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing procedures.Shared memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR algorithm.The experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved.展开更多
Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex enviro...Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.展开更多
Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is ...Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.展开更多
A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence...A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.展开更多
Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters fo...Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.展开更多
In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predic...In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predictive coding(LPC),linear prediction cepstrum coefficient(LPCC),perceptual linear prediction(PLP),and Mel frequency cepstral coefficient(MFCC).The 10-hour speech data were used for training and 3-hour data for testing.For each spectral feature,different hidden Markov model(HMM)based recognizers with variations in HMM states and different Gaussian mixture models(GMMs)were built.The performance was evaluated by using the word error rate(WER).The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.展开更多
Accurate delineation of urban form is essential to understand the impacts that urbanization has on the environment and regional climate.Conventional supervised classification of urban form requires a rigidly defined s...Accurate delineation of urban form is essential to understand the impacts that urbanization has on the environment and regional climate.Conventional supervised classification of urban form requires a rigidly defined scheme and high-quality sample data with class labels.Due to the complexity of urban systems,it is challenging to consistently define urban form types and collect metadata to describe them.Therefore,in this study,we propose a novel unsupervised deep learning method for urban form delineation while avoiding the limitations of conventional super-vised urban form classification methods.The novelty of the proposed method is the Multiscale Residual Convolutional Autoencoder(MRCAE),which can learn the latent representation of differ-ent urban form types.These vectors can be further used to generalize urban form types by using Self-Organizing Map(SOM)and the Gaussian Mixture Model(GMM).The proposed method is applied in the metropolitan area of Guangzhou-Foshan,China.The MRCAE model along with SOM and GMM is used to generalize the urban form types from satellite images.The physical and functional properties of each urban form type are also analyzed using several auxiliary datasets,including building footprints,Points-of-Interests(POIs)and Tencent User Density(TUD)data.The results reveal that the urban form map generated based on the MRCAE can explain 55%of the building height distribution and 55%of the building area distribution,which are 2.1%and 3.3%higher than those derived from the conventional convolutional autoencoder.As the information of urban form is essential to urban climate models,the results presented in this study can become a basis to refine the quantification of urban climate parameters,thereby introducing the urban heterogeneity to help understand the climate response of future urbanization.展开更多
The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features...The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features of aircraft trajectories.Low anomaly detection accuracy still exists due to the high-dimensionality,heterogeneity and temporality of flight trajectory data.To this end,this paper proposes an abnormal trajectory detection method based on the deep mixture density network(DMDN)to detect flights with unusual data patterns and evaluate flight trajectory safety.The technique consists of two components:Utilization of the deep long short-term memory(LSTM)network to encode features of flight trajectories effectively,and parameterization of the statistical properties of flight trajectory using the Gaussian mixture model(GMM).Experiment results on Guangzhou Baiyun International Airport terminal airspace show that the proposed method can effectively capture the statistical patterns of aircraft trajectories.The model can detect abnormal flights with elevated risks and its performance is superior to two mainstream methods.The proposed model can be used as an assistant decision-making tool for air traffic controllers.展开更多
Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique ...Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.展开更多
Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of componen...Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY co展开更多
Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal co...Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal component analysis de-nosing algorithm to reduce white noise in the video.Next,we apply the Gaussian mixture model(GMM)to model the video and segment all foreground objects primarily.After that,we calculate von Mises distribution of the velocity vectors and ratio of the overlapped region with referring to the result of the primary segmentation and extract the interesting object.Finally,rain and snow streaks are inpainted using the background to improve the quality of the video data.Experiments show that the proposed method can effectively suppress noise and extract interesting targets.展开更多
Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes together.However,the combination generates high dimensional featu...Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes together.However,the combination generates high dimensional feature vectors and leads to large computation.In this paper,a new classification method is proposed.This method adopts an intelligent feature space selection strategy by clustering similar Gaussian mixtures in order to reduce the feature dimensions.Audio classification experiments show that the proposed method is more accurate and effective with less computation compared with traditional methods.展开更多
传统的高斯混合模型(Gaussian mixture model,GMM)算法在图像分割中未考虑像素的空间信息,导致其对于噪声十分敏感.马尔科夫随机场(Markov random field,MRF)模型通过像素类别标记的Gibbs分布先验概率引入了图像的空间信息,能较好地分...传统的高斯混合模型(Gaussian mixture model,GMM)算法在图像分割中未考虑像素的空间信息,导致其对于噪声十分敏感.马尔科夫随机场(Markov random field,MRF)模型通过像素类别标记的Gibbs分布先验概率引入了图像的空间信息,能较好地分割含有噪声的图像,然而MRF模型的分割结果容易出现过平滑现象.为了解决上述缺陷,提出了一种新的基于图像片权重方法的马尔科夫随机场图像分割模型,对邻域内的不同图像片根据相似度赋予不同的权重,使其在克服噪声影响的同时能保持图像细节信息.同时,采用KL距离引入先验概率与后验概率关于熵的惩罚项,并对该惩罚项进行平滑,得到最终的分割结果.实验结果表明,算法具有较强的自适应性,能够有效克服噪声对于分割结果的影响,并获得较高的分割精度.展开更多
基金supported by the National Key Research and Development Program of China(2021YFB3101300,2021YFB3101303)the Natural Science Foundation of China(U22B2030,62302374)+4 种基金Shaanxi Provincial Key Research and Development Program(2023-ZDLGY-35)China Postdoctoral Science Foundation(2022M722498)the Natural Science Basic Research Plan in Shaanxi Province of China(2023-JC-QN-0699)Qin Chuangyuan Cited High-level Innovative and Entrepreneurial Talents Project(QCYRCXM-2022-244)the Science and Technology on Communication Networks Laboratory(HHX23641X003).
文摘With the rapid development of location-based services and online social networks,POI recommendation services considering geographic and social factors have received extensive attention.Meanwhile,the vigorous development of cloud computing has prompted service providers to outsource data to the cloud to provide POI recommendation services.However,there is a degree of distrust of the cloud by service providers.To protect digital assets,service providers encrypt data before outsourcing it.However,encryption reduces data availability,making it more challenging to provide POI recommendation services in outsourcing scenarios.Some privacy-preserving schemes for geo-social-based POI recommendation have been presented,but they have some limitations in supporting group query,considering both geographic and social factors,and query accuracy,making these schemes impractical.To solve this issue,we propose two practical and privacy-preserving geo-social-based POI recommendation schemes for single user and group users,which are named GSPR-S and GSPR-G.Specifically,we first utilize the quad tree to organize geographic data and the MinHash method to index social data.Then,we apply BGV fully homomorphic encryption to design some private algorithms,including a private max/min operation algorithm,a private rectangular set operation algorithm,and a private rectangular overlapping detection algorithm.After that,we use these algorithms as building blocks in our schemes for efficiency improvement.According to security analysis,our schemes are proven to be secure against the honest-but-curious cloud servers,and experimental results show that our schemes have good performance.
文摘This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing procedures.Shared memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR algorithm.The experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved.
基金supported by the National Natural Science Foundation of China under Grant No.62273083 and No.61973069Natural Science Foundation of Hebei Province under Grant No.F2020501012。
文摘Wireless sensor network(WSN)positioning has a good effect on indoor positioning,so it has received extensive attention in the field of positioning.Non-line-of sight(NLOS)is a primary challenge in indoor complex environment.In this paper,a robust localization algorithm based on Gaussian mixture model and fitting polynomial is proposed to solve the problem of NLOS error.Firstly,fitting polynomials are used to predict the measured values.The residuals of predicted and measured values are clustered by Gaussian mixture model(GMM).The LOS probability and NLOS probability are calculated according to the clustering centers.The measured values are filtered by Kalman filter(KF),variable parameter unscented Kalman filter(VPUKF)and variable parameter particle filter(VPPF)in turn.The distance value processed by KF and VPUKF and the distance value processed by KF,VPUKF and VPPF are combined according to probability.Finally,the maximum likelihood method is used to calculate the position coordinate estimation.Through simulation comparison,the proposed algorithm has better positioning accuracy than several comparison algorithms in this paper.And it shows strong robustness in strong NLOS environment.
基金the National Natural Science Foundation of China(61771367)the Science and Technology on Communication Networks Laboratory(HHS19641X003).
文摘Since the joint probabilistic data association(JPDA)algorithm results in calculation explosion with the increasing number of targets,a multi-target tracking algorithm based on Gaussian mixture model(GMM)clustering is proposed.The algorithm is used to cluster the measurements,and the association matrix between measurements and tracks is constructed by the posterior probability.Compared with the traditional data association algorithm,this algorithm has better tracking performance and less computational complexity.Simulation results demonstrate the effectiveness of the proposed algorithm.
文摘A dynamic learning rate Gaussian mixture model(GMM)algorithm is proposed to deal with the problem of slow adaption of GMM in the case of moving object detection in the outdoor surveillance,especially in the presence of sudden illumination changes.The GMM is mostly used for detecting objects in complex scenes for intelligent monitoring systems.To solve this problem,a mixture Gaussian model has been built for each pixel in the video frame,and according to the scene change from the frame difference,the learning rate of GMM can be dynamically adjusted.The experiments show that the proposed method gives good results with an adaptive GMM learning rate when we compare it with GMM method with a fixed learning rate.The method was tested on a certain dataset,and tests in the case of sudden natural light changes show that our method has a better accuracy and lower false alarm rate.
基金Supported by the National High Technology Research and Development Program of China (863 Program,No.2006AA010102)
文摘Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation.
基金supported by the Visvesvaraya Ph.D.Scheme for Electronics and IT students launched by the Ministry of Electronics and Information Technology(MeiTY),Government of India under Grant No.PhD-MLA/4(95)/2015-2016.
文摘In this paper,we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora.These four features include linear predictive coding(LPC),linear prediction cepstrum coefficient(LPCC),perceptual linear prediction(PLP),and Mel frequency cepstral coefficient(MFCC).The 10-hour speech data were used for training and 3-hour data for testing.For each spectral feature,different hidden Markov model(HMM)based recognizers with variations in HMM states and different Gaussian mixture models(GMMs)were built.The performance was evaluated by using the word error rate(WER).The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features.
基金supported by the National Key R&D Program of China[grant number 2019YFA0607201 and 2017YFA0604401]the National Natural Science Foundation of China[grant number 41871306]+1 种基金the Guangdong Natural Science Funds for Distinguished Young Scholar[grant number 2021B1515020104]the Fundamental Research Funds for the Central Universities[grant number 20lgzd09].
文摘Accurate delineation of urban form is essential to understand the impacts that urbanization has on the environment and regional climate.Conventional supervised classification of urban form requires a rigidly defined scheme and high-quality sample data with class labels.Due to the complexity of urban systems,it is challenging to consistently define urban form types and collect metadata to describe them.Therefore,in this study,we propose a novel unsupervised deep learning method for urban form delineation while avoiding the limitations of conventional super-vised urban form classification methods.The novelty of the proposed method is the Multiscale Residual Convolutional Autoencoder(MRCAE),which can learn the latent representation of differ-ent urban form types.These vectors can be further used to generalize urban form types by using Self-Organizing Map(SOM)and the Gaussian Mixture Model(GMM).The proposed method is applied in the metropolitan area of Guangzhou-Foshan,China.The MRCAE model along with SOM and GMM is used to generalize the urban form types from satellite images.The physical and functional properties of each urban form type are also analyzed using several auxiliary datasets,including building footprints,Points-of-Interests(POIs)and Tencent User Density(TUD)data.The results reveal that the urban form map generated based on the MRCAE can explain 55%of the building height distribution and 55%of the building area distribution,which are 2.1%and 3.3%higher than those derived from the conventional convolutional autoencoder.As the information of urban form is essential to urban climate models,the results presented in this study can become a basis to refine the quantification of urban climate parameters,thereby introducing the urban heterogeneity to help understand the climate response of future urbanization.
基金supported in part by the National Natural Science Foundation of China(Nos.62076126,52075031)Postgraduate Research&Practice Innovation Program of Jiangsu Province(No.SJCX19_0013)。
文摘The timely and accurately detection of abnormal aircraft trajectory is critical to improving flight safety.However,the existing anomaly detection methods based on machine learning cannot well characterize the features of aircraft trajectories.Low anomaly detection accuracy still exists due to the high-dimensionality,heterogeneity and temporality of flight trajectory data.To this end,this paper proposes an abnormal trajectory detection method based on the deep mixture density network(DMDN)to detect flights with unusual data patterns and evaluate flight trajectory safety.The technique consists of two components:Utilization of the deep long short-term memory(LSTM)network to encode features of flight trajectories effectively,and parameterization of the statistical properties of flight trajectory using the Gaussian mixture model(GMM).Experiment results on Guangzhou Baiyun International Airport terminal airspace show that the proposed method can effectively capture the statistical patterns of aircraft trajectories.The model can detect abnormal flights with elevated risks and its performance is superior to two mainstream methods.The proposed model can be used as an assistant decision-making tool for air traffic controllers.
文摘Intrusion detection is the investigation process of information about the system activities or its data to detect any malicious behavior or unauthorized activity.Most of the IDS implement K-means clustering technique due to its linear complexity and fast computing ability.Nonetheless,it is Naïve use of the mean data value for the cluster core that presents a major drawback.The chances of two circular clusters having different radius and centering at the same mean will occur.This condition cannot be addressed by the K-means algorithm because the mean value of the various clusters is very similar together.However,if the clusters are not spherical,it fails.To overcome this issue,a new integrated hybrid model by integrating expectation maximizing(EM)clustering using a Gaussian mixture model(GMM)and naïve Bays classifier have been proposed.In this model,GMM give more flexibility than K-Means in terms of cluster covariance.Also,they use probabilities function and soft clustering,that’s why they can have multiple cluster for a single data.In GMM,we can define the cluster form in GMM by two parameters:the mean and the standard deviation.This means that by using these two parameters,the cluster can take any kind of elliptical shape.EM-GMM will be used to cluster data based on data activity into the corresponding category.
基金The work described in this paper was supported by a grant of the General Research Fund(GRF)from the Research Grant Council of Hong Kong SAR(Project No.CUHK418011E).
文摘Three Bayesian related approaches,namely,variational Bayesian(VB),minimum message length(MML)and Bayesian Ying-Yang(BYY)harmony learning,have been applied to automatically determining an appropriate number of components during learning Gaussian mixture model(GMM).This paper aims to provide a comparative investigation on these approaches with not only a Jeffreys prior but also a conjugate Dirichlet-Normal-Wishart(DNW)prior on GMM.In addition to adopting the existing algorithms either directly or with some modifications,the algorithm for VB with Jeffreys prior and the algorithm for BYY with DNW prior are developed in this paper to fill the missing gap.The performances of automatic model selection are evaluated through extensive experiments,with several empirical findings:1)Considering priors merely on the mixing weights,each of three approaches makes biased mistakes,while considering priors on all the parameters of GMM makes each approach reduce its bias and also improve its performance.2)As Jeffreys prior is replaced by the DNW prior,all the three approaches improve their performances.Moreover,Jeffreys prior makes MML slightly better than VB,while the DNW prior makes VB better than MML.3)As the hyperparameters of DNW prior are further optimized by each of its own learning principle,BYY improves its performances while VB and MML deteriorate their performances when there are too many free hyper-parameters.Actually,VB and MML lack a good guide for optimizing the hyper-parameters of DNW prior.4)BYY considerably outperforms both VB and MML for any type of priors and whether hyper-parameters are optimized.Being different from VB and MML that rely on appropriate priors to perform model selection,BYY does not highly depend on the type of priors.It has model selection ability even without priors and performs already very well with Jeffreys prior,and incrementally improves as Jeffreys prior is replaced by the DNW prior.Finally,all algorithms are applied on the Berkeley segmentation database of real world images.Again,BYY co
基金supported by the National Natural Science Foundation of China(Grant No.60702032)the Natural Science Foundation of Heilongjiang Province(No.F201021)the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology(No.HIT.NSRIF.2008.63).
文摘Rain and snow seriously degrade outdoor video quality.In this work,a primary-secondary background model for removal of rain and snow is built.First,we analyze video noise and use a sliding window sequence principal component analysis de-nosing algorithm to reduce white noise in the video.Next,we apply the Gaussian mixture model(GMM)to model the video and segment all foreground objects primarily.After that,we calculate von Mises distribution of the velocity vectors and ratio of the overlapped region with referring to the result of the primary segmentation and extract the interesting object.Finally,rain and snow streaks are inpainted using the background to improve the quality of the video data.Experiments show that the proposed method can effectively suppress noise and extract interesting targets.
基金supported by the National Natural Science Foundation of China (Grant No.60705019)the National High Technology Research and Development Program of China (Nos.2006AA010102 and 2007AA01Z417)NOKIA project,and the 111 Project (No.B08004).
文摘Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes together.However,the combination generates high dimensional feature vectors and leads to large computation.In this paper,a new classification method is proposed.This method adopts an intelligent feature space selection strategy by clustering similar Gaussian mixtures in order to reduce the feature dimensions.Audio classification experiments show that the proposed method is more accurate and effective with less computation compared with traditional methods.
文摘传统的高斯混合模型(Gaussian mixture model,GMM)算法在图像分割中未考虑像素的空间信息,导致其对于噪声十分敏感.马尔科夫随机场(Markov random field,MRF)模型通过像素类别标记的Gibbs分布先验概率引入了图像的空间信息,能较好地分割含有噪声的图像,然而MRF模型的分割结果容易出现过平滑现象.为了解决上述缺陷,提出了一种新的基于图像片权重方法的马尔科夫随机场图像分割模型,对邻域内的不同图像片根据相似度赋予不同的权重,使其在克服噪声影响的同时能保持图像细节信息.同时,采用KL距离引入先验概率与后验概率关于熵的惩罚项,并对该惩罚项进行平滑,得到最终的分割结果.实验结果表明,算法具有较强的自适应性,能够有效克服噪声对于分割结果的影响,并获得较高的分割精度.