This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating populatio...This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.展开更多
The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies...The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies some problems in performance and real application. This study investigates the end-to-end ABSA and proposes a novel multitask multiview network(MTMVN) architecture. Specifically, the architecture takes the unified ABSA as the main task with the two subtasks as auxiliary tasks. Meanwhile, the representation obtained from the branch network of the main task is regarded as the global view, whereas the representations of the two subtasks are considered two local views with different emphases. Through multitask learning, the main task can be facilitated by additional accurate aspect boundary information and sentiment polarity information. By enhancing the correlations between the views under the idea of multiview learning, the representation of the global view can be optimized to improve the overall performance of the model. The experimental results on three benchmark datasets show that the proposed method exceeds the existing pipeline methods and end-to-end methods, proving the superiority of our MTMVN architecture.展开更多
This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, w...This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, which can characterize more than 92% of a foot, are defined by using the principal component analysis method. Then, using "active shape models", the initial 3D model is adapted to the real foot captured in multiple images by applying some constraints (edge points' distance and color variance). We insist here on the experiment part where we demonstrate the efficiency of the proposed method on a plastic foot model, and also on real human feet with various shapes. We propose and compare different ways of texturing the foot which is needed for reconstruction. We present an experiment performed on the plastic foot model and on human feet and propose two different ways to improve the final 3D shapers accuracy according to the previous experiments' results. The first improvement proposed is the densification of the cloud of points used to represent the initial model and the foot database. The second improvement concerns the projected patterns used to texture the foot. We conclude by showing the obtained results for a human foot with the average computed shape error being only 1.06 mm.展开更多
Human activity recognition is a recent area of research for researchers.Activity recognition has many applications in smart homes to observe and track toddlers or oldsters for their safety,monitor indoor and outdoor a...Human activity recognition is a recent area of research for researchers.Activity recognition has many applications in smart homes to observe and track toddlers or oldsters for their safety,monitor indoor and outdoor activities,develop Tele immersion systems,or detect abnormal activity recognition.Three dimensions(3D)skeleton data is robust and somehow view-invariant.Due to this,it is one of the popular choices for human action recognition.This paper proposed using a transversal tree from 3D skeleton data to represent videos in a sequence.Further proposed two neural networks:convolutional neural network recurrent neural network_1(CNN_RNN_1),used to find the optimal features and convolutional neural network recurrent neural network network_2(CNN_RNN_2),used to classify actions.The deep neural network-based model proposed CNN_RNN_1 and CNN_RNN_2 that uses a convolutional neural network(CNN),Long short-term memory(LSTM)and Bidirectional Long shortterm memory(BiLSTM)layered.The systemefficiently achieves the desired accuracy over state-of-the-art models,i.e.,88.89%.The performance of the proposed model compared with the existing state-of-the-art models.The NTURGB+D dataset uses for analyzing experimental results.It is one of the large benchmark datasets for human activity recognition.Moreover,the comparison results show that the proposed model outperformed the state-ofthe-art models.展开更多
Multiview video can provide more immersive perception than traditional single 2-D video. It enables both interactive free navigation applications as well as high-end autostereoscopic displays on which multiple users c...Multiview video can provide more immersive perception than traditional single 2-D video. It enables both interactive free navigation applications as well as high-end autostereoscopic displays on which multiple users can perceive genuine 3-D content without glasses. The multiview format also comprises much more visual information than classical 2-D or stereo 3-D content, which makes it possible to perform various interesting editing operations both on pixel-level and object-level. This survey provides a comprehensive review of existing multiview video synthesis and editing algorithms and applications. For each topic, the related technologies in classical 2-D image and video processing are reviewed. We then continue to the discussion of recent advanced techniques for multiview video virtual view synthesis and various interactive editing applications. Due to the ongoing progress on multiview video synthesis and editing, we can foresee more and more immersive 3-D video applications will appear in the future.展开更多
Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy min...Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy minimization framework.By employing a geometrical constraint,the cost value of the energy function was calculated from multiple images,and the cost value was aggregated in an image space using a semi-global optimization approach.A homography transform parameter calculation method is proposed for fast calculation of projection pixel on each image when calculation cost values.It is based on the known interior orientation parameters,exterior orientation parameters,and a given elevation value.For an efficient and reliable processing of multiple remote sensing images,the proposed matching method was performed via a coarse-to-fine strategy through image pyramid.Three sets of airborne remote sensing images were used to evaluate the performance of the proposed method.Results reveal that the multi-view image matching can improve matching reliability.Moreover,the experimental results show that the proposed method performs better than traditional methods.展开更多
基金Guangdong Basic and Applied Basic Research Foundation under Grant No.2024A1515012485in part by the Shenzhen Fundamental Research Program under Grant JCYJ20220810112354002.
文摘This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.
基金supported by the National Natural Science Foundation of China(No.61976247)
文摘The aspect-based sentiment analysis(ABSA) consists of two subtasks—aspect term extraction and aspect sentiment prediction. Existing methods deal with both subtasks one by one in a pipeline manner, in which there lies some problems in performance and real application. This study investigates the end-to-end ABSA and proposes a novel multitask multiview network(MTMVN) architecture. Specifically, the architecture takes the unified ABSA as the main task with the two subtasks as auxiliary tasks. Meanwhile, the representation obtained from the branch network of the main task is regarded as the global view, whereas the representations of the two subtasks are considered two local views with different emphases. Through multitask learning, the main task can be facilitated by additional accurate aspect boundary information and sentiment polarity information. By enhancing the correlations between the views under the idea of multiview learning, the representation of the global view can be optimized to improve the overall performance of the model. The experimental results on three benchmark datasets show that the proposed method exceeds the existing pipeline methods and end-to-end methods, proving the superiority of our MTMVN architecture.
基金This work was supported by Grant-in-Aid for Scientific Research (C) (No.17500119)
文摘This paper describes a multiple camera-based method to reconstruct the 3D shape of a human foot. From a foot database, an initial 3D model of the foot represented by a cloud of points is built. The shape parameters, which can characterize more than 92% of a foot, are defined by using the principal component analysis method. Then, using "active shape models", the initial 3D model is adapted to the real foot captured in multiple images by applying some constraints (edge points' distance and color variance). We insist here on the experiment part where we demonstrate the efficiency of the proposed method on a plastic foot model, and also on real human feet with various shapes. We propose and compare different ways of texturing the foot which is needed for reconstruction. We present an experiment performed on the plastic foot model and on human feet and propose two different ways to improve the final 3D shapers accuracy according to the previous experiments' results. The first improvement proposed is the densification of the cloud of points used to represent the initial model and the foot database. The second improvement concerns the projected patterns used to texture the foot. We conclude by showing the obtained results for a human foot with the average computed shape error being only 1.06 mm.
文摘Human activity recognition is a recent area of research for researchers.Activity recognition has many applications in smart homes to observe and track toddlers or oldsters for their safety,monitor indoor and outdoor activities,develop Tele immersion systems,or detect abnormal activity recognition.Three dimensions(3D)skeleton data is robust and somehow view-invariant.Due to this,it is one of the popular choices for human action recognition.This paper proposed using a transversal tree from 3D skeleton data to represent videos in a sequence.Further proposed two neural networks:convolutional neural network recurrent neural network_1(CNN_RNN_1),used to find the optimal features and convolutional neural network recurrent neural network network_2(CNN_RNN_2),used to classify actions.The deep neural network-based model proposed CNN_RNN_1 and CNN_RNN_2 that uses a convolutional neural network(CNN),Long short-term memory(LSTM)and Bidirectional Long shortterm memory(BiLSTM)layered.The systemefficiently achieves the desired accuracy over state-of-the-art models,i.e.,88.89%.The performance of the proposed model compared with the existing state-of-the-art models.The NTURGB+D dataset uses for analyzing experimental results.It is one of the large benchmark datasets for human activity recognition.Moreover,the comparison results show that the proposed model outperformed the state-ofthe-art models.
基金partially supported by Innoviris(3-DLicornea project)FWO(project G.0256.15)+3 种基金supported by the National Natural Science Foundation of China(Nos.61272226 and 61373069)Research Grant of Beijing Higher Institution Engineering Research CenterTsinghua-Tencent Joint Laboratory for Internet Innovation TechnologyTsinghua University Initiative Scientific Research Program
文摘Multiview video can provide more immersive perception than traditional single 2-D video. It enables both interactive free navigation applications as well as high-end autostereoscopic displays on which multiple users can perceive genuine 3-D content without glasses. The multiview format also comprises much more visual information than classical 2-D or stereo 3-D content, which makes it possible to perform various interesting editing operations both on pixel-level and object-level. This survey provides a comprehensive review of existing multiview video synthesis and editing algorithms and applications. For each topic, the related technologies in classical 2-D image and video processing are reviewed. We then continue to the discussion of recent advanced techniques for multiview video virtual view synthesis and various interactive editing applications. Due to the ongoing progress on multiview video synthesis and editing, we can foresee more and more immersive 3-D video applications will appear in the future.
基金This work was supported by the National Key Research and Development Program of China[grant number 2017YFC0803802]and the National Natural Science Foundation of China[grant number 41771486].
文摘Targeting at a reliable image matching of multiple remote sensing images for the generation of digital surface models,this paper presents a geometric-constrained multi-view image matching method,based on an energy minimization framework.By employing a geometrical constraint,the cost value of the energy function was calculated from multiple images,and the cost value was aggregated in an image space using a semi-global optimization approach.A homography transform parameter calculation method is proposed for fast calculation of projection pixel on each image when calculation cost values.It is based on the known interior orientation parameters,exterior orientation parameters,and a given elevation value.For an efficient and reliable processing of multiple remote sensing images,the proposed matching method was performed via a coarse-to-fine strategy through image pyramid.Three sets of airborne remote sensing images were used to evaluate the performance of the proposed method.Results reveal that the multi-view image matching can improve matching reliability.Moreover,the experimental results show that the proposed method performs better than traditional methods.