Parallel computing techniques have been introduced into digital image correlation(DIC) in recent years and leads to a surge in computation speed. The graphics processing unit(GPU)-based parallel computing demonstrated...Parallel computing techniques have been introduced into digital image correlation(DIC) in recent years and leads to a surge in computation speed. The graphics processing unit(GPU)-based parallel computing demonstrated a surprising effect on accelerating the iterative subpixel DIC, compared with CPU-based parallel computing. In this paper, the performances of the two kinds of parallel computing techniques are compared for the previously proposed path-independent DIC method, in which the initial guess for the inverse compositional Gauss-Newton(IC-GN) algorithm at each point of interest(POI) is estimated through the fast Fourier transform-based cross-correlation(FFT-CC) algorithm. Based on the performance evaluation, a heterogeneous parallel computing(HPC) model is proposed with hybrid mode of parallelisms in order to combine the computing power of GPU and multicore CPU. A scheme of trial computation test is developed to optimize the configuration of the HPC model on a specific computer. The proposed HPC model shows excellent performance on a middle-end desktop computer for real-time subpixel DIC with high resolution of more than 10000 POIs per frame.展开更多
Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light t...Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light transmission in multiple spatial channels.Different from the situation in standard single mode fiber(SMF),the fiber bending gives rise to tangential strain in off-center cores,and this unique feature has been employed for directional bending and shape sensing,where strain measurement is achieved by using either fiber Bragg gratings(FBGs),optical frequency-domain reflectometry(OFDR)or Brillouin distributed sensing technique.On the other hand,the parallel spatial cores enable space-division multiplexed(SDM)system configuration that allows for the multiplexing of multiple distributed sensing techniques.As a result,multi-parameter sensing or performance enhanced sensing can be achieved by using MCF.In this paper,we review the research progress in MCF based distributed fiber sensors.Brief introductions of MCF and the multiplexing/de-multiplexing methods are presented.The bending sensitivity of off-center cores is analyzed.Curvature and shape sensing,as well as various SDM distributed sensing using MCF are summarized,and the working principles of diverse MCF sensors are discussed.Finally,we present the challenges and prospects of MCF for distributed sensing applications.展开更多
Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavil...Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavily on electronic circuits.The development of all-optical transmission networks and optical computing frameworks has pointed to the direction for the next generation of data transmission and information processing.Here,we propose a high-speed,low-cost,multiplexed parallel and one-piece all-fiber architecture for image acquisition,encoding,and transmission,called the Multicore Fiber Acquisition and Transmission Image System(MFAT).Based on different spatial and modal channels of the multicore fiber,fiber-coupled self-encoding,and digital aperture decoding technology,scenes can be observed directly from up to 1 km away.The expansion of capacity provides the possibility of parallel coded transmission of multimodal high-quality data.MFAT requires no additional signal transmitting and receiving equipment.The all-fiber processing saves the time traditionally spent on signal conversion and image pre-processing(compression,encoding,and modulation).Additionally,it provides an effective solution for 2D information acquisition and transmission tasks in extreme environments such as high temperatures and electromagnetic interference.展开更多
We propose a high-sensitivity bidirectional torsion sensor using a helical seven-core fiber taper embedded in multimode fiber(MHSTM).Sensors with different taper waists and helical pitches are fabricated,and their tra...We propose a high-sensitivity bidirectional torsion sensor using a helical seven-core fiber taper embedded in multimode fiber(MHSTM).Sensors with different taper waists and helical pitches are fabricated,and their transmission spectra are obtained and analyzed.The waist and length of the sandwiched seven-core fiber are finally determined to be 68 μm and3 mm,respectively.The experimental results show that the clockwise and counterclockwise torsion sensitivities of the proposed sensor are 2.253 nm/(rad/m) and-1.123 nm/(rad/m),respectively.When tapered waist diameter reduces to48 μm,a superior torsion sensitivity of 5.391 nm/(rad/m) in the range of 0-4.24 nm/(rad/m) is obtained,which is 46 times as large as the traditional helical seven-core fiber structure.In addition,the MHSTM structure is also relatively stable to temperature variations.展开更多
Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospray...Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospraying-ionotropic gelation technique to prepare multicore millimeter-sized spherical capsules for specific and sustained release of fish oil.The spherical capsules had diameters from 2.05 mm to 0.35 mm with the increased applied voltages.The capsules consisted of uniform(at applied voltages of≤10 k V)or nonuniform(at applied voltages of>10 k V)multicores.The obtained capsules had reasonable loading ratios(9.7%-6.3%)due to the multicore structure.In addition,the obtained capsules had specific and sustained release behaviors of fish oil into the small intestinal phase of in vitro gastrointestinal tract and small intestinal tract models.The simple monoaxial dispersion electrospraying-ionotropic gelatin technique does not involve complicated preparation formulations and polymer modification,which makes the technique has a potential application prospect for the fish oil preparations and the encapsulation of functional active substances in the field of food and drug industries.展开更多
As multi-core processors become the de-facto configuration in modern computers, the adoption of SMP Virtual Machines(VMs) has been increasing, allowing for more efficient use of computing resources. However,because ...As multi-core processors become the de-facto configuration in modern computers, the adoption of SMP Virtual Machines(VMs) has been increasing, allowing for more efficient use of computing resources. However,because of existence of schedulers in both the hypervisor and the guest VMs, this creates a new research problem,viz., double scheduling. Although double scheduling may cause many issues including lock-holder preemption,v CPU stacking, CPU fragmentation, and priority inversion, prior approaches have either introduced new problems and/or addressed the problem incompletely. In this paper, we describe the design and implementation of Flex Core,a new scheduling scheme using v CPU ballooning, which dynamically adjusts the number of v CPUs of a VM at runtime. This essentially eliminates unnecessary scheduling in the hypervisor layer, and thus, boosts performance significantly. An evaluation using a complete KVM-based implementation shows that the average performance improvement for PARSEC applications on a 12-core Intel machine is approximately 52.9%, ranging from 35.4% to79.6%.展开更多
This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and...This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism (TAM) is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results.展开更多
Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation ...Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation is still a problem in large-scale structural analysis based on heterogeneousmulticore clusters.To solve it,a hybrid hierarchical parallel algorithm(HHPA)is proposed on the basis of the conventional domain decomposition algorithm(CDDA)and the parallel sparse solver.In this new algorithm,a three-layer parallelization of the computational procedure is introduced to enable the separation of the communication of inter-nodes,heterogeneous-core-groups(HCGs)and inside-heterogeneous-core-groups through mapping computing tasks to various hardware layers.This approach can not only achieve load balancing at different layers efficiently but can also improve the communication rate significantly through hierarchical communication.Additionally,the proposed hybrid parallel approach in this article can reduce the interface equation size and further reduce the solution time,which can make up for the shortcoming of growing communication overheads with the increase of interface equation size when employing CDDA.Moreover,the distributed sparse storage of a large amount of data is introduced to improve memory access.By solving benchmark instances on the Shenwei-Taihuzhiguang supercomputer,the results show that the proposed method can obtain higher speedup and parallel efficiency compared with CDDA and more superior extensibility of parallel partition compared with the two-level parallel computing algorithm(TPCA).展开更多
Improving the accuracy of shape sensors based on multicore fibers(MCFs)is challenging but of great importance for real-time 3D shape detection,especially in visually inaccessible areas.In this work,a novel approach is...Improving the accuracy of shape sensors based on multicore fibers(MCFs)is challenging but of great importance for real-time 3D shape detection,especially in visually inaccessible areas.In this work,a novel approach is proposed to improve MCF shape sensor accuracy using an ultraviolet transparent liquid mediated fiber Bragg grating(FBG)inscription technique and a twist-isolating packaging method.A newly developed UV index matching liquid(UV-IML)is used to generate uniform light field at all the MCF cores,enabling FBG inscription with high accuracy.Additionally,a new stress fully released(SFR)packaging method is implemented to isolate the sensor from any external twists.The MCF shape sensor shows a maximum relative error of only 3.33%and the lowest reported relative sensitivity error of 1.11%cm^(-1).Moreover,a real-time 3D shape sensing system with a response frequency larger than 30 Hz is constructed using the unique MCF shape sensor.The highly accurate real-time 3D shape sensing results indicate potential applications for in vivo shape estimation of endoscopies and soft robots.展开更多
The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic anal...The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic analysis of the entire large structure of these engineering systems is extremely meaningful in practice.This article proposes a multilevel hierarchical parallel algorithm for large-scale finite element modal analysis to reduce the parallel computational efficiency loss when using heterogeneous multicore distributed storage computers in solving large-scale finite element modal analysis.Based on two-level partitioning and four-transformation strategies,the proposed algorithm not only improves the memory access rate through the sparsely distributed storage of a large amount of data but also reduces the solution time by reducing the scale of the generalized characteristic equation(GCEs).Moreover,a multilevel hierarchical parallelization approach is introduced during the computational procedure to enable the separation of the communication of inter-nodes,intra-nodes,heterogeneous core groups(HCGs),and inside HCGs through mapping computing tasks to various hardware layers.This method can efficiently achieve load balancing at different layers and significantly improve the communication rate through hierarchical communication.Therefore,it can enhance the efficiency of parallel computing of large-scale finite element modal analysis by fully exploiting the architecture characteristics of heterogeneous multicore clusters.Finally,typical numerical experiments were used to validate the correctness and efficiency of the proposedmethod.Then a parallel modal analysis example of the cross-river tunnel with over ten million degrees of freedom(DOFs)was performed,and ten-thousand core processors were applied to verify the feasibility of the algorithm.展开更多
基金supported by the National Natural Science Foundation of China(Grant Nos.11772131,11772132,11772134&11472109)the Natural Science Foundation of Guangdong Province,China(Grant Nos.2015A030308017,2015A030311046&2015B010131009)+2 种基金the Opening fund of State Key Laboratory of Nonlinear Mechanics(LNM)CASthe State Key Lab of Subtropical Building Science,South China University of Technology(Grant Nos.2014ZC17&2017ZD096)
文摘Parallel computing techniques have been introduced into digital image correlation(DIC) in recent years and leads to a surge in computation speed. The graphics processing unit(GPU)-based parallel computing demonstrated a surprising effect on accelerating the iterative subpixel DIC, compared with CPU-based parallel computing. In this paper, the performances of the two kinds of parallel computing techniques are compared for the previously proposed path-independent DIC method, in which the initial guess for the inverse compositional Gauss-Newton(IC-GN) algorithm at each point of interest(POI) is estimated through the fast Fourier transform-based cross-correlation(FFT-CC) algorithm. Based on the performance evaluation, a heterogeneous parallel computing(HPC) model is proposed with hybrid mode of parallelisms in order to combine the computing power of GPU and multicore CPU. A scheme of trial computation test is developed to optimize the configuration of the HPC model on a specific computer. The proposed HPC model shows excellent performance on a middle-end desktop computer for real-time subpixel DIC with high resolution of more than 10000 POIs per frame.
文摘Multicore fiber(MCF)which contains more than one core in a single fiber cladding has attracted ever increasing attention for application in optical sensing systems owing to its unique capability of independent light transmission in multiple spatial channels.Different from the situation in standard single mode fiber(SMF),the fiber bending gives rise to tangential strain in off-center cores,and this unique feature has been employed for directional bending and shape sensing,where strain measurement is achieved by using either fiber Bragg gratings(FBGs),optical frequency-domain reflectometry(OFDR)or Brillouin distributed sensing technique.On the other hand,the parallel spatial cores enable space-division multiplexed(SDM)system configuration that allows for the multiplexing of multiple distributed sensing techniques.As a result,multi-parameter sensing or performance enhanced sensing can be achieved by using MCF.In this paper,we review the research progress in MCF based distributed fiber sensors.Brief introductions of MCF and the multiplexing/de-multiplexing methods are presented.The bending sensitivity of off-center cores is analyzed.Curvature and shape sensing,as well as various SDM distributed sensing using MCF are summarized,and the working principles of diverse MCF sensors are discussed.Finally,we present the challenges and prospects of MCF for distributed sensing applications.
基金financial supports from the National Key R&D Program of China (2021YFA1401103)the National Natural Science Foundation of China (61925502 and 51772145)
文摘Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavily on electronic circuits.The development of all-optical transmission networks and optical computing frameworks has pointed to the direction for the next generation of data transmission and information processing.Here,we propose a high-speed,low-cost,multiplexed parallel and one-piece all-fiber architecture for image acquisition,encoding,and transmission,called the Multicore Fiber Acquisition and Transmission Image System(MFAT).Based on different spatial and modal channels of the multicore fiber,fiber-coupled self-encoding,and digital aperture decoding technology,scenes can be observed directly from up to 1 km away.The expansion of capacity provides the possibility of parallel coded transmission of multimodal high-quality data.MFAT requires no additional signal transmitting and receiving equipment.The all-fiber processing saves the time traditionally spent on signal conversion and image pre-processing(compression,encoding,and modulation).Additionally,it provides an effective solution for 2D information acquisition and transmission tasks in extreme environments such as high temperatures and electromagnetic interference.
基金supported in part by the Joint Research Fund in Astronomy under Cooperative Agreement between the National Natural Science Foundation of China(NSFC) and the Chinese Academy of Sciences(CAS)(Nos.U2031132 and U2031130)the National Natural Science Foundation of China(No.12103015)the Fundamental Research Funds for the Central Universities to the Harbin Engineering University。
文摘We propose a high-sensitivity bidirectional torsion sensor using a helical seven-core fiber taper embedded in multimode fiber(MHSTM).Sensors with different taper waists and helical pitches are fabricated,and their transmission spectra are obtained and analyzed.The waist and length of the sandwiched seven-core fiber are finally determined to be 68 μm and3 mm,respectively.The experimental results show that the clockwise and counterclockwise torsion sensitivities of the proposed sensor are 2.253 nm/(rad/m) and-1.123 nm/(rad/m),respectively.When tapered waist diameter reduces to48 μm,a superior torsion sensitivity of 5.391 nm/(rad/m) in the range of 0-4.24 nm/(rad/m) is obtained,which is 46 times as large as the traditional helical seven-core fiber structure.In addition,the MHSTM structure is also relatively stable to temperature variations.
基金supported by research grants from the National Key R&D Program(2019YFD0902003)。
文摘Specific and sustained release of nutrients from capsules to the gastrointestinal tract has attracted many attentions in the field of food and drug delivery.In this work,we reported a monoaxial dispersion electrospraying-ionotropic gelation technique to prepare multicore millimeter-sized spherical capsules for specific and sustained release of fish oil.The spherical capsules had diameters from 2.05 mm to 0.35 mm with the increased applied voltages.The capsules consisted of uniform(at applied voltages of≤10 k V)or nonuniform(at applied voltages of>10 k V)multicores.The obtained capsules had reasonable loading ratios(9.7%-6.3%)due to the multicore structure.In addition,the obtained capsules had specific and sustained release behaviors of fish oil into the small intestinal phase of in vitro gastrointestinal tract and small intestinal tract models.The simple monoaxial dispersion electrospraying-ionotropic gelatin technique does not involve complicated preparation formulations and polymer modification,which makes the technique has a potential application prospect for the fish oil preparations and the encapsulation of functional active substances in the field of food and drug industries.
文摘As multi-core processors become the de-facto configuration in modern computers, the adoption of SMP Virtual Machines(VMs) has been increasing, allowing for more efficient use of computing resources. However,because of existence of schedulers in both the hypervisor and the guest VMs, this creates a new research problem,viz., double scheduling. Although double scheduling may cause many issues including lock-holder preemption,v CPU stacking, CPU fragmentation, and priority inversion, prior approaches have either introduced new problems and/or addressed the problem incompletely. In this paper, we describe the design and implementation of Flex Core,a new scheduling scheme using v CPU ballooning, which dynamically adjusts the number of v CPUs of a VM at runtime. This essentially eliminates unnecessary scheduling in the hypervisor layer, and thus, boosts performance significantly. An evaluation using a complete KVM-based implementation shows that the average performance improvement for PARSEC applications on a 12-core Intel machine is approximately 52.9%, ranging from 35.4% to79.6%.
基金Supported by the National High-Tech Research and Development 863 Program of China under Grant Nos. 2008AA010901,2009AA01Z125,2009AA01Z103the National Natural Science Foundation of China under Grant Nos. 60736012,60921002,60803029,61050002+1 种基金the National Basic Research 973 Program of China under Grant No. 2005CB321600the Important National Science and Technology Specific Projects under Grant Nos. 2009ZX01028-002-003,2009ZX01029-001-003
文摘This paper describes the design for testability (DFT) challenges and techniques of Godson-3 microprocessor, which is a scalable multicore processor based on the scalable mesh of crossbar (SMOC) on-chip network and targets high-end applications. Advanced techniques are adopted to make the DFT design scalable and achieve low-power and low-cost test with limited IO resources. To achieve a scalable and flexible test access, a highly elaborate test access mechanism (TAM) is implemented to support multiple test instructions and test modes. Taking advantage of multiple identical cores embedding in the processor, scan partition and on-chip comparisons are employed to reduce test power and test time. Test compression technique is also utilized to decrease test time. To further reduce test power, clock controlling logics are designed with ability to turn off clocks of non-testing partitions. In addition, scan collars of CACHEs are designed to perform functional test with low-speed ATE for speed-binning purposes, which poses low complexity and has good correlation results.
基金supported by the National Natural Science Foundation of China (Grant No.11772192).
文摘Heterogeneous multicore clusters are becoming more popular for high-performance computing due to their great computing power and cost-to-performance effectiveness nowadays.Nevertheless,parallel efficiency degradation is still a problem in large-scale structural analysis based on heterogeneousmulticore clusters.To solve it,a hybrid hierarchical parallel algorithm(HHPA)is proposed on the basis of the conventional domain decomposition algorithm(CDDA)and the parallel sparse solver.In this new algorithm,a three-layer parallelization of the computational procedure is introduced to enable the separation of the communication of inter-nodes,heterogeneous-core-groups(HCGs)and inside-heterogeneous-core-groups through mapping computing tasks to various hardware layers.This approach can not only achieve load balancing at different layers efficiently but can also improve the communication rate significantly through hierarchical communication.Additionally,the proposed hybrid parallel approach in this article can reduce the interface equation size and further reduce the solution time,which can make up for the shortcoming of growing communication overheads with the increase of interface equation size when employing CDDA.Moreover,the distributed sparse storage of a large amount of data is introduced to improve memory access.By solving benchmark instances on the Shenwei-Taihuzhiguang supercomputer,the results show that the proposed method can obtain higher speedup and parallel efficiency compared with CDDA and more superior extensibility of parallel partition compared with the two-level parallel computing algorithm(TPCA).
基金Major Scientific Research Project of Zhejiang Laboratory(No.2019MC0AD02)Innovation Project of Zhejiang Laboratory(No.2022MG0AL03)+1 种基金National Science Foundation of China(Nos.62204230,62020106002,T2293750,62205306,92250304)National Key Research and Development Program of China(2021YFC2401403)。
文摘Improving the accuracy of shape sensors based on multicore fibers(MCFs)is challenging but of great importance for real-time 3D shape detection,especially in visually inaccessible areas.In this work,a novel approach is proposed to improve MCF shape sensor accuracy using an ultraviolet transparent liquid mediated fiber Bragg grating(FBG)inscription technique and a twist-isolating packaging method.A newly developed UV index matching liquid(UV-IML)is used to generate uniform light field at all the MCF cores,enabling FBG inscription with high accuracy.Additionally,a new stress fully released(SFR)packaging method is implemented to isolate the sensor from any external twists.The MCF shape sensor shows a maximum relative error of only 3.33%and the lowest reported relative sensitivity error of 1.11%cm^(-1).Moreover,a real-time 3D shape sensing system with a response frequency larger than 30 Hz is constructed using the unique MCF shape sensor.The highly accurate real-time 3D shape sensing results indicate potential applications for in vivo shape estimation of endoscopies and soft robots.
基金supported by the National Natural Science Foundation of China(Grant No.11772192).
文摘The strict and high-standard requirements for the safety and stability ofmajor engineering systems make it a tough challenge for large-scale finite element modal analysis.At the same time,realizing the systematic analysis of the entire large structure of these engineering systems is extremely meaningful in practice.This article proposes a multilevel hierarchical parallel algorithm for large-scale finite element modal analysis to reduce the parallel computational efficiency loss when using heterogeneous multicore distributed storage computers in solving large-scale finite element modal analysis.Based on two-level partitioning and four-transformation strategies,the proposed algorithm not only improves the memory access rate through the sparsely distributed storage of a large amount of data but also reduces the solution time by reducing the scale of the generalized characteristic equation(GCEs).Moreover,a multilevel hierarchical parallelization approach is introduced during the computational procedure to enable the separation of the communication of inter-nodes,intra-nodes,heterogeneous core groups(HCGs),and inside HCGs through mapping computing tasks to various hardware layers.This method can efficiently achieve load balancing at different layers and significantly improve the communication rate through hierarchical communication.Therefore,it can enhance the efficiency of parallel computing of large-scale finite element modal analysis by fully exploiting the architecture characteristics of heterogeneous multicore clusters.Finally,typical numerical experiments were used to validate the correctness and efficiency of the proposedmethod.Then a parallel modal analysis example of the cross-river tunnel with over ten million degrees of freedom(DOFs)was performed,and ten-thousand core processors were applied to verify the feasibility of the algorithm.