This paper introduces the microarchitecture and physical implementation of the Godson-2E processor, which is a four-issue superscalar RISC processor that supports the 64-bit MIPS instruction set. The adoption of the a...This paper introduces the microarchitecture and physical implementation of the Godson-2E processor, which is a four-issue superscalar RISC processor that supports the 64-bit MIPS instruction set. The adoption of the aggressive out-of-order execution and memory hierarchy techniques help Godson-2E to achieve high performance. The Godson-2E processor has been physically designed in a 7-metal 90nm CMOS process using the cell-based methodology with some bitsliced manual placement and a number of crafted cells and macros. The processor can be run at 1GHz and achieves a SPEC CPU2000 rate higher than 500.展开更多
Very large scale integration (VLSI) circuit par- titioning is an important problem in design automation of VLSI chips and multichip systems; it is an NP-hard combi- national optimization problem. In this paper, an e...Very large scale integration (VLSI) circuit par- titioning is an important problem in design automation of VLSI chips and multichip systems; it is an NP-hard combi- national optimization problem. In this paper, an effective hy- brid multi-objective partitioning algorithm, based on discrete particle swarm optimzation (DPSO) with local search strat- egy, called MDPSO-LS, is presented to solve the VLSI two- way partitioning with simultaneous cutsize and circuit delay minimization. Inspired by the physics of genetic algorithm, uniform crossover and random two-point exchange operators are designed to avoid the case of generating infeasible so- lutions. Furthermore, the phenotype sharing function of the objective space is applied to circuit partitioning to obtain a better approximation of a true Pareto front, and the theorem of Markov chains is used to prove global convergence. To improve the ability of local exploration, Fiduccia-Matteyses (FM) strategy is also applied to further improve the cutsize of each particle, and a local search strategy for improving circuit delay objective is also designed. Experiments on IS- CAS89 benchmark circuits show that the proposed algorithm is efficient.展开更多
The rectilinear Steiner minimal tree (RSMT) problem is one of the fundamental problems in physical design, especially in routing, which is known to be NP-complete. This paper presents an algorithm, called ACO-Steine...The rectilinear Steiner minimal tree (RSMT) problem is one of the fundamental problems in physical design, especially in routing, which is known to be NP-complete. This paper presents an algorithm, called ACO-Steiner, for RSMT construction based on ant colony optimization (ACO). An RSMT is constructed with ants' movements in Hanan grid, and then the constraint of Hanan grid is broken to accelerate ants' movements to improve the performance of the algorithm. This algorithm has been implemented on a Sun workstation with Unix operating system and the results have been compared with the fastest exact RSMT algorithm, GeoSteiner 3.1 and a recent heuristic using batched greedy triple construction (BGTC). Experimental results show that ACO-Steiner can get a short running time and keep the high performance. Furthermore, it is Mso found that the ACO-Steiner can be easily extended to be used to some other problems, such as rectilinear Steiner minimal tree avoiding obstacles, and congestion reduction in global routing.展开更多
基金Supported by the National Natural Science Foundation of China for Distinguished Young Scholars under Grant No. 60325205, the National Natural Science Foundation of China under Grant No. 60673146, the National High Technology Development 863 Program of China under Grants No. 2002AAl10010, No. 2005AAl10010, No. 2005AAl19020, and the National Grand Fundamental Research 973 Program of China under Grant No. 2005CB321600.
文摘This paper introduces the microarchitecture and physical implementation of the Godson-2E processor, which is a four-issue superscalar RISC processor that supports the 64-bit MIPS instruction set. The adoption of the aggressive out-of-order execution and memory hierarchy techniques help Godson-2E to achieve high performance. The Godson-2E processor has been physically designed in a 7-metal 90nm CMOS process using the cell-based methodology with some bitsliced manual placement and a number of crafted cells and macros. The processor can be run at 1GHz and achieves a SPEC CPU2000 rate higher than 500.
文摘Very large scale integration (VLSI) circuit par- titioning is an important problem in design automation of VLSI chips and multichip systems; it is an NP-hard combi- national optimization problem. In this paper, an effective hy- brid multi-objective partitioning algorithm, based on discrete particle swarm optimzation (DPSO) with local search strat- egy, called MDPSO-LS, is presented to solve the VLSI two- way partitioning with simultaneous cutsize and circuit delay minimization. Inspired by the physics of genetic algorithm, uniform crossover and random two-point exchange operators are designed to avoid the case of generating infeasible so- lutions. Furthermore, the phenotype sharing function of the objective space is applied to circuit partitioning to obtain a better approximation of a true Pareto front, and the theorem of Markov chains is used to prove global convergence. To improve the ability of local exploration, Fiduccia-Matteyses (FM) strategy is also applied to further improve the cutsize of each particle, and a local search strategy for improving circuit delay objective is also designed. Experiments on IS- CAS89 benchmark circuits show that the proposed algorithm is efficient.
基金This work was partially supported by the National Natural Science Foundation of China (NSFC) under Grant No. 60373012, and the Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP) of China under Grant No. 20050003099. Some preliminary results of this work were presented at IEEE International Conference on Communications, Circuits and Systems (ICCCAS), Chengdu, China, 2004.
文摘The rectilinear Steiner minimal tree (RSMT) problem is one of the fundamental problems in physical design, especially in routing, which is known to be NP-complete. This paper presents an algorithm, called ACO-Steiner, for RSMT construction based on ant colony optimization (ACO). An RSMT is constructed with ants' movements in Hanan grid, and then the constraint of Hanan grid is broken to accelerate ants' movements to improve the performance of the algorithm. This algorithm has been implemented on a Sun workstation with Unix operating system and the results have been compared with the fastest exact RSMT algorithm, GeoSteiner 3.1 and a recent heuristic using batched greedy triple construction (BGTC). Experimental results show that ACO-Steiner can get a short running time and keep the high performance. Furthermore, it is Mso found that the ACO-Steiner can be easily extended to be used to some other problems, such as rectilinear Steiner minimal tree avoiding obstacles, and congestion reduction in global routing.