Enlightenment-1 (QiMeng-1): The First Automatically Generated RISC-V CPU

4 days ago 2

1State Key Lab of Processors, Institute of Computing Technology, Chinese Academy of Sciences
2University of Chinese Academy of Sciences
3Cambricon Technologies
4Shanghai Innovation Center for Processor Technologies
5University of Science and Technology of China

Overview

Designing a central processing unit (CPU) requires intensive manual work by talented experts to implement the circuit logic from design specifications, involving an iterative process that demands significant effort in programming, debugging, and verification (shown in Figure 1 (a)). Although considerable progress has been made in electronic design automation (EDA) to relieve human efforts, all existing tools require hand-crafted formal program codes (e.g., Verilog, Chisel, or C) as the input.

To automate CPU design without human programming, we are motivated to learn the CPU design from only input-output (IO) examples, which are generated from test cases of design specification (shown in Figure 1 (b)). The key challenge is that the learned CPU design must have near-zero tolerance for inaccuracy, rendering well-known approximate algorithms, such as neural networks, ineffective.

We propose a novel AI approach to generate the CPU design as a large-scale Boolean function, using only external IO examples instead of formal program code. This approach employs a new graph structure called the Binary Speculative Diagram (BSD) to accurately approximate the CPU-scale Boolean function. We introduce an efficient BSD expansion method based on Boolean Distance, a new metric to quantitatively measure the structural similarity between Boolean functions, gradually achieving 100% design accuracy.

Our approach generates an industrial-scale RISC-V CPU design in just 5 hours which is over 1700× larger than existing work (shown in Table 1), reducing the design cycle by approximately 1000× without human involvement. The taped-out chip, Enlightenment-1, the world's first CPU designed by AI, successfully runs the Linux operating system and performs comparably to the human-designed Intel 80486SX CPU. Remarkably, our approach autonomously rediscovers human knowledge of the von Neumann architecture.

Figure 1: Comparison of CPU design flow. (a) The conventional design flow is an iterative process with huge manual efforts in programming, debugging, and verification of the circuit logic. (b) The proposed automated design flow, which learns the circuit logic only from input-output (IO) examples of test cases with the proposed BSD expansion, is a fully-automatic and iterative process that eliminates manual efforts on programming, debugging, and verification of the circuit logic.
Table 1: Comparison with automated circuit design tasks.

Enlightenment-1 (QiMeng-1): The World's First Automatically Generated CPU

We use the proposed approach to automatically generate a 32-bit RISC-V CPU, Enlightenment-1, within 5 hours, and demonstrate that the approach can discover human knowledge of von Neumann architecture.

Automatically Design a RISC-V CPU

We use the proposed approach to generate the CPU design from a relatively small set of IO examples. Concretely, the CPU has 1789 input bits and 1826 output bits, and thus the total number of IO examples is 1826 × 21798, while only less than 240 IO examples are randomly sampled for training. The training process takes less than 5 hours to achieve an accuracy of >99.99999999999% for validation tests. The generated CPU design then undergoes the physical design process with scripts at 65nm technology to generate the layout for fabrication,and the detailed hardware characteristics are listed in Table 2. The layout of the entire chip with major components marked, the manufactured chip with a frequency of 300 MHz, and the printed circuit board containing the chip are illustrated below.

Table 2: Hardware characteristics of Enlightenment-1.
Figure 2: The (a) layout, (b) manufactured chip, and (c) printed circuit board of Enlightenment-1.

Perform Comparably to Intel 80486SX CPU

We successfully run the Linux (kernel 5.15) operating system and SPEC CINT2000 on Enlightenment-1 to validate the functionality (see Figure 3 (a) below). We also use the widely-used Dhrystone to evaluate the performance. The Figure 3 (b) below compares the performance of Enlightenment-1 against different generations of commercial CPUs, e.g., Intel 80386 (1980s), Intel 80486SX (1990s), and Intel Pentium III (2000s). On the evaluated program, it performs comparably to Intel 80486SX, designed in mid-1991. Though Enlightenment-1 performs worse than modern processors such as Intel Core i7 3930K, it is the world’s first automatically designed CPU, and its performance could be significantly improved with augmented algorithms, which is left as our future work.

Figure 3: Functional validation and performance comparison. (a) The outputs of booting up the Linux operating system. (b) The performance of Enlightenment-1 is compared against commercial CPUs on the Dhrystone benchmark, and Enlightenment-1 performs comparably to the human-designed Intel 80486SX CPU.

Discover the von Neumann Architecture

By detailing the generated circuit logic of Enlightenment-1, we demonstrate that our approach discovers human knowledge of von Neumann architecture only from the IO examples. Concretely, the generated CPU design in terms of BSD has the key component of the von Neumann architecture, which mainly consists of the control unit generated first in the BSD for global control, and the arithmetic unit (see Figure 4). The control unit generates the controlling signals for the entire CPU, and the arithmetic unit accomplishes arithmetic operations (e.g., ADD and SUB) and logic operations (e.g., AND and OR). Moreover, we observe that both the control unit and arithmetic unit can be recursively decomposed into smaller functional modules such as the instruction decoder, ALU, and LSU (load/store unit) by expanding more BSD layers.

Figure 4: Discovering von Neumann architecture from scratch. The generated BSD mainly consists of the control unit and arithmetic unit, which can be further decomposed into sub-modules in the BSD, e.g., the control unit contains the privilege controller and instruction decoder, and the arithmetic unit contains ALU and LSU.

BibTex

@inproceedings{10.24963/ijcai.2024/425, author = {Cheng, Shuyao and Jin, Pengwei and Guo, Qi and Du, Zidong and Zhang, Rui and Hu, Xing and Zhao, Yongwei and Hao, Yifan and Guan, Xiangtao and Han, Husheng and Zhao, Zhengyue and Liu, Ximing and Zhang, Xishan and Chu, Yuejie and Mao, Weilong and Chen, Tianshi and Chen, Yunji}, title = {Automated CPU design by learning from input-output examples}, year = {2024}, isbn = {978-1-956792-04-1}, url = {https://doi.org/10.24963/ijcai.2024/425}, doi = {10.24963/ijcai.2024/425}, abstract = {Designing a central processing unit (CPU) requires intensive manual work of talented experts to implement the circuit logic from design specifications. Although considerable progress has been made in electronic design automation (EDA) to relieve human efforts, all existing tools require handcrafted formal program codes (e.g., Verilog, Chisel, or C) as the input. To automate the CPU design without human programming, we are motivated to learn the CPU design from only input-output (IO) examples, which are generated from test cases of design specification. The key challenge is that the learned CPU design should have almost zero tolerance for inaccuracy, which makes well-known approximate algorithms such as neural networks ineffective.We propose a new AI approach to generate the CPU design in the form of a large-scale Boolean function, from only external IO examples instead of formal program code. This approach employs a novel graph structure called Binary Speculative Diagram (BSD) to approximate the CPU-scale Boolean function accurately. We propose an efficient BSD expansion method based on Boolean Distance, a new metric to quantitatively measure the structural similarity between Boolean functions, gradually increasing the design accuracy up to 100\%. Our approach generates an industrial-scale RISC-V CPU design within 5 hours, reducing the design cycle by about 1000\texttimes{} without human involvement. The taped-out chip, Enlightenment-1, the world's first CPU designed by AI, successfully runs the Linux operating system and performs comparably against the human-design Intel 80486SX CPU. Our approach even autonomously discovers human knowledge of the von Neumann architecture.}, booktitle = {Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence}, articleno = {425}, numpages = {11}, location = {Jeju, Korea}, series = {IJCAI '24} }
Read Entire Article