AWS for Industries
Accelerated chip verification using AWS EC2 F1 and VeriFire
The generative artificial intelligence (AI) transformation is driving unprecedented demand for chips that are required to train and run increasingly sophisticated machine learning (ML) models. In response, the semiconductor industry is designing advanced node chips that are more complex and have more transistors than ever before. For instance, the latest Nvidia Blackwell 4nm chip contains a staggering 208 billion transistors.
Manufacturing these advanced node semiconductors is a capital- and time-intensive process. In 2022, figures indicate that the value of research and development exceeded $58 billion, which is more than half of the $110 billion invested across the industry. Of that $58 billion, 20 percent is spent on electronic design automation (EDA) tools. Another 30 percent is spent on testing and verification processes that must be carried out before fabrication can occur. More complex designs require more verification engineers to support this process.
The 2022 Wilson Research Group Functional Verification Study showed that engineers spend more than 50 percent of their time in the verification stage, during which they perform extensive simulation, but traditional simulation methods and debugging tools often create significant bottlenecks due to the additional time to run simulations.
To address this bottleneck, AWS Partner SilverLining EDA (SLE) is using Amazon Elastic Compute Cloud (Amazon EC2) F1 Instances, which use Field-Programmable Gate Arrays (FPGAs) to enable delivery of custom hardware accelerations, in their VeriFire tool to reduce simulation times by orders of magnitude over traditional Verilog simulators.
F1 Instances provide a seamless development environment, including an FPGA Developer AMI and support for hardware-level development, letting users program, simulate, debug, and compile their acceleration code. F1 Instances deliver high bandwidth, enhanced networking, and massive compute capabilities, making them ideal for solving complex problems and simulations in areas such as EDA, verification, genomics, analytics, video processing, network security, and big data analytics. With the ability to register and deploy Amazon FPGA Images (AFIs) across multiple instances, F1 Instances offer unparalleled flexibility and scalability while providing access to FPGA development tools without additional fees.
VeriFire is an EDA tool that accelerates the verification of semiconductor designs in the pre-silicon phase. It has two main components (see Figure 1 below):
- VeriFire Engine
- FireBolt
Figure 1. Reference architecture for SLE VeriFire on AWS
VeriFire Engine
A VeriFire Engine (VE) is a RISC-V–based, synthesizable IP that can be customized to interface with various types of designs under test (DUT). A VE provides a means to stimulate, monitor, check, and debug any DUT with which it is connected, and it can be connected to a DUT using standard interfaces, like AXI, as well as custom interfaces. All drivers, monitors, and scoreboards are coded in C or C++ targeted for the VE. Depending on the complexity of the DUT and the number of interfaces it has, the VeriFire environment will consist of one or more VEs.
A VE can be used for both active and passive mode verification. In active mode, it acts as a driver, monitor, and scoreboard for the DUT. In passive mode, it acts only as a monitor and scoreboard.
FireBolt
FireBolt is a set of software utilities that lets the verification engineer monitor the DUT in near real time while it is being simulated on the F1 Instances. FireBolt is also used to compile tests, load them onto the VE, and determine whether the test passed or failed.
FireBolt is built upon the GNU Debugger (GDB), an open-source debugging tool commonly used in systems development. GDB lets developers see what is happening inside their programs while they run or what the program was doing at the moment the test failed.
This makes internal design state visible and generates log files on an FPGA run, similar to log files that engineers are familiar with. This groundbreaking capability increases the visibility of designs and makes debugging easier.
FireBolt runs on the host machine and communicates with the VeriFire Engine deployed on the FPGA through a high speed PCI Express (PCIe) interface. The VeriFire engine and DUT are synthesized into an AFI that is loaded onto the F1 Instances.
Figure 2 below shows the topology of a VeriFire-based verification environment with a single VE.
Figure 2. Topology of a VeriFire-based environment with a single VE
Based on DUT complexity, a multi-VE verification environment might be needed to drive, monitor, and check all the interfaces. Figure 3 below shows the topology of such an environment.
Figure 3. Topology of a VeriFire-based environment with multiple VEs
How VeriFire increases verification speed
Traditional verification environments use simulators like Verilator or other hardware description language (HDL) simulators from EDA vendors. Verilator reads the specified System Verilog code (including DUT and testbench), lints it, optionally adds coverage and waveform tracing support, and compiles the design into a source-level multithreaded C++ model. The resulting output of this process is a runnable file, which will perform the actual simulation during simulation runtime.
The simulation speed achieved by running this file depends on various factors, including complexity of the DUT, coding patterns of the testbench, and the available compute power. The simulation speeds achieved are generally considered very slow given that the designs being simulated will run in silicon at clock rates exceeding 2 GHz on the higher end.
VeriFire solves the speed problem by deploying the entire verification environment, including DUT and testbench components, on the FPGA integrated into the F1 Instance. This lets simulation occur at the clock rate used for FPGA synthesis. VeriFire-based environments can be run at up to 125 MHz, increasing simulation speeds by a factor of up to 1 million for a traditional verification environment, which achieves a peak speed of 125 Hz.
Verification closure is directly affected by simulation speed, so speeding up simulation by a factor of 1 million can dramatically reduce verification schedule and resources. Using F1 Instances and VeriFire lets development teams rapidly develop and deploy VeriFire-based verification environments to the cloud and accelerate their tape-out.
Improving debugging process efficiency
VeriFire has a powerful feature that lets users seamlessly replicate failing test cases from the FPGA in a simulation environment, without any additional overhead. This, coupled with its ability to provide 100 percent visibility into the entire design, lets users do root-cause analysis of intricate design bugs and validate fixes, especially those exposed through long and complex test sequences.
The simulation framework underpinning VeriFire is built upon Verilator, a widely adopted open-source simulator, using its robust capabilities and community support. This combination of advanced features and proven technology lets VeriFire deliver a comprehensive and efficient solution for hardware design verification and debugging.
In sample VeriFire and Verilator–based environments stood up for testing, we observed simulation speeds of between 300 Hz to 1 Khz on F1 Instances. For reference, a test that takes 1 second to run in the FPGA environment will simulate 125 million cycles of the DUT and take 125,000 seconds (about 1.5 days) to run in a verilator simulation.
FPU verification example
A floating-point unit (FPU) is a key part of CPUs and AI accelerators. FPU verification consumes significant resources because of the huge state space and the need to maintain compliance with specifications like IEEE-754.
SLE used an open-source FPU implementation as a DUT to set up the environment. The FPU interface was abstracted as a custom interface used to connect the VeriFire Engine to the DUT. It includes the following:
- Two operands, A and B
- The operation to be performed (Single Precision ADD)
- Ready signal
- Done signal
- Result
A test program (written in C or C++) encapsulates the interface described above and creates a packet for each transaction. The program polls the ‘Ready’ signal and sends a packet when the DUT is ready. The program waits for the ‘Done’ signal to be asserted and captures the ‘Result’. (See figures 4 and 5 below.)
Figure 4. Sample program to drive, monitor, and check FPU DUT
Figure 5. Sample program to compute reference values for FPU add operation
This result is compared against a reference value that has been computed by VeriFire Engine itself on the FPGA, letting computation remain local to the FPGA (and running at 125 MHz) without the need to go back to the host for reference value calculations.
Tests are loaded and debugged using FireBolt. See figure 6 below for details.
Figure 6. Topology of VeriFire-based environment used to verify FPU
Conclusion
VeriFire is an EDA tool that accelerates the verification of semiconductor designs in the pre-silicon phase. It is suitable for a large spectrum of design types: in-order/out-of-order CPUs, accelerator ASICs for ML or cryptography, FPUs, custom arithmetic blocks, streaming data engines, matrix multiplication units, instruction fetch units, load-store units, memory hierarchies, instruction schedulers, and microcontrollers can all be verified using VeriFire.
EDA and verification cost the semiconductor industry $30 billion in 2022—more than half of all research and development expenditure. VeriFire significantly shortens the design cycle by accelerating the verification process, reducing the time and resources required to introduce new designs, which lets the semiconductor industry innovate and achieve faster time to market
See how AWS helps the semiconductor industry innovate faster or to learn more about how VeriFire can improve verification speed, lower costs, and accelerate your tape-out, reach out to SilverLining EDA.