NEMU(NJU Emulator) is a simple but complete full-system emulator designed for teaching purpose. Originally it supports x86, mips32, riscv64, and riscv32. This repo only guarantees the support for riscv64.
The main features of NEMU include
- a small monitor with a simple debugger
- single step
- register/memory examination
- expression evaluation without the support of symbols
- watch point
- differential testing against reference design (e.g. QEMU)
- snapshot
- CPU core with support of most common ISAs
- x86
- real mode is not supported
- x87 floating point instructions are not supported
- mips32
- CP1 floating point instructions are not supported
- riscv32
- only RV32IM
- riscv64
- rv64gcbhk currently
- rv64gcbhkv in the near future
- x86
- memory
- paging
- TLB is optional (but necessary for mips32)
- protection is not supported for most ISAs, but PMP is supported for riscv64
- interrupt and exception
- protection is not supported
- 5 devices
- serial, timer, keyboard, VGA, audio
- most of them are simplified and unprogrammable
- 2 types of I/O
- port-mapped I/O and memory-mapped I/O
- Cannot directly run an ELF
- GEM5's System call emulation is not supported.(What is system call emulation)
- QEMU's User space emulation is not supported.(What is user space emulation)
- Checkpoint is not compatible with GEM5's SE checkpoints or m5 checkpoints.
- Cannot produce GEM5's SE checkpoints or m5 checkpoints
- Cannot run GEM5's SE checkpoints or m5 checkpoints
- Recommend NOT to produce a checkpoint in M-mode
- Please don't running SimPoint bbv.gz with NEMU, XS-GEM5, or XiangShan processor, because it is not bootable
- Please don't make a new issue without reading the doc
- Please don't make a new issue without searching in issue list
- Please don't make a new issue about building Linux in NEMU's issue list, plz head to XiangShan doc
NEMU plays the following roles in XiangShan ecosystem:
- In reference mode, NEMU is the golden model of XiangShan processor (paper: MINJIE, code to adapt NEMU with XiangShan:Difftest)
- In standalone mode, NEMU is able to produce SimPoint BBVs and checkpoints for XS-GEM5 and XiangShan processor.
- In standalone mode, NEMU can also be used as a profiler for large programs.
NEMU can be used as a reference design to validate the correctness of XiangShan processor or XS-GEM5. Typical workflow is as follows. Concrete instructions are described in Section build-NEMU-as-ref.
graph TD;
build["Build NEMU in reference mode"]
so[/"./build/riscv64-nemu-interpreter-so"/]
cosim["Run XS-GEM5 or XiangShan processor, turn on difftest, specify riscv64-nemu-interpreter-so as reference design"]
build-->so
so-->cosim
The typical flow for running workloads is similar for NEMU, XS-GEM5, and XiangShan processor. All of them only support full-system simulation. To prepare workloads for full-system simulation, users need to either build a baremetal app or running user programs in an operating system.
graph TD;
am["Build a baremetal app with AM"]
linux["Build a Linux image containing user app"]
baremetal[/"Image of baremetal app or OS"/]
run["Run image with NEMU, XS-GEM5, or XiangShan processor"]
am-->baremetal
linux-->baremetal
baremetal-->run
Because most of the enterprise users and researchers are more interested in running larger workloads, like SPECCPU, on XS-GEM5 or XiangShan processor. To reduce the simulation time of detailed simulation, NEMU serves as a checkpoint producer. The flow for producing and running checkpoints is as follows. The detailed instructions for each step is described in Section Howto.
graph TD;
linux["Build a Linux image containing NEMU trap app and user app"]
bin[/"Image containing Linux and app"/]
profiling["Boot image with NEMU with SimPoint profiling"]
bbv[/"SimPoint BBV, a .gz file"/]
cluster["Cluster BBV with SimPoint"]
points[/"SimPoint sampled points and weights"/]
take_cpt["Boot image with NEMU to produce checkpoints"]
checkpoints[/"Checkpoints, several .gz files of memory image"/]
run["Run checkpoints with XS-GEM5 or XiangShan processor"]
linux-->bin
bin-->profiling
profiling-->bbv
bbv-->cluster
cluster-->points
points-->take_cpt
take_cpt-->checkpoints
checkpoints-->run
Because different distributions have different package management tools, the installation commands are different. For Ubuntu, users can install the dependencies with the following command:
apt install build-essential man gcc gdb git libreadline-dev libsdl2-dev zstd libzstd-dev
To build NEMU as reference design, run
make menuconfig # at the first time when NEMU is downloaded
make xxx-ref_defconfig
make -j
./build/riscv64-nemu-interpreter-so
is the reference design.
Specifically, xxx-ref_defconfig varies for different ISA extensions.
rv64gcb | rv64gcbh | rv64gcbv |
---|---|---|
riscv64-xs-ref_defconfig | riscv64-rvh-ref_defconfig | riscv64-rvv-ref_defconfig |
To test XS-GEM5 against NEMU, refer to the doc of XS-GEM5 Difftest.
To test XiangShan processor against NEMU, run
./emu \
-i test_workload.bin \
--diff $NEMU_HOME/build/riscv64-nemu-interpreter-so \
2> perf.out
Details can be found in the tutorial of XiangShan.
As described in the workflow, NEMU either takes a baremetal app or an operating system image as input.
For baremetal app, Abstract Machine is a light-weight baremetal library. Common simple apps like coremark and dhrystone can be built with Abstract Machine.
For build operating system image, Please read the doc to build Linux.
Then modify NEMU_HOME
and BBL_PATH
in $NEMU_HOME/scripts/checkpoint_example/checkpoint_env.sh
and the workload parameter passed to the function in each example script to get started.
Please read the doc to generate checkpoint
Run a checkpoint with XiangShan processor
./build/emu -i /path/to/a/checkpoint.gz
Run checkpoints with XS-GEM5: the doc to run XS-GEM5
Read the source code of GCPT restorer
Because we restore checkpoint in M mode, and the PC of returning to user mode is stored in EPC register. This recovery method will break the architecture state (EPC) if the checkpoint is produced in M mode. In contrast, if the checkpoint is produced in S mode or U mode, the return process is just like a normal trap return, which will not break the architecture state.
Please use master branch. The checkpoint related code is not merged from tracing branch into master
First, make sure you have obtained a checkpoint.gz, not a bbv.gz. Then, see the doc to run checkpoints.
First, make sure interval size is smaller than total instruction counter of the application. Second, it is not necessary to produce checkpoints for small applications with few intervals.
Typical sampling interval size used in architecture research is 10M-200M, while typical warmup interval size is 20M-100M. It depends on your cache size and use case. For example, when studying cache's temporal locality, it is better to use a larger interval size (>=50M).
The simulation time depends on IPC of the application and the complexity of the CPU model. For Verilator simulation of XiangShan processor, the simulation time varies from hours to days. For XS-GEM5, the simulation time varies typically ranges from 6 minutes to 1 hour.
First, check FAQs of building Linux kernel for XiangShan
Then, try to search solution in issue list of NEMU and issue list of XiangShan doc.
Finally, if you cannot find a solution, please make a new issue in XiangShan doc.