emulsiV is a visual simulator for a simple RISC processor called Virgule.
Virgule is a 32-bit RISC processor core that implements a minimal subset of the RISC-V instruction set. Here, “minimal” means that Virgule accepts only the instructions that a C compiler would generate from a pure stand-alone C program.
Virgule and emulsiV are used for teaching computer architecture and digital design to beginners at ESEO. Before choosing a processor architecture, we had the following requirements in mind:
Among the candidate architectures, the RISC-V filled all our requirements:
These properties allow several teaching scenarios.
In a computer architecture course, students discover how a processor works, what kinds of languages and tools can be used to create low-level programs (assembly, C). A simulator is a great way to visualize how each instruction affects the data path.
In a digital circuit design course, students can implement a processor core in VHDL or Verilog, or instantiate one to make their own system-on-chip.
Virgule implements all the computational, transfer control, and memory access instructions from the integer subset RV32I of the RISC-V specification. It also provides an exception return instruction that can be used in interrupt handlers.
The following instructions are not available:
The RISC-V architecture defines three privilege levels: user, supervisor, and machine. Virgule only supports the machine level.
Virgule contains the following 32-bit registers:
x0
to x31
.pc
. This register contains the address of the
current instruction. Its reset value is zero and it is always a multiple of 4.mepc
. This register
receives the return address when a machine-level exception occurs.Data formats for memory load and store instructions follow these conventions:
Name | Data size (bits) | Address is a multiple of |
---|---|---|
Byte (B ) | 8 | 1 |
Half word (H ) | 16 | 2 |
Word (W ) | 32 | 4 |
In memory, 16-bit and 32-bit data will follow the little-endian ordering.
The following two addresses have a specific role:
Virgule implements a simple hardware interrupt scheme in the machine privilege mode. There is no interrupt control or status register in the processor core itself.
When it receives an interrupt request, Virgule performs the following operations:
mepc
to the address of the next instruction.pc
to 4, which will transfer control to the interrupt handler.Returning from an interrupt handler is done with the mret
instruction.
This instruction has the following effect:
mepc
to pc
.In the following table:
rd
is the destination general-purpose register.
rs1
and rs2
are the source general-purpose registers.
imm
is a literal (immediate) integer value.
Instruction | Syntax | Operation |
---|---|---|
Load Upper Immediate | LUI rd, imm | rd ← imm |
Add Upper Immediate to PC | AUIPC rd, imm | rd ← pc + imm |
Jump And Link | JAL rd, imm | rd ← pc + 4; pc ← pc + imm |
Jump And Link Register | JALR rd, rs1, imm | rd ← pc + 4; pc ← rs1 + imm |
Branch if Equal | BEQ rs1, rs2, imm |
if rs1 = rs2: pc ← pc + imm else: pc ← pc + 4 |
Branch if Not Equal | BNE rs1, rs2, imm |
if rs1 ≠ rs2: pc ← pc + imm else: pc ← pc + 4 |
Branch if Less Than | BLT rs1, rs2, imm |
if signed(rs1) < signed(rs2): pc ← pc + imm else: pc ← pc + 4 |
Branch if Greater or Equal | BGE rs1, rs2, imm |
if signed(rs1) ≥ signed(rs2): pc ← pc + imm else: pc ← pc + 4 |
Branch if Less Than Unsigned | BLTU rs1, rs2, imm |
if unsigned(rs1) < unsigned(rs2): pc ← pc + imm else: pc ← pc + 4 |
Branch if Greater or Equal Unsigned | BGEU rs1, rs2, imm |
if unsigned(rs1) ≥ unsigned(rs2): pc ← pc + imm else: pc ← pc + 4 |
Load Byte | LB rd, imm(rs1) | rd ← signed(mem[rs1+imm]) |
Load Half word | LH rd, imm(rs1) | rd ← signed(mem[rs1+imm:rs1+imm+1]) |
Load Word | LW rd, imm(rs1) | rd ← signed(mem[rs1+imm:rs1+imm+3]) |
Load Byte Unsigned | LBU rd, imm(rs1) | rd ← unsigned(mem[rs1+imm]) |
Load Half word Unsigned | LHU rd, imm(rs1) | rd ← unsigned(mem[rs1+imm:rs1+imm+1]) |
Store Byte | SB rs2, imm(rs1) | mem[rs1+imm] ← rs2[7:0] |
Store Half word | SH rs2, imm(rs1) | mem[rs1+imm:rs1+imm+1] ← rs2[15:0] |
Store Word | SW rs2, imm(rs1) | mem[rs1+imm:rs1+imm+3] ← rs2 |
Add Immediate | ADDI rd, rs1, imm | rd ← rs1 + imm |
Shift Left Logical Immediate | SLLI rd, rs1, imm | rd ← rs1 sll imm |
Set on Less Than Immediate | SLTI rd, rs1, imm |
if signed(rs1) < signed(imm): rd ← 1 else: rd ← 0 |
Set on Less Than Immediate Unsigned | SLTIU rd, rs1, imm |
if unsigned(rs1) < unsigned(imm): rd ← 1 else: rd ← 0 |
Exclusive Or Immediate | XORI rd, rs1, imm | rd ← rs1 xor imm |
Shift Right Logical Immediate | SRLI rd, rs1, imm | rd ← rs1 srl imm |
Shift Right Arithmetic Immediate | SRAI rd, rs1, imm | rd ← rs1 sra imm |
Or Immediate | ORI rd, rs1, imm | rd ← rs1 or imm |
And Immediate | ANDI rd, rs1, imm | rd ← rs1 and imm |
Add | ADD rd, rs1, rs2 | rd ← rs1 + rs2 |
Subtract | SUB rd, rs1, rs2 | rd ← rs1 - rs2 |
Shift Left Logical | SLL rd, rs1, rs2 | rd ← rs1 sll rs2 |
Set on Less Than | SLT rd, rs1, rs2 |
if signed(rs1) < signed(rs2): rd ← 1 else: rd ← 0 |
Set on Less Than Unsigned | SLTU rd, rs1, rs2 |
if unsigned(rs1) < unsigned(rs2): rd ← 1 else: rd ← 0 |
Exclusive Or | XOR rd, rs1, rs2 | rd ← rs1 xor rs2 |
Shift Right Logical | SRL rd, rs1, rs2 | rd ← rs1 srl rs2[4:0] |
Shift Right Arithmetic | SRA rd, rs1, rs2 | rd ← rs1 sra rs2[4:0] |
Or | OR rd, rs1, rs2 | rd ← rs1 or rs2 |
And | AND rd, rs1, rs2 | rd ← rs1 and rs2 |
Machine Return | MRET | pc ← mepc |
In the above table, logical and shift operations have the following meanings:
Operator | Effect |
---|---|
and | Bitwise and |
or | Bitwise or |
xor | Bitwise exclusive or |
sll | Logical shift left |
srl | Logical shift right |
sra | Arithmetic shift right (with sign extension) |
An instruction word can be composed of the following fields:
funct7
, funct3
and opcode
define the operation to perform;
rd
is the index of the destination general-purpose register (allowed values are in the range 0 to 15).
rs1
and rs2
are the indices of the source general-purpose registers (allowed values are in the range 0 to 15);
imm
represents a literal (immediate) integer value;
The RISC-V instruction set defines six instruction formats:
Format / Bits | 31:25 | 24:20 | 19:15 | 14:12 | 11:7 | 6:0 |
---|---|---|---|---|---|---|
R | funct7 | rs2 | rs1 | funct3 | rd | opcode |
I | imm[11:5]`/`funct7 | imm[4:0] | rs1 | funct3 | rd | opcode |
S | imm[11:5] | rs2 | rs1 | funct3 | imm[4:0] | opcode |
B | imm[12,10:5] | rs2 | rs1 | funct3 | imm[4:1,11] | opcode |
U | imm[31:25] | imm[24:20] | imm[19:15] | imm[14:12] | rd | opcode |
J | imm[20,10:5] | imm[4:1,11] | imm[19:15] | imm[14:12] | rd | opcode |
Immediate values are sign-extended to 32 bits.
When they are not explicitly encoded in the imm
field, the least significant bits are 0.
In the specification, formats B
and J
are described
as variants of formats S
and U
.
In formats B
and J
, immediate values represent offsets
in relative branch instructions.
They are encoded so that they share most of their bits with other formats
while preserving their most significant bit at location 31 of the instruction word.
The following table shows the mapping between the bits of the instruction word and the bits of the immediate values:
Format | imm[31:25] | imm[24:21] | imm[20] | imm[19:15] | imm[14:12] | imm[11] | imm[10:5] | imm[4:1] | imm[0] |
---|---|---|---|---|---|---|---|---|---|
I | inst[31] | inst[31] | inst[31] | inst[31] | inst[31] | inst[31] | inst[30:25] | inst[24:21] | inst[20] |
S | inst[31] | inst[31] | inst[31] | inst[31] | inst[31] | inst[31] | inst[30:25] | inst[11:8] | inst[7] |
B | inst[31] | inst[31] | inst[31] | inst[31] | inst[31] | inst[7] | inst[30:25] | inst[11:8] | 0 |
U | inst[31:25] | inst[24:21] | inst[20] | inst[19:15] | inst[14:12] | 0 | 0 | 0 | 0 |
J | inst[31] | inst[31] | inst[31] | inst[19:15] | inst[14:12] | inst[20] | inst[30:25] | inst[24:21] | 0 |
In Virgule, we have retained the following base opcodes from the RISC-V specification. Each opcode corresponds to a specific instruction format:
Name | opcode | Format |
---|---|---|
LOAD | 0000011 | I |
OP-IMM | 0010011 | I |
AUIPC | 0010111 | U |
STORE | 0100011 | S |
OP | 0110011 | R |
LUI | 0110111 | U |
BRANCH | 1100011 | B |
JALR | 1100111 | I |
JAL | 1101111 | J |
SYSTEM | 1110011 | I |
When decoding an instruction word, Virgule uses the following fields to
identify the actual instruction.
In this table, the opcode
column refers to the
names from the base opcode table above.
Instruction | opcode | funct3 | funct7 | rs2 |
---|---|---|---|---|
LUI | LUI | — | — | — |
AUIPC | AUIPC | — | — | — |
JAL | JAL | — | — | — |
JALR | JALR | 000 | — | — |
BEQ | BRANCH | 000 | — | — |
BNE | BRANCH | 001 | — | — |
BLT | BRANCH | 100 | — | — |
BGE | BRANCH | 101 | — | — |
BLTU | BRANCH | 110 | — | — |
BGEU | BRANCH | 111 | — | — |
LB | LOAD | 000 | — | — |
LH | LOAD | 001 | — | — |
LW | LOAD | 010 | — | — |
LBU | LOAD | 100 | — | — |
LHU | LOAD | 101 | — | — |
SB | STORE | 000 | — | — |
SH | STORE | 001 | — | — |
SW | STORE | 010 | — | — |
ADDI | OP-IMM | 000 | — | — |
SLLI | OP-IMM | 001 | 0000000 | — |
SLTI | OP-IMM | 010 | — | — |
SLTIU | OP-IMM | 011 | — | — |
XORI | OP-IMM | 100 | — | — |
SRLI | OP-IMM | 101 | 0000000 | — |
SRAI | OP-IMM | 101 | 0100000 | — |
ORI | OP-IMM | 110 | — | — |
ANDI | OP-IMM | 111 | — | — |
ADD | OP | 000 | 0000000 | — |
SUB | OP | 000 | 0100000 | — |
SLL | OP | 001 | 0000000 | — |
SLT | OP | 010 | 0000000 | — |
SLTU | OP | 011 | 0000000 | — |
XOR | OP | 100 | 0000000 | — |
SRL | OP | 101 | 0000000 | — |
SRA | OP | 101 | 0100000 | — |
OR | OP | 110 | 0000000 | — |
AND | OP | 111 | 0000000 | — |
MRET | SYSTEM | 000 | 0011000 | 00010 |
The addressing space is organized as follows.
Address (hex) | Device |
---|---|
00000000 ⋮ 00000bff | RAM (3072 bytes) |
00000C00 ⋮ 00000fff | Bitmap RAM (1024 bytes) |
B0000000 B0000001 | Text input |
C0000000 | Text output |
D0000000 | General-purpose input/output |
The text input device is represented by a text field in the user interface of the simulator. It has two 8-bit registers:
Address (hex) | Role | Value |
---|---|---|
B0000000 | Control/Status | Bit 7: Interrupt enable Bit 6: Character received |
B0000001 | Data | The ASCII code of the last input character. |
The control/status register works as follows:
The text output device is represented by a text area in the user interface of the simulator. It has only one write-only register:
Address (hex) | Role | Value |
---|---|---|
C0000000 | Data | The ASCII code of the character to display. |
The GPIO (General-purpose input/output) peripheral allows to connect at most 32 simple user I/O devices:
The inputs are arranged in an 8×4 grid at the bottom of the General-purpose I/O section of the simulator. Right-click on a cell to change its type. Left-click to change the state of a button or switch.
It has the following 32-bit registers:
Address (hex) | Role | Value |
---|---|---|
D0000000 | Direction (dir) | The configuration of each pin (0 for an output, 1 for an input). |
D0000004 | Interrupt enable (ien) | Enable interrupts on input events. |
D0000008 | Rising-edge events (rev) | Each bit is set to 1 if the corresponding input pin has changed from 0 to 1. |
D000000C | Falling-edge events (fev) | Each bit is set to 1 if the corresponding input pin has changed from 1 to 0. |
D0000010 | Value (val) | The current value of each input or output. |
On reset, all pins are configured as inputs and interrupts are disabled.
The rising-edge and falling-edge events registers must be cleared in software using store instructions when the events have been processed.
The simulator provides a graphic display area with 32 rows of 32 pixels. Each pixel is mapped to a RAM byte and its color is encoded like this:
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|
Red | Green | Blue |
The bitmap RAM follows a raster-scan ordering. For each address, this table shows the (x, y) coordinates of the corresponding pixel:
+0 | +1 | +2 | … | +1F | |
---|---|---|---|---|---|
00000C00 | (0, 0) | (1, 0) | (2, 0) | … | (31, 0) |
00000C20 | (0, 1) | (1, 1) | (2, 1) | … | (31, 1) |
00000C40 | (0, 2) | (1, 2) | (2, 2) | … | (31, 2) |
⋮ | ⋮ | ⋮ | ⋮ | … | ⋮ |
00000FE0 | (0, 31) | (1, 31) | (2, 31) | … | (31, 31) |
The simulator allows to create and edit programs by entering instructions in the assembly column of the memory view. Another option is to type your program in a text editor and generate an executable for emulsiV using the GNU toolchain.
The following instructions have been tested in Ubuntu 20.04
If you are using Ubuntu, you can install the pre-built RISC-V bare-metal toolchain using this command:
sudo apt install gcc-riscv64-unknown-elf
Despite what the name suggests, this installs a toolchain that supports both the 32-bit and 64-bit architectures.
Here is a typical startup module (startup.s
) that you can use
for your programs.
Another assembly or C source file should define the main
subprogram,
and optionally override the irq_handler
subprogram.
.section vectors, "x" .global __reset __reset: j start __irq: j irq_handler .text .align 4 .weak irq_handler irq_handler: mret start: la gp, __global_pointer la sp, __stack_pointer la t0, __bss_start la t1, __bss_end bgeu t0, t1, memclr_done memclr: sw zero, (t0) addi t0, t0, 4 bltu t0, t1, memclr memclr_done: call main j .
This command assembles the source file startup.s
into an object
file startup.o
:
riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -c -o startup.o startup.s
#define TEXT_OUT (*(char*)0xC0000000) void print(const char *str) { while (*str) { TEXT_OUT = *str++; } } void main(void) { print("Virgule says\n<< Hello! >>\n"); }
This command compiles the source file hello.c
into an object file hello.o
:
riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -ffreestanding -c -o hello.o hello.c
If you want to write an interrupt handler in C, you can override the irq_handler
subprogram, adding the interrupt
attribute like this:
__attribute__((interrupt("machine"))) void irq_handler(void) { // Insert your code here. }
This command links startup.o
and hello.o
into a binary executable file hello.elf
:
riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -nostdlib -T emulsiv.ld -o hello.elf startup.o hello.o
The memory map of the simulator is configured in the linker script emulsiv.ld
below:
ENTRY(__reset) MEM_SIZE = 4K; STACK_SIZE = 512; BITMAP_SIZE = 1K; SECTIONS { . = 0x0; .text : { *(vectors) *(.text) __text_end = .; } .data : { *(.data) } .rodata : { *(.rodata) } __global_pointer = ALIGN(4); .bss ALIGN(4) : { __bss_start = .; *(.bss COMMON) __bss_end = ALIGN(4); } . = MEM_SIZE - STACK_SIZE - BITMAP_SIZE; .stack ALIGN(4) : { __stack_start = .; . += STACK_SIZE; __stack_pointer = .; } .bitmap ALIGN(4) : { __bitmap_start = .; *(bitmap) } __bitmap_end = __bitmap_start + BITMAP_SIZE; }
This command converts an ELF binary file hello.elf
into a text file hello.hex
in the Intel Hex format:
riscv64-unknown-elf-objcopy -O ihex hello.elf hello.hex
In the simulator, use the button "Open an hex file from your computer" and
choose hello.hex
or any other hex file that you want to load.
emulsiV is free software and is distributed under the terms of the Mozilla Public License 2.0.
This document was created by Guillaume Savaton, ESEO. It is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.