About Virgule and its simulator emulsiV

Guillaume Savaton, ESEO

emulsiV is a visual simulator for a simple RISC processor called Virgule.

Virgule is a 32-bit RISC processor core that implements a minimal subset of the RISC-V instruction set. Here, “minimal” means that Virgule accepts only the instructions that a C compiler would generate from a pure stand-alone C program.

Motivation and scope

Virgule and emulsiV are used for teaching computer architecture and digital design to beginners at ESEO. Before choosing a processor architecture, we had the following requirements in mind:

Among the candidate architectures, the RISC-V filled all our requirements:

These properties allow several teaching scenarios.

In a computer architecture course, students discover how a processor works, what kinds of languages and tools can be used to create low-level programs (assembly, C). A simulator is a great way to visualize how each instruction affects the data path.

In a digital circuit design course, students can implement a processor core in VHDL or Verilog, or instantiate one to make their own system-on-chip.

Processor architecture

Relation with the RISC-V specification

Virgule implements all the computational, transfer control, and memory access instructions from the integer subset RV32I of the RISC-V specification. It also provides an exception return instruction that can be used in interrupt handlers.

The following instructions are not available:

The RISC-V architecture defines three privilege levels: user, supervisor, and machine. Virgule only supports the machine level.

Registers

Virgule contains the following 32-bit registers:

Memory organization

Data formats for memory load and store instructions follow these conventions:

Name Data size (bits) Address is a multiple of
Byte (B) 8 1
Half word (H) 16 2
Word (W) 32 4

In memory, 16-bit and 32-bit data will follow the little-endian ordering.

The following two addresses have a specific role:

Interrupts

Virgule implements a simple hardware interrupt scheme in the machine privilege mode. There is no interrupt control or status register in the processor core itself.

When it receives an interrupt request, Virgule performs the following operations:

  1. Complete the current instruction.
  2. Switch to an uninterruptible state.
  3. Set mepc to the address of the next instruction.
  4. Set pc to 4, which will transfer control to the interrupt handler.

Returning from an interrupt handler is done with the mret instruction. This instruction has the following effect:

  1. Copy mepc to pc.
  2. Switch to an interruptible state.

Instruction set

In the following table:

Instruction Syntax Operation
Load Upper Immediate LUI rd, imm rd ← imm
Add Upper Immediate to PC AUIPC rd, imm rd ← pc + imm
Jump And Link JAL rd, imm rd ← pc + 4; pc ← pc + imm
Jump And Link Register JALR rd, rs1, imm rd ← pc + 4; pc ← rs1 + imm
Branch if Equal BEQ rs1, rs2, imm
if rs1 = rs2:
    pc ← pc + imm
else:
    pc ← pc + 4
Branch if Not Equal BNE rs1, rs2, imm
if rs1 ≠ rs2:
    pc ← pc + imm
else:
    pc ← pc + 4
Branch if Less Than BLT rs1, rs2, imm
if signed(rs1) < signed(rs2):
    pc ← pc + imm
else:
    pc ← pc + 4
Branch if Greater or Equal BGE rs1, rs2, imm
if signed(rs1) ≥ signed(rs2):
    pc ← pc + imm
else:
    pc ← pc + 4
Branch if Less Than Unsigned BLTU rs1, rs2, imm
if unsigned(rs1) < unsigned(rs2):
    pc ← pc + imm
else:
    pc ← pc + 4
Branch if Greater or Equal Unsigned BGEU rs1, rs2, imm
if unsigned(rs1) ≥ unsigned(rs2):
    pc ← pc + imm
else:
    pc ← pc + 4
Load Byte LB rd, imm(rs1) rd ← signed(mem[rs1+imm])
Load Half word LH rd, imm(rs1) rd ← signed(mem[rs1+imm:rs1+imm+1])
Load Word LW rd, imm(rs1) rd ← signed(mem[rs1+imm:rs1+imm+3])
Load Byte Unsigned LBU rd, imm(rs1) rd ← unsigned(mem[rs1+imm])
Load Half word Unsigned LHU rd, imm(rs1) rd ← unsigned(mem[rs1+imm:rs1+imm+1])
Store Byte SB rs2, imm(rs1) mem[rs1+imm] ← rs2[7:0]
Store Half word SH rs2, imm(rs1) mem[rs1+imm:rs1+imm+1] ← rs2[15:0]
Store Word SW rs2, imm(rs1) mem[rs1+imm:rs1+imm+3] ← rs2
Add Immediate ADDI rd, rs1, imm rd ← rs1 + imm
Shift Left Logical Immediate SLLI rd, rs1, imm rd ← rs1 sll imm
Set on Less Than Immediate SLTI rd, rs1, imm
if signed(rs1) < signed(imm):
    rd ← 1
else:
    rd ← 0
Set on Less Than Immediate Unsigned SLTIU rd, rs1, imm
if unsigned(rs1) < unsigned(imm):
    rd ← 1
else:
    rd ← 0
Exclusive Or Immediate XORI rd, rs1, imm rd ← rs1 xor imm
Shift Right Logical Immediate SRLI rd, rs1, imm rd ← rs1 srl imm
Shift Right Arithmetic Immediate SRAI rd, rs1, imm rd ← rs1 sra imm
Or Immediate ORI rd, rs1, imm rd ← rs1 or imm
And Immediate ANDI rd, rs1, imm rd ← rs1 and imm
Add ADD rd, rs1, rs2 rd ← rs1 + rs2
Subtract SUB rd, rs1, rs2 rd ← rs1 - rs2
Shift Left Logical SLL rd, rs1, rs2 rd ← rs1 sll rs2
Set on Less Than SLT rd, rs1, rs2
if signed(rs1) < signed(rs2):
    rd ← 1
else:
    rd ← 0
Set on Less Than Unsigned SLTU rd, rs1, rs2
if unsigned(rs1) < unsigned(rs2):
    rd ← 1
else:
    rd ← 0
Exclusive Or XOR rd, rs1, rs2 rd ← rs1 xor rs2
Shift Right Logical SRL rd, rs1, rs2 rd ← rs1 srl rs2[4:0]
Shift Right Arithmetic SRA rd, rs1, rs2 rd ← rs1 sra rs2[4:0]
Or OR rd, rs1, rs2 rd ← rs1 or rs2
And AND rd, rs1, rs2 rd ← rs1 and rs2
Machine Return MRET pc ← mepc

In the above table, logical and shift operations have the following meanings:

Operator Effect
and Bitwise and
or Bitwise or
xor Bitwise exclusive or
sll Logical shift left
srl Logical shift right
sra Arithmetic shift right (with sign extension)

Instruction encoding

An instruction word can be composed of the following fields:

The RISC-V instruction set defines six instruction formats:

Format / Bits 31:25 24:20 19:15 14:12 11:7 6:0
R funct7 rs2 rs1 funct3 rd opcode
I imm[11:5]`/`funct7 imm[4:0] rs1 funct3 rd opcode
S imm[11:5] rs2 rs1 funct3 imm[4:0] opcode
B imm[12,10:5] rs2 rs1 funct3 imm[4:1,11] opcode
U imm[31:25] imm[24:20] imm[19:15] imm[14:12] rd opcode
J imm[20,10:5] imm[4:1,11] imm[19:15] imm[14:12] rd opcode

Immediate values

Immediate values are sign-extended to 32 bits. When they are not explicitly encoded in the imm field, the least significant bits are 0.

In the specification, formats B and J are described as variants of formats S and U. In formats B and J, immediate values represent offsets in relative branch instructions. They are encoded so that they share most of their bits with other formats while preserving their most significant bit at location 31 of the instruction word.

The following table shows the mapping between the bits of the instruction word and the bits of the immediate values:

Format imm[31:25] imm[24:21] imm[20] imm[19:15] imm[14:12] imm[11] imm[10:5] imm[4:1] imm[0]
I inst[31] inst[31] inst[31] inst[31] inst[31] inst[31] inst[30:25] inst[24:21] inst[20]
S inst[31] inst[31] inst[31] inst[31] inst[31] inst[31] inst[30:25] inst[11:8] inst[7]
B inst[31] inst[31] inst[31] inst[31] inst[31] inst[7] inst[30:25] inst[11:8] 0
U inst[31:25] inst[24:21] inst[20] inst[19:15] inst[14:12] 0 0 0 0
J inst[31] inst[31] inst[31] inst[19:15] inst[14:12] inst[20] inst[30:25] inst[24:21] 0

Base opcodes

In Virgule, we have retained the following base opcodes from the RISC-V specification. Each opcode corresponds to a specific instruction format:

Name opcode Format
LOAD 0000011 I
OP-IMM 0010011 I
AUIPC 0010111 U
STORE 0100011 S
OP 0110011 R
LUI 0110111 U
BRANCH 1100011 B
JALR 1100111 I
JAL 1101111 J
SYSTEM 1110011 I

Field values for each instruction

When decoding an instruction word, Virgule uses the following fields to identify the actual instruction. In this table, the opcode column refers to the names from the base opcode table above.

Instruction opcode funct3 funct7 rs2
LUI LUI
AUIPCAUIPC
JAL JAL
JALR JALR 000
BEQ BRANCH 000
BNE BRANCH 001
BLT BRANCH 100
BGE BRANCH 101
BLTU BRANCH 110
BGEU BRANCH 111
LB LOAD 000
LH LOAD 001
LW LOAD 010
LBU LOAD 100
LHU LOAD 101
SB STORE 000
SH STORE 001
SW STORE 010
ADDI OP-IMM 000
SLLI OP-IMM 001 0000000
SLTI OP-IMM 010
SLTIUOP-IMM 011
XORI OP-IMM 100
SRLI OP-IMM 101 0000000
SRAI OP-IMM 101 0100000
ORI OP-IMM 110
ANDI OP-IMM 111
ADD OP 000 0000000
SUB OP 000 0100000
SLL OP 001 0000000
SLT OP 010 0000000
SLTU OP 011 0000000
XOR OP 100 0000000
SRL OP 101 0000000
SRA OP 101 0100000
OR OP 110 0000000
AND OP 111 0000000
MRET SYSTEM 000 0011000 00010

Memory and peripheral devices

Memory layout

The addressing space is organized as follows.

Address (hex) Device
00000000

00000bff
RAM (3072 bytes)
00000C00

00000fff
Bitmap RAM (1024 bytes)
B0000000
B0000001
Text input
C0000000 Text output
D0000000 General-purpose input/output

Text input/output

The text input device is represented by a text field in the user interface of the simulator. It has two 8-bit registers:

Address (hex) Role Value
B0000000 Control/Status Bit 7: Interrupt enable
Bit 6: Character received
B0000001 Data The ASCII code of the last input character.

The control/status register works as follows:

The text output device is represented by a text area in the user interface of the simulator. It has only one write-only register:

Address (hex) Role Value
C0000000 Data The ASCII code of the character to display.

General-purpose input/output

The GPIO (General-purpose input/output) peripheral allows to connect at most 32 simple user I/O devices:

The inputs are arranged in an 8×4 grid at the bottom of the General-purpose I/O section of the simulator. Right-click on a cell to change its type. Left-click to change the state of a button or switch.

It has the following 32-bit registers:

Address (hex) Role Value
D0000000 Direction (dir) The configuration of each pin (0 for an output, 1 for an input).
D0000004 Interrupt enable (ien) Enable interrupts on input events.
D0000008 Rising-edge events (rev) Each bit is set to 1 if the corresponding input pin has changed from 0 to 1.
D000000C Falling-edge events (fev) Each bit is set to 1 if the corresponding input pin has changed from 1 to 0.
D0000010 Value (val) The current value of each input or output.

On reset, all pins are configured as inputs and interrupts are disabled.

The rising-edge and falling-edge events registers must be cleared in software using store instructions when the events have been processed.

Bitmap output

The simulator provides a graphic display area with 32 rows of 32 pixels. Each pixel is mapped to a RAM byte and its color is encoded like this:

76543210
Red Green Blue

The bitmap RAM follows a raster-scan ordering. For each address, this table shows the (x, y) coordinates of the corresponding pixel:

+0 +1 +2 +1F
00000C00 (0, 0) (1, 0) (2, 0) (31, 0)
00000C20 (0, 1) (1, 1) (2, 1) (31, 1)
00000C40 (0, 2) (1, 2) (2, 2) (31, 2)
00000FE0 (0, 31) (1, 31) (2, 31) (31, 31)

Creating programs for emulsiV with the GNU toolchain

The simulator allows to create and edit programs by entering instructions in the assembly column of the memory view. Another option is to type your program in a text editor and generate an executable for emulsiV using the GNU toolchain.

The following instructions have been tested in Ubuntu 20.04

Installation

If you are using Ubuntu, you can install the pre-built RISC-V bare-metal toolchain using this command:

sudo apt install gcc-riscv64-unknown-elf

Despite what the name suggests, this installs a toolchain that supports both the 32-bit and 64-bit architectures.

Using the assembler

Here is a typical startup module (startup.s) that you can use for your programs. Another assembly or C source file should define the main subprogram, and optionally override the irq_handler subprogram.

    .section vectors, "x"

    .global __reset
__reset:
    j start

__irq:
    j irq_handler

    .text
    .align 4

    .weak irq_handler
irq_handler:
    mret

start:
    la gp, __global_pointer
    la sp, __stack_pointer
    la t0, __bss_start
    la t1, __bss_end
    bgeu t0, t1, memclr_done
memclr:
    sw zero, (t0)
    addi t0, t0, 4
    bltu t0, t1, memclr

memclr_done:
    call main
    j .

This command assembles the source file startup.s into an object file startup.o:

riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -c -o startup.o startup.s

Using the C compiler

Here is an implementation of Hello World in C for emulsiV:
#define TEXT_OUT (*(char*)0xC0000000)

void print(const char *str) {
    while (*str) {
        TEXT_OUT = *str++;
    }
}

void main(void) {
    print("Virgule says\n<< Hello! >>\n");
}

This command compiles the source file hello.c into an object file hello.o:

riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -ffreestanding -c -o hello.o hello.c

If you want to write an interrupt handler in C, you can override the irq_handler subprogram, adding the interrupt attribute like this:

__attribute__((interrupt("machine")))
void irq_handler(void) {
    // Insert your code here.
}

Using the linker

This command links startup.o and hello.o into a binary executable file hello.elf:

riscv64-unknown-elf-gcc -march=rv32i -mabi=ilp32 -nostdlib -T emulsiv.ld -o hello.elf startup.o hello.o

The memory map of the simulator is configured in the linker script emulsiv.ld below:

ENTRY(__reset)

MEM_SIZE    = 4K;
STACK_SIZE  = 512;
BITMAP_SIZE = 1K;

SECTIONS {
    . = 0x0;

    .text : {
        *(vectors)
        *(.text)
        __text_end = .;
    }

    .data   : { *(.data) }
    .rodata : { *(.rodata) }

    __global_pointer = ALIGN(4);

    .bss ALIGN(4) : {
        __bss_start = .;
        *(.bss COMMON)
        __bss_end = ALIGN(4);
    }

    . = MEM_SIZE - STACK_SIZE - BITMAP_SIZE;

    .stack ALIGN(4) : {
        __stack_start = .;
        . += STACK_SIZE;
        __stack_pointer = .;
    }

    .bitmap ALIGN(4) : {
        __bitmap_start = .;
        *(bitmap)
    }

    __bitmap_end = __bitmap_start + BITMAP_SIZE;
}

Converting an executable to the hex format

This command converts an ELF binary file hello.elf into a text file hello.hex in the Intel Hex format:

riscv64-unknown-elf-objcopy -O ihex hello.elf hello.hex

In the simulator, use the button "Open an hex file from your computer" and choose hello.hex or any other hex file that you want to load.

License

emulsiV is free software and is distributed under the terms of the Mozilla Public License 2.0.

This document was created by Guillaume Savaton, ESEO. It is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.