Update readme a little
Minor cleanup of generator program.
Simple program to output Markdown
An unofficial instruction reference for the RISC-V instruction set, oriented towards assembly programming.
One of the common nuisances with the RISC-V instruction set manual is the structure of the documents: They are written in a very prose-y style, which is great for reading them as a learning document, but a pain in the butt for actually using it to look up the details of various things as you are reading or writing assembly code. So I decided to write a simple reference of my own, since it would be useful and maybe not too large a project for one person to take on.
I hate formatting stuff, so all the instruction definitions are in a structured data file (TOML, at the moment) and there is a small Rust program that reads it and outputs whatever text format you want, as long as you want Markdown. It is very crude and not at all pretty, but for now it works. The easy way to get HTML or PDF out of it is to feed the markdown into Pandoc.
Contributions are very welcome. This is a big project for one person, but also a project that is very easy to incrementally chip in a little here and there. To contribute, send patches the the mailing list.
License is Creative Commons Attribution 4.0 International, same as the RISC-V ISA manual.
The goal is to cover the entire RV64 user-level instruction set with all ratified extensions, pseudo-instructions, etc. The reference being used is the latest formally published one, dated 20191213, though there will probably be a new one along soon. Another file for the privileged instruction set would be nice but I'm not gonna go there myself yet.
RV32 is currently out of scope, but might be nice in the future. Apparently some instruction encodings are different between RV32 and RV64, such as SLLI, SRLI, and SRAI, so I'm not going to bother worrying about both for now.
Possible states:
Extension and state:
Extensions to do later:
A glance at the 20200427 draft doesn't show too many differences, mainly the half-precision floats and floats-in-integer-registers extensions. I'm really waiting for a version to get released with up to date V extension.
Some instructions have multiple encodings, such as compressed/noncompressed versions. It's a bit annoying to figure out how the instruction formats intertwine with each other. I'm sure it all makes perfect sense to some hardware people. But we DO need to document what kind of operands instructions take.
Instruction opcodes in RV are kinda wonky 'cause they're commonly broken up across 1-3 fields depending on the instruction format.
Pseudo-instructions are sometimes described in the description of the
base instruction, and are sometimes not. This is, of course, the exact
sort of annoyance this document is intended to clean up. Currently I
THINK all pseudo-instructions are split out and listed as separate
instructions, but I may have missed some. In particular there's
pseudo-instruction that only change their operands, such as
jal offset
which translates to jal x1, offset
and stuff like that.
Not sure how to handle those yet; I'd expected we could entirely
describe an instructions operands based on its instruction encoding
format, buuuuuuut maybe not. Maybe we need to list operands explicitly,
so that this is a more useful reference for assembly programmers.
Other guilty instructions: fence
, jalr
, ...
Instruction format types
op rd, rs1, rs2
. 3 opcode parts, opcode
, funct3
and
funct7
(for some gorram reason) (Oh I think that's the number of
bits in the section)op rd, rs1, imm[12]
. 2 opcode parts, opcode
and
funct3
. Except on ECALL when it uses the immediate section for an
opcode, called funct12
op, rs1, rs2+imm[12]
. 2 opcode parts, opcode
and
funct3
.opcode
and funct3
.op rd, imm[20]
. 1 opcode part, opcode
.opcode
.pseudo
-- Pseudo-instructionFor now, the "opcode" section in the toml just lists the text defintions of the opcodes the standard provides, starting from least significant bits first. Exact values can be found (in binary) in chapter 24 of the spec, "RV32/64G Instruction Set Listings". I do kinda want to have actual hex values for the opcodes somewhere, because it is (sometimes) useful to be able to eyeball instructions and at least say "that's an arithmatic op" or "this whole chunk is floating point", but the instruction format makes it tricky.
MIPS standard doc:
RISC-V standard doc: