There are several commonly used RISC-V
with 32-bit immediates.
Below is an example of loading a 32-bit immediate
into a register using
lui rd,imm[31:12] addi rd,rd,imm[11:0]
a (sign-extended) 20-bit immediate into register
and fills the lowest 12 bits with zeros,
addi adds a sign-extended 12-bit immediate
Question: does this work for any 32-bit immediate? It may be trickier than you think.
Clearly, on 32-bit systems,
this instruction pair can be used to load any 32-bit immediate
in the range [-231, 231-1].
one may use
lui with a 20-bit
addi with a 12-bit
0xffffff00 (sign-extended from
How about 64-bit systems?
let’s look at loading the 32-bit immediate
Note that the same 20-bit and 12-bit values used on 32-bit systems won’t work,
0xffffffff_8000000 (sign-extended from
0xffffffff_ffffff00 (sign-extended from
Does there exist any 20-bit and 12-bit values
addi work on 64-bit systems?
Similar questions can be asked about other instruction pairs,
ld for loading a value at a 32-bit address,
jalr for jumping to a 32-bit pc-relative offset.
The short answer is no. You may be interested in the discussion in the RISC-V ISA Dev group started by Luke Nelson, which prompted the RISC-V ISA specification to clarify the range in the “RV64I Base Integer Instruction Set” chapter:
Note that the set of address offsets that can be formed by pairing LUI with LD, AUIPC with JALR, etc. in RV64I is [−231−211, 231−211−1].
In other words,
the 32-bit range reachable by such instruction pairs
in 64-bit RISC-V is shifted by -211
from [-231, 231-1].
0x7fffff00 doesn’t fall in the range.
Intuitively, the shifting is caused by the choice of the sign extension of immediates in the RISC-V ISA. See examples of issues reported in coreboot and the BPF JIT for RV64 in the Linux kernel, as well as our upcoming Jitterbug paper.
To check the correctness of the range, I wrote a simple Rosette program, as follows:
#lang rosette ; integer register width in bits (define XLEN 64) ; symbolic 20-bit and 12-bit values (define-symbolic imm20 (bitvector 20)) (define-symbolic imm12 (bitvector 12)) ; mimic the result of an instruction pair (define v (bvadd (sign-extend (concat imm20 (bv 0 12)) (bitvector XLEN)) (sign-extend imm12 (bitvector XLEN)))) ; lower and upper bounds (define-symbolic lower upper (bitvector XLEN)) ; find the lower and upper bounds via optimization (optimize #:maximize (list lower) #:minimize (list upper) #:guarantee (assert (forall (list imm12 imm20) (&& (bvsge v lower) (bvsle v upper)))))
Rosette invokes the Z3 SMT solver to find the lower and upper bounds of the reachable range (you may also use SMT or the Z3 API directly). The output of the above program is:
(model [lower (bv #xffffffff7ffff800 64)] [upper (bv #x000000007ffff7ff 64)])
This is consistent with the clarification in the RISC-V ISA specification.
if you were asked to make one change in the 64-bit RISC-V ISA
(e.g., the semantics of an instruction)
to make the range remain
what would you do?
The above Rosette program may be helpful.
You may also want to check the
Acknowledgments: James Bornholt, Luke Nelson, and Emina Torlak provided feedback on a draft of this post.