Why I'm Documenting My RISC-V CPU Build in the Open
Building a RISC-V processor from scratch is slow and full of dead ends. Writing it up publicly turned out to matter more than the silicon ever will.
The RISC-V CPU core on this site is, on paper, a fairly ordinary project: implement the RV32I base integer instruction set in Verilog, three different ways — single-cycle, multi-cycle, and a fully pipelined five-stage design. None of that is novel. RV32I cores are a standard exercise, and dozens of public implementations already exist.
The point was never to be first. It was to actually understand what's happening between an instruction fetch and a write-back, instead of treating a CPU as a black box that "just runs code."
Three implementations, on purpose
Building the same instruction set three separate ways sounds redundant until you've done it. The single-cycle design forces you to confront the worst-case critical path through every instruction type in one clock period — and makes clear why nobody ships single-cycle CPUs. The multi-cycle version trades that for a control unit that has to track where in an instruction's execution you currently are, which is its own kind of complexity. The pipelined version is where the real lessons live: forwarding and stalling exist because a five-stage pipeline doesn't get to pretend instructions execute one at a time anymore, and a load whose result the very next instruction needs will expose every shortcut taken in the design.
Skipping straight to the pipelined version would have meant debugging hazards I didn't fully understand were hazards yet.
Why the write-up is part of the project
It would have been faster to keep notes in a private scratch file and call the project "done" once the testbenches passed. Writing it up publicly — what's implemented, what's still a known gap (cache integration and formal verification, currently), what each design tradeoff actually cost in practice — does something a passing testbench doesn't: it forces the explanation to hold up to someone who wasn't in my head while I built it.
That's a higher bar than "the simulation didn't crash." A correct-looking waveform is not the same as understanding why it's correct, and the gap between those two things only shows up when you try to explain it to someone else, including a future version of yourself looking at this six months from now with the context gone.
What's next
Cache integration and formal verification are the two honest gaps left in the project, and they're staying listed as gaps rather than getting quietly dropped from scope. If you're working through something similar, the Embedded Systems Engineer roadmap on this site covers the layer below this — circuit fundamentals and firmware — that makes a project like this easier to reason about once you eventually connect a core like this to real memory and peripherals instead of a testbench.