Draft Proposal for Revision of Programmable Decoder Grant

This is a draft of a proposal to NLNet to revise the Programmable Decoder grant that we applied for, they're currently considering it.

Two points that were mentioned in a previous meeting:

  • Some people thought the wages looked rather low for a senior-level software engineer.
  • The estimated budgets are based off how long I thought I could finish the task in, which doesn't account for how much time other people would need to do the same task, since they don't have the same experience level.

After looking around some, I found some wage stats for Belgium on GlassDoor, apparently €5k/mo is the median wage for senior software engineers with 15+ years of experience working in IT. That basically matches what we already have for our target wage of ~$69k/yr which assuming 1.15 USD/EUR comes out to €5k/mo. (edit: seems this is a bit low, see comment below)

I spent around a week working on the compiler for extracting the decoder portion of QEMU as HDL, so far I cross-compiled QEMU to powerpc64le-linux-gnu using LTO and lld to generate a LLVM bitcode file that contains almost all of QEMU, I then am running the resulting bitcode through an LLVM IR interpreter I wrote that uses symbolic base addresses for most memory allocations. I got it to successfully run QEMU's global constructors and start running qemu_init_subsystems.

That makes me think I could finish that task in maybe 2.5mo, so I think the €20k budget is actually appropriate when taking into account that other people working on the task would take longer due to not being experienced in LLVM IR and stuff.

Here's a starting point for adjusting the top-level tasks in the budget:

  • € 50000 Adding missing features to our CPU, such as memory paging, floating-point instructions, a better cache hierarchy, and better compatibility with the PowerISA specification.
    This task is basically just adding features until the budget runs out. I don't expect we'll cover all the features, but we can definitely get some done.

    • Memory Paging I think I could finish in 1mo. Skills: Rust, HDL, familiarity with PPC radix paging and x86_64 page tables.

    • Floating-point instructions: I think I could do: Add/Sub in 3 weeks, Mul/FMA in 1mo, Div/Sqrt in 1.5mo (wouldn't be fast, but it would work), comparison in 1.5 weeks, and conversion from/to integer and rounding to integer in 3 weeks.
      Skills for all those: Rust, HDL, IEEE 754

    • Handling control/status and PowerISA's OV/SO I think I could do in 1.5mo since it'd basically be feeding the rounding mode into the different units, adding denormal flushing support, and then just make instructions cause a CPU pipeline flush if the status flags or SO change, treating them as kinda like the program counter.
      Skills: Rust, HDL, Speculative CPU design.

    • Better cache hierarchy could be 2-3mo for me and likely won't fit on FPGAs, so we could leave it out of the grant.
      Skills: Rust, HDL, cache coherency

    So, adding up all of the above durations without the cache hierarchy gives 27.5 weeks. Multiplying that by 1.5 to account for others taking more time gives ~10.3 mo so I think €50000 is good.

  • € 20000 Add the programmable decoder and µOp cache to our CPU design.
    This task I don't actually expect to be all that complex, since the programmable decoder will be pretty straightforward, basically a pipeline of adders and muxes. the µOp cache will be a bit more complex, but probably simpler than the d-cache. I think I could finish it in maybe 2mo.
    Skills: Rust, HDL, basic FPGA architecture

  • € 20000 Build a compiler that can extract the decoder portion of QEMU using pattern matching and some symbolic execution of LLVM IR, converting to a HDL IR more suitable for hardware.
    This one actually stays the same as mentioned above.
    Skills: Rust, C/C++ (reading QEMU and LLVM source as needed), HDL, LLVM IR, interpreters, QEMU's high-level architecture.

  • € 25000 Write code to convert the HDL IR to a bitstream we can program into the decoder.
    I expect this to be mostly running a FPGA toolchain like normal and invoking nextpnr with the appropriate custom FPGA model, since IIRC you can supply that with a config file.
    Skills: Rust, HDL toolchains, nextpnr custom FPGA config
    Kinda depends on: "add programmable decoder" and "compiler to extract decoder" which might be able to be done in parallel

  • € 20000 Start getting the fallback software decoder and the software instruction emulator to work, as well as misc. other parts of the compiler needed to make the whole system work together.
    This is somewhat open-ended and I don't have a good idea how much work would be needed, so I rephrased it to start doing the work, but not promising we have to finish everything before we can get paid.
    Skills: Low-level programming, QEMU's architecture, C, maybe Rust.
    Depends on: "programmable decoder and µOp cache" and maybe "Write code to convert the HDL IR to a bitstream" with "Build a compiler that can extract the decoder portion of QEMU"

So, the new total is €135k.

Let me know what you all think!

Jacob

I've since realized that EUR 5000/mo may be a bit low, proposals are appreciated for a more appropriate wage considering NLNet has a limit of EUR 150k for this grant.

Sent the email to NLnet.