Are you an engineer who loves understanding and improving the performance of systems? We are seeking versatile engineers to work on the performance of Oxide systems wherever suboptimality is to be found: from the depths of hardware, through the operating system kernel and hypervisor and into the guest operating system — and to the highest levels of the application stack.
As an engineer focused on systems performance, you will:
Work rigorously to understand existing limiters to performance, wherever those limiters may lie.
Prototype improvements to the system, be they small fixes, larger scale re-writes, or entirely de novo subsystems.
Work on systems of a variety of ages, spanning from decades-old to entirely new — and everything in between.
Work on systems primarily written in Rust and C.
Work with a wide variety of our systems software, including (but not limited to!) our host operating system (Helios), our hypervisor (Propolis), our block storage service (Crucible), our embedded operating system (Hubris), and our control plane (Omicron).
Work with a variety of hardware as needed to understand and model the performance ramifications of different architectural or component decisions.
Develop infrastructure and tooling to better understand systems performance.
You will thrive in this role if you:
Believe that every instruction is sacred, every instruction is great.
Love to hunt slow, broken code — and replace it with a vastly improved alternative.
Are deeply analytical and data-intensive.
Have used whatever tooling at your disposal to understand systems behavior (e.g., DTrace/eBPF, snoop/tcpdump, truss/strace).
Have implemented your own tools where the right tool didn’t exist (or otherwise needed to be extended).
Have experience shipping software written in Rust, C, or another systems-oriented language.
Before applying for this role, you should:
Browse our public Requests for Discussion to get a flavor for how we work
Listen to Hiring Processes with Gergely Orosz to familiarize yourself with the Oxide hiring process.
Listen to some of our episodes of Oxide and Friends. A few recommendations:
When Async Attacks! on a particularly pathological performance problem and the tooling we developed to understand it
Mr. Nagle’s Wild Ride on a timeless performance tale, re-told anew
Crucible: The Oxide Storage Service on our storage service and our approach to improving its performance
Heterogeneous Computing with Raja Koduri on how hardware comprises the ultimate limiter of performance — and why different approaches are called for by different problems