Systems — inline assembly, volatile, atomic
Nyx as a systems language
Most programming books stop at "build a web server." But Nyx is designed to go further — it can replace C for systems programming. This chapter covers the low-level tools that make that possible.
Inline assembly
When you need to execute a specific CPU instruction, Nyx lets you write assembly directly:
fn main() { unsafe { asm("nop") // no-operation: does nothing, takes one cycle } }
Inline assembly uses GCC-style AT&T syntax and compiles to LLVM inline assembly.
Register manipulation
fn main() { unsafe { asm("mov $42, %rax") // move 42 into RAX register asm("add $8, %rax") // add 8 to RAX (now 50) } }
When to use inline assembly
- CPU-specific instructions: CPUID, RDTSC (read timestamp counter), cache control
- Interrupt handlers:
cli/stito disable/enable interrupts - Bare-metal programming: bootloaders, OS kernels
- Performance-critical sequences: when LLVM's optimizer does not generate optimal code
For almost all application-level code, you never need inline assembly. It exists for the rare cases where you need exact control over what the CPU does.
Function attributes
Nyx supports function attributes for systems programming:
#[naked] fn interrupt_handler() { unsafe { asm("push %rax") asm("push %rbx") // ... handle interrupt ... asm("pop %rbx") asm("pop %rax") asm("iretq") } }
#[naked]— no function prologue/epilogue (no stack frame setup)#[interrupt]— marks as hardware interrupt handler#[link_section(".text.boot")]— places function in a specific binary section#[export_name("_start")]— controls the symbol name in the binary
Volatile memory access
Normal memory access can be reordered or eliminated by the compiler. Volatile access cannot:
fn read_hardware_register() -> int { unsafe { let mmio_addr: *int = 0x40000000 as *int return volatile_read(mmio_addr) } } fn write_hardware_register(value: int) { unsafe { let mmio_addr: *int = 0x40000000 as *int volatile_write(mmio_addr, value) } }
Why volatile matters
Without volatile, the compiler might:
- Read a hardware register once and cache the value (missing updates)
- Remove a write to a register because "no one reads it" (the hardware does!)
- Reorder reads and writes (hardware depends on exact ordering)
Volatile guarantees: every read reads from memory, every write writes to memory, in exactly the order you specified.
Atomic operations
Atomic operations are the foundation of lock-free programming. They guarantee that a read-modify-write cycle completes without interruption:
fn main() { unsafe { let flag: *int = alloc(8) atomic_store(flag, 0) // In one thread: atomic_store(flag, 1) // signal "ready" // In another thread: let ready: int = atomic_load(flag) if ready == 1 { print("Other thread signaled ready") } free(flag) } }
Atomic vs mutex
| Atomic | Mutex |
|---|---|
| Single operations only | Protects code blocks |
| No waiting (lock-free) | Can block (threads wait) |
| Very fast (~1 nanosecond) | Slower (~50 nanoseconds) |
| Hard to use correctly | Easy to reason about |
Use atomics for simple flags, counters, and state machines. Use mutexes for complex critical sections.
Sized types for hardware
When interfacing with hardware or binary protocols, you need exact-sized types:
var byte: i8 = 0xFF var word: i16 = 0x1234 var dword: i32 = 0xDEADBEEF var qword: i64 = 0 var ubyte: u8 = 255 var uword: u16 = 65535 var udword: u32 = 4294967295 var uqword: u64 = 0 var single: f32 = 3.14
These types map directly to hardware register widths and are essential for device drivers and binary protocol parsing.
No-GC mode
For systems programming, you often cannot have a garbage collector. Nyx supports compilation without the GC:
make run-no-gc FILE=program.nx
In no-GC mode:
- No Boehm GC linked
- All memory management is manual (
alloc/free) - Smaller binaries
- Deterministic performance (no GC pauses)
- You must free everything — leaks are your responsibility
Cross-compilation
Nyx can compile for different architectures:
make cross FILE=program.nx TARGET=aarch64-linux-gnu
This generates a binary for ARM64 Linux from an x86 machine. Combined with no-GC mode, you can target embedded systems and microcontrollers.
WebAssembly target
Nyx can also compile to WebAssembly for running in browsers:
make wasm FILE=program.nx
This generates a .wasm file that can be loaded by a web page.
Practical example: reading the CPU timestamp
fn rdtsc() -> int { var result: int = 0 unsafe { // Read Time Stamp Counter asm("rdtsc") asm("shl $32, %rdx") asm("or %rdx, %rax") } return result } fn main() { let start: int = time_us() // ... do some work ... var i: int = 0 while i < 1000000 { i += 1 } let end: int = time_us() print("Elapsed: " + int_to_string(end - start) + " microseconds") }
Exercises
- Write a program that uses
volatile_writeandvolatile_readto implement a simple spin-lock (busy-waiting lock).
- Use
atomic_storeandatomic_loadto implement a thread-safe boolean flag that one thread sets and another thread polls.
- Write a program using sized types (
i8,i16,i32) that packs three values into a singlei64using bit shifting.
- Compile a simple "hello world" program in no-GC mode and compare its binary size to the normal GC-enabled version.
- Write a function that uses inline assembly to execute a
nopsled (100 nops in a row) and measure its execution time.
Summary
unsafe { asm("...") }executes inline assembly (AT&T syntax).- Function attributes:
#[naked],#[interrupt],#[link_section],#[export_name]. volatile_read/volatile_writeprevent compiler reordering of memory access.atomic_load/atomic_storeprovide lock-free thread synchronization.- Sized types (
i8,i16,i32,u8,u16,u32,f32) match hardware widths. - No-GC mode enables deterministic, manual memory management.
- Cross-compilation and WebAssembly targets extend Nyx beyond desktop Linux.
Next chapter: Case study — How nyx-kv was built →