Table of Contents

Systems — inline assembly, volatile, atomic

Nyx as a systems language

Most programming books stop at "build a web server." But Nyx is designed to go further — it can replace C for systems programming. This chapter covers the low-level tools that make that possible.

Inline assembly

When you need to execute a specific CPU instruction, Nyx lets you write assembly directly:

fn main() {
    unsafe {
        asm("nop")    // no-operation: does nothing, takes one cycle
    }
}

Inline assembly uses GCC-style AT&T syntax and compiles to LLVM inline assembly.

Register manipulation

fn main() {
    unsafe {
        asm("mov $42, %rax")     // move 42 into RAX register
        asm("add $8, %rax")      // add 8 to RAX (now 50)
    }
}

When to use inline assembly

For almost all application-level code, you never need inline assembly. It exists for the rare cases where you need exact control over what the CPU does.

Function attributes

Nyx supports function attributes for systems programming:

#[naked]
fn interrupt_handler() {
    unsafe {
        asm("push %rax")
        asm("push %rbx")
        // ... handle interrupt ...
        asm("pop %rbx")
        asm("pop %rax")
        asm("iretq")
    }
}

Volatile memory access

Normal memory access can be reordered or eliminated by the compiler. Volatile access cannot:

fn read_hardware_register() -> int {
    unsafe {
        let mmio_addr: *int = 0x40000000 as *int
        return volatile_read(mmio_addr)
    }
}

fn write_hardware_register(value: int) {
    unsafe {
        let mmio_addr: *int = 0x40000000 as *int
        volatile_write(mmio_addr, value)
    }
}

Why volatile matters

Without volatile, the compiler might:

Volatile guarantees: every read reads from memory, every write writes to memory, in exactly the order you specified.

Atomic operations

Atomic operations are the foundation of lock-free programming. They guarantee that a read-modify-write cycle completes without interruption:

fn main() {
    unsafe {
        let flag: *int = alloc(8)
        atomic_store(flag, 0)

        // In one thread:
        atomic_store(flag, 1)    // signal "ready"

        // In another thread:
        let ready: int = atomic_load(flag)
        if ready == 1 {
            print("Other thread signaled ready")
        }

        free(flag)
    }
}

Atomic vs mutex

Atomic Mutex
Single operations only Protects code blocks
No waiting (lock-free) Can block (threads wait)
Very fast (~1 nanosecond) Slower (~50 nanoseconds)
Hard to use correctly Easy to reason about

Use atomics for simple flags, counters, and state machines. Use mutexes for complex critical sections.

Sized types for hardware

When interfacing with hardware or binary protocols, you need exact-sized types:

var byte: i8 = 0xFF
var word: i16 = 0x1234
var dword: i32 = 0xDEADBEEF
var qword: i64 = 0

var ubyte: u8 = 255
var uword: u16 = 65535
var udword: u32 = 4294967295
var uqword: u64 = 0

var single: f32 = 3.14

These types map directly to hardware register widths and are essential for device drivers and binary protocol parsing.

No-GC mode

For systems programming, you often cannot have a garbage collector. Nyx supports compilation without the GC:

make run-no-gc FILE=program.nx

In no-GC mode:

Cross-compilation

Nyx can compile for different architectures:

make cross FILE=program.nx TARGET=aarch64-linux-gnu

This generates a binary for ARM64 Linux from an x86 machine. Combined with no-GC mode, you can target embedded systems and microcontrollers.

WebAssembly target

Nyx can also compile to WebAssembly for running in browsers:

make wasm FILE=program.nx

This generates a .wasm file that can be loaded by a web page.

Practical example: reading the CPU timestamp

fn rdtsc() -> int {
    var result: int = 0
    unsafe {
        // Read Time Stamp Counter
        asm("rdtsc")
        asm("shl $32, %rdx")
        asm("or %rdx, %rax")
    }
    return result
}

fn main() {
    let start: int = time_us()
    // ... do some work ...
    var i: int = 0
    while i < 1000000 { i += 1 }
    let end: int = time_us()
    print("Elapsed: " + int_to_string(end - start) + " microseconds")
}

Exercises

  1. Write a program that uses volatile_write and volatile_read to implement a simple spin-lock (busy-waiting lock).
  1. Use atomic_store and atomic_load to implement a thread-safe boolean flag that one thread sets and another thread polls.
  1. Write a program using sized types (i8, i16, i32) that packs three values into a single i64 using bit shifting.
  1. Compile a simple "hello world" program in no-GC mode and compare its binary size to the normal GC-enabled version.
  1. Write a function that uses inline assembly to execute a nop sled (100 nops in a row) and measure its execution time.

Summary

Next chapter: Case study — How nyx-kv was built →

← Previous: Async and event loop Next: Case study — How nyx-kv was built →