kerf is a comprehensive multikernel management system designed to orchestrate and manage multiple kernel instances on a single host. Starting with advanced device tree compilation and validation, kerf provides the foundation for complete multikernel lifecycle management.
Unlike standard tools that only perform basic format conversion, kerf understands multikernel semantics and always validates resource allocations and detects conflicts. The system is architected to evolve into a complete multikernel runtime environment.
kerf currently provides the essential device tree compilation and validation capabilities needed for multikernel systems:
- Resource Conflict Detection: Multiple instances might accidentally be allocated the same CPUs, overlapping memory regions, or the same devices
- Over-Allocation: The sum of all allocations might exceed available resources
- Invalid References: Instances might reference non-existent hardware or devices reserved for the host
- Atomicity: All allocations should be validated together before deployment
The kerf system is designed to evolve into a comprehensive multikernel management platform:
- Kernel Loading & Execution: Load and execute multiple kernel instances with proper isolation
- Resource Management: Dynamic allocation and deallocation of system resources
- Instance Lifecycle: Start, stop, pause, and migrate kernel instances
- Monitoring & Debugging: Real-time monitoring of kernel instances and system health
- Security & Isolation: Advanced security policies and isolation mechanisms
- Orchestration: High-level orchestration of complex multikernel workloads
The current device tree foundation provides the critical infrastructure needed for these advanced capabilities.
The kerf system is built on foundational principles that support both current device tree capabilities and future multikernel runtime features:
- Single Source of Truth: Baseline DTS describes hardware resources available for allocation
- Mandatory Validation: Every operation validates the configuration - validation is not optional
- Fail-Fast: Catch resource conflicts immediately, never produce invalid output
- Overlay-based Management: Dynamic instance changes are managed via device tree overlays
- Extensible Architecture: Designed to support future kernel loading, execution, and management capabilities
- Developer-Friendly: Clear error messages with suggestions for fixing problems
- Runtime-Ready: Current design anticipates future kernel execution and lifecycle management needs
Baseline initialization:
Input: Baseline DTS (resources only)
│
▼
┌─────────┐
│ kerf │ ← Always validates
│ init │
└─────────┘
│
▼
Baseline DTB
(resources only)
→ /sys/fs/multikernel/device_tree
Overlay-based dynamic changes:
Current State Modified State
(Baseline + Overlays) (After change)
│ │
├───────────────────────┤
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ Compute │ │ Compute │
│ Delta │ │ Delta │
└─────────┘ └─────────┘
│ │
└───────────┬───────────┘
│
▼
┌─────────────┐
│ kerf │ ← Validates full state
│ (create/ │ before generating overlay
│ update/ │
│ delete) │
└─────────────┘
│
▼
DTBO Overlay
│
▼
→ /sys/fs/multikernel/overlays/new
│
▼
Applied Overlay
→ /sys/fs/multikernel/overlays/tx_XXX/
Complete system state:
Baseline DTB (static)
│
├─── Overlay tx_101 (instance: web-server)
├─── Overlay tx_102 (instance: database)
└─── Overlay tx_103 (update: web-server resources)
│
▼
Effective Device Tree
(Baseline + All Applied Overlays)
│
▼
Kernel Instance Views
/sys/fs/multikernel/instances/*
Key Points:
- Baseline contains only resources: Hardware inventory available for allocation, loaded once via
kerf init - Instances created via overlays: Dynamic instance lifecycle managed through device tree overlays (DTBO)
- Overlay generation: Computes delta between current and modified state, generates minimal DTBO
- Transactional overlays: Each overlay is a transaction with rollback support via
rmdir - Validation is mandatory: Always validates full state (baseline + all overlays) before applying
- Single source of truth: Baseline DTB is the authoritative resource configuration, overlays add instances dynamically
- Advanced Validation: Comprehensive resource conflict detection and validation
- Baseline Management: Initialize and manage baseline device tree containing hardware resources
- Format Support: DTS to DTB compilation for baseline configuration
- Error Reporting: Detailed error messages with actionable suggestions
- Resource Analysis: Complete resource utilization reporting
- CPU & NUMA Topology: Full support for CPU topology and NUMA-aware resource allocation
# Initialize baseline device tree (applies by default)
kerf init --input=baseline.dts
# Validate baseline without applying (dry-run)
kerf init --input=baseline.dts --dry-run
# Load kernel image with initrd and boot parameters
kerf load --kernel=/boot/vmlinuz --initrd=/boot/initrd.img \
--cmdline="root=/dev/sda1 ro" --id=1
# Create kernel instance with explicit CPU allocation
kerf create web-server --cpus=4-7 --memory=2GB
# Create instance with auto-allocated CPU count
kerf create database --cpu-count=8 --memory=16GB
# Create with topology-aware auto-allocation
kerf create compute --cpu-count=16 --memory=32GB --numa-nodes=0,1 --cpu-affinity=spread --memory-policy=interleave
# Validate instance creation without applying
kerf create test-instance --cpu-count=4 --memory=2GB --dry-run
# Show all kernel instances
kerf show
# Show specific instance information
kerf show web-server --verbose
# Boot a kernel instance (requires loaded kernel)
kerf exec web-server
kerf exec --id=1
# Unload kernel image from an instance
kerf unload web-server
kerf unload --id=1 --verbose
# Delete a kernel instance (must be unloaded first)
kerf delete web-server
kerf delete --id=1 --dry-runThe kerf system is designed with a modular architecture that supports incremental development:
kerf init: Initialize baseline device tree (resources only) (current)kerf create: Create a kernel instance (current)kerf load: Kernel loading via kexec_file_load syscall (current)kerf exec: Kernel execution via reboot syscall with MULTIKERNEL command (current)kerf unload: Unload kernel image from a multikernel instance (current)kerf delete: Delete a kernel instance (current)kerf show: Show kernel instance information (current)kerf update: Update a kernel instance (future)kerf kill: Kill a kernel instance (future)
This modular design allows users to adopt kerf incrementally, starting with device tree validation and expanding to full multikernel management as features become available.
The current device tree foundation provides essential building blocks for future multikernel capabilities:
- Resource Validation: Ensures safe resource allocation before kernel execution
- Instance Isolation: Provides the foundation for secure kernel isolation
- Configuration Management: Enables consistent and validated system configurations
- Error Handling: Establishes patterns for robust error reporting and recovery
- Extensible Architecture: Designed to support future kernel management APIs
These foundational capabilities are essential for safe and reliable multikernel execution, making kerf the ideal platform for building comprehensive multikernel management systems.
The baseline device tree contains only the Resources section, which describes all physical hardware available for allocation. Instances and device references are added dynamically via overlays when using kerf create.
- Resources (
/resources): Complete description of all physical resources (baseline only) - Instances (
/instances): Resource assignments for each spawn kernel (added via overlays) - Device References: Linkage between instances and hardware devices (added via overlays)
The baseline DTS file contains only hardware resources. Instances are created dynamically via overlays using kerf create.
/multikernel-v1/;
/ {
compatible = "linux,multikernel-host";
resources {
cpus {
total = <32>;
host-reserved = <0 1 2 3>;
available = <4 5 6 7 8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31>;
};
memory {
total-bytes = <0x0 0x400000000>; // 16GB
host-reserved-bytes = <0x0 0x80000000>; // 2GB
memory-pool-base = <0x80000000>;
memory-pool-bytes = <0x0 0x380000000>; // 14GB
};
devices {
eth0: ethernet@0 {
compatible = "intel,i40e";
pci-id = "0000:01:00.0";
sriov-vfs = <8>;
host-reserved-vf = <0>;
available-vfs = <1 2 3 4 5 6 7>;
};
nvme0: storage@0 {
compatible = "nvme";
pci-id = "0000:02:00.0";
namespaces = <4>;
host-reserved-ns = <1>;
available-ns = <2 3 4>;
};
};
};
};
Device Tree Structure → Kernel Filesystem Interface:
DTS: /resources → /sys/kernel/multikernel/device_tree (writable, single source of truth)
DTS: /instances/web-server → /sys/kernel/multikernelinstances/web-server/ (read-only)
DTS: /instances/database → /sys/kernel/multikernel/instances/database/ (read-only)
DTS: /instances/compute → /sys/kernel/multikernel/instances/compute/ (read-only)
Name-based addressing:
- Instance node name in DTS (
web-server) = directory name in kernel filesystem (instances/web-server/) - Kernel assigns numeric IDs, but users reference by name
- No manual ID coordination needed
- Instance directories and
device_tree_sourcefiles are auto-generated by the kernel from the global device tree
All kerf operations perform validation automatically:
- Compiling DTS to DTB → validates
- Converting formats → validates
- Generating reports → validates first
Validation cannot be disabled or skipped.
Rules:
- All CPUs must exist in hardware inventory (0 to
total-1) - CPUs must be in the
availablelist (nothost-reserved) - No CPU can be allocated to multiple instances
- CPU lists should be explicitly enumerated
Rules:
- All memory regions must be within memory pool bounds
- Memory regions cannot overlap between instances
- Sum of all allocations must not exceed memory pool size
- Memory base addresses must be page-aligned (4KB = 0x1000)
Rules:
- Referenced devices must exist in hardware inventory
- Devices can only be allocated to one instance (exclusive access)
- Device references must be valid (no dangling phandles)
- SR-IOV VF numbers must be within available range
- Namespace IDs must be within available range
Rules:
- Instance names must be unique
- Instance IDs must be unique
- All phandle references must resolve
- Hardware inventory must be complete and consistent
# Initialize baseline device tree
kerf init --input=baseline.dts
# Validate baseline without applying
kerf init --input=baseline.dts --dry-run
# Generate detailed validation report
kerf init --input=baseline.dts --report
# Validate with verbose output
kerf init --input=baseline.dts --verbose# Human-readable text (default)
kerf init --input=baseline.dts --report
# JSON for tooling integration
kerf init --input=baseline.dts --report --format=json
# YAML for configuration management
kerf init --input=baseline.dts --report --format=yamlThe kernel exposes a filesystem interface (mounted at /sys/fs/multikernel/) that manages baseline resources and overlay-based instance changes:
Kernel Interface Structure:
/sys/fs/multikernel/
├── device_tree # Baseline DTB (resources only, writable via kerf init)
├── overlays/ # Overlay subsystem
│ ├── new # Write DTBO here to apply overlay
│ ├── tx_101/ # Applied overlay transaction
│ │ ├── id # Transaction ID: "101"
│ │ ├── status # "applied" | "failed" | "removed"
│ │ ├── dtbo # Original overlay blob (binary)
│ │ └── ...
│ └── tx_102/
│ └── ...
└── instances/ # Runtime kernel instances (read-only)
├── web-server/
│ ├── id # Instance ID
│ ├── status # Instance status
│ └── ...
└── ...
Key Design Principles:
- Baseline Separation: Baseline (
device_tree) contains only resources - no instances - Overlay-based Changes: All dynamic changes (create, update, delete instances) via overlays
- Rollback Support: Remove overlay transaction directory (
rmdir /sys/fs/multikernel/overlays/tx_XXX/) to rollback changes - Kernel-Generated: Instance directories auto-generated from baseline + applied overlays
# Step 1: Write baseline DTS describing hardware resources only
vim baseline.dts
# Baseline contains only /resources - no instances
# Step 2: Initialize baseline device tree
kerf init --input=baseline.dts
# Output:
# ✓ Baseline validation passed
# ✓ Baseline applied to kernel successfully
# Baseline: /sys/fs/multikernel/device_tree
# Step 3: Create kernel instances via overlays
kerf create web-server --cpus=4-7 --memory=2GB
kerf create database --cpus=8-15 --memory=8GB
# Kernel now has baseline configuration:
# - Resources defined and available for allocation
# - Instances created via overlays
# - Ready for kernel loading via 'kerf load'# Create new kernel instance via overlay
# Instance name is a positional argument (can appear anywhere after 'create')
kerf create web-server --cpus=4-7 --memory=2GB
# This applies an overlay adding the instance to the device tree
# Instance name can also appear after options
kerf create --cpus=8-15 --memory=8GB database
# Create with explicit CPU allocation (CPU 8)
kerf create compute --cpus=8 --memory=16GB
# Create with auto-allocated CPU count (topology-aware)
kerf create compute --cpu-count=8 --memory=16GB --numa-nodes=0 --cpu-affinity=compact --memory-policy=local
# Validate before applying (dry-run, auto-allocate 4 CPUs)
kerf create web-server --cpu-count=4 --memory=2GB --dry-run
# Update instance resources via overlay (future)
kerf update database --cpus=8-19 --memory=8GB
# This applies an overlay updating the instance configuration
# Delete instance via overlay removal (future)
kerf delete compute
# This removes the overlay transaction, reverting the change$ kerf init --input=baseline.dts --report
Multikernel Device Tree Validation Report
==========================================
Status: ✓ VALID
Hardware Inventory:
CPUs: 32 total
Host reserved: 0-3 (4 CPUs, 12%)
Memory pool: 4-31 (28 CPUs, 88%)
Memory: 16GB total
Host reserved: 2GB (12%)
Memory pool: 14GB at 0x80000000 (88%)
Devices: 2 network, 1 storage
✓ Baseline validation passed
✓ Baseline applied to kernel successfully
Baseline: /sys/fs/multikernel/device_tree
$ kerf init --input=bad_baseline.dts --report
Multikernel Device Tree Validation Report
==========================================
Status: ✗ INVALID
ERROR: Baseline must not contain instances. Instances should be created via overlays.
Baseline must contain:
✓ /resources (hardware inventory)
✗ /instances (must be empty or absent)
Suggestion: Remove instances section from baseline
Instances should be created via 'kerf create' using overlays
In file bad_baseline.dts:
Line 45: instances { web-server { ... } }
✗ Validation failed with 1 error
Exit code: 1
[tool.poetry.dependencies]
python = "^3.8"
pylibfdt = "^1.7.0" # Device tree parsing (from dtc project)# From source (recommended for development)
git clone https://github.com/multikernel/kerf.git
cd kerf
# Installs 'kerf' command to ~/.local/bin/kerf
pip install -e .
# Installs 'kerf' command to the system Python's scripts directory
# (typically /usr/local/bin/kerf, or /usr/bin/kerf if using system Python)
sudo pip install .
# Install in development mode
pip install -e .
# Test the installation
kerf --help
kerf init --help
# Try with example baseline configuration
kerf init --input=examples/baseline.dts --reportThe examples/ directory contains sample baseline Device Tree Source (DTS) files demonstrating various hardware resource configurations:
baseline.dts- Complete baseline with CPU, memory, and device resources (32 CPUs, 16GB memory)minimal.dts- Simple baseline for testing and development (8 CPUs, 8GB memory)edge_computing.dts- Edge computing baseline with GPU support for AI inference (16 CPUs, 32GB memory)numa_topology.dts- Advanced NUMA topology baseline with 4 NUMA nodes and topology-aware allocationsystem.dts- Example baseline with various device configurationsconflict_example.dts- Intentionally invalid baseline demonstrating common validation errors
Note: All baseline files contain only hardware resources - no instances. Instances are created dynamically via overlays using kerf create command.
Kerf provides comprehensive support for CPU and NUMA topology management:
- CPU Topology: Socket, core, and thread mapping with SMT/hyperthreading support
- NUMA Awareness: NUMA node definition with memory regions and CPU assignments
- Topology Policies: CPU affinity (
compact,spread,local) and memory policies (local,interleave,bind) - Performance Validation: Automatic validation of topology constraints and performance warnings
resources {
cpus {
total = <32>;
host-reserved = <0 1 2 3>;
available = <4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29 30 31>;
};
topology {
numa-nodes {
node@0 {
node-id = <0>;
memory-base = <0x0 0x0>;
memory-size = <0x0 0x800000000>; // 16GB
cpus = <0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15>;
};
node@1 {
node-id = <1>;
memory-base = <0x0 0x800000000>;
memory-size = <0x0 0x800000000>; // 16GB
cpus = <16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31>;
};
};
};
};
For detailed information about CPU and NUMA topology support, see CPU_NUMA_TOPOLOGY.md.
- Device Tree Specification: https://devicetree-specification.readthedocs.io/
- libfdt Documentation: https://git.kernel.org/pub/scm/utils/dtc/dtc.git/tree/Documentation
