§2

Architecture

The perceptive-motor cycle, warm-up calibration, graduated response system, quarantine modes, habituation mechanisms, and key engineering design decisions.

Based on Whitepaper v2.1 — Section 5 · ~20 min read

1. Architectural Principles

The design of HOSA is governed by five non-negotiable principles. These are not aspirational guidelines — they are hard constraints that every design decision must satisfy.

#PrincipleDescription
1 Local Autonomy HOSA must execute its complete detection-and-mitigation cycle without dependency on network, external APIs, or human intervention for its primary function.
2 Zero External Runtime Dependencies The agent does not depend on external services (TSDB, message brokers, cloud APIs) to operate. All dependencies are internal to the binary or the host kernel. Communication with external systems is opportunistic: performed when available, never required.
3 Predictable Computational Footprint CPU and memory consumption must be constant and predictable — O(1) in memory, configurable and bounded CPU percentage. The agent must not become the cause of the problem it aims to solve.
4 Graduated Response Mitigation is not binary. HOSA implements a spectrum of responses proportional to the severity and rate of change of the anomaly, from light priority adjustment to complete network isolation.
5 Decision Observability Every autonomous action is logged locally with its mathematical justification — DM value, derivative, threshold crossed, action executed. The agent is fully auditable.
On Principle 3 — Self-Containment

HOSA itself operates within a dedicated cgroup v2 with hard limits on memory.max and cpu.max. If the agent exceeds its own resource limits, the kernel contains it before it affects the system. HOSA practices what it preaches.

2. The Perceptive-Motor Cycle

HOSA operates in a continuous cycle with three functional layers, inspired by the biological separation between sensory system, nervous system, and motor system. These layers map directly to the reflex arc pattern described in §1 — Core Concepts.

Figure 1 The three-layer perceptive-motor cycle
Kernel Space — eBPF
Sensory Probes
tracepoints, kprobes, PSI
Actuators
XDP, cgroup controllers
Ring Buffer ↓    BPF Maps ↑
User Space — Go
Predictive Cortex
Welford → DM → EWMA → derivatives → decision
Opportunistic Comms
webhooks, metrics, audit log

2.1. Sensory Layer — eBPF in Kernel Space

The sensory layer collects system state via eBPF probes attached directly to kernel tracepoints and kprobes. This is fundamentally different from the polling model used by traditional monitoring agents:

Traditional Agent

  • Reads /proc files periodically
  • Parses text output, converts to numbers
  • Interval: 10–60 seconds
  • Misses transient events between polls
  • Each read involves syscalls and context switches

HOSA eBPF Probes

  • Attached to kernel tracepoints at load time
  • Receives structured data as events fire
  • Continuous — every relevant kernel event captured
  • No transient events missed
  • Data flows via ring buffer with μs latency

The probes collect data across five resource dimensions:

DimensionSourceVariables
CPU Tracepoints: sched_switch, sched_process_fork Utilization (aggregate + per-core), context switches, run queue depth
Memory Tracepoints: mm_page_alloc, mm_page_free; PSI hooks Usage, pressure (PSI some/full), swap activity, page faults
I/O Tracepoints: block_rq_issue, block_rq_complete Throughput (IOPS), latency, queue depth
Network Tracepoints: net_dev_xmit, netif_receive_skb Packet rate (rx/tx), byte rate, connection count
Scheduler Tracepoints: sched_wakeup, sched_stat_runtime Run queue depth, scheduling latency

These variables compose the state vector x(t) ∈ ℝⁿ that feeds the mathematical engine. The dimensionality n is determined automatically during the warm-up phase based on hardware topology (see §3).

2.2. Cortex Layer — Mathematical Engine

The cortex is the decision-making core of HOSA. It executes a nine-step pipeline on every sample received from the sensory layer:

  1. Receive events from the eBPF ring buffer
  2. Update state vector x(t) with current values
  3. Update μ and Σ incrementally via the Welford algorithm [1] — O(n²) per sample with O(1) memory allocation
  4. Calculate DM(x(t)) — the Mahalanobis Distance from the baseline profile (see §3 — Math Model for full derivation)
  5. Apply EWMA smoothing → D̄M(t) to suppress noise before differentiation
  6. Calculate derivatives — dD̄M/dt (velocity of deviation) and d²D̄M/dt² (acceleration of deviation)
  7. Evaluate against adaptive thresholds — θ₁, θ₂, θ₃, θ₄ calibrated during warm-up as multiples of the baseline standard deviation
  8. Determine response level (0–5) based on the combination of DM, its derivatives, and the Load Direction Index φ(t)
  9. Send actuation command via BPF maps back to kernel space

The entire pipeline executes in user space with zero heap allocation on the hot path. All matrix operations use pre-allocated slices; the Welford algorithm updates in-place; and the EWMA filter operates on scalar registers. This ensures that the mathematical engine does not trigger garbage collection during critical decision windows.

Latency Budget

The target latency for the complete cortex pipeline (steps 1–9) is <500μs for a state vector of n=10 dimensions on commodity hardware. The dominant cost is the matrix-vector multiplication in step 4 (O(n²)), which for n=10 involves 100 floating-point operations — trivial on modern CPUs. The ring buffer read (step 1) and BPF map write (step 9) contribute ~1–10μs each.

2.3. Motor Layer — Actuation

The motor layer translates response-level decisions into concrete kernel actions via two primary mechanisms:

MechanismInterfaceActionsLatency
cgroups v2 Direct file writes to /sys/fs/cgroup/ memory.high — apply memory backpressure
cpu.max — throttle CPU bandwidth
cgroup.freeze — freeze process group
~10–100μs
XDP eBPF programs at network driver level XDP_DROP — drop packets before stack processing
Selective filtering: new connections dropped, existing preserved
Healthcheck traffic always allowed
~1–5μs per packet

A critical design distinction: HOSA's primary containment mechanism is throttling, not killing. Setting memory.high instructs the kernel to apply reclaim pressure on the target cgroup, slowing memory allocation without terminating the process. This preserves in-flight transactions — the process is degraded but alive.

2.4. On Kernel↔User Space Transition

The HOSA execution model involves transition between kernel space (eBPF collection and actuation) and user space (mathematical computation). This transition uses the eBPF ring buffer mechanism and BPF maps, with typical latency on the order of 1–10μs on modern hardware.

Terminology Clarification

The correct characterization of this model is "zero external runtime dependencies" — HOSA does not depend on processes, services, or infrastructure external to the agent binary and the host kernel. The kernel↔user transition is internal to the agent. An earlier version of the whitepaper incorrectly described this as "zero context switch," which has been corrected.

3. Warm-Up and Proprioceptive Calibration

Upon starting, HOSA executes a calibration phase termed Hardware Proprioception — a term borrowed from the biological sense by which an organism perceives its own body configuration. During this phase, HOSA learns both the hardware topology and the behavioral baseline of the node.

  1. Topological discovery. Via reading /sys/devices/system/node/ and /sys/devices/system/cpu/, the agent identifies NUMA topology, physical and logical core count, L1/L2/L3 cache sizes, and memory configuration.
  2. State vector definition. Based on topology, HOSA determines which variables to include in x(t) and their respective eBPF sources. A single-socket server with NVMe storage produces a different vector than a dual-socket machine with spinning disks.
  3. Baseline accumulation. During a configurable period (default: 5 minutes), the agent collects samples without executing mitigation, accumulating the initial μ₀ and Σ₀ via Welford incremental updates. This is the node's baseline profile.
  4. EWMA calibration. The smoothing factor α is calibrated for each resource based on the variance observed during warm-up. Higher-variance signals receive lower α (more smoothing) to prevent false derivatives.
  5. Adaptive threshold definition. The thresholds θ₁ through θ₄ for each response level are calculated as multiples of the standard deviation observed in the baseline regime (e.g., Level 1 = 2σ, Level 3 = 4σ).

After warm-up, μ and Σ continue to be updated incrementally, allowing the baseline profile to evolve with legitimate workload changes (see §5 — Habituation).

Cold Start Vulnerability

During the warm-up period, the agent does not have a sufficient baseline profile for reliable detection. In this interval, HOSA operates in conservative mode — logging only, no mitigation. This constitutes a known vulnerability window and is documented as a limitation in the whitepaper (§9.2). The duration is configurable and can be reduced if the node has a pre-computed baseline from a previous execution.

4. Graduated Response System

The graduated response is one of HOSA's most critical architectural decisions. Rather than implementing a binary switch (healthy → kill), the system defines a spectrum of six proportional response levels, each with specific activation conditions, actions, and reversibility guarantees.

4.1. Response Levels 0–5

Table 1 Complete specification of graduated response levels with activation conditions.
Level Name Activation Condition Action Reversibility
0 Homeostasis DM < θ₁ and dDM/dt ≤ 0 None. Suppress redundant telemetry (Thalamic Filter). Heartbeat only. N/A
1 Vigilance DM > θ₁ or sustained dDM/dt > 0 Increase sampling rate (100ms → 10ms). Local logging. No system intervention. Automatic — returns to L0 when condition ceases
2 Soft Containment DM > θ₂ and dDM/dt > 0 renice non-essential processes via cgroups. Webhook notification (opportunistic). Automatic — gradual renice relaxation
3 Active Containment DM > θ₃ and d²DM/dt² > 0 (positive acceleration) CPU/memory throttling via cgroups on identified contributors. Partial load shedding via XDP (drop new connections, preserve existing). Urgent webhook. Automatic with hysteresis — relaxation when DM < θ₂ for sustained period
4 Severe Containment DM > θ₄ or convergence velocity indicates exhaustion within < T seconds Aggressive throttling. XDP blocks all inbound traffic except orchestrator healthchecks. Freeze non-critical cgroups. Requires sustained DM reduction below θ₃ for extended period
5 Quarantine Containment failure at prior levels. DM in uncontrolled ascent despite active mitigations. Network isolation. Non-essential processes frozen (SIGSTOP). Detailed log to persistent storage. Final webhook with quarantine state. Manual — requires administrative intervention to restore

The key insight in the graduated response design is the use of both the value and the derivatives of DM for level determination. Level 3 requires not just a high DM, but positive acceleration — the system is not merely stressed but accelerating toward collapse. This prevents aggressive response during stable-but-elevated workloads (which are handled by the habituation mechanism instead).

4.2. Quarantine Modes by Environment Class

The autonomous quarantine (Level 5) involves network isolation of the compromised node. The feasibility and strategy of this isolation vary fundamentally by infrastructure class. HOSA implements differentiated quarantine modes, selected automatically during Hardware Proprioception or configured explicitly by the operator.

EnvironmentDetectionQuarantine StrategyRecovery
Bare metal with IPMI IPMI interface detection via /sys/class/net/ and ipmi_* kernel modules Disable all network interfaces except the out-of-band management interface (IPMI/iLO/iDRAC). Node remains accessible via management console. Manual via IPMI console
Cloud VM (AWS, GCP, Azure) DMI/SMBIOS, metadata service (169.254.169.254), hypervisor detection Does not disable interfaces. Instead: (1) XDP drops all traffic except metadata service, DHCP, and orchestrator endpoint. (2) Signals quarantine via cloud-native mechanism (instance tag, SNS, healthcheck → HTTP 503). (3) Orchestrator decides terminate/replace. Orchestrator terminates and replaces instance. Optional self-termination via cloud API if orchestrator doesn't act within 5 min (disabled by default).
Kubernetes Container detection via /proc/1/cgroup, KUBERNETES_SERVICE_HOST Does not isolate the host node (no permission). Instead: (1) Maximum cgroup containment on offending pods. (2) Applies taint hosa.io/quarantine=true:NoExecute and condition HOSAQuarantine=True via K8s API, causing pod evacuation. (3) Emits Warning Event. Operator removes taint after investigation. Node returns to scheduling pool.
Edge/IoT with physical access Explicit operator configuration Complete network interface deactivation. Device operates in isolated mode. Logs preserved on local flash/eMMC. LED or display visual signaling if available. Manual — field technician accesses device, collects logs, restores.
Edge/IoT without physical access Explicit operator configuration Network deactivation + hardware watchdog timer (default: 30 min). If no intervention occurs, watchdog reboots device with quarantine_recovery=true flag. Agent enters conservative mode post-reboot (logging only) to allow remote diagnosis. Automatic via watchdog reboot with observation period
Air-gapped (SCADA/ICS) Explicit operator configuration Identical to bare metal, with all opportunistic communication permanently disabled. Logs written exclusively to encrypted local storage. Collected periodically by authorized personnel. Manual via authorized physical access
Design Principle — Automatic Detection with Manual Override

HOSA attempts to automatically detect the environment class and select the appropriate quarantine mode. The operator can override this detection via explicit configuration. In case of ambiguity (e.g., a private cloud VM that doesn't respond to the standard metadata service), HOSA assumes the most conservative mode (cloud VM — does not disable interfaces), prioritizing recoverability over isolation.

4.3. Escalation and Hysteresis

Transitions between response levels are governed by two rules that prevent oscillation (flapping):

  • Escalation requires sustained condition. The activation condition for a higher level must be met for a minimum sustained period (configurable, default varies by level) before escalation occurs. HOSA does not jump from Level 0 to Level 4 in a single cycle.
  • De-escalation requires hysteresis. Returning to a lower level requires the condition for the lower level (not just the absence of the higher-level condition) to be sustained. For example, dropping from Level 3 to Level 2 requires DM < θ₂ (not just DM < θ₃) for a sustained period. This prevents rapid oscillation at threshold boundaries.

The combination of derivative-based escalation and hysteresis-based de-escalation produces a system that is fast to escalate (responds to acceleration, not just magnitude) but slow to de-escalate (requires confirmed recovery, not just momentary improvement).

5. Habituation — Adapting to the New Baseline

A recurring problem in anomaly detection systems is chronic false positives: when the legitimate workload changes permanently (e.g., deployment of a new application version that consumes more memory), the detector continues signaling anomaly indefinitely.

HOSA implements a habituation mechanism inspired by neuroplasticity:

  1. If DM remains elevated but stable (derivative near zero) for a configurable period without any real failure (no OOM, no timeout, no process crash);
  2. And the covariance structure is preserved (the deformation ratio ρ(t) is below threshold — resources still correlate in the same proportions, just at higher magnitude);
  3. And no indicators of compromise are present (syscall entropy ΔH and propagation index ICP below thresholds);
  4. Then HOSA recalibrates μ and Σ with increasing weight on recent samples, effectively shifting the baseline profile to the new operational regime.

This is implemented via exponential decay of weights in the Welford algorithm, assigning lower influence to older samples and allowing Σ to reflect the contemporary covariance of the system.

Safeguards Against Premature Habituation

Habituation is blocked when: (a) stabilization occurs near the physical safety limit of a resource (e.g., memory > 90% — stabilizing at 92% is not a safe "new normal"); (b) the covariance deformation ratio ρ(t) exceeds threshold (structural change, not just magnitude change — potentially adversarial); (c) the propagation index ICP is elevated (viral behavior); (d) the derivative remains positive sustained (progressive failure, not stable plateau). The formal pre-condition is documented in the whitepaper §6.12.

6. Selectivity Policy — The Throttling Problem

While throttling via cgroups is an effective mitigation against resource exhaustion, it introduces secondary risks that must be explicitly addressed:

  • Cascading timeouts. A throttled HTTP backend can cause connection accumulation upstream, propagating degradation to healthy services.
  • Transaction deadlocks. A process throttled during a database transaction may hold locks indefinitely, blocking other processes.
  • Critical component starvation. If the Kubernetes kubelet is throttled, the node is marked NotReady and all pods are evacuated — potentially causing more damage than the original problem.

HOSA addresses these risks through a safelist — a protected list of processes and cgroups that are never targets of throttling:

  • Kernel processes (kthreadd, ksoftirqd, etc.)
  • The HOSA agent itself
  • Orchestration agents (kubelet, containerd, dockerd) when detected
  • Processes explicitly marked by the operator via configuration or cgroup label

Throttling is applied preferentially to the processes identified as greatest contributors to the anomaly, determined by the decomposition of x(t) — the processes whose resource consumption most contributes to the dimensions where DM diverges from baseline. This dimensional contribution analysis (cj decomposition) is documented in §3 — Math Model.

7. Project Structure

The codebase follows a layered organization that mirrors the biological metaphor:

hosa/
├── cmd/hosa/
│   └── main.go                # Entry point — agent initialization
├── internal/
│   ├── sysbpf/
│   │   └── syscall.go         # Custom eBPF loader via native syscalls
│   ├── linalg/
│   │   └── matrix.go          # Linear algebra (matrices, inversion, covariance)
│   ├── syscgroup/
│   │   └── file_edit.go       # Direct cgroup file manipulation via VFS
│   ├── bpf/
│   │   ├── sensors.c          # eBPF C code injected into the kernel
│   │   └── bpf_bpfeb.go       # Auto-generated Go↔C bridge (cilium/ebpf)
│   ├── sensor/                # ── The Sensory System
│   │   └── collector.go       # Reads eBPF maps → state vector x(t)
│   ├── brain/                 # ── The Predictive Cortex
│   │   ├── matrix.go          # Covariance matrix management (Welford)
│   │   ├── mahalanobis.go     # D_M calculation
│   │   └── predictor.go       # EWMA, derivatives, level determination
│   ├── motor/                 # ── The Reflex Arc (Actuators)
│   │   ├── cgroups.go         # Process throttling via cgroups v2
│   │   └── signals.go         # Process signaling (SIGSTOP/SIGCONT)
│   └── state/                 # ── The Limbic System
│       └── memory.go          # Short-term ring buffer for baseline
├── docs/
│   ├── whitepaper.pdf         # Full academic whitepaper v2.1
│   └── *.html                 # Documentation pages
└── Makefile                   # make build → compiles eBPF C + Go

The naming convention deliberately uses biological terms (brain/, sensor/, motor/, state/) to maintain the conceptual mapping between architecture and metaphor throughout the codebase.

8. Key Design Decisions

DecisionRationale
Mahalanobis over ML/DL O(n²) constant memory, no GPU, no training pipeline, runs on a Raspberry Pi with 512MB RAM. Produces interpretable results (dimensional contributions cj). Full rationale in §3 — Math Model.
Welford incremental updates O(n²) per sample with O(1) memory allocation. No data windows stored. Predictable footprint regardless of uptime. For n=10 variables, the entire statistical state occupies <2KB.
EWMA before derivatives Numerical differentiation of discrete, noisy data is an ill-posed problem in the sense of Hadamard [2]. The second derivative amplifies noise quadratically. EWMA smoothing before differentiation is mandatory for stable derivative estimates.
Go for user space Pragmatic choice for research velocity. Go 1.22+ GC pauses are sub-millisecond. The hot path uses zero-allocation patterns (sync.Pool, pre-allocated slices, GOGC=off during critical cycles). The cilium/ebpf library provides a mature eBPF ecosystem. If GC pauses prove problematic in benchmarks, the hot path can be migrated to C via CGo.
Throttle, not kill OOM-Killer already exists and is destructive. HOSA's value proposition is preventing the need for kills by applying graduated pressure early. memory.high backpressure preserves in-flight transactions.
Complement, not replace HOSA is the reflex arc; Prometheus/Datadog/Kubernetes are the cerebral cortex. Different temporal scales, different decision scopes. HOSA keeps the node alive during the Lethal Interval; the orchestrator handles strategic decisions afterward.

9. Self-Protection Mechanisms

A legitimate concern for any autonomous agent running with kernel privileges is: can the agent itself become the cause of the problem? HOSA addresses this through multiple layers of self-protection:

  1. Self-contained footprint. HOSA operates within its own cgroup v2 with hard limits on memory.max and cpu.max. If the agent exceeds its own limits, the kernel constrains it before it affects the system.
  2. Safelist self-inclusion. HOSA is the first entry in its own safelist — it never throttles itself. Kernel processes and orchestration agents are also protected by default.
  3. Reversible mitigation. Levels 0–4 are automatically reversible. No destructive action (process kill, interface deactivation) is executed below Level 5.
  4. Escalation hysteresis. Level transitions require sustained conditions, preventing oscillation. The agent cannot jump from Level 0 to Level 4 in a single cycle.
  5. Dry-run mode. The agent can be executed in observation mode (logging and decision calculation without action execution), allowing validation of decision quality before enabling actuation.
  6. Deterministic compilation. The binary is compiled statically with no dynamic dependencies. No risk of failure due to absent or incompatible shared libraries.
  7. eBPF verifier as safety net. All eBPF programs are validated by the kernel's eBPF verifier before loading. A bug in the eBPF C code causes the program to be rejected at load time (fail-safe), not at runtime.
On the Impossibility of Total Risk Elimination

The total elimination of risk is impossible for any software that executes with kernel-space privileges. Bugs in user space can cause incorrect decisions. The mitigation is: extensive testing, dry-run mode, and the recognition that an agent that makes an incorrect throttling decision (effect: temporary latency increase) is categorically less destructive than the complete absence of mitigation (effect: OOM-Kill, crash, data loss).

10. References

  1. Welford, B. P. (1962). Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics, 4(3), 419–420.
  2. Hadamard, J. (1902). Sur les problèmes aux dérivées partielles et leur signification physique. Princeton University Bulletin, 13, 49–52.
  3. Heo, T. (2015). Control Group v2. Linux Kernel Documentation. kernel.org
  4. Gregg, B. (2019). BPF Performance Tools: Linux System and Application Observability. Addison-Wesley Professional.
  5. Vieira, M. A., et al. (2020). Fast Packet Processing with eBPF and XDP: Concepts, Code, Challenges, and Applications. ACM Computing Surveys, 53(1), Article 16.
  6. Horn, P. (2001). Autonomic Computing: IBM's Perspective on the State of Information Technology. IBM Corporation.
  7. Hellerstein, J. L., Diao, Y., Parekh, S., & Tilbury, D. M. (2004). Feedback Control of Computing Systems. John Wiley & Sons.