1. The Classification Problem
The efficacy of an anomaly detection system depends fundamentally on its capacity to distinguish between legitimate variation and pathological deterioration. A detector that treats every deviation as a threat generates alert fatigue through false positives. A detector that is too tolerant allows sophisticated attacks to operate below the perception threshold.
The challenge is compounded by the fact that, from a purely metric perspective, radically different scenarios can produce superficially identical signatures. CPU at 85% can mean:
- A normal day for a video rendering server
- A predictable Black Friday seasonal peak
- The first milliseconds of a volumetric DDoS attack
- A silent cryptominer consuming idle cycles
The isolated metric is identical. What differentiates these scenarios is the multivariate structure of the stress — how variables correlate with each other — and the temporal dynamics — how that correlation evolves over time. This is precisely what the Mahalanobis Distance and its derivatives capture (see §3 — Mathematical Model).
Equally critically, the taxonomy must recognize that anomaly is not exclusively a phenomenon of excess. A server processing zero requests when it should be handling thousands is not in homeostasis — it is in anomalous silence, with financial, energetic, and security implications that the anomaly detection literature historically ignores.
2. The Bipolar Spectrum — Architecture
2.1. Organizing Principle
HOSA models operational regimes as a continuous bipolar spectrum centered on homeostasis, where:
- The sign of the index indicates the direction of the deviation relative to the baseline profile
- The magnitude of the index indicates the severity of the deviation
Regime 0 (homeostasis) constitutes the central reference point. Negative deviations represent states of sub-demand (the node operates below expectations). Positive deviations represent states of over-demand or pathology (the node operates above expectations or under abnormal conditions).
2.2. Design Justification
The spectral organization solves three problems that ad-hoc taxonomies introduce:
- Conceptual symmetry. Biological homeostasis is inherently bidirectional: hypothermia and hyperthermia are both pathologies, with basal temperature as the central reference. HOSA treats sub-demand and over-demand as symmetric deviations, not as ontologically distinct categories.
- Numerical continuity. The integer regime index reflects a natural severity ordering in each semi-axis. Transitions between adjacent regimes are smooth and auditable (e.g., −1 → −2 when legitimate idleness reveals itself as structural; +3 → +4 when adversarial activity causes localized failure).
- Mathematical uniformity. The same primary metric (DM) and the same Load Direction Index (φ) position any observed state on the spectrum. The sign of φ determines the semi-axis; the combination of DM, its derivatives, and supplementary metrics determines the position within the semi-axis.
2.3. Directionality — The Load Direction Index (φ)
The Mahalanobis Distance is inherently non-directional — it measures how far the state has departed from baseline, but not in which direction. To position the system on the bipolar spectrum, HOSA computes the Load Direction Index:
Where:
- dj(t) = xj(t) − μj — deviation of the j-th variable from baseline
- σj — baseline standard deviation of the j-th variable
- sj ∈ {+1, −1} — the load sign (+1 if increase = more load, −1 if increase = less load)
| Value of φ(t) | Meaning | Spectrum Position |
|---|---|---|
| φ ≈ 0 | System near baseline | Regime 0 (Homeostasis) |
| φ > 0 | Deviation toward overload | Positive semi-axis (+1 to +5) |
| φ < 0 | Deviation toward idleness | Negative semi-axis (−1 to −3) |
Together, DM measures the magnitude of deviation and φ indicates the direction. They position the system in a two-dimensional space (magnitude × direction) that maps directly to the regime spectrum.
3. Regime 0 — Homeostasis
The steady-state normal operation. Resource variables fluctuate within a predictable range, reflecting the ordinary activity of hosted applications.
| Indicator | Behavior |
|---|---|
| DM(t) | Low and stable, fluctuating near the origin. Typically DM < θ₁. |
| φ(t) | Oscillates around zero. No sustained directional tendency. |
| dD̄M/dt | Oscillates around zero. No sustained trend. |
| d²D̄M/dt² | Low-amplitude stationary noise. |
| Σ | Stable. Correlations between variables are consistent over time. |
HOSA behavior:
- Response Level 0 — no intervention.
- Thalamic Filter active — redundant telemetry suppressed, heartbeat only (see §4 — Thalamic Filter).
- Baseline refinement — μ and Σ updated incrementally via Welford.
This is the reference state. All anomaly detection is measured as deviation from this regime.
4. Sub-Demand Regimes (Negative Semi-Axis)
The inclusion of the negative semi-axis addresses a systematic gap in the anomaly detection literature, which focuses almost exclusively on anomalies of excess. A server processing zero requests when it should be handling thousands is not in homeostasis — it is in anomalous silence, with costs that are financial, energetic, and potentially indicative of security compromise.
4.1. Regime −1 — Legitimate Idleness
Definition: Demand reduction consistent with temporal or operational context. Resource consumption is below the global baseline but coherent with the expected baseline for the current temporal window.
Examples: Overnight for a corporate web application; weekends for an ERP server; scheduled upstream maintenance.
| Indicator | Behavior |
|---|---|
| DM vs. global | Moderate elevation |
| DM vs. temporal profile | Low (coherent with the segment baseline) |
| φ(t) | Negative, moderate |
| dφ/dt | Gradual transition |
| ρ(t) | Low — correlations preserved proportionally |
| Temporal coherence | Yes — period corresponds to a historically low-activity window |
HOSA behavior:
- Response Level 0 — the idleness is expected.
- Thalamic Filter maximally active — heartbeat only.
- FinOps reporting — idle hours logged, estimated cost recorded.
- GreenOps optimization — CPU frequency reduction via scaling governor, sampling interval increase, network polling reduction. All instantly reversible if φ begins to rise.
Habituation: Incorporated into seasonal profiles — each temporal segment accumulates its own baseline (see §5 — Seasonal Profiles).
4.2. Regime −2 — Structural Idleness
Definition: The node is permanently over-provisioned relative to actual demand. No temporal window shows full resource utilization.
Examples: Instance provisioned on incorrect capacity estimates; legacy server that lost operational relevance but was never decommissioned; infrastructure provisioned for projected peaks that never materialized.
| Indicator | Behavior |
|---|---|
| DM(t) | Chronically low — system rarely departs from low-consumption region |
| φ(t) | Persistently negative — across all temporal windows, including expected peaks |
| dD̄M/dt | ≈ 0 (stable at low plateau) |
| Seasonal profiles | No significant variation between peak and valley windows |
| IPE | Near 1 — Excess Provisioning Index indicates massive over-provisioning |
The Excess Provisioning Index (IPE) is dedicated to this regime:
An IPE close to 1 indicates that even during peak-activity windows, the node utilizes a minimal fraction of its provisioned capacity.
HOSA behavior:
- Response Level 0 — no operational risk.
- Critical FinOps signaling — generates a right-sizing report with: IPE value, per-resource utilization vs. provisioned capacity, suggested smaller instance type (when cloud catalog is configured), and estimated annual savings.
- Exposes
hosa.io/structurally-idle=trueannotation for Kubernetes cluster autoscaler consideration.
Habituation: Permitted (with persistent FinOps flag). HOSA habituates for detection purposes but continues reporting the waste.
4.3. Regime −3 — Anomalous Silence
Definition: Abrupt or gradual activity drop incompatible with the expected temporal context. The node should be active and is not.
Examples: Traffic redirected by DNS hijack; silent load balancer failure stopping request forwarding; application process dead without restart; attacker who crashed the service before installing a payload.
| Indicator | Behavior |
|---|---|
| DM(t) | Abrupt elevation (even though load is dropping — deviation from baseline is large) |
| φ(t) | Strongly negative, with rapid transition |
| dD̄M/dt | Abrupt positive spike (DM rising rapidly) |
| ρ(t) | Potentially high — disproportionate drop across resources (network at zero but CPU remains elevated → residual processes, malware, or loop) |
| Temporal coherence | No — drop occurs during a period that should be active |
The critical distinction between Regime −1 and −3 is temporal coherence. A drop in activity at 03:00 AM is coherent with an overnight profile (Regime −1). A drop at 10:00 AM on a Tuesday, when the seasonal profile predicts peak load, is incoherent (Regime −3). When seasonal profiles are not yet calibrated, HOSA uses the velocity of transition as a provisional discriminant: an abrupt drop (|dφ/dt| high) is provisionally classified as −3; a gradual drop is treated as potentially legitimate.
HOSA behavior:
- Response Level 1–3, depending on speed and magnitude.
- Active investigation: process health checks, network interface verification, upstream health probes.
- Urgent webhook: "Node X reports activity significantly below expectations for the current temporal context."
- If accompanied by high ICP → reclassified as Regime +5 (Viral Propagation).
Habituation: Blocked. HOSA does not adapt to unexplained silence.
Anomalous Silence is, counterintuitively, one of HOSA's most valuable detection capabilities. Traditional monitors alert on excess. When a server stops receiving traffic and all metrics are "green" (CPU low, memory free, network quiet), the traditional monitor reports: "all healthy." HOSA, by modeling what should be happening (baseline profile), detects that the silence is anomalous and signals: "this node should be active and is not."
4.4. Consolidated Negative Signatures
| Indicator | −1 (Legitimate) | −2 (Structural) | −3 (Anomalous) |
|---|---|---|---|
| DM vs. global | Moderate | Low (chronic) | High (abrupt) |
| DM vs. temporal | Low (coherent) | Low in all windows | High (incoherent) |
| φ(t) | Negative moderate | Negative persistent | Strongly negative |
| dφ/dt | Gradual | ≈ 0 (stable) | Abrupt |
| ρ(t) | Low (preserved) | Low | Variable (possibly high) |
| Temporal coherence | Yes | Irrelevant (always idle) | No |
| IPE | Variable | Near 1 | Irrelevant |
5. Over-Demand Regimes (Positive Semi-Axis)
5.1. Regime +1 — Plateau Shift
Definition: A persistent, non-reverting elevation in resource consumption from legitimate workload change.
Examples: New application version deployed with higher memory footprint; additional microservice migrated to the node; organic user base growth; kernel or runtime update altering the consumption profile.
| Indicator | Behavior |
|---|---|
| DM(t) | Abrupt elevation followed by stabilization at a new plateau. Constant above θ₁. |
| φ(t) | Positive, stable after transient. |
| dD̄M/dt | Transient peak at change, then converges to zero. |
| ρ(t) | Low — covariance structure preserved. Ellipsoid scaled, not deformed. |
The critical discriminant: Derivative converging to zero while DM remains elevated, combined with preserved covariance structure (low ρ). This distinguishes +1 from +3 (where ρ is high) and +4 (where the derivative remains positive).
Habituation: Permitted if all pre-conditions satisfied. This is the primary use case for habituation (see §5 — Habituation).
5.2. Regime +2 — Seasonality
Definition: Demand variations following recurrent temporal patterns determined by predictable usage cycles.
Examples: Daily peak 09:00–11:00 in corporate applications; overnight traffic drop; weekly peaks (Monday ERPs, Friday e-commerce); monthly cycles (payroll, accounting close); annual events (Black Friday).
| Indicator | Behavior |
|---|---|
| DM(t) | Periodic oscillation with predictable amplitude and frequency. |
| φ(t) | Oscillates between positive (peaks) and negative (valleys) with corresponding periodicity. |
| ACF of DM | Significant peaks at lags corresponding to the seasonal period (24h for daily, 168h for weekly). |
| ρ(t) | Low — covariance structure may shift periodically but predictably. |
Solution: Temporal segmentation of the baseline — seasonal profiles (the "digital circadian rhythm"). Each temporal window accumulates its own independent (μi, Σi) via independent Welford accumulators (see §5 — Seasonal Profiles).
Habituation: Intra-segment. Each temporal window habituates independently.
5.3. Regime +3 — Adversarial Demand
Definition: Resource consumption caused by malicious activity that deliberately mimics legitimate demand patterns to evade detection.
Examples: Layer 7 DDoS (syntactically valid HTTP from botnets simulating human browsing); parasitic cryptomining at calculated levels; low-and-slow data exfiltration; gradual resource exhaustion attacks (socket handles, file descriptors, threads).
Legitimate Demand
- DM: normal or elevated
- ρ(t): Low — correlations preserved
- CPU↔Network: proportional
- Syscall entropy: stable
- Work/Resource ratio: proportional
Adversarial Demand
- DM: may appear normal (evasion)
- ρ(t): High — correlations altered
- CPU↔Network: disproportional (kernel CPU rises, not app CPU)
- Syscall entropy: altered
- Work/Resource ratio: drops
The central thesis of this classification: even when individual magnitudes are kept in normal range, malicious activity produces deformation in the covariance structure that legitimate demand does not produce.
Second-level metrics for structural deformation detection:
- ΔH(t) — Syscall entropy change. Emergence of atypical syscalls (
connect(),mmap()) without corresponding application throughput change. - WEI(t) — Work Efficiency Index. Application throughput / resource consumption. Drops under parasitic activity.
- Rku(t) — Kernel/User CPU ratio. Rises under network attacks and malware.
Habituation: Blocked when ρ(t) exceeds its threshold. HOSA never learns to accept a deformed covariance structure as normal.
5.4. Regime +4 — Local Failure
Definition: Resource deterioration caused by failure or pathology confined to the local node, with no propagation component.
Examples: Memory leak in application process; disk degradation (bad sectors, rising I/O latency); file descriptor / thread accumulation bug; fork bomb (accidental or intentional); deadlock causing request queue accumulation; CPU thermal throttling from hardware degradation.
| Indicator | Behavior |
|---|---|
| DM(t) | Progressive or abrupt elevation (depending on failure speed). |
| dD̄M/dt | Sustained positive. The anomaly does not self-correct. |
| d²D̄M/dt² | Variable. Memory leak: ≈ 0 (linear growth). Fork bomb: positive and growing (exponential). |
| ICP(t) | Low — no outbound propagation indicators. |
| Dimensional localization | One or two dominant dimensions in the cj decomposition. |
Habituation: Blocked while derivative remains positive. Monotonically growing anomalies are not "new normals" — they are progressive failures. Habituation is only considered after stabilization and confirmed functionality at the new plateau (transition to Regime +1).
5.5. Regime +5 — Viral Propagation
Definition: Malicious activity or cascading failure with a propagation component — the affected node attempts to compromise, overload, or infect other systems.
Examples: Worms with lateral movement capability; post-compromise pivot (attacker using node as launching point); microservice cascade failure (degraded service causing backpressure on dependents); compromised node used as internal DDoS amplifier.
| Local Indicator | Significance |
|---|---|
| Outbound connection explosion | Abrupt increase in connect() to multiple IPs/ports not in the normal communication profile. |
| Destination entropy | High entropy in destination IP distribution, inconsistent with the node's normal peer set. |
| Anomalous process spawning | Application processes generating shells, downloads (curl, wget), or processes with randomized names. |
| Anomaly↔outbound correlation | Local anomaly coincides with increased outbound traffic (in non-viral failures, outbound typically decreases). |
These indicators are aggregated into the Propagation Behavior Index (ICP):
Response logic based on ICP:
| ICP | DM | Classification | HOSA Response |
|---|---|---|---|
| Low | High | Regime +4 (Local Failure) | Local containment, no network isolation |
| High | High | Regime +5 | Network isolation (Level 4–5) to protect the cluster |
| High | Moderate | Early propagation | Selective containment of contributing processes + outbound connection restriction via XDP |
Habituation: Categorically blocked. HOSA never habituates to propagation patterns.
6. Integrated Classification Matrix
| Regime | DM | dDM/dt | d²DM/dt² | φ | ρ | ΔH | ICP | Classification |
|---|---|---|---|---|---|---|---|---|
| −3 | High (abrupt) | Spike | Variable | Strongly − | Variable | Variable | Variable | Silence → Investigate |
| −2 | Chronic low | ≈ 0 | ≈ 0 | Persistent − | Low | Low | Low | Over-provisioned → FinOps |
| −1 | Low vs. temporal | ≈ 0 | ≈ 0 | Negative | Low | Low | Low | Legitimate → GreenOps |
| 0 | Low | ≈ 0 | ≈ 0 | ≈ 0 | Low | Low | Low | Homeostasis |
| +1 | High, stable | ≈ 0 (post) | ≈ 0 | Positive | Low | Low | Low | Plateau → Habituation |
| +2 | Oscillates | Oscillates | Oscillates | Oscillates | Low | Low | Low | Seasonal → Profiles |
| +3 | Any | Any | Any | Positive | High | High | Variable | Adversarial → Contain |
| +4 | Growing | Sustained + | Variable | Positive | Variable | Low | Low | Failure → Grad. contain |
| +5 | Variable | Variable | Variable | Variable | Variable | Variable | High | Viral → Isolate |
7. FinOps and GreenOps Implications
The inclusion of the negative semi-axis enables three practical contributions that, as far as the literature review identifies, are not addressed by any existing local agent in an integrated manner:
| Contribution | Mechanism | Regimes |
|---|---|---|
| FinOps from endogenous evidence | Cloud cost optimization tools (AWS Cost Explorer, GCP Recommender, Kubecost) operate on billing data and aggregated metrics at hourly/daily intervals. HOSA provides under-utilization evidence at second-level granularity, with multivariate correlation and temporal context — enabling right-sizing recommendations with higher statistical confidence. | −1, −2 |
| GreenOps as a consequence of homeostasis | Energy optimization is not a separate module — it is the natural response of the agent to sub-demand regimes. CPU frequency reduction, sampling interval increase, and telemetry suppression are actions of the same graduated response system that applies throttling in over-demand. Homeostasis is bidirectional. | −1, −2 |
| Operational blackout detection | Anomalous Silence (Regime −3) is a genuine security scenario that traditional health-of-resource monitors are structurally incapable of detecting — because all capacity metrics are "healthy" when the server stops receiving work. Detection requires a model of "what should be happening" (contextual baseline), not just "what is dangerous" (capacity thresholds). | −3 |
HOSA does not autonomously shut down, resize, or decommission nodes — that would exceed its scope of local actuation and could violate availability contracts. It provides the mathematical evidence for the human or orchestrator to make informed decisions. Financial and energetic savings are a consequence of precise detection, not a direct action of the agent.
8. Handling Ambiguous Classification
In scenarios where indicators do not unambiguously point to a single regime, HOSA adopts the precautionary principle:
- Temporary classification as highest-severity compatible regime. If indicators are compatible with both +1 (Plateau Shift) and +3 (Adversarial), HOSA classifies as +3 and applies the corresponding response.
- Continuous re-evaluation. Classification is revised at every sample cycle as new data accumulates evidence. As the covariance deformation ratio ρ(t) stabilizes at a low value, the classification may be revised downward to +1.
- Audit trail. The ambiguity and the indicators that led to the conservative decision are logged, ensuring transparency for post-incident analysis.
Regime −3 (Anomalous Silence) can transition directly to the positive semi-axis when investigation reveals that the silence is accompanied by propagation indicators (elevated ICP, anomalous processes). In this case, the state is reclassified as Regime +5 (Viral Propagation). A traversal of point zero without passing through homeostasis is logged as a high-priority event.
9. Theoretical Contribution
The regime taxonomy proposed here contributes to the anomaly detection literature for computational systems by formalizing two distinctions frequently treated in an ad-hoc manner in operational practice:
- Not every deviation is an anomaly, and not every anomaly is a threat. The spectral organization enables proportional responses: central regimes (−2 to +2) are treated with adaptation and optimization; extreme regimes (−3, +3 to +5) are treated with containment and isolation.
- Deficit anomaly is as significant as excess anomaly. The symmetry around Regime 0 establishes that HOSA implements genuine homeostasis — bidirectional equilibrium — not merely protection against overload.
The combination of magnitude analysis (DM), direction (φ), temporal dynamics (derivatives), structural integrity (ρ), behavioral profile (ΔH, WEI, Rku), and propagation intent (ICP) provides a classification framework that enables proportional, informed responses — reducing both false positives (habituation when appropriate) and false negatives (habituation blocked when deformation or propagation indicators are present).
"Anomaly is redefined as significant deviation from the baseline profile in any direction — not only toward excess. HOSA adapts to legitimate variation but refuses to normalize pathology."
10. Known Limitations
- Regime boundary ambiguity. The boundaries between adjacent regimes (e.g., +1 vs. +3, −1 vs. −3) are not crisp thresholds but regions of classification uncertainty. The precautionary principle (§8) addresses this operationally, but formal characterization of the ambiguity regions requires empirical data from production systems.
- ICP weight calibration. The weights w₁–w₄ in the Propagation Behavior Index are initialized uniformly and require empirical calibration against controlled attack scenarios with known ground truth. The calibration methodology (AUC-ROC maximization via grid search over the weight simplex) is documented; execution awaits the experimental phase.
- Adversarial evasion of structural detection. An attacker who understands HOSA's architecture and manages to preserve the covariance structure, syscall distribution, and propagation indicators while executing malicious activity will evade Regime +3 classification. Formal adversarial resistance analysis is a future research topic.
- Seasonal profile cold start. Temporal segmentation requires at minimum 7 days of accumulated data. During this period, Regime −1 vs. −3 discrimination relies on transition velocity rather than temporal coherence — a weaker discriminant.
- FinOps accuracy. Right-sizing recommendations (Regime −2) depend on the quality of peak detection across all temporal windows. If the observation period does not include the true annual peak (e.g., Black Friday for e-commerce), the IPE may overestimate over-provisioning.
- Threshold calibration for supplementary metrics. The thresholds ρthreshold, ΔHthreshold, ICPthreshold, and DM,safety are set to conservative defaults. Optimal values depend on the specific workload class and will be calibrated during the experimental phase against real production data and controlled attack scenarios.
11. References
- Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Sciences of India, 2(1), 49–55.
- Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly Detection: A Survey. ACM Computing Surveys, 41(3), Article 15.
- Aggarwal, C. C. (2017). Outlier Analysis (2nd ed.). Springer.
- Forrest, S., Hofmeyr, S. A., & Somayaji, A. (1997). Computer immunology. Communications of the ACM, 40(10), 88–96.
- Horn, P. (2001). Autonomic Computing: IBM's Perspective on the State of Information Technology. IBM Corporation.
- Weiner, J. (2018). PSI — Pressure Stall Information. Linux Kernel Documentation.
- Gnanadesikan, R., & Kettenring, J. R. (1972). Robust Estimates, Residuals, and Outlier Detection with Multiresponse Data. Biometrics, 28(1), 81–124.
- Hellerstein, J. L., Diao, Y., Parekh, S., & Tilbury, D. M. (2004). Feedback Control of Computing Systems. John Wiley & Sons.