Docs | SpectraStrike | Nexus | Nyxera Labs

Orchestrator Architecture (Phase 2 Design)

SpectraStrike Logo

Purpose

The orchestrator is the central control plane for SpectraStrike. It coordinates tool wrappers, enforces AAA policy, records audit trails, and publishes telemetry.

Core Components

OrchestratorEngine
- Owns runtime lifecycle (start, stop, health).
- Wires scheduler, telemetry pipeline, and policy enforcement.
TaskScheduler
- Accepts tasks from integrations and manual API.
- Prioritizes and schedules async execution.
- Supports retry policy and failure isolation.
ExecutionWorkers
- Async workers that execute normalized tool tasks.
- Emit structured execution events for logging and telemetry.
AAAServiceAdapter
- Wraps pkg.security.aaa_framework.AAAService.
- Enforces authentication and role authorization for each action.
- Produces accounting records for all privileged operations.
TelemetryPipeline
- Normalizes runtime events.
- Buffers and batches telemetry for downstream export.
- Supports secure transport abstraction for VectorVue integration.
- Supports broker-backed async publishing via TelemetryPublisher.
AuditLogger
- Uses pkg.logging.framework.
- Writes structured audit events for auth decisions, task execution, and failures.

Data Contracts

OrchestratorTask
- task_id, source, tool, action, payload, requested_by, required_role.
ExecutionResult
- task_id, status, started_at, ended_at, output, error, metadata.
TelemetryEvent
- event_id, event_type, timestamp, actor, target, status, attributes.

Runtime Flow

Receive task request.
Authenticate actor and authorize required role.
Record accounting event and enqueue task.
Worker executes task asynchronously.
Emit operational logs and audit events.
Emit telemetry events and batch for export.
Persist result and expose status query.

Messaging Backbone (Sprint 9.5)

Broker standard: RabbitMQ (dockerized deployment target).
Logical model:
- Exchange: spectrastrike.telemetry
- Routing key: telemetry.events
- Main queue: telemetry.events
- Dead-letter queue: telemetry.events.dlq
Delivery policy:
- Idempotency key per event (event_id) to deduplicate replays.
- Bounded retry attempts for transient broker failures.
- Dead-letter routing when retries are exhausted.
Runtime adapters:
- RabbitMQTelemetryPublisher: in-memory RabbitMQ model for deterministic tests.
- PikaRabbitMQTelemetryPublisher: dockerized RabbitMQ adapter for real runtime publish.
Telemetry transport security:
- RabbitMQ listener configured TLS-only (5671) with client-certificate verification.
- App-side publisher supports CA/cert/key configuration from RABBITMQ_SSL_*.

Cryptographic Signing (Sprint 10)

Signer integration: pkg.orchestrator.signing.VaultTransitSigner.
Key management:
- Transit key metadata and signing operations are delegated to HashiCorp Vault.
- Optional key bootstrap is supported through VAULT_TRANSIT_AUTO_CREATE_KEY=true.
Runtime config:
- VAULT_ADDR, VAULT_TOKEN, VAULT_NAMESPACE, VAULT_VERIFY_TLS.
- VAULT_TRANSIT_MOUNT, VAULT_TRANSIT_KEY_NAME, VAULT_TRANSIT_KEY_TYPE.
Security controls:
- HTTPS is required by default (VAULT_REQUIRE_HTTPS=true).
- Token and key material are never logged.
- Public key retrieval is version-aware for future manifest verification.
JWS generation:
- Compact JWS payload generation is handled by pkg.orchestrator.jws.CompactJWSGenerator.
- The signing input is canonical JSON (sort_keys=True) and encoded as base64url(header).base64url(payload).
- Signatures from Vault transit are requested with JWS marshaling and normalized into compact JWS signature segment encoding.
Execution Manifest schema:
- Schema is defined in pkg.orchestrator.manifest.ExecutionManifest.
- Required fields: task_context, target_urn, tool_sha256, parameters.
- task_context enforces tenant and operator attribution (tenant_id, operator_id) and task lineage (task_id, correlation_id).
- target_urn and tool_sha256 are strictly validated before signing to reduce forged/tampered dispatch risk.
Anti-replay controls:
- ExecutionManifest includes a per-request nonce and issued_at timestamp.
- pkg.orchestrator.anti_replay.AntiReplayGuard validates allowed time window and nonce uniqueness before dispatch.
- Replay storage is tenant-scoped (tenant_id + nonce) to preserve isolation boundaries.

Security and Reliability Requirements

No secrets in code or logs.
TLS-only transport for outbound telemetry.
Role-based authorization enforced before execution.
Audit trail for denied and successful operations.
Retry with bounded backoff; circuit-break style failure protection.
Tamper-evident audit stream via hash-chained audit event records.

Armory + Universal Runner (Sprints 11-13)

Armory registry:
- Internal OCI registry service (armory-registry) is deployed in compose.
- Registry delete operations are disabled for immutable artifact posture.
Armory control service:
- pkg.armory.service.ArmoryService handles ingest pipeline:
  - upload digesting (sha256),
  - SBOM metadata generation,
  - vulnerability summary generation,
  - signing metadata generation,
  - operator approval gating.
Runner cryptographic gate:
- pkg.runner.jws_verify.RunnerJWSVerifier validates compact JWS before execution admission.
- Forged signature attempts are hard-failed before any tool resolution.
Signed tool retrieval:
- pkg.runner.universal.UniversalEdgeRunner resolves only approved Armory digests that exactly match ExecutionManifest.tool_sha256.
Sandbox profile:
- Runner command contract enforces --runtime=runsc, AppArmor profile pinning, read-only rootfs, dropped capabilities, and no-network baseline.
Execution contract:
- stdout/stderr/exit_code are mapped into CloudEvents v1.0 via pkg.runner.cloudevents.map_execution_to_cloudevent.
QA guarantees:
- tests/qa/test_execution_fabric_qa.py validates forged-JWS rejection, tampered-digest rejection, and CloudEvents output integrity.

OPA Capability Policies (Sprint 14)

Pre-sign authorization:
- Orchestrator pre-sign flow can query OPA before issuing compact JWS manifests.
Capability tuple model:
- Policy evaluation uses a tuple match on operator_id + tenant_id + tool_sha256 + target_urn.
Policy defaults:
- spectrastrike.capabilities.allow is deny-by-default.
- Input contract validation is exposed through spectrastrike.capabilities.input_contract_valid.

Wrapper Telemetry Migration (Sprint 16.5)

Legacy wrapper direct telemetry emission (telemetry.ingest(...) in wrappers) is deprecated.
Wrappers emit SDK-built payloads (pkg.telemetry.sdk) and submit through unified parser path (telemetry.ingest_payload(...)).
Security invariants are preserved by parser/enforcement gates:
- strict tenant context propagation (tenant_id required),
- unified schema validation before buffering/publishing.

Control Plane Integrity Hardening (Sprint 19)

Signed startup config gate:
- pkg.orchestrator.control_plane_integrity.ControlPlaneIntegrityEnforcer enforces JWS signature validation before startup acceptance.
- Unsigned or invalid signatures are hard-rejected.
Policy trust pinning:
- Startup config must include policy_sha256 that matches OPA_POLICY_PINNED_SHA256.
- Any mismatch is rejected with integrity audit evidence.
Runtime baseline integrity:
- Optional startup binary baseline (SPECTRASTRIKE_ENFORCE_BINARY_HASH=true) validates SHA-256 against signed envelope value.
Immutable configuration history:
- ImmutableConfigurationHistory stores append-only config versions with hash chaining and duplicate-version rejection.
Integrity audit channel:
- pkg.logging.framework.emit_integrity_audit_event writes hash-chained records to dedicated logger spectrastrike.audit.integrity.
Vault hardening workflow:
- pkg.orchestrator.vault_hardening.VaultHardeningWorkflow automates transit key rotation checks and unseal share policy enforcement.

High-Assurance AAA Controls (Sprint 20)

Hardware-backed MFA for privileged actions:
- pkg.security.aaa_framework.AAAService now supports hardware assertion verification via HardwareMFAVerifier.
- Privileged role authorization can require hardware_mfa_assertion in policy context.
Time-bound privilege elevation:
- pkg.security.high_assurance.PrivilegeElevationService issues short-lived, one-time elevation tokens.
- AAA privileged authorization can consume required elevation_token_id via validator hook.
Dual-control Armory approval:
- pkg.armory.service.ArmoryService enforces approval quorum (approval_quorum=2 default) for tool authorization.
- Distinct approvers are required before a digest is marked authorized.
Dual-signature high-risk manifests:
- pkg.orchestrator.dual_signature.HighRiskManifestDualSigner enforces independent second signature for high/critical risk levels.
Break-glass and session recording:
- Break-glass activation uses irreversible audit flag semantics.
- PrivilegedSessionRecorder provides structured session start/command/end event capture for privileged activity evidence.

Deterministic Execution Guarantees (Sprint 21)

Canonical manifest serialization:
- pkg.orchestrator.manifest.canonical_manifest_json enforces deterministic compact JSON (sort_keys=True, fixed separators).
Deterministic hashing:
- pkg.orchestrator.manifest.deterministic_manifest_hash computes stable SHA-256 over canonical manifest payload.
Schema semantic versioning:
- ManifestSchemaVersionPolicy enforces MAJOR.MINOR.PATCH format and supported major compatibility bounds.
Non-canonical submission rejection:
- parse_and_validate_manifest_submission rejects payloads that are not canonical JSON before manifest construction.
Runtime ingress guard:
- OrchestratorEngine.validate_manifest_submission exposes canonical validation path for raw manifest intake.
CI regression guard:
- scripts/manifest_schema_regression.py validates stable schema hash and is executed in CI (.github/workflows/lint-test.yml).

Federation Fingerprint Binding (Sprint 22)

Unified execution fingerprint schema:
- manifest_hash + tool_hash + operator_id + tenant_id + policy_decision_hash + timestamp.
- Implemented in pkg.orchestrator.execution_fingerprint.ExecutionFingerprintInput.
Fingerprint generation and validation:
- generate_execution_fingerprint creates deterministic SHA-256 execution fingerprint.
- validate_fingerprint_before_c2_dispatch enforces pre-dispatch integrity gate.
Tamper-evident fingerprint audit:
- Fingerprint bind/validate outcomes are emitted to integrity audit channel via emit_integrity_audit_event.
VectorVue federation payload binding:
- RabbitMQ bridge includes execution_fingerprint in outgoing telemetry metadata and federation bundle.
- Bridge uses federated gateway dispatch path (send_federated_telemetry).

Federation Channel Enforcement (Sprint 23)

Single outbound gateway:
- Bridge dispatch uses only internal federation endpoint (/internal/v1/telemetry) via send_federated_telemetry.
Legacy path removal:
- Direct bridge event/finding API emission path is removed from active bridge runtime.
mTLS-only federation:
- Federation dispatch requires TLS verification plus configured mTLS client cert/key.
Signed telemetry required:
- Federation dispatch requires payload signature secret configuration; unsigned federation payloads are denied.
Producer replay detection:
- Bridge tracks nonce replay window and denies duplicate producer nonce usage.
Idempotent bounded retry:
- Idempotency key for federation dispatch is execution fingerprint hash, aligning retries with deterministic replay-safe semantics.

Anti-Repudiation Closure (Sprint 24)

Operator-identity-bound fingerprint:
- Operator identity is required and validated when generating execution fingerprint.
Write-ahead execution intent:
- Pre-dispatch execution intent records are appended before outbound federation dispatch.
- Intent records are hash-chained (prev_hash -> intent_hash) for tamper evidence.
Execution intent verification API:
- verify_execution_intent_api exposes verification contract for execution_fingerprint and optional operator_id checks.
Reconciliation and repudiation detection:
- Operator-to-execution reconciliation confirms immutable attribution.
- Repudiation attempts (claiming wrong operator) are detected and emitted to integrity audit stream.
Federation bundle intent metadata:
- Outbound federation bundle now includes intent_id, intent_hash, and write_ahead=true.

Testing Strategy (for next tasks)

Unit tests for scheduler ordering and retry behavior.
Unit tests for AAA enforcement on task submission.
Unit tests for telemetry event normalization.
Integration tests for async execution lifecycle.

Docs | SpectraStrike | Nexus | Nyxera Labs