Orchestrator Architecture (Phase 2 Design)

Purpose
The orchestrator is the central control plane for SpectraStrike. It coordinates tool wrappers, enforces AAA policy, records audit trails, and publishes telemetry.
Core Components
OrchestratorEngine- Owns runtime lifecycle (
start,stop,health). - Wires scheduler, telemetry pipeline, and policy enforcement.
- Owns runtime lifecycle (
TaskScheduler- Accepts tasks from integrations and manual API.
- Prioritizes and schedules async execution.
- Supports retry policy and failure isolation.
ExecutionWorkers- Async workers that execute normalized tool tasks.
- Emit structured execution events for logging and telemetry.
AAAServiceAdapter- Wraps
pkg.security.aaa_framework.AAAService. - Enforces authentication and role authorization for each action.
- Produces accounting records for all privileged operations.
- Wraps
TelemetryPipeline- Normalizes runtime events.
- Buffers and batches telemetry for downstream export.
- Supports secure transport abstraction for VectorVue integration.
- Supports broker-backed async publishing via
TelemetryPublisher.
AuditLogger- Uses
pkg.logging.framework. - Writes structured audit events for auth decisions, task execution, and failures.
- Uses
Data Contracts
OrchestratorTasktask_id,source,tool,action,payload,requested_by,required_role.
ExecutionResulttask_id,status,started_at,ended_at,output,error,metadata.
TelemetryEventevent_id,event_type,timestamp,actor,target,status,attributes.
Runtime Flow
- Receive task request.
- Authenticate actor and authorize required role.
- Record accounting event and enqueue task.
- Worker executes task asynchronously.
- Emit operational logs and audit events.
- Emit telemetry events and batch for export.
- Persist result and expose status query.
Messaging Backbone (Sprint 9.5)
- Broker standard: RabbitMQ (dockerized deployment target).
- Logical model:
- Exchange:
spectrastrike.telemetry - Routing key:
telemetry.events - Main queue:
telemetry.events - Dead-letter queue:
telemetry.events.dlq
- Exchange:
- Delivery policy:
- Idempotency key per event (
event_id) to deduplicate replays. - Bounded retry attempts for transient broker failures.
- Dead-letter routing when retries are exhausted.
- Idempotency key per event (
- Runtime adapters:
RabbitMQTelemetryPublisher: in-memory RabbitMQ model for deterministic tests.PikaRabbitMQTelemetryPublisher: dockerized RabbitMQ adapter for real runtime publish.
- Telemetry transport security:
- RabbitMQ listener configured TLS-only (
5671) with client-certificate verification. - App-side publisher supports CA/cert/key configuration from
RABBITMQ_SSL_*.
- RabbitMQ listener configured TLS-only (
Cryptographic Signing (Sprint 10)
- Signer integration:
pkg.orchestrator.signing.VaultTransitSigner. - Key management:
- Transit key metadata and signing operations are delegated to HashiCorp Vault.
- Optional key bootstrap is supported through
VAULT_TRANSIT_AUTO_CREATE_KEY=true.
- Runtime config:
VAULT_ADDR,VAULT_TOKEN,VAULT_NAMESPACE,VAULT_VERIFY_TLS.VAULT_TRANSIT_MOUNT,VAULT_TRANSIT_KEY_NAME,VAULT_TRANSIT_KEY_TYPE.
- Security controls:
- HTTPS is required by default (
VAULT_REQUIRE_HTTPS=true). - Token and key material are never logged.
- Public key retrieval is version-aware for future manifest verification.
- HTTPS is required by default (
- JWS generation:
- Compact JWS payload generation is handled by
pkg.orchestrator.jws.CompactJWSGenerator. - The signing input is canonical JSON (
sort_keys=True) and encoded asbase64url(header).base64url(payload). - Signatures from Vault transit are requested with JWS marshaling and normalized into compact JWS signature segment encoding.
- Compact JWS payload generation is handled by
- Execution Manifest schema:
- Schema is defined in
pkg.orchestrator.manifest.ExecutionManifest. - Required fields:
task_context,target_urn,tool_sha256,parameters. task_contextenforces tenant and operator attribution (tenant_id,operator_id) and task lineage (task_id,correlation_id).target_urnandtool_sha256are strictly validated before signing to reduce forged/tampered dispatch risk.
- Schema is defined in
- Anti-replay controls:
ExecutionManifestincludes a per-requestnonceandissued_attimestamp.pkg.orchestrator.anti_replay.AntiReplayGuardvalidates allowed time window and nonce uniqueness before dispatch.- Replay storage is tenant-scoped (
tenant_id + nonce) to preserve isolation boundaries.
Security and Reliability Requirements
- No secrets in code or logs.
- TLS-only transport for outbound telemetry.
- Role-based authorization enforced before execution.
- Audit trail for denied and successful operations.
- Retry with bounded backoff; circuit-break style failure protection.
- Tamper-evident audit stream via hash-chained audit event records.
Armory + Universal Runner (Sprints 11-13)
- Armory registry:
- Internal OCI registry service (
armory-registry) is deployed in compose. - Registry delete operations are disabled for immutable artifact posture.
- Internal OCI registry service (
- Armory control service:
pkg.armory.service.ArmoryServicehandles ingest pipeline:- upload digesting (
sha256), - SBOM metadata generation,
- vulnerability summary generation,
- signing metadata generation,
- operator approval gating.
- upload digesting (
- Runner cryptographic gate:
pkg.runner.jws_verify.RunnerJWSVerifiervalidates compact JWS before execution admission.- Forged signature attempts are hard-failed before any tool resolution.
- Signed tool retrieval:
pkg.runner.universal.UniversalEdgeRunnerresolves only approved Armory digests that exactly matchExecutionManifest.tool_sha256.
- Sandbox profile:
- Runner command contract enforces
--runtime=runsc, AppArmor profile pinning, read-only rootfs, dropped capabilities, and no-network baseline.
- Runner command contract enforces
- Execution contract:
stdout/stderr/exit_codeare mapped into CloudEvents v1.0 viapkg.runner.cloudevents.map_execution_to_cloudevent.
- QA guarantees:
tests/qa/test_execution_fabric_qa.pyvalidates forged-JWS rejection, tampered-digest rejection, and CloudEvents output integrity.
OPA Capability Policies (Sprint 14)
- Pre-sign authorization:
- Orchestrator pre-sign flow can query OPA before issuing compact JWS manifests.
- Capability tuple model:
- Policy evaluation uses a tuple match on
operator_id + tenant_id + tool_sha256 + target_urn.
- Policy evaluation uses a tuple match on
- Policy defaults:
spectrastrike.capabilities.allowis deny-by-default.- Input contract validation is exposed through
spectrastrike.capabilities.input_contract_valid.
Wrapper Telemetry Migration (Sprint 16.5)
- Legacy wrapper direct telemetry emission (
telemetry.ingest(...)in wrappers) is deprecated. - Wrappers emit SDK-built payloads (
pkg.telemetry.sdk) and submit through unified parser path (telemetry.ingest_payload(...)). - Security invariants are preserved by parser/enforcement gates:
- strict tenant context propagation (
tenant_idrequired), - unified schema validation before buffering/publishing.
- strict tenant context propagation (
Control Plane Integrity Hardening (Sprint 19)
- Signed startup config gate:
pkg.orchestrator.control_plane_integrity.ControlPlaneIntegrityEnforcerenforces JWS signature validation before startup acceptance.- Unsigned or invalid signatures are hard-rejected.
- Policy trust pinning:
- Startup config must include
policy_sha256that matchesOPA_POLICY_PINNED_SHA256. - Any mismatch is rejected with integrity audit evidence.
- Startup config must include
- Runtime baseline integrity:
- Optional startup binary baseline (
SPECTRASTRIKE_ENFORCE_BINARY_HASH=true) validates SHA-256 against signed envelope value.
- Optional startup binary baseline (
- Immutable configuration history:
ImmutableConfigurationHistorystores append-only config versions with hash chaining and duplicate-version rejection.
- Integrity audit channel:
pkg.logging.framework.emit_integrity_audit_eventwrites hash-chained records to dedicated loggerspectrastrike.audit.integrity.
- Vault hardening workflow:
pkg.orchestrator.vault_hardening.VaultHardeningWorkflowautomates transit key rotation checks and unseal share policy enforcement.
High-Assurance AAA Controls (Sprint 20)
- Hardware-backed MFA for privileged actions:
pkg.security.aaa_framework.AAAServicenow supports hardware assertion verification viaHardwareMFAVerifier.- Privileged role authorization can require
hardware_mfa_assertionin policy context.
- Time-bound privilege elevation:
pkg.security.high_assurance.PrivilegeElevationServiceissues short-lived, one-time elevation tokens.- AAA privileged authorization can consume required
elevation_token_idvia validator hook.
- Dual-control Armory approval:
pkg.armory.service.ArmoryServiceenforces approval quorum (approval_quorum=2default) for tool authorization.- Distinct approvers are required before a digest is marked authorized.
- Dual-signature high-risk manifests:
pkg.orchestrator.dual_signature.HighRiskManifestDualSignerenforces independent second signature forhigh/criticalrisk levels.
- Break-glass and session recording:
- Break-glass activation uses irreversible audit flag semantics.
PrivilegedSessionRecorderprovides structured session start/command/end event capture for privileged activity evidence.
Deterministic Execution Guarantees (Sprint 21)
- Canonical manifest serialization:
pkg.orchestrator.manifest.canonical_manifest_jsonenforces deterministic compact JSON (sort_keys=True, fixed separators).
- Deterministic hashing:
pkg.orchestrator.manifest.deterministic_manifest_hashcomputes stable SHA-256 over canonical manifest payload.
- Schema semantic versioning:
ManifestSchemaVersionPolicyenforcesMAJOR.MINOR.PATCHformat and supported major compatibility bounds.
- Non-canonical submission rejection:
parse_and_validate_manifest_submissionrejects payloads that are not canonical JSON before manifest construction.
- Runtime ingress guard:
OrchestratorEngine.validate_manifest_submissionexposes canonical validation path for raw manifest intake.
- CI regression guard:
scripts/manifest_schema_regression.pyvalidates stable schema hash and is executed in CI (.github/workflows/lint-test.yml).
Federation Fingerprint Binding (Sprint 22)
- Unified execution fingerprint schema:
manifest_hash + tool_hash + operator_id + tenant_id + policy_decision_hash + timestamp.- Implemented in
pkg.orchestrator.execution_fingerprint.ExecutionFingerprintInput.
- Fingerprint generation and validation:
generate_execution_fingerprintcreates deterministic SHA-256 execution fingerprint.validate_fingerprint_before_c2_dispatchenforces pre-dispatch integrity gate.
- Tamper-evident fingerprint audit:
- Fingerprint bind/validate outcomes are emitted to integrity audit channel via
emit_integrity_audit_event.
- Fingerprint bind/validate outcomes are emitted to integrity audit channel via
- VectorVue federation payload binding:
- RabbitMQ bridge includes
execution_fingerprintin outgoing telemetry metadata and federation bundle. - Bridge uses federated gateway dispatch path (
send_federated_telemetry).
- RabbitMQ bridge includes
Federation Channel Enforcement (Sprint 23)
- Single outbound gateway:
- Bridge dispatch uses only internal federation endpoint (
/internal/v1/telemetry) viasend_federated_telemetry.
- Bridge dispatch uses only internal federation endpoint (
- Legacy path removal:
- Direct bridge event/finding API emission path is removed from active bridge runtime.
- mTLS-only federation:
- Federation dispatch requires TLS verification plus configured mTLS client cert/key.
- Signed telemetry required:
- Federation dispatch requires payload signature secret configuration; unsigned federation payloads are denied.
- Producer replay detection:
- Bridge tracks nonce replay window and denies duplicate producer nonce usage.
- Idempotent bounded retry:
- Idempotency key for federation dispatch is execution fingerprint hash, aligning retries with deterministic replay-safe semantics.
Anti-Repudiation Closure (Sprint 24)
- Operator-identity-bound fingerprint:
- Operator identity is required and validated when generating execution fingerprint.
- Write-ahead execution intent:
- Pre-dispatch execution intent records are appended before outbound federation dispatch.
- Intent records are hash-chained (
prev_hash -> intent_hash) for tamper evidence.
- Execution intent verification API:
verify_execution_intent_apiexposes verification contract forexecution_fingerprintand optionaloperator_idchecks.
- Reconciliation and repudiation detection:
- Operator-to-execution reconciliation confirms immutable attribution.
- Repudiation attempts (claiming wrong operator) are detected and emitted to integrity audit stream.
- Federation bundle intent metadata:
- Outbound federation bundle now includes
intent_id,intent_hash, andwrite_ahead=true.
- Outbound federation bundle now includes
Testing Strategy (for next tasks)
- Unit tests for scheduler ordering and retry behavior.
- Unit tests for AAA enforcement on task submission.
- Unit tests for telemetry event normalization.
- Integration tests for async execution lifecycle.