Skip to content

Operations - Website Offline Controls, Heartbeat, And API Observability

Overview

This document specifies missing legacy behavior for storefront offline controls and API observability that is currently not covered in dedicated migration specs.

Checklist coverage: - OP-001 Offline middleware / website online-offline controls - OP-004 Request/response traffic recording middleware in APIs - OP-005 Sentry exception capture in APIs - OP-012 Heartbeat endpoint/admin heartbeat module

Primary legacy evidence: - _core/config/router.php - _www/index.php - _www/api.php - _www/appconfig/constants.php - src/functions.php - src/Atraxion/Front/Middleware/OfflineMiddleware.php - src/Atraxion/Common/Command/WebsiteOfflineCommand.php - src/Atraxion/Common/Command/WebsiteOnlineCommand.php - _api/index.php - _autodoc/index.php - src/Atraxion/Api/Middleware/TrafficRecordingMiddleware.php - src/Atraxion/Api/Repository/TrafficRecordingRepository.php - src/Atraxion/Api/Resources/doctrine/TrafficRecording.orm.xml - src/Atraxion/Request/RequestResponseRecorderMiddleware.php - src/Atraxion/Request/DbTableRequestResponseStorage.php - src/Atraxion/Api/Service/SentryService.php - _www/appconfig/bootstrap.php - _core/config/routing.php - _admin/config/routing.php - _core/controller/classes/atx/system/HeartbeatController.php - _admin/controller/classes/atx/system/AdminHeartbeatController.php

Legacy Capabilities (As-Is)

1. Website Offline Controls (OP-001)

State model: 1. Offline state is represented by file presence at OFFLINE_FILE (APP_PUBLIC_DIR/appconfig/offline). 2. File contents store the expected back-online timestamp. 3. is_offline(bool $checkIp) supports privileged-IP bypass when $checkIp=true.

Control entry points: 1. CLI command website:offline [back-online] writes/extends offline timestamp. 2. CLI command website:online removes offline file.

Runtime enforcement: 1. _www/index.php blocks public storefront early when is_offline(true) and renders offline page. 2. _www/api.php returns HTTP 503 JSON when is_offline(true). 3. Slim front app uses OfflineMiddleware in _core/config/router.php; middleware checks is_offline(false) and returns 503 HTML or JSON payload.

2. API Traffic Recording (OP-004)

_api application: 1. Uses TrafficRecordingMiddleware. 2. Records request/response in api_traffic_logs Doctrine entity. 3. Captures duration and runs probabilistic retention cleanup (~5%) for logs older than two months. 4. On exceptions, writes a debug-level synthetic response into logs and returns safe error response.

_autodoc application: 1. Uses RequestResponseRecorderMiddleware with autodoc_traffic_logger. 2. Persists request, response, and optional exception fields in DB table autodoc_traffic_logs. 3. Retention is probabilistic (~5%) with a two-week window. 4. Middleware rethrows exceptions after recording; default error handler then builds JSON error response.

3. API Sentry Exception Capture (OP-005)

Global bootstrap: 1. _www/appconfig/bootstrap.php initializes Sentry SDK. 2. Shutdown handler captures fatal errors as Sentry messages.

Per-app capture points: 1. _api default error handler captures exception with request context via SentryService. 2. _api TrafficRecordingMiddleware also captures exception before building fallback response. 3. _autodoc default error handler captures exception with request context via SentryService.

Context enrichment: 1. SentryService attaches request metadata, locale tag, optional authenticated user info, endpoint, and IP. 2. Selected sensitive fields are redacted for Sentry context (authorization, cookie, selected auth server params).

4. Heartbeat Endpoint And Admin Heartbeat Module (OP-012)

Public heartbeat route: 1. Legacy route key hb maps to HeartbeatController. 2. init action returns lightweight runtime data (cart context and session-online updates for eligible authenticated users). 3. Used as a periodic front signal path rather than a deep health diagnostics endpoint.

Admin heartbeat module: 1. Admin route key hbadmin maps to AdminHeartbeatController. 2. check aggregates dashboard metrics (orders, platform orders, manual orders, searches, active users, queue/general-running flags, unvalidated articles, offline status). 3. Includes UI/session toggles (hideAdminInfo, hideHeartbeatMenu, showDebug) and version switch helper (switchVersion for eligible users).

Current Gaps And Risks To Resolve In Migration

  1. Offline checks are duplicated across entrypoints and middleware with different bypass semantics (is_offline(true) vs is_offline(false)).
  2. API offline behavior is not unified between legacy _www/api.php, _api, and _autodoc apps.
  3. _api may double-report exceptions to Sentry (middleware + error handler for same failure path).
  4. Traffic recording stores full request/response payloads and headers; redaction policy is inconsistent across apps.
  5. Retention windows differ (api_traffic_logs two months vs autodoc_traffic_logs two weeks) without a formal policy.
  6. Cleanup is probabilistic and piggybacked on writes, so deletion cadence is non-deterministic.
  7. OfflineMiddleware JSON detection uses request Content-Type only; this can mismatch client expectations.
  8. Heartbeat behavior and payloads are legacy-controller specific and undocumented as a stable contract.

Target Migration Specification (Symfony)

Scope

In-scope: - Unified maintenance-mode domain behavior for storefront and APIs. - Explicit contract for storefront/admin heartbeat endpoints. - Unified API request/response recording policy with redaction and retention controls. - Deterministic API exception telemetry pipeline to Sentry.

Out-of-scope: - SIEM downstream integrations beyond Sentry. - Product-level alert routing ownership matrix.

Required Operating Rules

  1. Maintenance mode must be represented by one typed source of truth (not raw file checks scattered across apps).
  2. All HTTP surfaces (webshop, internal API, Autodoc API) must follow one maintenance policy contract and response format per channel.
  3. Exception-to-Sentry pipeline must capture each exception once per request path.
  4. Traffic recording must apply configurable redaction rules for authorization headers, cookies, tokens, and sensitive body fields.
  5. Retention and cleanup must be deterministic (scheduled cleanup command/job), not probabilistic on write.
  6. Recorder behavior must define body truncation/size limits to control storage and PII exposure.
  7. Heartbeat endpoints must expose a documented, versioned payload contract with clear auth boundaries.

Proposed Target Components

Application services: - MaintenanceWindowService (state and back-online metadata) - MaintenanceModeResponder (HTML/JSON response contracts) - HeartbeatStatusService (front/admin heartbeat payload builder) - ApiTrafficRecorder (redaction, truncation, persistence) - ExceptionTelemetryService (Sentry envelope and dedup guard)

Middleware: - MaintenanceModeMiddleware (shared across apps, policy-driven) - ApiObservabilityMiddleware (records duration + traffic with consistent schema) - ApiExceptionMiddleware (single capture point to telemetry and response mapping)

Data stores: - api_traffic_logs replacement table(s) with explicit retention index strategy. - Optional separate table for exception metadata if query needs differ from traffic logs.

Data And Migration Notes

  1. Preserve historical api_traffic_logs and autodoc_traffic_logs data for the agreed retention horizon before archive/drop.
  2. Define and run one migration for canonical observability schema (or explicitly keep separate schemas with policy parity).
  3. Migrate OFFLINE_FILE behavior to a typed configuration/state store with equivalent back-online timestamp semantics.
  4. Document allowed maintenance bypass policy (if any) as explicit role-based authorization, not implicit IP checks.
  5. Define compatibility strategy for legacy heartbeat response fields used by current admin/front polling code.

Acceptance Scenarios (Gherkin)

Feature: Maintenance mode and API observability

  Scenario: Storefront maintenance mode returns expected page
    Given maintenance mode is enabled until 15:30
    When an anonymous customer requests the webshop
    Then response status should be 503
    And response should contain back-online metadata

  Scenario: API maintenance mode returns structured JSON
    Given maintenance mode is enabled
    When a client calls an API endpoint
    Then response status should be 503
    And response body should follow the agreed maintenance JSON contract

  Scenario: API exceptions are captured once
    Given an API handler throws an exception
    When the request fails
    Then exactly one Sentry event should be emitted for that exception

  Scenario: Traffic recorder redacts sensitive fields
    Given a request includes Authorization and Cookie headers
    When the request and response are recorded
    Then stored traffic logs should not contain raw secret values

  Scenario: Traffic retention cleanup is deterministic
    Given logs older than configured retention exist
    When cleanup job runs
    Then stale logs should be deleted according to policy

  Scenario: Admin heartbeat returns operational snapshot
    Given an authenticated admin opens the heartbeat dashboard
    When heartbeat check is requested
    Then response should include orders/users/runtime-state metrics
    And the payload should follow the documented heartbeat contract

Open Decisions

  1. Should maintenance-mode bypass exist at all in target production, and if yes, should it be role-based or network-based.
  2. Whether to keep separate observability stores for _api and _autodoc or merge into one canonical API telemetry schema.
  3. Which fields are mandatory to redact in recorded bodies (tenant/customer-specific legal constraints).
  4. Whether to sample successful request recording in production while always recording failures.
  5. Whether storefront heartbeat remains a dedicated endpoint or is replaced by a broader health/status API.