PlatformXeDocs
Get API Key

Operations Runbook

Comprehensive operational guide for tenant onboarding, service configuration, monitoring, contextual messaging operations, troubleshooting, and Terraform workflows.

Operations and Configuration Runbook

This guide covers day-to-day operational procedures for PlatformXe, from initial tenant setup through service configuration, monitoring, and troubleshooting.


Tenant onboarding

API key creation and scope assignment

Every tenant receives one or more API keys with scoped access. Create keys through the portal or the admin dashboard.

// TypeScript SDK -- verify key works
const client = new PlatformXeClient({
  apiKey: 'pxk_live_your_key_here',
});

const health = await client.healthCheck();
console.log(health.status); // "ok"
# Python SDK
from platformxe import PlatformXeClient

client = PlatformXeClient(api_key="pxk_live_your_key_here")
// Go SDK
client := platformxe.NewClient(platformxe.ClientConfig{
    APIKey: "pxk_live_your_key_here",
})

Recommended scope sets by use case:

Use caseScopes
Full-stack appmessaging:send, storage:upload, storage:read, permissions:check, events:read
Authorization onlypermissions:check, permissions:manage, permissions:audit
Background jobsexports:create, events:manage, webhooks:manage
Read-only dashboardstorage:read, events:read, permissions:check

Service processor initial configuration

After creating an API key, configure processors for each service your tenant will use.

// Enable and configure storage
await client.storage.updateProcessor({
  enabled: true,
  config: {
    maxFileSizeMb: 25,
    allowedMimeTypes: ['image/jpeg', 'image/png', 'application/pdf'],
    moderationEnabled: true,
  },
});

// Enable OCR
await client.ocr.updateProcessor({
  enabled: true,
  config: {
    confidenceThreshold: 0.85,
    supportedDocumentTypes: ['NIN_SLIP', 'DRIVERS_LICENSE'],
  },
});

// Enable identity resolution
await client.identity.updateProcessor({
  enabled: true,
  config: { retryOnFailure: true, maxRetries: 3 },
});
# Python equivalents
client.storage.update_processor(enabled=True, config={"maxFileSizeMb": 25})
client.ocr.update_processor(enabled=True, config={"confidenceThreshold": 0.85})
client.identity.update_processor(enabled=True, config={"retryOnFailure": True})
// Go equivalents
client.Storage.UpdateProcessor(map[string]interface{}{"enabled": true, "config": map[string]interface{}{"maxFileSizeMb": 25}})
client.Ocr.UpdateProcessor(map[string]interface{}{"enabled": true, "config": map[string]interface{}{"confidenceThreshold": 0.85}})

Channel setup for contextual messaging

Create channels before threads can be opened. Each channel maps to a domain entity type.

await client.threads.createChannel({
  slug: 'booking',
  displayName: 'Booking Conversations',
  entityType: 'BOOKING',
  participantRoles: ['GUEST', 'HOST', 'PLATFORM'],
  defaultVisibility: ['ALL'],
  lifecycleRules: {
    autoClose: { onEntityStatus: ['CHECKED_OUT', 'CANCELLED'] },
    autoArchive: { afterClosedDays: 90 },
  },
});

Webhook and event subscription setup

// Create a webhook endpoint
const webhook = await client.webhooks.create({
  url: 'https://your-app.com/webhooks/platformxe',
  events: ['email.message.*', 'permissions.role.*'],
  secret: 'whsec_your_signing_secret',
});

// Create an event subscription
await client.events.createSubscription({
  eventTypes: ['BOOKING_CONFIRMED', 'BOOKING_CANCELLED'],
  webhookUrl: 'https://your-app.com/events',
  isActive: true,
});

Service configuration

Processor types and defaults

PlatformXe has 7 configurable processor types. Each controls runtime behaviour for its service.

ProcessorKey settingsDefault values
MessagingretryMaxAttempts, retryDelayMs, deadLetterAfter3 attempts, 2000ms delay, dead-letter after 5 failures
StoragemaxFileSizeMb, allowedMimeTypes, moderationEnabled10MB, all types, moderation off
OCRconfidenceThreshold, supportedDocumentTypes0.80 threshold, all document types
PDFdefaultPageSize, defaultMarginsA4, 20/15/20/15 margins
QRdefaultSize, defaultFormat, brandColor256px, PNG, black
ExportsmaxConcurrentJobs, retentionDays2 concurrent, 7 day retention
IdentityretryOnFailure, maxRetries, cacheTtlSecondsRetry on, 2 retries, 3600s cache

Recommended configurations by industry

Property management:

// Storage: large files, image moderation on
await client.storage.updateProcessor({
  enabled: true,
  config: { maxFileSizeMb: 50, moderationEnabled: true },
});

// PDF: A4, custom branding margins
await client.pdf.updateProcessor({
  enabled: true,
  config: { defaultPageSize: 'A4' },
});

Healthcare:

// Identity: aggressive caching, retry enabled
await client.identity.updateProcessor({
  enabled: true,
  config: { retryOnFailure: true, maxRetries: 5, cacheTtlSeconds: 0 },
});

// OCR: high confidence for medical documents
await client.ocr.updateProcessor({
  enabled: true,
  config: { confidenceThreshold: 0.95 },
});

Legal services:

// Exports: longer retention for compliance
await client.exports.updateProcessor({
  enabled: true,
  config: { retentionDays: 90, maxConcurrentJobs: 1 },
});

Monitoring and health

Health check endpoint

const health = await client.healthCheck();
// health.status: "ok" | "degraded" | "down"
// health.timestamp: ISO 8601
health = client.health_check()
health, err := client.HealthCheck()

Usage monitoring

Track consumption against plan limits on a monthly basis.

const usage = await client.usage.summary({ month: '2026-04' });

console.log(`Emails: ${usage.emailsSent}`);
console.log(`API calls: ${usage.apiCalls}`);
console.log(`Storage: ${usage.storageUsedMb}MB`);
console.log(`Permission checks: ${usage.permissionChecks}`);

Webhook delivery monitoring

Check webhook delivery health by inspecting the webhook resource and testing delivery.

const webhooks = await client.webhooks.list();

for (const wh of webhooks.data.webhooks) {
  console.log(`${wh.name}: ${wh.isActive ? 'active' : 'disabled'}`);
}

Event log monitoring

Monitor event processing by querying the event log.

const log = await client.events.log({
  from: new Date(Date.now() - 3600000).toISOString(), // last hour
  limit: '100',
});

Contextual messaging operations

Channel lifecycle rules configuration

Lifecycle rules control automatic thread state transitions. Configure them on channel creation or update.

Auto-close rules close threads when the associated entity reaches a terminal status:

await client.threads.updateChannel('ch_abc', {
  lifecycleRules: {
    autoClose: {
      onEntityStatus: ['CHECKED_OUT', 'CANCELLED', 'EXPIRED'],
    },
    autoArchive: {
      afterClosedDays: 90,
    },
    inactivityClose: {
      afterDays: 30,
      warningBeforeDays: 3,
    },
  },
});

Your application forwards entity status changes to trigger lifecycle evaluation:

await client.threads.entityEvent({
  channelSlug: 'booking',
  entityId: 'BK-2026-00451',
  event: 'STATUS_CHANGED',
  newStatus: 'CHECKED_OUT',
});
// If 'CHECKED_OUT' is in the autoClose list, the thread closes automatically

Escalation rule authoring

Escalation rules use JSON Logic conditions to match against flag data and trigger actions automatically.

Condition format (JSON Logic):

{
  "in": [{ "var": "flag.reason" }, ["SAFETY", "EMERGENCY"]]
}
{
  "and": [
    { "in": [{ "var": "flag.reason" }, ["DISPUTE"]] },
    { "==": [{ "var": "flag.severity" }, "HIGH"] }
  ]
}

Action types:

ActionDescriptionConfig fields
CREATE_ISSUECreate an issue in the connected issue trackertitle, priority, assignee
NOTIFY_WEBHOOKSend a webhook notificationwebhookUrl, includeThread
SEND_EMAILSend an alert emailto, templateId
ASSIGN_AGENTAuto-assign a platform agentagentPool, strategy
CLOSE_THREADForce-close the threadreason

Autonomous escalation setup

Configure rules for safety, cleanliness, and refund scenarios that trigger without human intervention.

await client.threads.setEscalationConfig('ch_abc', {
  flagReasons: [
    { code: 'SAFETY', label: 'Safety concern', severity: 'HIGH' },
    { code: 'CLEANLINESS', label: 'Cleanliness issue', severity: 'MEDIUM' },
    { code: 'REFUND', label: 'Refund request', severity: 'LOW' },
    { code: 'EMERGENCY', label: 'Emergency', severity: 'HIGH' },
  ],
  rules: [
    {
      id: 'rule-safety',
      name: 'Safety auto-escalation',
      trigger: 'PARTICIPANT_FLAG',
      conditions: { in: [{ var: 'flag.reason' }, ['SAFETY', 'EMERGENCY']] },
      actions: [
        { type: 'CREATE_ISSUE', config: { title: 'SAFETY: {{thread.subject}}', priority: 'URGENT' } },
        { type: 'NOTIFY_WEBHOOK', config: { webhookUrl: 'https://ops.example.com/safety-alerts' } },
      ],
      priority: 1,
      isActive: true,
    },
    {
      id: 'rule-cleanliness',
      name: 'Cleanliness follow-up',
      trigger: 'PARTICIPANT_FLAG',
      conditions: { in: [{ var: 'flag.reason' }, ['CLEANLINESS']] },
      actions: [
        { type: 'ASSIGN_AGENT', config: { agentPool: 'housekeeping', strategy: 'round-robin' } },
      ],
      priority: 2,
      isActive: true,
    },
    {
      id: 'rule-refund',
      name: 'Refund request routing',
      trigger: 'PARTICIPANT_FLAG',
      conditions: { in: [{ var: 'flag.reason' }, ['REFUND']] },
      actions: [
        { type: 'SEND_EMAIL', config: { to: 'refunds@example.com', templateId: 'tmpl_refund_alert' } },
      ],
      priority: 3,
      isActive: true,
    },
  ],
});

Thread lifecycle processing

Inactivity close: Threads with no messages for the configured afterDays period are automatically closed. A warning system message is sent warningBeforeDays before closure.

Auto-archive: Closed threads are archived after afterClosedDays. Archived threads remain queryable but are excluded from inbox views.

Retention: Message and thread data is retained per the organization's data retention policy. Archived thread content is immutable.

Audit trail verification

Every thread action produces an audit event. Query the event log to verify the trail.

const log = await client.events.log({
  eventType: 'THREAD_',
  entityId: 'th-001',
});
// Returns: THREAD_CREATED, THREAD_MESSAGE_SENT, THREAD_CLOSED, etc.

Troubleshooting

Common error codes

Error codeHTTPCauseResolution
UNAUTHORIZED401Missing or invalid API keyVerify the x-api-key header value
FORBIDDEN403API key lacks required scopeAdd the missing scope to the API key
PLAN_REQUIRED403Feature requires a higher planUpgrade tenant plan (e.g., Federation requires Enterprise)
NOT_FOUND404Resource does not existVerify the resource ID
RATE_LIMITED429Rate limit exceededBack off and retry. See rate limits below
VALIDATION_ERROR400Invalid request bodyCheck the error message for field-level details
CONFLICT409Duplicate or conflicting stateCheck for existing resources with the same unique fields
PROVIDER_ERROR502Upstream provider failureRetry; the circuit breaker will failover automatically
PROCESSOR_DISABLED400Service processor is disabledEnable the processor via updateProcessor

Rate limiting behaviour

Route classLimitScope
Permission checks (check, resolve, batch)5,000/hrpermissions:check
Permission mutations (CRUD)500/hrpermissions:manage
Permission audit (logs, export)100/hrpermissions:audit
All other routes1,000/hrPer API key

When rate limited, the API returns a 429 response with Retry-After header indicating seconds until the next allowed request. SDKs with retry enabled (default) handle this automatically with exponential backoff.

Circuit breaker states

PlatformXe uses circuit breakers for external provider calls (email, SMS, identity resolution). The three states are:

StateBehaviour
CLOSEDNormal operation. Requests go to the primary provider
OPENPrimary provider has failed repeatedly. Requests route to the next provider in the fallback chain
HALF_OPENTesting if the primary provider has recovered. A small percentage of requests probe the primary

Circuit breakers reset automatically. No manual intervention is required. The health check endpoint reflects provider circuit breaker states.

Escalation action failures

If an escalation rule action fails (e.g., webhook timeout, email delivery failure):

  1. The action failure is logged in the event log
  2. The flag remains in PENDING state
  3. The action is retried up to 3 times with exponential backoff
  4. After all retries fail, the action is marked FAILED and a THREAD_ESCALATION_FAILED event is emitted
  5. Manual intervention: review the flag and re-trigger escalation or process manually
// Query for failed escalation actions
const log = await client.events.log({
  eventType: 'THREAD_ESCALATION_FAILED',
});

Provider failover chain

For messaging services, PlatformXe uses a multi-provider fallback chain. If the primary provider fails, requests automatically route to the next available provider. The order is configured per-tenant and not exposed publicly.

Failed messages enter a persistent retry queue. Monitor queue health through the usage summary endpoint.


Terraform operations

Initial workflow

# 1. Initialize the provider
terraform init

# 2. Preview changes
terraform plan -var="platformxe_api_key=pxk_live_..."

# 3. Apply changes
terraform apply -var="platformxe_api_key=pxk_live_..."

Store your API key in a .tfvars file or environment variable rather than passing it on the command line.

# Using environment variable
export PLATFORMXE_API_KEY="pxk_live_your_key_here"
terraform plan
terraform apply

Resource import for existing infrastructure

If you have resources already created through the portal or SDK, import them into Terraform state before managing them as code.

# Import a role
terraform import platformxe_permissions_role.agent role_abc123

# Import a channel
terraform import platformxe_threads_channel.booking ch_abc123

# Import a processor
terraform import platformxe_storage_processor.config proc_abc123

After importing, run terraform plan to verify the imported state matches your configuration. Fix any drift before applying new changes.

State management best practices

  1. Remote state: Use a remote backend (S3, GCS, Terraform Cloud) for team environments.
  2. State locking: Enable state locking to prevent concurrent modifications.
  3. Workspaces: Use separate workspaces for staging and production tenants.
  4. Sensitive values: Mark API keys as sensitive = true in variable definitions.
variable "platformxe_api_key" {
  type      = string
  sensitive = true
}

Processor resource lifecycle

Processor resources are singletons per service per organization. They are created on first apply and updated in place on subsequent applies. Destroying a processor resource resets it to default values (it does not disable the service).

# View current processor state
terraform state show platformxe_storage_processor.config

# Refresh from remote
terraform refresh

Handling plan changes

When changing your PlatformXe plan (e.g., upgrading from Basic to Enterprise), some resources may become available or unavailable. Run terraform plan after plan changes to detect drift:

terraform plan
# If federation resources are now available, they will show as "to create"