depths.cli.app) that accepts OpenTelemetry over HTTP and persists normalized rows into six Delta Lake tables. The service is built around a single orchestrator, depths.core.logger.DepthsLogger, which wires ingestion, validation, batching, local storage, optional S3 shipping, and read APIs.
Signals and endpoints
Depths listens on the standard OTLP HTTP paths:POST /v1/tracesPOST /v1/logsPOST /v1/metricsGET /healthzfor liveness and minimal diagnosticsGET /api/spans,GET /api/logs,GET /api/metrics/points,GET /api/metrics/histfor quick reads
application/json and application/x-protobuf. The default HTTP port is 4318 and most SDKs add /v1/{signal} to the base endpoint automatically.
Core components
- Ingestion surface:
depths.cli.appexposes the endpoints and owns process lifecycle. - Orchestrator:
depths.core.logger.DepthsLoggerconstructs and coordinates everything under a single instance root. - Mappers:
depths.core.otlp_mapperconverts decoded OTLP payloads to row dicts shaped for our tables, stamping resource and scope context. - Producer:
depths.core.producer.LogProducervalidates and normalizes events against anEventSchema. - Aggregator:
depths.core.aggregator.LogAggregatordrains the producer buffer, batches rows into typed DataFrames, and appends to Delta tables. - Delta I/O:
depths.core.deltaprovides safe creates, appends, compaction, checkpoint, and vacuum helpers overdeltalake+ Polars. - S3 shipper:
depths.core.shippercan seal a UTC day, upload to S3, and verify row counts, usingdepths.core.config.S3Config. - Config:
depths.core.configcentralizes options for logger, producer, aggregator, shipper, and S3.
Data model and tables
Depths persists six OTel-aligned tables under a per-day directory:- Spans
- Span events
- Span links
- Logs
- Metric points (Gauge or Sum)
- Metric histograms (Histogram, Exponential Histogram, Summary)
Signal journey
-
Receive
FastAPI endpoint indepths.cli.appaccepts OTLP JSON or protobuf. Payloads may be gzip-encoded. The app creates a process-widedepths.core.logger.DepthsLoggeron first request. -
Map
depths.core.otlp_mapperconverts the OTLP message to table-specific rows. Resource and scope attributes are normalized. Correlation IDs are coerced to stable formats. -
Produce
depths.core.producer.LogProducerapplies thedepths.core.schema.EventSchemacontract: defaults, computed fields, type coercion, and required checks. Valid rows go into a bounded queue. -
Aggregate and persist
depths.core.aggregator.LogAggregatorbatches rows into typed Polars DataFrames and appends them to the correct Delta table under the current UTC day. The aggregator tracks triggers like batch age and size. -
Seal and ship (optional)
At day close, or on demand,depths.core.shipperseals the day: compact files, write a checkpoint, vacuum old debris, compute row counts and versions, then upload to S3 and verify. Optimize, checkpoint, and vacuum are standard Delta maintenance steps. -
Read
Read endpoints and thedepths.core.logger.DepthsLogger.read_*helpers construct Polars lazy scans over local paths or S3 URIs. Delta’s transaction log lets Polars read only what is needed.
Instance layout and day boundaries
Each Depths instance has a root directory that contains configs, indexes, and astaging/days/YYYY-MM-DD/otel/ tree with one Delta table per signal. Day boundaries are in UTC. Capturing the target table path at enqueue time keeps late batches on the correct day.
Storage modes
Readers accept astorage selector:
autopicks S3 if a sealed day exists remotely, otherwise local.localforces local Delta tables.s3forces S3 and usesS3Config.to_delta_storage_options()to construct reader options.
Health, lifecycle, and reliability
GET /healthzreturns a lightweight JSON document with process and pipeline stats.- On startup, the orchestrator prepares schemas and directories, installs exit hooks, and resumes any unshipped days.
- The producer buffer and the two aggregator threads create backpressure and clear durability points between network ingress and storage.
Why Delta Lake and Polars
- Delta gives ACID tables on object storage and a transaction log. Operations like OPTIMIZE, CHECKPOINT, and VACUUM reduce file counts, make table state discovery fast, and clean stale files.
- Polars integrates with Delta via
scan_deltafor lazy reads and supports Delta writes, which keeps ingestion simple and reads efficient.
What this means for you
- Point any OTLP HTTP exporter to
http://host:4318. Most exporters append/v1/{signal}automatically and support JSON or protobuf. - Depths handles the rest: map, validate, batch, write to Delta, optionally ship to S3, and let you query back with simple endpoints or the Python API.