Spark & Iceberg
Managed Spark & Iceberg with OpenLineage by default.
Built by the co-creators of OpenLineage and Marquez
Skip the dashboards. We build a reusable telemetry context layer where the Data Reliability Engineer and agents can solve data incidents together on day one
We unify compute, storage, lineage, and observability so your team can move faster with shared context at every stage. Use any or all of our tools.
Build. Deploy. Debug. Let us deploy your Spark jobs and handle production monitoring, running investigations as soon as things go wrong.
Work on your own data with an integrated Iceberg storage and query layer, while keeping lineage and observability attached to every query. You can also bring your own iceberg catalog along for the ride.
Anomaly investigation starts the moment alerts fire. Pull deep context from your telemetry lake to pinpoint and assist with root cause before downstream impacts.
Skip dashboard hopping. Debug production issues using an independent metadata context layer and trace root causes across your data infrastructure on day zero.
Correlate metrics, logs, traces, and lineage metadata instantly so you can understand the intent behind every deployed pipeline with zero context switching.
Reduce on-call burden. Every alert is paired with a detailed knowledge graph of your data infrastructure so you understand how everything fits together. A shared context with your team to solve incidents faster.
@alertingInvestigate new production alerts and generate triage context with downstream blast radius and likely remediation steps.
Every run, commit, deployment and dataset is connected. Search it. Trace it. Understand it. Share it.
@insightsAnalyze the last 30 days of finance.billing.process_pending_invoices. Investigate the 15x drop in output volume and map the downstream impact.
Databricks and Amazon EMR are supported today. Built on open standards, integration keeps pace with your continuously evolving data stack.
| Platform | Status |
|---|---|
Databricks | supported |
Amazon EMR | supported |
Amazon SageMaker AI | supported |
AWS Glue | supported |
Google Cloud Dataproc | supported |
Jupyter Notebook | supported |
MLflow | coming soon |
Production intelligence, built on open standards. OpenTelemetry (OTel) captures execution traces for your Spark jobs, while OpenLineage (OLin) maps your high-level data flow context.
Backed by the some of the creators behind the tools we support & use each and every day.