Perusing neighborhood complaints
Exploring San Francisco 311 case data with oleander... From CSV ingestion to district demographics.
Built by the co-creators of
OpenLineage and Marquez
Skip the dashboards. Built for fast-moving Spark & Iceberg teams. Autonomous reliability from day one.
As soon as your Spark job fails, oleander begins investigating across logs, spans, lineage and Spark plan diffs. Go straight from your alert to a redeployed fix.
| Feature | oleander | Databricks |
|---|---|---|
| Compute | Serverless Spark via CLI | Managed Spark |
| Storage | Managed Iceberg & telemetry lake | Delta Lake + Iceberg via Unity Catalog |
| Observability | LLM-based RCA & auto-fix | Monitoring only, no auto-fix |
| Onboarding | Fast via CLI | Medium |
| Time to value | Fast | Medium |
| Modularity | Full stack, use any or all | Limited |
| Made for | Fast-moving Spark & Iceberg teams | Data & ML teams on Databricks |
Oleander gives agents everything they need to generate a fix. You review, merge and oleander verifies it in production.
Build. Deploy. Debug. Let us deploy your Spark jobs and handle production monitoring, running investigations as soon as things go wrong.
Work on your own data with an integrated Iceberg storage and query layer, while keeping lineage and observability attached to every query. You can also bring your own iceberg catalog along for the ride.
Use your own agent or IDE that inherits our Spark & Iceberg expertise and context. Ask for PySpark against your oleander lake in plain language.
Build a PySpark script for oleander.default.global_flowers. Filter to poisonous flowers, normalize genus and continent values, and compute risk slices by continent, toxicity band, genus, and bloom season with record counts and confidence metrics. Write partitioned outputs to oleander.analytics.toxicity_by_continent, oleander.analytics.toxicity_by_genus, and oleander.analytics.high_risk_species, and include idempotent upsert behavior so downstream dashboards and anomaly monitors can consume each dataset safely.
Anomaly investigation starts the moment alerts fire. Pull deep context from your telemetry lake to pinpoint and assist with root cause before downstream impacts.
Skip dashboard hopping. Debug production issues using an independent metadata context layer and trace root causes across your data infrastructure on day zero.
Correlate metrics, logs, traces, and lineage metadata instantly so you can understand the intent behind every deployed pipeline with zero context switching.
Reduce on-call burden. Every alert is paired with a detailed knowledge graph of your data infrastructure so you understand how everything fits together. A shared context with your team to solve incidents faster.
@alertingInvestigate new production alerts and generate triage context with downstream blast radius and likely remediation steps.
Every run, commit, deployment and dataset is connected. Search it. Trace it. Understand it. Share it.
@insightsAnalyze the last 30 days of finance.billing.process_pending_invoices. Investigate the 15x drop in output volume and map the downstream impact.
Databricks and Amazon EMR are supported today. Built on open standards, integration keeps pace with your continuously evolving data stack.
| Platform | Status |
|---|---|
Databricks | supported |
Amazon EMR | supported |
Amazon SageMaker AI | supported |
AWS Glue | supported |
Google Cloud Dataproc | supported |
Jupyter Notebook | supported |
MLflow | coming soon |
Production intelligence, built on open standards. OpenTelemetry (OTel) captures execution traces for your Spark jobs, while OpenLineage (OLin) maps your high-level data flow context.
Backed by some of the creators behind the tools we support & use each and every day.