View our presentationStream — live Spark processingInteract with lineageLearn about OpenLineageView our presentationStream — live Spark processingInteract with lineageLearn about OpenLineageView our presentationStream — live Spark processingInteract with lineageLearn about OpenLineageView our presentationStream — live Spark processingInteract with lineageLearn about OpenLineage

Built by the co-creators of
OpenLineage and Marquez

Spark jobs that fix themselves.No on-call required.

Skip the dashboards. Built for fast-moving Spark & Iceberg teams. Autonomous reliability from day zero.

No credit card required

brew install OleanderHQ/tap/oleander-cli

Auto-fix. Auto-deploy. In minutes, not hours.

Catch the failures that never alert. Rows written drop 40%. Costs spike 10x, data quality drifts. Before your on-call does.

//Advisors & Investors

Backed by the developers who built the modern data stack: Pandas, Parquet, OpenLineage, Airflow and more!

Myriad Venture Partners

Co-founder & CEO, SDF Labs

Co-founder & CEO, Kilo

Investor

Myriad Venture Partners

Co-founder & CEO, SDF Labs

Co-founder & CEO, Kilo

Investor

//Comparison

Feature	oleander	Databricks	AWS EMR
Compute	Serverless Spark, zero setup	Managed Spark	Self-managed
Storage	Managed Iceberg, zero setup	Delta Lake + Iceberg via Unity Catalog	S3 Tables (Iceberg) or self-managed
Observability	LLM-based RCA & auto-fix	Monitoring only, no auto-fix	CloudWatch logs only, no RCA
Onboarding	Fast via CLI & MCP	Medium	Heavy
Time to value	Fast	Medium	Slow
Modularity	Full stack, use any or all	Limited	Limited
Made for	Fast-moving Spark & Iceberg teams	Data & ML teams on Databricks	Large teams with SREs

//Walkthrough

Deep Spark context. From failure to redeployed fix.

Every log, span, lineage event and Spark plan diff is captured in one investigation link with root cause. Your agent reads it, writes the fix and opens the PR. You review, merge and oleander verifies in production.

Cursor Claude Code

[01]SPARK AUTO-INVESTIGATION

[02]AUTO-FIX & REDEPLOY

//Core

[01] Compute

Spark compute with context

Build. Deploy. Debug. Let us deploy your Spark jobs and handle production monitoring, running investigations as soon as things go wrong.

Integrate Create compute CLI

process_pending_invoices.py

[02] Storage

Lake storage & query engine

Work on your own data with an integrated Iceberg storage and query layer, while keeping lineage and observability attached to every query. You can also bring your own iceberg catalog along for the ride.

More about Lake

flower_power.sql

[03] Code

English to distributed SQL

Use your own agent or IDE that inherits our Spark & Iceberg expertise and context. Ask for PySpark against your oleander lake in plain language.

npx skills add oleanderHQ/skills

Skill pack

Build a PySpark script for oleander.default.global_flowers. Filter to poisonous flowers, normalize genus and continent values, and compute risk slices by continent, toxicity band, genus, and bloom season with record counts and confidence metrics. Write partitioned outputs to oleander.analytics.toxicity_by_continent, oleander.analytics.toxicity_by_genus, and oleander.analytics.high_risk_species, and include idempotent upsert behavior so downstream dashboards and anomaly monitors can consume each dataset safely.

//Observability

[04] Incidents

Automated incident investigations

Anomaly investigation starts the moment alerts fire. Pull deep context from your telemetry lake to pinpoint and assist with root cause before downstream impacts.

Incidents/ spark job rows written dropped 15x

Investigations/ #381

[05] Root cause

Full context-aware root cause analysis

Skip dashboard hopping. Debug production issues using an independent metadata context layer and trace root causes across your data infrastructure on day zero.

CLI

[06] Telemetry

Query telemetry data with SQL

Correlate metrics, logs, traces, and lineage metadata instantly so you can understand the intent behind every deployed pipeline with zero context switching.

pending_invoices.sql

[07] Alerting

Quick, smart alerting with incident triage

Reduce on-call burden. Every alert is paired with a detailed knowledge graph of your data infrastructure so you understand how everything fits together. A shared context with your team to solve incidents faster.

MCP setup

@alertingInvestigate new production alerts and generate triage context with downstream blast radius and likely remediation steps.

DoneThought for 5s

Active Alert

severityP1 high-impact anomaly

pipelinefinance.billing.process_pending_invoices

signalrows_written down 94% vs 7-day avg

started_at2025-01-28 10:03:12 UTC

Triage Context

blast_radius4 downstream models + 2 dashboards

linked_runs3 upstream runs in last 30 minutes

ownershipBilling Data Platform

recommended_next_steprollback run i7k2n + replay window

[08] Insights

The context graph for your data infrastructure

Every run, commit, deployment and dataset is connected. Search it. Trace it. Understand it. Share it.

MCP setup

@insightsAnalyze the last 30 days of finance.billing.process_pending_invoices. Investigate the 15x drop in output volume and map the downstream impact.

DoneThought for 5s

Deployment & Execution

activity28 days active across 45 runs

outages3 major outages (#381, #394, #412)

data12.5TB in; 2.1TB out with drift

uptime99.2% uptime vs. 15x volume drop

Code & Schema Evolution

commits18 commits linked to active runs

schema_changes4 schema migrations

files_touched12 files touched spark/jobs/finance/

investigations7 automated investigations triggered

//Compatibility

[09] Spark

Superb Spark compatibility

Databricks and Amazon EMR are supported today. Built on open standards, integration keeps pace with your continuously evolving data stack.

Platform	Status
Databricks	supported
Amazon EMR	supported
Amazon SageMaker AI	supported
AWS Glue	supported
Google Cloud Dataproc	supported
Jupyter Notebook	supported
MLflow	coming soon

//Integrations

2.4k stars

Production intelligence, built on open standards. OpenTelemetry (OTel) captures execution traces for your Spark jobs, while OpenLineage (OLin) maps your high-level data flow context.