Data & AI Platform

What is Databricks?

The Data Intelligence Platform — unifying data engineering, analytics, and AI on the lakehouse.

Category
Data & AI Platform (lakehouse)
Headquarters
San Francisco, California
Founded
2013
Employees
~10,000+ (est. 12,000–15,000)
Revenue run-rate
$6.9B (+80% YoY, Jun 2026)
Valuation
$134B (Series L, Dec 2025)

What is Databricks?

Databricks is the company behind the Data Intelligence Platform, a cloud-based system that unifies data engineering, analytics, data warehousing, and AI/ML on a single 'lakehouse' architecture. Founded in 2013 by the UC Berkeley team that created Apache Spark, it now serves more than 20,000 organizations and, as of its June 2026 Data + AI Summit, runs at a $6.9 billion revenue run-rate growing more than 80% year-over-year.

Databricks pioneered the 'lakehouse' — an architecture that merges the cheap, open storage of a data lake with the governance and SQL performance of a data warehouse, built on open-source projects it created or stewards: Apache Spark, Delta Lake, MLflow, and Unity Catalog. The platform runs natively on all three major clouds (AWS, Azure, and Google Cloud) and lets customers run ETL, BI/SQL, machine learning, and generative-AI workloads against one copy of their data.

The scale is large. At its June 16, 2026 Data + AI Summit, Databricks said it had crossed a $6.9 billion annualized revenue run-rate growing more than 80% year-over-year — a sharp acceleration from the $5.4 billion (+65%) it reported in February 2026. It serves more than 20,000 customers, over 800 of them spending more than $1 million a year and more than 70 spending over $10 million, with adoption by more than 60% of the Fortune 500. Net revenue retention exceeds 140%, and the company has been free-cash-flow positive over the trailing twelve months.

AI is now the fastest-growing part of the business: Databricks' AI products (its Mosaic AI stack, Genie assistant, and agent tooling) reached a $1.7 billion annualized run-rate, up from $1.4 billion four months earlier. That growth carries a cost — gross margins have slipped to around 74% (from above 80%) as AI agents drive heavier compute consumption. Gartner has named Databricks a Leader in its Magic Quadrant for Data Science and Machine Learning Platforms, positioning it as the primary challenger to Snowflake and the cloud hyperscalers' native analytics stacks.

What does Databricks offer?

A unified Data Intelligence Platform spanning data engineering, warehousing, governance, and AI — built on open lakehouse foundations.

  • Lakehouse Platform· Core
  • Delta Lake· Storage
  • Apache Spark· Compute
  • Databricks SQL· Warehousing
  • Unity Catalog· Governance
  • Mosaic AI· AI/ML
  • MLflow· AI/ML
  • Genie (AI assistant)· AI
  • Lakebase (serverless Postgres)· Database
  • Delta Live Tables / ETL· Data engineering
  • Databricks Apps· App platform
  • Marketplace & Delta Sharing· Data sharing

How does Databricks make money?

Databricks sells consumption-based access to its platform, billed in Databricks Units (DBUs) on top of the customer's cloud spend, plus subscription tiers that unlock governance, security, and AI features. Revenue is driven by usage growth inside existing accounts (net retention >140%) far more than by per-seat licensing.

The core unit of pricing is the DBU — a normalized measure of compute consumed per second. Rates vary by workload and tier: SQL Classic compute runs around $0.22/DBU on AWS, all-purpose compute is roughly $0.55/DBU on the Premium tier, and fully managed Serverless SQL is about $0.70/DBU in US regions (higher in the EU). Customers also pay their cloud provider separately for the underlying VMs and storage, except on serverless, where infrastructure is bundled into the DBU rate.

Databricks packages features into tiers — historically Standard, Premium, and Enterprise. Premium (now effectively the default, including Unity Catalog, role-based access control, SQL warehouses, serverless, and the full Mosaic AI suite) is the volume tier; Enterprise adds compliance certifications, dedicated support, and custom SLAs at roughly 15–25% higher DBU rates. The legacy Standard tier is being retired across clouds, pushing accounts up to Premium.

Growth comes from landing a workload and expanding consumption as more teams move data engineering, BI, and AI onto the platform — which is why net revenue retention exceeds 140%. Large enterprises sign committed-use contracts (DBCUs) for volume discounts, and the AI products plus the newer Lakebase serverless database are the fastest-growing consumption drivers — AI alone reached a $1.7 billion run-rate, and the company's data-warehousing business also crossed $1 billion. The flip side of agent-heavy consumption is margin: gross margins have eased to about 74% as AI workloads burn more compute.

Who leads Databricks?

Databricks is led by co-founder and CEO Ali Ghodsi, alongside the other UC Berkeley 'Apache Spark' founders and a public-company-caliber executive bench (CFO Dave Conte, plus revenue and operations leadership).

  • Ali GhodsiCo-founder & CEOCEO since 2016 (co-founder 2013)UC Berkeley adjunct professor and Spark contributor; took over as CEO from Ion Stoica and has led Databricks through its rise to a $134B valuation.
  • Ion StoicaCo-founder & Executive ChairmanSince 2013 (first CEO)UC Berkeley professor; served as Databricks' founding CEO and now chairs the board.
  • Matei ZahariaCo-founder & CTOSince 2013Original creator of Apache Spark (his Berkeley PhD) and MLflow; also a professor at UC Berkeley.
  • Reynold XinCo-founder & Chief ArchitectSince 2013Top Apache Spark committer; drives the technical architecture of the lakehouse platform.
  • Patrick WendellCo-founder & VP of EngineeringSince 2013Early Spark release manager; oversees core platform engineering.
  • Dave ConteChief Financial OfficerCFO (joined 2019)Former Splunk CFO who took that company public and scaled it past $2B in revenue; brought in to steer Databricks toward an IPO.
  • Ron GabriskoChief Revenue OfficerCROLeads global sales and field operations; owns the go-to-market engine driving 80%+ revenue growth.

How do you contact Databricks's leadership?

Databricks does not publish individual executive email addresses, but its verified corporate pattern is first.last@databricks.com (used in ~86% of addresses). The addresses below follow that verified format and are inferred, not officially published; for press, Databricks routes through press@databricks.com and its newsroom.

Email formatfirst.last@databricks.com

How much funding has Databricks raised?

Databricks has raised roughly $19 billion in equity across twelve disclosed rounds (Series A through L), most recently a Series L of more than $4 billion at a $134 billion valuation announced in December 2025 — making it one of the most valuable venture-backed companies in the world.

The early rounds were Andreessen Horowitz- and NEA-led: Series A in September 2013 ($13.9M, led by a16z), Series B in 2014 ($33M, NEA), Series C in 2016 ($60M, NEA), and Series D in 2017 ($140M, a16z). Series E in February 2019 raised $250M at a $2.75B valuation (a16z), followed by Series F in October 2019 ($400M at $6.2B, a16z).

The scale-up rounds came fast: Series G in February 2021 raised $1B at a $28B valuation (led by Franklin Templeton), Series H in August 2021 raised $1.6B at $38B (led by Morgan Stanley's Counterpoint), and Series I in September 2023 added $500M at a $43B valuation (co-led by Capital One Ventures and Nvidia). Series J in December 2024 was a landmark $10B raise at a $62B valuation led by Thrive Capital (co-led by a16z, DST Global, GIC, Insight Partners, and WCM).

The most recent rounds reflect the AI-driven re-rating. Series K in September 2025 raised about $1B at a $100B-plus valuation (co-led by Thrive Capital, a16z, Insight Partners, MGX, and WCM). Then Series L, announced December 16, 2025, brought in more than $4 billion of equity (about $5B) at a $134B valuation — led by Insight Partners, Fidelity, and J.P. Morgan Asset Management — alongside roughly $2B of new debt, for over $7B of total capital. Databricks is free-cash-flow positive, and CEO Ali Ghodsi has signaled it is IPO-bound, with an S-1 widely expected around Q3 2026 and a debut in late 2026 or 2027.

How did Databricks get here?

From an open-source Spark project out of UC Berkeley to a $134B data-and-AI platform running at a $6.9B revenue run-rate in roughly a decade.

  1. 2013Founded out of UC BerkeleySeven AMPLab researchers — including Ali Ghodsi, Ion Stoica, and Matei Zaharia — commercialize Apache Spark; Series A from Andreessen Horowitz.
  2. 2020Lakehouse and Delta Lake go mainstreamDatabricks popularizes the 'lakehouse' architecture and open-sources Delta Lake; acquires Redash for dashboards.
  3. Jun 2023$1.3B MosaicML acquisitionBuys generative-AI startup MosaicML to build out its AI/LLM stack, seeding the Mosaic AI product line.
  4. Dec 2024$10B Series J at $62BThrive Capital leads a landmark $10B round, one of the largest private raises in tech history.
  5. 2025Neon acquisition and Lakebase launchAcquires serverless-Postgres startup Neon (~$1B) and launches Lakebase, a database built for AI agents; Series K pushes valuation past $100B.
  6. Dec 2025Series L at $134B valuationRaises >$4B equity plus ~$2B debt at a $134B valuation, the largest private software round on record at the time; led by Insight Partners, Fidelity, and J.P. Morgan.
  7. Jun 2026$6.9B run-rate, IPO on the horizonAt its Data + AI Summit, Databricks reports a $6.9B revenue run-rate (+80% YoY) with AI products at $1.7B; CEO calls 2026 'a terrible year to go public' but reaffirms IPO plans.

Who are Databricks's competitors?

Databricks competes with the cloud data warehouses and the hyperscalers' native analytics stacks, with Snowflake as its closest head-to-head rival.

  • SnowflakeThe closest rival; a SQL-first cloud data warehouse, stronger for managed BI/analytics where Databricks leans data-engineering and AI/ML. Databricks's $134B private valuation now exceeds Snowflake's ~$83B public market cap.
  • Google BigQueryServerless data warehouse native to Google Cloud; favored by GCP-centric teams over a multi-cloud lakehouse.
  • Amazon RedshiftAWS's native warehouse; the default for AWS-locked structured analytics, competing on cost and integration.
  • Microsoft FabricMicrosoft's unified analytics SaaS (Synapse + Power BI); bundles warehouse, lakehouse, and BI for Azure shops.
  • ConfluentReal-time data streaming on Kafka; overlaps with Databricks on streaming ingestion and event pipelines.
  • DremioOpen lakehouse query engine on Apache Iceberg; pitches lower lock-in versus a full Databricks platform.

Databricks — frequently asked questions

Agent CTA Background

Revenue work. On autopilot.

Start Free TrialBuilt for revenue teams who care about quality.