T
traeai
登录
返回首页
Databricks

宣布推出原生湖屋同步功能

8.5Score
宣布推出原生湖屋同步功能

TL;DR · AI 摘要

Databricks 推出原生湖屋同步功能,实现数据湖和数据仓库之间的无缝集成,提升数据共享和治理能力。

核心要点

  • 原生湖屋同步功能支持零拷贝数据共享,提高数据访问效率。
  • 统一治理功能确保数据安全性和合规性,简化数据管理。
  • 该功能适用于多种云平台,包括 AWS、Azure 和 GCP。

结构提纲

按章节快速跳转。

  1. 介绍 Databricks 推出的原生湖屋同步功能。

  2. 详细说明原生湖屋同步功能的工作原理。

  3. 探讨该功能带来的数据共享和治理优势。

  4. 介绍该功能在不同云平台上的适用性。

  5. 展望原生湖屋同步功能的未来发展。

思维导图

用一张图看清主题之间的关系。

查看大纲文本(无障碍 / 无 JS 友好)
  • 原生湖屋同步
    • 核心机制
      • 零拷贝数据共享
    • 优势与应用
      • 数据共享和治理
    • 多云支持
      • AWS, Azure, GCP
    • 未来展望

金句 / Highlights

值得收藏与分享的关键句。

#Databricks#数据湖#数据仓库#数据治理
打开原文

Announcing Native Lakehouse Sync | Databricks Blog

Skip to main content

[![Image 1](blob:http://localhost/c3d26385bd032c882a09c45135533626)](https://www.databricks.com/)

[![Image 2](blob:http://localhost/c3d26385bd032c882a09c45135533626)](https://www.databricks.com/)

  • Why Databricks
  • * Discover
  • Customers
  • Partners
  • Product
  • * Databricks Platform
  • Integrations and Data
  • Pricing
  • Open Source
  • Solutions
  • * Databricks for Industries
  • Cross Industry Solutions
  • Migration & Deployment
  • Solution Accelerators
  • Resources
  • * Learning
  • Events
  • Blog and Podcasts
  • Get Help
  • Dive Deep
  • About
  • * Company
  • Careers
  • Press
  • Security and Trust
  • DATA + AI SUMMIT ![Image 3: Data+ai summit promo JUNE 15–18|SAN FRANCISCO Join us at the world’s largest data, apps and AI event. Register](https://www.databricks.com/dataaisummit?itm_source=www&itm_category=home&itm_page=home&itm_location=navigation&itm_component=navigation&itm_offer=dataaisummit)
  1. All blogs
  2. / Platform

Table of contents

Table of contents

Table of contents

ProductMay 12, 2026

Announcing Native Lakehouse Sync

Opening Lakebase data to models, analytics, and other engines

by Pranav Aurora, Hristo Stoyanov and Cheng Chen

Summary

  • Native Lakehouse Sync (Public Preview) replicates Lakebase Postgres data into Unity Catalog managed tables automatically, with no pipelines or external compute.
  • Traditional CDC stacks break under agent-driven workloads. Because Lakebase and the Lakehouse share the same open storage, sync becomes a native database property with zero Postgres performance impact, no added cost, and automatic schema propagation.
  • Live ML features grounded in current app state, operational data as the Bronze layer of a medallion architecture with full SCD Type 2 history, and built-in audit capture for every change.

Today we are excited to announce the Public Preview ofNative Lakehouse Sync, a core capability ofDatabricks Lakebase that replicates Lakebase data to Unity Catalog managed tables, without any pipelines or external compute. Native Lakehouse Sync is available in all Lakebase regions on AWS and Azure.

Why we built it

Applications used to run on a single operational database. As use cases expanded, one database stopped being enough. Analytics, ML, and search all live outside the operational database, meaning data has to move.

Image 4: image_122

Expand

Historically, this meant daily batch dumps to a warehouse, which eventually evolved into Change Data Capture (CDC). Hyperscalers packaged this as ‘managed' syncs ("zero-ETL"), deploying data pipelines alongside the database. But these managed syncs rely on legacy assumptions: always-on workloads, stable schemas, predictable query volumes, and a single destination warehouse. The problem compounds with every new destination of data: operational performance degrades, schema drifts, and points of failure multiply across the stack.

Agent-first development breaks this model entirely. Agents branch data rapidly to iterate safely, scale to zero between tasks, and spin up short-lived environments. Managing a custom pipeline for every branch and every destination simply doesn’t scale.

Plumbing into a warehouse is the wrong approach. Downstream consumers are rarely just dashboards anymore; they are embedding models, LLMs, prediction services, and feature pipelines. Open table formats like Delta Lake and Apache Iceberg™ provide the ideal primitive: storing data once in cheap object storage to power every workload without duplication. It's a known known: you need a Lakehouse, and you want fresh operational data inside it.

But writing operational data into a Lakehouse created new challenges. Teams were forced to configure Postgres replication slots, Debezium connectors, stream processing engines to write into open formats, and separate compute just to optimize the tables. Every hop adds a point of failure.

Sync as a property of Lakebase

Lakebase is built on a fundamentally different assumption: an operational database should run on the exact same open, low-cost cloud storage as your Lakehouse. Because OLTP and OLAP share this unified storage foundation, we can eliminate the ETL pipeline entirely. Data movement becomes a native property of the database itself.

Image 5: image_120

Expand

With Native Lakehouse Sync, Lakebase decodes its Write-Ahead-Log (WAL) and writes directly to Unity Catalog Managed Tables. A single schema-level toggle enables it in under a minute. This sync has zero impact on Postgres performance, and no additional cost. And since Databricks controls both ends, schema changes flow automatically, eliminating the drift and lag.

Agent-first from end to end

Agents build apps on Lakebase. Agents like Databricks Genie analyze the data. To keep this entire lifecycle autonomous, Native Lakehouse Sync is built as a core property of Lakebase. It inherits the exact behaviors agents need to operate seamlessly:

  • Scale-to-zero: Sync pauses when the database scales to zero and resumes from the last LSN upon waking.
  • Zero compute management: Sync is a native part of Lakebase. All monitoring and observability stay within your Lakebase Project.
  • Automatic schema propagation: Schema changes flow automatically. Adding a column propagates instantly. Dropping a column retains it on the destination. Agents never have to recreate the sync.

Lakehouse primitives on the destination side

Because the destination is a Unity Catalog managed table, every Lakehouse capability is available on synced data from the moment it lands.

  • AI-native analytics:Immediately available for querying, analysis, and pipeline generation by agents like Databricks Genie and Genie Code.
  • Universal readability: Readable by Databricks SQL, Apache Spark, Lakeflow Spark Declarative Pipelines, ML notebooks, and any tool speaking Delta or Iceberg.
  • Unified governance: Lineage, access policies, tags, and audits are inherited from Unity Catalog.
  • Automatic optimization: Predictive Optimization and Liquid Clustering apply with zero setup.
  • Default versioning: Every insert, update, and delete lands as SCD Type 2 history. Audit logs, rewinds, and CDF semantics are built in.

What you can build with Native Lakehouse Sync

Together, these source and destination behaviors unlock three patterns that previously required a custom Change Data Capture (CDC) stack:

Agentic memory and live ML features. Application writes land in Unity Catalog within a minute, so models retrain and score against the current state of the application without a separate ingestion pipeline.

Operational data in the medallion architecture. Use Lakebase as the Bronze Tables in the medallion architecture. High-velocity updates happen in Postgres, and the full change history flows into the Lakehouse automatically as SCD Type 2.

Compliance and audit. Every insert, update, and delete is captured as a history row in Unity Catalog. No application-side history tracking, no separate audit pipeline.

Get started

Native Lakehouse Sync is in Public Preview. Spinning up a Lakebase is instant. Toggle sync on a schema once, and every existing and future table will appear in Unity Catalog within a minute

Image 6: image_8

Expand

Lakebase is built on the exact same open data foundation as the Lakehouse. Native Lakehouse Sync makes that vision a reality, allowing Lakebase data to flow into open formats automatically without a separate pipeline.

The next step: bringing that same openness from the Lakehouse to Lakebase tables. Stay tuned.

Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.

Sign up

*

Work Email

*

Country Country*

By clicking “Subscribe” I understand that I will receive Databricks communications, and I agree to Databricks processing my personal data in accordance with its Privacy Policy.

Subscribe

View all blogs

Image 7: databricks logo

Why Databricks

Discover

Customers

Partners

Why Databricks

Discover

Customers

Partners

Product

Databricks Platform

Pricing

Open Source

Integrations and Data

Product

Databricks Platform

Pricing

Open Source

Integrations and Data

Solutions

Databricks For Industries

Cross Industry Solutions

Data Migration

Professional Services

Solution Accelerators

Solutions

Databricks For Industries

Cross Industry Solutions

Data Migration

Professional Services

Solution Accelerators

Resources

Documentation

Customer Support

Community

Learning

Events

Blog and Podcasts

Resources

Documentation

Customer Support

Community

Learning

Events

Blog and Podcasts

About

Company

Careers

Press

Security and Trust

About

Company

Careers

Press

Security and Trust

Image 9: databricks logo

Databricks Inc.

160 Spear Street, 15th Floor

San Francisco, CA 94105

1-866-330-0121

  • [](https://www.linkedin.com/company/databricks)
  • [](https://www.facebook.com/pages/Databricks/560203607379694)
  • [](https://twitter.com/databricks)
  • [](https://www.databricks.com/feed)
  • [](https://www.glassdoor.com/Overview/Working-at-Databricks-EI_IE954734.11,21.htm)
  • [](https://www.youtube.com/@Databricks)
Image 11

See Careers

at Databricks

  • [](https://www.linkedin.com/company/databricks)
  • [](https://www.facebook.com/pages/Databricks/560203607379694)
  • [](https://twitter.com/databricks)
  • [](https://www.databricks.com/feed)
  • [](https://www.glassdoor.com/Overview/Working-at-Databricks-EI_IE954734.11,21.htm)
  • [](https://www.youtube.com/@Databricks)

© Databricks 2026. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the Apache Software Foundation.

We Care About Your Privacy

Databricks uses cookies and similar technologies to enhance site navigation, analyze site usage, personalize content and ads, and as further described in our Cookie Notice. To disable non-essential cookies, click “Reject All”. You can also manage your cookie settings by clicking “Manage Preferences.”

Manage Preferences

Reject All Accept All

Image 14: Databricks Company Logo

Privacy Preference Center

Opt-Out Preference Signal Honored

Privacy Preference Center

  • ### Your Privacy
  • ### Strictly Necessary Cookies
  • ### Performance Cookies
  • ### Functional Cookies
  • ### Targeting Cookies
  • ### TOTHR

#### Your Privacy

When you visit any website, it may store or retrieve information on your browser, mostly in the form of cookies. This information might be about you, your preferences or your device and is mostly used to make the site work as you expect it to. The information does not usually directly identify you, but it can give you a more personalized web experience. Because we respect your right to privacy, you can choose not to allow some types of cookies. Click on the different category headings to find out more and change our default settings. However, blocking some types of cookies may impact your experience of the site and the services we are able to offer.

#### Opting out of sales, sharing, and targeted advertising

Depending on your location, you may have the right to opt out of the “sale” or “sharing” of your personal information or the processing of your personal information for purposes of online “targeted advertising.” You can opt out based on cookies and similar identifiers by disabling optional cookies here. To opt out based on other identifiers (such as your email address), submit a request in our Privacy Request Center.

More information

#### Strictly Necessary Cookies

Always Active

These cookies are necessary for the website to function and cannot be switched off in our systems. They assist with essential site functionality such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will no longer work.

#### Performance Cookies

  • [x] Performance Cookies

These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site.

#### Functional Cookies

  • [x] Functional Cookies

These cookies enable the website to provide enhanced functionality and personalization. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies then some or all of these services may not function properly.

#### Targeting Cookies

  • [x] Targeting Cookies

These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant advertisements on other sites. If you do not allow these cookies, you will experience less targeted advertising.

#### TOTHR

  • [x] TOTHR

Cookie List

Consent Leg.Interest

  • [x] checkbox label label
  • [x] checkbox label label
  • [x] checkbox label label

Clear

  • - [x] checkbox label label

Apply Cancel

Confirm My Choices

Allow All

Image 15: Powered by Onetrust

Image 17Image 18

Image 19
Image 20

AI 可能会生成不准确的信息,请核实重要内容