Skip to main content
When implementing Delta Sharing here are some best practises to keep in mind.
  • Keep shares small and purposeful: one share per product, avoid “kitchen sink” shares
  • Schema contracts: version breaking changes (e.g., dp_orders_v2)
  • Select only columns you need; apply filters for predicate pushdown
  • For Power BI, mind the row limit behavior in Power Query
  • Incremental loads: prioritise CDF over re-reading entire tables
  • Security: favor OIDC federation over long-lived bearer tokens where possible (better rotation, MFA)
  • Auditing: one recipient per share simplifies revocation and tracking

Connecting to a share

Activation (open sharing):
  1. Provider creates a recipient (open sharing) and gets the activation link
  2. The recipient opens the link to download the credentials file (.SHARE) or completes OIDC setup
  3. The recipient opens the .SHARE credentials file, a JSON with the structure below
  4. They use the credentials to connect from Spark, pandas, Power BI, or Tableau
The credentials file is a JSON with the following structure:
{
  "shareCredentialsVersion": 1,
  "bearerToken": "fdij0238d209f94h2...",
  "endpoint": "https://northeurope-c2.azuredatabricks.net/api/2.0/delta-sharing/metastores/54fd8965-2ad0-4ee4",
  "expirationTime": "9999-12-31T23:59:59.999Z"
}
shareCredentialsVersion: The schema version of the credential file, currently this is always 1 bearerToken: This is the secret, the token that authenticates your client against the provider’s sharing server. Treat it like a password and do not commit it to source control or share it endpoint: The Delta Sharing server URL. This is identical for all shares from the same Databricks metastore. What differs between recipients is the token, not the endpoint expirationTime: When the token will expires. A value of 9999-12-31 means it is set to never expire, although the provider is able to revoke it at any time
The bearerToken is the sensitive part of this file; the endpoint is the same across your entire organisation’s metastore and is not secret. Many orgs now prefer to use OIDC to avoid managing long-lived tokens altogether. If using the credentials file, store it outside your repository and restrict file system access accordingly.
Some examples:
Load a data feed
table_url = "/path/to/profile.share#<share>.<schema>.<table>"

df = (spark.read
  .format("deltasharing")
  .load(table_url)
) # batch read

df.filter("event_date = '2025-01-01'")
  .select("customer_id","event_type","event_date")
  .show()
Change a data Feed (incremental):
cdf = (spark.read
      .format("deltasharing")
      .option("readChangeFeed", "true")
      .option("startingTimestamp", "2025-08-01 00:00:00")
      .load(table_url)
)
A pandas example
  import delta_sharing as ds
  profile = "/path/to/profile.share"
  table_url = f"{profile}#<share>.<schema>.<table>"
  sample = ds.load_as_pandas(table_url, limit=100)
  pdf = ds.load_as_pandas(table_url)
Import data using Power BI or Power Query:
  1. Get Data → Delta Sharing
  2. Paste in Delta Sharing Server URL and Bearer token from the credential file (or use OIDC if configured).
  3. Choose your table or tables and load (Import).
Import data using Tableau:
  • Install “Delta Sharing by Databricks” from Tableau Exchange.
  • In Tableau: Connect → Delta Sharing by Databricks → either upload the .share file or enter Endpoint URL and Bearer Token.
For the JVM pipelines you can use the Delta Sharing Java connector (community/labs) via Maven/SBT and point to the .share file. This is handy for embedding inside services.

Retention and history

Delta table history (metadata) typically is retained for 30 days. Older versions may be removed and time-travel reads might no longer be possible after retention/VACUUM. Plan to copy or snapshot locally if you need long-term historical access. Delta Sharing is read-only; recipients can always land extracts locally if they need to keep records beyond retention windows.
  • Delta Sharing overview (Databricks docs)
  • Open-source Delta Sharing repo (protocol, Python & Spark connectors, examples)
  • Read shared data with Spark / pandas / Power BI using credential files
  • Spark format(“deltasharing”) examples (read, CDF, streaming)
  • Power BI / Power Query - Delta Sharing connector
  • Tableau - “Delta Sharing by Databricks” connector (Tableau Exchange)
  • Delta Sharing Java connector (labs/community)
If a partner cannot use Delta Sharing, they can sometimes read Delta format directly (e.g., Trino/Presto, DuckDB) with proper storage access controls.

Troubleshooting

  • 401/403 unauthorized: credential expired/revoked, token missing, or OIDC not configured. Regenerate activation link or confirm OIDC.
  • CDF is not enabled: request provider to enable history sharing or use full reads.
  • Power BI shows limited rows: adjust Power Query row limits and apply filters.
  • Historical versions unavailable: likely vacuumed or beyond retention; snapshot/copy locally for long-term needs.

Code snippets

table_url = "/path/to/profile.share#<share>.<schema>.<table>"
changes = (spark.read
  .format("deltasharing")
  .option("readChangeFeed", "true")
  .option("startingTimestamp", "2025-07-01 00:00:00")
  .option("endingTimestamp",   "2025-07-31 23:59:59")
  .load(table_url)
)
import delta_sharing as ds
ds.load_as_pandas(table_url, limit=50)

New delete pattern

To clarify which data is stored and made available through Delta Share, a new delete pattern has been introduced (as of January 2026). Depending on how consumers of Delta Share have implemented their integrations, this change may require adjustments on their side. Delta Share allows customers to incrementally retrieve changes to their data and store those changes in their own data warehouse or similar storage solution. This is achieved by storing every batch of new or modified records with a commit version tag. Consumers can then download all data associated with a specific commit version. When new or updated data is added to Delta Share, the record receives the value “insert” in the _change_type column. The column _commit_version indicates which version the record belongs to, and _commit_timestamp shows when the record was added.
_commit_timestamp
The new delete pattern introduces a _change_type value named “delete”. A delete record will be generated for any row where the import_date_time is older than 30 days. This record does not modify the original data. All fields remain unchanged. Instead, it serves as a technical log entry indicating that the corresponding row will be removed from Delta Share. Consumers of Delta Share may choose to ignore these delete records or treat them similarly.

FAQ

Here are some frequently asked questions about this integration.
Using the “query” option returns all data currently available in the Delta Share. In other words, the latest full snapshot of that table. It does not include all historical data, only what’s active right now.Some Delta setups allow “time travel” to older versions of the data, but this only works if history sharing is enabled for that table.Old versions are automatically cleaned up according to the retention policy (for Voyado, that’s 30 days).
This means you only have access to what’s currently part of the share and not every piece of data stored in Engage.
Yes. Delta Share tables follow a retention policy. In Voyado’s case, data older than 30 days is removed automatically. So, when you use a “query” you’re getting the most recent snapshot. Data outside of the retention window isn’t available.
The “changes” option provides incremental updates, meaning only the rows that were added, modified, or deleted between table versions.It works through Delta’s Change Data Feed (CDF), so the feature must be enabled for that table.Each update includes metadata like:
  • _change_type
  • _commit_version
  • _commit_timestamp
The versions you can access depend on the retention window, not the number of versions or files. Once versions are older than 30 days, they’re no longer accessible.If you try to query a starting or ending version outside that 30-day window, no data will be returned.
Availability is based solely on time, not on how many versions or files exist. Once data versions pass the 30-day retention threshold, they’re cleaned up automatically. There’s no way to retrieve them afterward.
If a table shows no version history, it typically means:
  • Change Data Feed (CDF) or version tracking is not enabled for that share, or
  • All historical versions have been cleaned up by the retention policy.
In such cases, you’ll only have access to the latest snapshot, with no historical data or changes available.
This occurs when a full export or reset is performed in Engage. During this process, all previous table versions and transaction logs are removed as part of the retention and cleanup policy.After the reset, a new baseline (latest/max) version becomes the valid starting point for queries. If a customer or integration tries to access version 0 or any version older than the current baseline, the system returns this error because those versions have been deleted.The data version in the Delta Share protocol is not affected by the export itself — data continues to propagate to the latest version, ensuring ongoing updates are reflected from that new baseline.Versions in Delta Share are subject to the default retention period of 30 days. Older versions are automatically cleaned up and are no longer available after that window.In short, after a full export, historical versions are removed, and the data is accessible only from the latest (max) version onward. All queries must start from that version, as older ones are cleared by retention.