Skip to content

Conversation

@HarshCasper
Copy link
Member

No description provided.

@cloudflare-workers-and-pages
Copy link

Deploying localstack-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 9b0fd1d
Status: ✅  Deploy successful!
Preview URL: https://902cbd9a.localstack-docs.pages.dev
Branch Preview URL: https://s3tables.localstack-docs.pages.dev

View logs

Copy link

@bentsku bentsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nice to add this documentation now! @hovaesco did a phenomenal job implementing this service, and he has way more knowledge than me on managed Iceberg tables, so I'll let him give the final approval stamp.

I've shared what I know and the confusion that is around S3 Tables, the tutorial and the rest looks really good, thanks a lot for adding this quickly! 🚀


## Introduction

Amazon S3 Tables are specialized S3 buckets for managing tabular data (for example, Apache Iceberg tables) with built-in maintenance features like automatic compaction and snapshot management.
Copy link

@bentsku bentsku Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hovaesco correct me if I'm wrong, but I feel like S3 Tables is more of a "catalog" that will take care of creating the underlying S3 buckets for you transparently without you having to deal at all with them.
Saying that it's a managed Apache Iceberg solution using S3 storage might be a bit clearer/remove the confusion of S3 buckets all together?

An "S3 Tables Bucket" is actually a collection of real S3 buckets, one per table.

Overall I think it might look like: S3Tables Namespace -> S3TablesBucket -> S3TablesTable -> S3 Bucket
(This is mostly just for context sharing, no need to write about the line above)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, maybe also don't say for example, Apache Iceberg tables because Iceberg is the only format supported now

Comment on lines +130 to +133
{
"versionToken": "0c0c1509",
"warehouseLocation": "s3://hqpdve6ni1lb7w5bdn24lruswomtsh5bdrw66oip--table-s3"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hovaesco this is what you meant if the previous PR, right? S3 Tables will not return the MetadataLocation field if you did not execute any iceberg request against it? So this response does not look too good as it actually doesn't contain the metadata location? or am I fully offtrack here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, so after merging my latest PR the output would be:

awslocal s3tables get-table-metadata-location \
    --table-bucket-arn arn:aws:s3tables:us-east-1:000000000000:bucket/my-table-bucket \
    --namespace my_namespace \
    --name my_table
{
    "versionToken": "b69a6fbb",
    "metadataLocation": "s3://893wylknn9utyhrcv7xzlz5a9acqh1vn480tupa5--table-s3/metadata/00000-b6d96c57-403a-4387-ac59-ec55ac2e646b.metadata.json",
    "warehouseLocation": "s3://893wylknn9utyhrcv7xzlz5a9acqh1vn480tupa5--table-s3"
}

AWS S3 tables service doesn't return metadataLocation see https://github.com/localstack/localstack-pro/blob/1e9aef0522806c4974e5605b585f9d528e16504a/localstack-pro-core/tests/aws/services/s3tables/test_s3tables.snapshot.json#L364-L366 that's why I added skip on this field, without it being returned PyIceberg is not working correctly, for me it seems to be a bug on AWS side but I will investigate it further

Copy link
Contributor

@hovaesco hovaesco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀 % added some comments


## API Coverage

<FeatureCoverage service="s3tables" client:load />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how this is generated? it says that DeleteTableBucketPolicy, GetTablePolicy operations are supported but it's not true there are others which are not supported as well

---
title: "S3 Tables"
description: Get started with Amazon S3 Tables on LocalStack
persistence: supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether persistence is supported, we don't run any test to verify that - @bentsku could you help here and clarify what is required to support persistence in AWS service?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! By default, persistence would be enabled for the "control plane" via the store (we have some magic in place to automatically pick up stores), and here the "data plane" is in S3 so it should work by default.

You can test it with our persistence tests suite framework. I'll send you our internal docs 👍


## Introduction

Amazon S3 Tables are specialized S3 buckets for managing tabular data (for example, Apache Iceberg tables) with built-in maintenance features like automatic compaction and snapshot management.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, maybe also don't say for example, Apache Iceberg tables because Iceberg is the only format supported now


You can also create a table within the namespace.

Run the following command to create a table named `my_table` within the namespace `my_namespace`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS API doc link is missing and it's present in the other sections (CreateNamespace, CreateTableBucket)

Comment on lines +130 to +133
{
"versionToken": "0c0c1509",
"warehouseLocation": "s3://hqpdve6ni1lb7w5bdn24lruswomtsh5bdrw66oip--table-s3"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, so after merging my latest PR the output would be:

awslocal s3tables get-table-metadata-location \
    --table-bucket-arn arn:aws:s3tables:us-east-1:000000000000:bucket/my-table-bucket \
    --namespace my_namespace \
    --name my_table
{
    "versionToken": "b69a6fbb",
    "metadataLocation": "s3://893wylknn9utyhrcv7xzlz5a9acqh1vn480tupa5--table-s3/metadata/00000-b6d96c57-403a-4387-ac59-ec55ac2e646b.metadata.json",
    "warehouseLocation": "s3://893wylknn9utyhrcv7xzlz5a9acqh1vn480tupa5--table-s3"
}

AWS S3 tables service doesn't return metadataLocation see https://github.com/localstack/localstack-pro/blob/1e9aef0522806c4974e5605b585f9d528e16504a/localstack-pro-core/tests/aws/services/s3tables/test_s3tables.snapshot.json#L364-L366 that's why I added skip on this field, without it being returned PyIceberg is not working correctly, for me it seems to be a bug on AWS side but I will investigate it further

@quetzalliwrites
Copy link
Collaborator

Hey @HarshCasper, have you addressed all feedback from @hovaesco? I didn't want to do my review until that technical round of reviews were done. 😸

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants