-
Notifications
You must be signed in to change notification settings - Fork 726
Add remove sub_columns portions normalizer #28559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add remove sub_columns portions normalizer #28559
Conversation
|
🟢 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a new normalizer to clean portions that use the SUB_COLUMNS data accessor. The normalizer identifies and removes these portions from the database during tablet reboot, which is useful for database maintenance and schema migration scenarios.
Key Changes
- Added
TCleanSubColumnsPortionsNormalizerthat detects and removes portions containing columns with SUB_COLUMNS accessor - Implemented comprehensive test coverage using Y_UNIT_TEST_TWIN to verify behavior with and without subcolumns
- Registered the new normalizer in the build system
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| ydb/core/tx/columnshard/normalizer/portion/clean_sub_columns_portions.h | Header file defining the new normalizer class interface |
| ydb/core/tx/columnshard/normalizer/portion/clean_sub_columns_portions.cpp | Implementation of the normalizer that identifies and marks SUB_COLUMNS portions for removal |
| ydb/core/tx/columnshard/normalizer/portion/ya.make | Build system registration for the new normalizer source file |
| ydb/core/tx/columnshard/ut_rw/ut_normalizer.cpp | Added test case and helper classes to verify the normalizer functionality, plus new includes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const ui64 tableId = 1; | ||
| const std::vector<NArrow::NTest::TTestColumn> schema = { NArrow::NTest::TTestColumn("id", TTypeInfo(NTypeIds::Uint64)), | ||
| NArrow::NTest::TTestColumn("json_payload", TTypeInfo(NTypeIds::JsonDocument)) }; | ||
| const std::vector<ui32> columnsIds = { 1, 2 }; |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable columnsIds is defined but never used. It should either be removed or used in the writer.Write() call on line 552 instead of the hardcoded {1, 2}.
|
⚪ ⚪ Ya make output | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
|
⚪
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
Changelog entry
...
Changelog category
Description for reviewers
...