Skip to content

Conversation

@martonvago
Copy link
Contributor

Description

This PR adds a function for handling grouped errors under $.resources[x].schema.fields[x].
Here, the problem comes from the fact that each field type has its own sub-JSON-schema, and each one of these sub-schemas flags issues when the type of a field is not its own type. So, if a field has type="number" and there is something wrong with the field, then the sub-schemas for year, string, etc. will also flag issues. The goal is to flag issues only for number.

Part of #15

Needs an in-depth review.

Checklist

  • Formatted Markdown
  • Ran just run-all

errors_in_group = _get_errors_in_group(schema_errors, parent_error)
schema_errors.remove(parent_error)

field_type: str = parent_error.instance.get("type", "string")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting to string if the type is not given. This is because the string-schema is the only one that doesn't require type.

schema_index = FIELD_TYPES.index(field_type)
errors_for_other_types = _filter(
errors_in_group,
lambda error: f"fields/items/oneOf/{schema_index}/" not in error.schema_path,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can choose the sub-schema by filtering on the schema path. E.g., the schema path of the string-based schema is ...fields/items/oneOf/0/...

str(files("check_datapackage.schemas").joinpath("data-package-2-0.json"))
)

FIELD_TYPES = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can extract this from the schema on the fly if that's better

@martonvago martonvago moved this from Todo to In Review in Iteration planning Oct 29, 2025
@martonvago martonvago marked this pull request as ready for review October 29, 2025 15:08
@martonvago martonvago requested a review from a team as a code owner October 29, 2025 15:08
@martonvago
Copy link
Contributor Author

If we merge in #177, I will align this with that

Copy link
Member

@lwjohnst86 lwjohnst86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Only some small comments ☺️

@github-project-automation github-project-automation bot moved this from In Review to In Progress in Iteration planning Nov 3, 2025
@martonvago martonvago requested a review from lwjohnst86 November 3, 2025 14:17
@martonvago martonvago moved this from In Progress to In Review in Iteration planning Nov 3, 2025
Copy link
Member

@lwjohnst86 lwjohnst86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very small change to the error message

if field_type not in FIELD_TYPES:
unknown_field_error = SchemaError(
message=(
"Unknown Data Package field type. Please use one of"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Unknown Data Package field type. Please use one of"
"The resource's schema's field property doesn't have the correct type. It should be one of"

Just to be explicitly clear what we're meaning. I think this is correct right?

@github-project-automation github-project-automation bot moved this from In Review to In Progress in Iteration planning Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants