Skip to content

Conversation

semohr
Copy link
Contributor

@semohr semohr commented Oct 21, 2025

Description

Matching albums and singletons has slightly different behavior between singletons and albums.

Albums: When a user does not specify any search query in the manual prompt, the system defaults to using the corresponding value from the metadata.
Singletons: Overwrite metadata by any given noneempty search query.

Additionally I noticed that we have an unintentional type error in the metadata_plugins.candidates call. Since *arg and **kwargs are not typed they accepted artist and title/album of type str|None while they should only accept str.

Changes

This PR aligns both methods and uses a common function to parse the search terms. This intentionally couples these two occasions to hopefully prevent drift in the future and makes clear that we use the same logic in these two places.

Added explicit typehints to candidates and item_candidates function.

@semohr semohr requested a review from a team as a code owner October 21, 2025 09:53
@Copilot Copilot AI review requested due to automatic review settings October 21, 2025 09:53
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR standardizes search term handling between album and singleton matching in the autotag system. Previously, albums defaulted to metadata when no search query was provided, while singletons would overwrite metadata with any non-empty query. The changes introduce a unified _parse_search_terms_with_fallbacks function that applies consistent logic to both cases and adds explicit type hints to the candidates and item_candidates functions to catch type errors where str|None was accepted instead of the required str.

Key Changes:

  • Introduced _parse_search_terms_with_fallbacks function to unify search term parsing logic
  • Added explicit type hints to candidates and item_candidates functions to enforce str parameters
  • Updated documentation to clarify that empty strings are passed when no metadata is available

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
docs/changelog.rst Documents the standardization of search parameter handling
beets/metadata_plugins.py Adds explicit type hints to candidates and item_candidates functions and updates documentation
beets/autotag/match.py Introduces _parse_search_terms_with_fallbacks and refactors both tag_album and tag_item to use unified search term parsing

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Ensure that custom metadata plugins are updated to match the new named-argument signature for candidates and item_candidates to avoid breakage.
  • Consider renaming _parse_search_terms_with_fallbacks to a more concise name (e.g. _choose_search_terms) for better readability.
  • Clarify in the docstring how empty strings vs None values are treated by the fallback logic to avoid confusion.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Ensure that custom metadata plugins are updated to match the new named-argument signature for candidates and item_candidates to avoid breakage.
- Consider renaming _parse_search_terms_with_fallbacks to a more concise name (e.g. _choose_search_terms) for better readability.
- Clarify in the docstring how empty strings vs None values are treated by the fallback logic to avoid confusion.

## Individual Comments

### Comment 1
<location> `beets/autotag/match.py:252` </location>
<code_context>
def tag_album(
    items,
    search_artist: str | None = None,
    search_album: str | None = None,
    search_ids: list[str] = [],
) -> tuple[str, str, Proposal]:
    """Return a tuple of the current artist name, the current album
    name, and a `Proposal` containing `AlbumMatch` candidates.

    The artist and album are the most common values of these fields
    among `items`.

    The `AlbumMatch` objects are generated by searching the metadata
    backends. By default, the metadata of the items is used for the
    search. This can be customized by setting the parameters.
    `search_ids` is a list of metadata backend IDs: if specified,
    it will restrict the candidates to those IDs, ignoring
    `search_artist` and `search album`. The `mapping` field of the
    album has the matched `items` as keys.

    The recommendation is calculated from the match quality of the
    candidates.
    """
    # Get current metadata.
    likelies, consensus = get_most_common_tags(items)
    cur_artist: str = likelies["artist"]
    cur_album: str = likelies["album"]
    log.debug("Tagging {} - {}", cur_artist, cur_album)

    # The output result, keys are the MB album ID.
    candidates: dict[Any, AlbumMatch] = {}

    # Search by explicit ID.
    if search_ids:
        for search_id in search_ids:
            log.debug("Searching for album ID: {}", search_id)
            if info := metadata_plugins.album_for_id(search_id):
                _add_candidate(items, candidates, info)

    # Use existing metadata or text search.
    else:
        # Try search based on current ID.
        if info := match_by_id(items):
            _add_candidate(items, candidates, info)
            rec = _recommendation(list(candidates.values()))
            log.debug("Album ID match recommendation is {}", rec)
            if candidates and not config["import"]["timid"]:
                # If we have a very good MBID match, return immediately.
                # Otherwise, this match will compete against metadata-based
                # matches.
                if rec == Recommendation.strong:
                    log.debug("ID match.")
                    return (
                        cur_artist,
                        cur_album,
                        Proposal(list(candidates.values()), rec),
                    )

        # Manually provided search terms or fallbacks.
        _search_artist, _search_album = _parse_search_terms_with_fallbacks(
            (search_artist, cur_artist),
            (search_album, cur_album),
        )
        log.debug("Search terms: {} - {}", _search_artist, _search_album)

        # Is this album likely to be a "various artist" release?
        va_likely = (
            (not consensus["artist"])
            or (_search_artist.lower() in VA_ARTISTS)
            or any(item.comp for item in items)
        )
        log.debug("Album might be VA: {}", va_likely)

        # Get the results from the data sources.
        for matched_candidate in metadata_plugins.candidates(
            items, _search_artist, _search_album, va_likely
        ):
            _add_candidate(items, candidates, matched_candidate)

    log.debug("Evaluating {} candidates.", len(candidates))
    # Sort and get the recommendation.
    candidates_sorted = _sort_candidates(candidates.values())
    rec = _recommendation(candidates_sorted)
    return cur_artist, cur_album, Proposal(candidates_sorted, rec)

</code_context>

<issue_to_address>
**issue (code-quality):** Replace mutable default arguments with None ([`default-mutable-arg`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/default-mutable-arg/))
</issue_to_address>

### Comment 2
<location> `beets/autotag/match.py:340` </location>
<code_context>
def tag_item(
    item: Item,
    search_artist: str | None = None,
    search_title: str | None = None,
    search_ids: list[str] | None = None,
) -> Proposal:
    """Find metadata for a single track. Return a `Proposal` consisting
    of `TrackMatch` objects.

    `search_artist` and `search_title` may be used to override the item
    metadata in the search query. `search_ids` may be used for restricting the
    search to a list of metadata backend IDs.
    """
    # Holds candidates found so far: keys are MBIDs; values are
    # (distance, TrackInfo) pairs.
    candidates = {}
    rec: Recommendation | None = None

    # First, try matching by the external source ID.
    trackids = search_ids or [t for t in [item.mb_trackid] if t]
    if trackids:
        for trackid in trackids:
            log.debug("Searching for track ID: {}", trackid)
            if info := metadata_plugins.track_for_id(trackid):
                dist = track_distance(item, info, incl_artist=True)
                candidates[info.track_id] = hooks.TrackMatch(dist, info)
                # If this is a good match, then don't keep searching.
                rec = _recommendation(_sort_candidates(candidates.values()))
                if (
                    rec == Recommendation.strong
                    and not config["import"]["timid"]
                ):
                    log.debug("Track ID match.")
                    return Proposal(_sort_candidates(candidates.values()), rec)

    # If we're searching by ID, don't proceed.
    if search_ids:
        if candidates:
            assert rec is not None
            return Proposal(_sort_candidates(candidates.values()), rec)
        else:
            return Proposal([], Recommendation.none)

    # Manually provided search terms or fallbacks.
    _search_artist, _search_title = _parse_search_terms_with_fallbacks(
        (search_artist, item.artist),
        (search_title, item.title),
    )
    log.debug("Item search terms: {} - {}", _search_artist, _search_title)

    # Get and evaluate candidate metadata.
    for track_info in metadata_plugins.item_candidates(
        item,
        _search_artist,
        _search_title,
    ):
        dist = track_distance(item, track_info, incl_artist=True)
        candidates[track_info.track_id] = hooks.TrackMatch(dist, track_info)

    # Sort by distance and return with recommendation.
    log.debug("Found {} candidates.", len(candidates))
    candidates_sorted = _sort_candidates(candidates.values())
    rec = _recommendation(candidates_sorted)
    return Proposal(candidates_sorted, rec)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

codecov bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.99%. Comparing base (043581e) to head (7a250ef).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beets/metadata_plugins.py 50.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6117   +/-   ##
=======================================
  Coverage   66.98%   66.99%           
=======================================
  Files         118      118           
  Lines       18191    18193    +2     
  Branches     3079     3079           
=======================================
+ Hits        12186    12189    +3     
  Misses       5346     5346           
+ Partials      659      658    -1     
Files with missing lines Coverage Δ
beets/autotag/match.py 79.60% <100.00%> (+0.93%) ⬆️
beets/metadata_plugins.py 76.52% <50.00%> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@semohr semohr force-pushed the typing-candidates-and-item-candidates branch from 8f07788 to 7a250ef Compare October 21, 2025 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant