Skip to content

[PULP-1118] Add better error handling for repover duplicate content#7280

Open
pedro-psb wants to merge 2 commits intopulp:mainfrom
pedro-psb:fix/7184-report-conflicting-packages
Open

[PULP-1118] Add better error handling for repover duplicate content#7280
pedro-psb wants to merge 2 commits intopulp:mainfrom
pedro-psb:fix/7184-report-conflicting-packages

Conversation

@pedro-psb
Copy link
Member

@pedro-psb pedro-psb commented Feb 4, 2026

Added a proper error class for duplicate content handling and some more logging to inform exactly what are the conflicting content.

When duplicates are detected, we do some extra work to collect duplicate content.
A simple performance test shows it's not too bad:

In [1]: import pulpcore.plugin.repo_version_utils as ut

In [2]: content_qs = Package.objects.all()

In [3]: unique_keys = Package.repo_key_fields

In [4]: content_qs.count()
Out[4]: 24547

In [5]: ut.count_duplicates(content_qs, unique_keys)
Out[5]: 7701

In [6]: %timeit ut.count_duplicates(content_qs, unique_keys)
35.1 ms ± 154 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [7]: %timeit ut.collect_duplicates(content_qs, unique_keys)
61.4 ms ± 266 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Closes: #7184

📜 Checklist

  • Commits are cleanly separated with meaningful messages (simple features and bug fixes should be squashed to one commit)
  • A changelog entry or entries has been added for any significant changes
  • Follows the Pulp policy on AI Usage
  • (For new features) - User documentation and test coverage has been added

See: Pull Request Walkthrough

Added a proper error class for duplicate content handling and some more
logging to inform exactly what are the conflicting content.

Closes: pulp#7184
@pedro-psb pedro-psb force-pushed the fix/7184-report-conflicting-packages branch from 7e9e749 to a42e341 Compare February 4, 2026 14:45
@pedro-psb pedro-psb changed the title Add better error handling for repover duplicate content [PULP-1118] Add better error handling for repover duplicate content Feb 4, 2026
"""

def __init__(self, duplicate_count: int, correlation_id: str):
self.dup_count = duplicate_count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would seem like this assumes that a RepositoryVersion creation failed due to duplicates always? Do we want to assume that? Should the error be more specific?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense. I'll make more specific one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PULP-1118] Sync failures due to NEVRA duplicates should also report the conflicting packages

2 participants