Bump com.github.crawler-commons:crawler-commons from 1.5 to 1.6 by dependabot[bot] · Pull Request #1747 · apache/stormcrawler · GitHub
Skip to content

Bump com.github.crawler-commons:crawler-commons from 1.5 to 1.6#1747

Merged
rzo1 merged 1 commit into
mainfrom
dependabot/maven/com.github.crawler-commons-crawler-commons-1.6
Dec 8, 2025
Merged

Bump com.github.crawler-commons:crawler-commons from 1.5 to 1.6#1747
rzo1 merged 1 commit into
mainfrom
dependabot/maven/com.github.crawler-commons-crawler-commons-1.6

Conversation

@dependabot

@dependabot dependabot Bot commented on behalf of github Dec 8, 2025

Copy link
Copy Markdown
Contributor

Bumps com.github.crawler-commons:crawler-commons from 1.5 to 1.6.

Release notes

Sourced from com.github.crawler-commons:crawler-commons's releases.

crawler-commons-1.6

Important Changes

  • This release adds support for IDN2008 domain names and public suffixes in EffectiveTldFinder. If you rely on a recent version of the public suffix list, please upgrade to release 1.6! See [issue report #551](crawler-commons/crawler-commons#551) for more information.

Full List of Changes

  • Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN methods within TldFinder / Normalizer. (Richard Zowalla, sebastian-nagel) #551, #552
  • Add URLUtils class for URL resolution functionality (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #526
  • Replace deprecated URL constructor in BasicURLNormalizer (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #531
  • Add matchedWildcard flag to BaseRobotRules (CoGiang, sebastian-nagel) #530
  • [Domains] Update unit tests after change in public suffix list (sebastian-nagel, Richard Zowalla) #544
  • [Domains] Replace deleted *.uberspace.de with a wildcard from Google (Richard Zowalla) #532
  • Partial replacement of deprecated URL constructors (HamzaElzarw-2022, kkrugler, sebastian-nagel) #522, #524, #545, #536
  • Upgrade dependencies (dependabot) #521, #533, #534, #547, #549,
  • Upgrade Maven plugins (dependabot) #520, #538, #539, #540, #546, #550, #553, #554, #555

New Contributors

Changelog

Sourced from com.github.crawler-commons:crawler-commons's changelog.

Crawler-Commons Change Log

Current Development 1.7-SNAPSHOT (yyyy-mm-dd)

Release 1.6 (2025-12-04)

  • Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN methods within TldFinder / Normalizer. (Richard Zowalla, sebastian-nagel) #551, #552
  • Add URLUtils class for URL resolution functionality (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #526
  • Replace deprecated URL constructor in BasicURLNormalizer (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #531
  • Add matchedWildcard flag to BaseRobotRules (CoGiang, sebastian-nagel) #530
  • [Domains] Update unit tests after change in public suffix list (sebastian-nagel, Richard Zowalla) #544
  • [Domains] Replace deleted *.uberspace.de with a wildcard from Google (Richard Zowalla) #532
  • Partial replacement of deprecated URL constructors (HamzaElzarw-2022, kkrugler, sebastian-nagel) #522, #524, #545, #536
  • Upgrade dependencies (dependabot) #521, #533, #534, #547, #549,
  • Upgrade Maven plugins (dependabot) #520, #538, #539, #540, #546, #550, #553, #554, #555

Release 1.5 (2025-06-27)

  • Migrate publishing from OSSRH to Central Portal (jnioche, sebastian-nagel, Richard Zowalla, aecio) #510, #516
  • [Sitemaps] Add cross-submit feature (Avi Hayun, kkrugler, sebastian-nagel, Richard Zowalla) #85, #515
  • [Sitemaps] Complete sitemap extension attributes (sebastian-nagel, Richard Zowalla) #513, #514
  • [Sitemaps] Allow partial extension metadata (adriabonetmrf, sebastian-nagel, Richard Zowalla) #456, #458, #512
  • [Domains] EffectiveTldFinder to also take shorter suffix matches into account (sebastian-nagel, Richard Zowalla) #479, #505
  • Add package-info.java to all packages (sebastian-nagel, Richard Zowalla) #432, #504
  • [Robots.txt] Extend API to allow to check java.net.URL objects (sebastian-nagel, aecio, Richard Zowalla) #502
  • [Robots.txt] Incorrect robots.txt result for uppercase user agents (teammakdi, sebastian-nagel, aecio, Richard Zowalla) #453, #500
  • Remove class utils.Strings (sebastian-nagel, Richard Zowalla) #503
  • [BasicNormalizer] Complete normalization feature list of BasicURLNormalizer (sebastian-nagel, kkrugler) #494
  • [Robots] Document that URLs not properly normalized may not be matched by robots.txt parser (sebastian-nagel, kkrugler) #492, #493
  • [Sitemaps] Added https variants of namespaces (jnioche) #487
  • [Domains] Add version of public suffix list shipped with release packages enhancement (sebastian-nagel, Richard Zowalla) #433, #484
  • [Domains] Improve representation of public suffix match results by class EffectiveTLD (sebastian-nagel, Richard Zowalla) #478
  • Javadoc: fix links to Java core classes (sebastian-nagel, Richard Zowalla) #417, #483
  • [Sitemaps] Improve logging done by SiteMapParser (Valery Yatsynovich, sebastian-nagel) #457
  • [Sitemaps] Google Sitemap PageMap extensions (josepowera, sebastian-nagel, Richard Zowalla, jnioche) #388, #442
  • [Domains] Installation of a gzip-compressed public suffix list from Maven cache breaks EffectiveTldFinder to address (sebastian-nagel, Richard Zowalla) #441, #443
  • Upgrade dependencies (dependabot) #437, #444, #448, #451, #473, #465, #466, #468, #488, #491, #506, #511, #517
  • Upgrade Maven plugins (dependabot) #434, #438, #439, #449, #445, #452, #455, #459, #460, #464, #469, #467, #470, #471, #472, #474, #475, #476, #477, #480, #481, #482, #489, #490, #495, #496, #497, #498, #499, #508, #509, #518
  • Upgrade GitHub workflow actions v2 -> v4 (sebastian-nagel, Richard Zowalla) #501

Release 1.4 (2023-07-13)

  • [Robots.txt] Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests (sebastian-nagel, Richard Zowalla) #245, #360
  • [Robots.txt] Close groups of rules as defined in RFC 9309 (kkrugler, garyillyes, jnioche, sebastian-nagel) #114, #390, #430
  • [Robots.txt] Empty disallow statement not to clear other rules (sebastian-nagel, jnioche) #422, #424
  • [Robots.txt] SimpleRobotRulesParser main() to follow five redirects (sebastian-nagel, jnioche) #428
  • [Robots.txt] Add more spelling variants and typos of robots.txt directives (sebastian-nagel, jnioche) #425
  • [Robots.txt] Document effect of rules merging in combination with multiple agent names (sebastian-nagel, Richard Zowalla) #423, #426
  • [Robots.txt] Pass empty collection of agent names to select rules for any robot (wildcard user-agent name) (sebastian-nagel, Richard Zowalla) #427
  • [Robots.txt] Rename default user-agent / robot name in unit tests (sebastian-nagel, Richard Zowalla) #429
  • [Robots.txt] Add units test based on examples in RFC 9309 (sebastian-nagel, Richard Zowalla) #420
  • [BasicNormalizer] Query parameters normalization in BasicURLNormalizer (aecio, sebastian-nagel, Richard Zowalla) #308, #421
  • [Robots.txt] Deduplicate robots rules before matching (sebastian-nagel, jnioche) #416

... (truncated)

Commits
  • ce0fcb3 [maven-release-plugin] prepare release crawler-commons-1.6
  • 3d24b45 Update CHANGES.txt for release of 1.6
  • 9c82251 Bump org.apache.maven.plugins:maven-source-plugin from 3.3.1 to 3.4.0
  • 59cb351 Bump de.thetaphi:forbiddenapis from 3.9 to 3.10
  • 4730575 Bump org.apache.maven.plugins:maven-jar-plugin from 3.4.2 to 3.5.0
  • 5164d06 Update changelog
  • e58ecbc #551 – Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN meth...
  • 23aaeb2 Update Changelog
  • 8eb9c8f Unit tests for URLUtils.resolve(...)
  • 94658e3 Update Changelog
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [com.github.crawler-commons:crawler-commons](https://github.com/crawler-commons/crawler-commons) from 1.5 to 1.6.
- [Release notes](https://github.com/crawler-commons/crawler-commons/releases)
- [Changelog](https://github.com/crawler-commons/crawler-commons/blob/master/CHANGES.txt)
- [Commits](crawler-commons/crawler-commons@crawler-commons-1.5...crawler-commons-1.6)

---
updated-dependencies:
- dependency-name: com.github.crawler-commons:crawler-commons
  dependency-version: '1.6'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file java Pull requests that update Java code labels Dec 8, 2025
@rzo1 rzo1 added this to the 3.5.1 milestone Dec 8, 2025
@rzo1 rzo1 merged commit 72b4fbc into main Dec 8, 2025
2 checks passed
@dependabot dependabot Bot deleted the dependabot/maven/com.github.crawler-commons-crawler-commons-1.6 branch December 8, 2025 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file java Pull requests that update Java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant