Issue #728: Adding asterisk for metadata transfer by michaeldinzinger · Pull Request #1117 · apache/stormcrawler · GitHub
Skip to content

Issue #728: Adding asterisk for metadata transfer#1117

Merged
jnioche merged 3 commits into
apache:masterfrom
michaeldinzinger:devAsteriskMetadataTransfer
Nov 13, 2023
Merged

Issue #728: Adding asterisk for metadata transfer#1117
jnioche merged 3 commits into
apache:masterfrom
michaeldinzinger:devAsteriskMetadataTransfer

Conversation

@michaeldinzinger

Copy link
Copy Markdown
Contributor

Hello all

Goal:
When having a lot of metadata fields that all start with the same prefix, e.g. parse.[...], one can simply write

metadata.persist:
- parse.*

instead of listing them all.

Besides that, making mdToTransfer, mdToPersistOnly, trackPath and trackDepth protected instead of private. This makes it easier to create a Custom Metadata Transfer class.

Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de>
@jnioche jnioche added this to the 2.11 milestone Nov 6, 2023

@jnioche jnioche left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! this would be very helpful. See comments on possible improvements

Comment thread core/src/main/java/com/digitalpebble/stormcrawler/util/MetadataTransfer.java Outdated
Comment thread core/src/main/java/com/digitalpebble/stormcrawler/util/MetadataTransfer.java Outdated
Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de>
@jnioche

jnioche commented Nov 12, 2023

Copy link
Copy Markdown
Contributor

…persist + transfer

Signed-off-by: Michael Dinzinger <michael.dinzinger@uni-passau.de>
@michaeldinzinger

Copy link
Copy Markdown
Contributor Author

@jnioche jnioche left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! Thanks a lot @michaeldinzinger, this is a great addition to StormCrawler

@jnioche jnioche merged commit 4d3340f into apache:master Nov 13, 2023
@michaeldinzinger michaeldinzinger deleted the devAsteriskMetadataTransfer branch December 8, 2023 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants