fix(description-sanitizer): allow BlockEditor file attachment and callout attributes on div elements by Rohit0301 · Pull Request #27658 · open-metadata/OpenMetadata · GitHub
Skip to content

fix(description-sanitizer): allow BlockEditor file attachment and callout attributes on div elements#27658

Open
Rohit0301 wants to merge 6 commits intomainfrom
fix-image-upload
Open

fix(description-sanitizer): allow BlockEditor file attachment and callout attributes on div elements#27658
Rohit0301 wants to merge 6 commits intomainfrom
fix-image-upload

Conversation

@Rohit0301
Copy link
Copy Markdown
Contributor

@Rohit0301 Rohit0301 commented Apr 23, 2026

Describe your changes:

Extended DescriptionSanitizer to allow BlockEditor-specific data-* attributes on

elements — previously these were stripped on save, causing file attachments and callout nodes to lose their metadata
Attributes added: data-type, data-url, data-filename, data-filesize, data-mimetype, data-uploading, data-upload-progress, data-is-image, data-alt, data-callouttype
Also added data-textcontent to the existing allowlist on tags for entity mention/hashtag nodes
data-temp-file is intentionally excluded as it holds transient upload state that should not be persisted

Fixes #27666

I worked on ... because ...

Screenshot 2026-04-23 at 5 17 31 PM

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@Rohit0301 Rohit0301 self-assigned this Apr 23, 2026
@Rohit0301 Rohit0301 added the safe to test Add this label to run secure Github workflows on PRs label Apr 23, 2026
mohityadav766
mohityadav766 previously approved these changes Apr 23, 2026
chirag-madlani
chirag-madlani previously approved these changes Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 23, 2026

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 23, 2026

Code Review ✅ Approved 3 resolved / 3 findings

Updates the description sanitizer to allow file attachments and callout attributes on div elements. Sanitization logic for data-url has been tightened, and missing test coverage for block editor attributes has been addressed.

✅ 3 resolved
Security: data-url on div lacks protocol validation unlike href on anchor

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java:105-116
The data-url attribute is allowed on <div> elements without any .matching() policy to restrict URL protocols. Unlike href on <a> tags (which is validated against http/https/mailto/#/ prefixes at line 50-58), data-url accepts arbitrary values including javascript: or data:text/html URIs.

The frontend's FileNodeView.tsx uses this value to set src on <video>/<audio> elements and to trigger file downloads. While src on media elements is generally safer than href, a javascript: URI in a download handler or future code changes could become exploitable. Since this is a stored value (persisted in descriptions), it's worth applying the same protocol allowlist used for href.

Quality: Test missing for data-filesize, data-alt, data-callouttype, data-textcontent

📄 openmetadata-service/src/test/java/org/openmetadata/service/util/DescriptionSanitizerTest.java:233-247
The new test fileAttachmentDivAttributesArePreserved covers 7 of the 10 newly allowed div attributes but omits data-filesize, data-alt, and data-callouttype. Similarly, the newly added data-textcontent on <a> tags has no dedicated test assertion. While the allowlist is straightforward, covering all explicitly allowed attributes ensures regressions are caught if the policy is refactored.

Quality: Test coverage for data-textcontent on removed but attr still allowed

📄 openmetadata-service/src/test/java/org/openmetadata/service/util/DescriptionSanitizerTest.java:263-271 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DescriptionSanitizer.java:101
The delta commit removes both the data-textcontent="@admin" attribute from the test input (line 266) and the corresponding assertion assertTrue(result.contains("data-textcontent=…")) (line 274). However, data-textcontent is still in the sanitizer allowlist for <a> elements (added in the previous commit). This leaves the newly-added attribute completely untested — if someone accidentally removes it from the allowlist in the future, no test will catch the regression.

Since the previous review explicitly asked for test coverage of these attributes and this commit is titled "unit test fix", it looks like the attribute was removed from the test by mistake rather than intentionally.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image upload is failing due to description sanitisation

3 participants