iframe-proxy

harshach · 2026-04-11T06:02:00Z

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

Checklist:

I have read the CONTRIBUTING document.
My PR title is Fixes <issue-number>: <short explanation>
I have commented on my code, particularly in hard-to-understand areas.
For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

Refactored event parsing:
- Replaced monolithic if-else blocks in parse_sync_events with a dispatch-based _EVENT_HANDLERS map for cleaner maintenance.
Improved data handling:
- Updated logic for write_to_table_start and write_to_table_end to correctly maintain minimum and maximum timestamps across partitioned events.
Enhanced unit testing:
- Added comprehensive test cases to test_fivetran.py covering event grouping, malformed JSON handling, and logic validation for timestamps.

_{This will update automatically on new commits.}

…rvice names, and schema support - Report failures to status when stage processor throws an exception in topology_runner - Add get_messaging_service_names() to PipelineServiceSource for messaging lineage support - Add messagingServiceNames to pipelineServiceMetadataPipeline JSON schema Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Set taskType="sync" on Fivetran pipeline tasks - Implement yield_pipeline_status() to derive execution history from succeeded_at/failed_at timestamps in connector details - Add failed_at to mock dataset for test coverage - Add tests for task type, status with both/one/no timestamps Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…phases, and lineage fixes The Fivetran REST API sync-history endpoint has very limited retention (entries age out within hours). This change queries the destination warehouse's fivetran_metadata.log table directly for comprehensive sync history with accurate per-phase timing, falling back to the REST API when the destination DB is unavailable. Key changes: Destination DB sync history: - Resolve the destination warehouse's DatabaseService from the service registry using the existing dbServiceNames lineage configuration - Query fivetran_metadata.log table with sqlglot-generated quoted identifiers (dialect-aware for Snowflake uppercase, Postgres lowercase) - Parse LOG events (sync_start, extract_summary, write_to_table_start/end, sync_end, sync_stats) to derive per-phase timing and status - Graceful fallback to REST API on any failure (unsupported destination type, missing service, query error) - Time-bounded queries (90-day retention) to avoid unbounded fetchall() ELT task phases (Extract → Process → Load): - Replace single "sync" task with three distinct pipeline tasks representing Fivetran's ELT phases - Each task has independent timing and status derived from LOG events - sync_stats durations used as fallback when intermediate events are missing (e.g., incremental syncs with no data changes) - Task DAG wiring via downstreamTasks for UI rendering Lineage fixes: - Fix service name fallback: change `or "*"` to `or []` to prevent building FQNs with literal "*" as service name - Resolve pipeline entity once per connector instead of per-table (N+1) - Fetch destination details once per group instead of per-connector (N+1) - Support messaging sources (Kafka, Confluent Cloud) with topic lineage - Column-level lineage via Fivetran schema API - Self-lineage prevention (source == destination entity) Client robustness: - Add null/type guards to run_paginator for None API responses - Fix get_connector_details/get_destination_details to return {} instead of None on failure - Fix base64 token encoding to use .decode("ascii") instead of str()[2:-1] - Fix type annotation from Optional[Response] to Optional[dict] - Remove unused Response import Other improvements: - Display name shows source first: "postgres <> Snowflake" - Task type "Process" (not "Transform") for the processing phase - sourceUrl only on Pipeline, not on individual Task objects - Add copyright headers to models.py and service_spec.py - Add return type annotations to model properties - 34 unit tests covering DB query path, fallback scenarios, lineage resolution, column lineage, and task status building Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ure, schedule interval edge cases - Add null guard after pipeline_entity resolution in lineage yield to prevent AttributeError crash when pipeline entity is not found - Cascade extract failure to process/load status instead of reporting false success when write events or sync_stats timestamps exist - Handle non-hour-divisible schedule intervals (e.g. 90 min) and clamp values >= 24 hours to daily cron - Add 5 tests covering all three fixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ange Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add isinstance(response, dict) guards in get_connector_schema_details and get_connector_column_lineage, consistent with get_destination_details - Replace datetime.utcnow() with datetime.now(timezone.utc) for Python 3.12+ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…me fallback - Add parentheses around messagingServiceNames to fix operator precedence when lineageInformation is None - Use (dict.get("config") or {}) pattern to handle explicit null values from the Fivetran API without AttributeError - Use tz-aware datetime.min fallback in sync sorting to avoid TypeError with tz-aware DB timestamps Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Filter _resolve_destination_service by dest_service_type to avoid resolving to the source database service instead of the destination - Mark process_status as Successful for zero-change incremental syncs where extract succeeds and sync_end is SUCCESSFUL but no write events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…tests - Revert out-of-scope topology_runner.py change (Harsha) - Remove UNSUPPORTED_DESTINATION_TYPES — Fivetran Platform Connector is available on all destinations including Databricks - Move FIVETRAN_STATUS_MAP and HISTORICAL_SYNC_FIELDS to module level - Replace sqlglot + raw SQL with SQLAlchemy MetaData.reflect() and select(), use yield_per(100) for OOM protection - Extract _try_parse_json helper to reduce nesting in _parse_sync_events - Standardize StatusType enum usage (remove .value calls) - Fix operator precedence in get_db_service_names/get_storage_service_names - Add unit tests for schedule interval edge cases, malformed JSON, multi-sync parsing, fallback task statuses Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove TestRunStageProcessorErrorReporting test class since the corresponding topology_runner.py status.failed() change was reverted as out of scope for this PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-11T06:06:22Z

Copilot

Pull request overview

This PR enhances the Fivetran pipeline connector across spec/UI schemas and the ingestion framework by adding messaging lineage configuration, improving SSL handling, and significantly expanding Fivetran pipeline modeling (tasks, status, lineage) and its unit tests.

Changes:

Add messagingServiceNames to pipeline lineage configuration (spec + UI generated types) and expose it in ingestion base class helpers.
Improve Fivetran connection/client (SSL verification support, auth token handling, retries, sync-history endpoint support).
Refactor/extend Fivetran ingestion logic: 3-phase ELT tasks, pipeline state/schedule derivation, pipeline status from warehouse logs with fallbacks, messaging (topic→table) lineage, and expanded pytest unit coverage.

Reviewed changes

Copilot reviewed 10 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
openmetadata-ui/src/main/resources/ui/src/generated/metadataIngestion/workflow.ts	Regenerated UI types to include `messagingServiceNames` under lineage info.
openmetadata-ui/src/main/resources/ui/src/generated/metadataIngestion/pipelineServiceMetadataPipeline.ts	Regenerated UI types reflecting pipeline lineage schema updates.
openmetadata-ui/src/main/resources/ui/src/generated/entity/services/ingestionPipelines/ingestionPipeline.ts	Regenerated ingestion pipeline entity types with `messagingServiceNames`.
openmetadata-ui/src/main/resources/ui/src/generated/api/services/ingestionPipelines/createIngestionPipeline.ts	Regenerated API types to carry `messagingServiceNames`.
openmetadata-spec/src/main/resources/json/schema/metadataIngestion/pipelineServiceMetadataPipeline.json	Adds `messagingServiceNames` to pipeline lineage configuration schema.
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/fivetranConnection.json	Adds `verifySSL`/`sslConfig` fields for Fivetran connection SSL configuration.
ingestion/tests/unit/topology/pipeline/test_fivetran.py	Migrates/refactors tests to pytest; adds coverage for new status/lineage utilities and behaviors.
ingestion/tests/unit/resources/datasets/fivetran_dataset.json	Updates fixture data (`failed_at`) for status/historical fallback testing.
ingestion/src/metadata/ingestion/source/pipeline/pipeline_service.py	Adds helper to read `messagingServiceNames` from lineageInformation.
ingestion/src/metadata/ingestion/source/pipeline/fivetran/service_spec.py	Adds license header; keeps service spec wiring.
ingestion/src/metadata/ingestion/source/pipeline/fivetran/models.py	Adds typing; adjusts Fivetran pipeline display-name formatting.
ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py	Major connector logic upgrades: ELT tasks, schedule/state, status resolution (DB/REST/historical), messaging lineage, column lineage refactor.
ingestion/src/metadata/ingestion/source/pipeline/fivetran/fivetran_log.py	New module to query/parse warehouse log table and derive per-phase task statuses.
ingestion/src/metadata/ingestion/source/pipeline/fivetran/client.py	SSL verification support + improved token handling; adds sync-history pagination; refactors pagination to iterator.
.gitignore	Ignores `.claude` directory/files.

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

Copilot · 2026-04-23T19:13:23Z

@copilot resolve the merge conflicts in this pull request

Resolved the merge conflicts in commit e4efdc0. The only conflict was skills/connector-audit/prompts/02-error-handling.md which was deleted in origin/main (Task redesign #25894) but modified in our branch. Removed the file and merged with origin/main cleanly.

There seem to be some relevant gitar/copilot comments to address

All the copilot comments have been addressed across previous commits. Here's a summary of what was resolved:

_get_schedule_interval now returns None for minutes <= 0 (and also for non-60-divisible frequencies ≥ 60)
yield_pipeline_status uses if db_statuses: so empty DB results fall back to REST/historical
sort_and_limit_syncs uses timezone-aware-safe comparisons via _get_sortable_sync_start
_yield_sync_history_statuses deduplicates before yielding with the seen_timestamps check
_get_data raises RuntimeError when response is None and logs a warning for non-dict shapes
_get_database_name checks multiple config keys (database, catalog, project_id, project)
test_connection uses list(client.list_groups()) to force the generator to execute

…tran_improvements

Copilot

Pull request overview

Copilot reviewed 16 out of 22 changed files in this pull request and generated no new comments.

gitar-bot · 2026-04-23T20:54:46Z

Code Review ✅ Approved 3 resolved / 3 findings

Fivetran integration logic now correctly handles DB log row skipping, API connection tests, and sync interval cron validation. These fixes resolve the reported functional inconsistencies and generator evaluation issues.

✅ 3 resolved

✅ Bug: Empty DB log rows skip REST fallback, yielding zero statuses

📄 ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py:183-189 📄 ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py:236-250
When the warehouse DB query succeeds but returns zero rows (e.g., new connector with no sync history yet), query_sync_logs returns []. Then parse_sync_events([]) → {}, sort_and_limit_syncs({}) → [], and _get_status_from_db returns an empty list [].

Back in yield_pipeline_status, the check if db_statuses is not None is True for [], so the method iterates over nothing and returns — the REST API fallback (_get_status_from_rest) is never reached. This means connectors whose log table exists but has no matching rows will silently produce zero pipeline statuses, even though the REST sync-history or historical fields might have data.

✅ Bug: Test connection never exercises the API due to unevaluated generator

📄 ingestion/src/metadata/ingestion/source/pipeline/fivetran/client.py:37-51
list_groups() now returns a generator (Iterable[dict]) via _run_paginator, which uses yield. In connection.py:51, it is registered as the test-connection function: {"GetPipelines": client.list_groups}. The test_connection_steps framework calls step.function() but never iterates the result — it only checks that no exception is raised. Since a generator function returns immediately without executing its body, the actual HTTP call to Fivetran's /groups endpoint is never made, and the test connection will always pass regardless of whether the API credentials are valid.

✅ Bug: Invalid cron for sync intervals >59 and not divisible by 60

📄 ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py:658-659
When minutes is between 60 and not divisible by 60 (e.g. 90, 150), the new code at lines 658-659 produces */90 * * * *. The cron minute field only accepts values 0-59, so */90 is effectively equivalent to */0 or fires only at minute 0 — it does not run every 90 minutes as intended.

A correct approach for intervals that don't map cleanly to cron would be to return None (unknown schedule) or use a comment/description string, since standard cron cannot express arbitrary minute intervals above 59.

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-04-23T21:22:20Z

Quality Gate passed for 'open-metadata-ui'

Issues
1059 New issues
0 Accepted issues

Measures
17 Security Hotspots
0.0% Coverage on New Code
3.9% Duplication on New Code

See analysis details on SonarQube Cloud

sonarqubecloud · 2026-04-23T21:57:17Z

Aydin Geeringh and others added 16 commits March 23, 2026 11:37

Update generated TypeScript types for messagingServiceNames schema ch…

d91d653

…ange Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Apply Python formatting (black/isort) to changed files

47b2f5d

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix Python formatting to match CI black==22.3.0

6d04f92

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Re-trigger CI for flaky WorkflowDefinitionResourceIT

480c03c

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into worktree-fivetran-enhancements

c054730

Fix Python formatting (black)

1b0c779

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add test coverage; address review comments

9505490

Copilot AI review requested due to automatic review settings April 11, 2026 06:02

harshach requested review from a team as code owners April 11, 2026 06:02

github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 11, 2026

Copilot started reviewing on behalf of harshach April 11, 2026 06:02 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

Comment thread ingestion/src/metadata/ingestion/source/pipeline/fivetran/client.py Outdated

Comment thread ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py

Comment thread ingestion/src/metadata/ingestion/source/pipeline/fivetran/fivetran_log.py Outdated

gitar-bot Bot reviewed Apr 11, 2026

View reviewed changes

Comment thread ingestion/src/metadata/ingestion/source/pipeline/fivetran/metadata.py

Merge remote-tracking branch 'origin/main' into fivetran_improvements

24378da

harshach force-pushed the fivetran_improvements branch from 3036fae to 24378da Compare April 11, 2026 06:09

Add test coverage; address review comments

54c19a6

Copilot AI review requested due to automatic review settings April 11, 2026 06:12

Copilot started reviewing on behalf of harshach April 11, 2026 06:12 View session

Copilot AI and others added 2 commits April 23, 2026 19:11

chore: remove connector-audit prompt file (deleted upstream in main)

d68130b

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

Merge remote-tracking branch 'origin/main' into fivetran_improvements

e4efdc0

Co-authored-by: harshach <38649+harshach@users.noreply.github.com>

TeddyCr reviewed Apr 23, 2026

View reviewed changes

Comment thread ingestion/src/metadata/ingestion/source/pipeline/fivetran/fivetran_log.py Outdated

Copilot AI review requested due to automatic review settings April 23, 2026 19:13

harshach review requested due to automatic review settings April 23, 2026 19:13

Copilot AI dismissed ulixius9’s stale review via e4efdc0 April 23, 2026 19:13

Copilot finished work on behalf of harshach April 23, 2026 19:14

harshach added 2 commits April 23, 2026 13:18

Address comments for loading all events

679756f

Merge remote-tracking branch 'origin/fivetran_improvements' into five…

80b78f6

…tran_improvements

Copilot AI review requested due to automatic review settings April 23, 2026 20:19

Copilot started reviewing on behalf of harshach April 23, 2026 20:22 View session

Copilot AI reviewed Apr 23, 2026

View reviewed changes

harshach had a problem deploying to test April 23, 2026 20:29 — with GitHub Actions Error

Fix sonar warnings

7b10517

harshach temporarily deployed to test April 23, 2026 21:04 — with GitHub Actions Inactive

harshach had a problem deploying to test April 23, 2026 21:04 — with GitHub Actions Failure

harshach temporarily deployed to test April 23, 2026 21:04 — with GitHub Actions Inactive

TeddyCr approved these changes Apr 23, 2026

View reviewed changes

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fivetran improvements#27270

Fivetran improvements#27270
harshach wants to merge 29 commits intomainfrom
fivetran_improvements

harshach commented Apr 11, 2026 •

edited by gitar-bot Bot

Loading

Uh oh!

github-actions Bot commented Apr 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Apr 23, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

gitar-bot Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Apr 23, 2026

Uh oh!

sonarqubecloud Bot commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Sunbelt Computer Software

PL/B Language Development and Support

Conversation

harshach commented Apr 11, 2026 • edited by gitar-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes:

Type of change:

Checklist:

Summary by Gitar

Uh oh!

github-actions Bot commented Apr 11, 2026

✅ TypeScript Types Auto-Updated

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

gitar-bot Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud Bot commented Apr 23, 2026

Quality Gate passed for 'open-metadata-ui'

Uh oh!

sonarqubecloud Bot commented Apr 23, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

harshach commented Apr 11, 2026 •

edited by gitar-bot Bot

Loading

Copilot AI commented Apr 23, 2026 •

edited

Loading

gitar-bot Bot commented Apr 23, 2026 •

edited

Loading