Jina API error logging is inconsistent with other search providers · Issue #2484 · bytedance/deer-flow · GitHub
Skip to content

Jina API error logging is inconsistent with other search providers #2484

@whhe

Description

@whhe

Description

In backend/packages/harness/deerflow/community/jina_ai/jina_client.py, when a request to the Jina API fails (e.g. due to a network timeout), the error is logged using logger.exception(...). This produces an ERROR-level log entry with a full multi-frame traceback (httpx / httpcore / anyio internals).

In contrast, other search-related components (e.g. Brave, Yandex, Yahoo, DuckDuckGo, Mojeek, Grokipedia via ddgs) log the same class of transient failures (timeouts, connection errors) at INFO level with a single concise message and no traceback.

The result is that a single offline/slow-network session produces dozens of loud ERROR-level stack traces for what are effectively recoverable, expected transient failures, while semantically equivalent failures from other providers stay quiet. This makes logs noisy and makes it much harder to spot real problems.

Current behavior

jina_client.py (relevant excerpt):

except Exception as e:
    error_message = f"Request to Jina API failed: {str(e)}"
    logger.exception(error_message)
    return f"Error: {error_message}"

A single failed crawl produces output similar to:

[error] Request to Jina API failed:    [deerflow.community.jina_ai.jina_client] ...
Traceback (most recent call last):
  File ".../httpx/_transports/default.py", line 101, in map_httpcore_exceptions
    yield
  File ".../httpcore/_async/connection_pool.py", line 256, in handle_async_request
    raise exc from None
  ... (20+ more frames across httpx / httpcore / anyio) ...
httpcore.ConnectTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File ".../jina_ai/jina_client.py", line 27, in crawl
    response = await client.post("https://r.jina.ai/", ...)
  ... (more frames) ...
httpx.ConnectTimeout

Other providers, for the same kind of failure, log a single line at INFO level, for example:

[info] Error in engine brave: TimeoutException("Request timed out: ...") [ddgs.ddgs] ...
[info] Error in engine yandex: TimeoutException("Request timed out: ...") [ddgs.ddgs] ...
[info] Error in engine duckduckgo: TimeoutException("Request timed out: ...") [ddgs.ddgs] ...

Expected behavior

Failures from the Jina client should be logged in a way that is consistent with the other search/crawl providers: a single, concise log line that includes the exception type and message, without dumping the full httpx/httpcore traceback for every failed request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions