feat: add Tavily as configurable search engine in core search framework by tavily-integrations · Pull Request #891 · modelscope/ms-agent · GitHub
Skip to content

feat: add Tavily as configurable search engine in core search framework#891

Open
tavily-integrations wants to merge 1 commit into
modelscope:mainfrom
Tavily-FDE:feat/tavily-migration/core-search-framework
Open

feat: add Tavily as configurable search engine in core search framework#891
tavily-integrations wants to merge 1 commit into
modelscope:mainfrom
Tavily-FDE:feat/tavily-migration/core-search-framework

Conversation

@tavily-integrations

Copy link
Copy Markdown

Summary

  • Added TavilySearch as a new search engine option alongside Exa, SerpAPI, and Arxiv
  • Follows the existing engine pattern: schema dataclasses + SearchEngine subclass + registration in all registries
  • Tavily results are mapped to the shared BaseResult schema for seamless downstream consumption

Files Changed

  • ms_agent/tools/search/search_base.py — Added TAVILY to SearchEngineType enum and 'tavily': 'tavily_search' to ENGINE_TOOL_NAMES
  • ms_agent/tools/search/tavily/__init__.py — New package init exporting TavilySearch, TavilySearchRequest, TavilySearchResult
  • ms_agent/tools/search/tavily/schema.py — New file with TavilySearchRequest and TavilySearchResult dataclasses
  • ms_agent/tools/search/tavily/search.py — New file with TavilySearch class using tavily-python SDK
  • ms_agent/tools/search/websearch_tool.py — Added 'tavily' to SUPPORTED_ENGINES, get_search_engine_class(), get_search_engine(), _api_keys init, and connect() engine instantiation
  • ms_agent/tools/search_engine.py — Added TAVILY branch in get_web_search_tool() factory and TAVILY_API_KEY support in env overrides
  • requirements/research.txt — Added tavily-python dependency

Dependency Changes

  • Added tavily-python to requirements/research.txt

Environment Variable Changes

  • Added TAVILY_API_KEY — required when using the Tavily engine

Notes for Reviewers

  • This is an additive change — all existing engines remain fully functional
  • Tavily can be selected via engine: tavily in agent YAML config or via the FIN_RESEARCH_SEARCH_ENGINE override
  • The TavilySearch.search() method maps Tavily API results (contentsummary, urlid) to the shared BaseResult schema

🤖 Generated with Claude Code

Automated Review

  • Passed after 1 attempt(s)
  • Final review: The Tavily migration is well-implemented and follows all existing patterns in the codebase. The additive strategy is respected: no existing engines are removed or altered. All factory functions (get_search_engine_class, get_search_engine, get_web_search_tool, WebSearchTool.connect) are correctly updated. The TavilySearch class, schema dataclasses, __init__.py, enum entry, tool-name mapping, and SUPPORTED_ENGINES tuple are all consistent with how Exa and SerpAPI are implemented. The tavily-python dependency is added to requirements/research.txt. The Tavily SDK is used correctly (TavilyClient.search(**kwargs) returning a dict with a results list). No critical or major issues found; only three minor documentation/style gaps.

@gemini-code-assist

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Tavily search engine integration into the existing search tool framework. This includes defining new data structures for Tavily search requests and results, implementing the Tavily search logic, and integrating it across various parts of the ms_agent search tool infrastructure, such as enums, mappings, and configuration. A new tavily-python dependency is also added. The review comments suggest several improvements: replacing assert with more robust error handling for API key validation, refining type hints for better type safety, and switching from print statements to the logger module for consistent warning and informational messages.

def __init__(self, api_key: str = None):

api_key = api_key or os.getenv('TAVILY_API_KEY')
assert api_key, 'TAVILY_API_KEY must be set either as an argument or as an environment variable'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using assert for validating API keys is not recommended in production code, as assert statements can be optimized out by the Python interpreter, leading to unexpected behavior if the key is missing. A ValueError or RuntimeError provides a more robust and explicit error handling mechanism.

Suggested change
assert api_key, 'TAVILY_API_KEY must be set either as an argument or as an environment variable'
if not api_key:
raise ValueError('TAVILY_API_KEY must be set either as an argument or as an environment variable')

arguments: Dict[str, Any] = field(default_factory=dict)

# The response from the Tavily search API (dict with 'results' key)
response: SearchResponse = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The type hint for response should specify the generic type of SearchResponse. Since the TavilySearch.search method populates this with SearchResponse[BaseResult], the type hint here should reflect that for better type safety and clarity.

Suggested change
response: SearchResponse = None
response: Optional[SearchResponse[BaseResult]] = None

Convert the search results to a list of dictionaries.
"""
if not self.response or not self.response.results:
print('***Warning: No search results found.')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for warnings is not ideal in a library context. It's better to use the logger module for consistent logging practices, which allows for configurable output and severity levels.

Suggested change
print('***Warning: No search results found.')
logger.warning('No search results found.')

print('***Warning: No search results found.')
return []

if not self.query:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for warnings is not ideal in a library context. It's better to use the logger module for consistent logging practices, which allows for configurable output and severity levels.

Suggested change
if not self.query:
logger.warning('No query provided for search results.')

"""
with open(file_path, 'r', encoding='utf-8') as f:
data = json.load(f)
print(f'Search results loaded from {file_path}')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for informational messages is not ideal in a library context. It's better to use the logger module for consistent logging practices, which allows for configurable output and severity levels.

Suggested change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant