Feature/skx pytest by Peterrpeterrr · Pull Request #800 · modelscope/ms-agent · GitHub
Skip to content

Feature/skx pytest#800

Open
Peterrpeterrr wants to merge 21 commits into
modelscope:mainfrom
Y-C-Fan:feature/skx-pytest
Open

Feature/skx pytest#800
Peterrpeterrr wants to merge 21 commits into
modelscope:mainfrom
Y-C-Fan:feature/skx-pytest

Conversation

@Peterrpeterrr

Copy link
Copy Markdown

Change Summary

Related issue number

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Run pre-commit install and pre-commit run --all-files before git commit, and passed lint check.
  • Documentation reflects the changes where applicable

Y-C-Fan and others added 20 commits November 7, 2025 21:11
- Add comprehensive documentation for role C (Integration Engineer & Verification Loop)
- Create analysis, goals and task breakdown documents in skx-docs
- Integrate orchestrator components from chao/docs branch
- Update projects/code_scratch configurations
- Modify .gitignore to include skx-docs in version control
- Implement external integration modules for role C
- Implement CodeScratchCaller for blackbox calling of projects/code_scratch
- Implement DeepResearchCaller for blackbox calling of projects/deep_research
- Implement PromptInjector for meta-instruction construction
- Implement TestRunner for automated testing and error feedback
- Update CodeAdapter to use real implementations instead of mocks
- Create test file for external integration modules
- Update devcontainer.json to use Python 3.10 as default
- Sync devcontainer configuration with chao/docs branch
- Add user_operation_guide.md with complete workflow instructions
- Remove git_config_guide.md as it's no longer needed in docs
@gemini-code-assist

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive development environment using DevContainers and a new orchestration layer for the agent. The setup is very thorough and includes detailed documentation, which is great.

My review focuses on improving the security, correctness, and maintainability of the new configuration and scripts. I've identified some critical security concerns in the Docker configuration, such as the use of seccomp:unconfined and host networking, which should be addressed. There are also a few inconsistencies and potential bugs in the setup scripts and configuration files. Additionally, I've provided suggestions for optimizing the Dockerfile and refactoring some Python code for better practices.

Overall, this is a significant contribution that will greatly improve the development workflow. Addressing the feedback will help ensure the environment is secure and robust.

Comment on lines +40 to +41
security_opt:
- seccomp:unconfined

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Disabling seccomp with seccomp:unconfined significantly reduces container security by removing the default syscall restrictions. This, combined with --cap-add=SYS_PTRACE from devcontainer.json, gives the container extensive privileges on the host. This should be avoided unless absolutely necessary for debugging. If required, please document the reason clearly. Consider using a custom, more restrictive seccomp profile if possible.

"--hostname",
"ms-agent-dev",
"--add-host=host.docker.internal:host-gateway",
"--network=host"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using --network=host removes network isolation between the container and the host, which can be a security risk. The container gets access to the host's entire network stack. If this is for convenience (e.g., accessing services on localhost), consider using host.docker.internal or explicitly publishing ports instead. If host networking is required, please document the reason.

environment:
- PYTHONPATH=/workspace
- PYTHONUNBUFFERED=1
- PYTHON_VERSION=3.11

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The PYTHON_VERSION environment variable is set to 3.11, but the Dockerfile installs python3.10. This inconsistency can lead to confusion and potential issues if scripts rely on this environment variable. Please make them consistent.

Comment on lines +51 to +56
git config --global credential.helper store

# 创建 GitHub 凭据文件
mkdir -p ~/.git-credentials
echo "https://oauth2:${GITHUB_TOKEN}@github.com" > ~/.git-credentials
chmod 600 ~/.git-credentials

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using credential.helper store saves the GitHub token in plaintext in ~/.git-credentials. This is a security risk, even inside a container. Consider using a more secure credential helper like cache to store the token in memory for a limited time. If you need to automate authentication, look into using the GitHub CLI (gh) for authentication, which is more secure.

echo "🚀 开始安装项目依赖..."

# 切换到workspace目录(项目代码在此)
cd /workspaces/seu-ms-agent

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The path /workspaces/seu-ms-agent is hardcoded. The docker-compose.yml mounts the project to /workspace. This hardcoded path is likely incorrect and will cause the script to fail. It's better to use a relative path or an environment variable. Given the context, it should probably be /workspace.

Suggested change
cd /workspaces/seu-ms-agent
cd /workspace

Comment thread .devcontainer/Dockerfile
Comment on lines +19 to +60
RUN sed -i 's|http://archive.ubuntu.com/ubuntu/|https://mirrors.aliyun.com/ubuntu/|g' /etc/apt/sources.list \
&& sed -i 's|http://security.ubuntu.com/ubuntu/|https://mirrors.aliyun.com/ubuntu/|g' /etc/apt/sources.list \
&& printf 'Acquire::Retries "5";\nAcquire::http::Timeout "30";\n' > /etc/apt/apt.conf.d/80-retries

# 创建用户(避免使用root)
ARG USERNAME=vscode
ARG USER_UID=1000
ARG USER_GID=$USER_UID

# 更新系统并安装基础软件包
RUN apt-get update && apt-get install -y \
# 基础工具
sudo \
curl \
wget \
git \
vim \
nano \
unzip \
zip \
build-essential \
cmake \
pkg-config \
# Python 3.10相关
python3.10 \
python3-pip \
python3-venv \
python3-dev \
python3-distutils \
# 网络工具
openssh-client \
# 语言环境
locales \
# 其他有用工具
htop \
tree \
jq \
# 配置语言环境并清理缓存
&& locale-gen en_US.UTF-8 \
&& update-locale LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To optimize the Docker image size and build time, it's a good practice to chain RUN commands together. The instructions on lines 19-21 and 29-60 can be combined into a single RUN layer. This avoids running apt-get update and cleaning up apt caches multiple times.

Comment on lines +105 to +192
def _parse_pytest_output(self, stdout: str,
stderr: str) -> List[Dict[str, Any]]:
"""
解析pytest输出,提取关键错误信息

Args:
stdout: pytest的标准输出
stderr: pytest的错误输出

Returns:
解析后的错误信息列表
"""
errors = []

# 解析输出以提取错误信息
output = stdout + stderr

# 查找测试失败的相关信息
lines = output.split('\n')
current_error = None

for line in lines:
# 检查是否是错误行
if 'FAIL' in line and '::' in line:
# 这是一个失败的测试
parts = line.split()
for part in parts:
if '::' in part and 'FAIL' not in part:
test_name = part
errors.append({
'type': 'test_failure',
'test_name': test_name,
'message': line.strip()
})
break
elif 'ERROR' in line and '::' in line:
# 这是一个错误的测试
parts = line.split()
for part in parts:
if '::' in part and 'ERROR' not in part:
test_name = part
errors.append({
'type': 'test_error',
'test_name': test_name,
'message': line.strip()
})
break
elif line.strip().startswith('E '):
# 这是具体的错误详情
if errors:
errors[-1]['detailed_error'] = line.strip()[
4:] # 移除 'E ' 前缀

# 尝试解析Python异常
in_traceback = False
for i, line in enumerate(lines):
if 'Traceback' in line and '(most recent call last)' in line:
in_traceback = True
continue

if in_traceback:
if line.strip().startswith('File "'):
# 提取异常信息
error_info = {'type': 'exception', 'traceback': []}

# 收集整个traceback
j = i
while j < len(lines) and not (lines[j].strip() == ''
and j > i + 5):
if j >= len(lines):
break
error_info['traceback'].append(lines[j])
if j > i and not lines[j].strip().startswith(
' ') and not lines[j].strip().startswith(
'File "'):
# 可能是异常类型和消息
if ':' in lines[j]:
parts = lines[j].split(':', 1)
error_info['exception_type'] = parts[0].strip()
error_info['exception_message'] = parts[
1].strip()
break
j += 1

errors.append(error_info)
in_traceback = False

return errors

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Parsing pytest's stdout/stderr text is brittle and can easily break if pytest changes its output format. A more robust approach is to have pytest generate a machine-readable report, such as JUnit XML, using the --junit-xml=path/to/report.xml flag. You can then parse this XML file to reliably extract test results and failures.

Comment on lines +68 to +83

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to get or create an asyncio event loop is a bit complex. Since Python 3.7, asyncio.run() is the recommended, simpler way to run a top-level async function from sync code, as it handles loop creation and teardown automatically. Consider refactoring this to use asyncio.run() for simplicity and better practice.

        # 5. 执行 Run (异步)
        # 使用 asyncio.run() 来简化异步代码的执行
        run_params = {
            'user_prompt': query,
            'breadth': 4,  # 广度
            'depth': 2,  # 深度
            'is_report': True,
            'show_progress': True
        }

        asyncio.run(workflow.run(**run_params))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants