ci: add deploy-stackrox composite action using roxie by davdhacs · Pull Request #20515 · stackrox/stackrox · GitHub
Skip to content

ci: add deploy-stackrox composite action using roxie#20515

Draft
davdhacs wants to merge 43 commits into
masterfrom
davdhacs/deploy-stackrox-action
Draft

ci: add deploy-stackrox composite action using roxie#20515
davdhacs wants to merge 43 commits into
masterfrom
davdhacs/deploy-stackrox-action

Conversation

@davdhacs

@davdhacs davdhacs commented May 12, 2026

Copy link
Copy Markdown
Contributor

Description

Reusable composite actions for deploying StackRox and connecting to infrastructure clusters.

New actions:

.github/actions/deploy-stackrox — deploys StackRox using roxie:

  • Extracts roxctl from quay.io/stackrox-io/roxctl:<tag> (no separate Go build)
  • Runs roxie deploy both with configurable scanner (v2/v4/both), resource sizing, and cluster name
  • Foreground/background modes, produces admin password + auth token + API endpoint
  • central-env: set env vars on Central (e.g. ROX_NODE_INDEX_ENABLED=true)
  • scanner-v4-env: set env vars on scanner-v4-matcher post-deploy
  • Copies KUBECONFIG to ~/.kube/config for tools that don't honor the env var (roxie)
  • Retries API token generation (12 attempts) for LoadBalancer readiness
  • Auto-enables OpenShift console plugin on OCP clusters

.github/actions/connect-infra-cluster — connects to infractl-provisioned clusters:

  • Auto-detects cluster type from kubeconfig (GKE, OCP, ROSA HCP)
  • GKE: sets up GCP auth + installs gke-gcloud-auth-plugin
  • ROSA HCP: auto-refreshes expired OAuth tokens using console credentials
  • OCP: certificate-based auth, no refresh needed
  • Configurable wait timeout (25m default, 60m for OpenShift)

.github/actions/create-gke-cluster + .github/actions/create-kind-cluster — cluster provisioning

Also switches e2e-db-backup-restore-test.yaml from infractl + CI scripts to the new actions as proof of integration.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added e2e tests
  • modified existing tests

How I validated my change

Actions validated across cluster types in the UI E2E PR (#20345):

  • deploy-stackrox: tested on KinD, GKE (native + infra), OCP, ROSA HCP
  • connect-infra-cluster: tested with GKE, OCP, and ROSA HCP (token refresh verified)
  • DB backup/restore test uses create-gke-cluster + deploy-stackrox as proof of integration

Adds a reusable composite action for deploying StackRox on any
Kubernetes cluster. Uses roxie (github.com/stackrox/roxie) for
the full deploy lifecycle — operator install, Central, Sensor,
port-forward, auth token generation.

Features:
- Cluster-agnostic: KinD, GKE, or any cluster kubectl can reach
- Auto-detects KinD → uses detached port-forward (exposure=none)
- Auto-detects GKE/cloud → uses LoadBalancer (exposure=lb)
- Runs in background (default) or foreground for shared clusters
- File-based outputs: /tmp/rox-{auth-token,admin-password,api-endpoint}
- Job outputs for shared cluster mode (api-endpoint, auth-token)
- Extracts roxctl from the main container image (no Go build needed)

Performance (measured):
- KinD: 34s deploy, all pods ready, Central API verified
- GKE: 2m19s deploy, LoadBalancer endpoint, Central API verified
- vs deploy.sh: 5x faster on KinD, 1.2x faster on GKE

Also switches e2e-db-backup-restore-test.yaml from the CI shell
script deploy (deploy_stackrox from tests/e2e/lib.sh) to the
new action as a proof of integration.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented May 12, 2026

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The composite action assumes docker, gh, kubectl, curl, jq, and sudo are available on the runner; consider adding explicit precondition checks with clear error messages (or early exits) so failures are easier to diagnose in environments where one of these tools is missing or misconfigured.
  • In the Install roxie + roxctl step, the docker-based roxctl extraction and gh release download are both allowed to fail partially (e.g., missing image, no docker daemon, invalid roxie version) and then continue; adding explicit error handling/fallbacks and logging for these failure paths would make the deploy behavior and failure modes more predictable.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The composite action assumes `docker`, `gh`, `kubectl`, `curl`, `jq`, and `sudo` are available on the runner; consider adding explicit precondition checks with clear error messages (or early exits) so failures are easier to diagnose in environments where one of these tools is missing or misconfigured.
- In the `Install roxie + roxctl` step, the `docker`-based `roxctl` extraction and `gh release download` are both allowed to fail partially (e.g., missing image, no docker daemon, invalid roxie version) and then continue; adding explicit error handling/fallbacks and logging for these failure paths would make the deploy behavior and failure modes more predictable.

## Individual Comments

### Comment 1
<location path=".github/actions/deploy-stackrox/action.yaml" line_range="142-150" />
<code_context>
+            EXPOSURE="loadbalancer"
+          fi
+
+          roxie deploy both \
+            --resources=${{ inputs.resources }} \
+            --single-namespace \
+            --central-wait=10m \
+            --secured-cluster-wait=10m \
+            --override /tmp/roxie-override.yaml \
+            --tag ${{ inputs.tag }} \
+            --envrc /tmp/roxie-env.sh \
+            --exposure=${EXPOSURE}
+          ROXIE_RC=$?
+          ts "roxie deploy complete (exit ${ROXIE_RC})"
</code_context>
<issue_to_address>
**issue (bug_risk):** The `namespace` input is not passed to `roxie deploy`, which can desynchronize the assumed namespace from the actual deployment namespace.

Here the `namespace` input is only used by subsequent `kubectl` commands (e.g., `get pods`, port-forward) and not passed into `roxie deploy`. If `roxie`’s default namespace differs from `${{ inputs.namespace }}`, those `kubectl` commands will point at a different namespace than the one actually deployed. Please either propagate the namespace into `roxie deploy` (if supported) or ensure both `roxie` and `kubectl` derive the namespace from the same source to avoid mismatches when the default is overridden.
</issue_to_address>

### Comment 2
<location path=".github/actions/deploy-stackrox/action.yaml" line_range="115-120" />
<code_context>
+          T0=$(date +%s)
+          ts() { echo "+$(( $(date +%s) - T0 ))s $*"; }
+
+          ts "waiting for cluster..."
+          for _ in $(seq 1 120); do
+            kubectl cluster-info >/dev/null 2>&1 && break
+            sleep 2
+          done
+          ts "cluster reachable"
+
+          cat > /tmp/roxie-override.yaml << EOF
</code_context>
<issue_to_address>
**issue (bug_risk):** Cluster readiness loop logs "cluster reachable" even if `kubectl cluster-info` never succeeds.

If `kubectl cluster-info` never succeeds within the 120 attempts, the script still logs `cluster reachable` and continues, which can hide real connectivity problems. Please distinguish between a successful check and a timeout (e.g., via a flag or loop counter) and fail or log an explicit error when the cluster is not actually reachable.
</issue_to_address>

### Comment 3
<location path=".github/actions/deploy-stackrox/action.yaml" line_range="201-203" />
<code_context>
+          ts "data ready (${dep_count} deployments)"
+        }
+
+        if [ "${{ inputs.background }}" = "true" ]; then
+          deploy_stackrox > /tmp/deploy.log 2>&1 &
+          echo "StackRox deploy started in background (PID $!)"
+        else
+          deploy_stackrox 2>&1 | tee /tmp/deploy.log
</code_context>
<issue_to_address>
**suggestion (bug_risk):** In background mode, failures in `deploy_stackrox` are not surfaced to the GitHub Action step status.

Because the process is backgrounded, the step always exits 0 even if `deploy_stackrox` later fails, so the workflow can report success while deployment actually failed and downstream jobs may hang or fail on missing artifacts. If you keep this behavior, please add a clear success/failure marker (e.g., alongside `/tmp/deploy.log`) that consumers can check, or provide a mode that waits for completion and returns the real exit code.

Suggested implementation:

```
        if [ "${{ inputs.background }}" = "true" ]; then
          (
            deploy_stackrox > /tmp/deploy.log 2>&1
            ec=$?
            echo "${ec}" > /tmp/deploy.exit_code
            if [ "${ec}" -eq 0 ]; then
              touch /tmp/deploy.success
            else
              touch /tmp/deploy.failure
            fi
          ) &
          echo "StackRox deploy started in background (PID $!)"
        else
          if deploy_stackrox 2>&1 | tee /tmp/deploy.log; then
            echo "0" > /tmp/deploy.exit_code
            touch /tmp/deploy.success
            # Expose outputs for downstream jobs (shared cluster mode).
            if [ -s /tmp/rox-auth-token ]; then
              echo "api-endpoint=$(cat /tmp/rox-api-endpoint)" >> "$GITHUB_OUTPUT"
              echo "auth-token=$(cat /tmp/rox-auth-token)" >> "$GITHUB_OUTPUT"
            fi
          else
            ec=$?
            echo "${ec}" > /tmp/deploy.exit_code
            touch /tmp/deploy.failure
            exit "${ec}"
          fi
        fi

```

1. Document the new status artifacts (`/tmp/deploy.exit_code`, `/tmp/deploy.success`, `/tmp/deploy.failure`) so downstream consumers know how to detect background deployment success/failure.
2. If there are existing consumers tailing `/tmp/deploy.log`, you may want to add guidance or helper scripts to check these markers for robust status handling.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +142 to +150
roxie deploy both \
--resources=${{ inputs.resources }} \
--single-namespace \
--central-wait=10m \
--secured-cluster-wait=10m \
--override /tmp/roxie-override.yaml \
--tag ${{ inputs.tag }} \
--envrc /tmp/roxie-env.sh \
--exposure=${EXPOSURE}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The namespace input is not passed to roxie deploy, which can desynchronize the assumed namespace from the actual deployment namespace.

Here the namespace input is only used by subsequent kubectl commands (e.g., get pods, port-forward) and not passed into roxie deploy. If roxie’s default namespace differs from ${{ inputs.namespace }}, those kubectl commands will point at a different namespace than the one actually deployed. Please either propagate the namespace into roxie deploy (if supported) or ensure both roxie and kubectl derive the namespace from the same source to avoid mismatches when the default is overridden.

Comment on lines +115 to +120
ts "waiting for cluster..."
for _ in $(seq 1 120); do
kubectl cluster-info >/dev/null 2>&1 && break
sleep 2
done
ts "cluster reachable"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Cluster readiness loop logs "cluster reachable" even if kubectl cluster-info never succeeds.

If kubectl cluster-info never succeeds within the 120 attempts, the script still logs cluster reachable and continues, which can hide real connectivity problems. Please distinguish between a successful check and a timeout (e.g., via a flag or loop counter) and fail or log an explicit error when the cluster is not actually reachable.

Comment thread .github/actions/deploy-stackrox/action.yaml Outdated
roxie already handles Central readiness, port-forwarding,
cluster auto-detection, and endpoint discovery. The action
was duplicating all of that.

Removed:
- Background/foreground mode toggle
- Manual cluster wait loop
- Exposure auto-detection (roxie does this)
- Manual port-forward (roxie's detached port-forward)
- Central API wait loop (roxie --central-wait)
- Deployment count polling
- Timestamp logging wrapper

What remains:
1. Install roxie + roxctl (from release + container image)
2. Generate admin password, run roxie deploy both
3. Generate API token (roxie doesn't do this yet)

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: c7667495-4cf7-4f90-8b45-deef72930d7a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR adds a composite GitHub action to install roxctl/roxie and run roxie deploy both (with background/foreground modes and admin token generation) and refactors the e2e DB backup/restore workflow to call that action in a separate "Deploy StackRox" step before running the DB test.

Changes

StackRox Deployment Action and Workflow Integration

Layer / File(s) Summary
Action metadata and inputs
.github/actions/deploy-stackrox/action.yaml
Declares composite action name/description and inputs: tag (required), cluster-name (default remote), resources (default small), roxie-version (default v0.2.2-test2), github-token (required), background (default "false").
Install roxctl
.github/actions/deploy-stackrox/action.yaml
Checks for existing roxctl; if absent, extracts roxctl from quay.io/stackrox-io/roxctl:latest into /usr/local/bin/roxctl.
Install roxie
.github/actions/deploy-stackrox/action.yaml
Downloads the OS/arch-matched roxie release from stackrox/roxie using the provided GitHub token and installs it to /usr/local/bin/roxie.
Deploy logic and token generation
.github/actions/deploy-stackrox/action.yaml
Generates/masks ROX_ADMIN_PASSWORD, sets MAIN_IMAGE_TAG, runs roxie deploy both with resource and clusterName override in single-namespace mode, sources /tmp/roxie-env.sh, POSTs to /v1/apitokens/generate to create an admin token saved to /tmp/rox-auth-token, and supports background (logs+PID) or foreground execution. No GitHub outputs are declared in this file.
Workflow: invoke deploy action and run test
.github/workflows/e2e-db-backup-restore-test.yaml
Adds a dedicated "Deploy StackRox" step that calls ./.github/actions/deploy-stackrox with tag and github-token, and changes the DB backup/restore test step to only source tests/e2e/lib.sh, call export_test_environment, and run db_backup_and_restore_test (removed previous inline deployment commands).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: adding a GitHub Actions composite action for deploying StackRox using roxie, which aligns with the primary purpose of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description covers all required sections: clear description of changes, documentation status checked, testing and quality requirements addressed, automated testing indicated, and validation approach explained.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch davdhacs/deploy-stackrox-action

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
.github/workflows/e2e-db-backup-restore-test.yaml (1)

128-136: 💤 Low value

Consider capturing action outputs explicitly.

The deploy-stackrox action exposes api-endpoint and auth-token outputs in foreground mode, but this workflow reads from /tmp files directly instead of capturing step outputs. While this works, using outputs would be more idiomatic.

♻️ Suggested enhancement
     - name: Deploy StackRox
+      id: deploy-stackrox
       uses: ./.github/actions/deploy-stackrox
       with:
         tag: ${{ env.tag }}
         background: "false"
         github-token: ${{ secrets.RHACS_BOT_GITHUB_TOKEN }}
         quay-user: ${{ secrets.QUAY_RHACS_ENG_RO_USERNAME }}
         quay-pass: ${{ secrets.QUAY_RHACS_ENG_RO_PASSWORD }}
 
     - name: Run DB backup/restore test
       env:
         QUAY_RHACS_ENG_RO_USERNAME: ${{ secrets.QUAY_RHACS_ENG_RO_USERNAME }}
         QUAY_RHACS_ENG_RO_PASSWORD: ${{ secrets.QUAY_RHACS_ENG_RO_PASSWORD }}
         REGISTRY_USERNAME: ${{ secrets.QUAY_RHACS_ENG_RO_USERNAME }}
         REGISTRY_PASSWORD: ${{ secrets.QUAY_RHACS_ENG_RO_PASSWORD }}
         ORCHESTRATOR_FLAVOR: k8s
         ROX_ADMIN_PASSWORD_FILE: /tmp/rox-admin-password
+        API_ENDPOINT: ${{ steps.deploy-stackrox.outputs.api-endpoint }}
       run: |
         set +x
         export ROX_ADMIN_PASSWORD="$(cat "$ROX_ADMIN_PASSWORD_FILE")"
-        export API_ENDPOINT="$(cat /tmp/rox-api-endpoint)"
         echo "API_ENDPOINT=${API_ENDPOINT}" >> "$GITHUB_ENV"

Note: This assumes the action outputs are available. If not, the file-based approach is appropriate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e-db-backup-restore-test.yaml around lines 128 - 136,
The workflow step that runs the deploy-stackrox action ("Deploy StackRox")
should capture the action outputs instead of reading /tmp files; add an id
(e.g., id: deploy_stackrox) to the step and then reference
steps.deploy_stackrox.outputs.api-endpoint and
steps.deploy_stackrox.outputs.auth-token wherever the workflow currently reads
/tmp/api-endpoint or /tmp/auth-token so downstream steps consume the action
outputs directly (ensure the action is run with background: "false" so it
exposes api-endpoint and auth-token).
.github/actions/deploy-stackrox/action.yaml (1)

187-187: 💤 Low value

Validate jq availability before use.

The script assumes jq is installed to parse JSON responses. While jq is typically available in standard GitHub runners, a missing dependency will cause silent failures due to set +e on line 111.

Consider adding a check or using alternative parsing.

♻️ Suggested validation

At the beginning of deploy_stackrox function:

         deploy_stackrox() {
           set +e
+          if ! command -v jq >/dev/null 2>&1; then
+            echo "::error::jq is required but not installed"
+            return 1
+          fi
           T0=$(date +%s)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/actions/deploy-stackrox/action.yaml at line 187, The deploy_stackrox
function uses jq (see the pipeline that writes '.token' to /tmp/rox-auth-token)
but doesn't validate jq is installed; add a pre-check at the start of
deploy_stackrox to verify jq is available (e.g., command -v jq or which jq) and
fail-fast with a clear error if missing, or provide a fallback JSON parsing
method (python -c or grep/sed) and use that to extract '.token' when jq is
absent; update any error messages to mention the required dependency so the
action fails clearly instead of silently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/actions/deploy-stackrox/action.yaml:
- Line 111: The script currently disables errexit with "set +e", which can mask
failures (e.g., the cluster wait loop) and let "roxie deploy" run against an
unreachable cluster; change to "set -e" (or remove "set +e") and add an explicit
check of the cluster readiness command used in the workflow (the wait/retry loop
that precedes "roxie deploy")—if the wait command fails, log a clear error and
exit the job (non-zero) instead of continuing, ensuring "roxie deploy" and
subsequent steps only run when the cluster wait command succeeds.
- Line 80: The fallback build step that runs "go build ./roxctl" should first
validate assumptions: check that the "go" executable is available and that the
"./roxctl" source directory exists, and if either check fails, emit a clear,
actionable error and skip the silent failure; update the action.yaml where "go
build -o /usr/local/bin/roxctl ./roxctl" and the fallback "go build ./roxctl"
are invoked to perform these checks (e.g., using a command availability check
for "go" and a directory existence check for "./roxctl") and log an explicit
error message explaining which prerequisite is missing and how to resolve it.
- Around line 170-172: The background infinite retry loop running `kubectl -n
"$NS" port-forward svc/central 8000:443` can spawn unbounded processes; replace
it with a bounded retry mechanism (e.g., add a max attempts counter or an
overall timeout) and exit the loop when the limit is reached or after a timeout,
ensuring the background job is not left running indefinitely; update the shell
block that currently uses `while true; do ... done &` to use a loop with a retry
counter or a `timeout` wrapper around the port-forward command and log/exit when
retries are exhausted.

---

Nitpick comments:
In @.github/actions/deploy-stackrox/action.yaml:
- Line 187: The deploy_stackrox function uses jq (see the pipeline that writes
'.token' to /tmp/rox-auth-token) but doesn't validate jq is installed; add a
pre-check at the start of deploy_stackrox to verify jq is available (e.g.,
command -v jq or which jq) and fail-fast with a clear error if missing, or
provide a fallback JSON parsing method (python -c or grep/sed) and use that to
extract '.token' when jq is absent; update any error messages to mention the
required dependency so the action fails clearly instead of silently.

In @.github/workflows/e2e-db-backup-restore-test.yaml:
- Around line 128-136: The workflow step that runs the deploy-stackrox action
("Deploy StackRox") should capture the action outputs instead of reading /tmp
files; add an id (e.g., id: deploy_stackrox) to the step and then reference
steps.deploy_stackrox.outputs.api-endpoint and
steps.deploy_stackrox.outputs.auth-token wherever the workflow currently reads
/tmp/api-endpoint or /tmp/auth-token so downstream steps consume the action
outputs directly (ensure the action is run with background: "false" so it
exposes api-endpoint and auth-token).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0bde9169-f176-4906-94b1-9a10afb49ecc

📥 Commits

Reviewing files that changed from the base of the PR and between 00a3293 and d39b20a.

📒 Files selected for processing (2)
  • .github/actions/deploy-stackrox/action.yaml
  • .github/workflows/e2e-db-backup-restore-test.yaml

Comment thread .github/actions/deploy-stackrox/action.yaml Outdated
Comment thread .github/actions/deploy-stackrox/action.yaml Outdated
Comment thread .github/actions/deploy-stackrox/action.yaml Outdated
davdhacs and others added 2 commits May 12, 2026 10:05
Background mode (background=true) returns immediately — poll
for /tmp/rox-auth-token to know when deploy is complete. Useful
for overlapping deploy with npm install or Gradle compile.

Also exports all roxie env vars (API_ENDPOINT, ROX_ADMIN_PASSWORD,
etc.) to GITHUB_ENV so downstream steps can use them directly
without sourcing files. Sensitive values are auto-masked.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roxie's envrc only contains deployment variables (endpoint,
password, cert paths). Just sed off the 'export ' prefix and
append to GITHUB_ENV — no parsing or masking logic needed.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented May 12, 2026

Copy link
Copy Markdown
Contributor

🚀 Build Images Ready

Images are ready for commit 515b0be. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.12.x-143-g515b0be76a

davdhacs and others added 16 commits May 12, 2026 10:08
GITHUB_ENV expects raw KEY=value, not shell syntax. Strip
both export prefix and surrounding quotes in one sed.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The deploy action exports roxie's env vars to GITHUB_ENV, so
downstream steps get ROX_ADMIN_PASSWORD, API_ENDPOINT, etc.
directly — no more cat /tmp/rox-admin-password.

Removed: /tmp/rox-admin-password file write, /tmp/rox-api-endpoint
file write, GITHUB_OUTPUT, outputs block, step id. The only file
kept is /tmp/rox-auth-token (completion signal for background mode).

Note: GITHUB_ENV writes only work in foreground mode. Background
subshells write after the step returns, so GHA won't pick them up.
Background callers should source /tmp/roxie-env.sh instead.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
If roxie-version is set to empty string, use whatever roxie
is already on PATH without downloading. Default still pins
to a specific version for reproducibility.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CI runners are ephemeral — no pre-installed roxie to worry
about. Just download the requested version every time.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ephemeral CI: no guards needed, always install both tools.
sudo install replaces chmod+mv. Chained docker commands.
Separate steps for roxctl/roxie/deploy for clear log grouping.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GHA renders ${{ inputs.* }} into the script source which appears
in the step header log. Moving secrets to env: block keeps them
out of the rendered script entirely.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roxie only uses roxctl for 'central crs generate' — an API call
to a running Central. Any recent roxctl version works, no need
to match the deploy image tag. Skip install if already on PATH.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
quay.io/stackrox-io/roxctl is public — no auth needed.
roxie only uses roxctl for CRS generation (an API call to
a running Central), so any recent version works.

Removed: quay-user input, quay-pass input, docker login,
platform-specific image pull from private registry.

roxctl install is now 4 lines with zero credentials.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Touch a .done file at the end of deploy instead of relying
on /tmp/rox-auth-token (which is a test framework artifact,
not a deploy signal). Standard Linux pattern for background
process completion.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs and others added 2 commits May 12, 2026 15:46
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shell: bash
run: |
command -v roxctl >/dev/null && exit 0
docker pull -q --platform linux/amd64 quay.io/stackrox-io/roxctl:latest

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to switch back to using the built roxctl for the tested commit instead

davdhacs and others added 8 commits May 13, 2026 12:39
Brings in improvements:
- scanner input (v2/v4/both) with override YAML via printf
- docker pull before docker create for roxctl
- /usr/bin/roxctl path fix for amd64 public image
- rox-admin-password file write for background mode
- cluster wait loop before roxie deploy
- combined install step (roxctl + roxie)

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- registry-username/password inputs: roxie needs Docker credentials for
  quay.io pre-flight check. KinD has docker login, GKE needs explicit
  env vars.
- Write API_ENDPOINT to /tmp/rox-api-endpoint for cross-job sharing
  (GKE shards download credentials as artifacts).
- Foreground deploy logs to /tmp/deploy-stackrox.log via tee for
  debugging.

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
roxie has built-in --central-wait (10m) and --secured-cluster-wait
(10m) that handle cluster readiness internally.

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

CodeRabbit chat interactions are restricted to organization members for this repository. Ask an organization member to interact with CodeRabbit, or set chat.allow_non_org_members: true in your configuration.

yq is pre-installed on ubuntu-latest. Replaces fragile printf with
escaped newlines with readable yq expressions.

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs added a commit that referenced this pull request May 15, 2026
Brings in all improvements: yq for override YAML, registry credential
inputs, API endpoint file, foreground tee logging, roxie built-in
cluster readiness (no manual kubectl loop).

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
davdhacs and others added 8 commits May 15, 2026 14:48
roxie's internal retry is only 3 attempts for kubeconfig access.
The kubectl cluster-info loop is needed when KinD creates in background.

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Supports: auto (default), none, loadbalancer, route. GKE shared
clusters need loadbalancer to get a real external IP. KinD uses
auto (port-forward).

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The exception management Cypress tests clean up requests where
requester.name === 'ui_tests'. Prow's get-auth-token.sh creates
tokens with this name by default (UI_API_TOKEN_NAME). Changed
from 'ci-test' to 'ui_tests' so cleanup works correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reusable composite actions for cluster provisioning:

- create-gke-cluster: GKE via gcloud with spot VM support, image
  pull secrets, outputs cluster-name and zone
- create-kind-cluster: KinD with background image pre-pull via ctr
- gke.sh: added --spot flag for preemptible VMs

Migrated e2e-db-backup-restore-test from infra/create-cluster
(infractl) to create-gke-cluster (gcloud direct). Removes the
INFRA_TOKEN and infractl dependencies.

AI-assisted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates from ui-e2e branch testing:

deploy-stackrox action:
- Add central-env input for Central env vars (KEY=VALUE,KEY2=VALUE2)
- Add scanner-v4-env input for scanner-v4-matcher env vars
- Copy KUBECONFIG to ~/.kube/config for roxie compatibility
- Retry API token generation (12 attempts) for LoadBalancer readiness
- Enable OpenShift console plugin on OCP clusters
- Configure scanner-v4-matcher env vars post-deploy

New connect-infra-cluster action:
- Connect to any infractl-provisioned cluster (GKE, OCP, ROSA HCP)
- Auto-detect GKE kubeconfigs and setup gcloud auth + plugin
- Auto-refresh expired ROSA HCP OAuth tokens using console credentials
- Configurable wait timeout (25m default, 60m for OpenShift)

Partially generated by AI.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The test script expects central-loadbalancer service. Without
explicit exposure, roxie may not create it on GKE.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@davdhacs davdhacs requested a review from AlexVulaj June 15, 2026 20:58
@davdhacs

Copy link
Copy Markdown
Contributor Author

@AlexVulaj what do you think of making these shared actions for other e2e tests? This PR adds using roxie, spot instances, and a kind cluster option (making it I hope so we can easily switch a test to be gke/kind/other hard-coded or dynamically). So, we don't need to apply this all together, but this gives a complete picture of the changes I'm seeking.

- Remove unused tag/registry inputs
- Default spot to true
- Add post-step cleanup (gacts/run-and-post-run) with logged command
- Remove explicit Delete GKE cluster step (now handled by action)

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant