Conversation
Set up a Cucumber (TypeScript) test framework in the 'test/e2e' directory. This commit contains only the infrastructure needed to write and run the tests; the tests themselves will be added in future commits. Signed-off-by: Victoria Dye <vdye@github.com>
Add 'basic.feature' to test common workflows using the bundle server. Initially, add a common "Background" step starting the bundle web server at the specified port, and a common "After" step run when a test ends (successful or not) stopping that web server process. Signed-off-by: Victoria Dye <vdye@github.com>
derrickstolee
left a comment
There was a problem hiding this comment.
I was attempting to run these locally, but I'm having issues with node.js version at least 14 working with an npm install on my Ubuntu machine. Don't let my inability to install software block this effort.
There was a problem hiding this comment.
This is an interesting formal mechanism for describing tests. Each of these lines are repeatable steps that can be mixed and matched. For instance, I could imagine (not looking ahead) a test like:
Scenario: User fails to clone from a non-existing bundle URI
Given a remote repository 'https://github.com/vdye/asset-hash.git'
When I clone from the remote repo with a bundle URI
Then bundles are not downloaded and clone succeeds with a warning
The only new step to build is the Then statement. Neat!
| // Ensure warning wasn't thrown | ||
| clonedRepo.cloneResult.stderr.toString().split("\n").forEach(function (line) { | ||
| if (line.startsWith("warning: failed to download bundle from URI")) { | ||
| assert.fail(line) | ||
| } |
There was a problem hiding this comment.
I wonder if it would be worth asking git for-each-ref --format="%(refname)" "refs/bundles/*" if it has at least one ref?
| result = clonedRepo.runGit("for-each-ref", "--format=%(refname)") | ||
| utils.assertStatus(0, result, "git for-each-ref failed") | ||
|
|
||
| const bundleRefRegex = new RegExp('^refs/bundles/.*$') | ||
| const bundleRefs = result.stdout.toString().split("\n").filter(function (line) { | ||
| return bundleRefRegex.test(line) |
There was a problem hiding this comment.
Ah, you do it here! We could add the refs/bundles/* filter to the command, then we don't need to regex on the result, just look for at least one line.
| Scenario: A user can fetch with a bundle server that's behind and get all updates | ||
| Given a new remote repository with main branch 'main' | ||
| Given another user pushed 10 commits to 'main' | ||
| Given the bundle server has been initialized with the remote repo | ||
| Given I cloned from the remote repo with a bundle URI | ||
| Given another user pushed 2 commits to 'main' | ||
| When I fetch from the remote | ||
| Then I am up-to-date with 'main' | ||
| Then my repo's bundles are not up-to-date with 'main' | ||
|
|
||
| Scenario: A user will fetch incremental bundles to stay up-to-date | ||
| Given a new remote repository with main branch 'main' | ||
| Given another user pushed 10 commits to 'main' | ||
| Given the bundle server has been initialized with the remote repo | ||
| Given I cloned from the remote repo with a bundle URI | ||
| Given another user pushed 2 commits to 'main' | ||
| Given the bundle server was updated for the remote repo | ||
| When I fetch from the remote | ||
| Then I am up-to-date with 'main' | ||
| Then my repo's bundles are up-to-date with 'main' | ||
|
|
||
| Scenario: A user can fetch force-pushed refs from the bundle server | ||
| Given a new remote repository with main branch 'main' | ||
| Given another user pushed 10 commits to 'main' | ||
| Given the bundle server has been initialized with the remote repo | ||
| Given I cloned from the remote repo with a bundle URI | ||
| Given another user removed 2 commits and added 4 commits to 'main' | ||
| Given the bundle server was updated for the remote repo | ||
| When I fetch from the remote | ||
| Then I am up-to-date with 'main' | ||
| Then my repo's bundles are up-to-date with 'main' |
There was a problem hiding this comment.
😍 this is amazing to read and clearly understand the scenarios.
|
|
||
| const clonedRepo = this.getRepo(user) | ||
|
|
||
| // TODO: figure out a better way to check whether the repo is empty |
There was a problem hiding this comment.
git rev-list --all -n 1 has no output?
| e2e-test: build | ||
| @echo | ||
| @echo "======== Running end-to-end tests ========" | ||
| $(RM) -r $(TESTDIR) | ||
| @if [ -n "$(TEST_E2E_PERF)" ]; then \ |
There was a problem hiding this comment.
Looks like make e2e-test and make e2e-test TEST_E2E_PERF=1 will run the tests and performance tests, respectively.
Could you update the README with these test instructions?
There was a problem hiding this comment.
I also had to update my version of node.js to work with cucumber. It was really old, but maybe that should be listed as a dependency for developers?
| on: | ||
| pull_request: | ||
| workflow_dispatch: |
There was a problem hiding this comment.
As you mentioned, "only on PRs" is quite nice because that way it's not on arbitrary pushes and not re-testing after a PR merged (and was subject to required builds).
The manual trigger of the workflow (with optional perf tests!) is excellent.
| Examples: | ||
| | repo | | ||
| | https://github.com/git/git.git | # takes ~2 minutes | ||
| | https://github.com/torvalds/linux.git | # takes ~30(!) minutes |
There was a problem hiding this comment.
It took me a while to realize that this literally supplies the input to the tests. Awesome.
There was a problem hiding this comment.
Here are my test results for the data you have so far:
Clone execution time for https://github.com/git/git.git: 11.64416905105114s (bundle URI) vs. 22.594083791017532s (no bundle URI)
Clone execution time for https://github.com/torvalds/linux.git: 168.95288394904136s (bundle URI) vs. 302.48135003399847s (no bundle URI)
Here are some others I tested:
Clone execution time for https://github.com/git-for-windows/git.git: 14.297109727025033s (bundle URI) vs. 29.787424183011055s (no bundle URI)
Clone execution time for https://github.com/kubernetes/kubernetes.git: 37.84783263194561s (bundle URI) vs. 41.304983952999116s (no bundle URI)
Clone execution time for https://github.com/homebrew/homebrew-core: 31.01036406993866s (bundle URI) vs. 357.1608746279478s (no bundle URI)
If you want to include any of them, here is my example list (minus git-for-windows/git since it's not materially different from git/git):
Examples:
| repo |
| https://github.com/git/git.git | # takes ~2 minutes
| https://github.com/kubernetes/kubernetes.git | # takes ~3 minutes
| https://github.com/homebrew/homebrew-core | # takes ~15(!) minutes
| https://github.com/torvalds/linux.git | # takes ~30(!) minutes
There was a problem hiding this comment.
Thanks, I'll add those cases!
Seeing your results, my timing comments could be misleading - there will be a ton of variability in execution time depending on internet speed (primarily), system resources, etc. I'll remove them and add a disclaimer in README.md along the lines of "The performance tests clone very large repos and can take anywhere from ~30 minutes to multiple hours to run, depending on internet connectivity and other system resources".
On that note, I'll also add an @online tag and corresponding "offline" cucumber.js profile to allow people to run tests without internet.
ldennington
left a comment
There was a problem hiding this comment.
Great stuff! ✨ Thanks for laying this great foundation.
| @@ -0,0 +1,7 @@ | |||
| import { defineParameterType } from '@cucumber/cucumber' | |||
|
|
|||
| defineParameterType({ | |||
There was a problem hiding this comment.
You've found a lot of cool ways to integrate useful Cucumber features - it's been really helpful for ramping!
|
I played around with extending your tests, so I could get a feel of how it works. The result is de09244, which let me repeat the performance tests, but used a second machine with existing bundles instead of creating new ones from scratch. (Perhaps these should be the same, with the environment variable switching modes at runtime instead of failing without it.) |
Add a test and its corresponding step definitions verifying that a remote repository initialized in a bundle server can be cloned with '--bundle-uri' to download bundles. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com>
Add scenarios to the end-to-end tests involving updates to the remote and/or bundle server. As part of this update, add a custom Cucumber expression for matching "are/are not" to a boolean. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com>
Add a test that captures timing measurements (in 'console.log()' printouts) for cloning various large repositories. The tests are divided into two example sets: one for tests that run on modern hardware with a good internet connection for a *total* of less than 15 minutes (currently take ~7 minutes), and the other for tests that take longer than that. The latter set is tagged with '@slow' and the 'default' profile is updated to skip them. Add an 'all' profile that runs these slow performance tests alongside the "regular" tests. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com>
Tag tests that require internet access with '@online' and add a Cucumber profile that skips those tests. This allows developers to easily run tests on the bundle server in an offline or bandwidth-limited environment. Signed-off-by: Victoria Dye <vdye@github.com>
Update the workflows that build the project to use the fixed version of Go tied to the project in 'go.mod'. The 'go.mod' can be used as the sole source of truth on the supported version of Go in the repository, making it easier to upgrade and use new features when needed. Signed-off-by: Victoria Dye <vdye@github.com>
Add a target to the Makefile that installs the end-to-end test NPM dependencies then runs the tests. By default, this will only run tests matching the "default" profile of the tests (excluding performance tests). The 'E2E_FLAGS' variable can be used to specify the profile(s) used by the tests (e.g., '--all' to include the tests excluded by default). Add documentation on how to run the end-to-end tests to 'README.md'. Signed-off-by: Victoria Dye <vdye@github.com>
Add a dedicated 'test.yml' workflow that executes the end-to-end tests. Unlike CI, these tests are only run automatically for pull requests (although they can be manually triggered via workflow dispatch). By default, the tests will run with the "default" profile; this can be changed via an input variable in the workflow dispatch. Because the end-to-end tests deal with external dependencies (namely, Git), add an 'install-dependencies.sh' script to install and configure the dependencies on a GitHub Actions runner. The goal of this workflow is for it to eventually be used to run multiple types of long-running tests, such as integration tests or installer validation tests. Signed-off-by: Victoria Dye <vdye@github.com>
derrickstolee
left a comment
There was a problem hiding this comment.
I like the new E2E_FLAGS interface and doc updates, in particular.

Closes #23
Summary
This pull request creates a Cucumber/TypeScript end-to-end testing framework for the Git Bundle Server (see #23 for background on the framework & reasoning for why TypeScript was selected). Two classes of tests are created to start with: "basic" tests (demonstrating expected typical workflows) and "performance" tests (comparing performance with & without use of the bundle server on large repositories).
The commits are broken down as follows:
e2e-testtarget to theMakefiletest.ymlworkflowPerformance results
The performance tests are quite slow, so they aren't run by default in the Actions workflow. For reference, some example results on my machine were:
Note that this only compares the clone time; it also took
torvalds/linux~10 minutes to clone & generate its 4.1Gb(!) base bundle. Since that scale is more reflective of the monorepos this bundle server would be used with than the small repos used in the "basic" tests, it's probably worth doing some more detailed profiling on the bundle server. Hopefully, we'd find some places where we can improve performance.