Back to source list
Official
Premium
GitHub source integration documentation
The CloudQuery GitHub plugin extracts your GitHub API and loads it into any supported CloudQuery destination
Publisher
cloudquery
Latest version
v15.5.1
Type
Source
Platforms
Date Published
Loading plugin documentation
© 2026 CloudQuery, Inc. All rights reserved.
CloudQuery's use of cookies
We use tracking cookies to understand how you use the product and help us improve it. Your consent is required before we can enable these cookies. You can opt out via the link in the footer.
Overview #
GitHub Source Plugin
The CloudQuery GitHub plugin extracts your GitHub API and loads it into any supported CloudQuery destination (e.g. PostgreSQL, BigQuery, Snowflake, and more).
Authentication #
Quickstart #
The GitHub source plugin supports two authentication methods: Personal Access Token and App authentication. To get started quickly, we recommend using a personal access token. For production deployment, GitHub Apps are better as they allow higher rate limits. Review GitHub rate limits documentation for details.
CloudQuery requires only read permissions (we will never make any changes to your GitHub account or organizations). Following the principle of least privilege, we recommend to grant it read-only permissions to all the resources you wish to sync.
Authenticating with a Personal Access Token #
Follow this guide on how to create a personal access token for CloudQuery.
See Authenticating with a GitHub App for details on how to configure GitHub App authentication with CloudQuery.
Authenticating with a GitHub App #
Authenticating with a GitHub App #
For App authentication, you need to create a GitHub App and install it on your organization. Follow this guide and install the App into your organization(s). Give it all the permissions you need (read-only is recommended).
Every organization will have a unique installation ID. You can find it by going to the organization's settings page, and clicking on the "Installed GitHub Apps" tab. The installation ID is the number in the URL of the page.
Passing private_key as plaintext #
You can use
| to pass the multi-line private key as plaintext.For example:
- org: cloudquery
private_key: |
-----BEGIN RSA PRIVATE KEY-----
MIIEpQIBAAKCAQEA3eVv6PCn9P8zO+EP8K7pLMfxcA2uVrSZ2f+H3GgYIavDxWtO
vM9tE3jAA8mOjZpdLaG5yy4QfV1LQ3R7kO49JCB6VbClwN2lNvd8Iw49JCBDid7D
...
-----END RSA PRIVATE KEY-----
app_id: your_app_id
Referencing private_key as environment variable #
When referencing the
private_key as a string from environment variables, you will need to replace all the new lines in your PEM file with \n otherwise the new line and indent will prevent CloudQuery from reading the variable correctly.For example:
- org: cloudquery
private_key: "${GITHUB_PRI_KEY}"
app_id: your_app_id
...
where
GITHUB_PRI_KEY="-----BEGIN RSA PRIVATE KEY-----\nMIIEpQIBAAKCAQEA3eVv6PCn9P8zO+EP8K7pLMfxcA2uVrSZ2f+H3GgYIavDxWtO\n...vM9tE3jAA8mOjZpdLaG5yy4QfV1LQ3R7kO49JCB6VbClwN2lNvd8Iw==\n-----END RSA PRIVATE KEY-----"
Configuration #
To configure CloudQuery to extract from GitHub, create a
.yml file in your CloudQuery configuration directory.The following configuration will extract all issues from the
cloudquery/cloudquery repository:kind: source
spec:
# Source spec section
name: github
path: cloudquery/github
registry: cloudquery
version: "v15.5.1"
tables: ["github_issues"]
destinations: ["postgresql"]
# Learn more about the configuration options at https://cql.ink/github_source
spec:
access_token: "${GITHUB_PERSONAL_ACCESS_TOKEN}" # Personal Access Token, required if not using App Authentication.
# # App Authentication (one per org):
# app_auth:
# - org: cloudquery
# private_key: <PRIVATE_KEY> # Private key as a string
# private_key_path: <PATH_TO_PRIVATE_KEY> # Path to private key file
# app_id: <YOUR_APP_ID> # App ID, required for App Authentication.
# installation_id: <ORG_INSTALLATION_ID> # Installation ID for this org
# # List of organizations to sync from. You must specify either orgs or repos in the configuration.
# orgs: []
# # List of repositories to sync from. The format is `owner/repo` (e.g. `cloudquery/cloudquery`). You must specify either `orgs` or `repos` in the configuration.
# repos: ["cloudquery/cloudquery"]
# # List of GitHub Enterprise Cloud slugs to sync enterprise-level tables from.
# # The authenticated user must be an enterprise owner or billing manager.
# enterprises: ["my-enterprise"]
# # GitHub Enterprise
# # In order to enable GHE you have to provide two urls, the base url of the server and the upload url.
# # Quote from GitHub's client:
# # If the base URL does not have the suffix "/api/v3/", it will be added automatically. If the upload URL does not have the suffix "/api/uploads", it will be added automatically.
# # Another important thing is that by default, the GitHub Enterprise URL format should be http(s)://[hostname]/api/v3/ or you will always receive the 406 status code. The upload URL format should be http(s)://[hostname]/api/uploads/"
# # If you are not configuring against an enterprise server please omit the enterprise configuration bellow
# enterprise:
# base_url: "http(s)://[your-ghe-hostname]/api/v3/"
# upload_url: "http(s)://[your-ghe-hostname]/api/uploads/"
# # Optional parameters
# concurrency: 1500 # Optional. The best effort maximum number of Go routines to use. Lower this number to reduce memory usage or to avoid hitting GitHub API rate limits. Default 1500.
# discovery_concurrency: 1 # Optional. Number of concurrent requests to GitHub API during discovery phase. Default 1.
# include_archived_repos: false # Optional. Include archived repositories in the sync. Default false.
# local_cache_path: "" # Optional. Path to a local directory that will hold the cache. If set, the plugin will cache the GitHub API responses in this directory. Defaults to an empty string (no cache).
# table_options:
# github_workflow_runs:
# created_since: "" # e.g. "7 days ago", defaults to all workflow runs
# github_issues:
# state: "" # e.g. "open, all, closed", defaults to `all`
See tables for a full list of available tables.
You must specify either
orgs or repos in the configuration. If a repository is specified in both orgs and repos, it will be extracted only once, and other repositories from that organization will be ignored.You can define either
private_key or private_key_path in the configuration, but not both.It is recommended that you use environment variable expansion for the access token in production. For example, if the access token is stored in an environment variable called
GITHUB_ACCESS_TOKEN:spec:
access_token: ${GITHUB_ACCESS_TOKEN}
GitHub Spec #
GitHub Spec #
This is the (nested) spec used by GitHub Source Plugin
repos([]string, optional. Default: empty) List of repositories to sync from. The format isowner/repo(e.g.cloudquery/cloudquery). You must specify eitherorgsorreposin the configuration.orgs([]string, optional. Default: empty): List of organizations to sync from. You must specify eitherorgsorreposin the configuration.enterprises([]string, optional. Default: empty): List of GitHub Enterprise Cloud slugs to sync enterprise-level tables from (e.g.github_enterprise_copilot_metrics,github_enterprise_billing_advanced_security). The authenticated user must be an enterprise owner or billing manager.concurrency(integer, optional, default:1500) A best effort maximum number of Go routines to use. Lower this number to reduce memory usage or to avoid hitting GitHub API rate limits.discovery_concurrency(integer, optional, default:1)During initialization the GitHub source plugin discovers all repositories under the organizations configured inorgs, to be used later on during the sync process. By default the plugin discovers repositories one organization at a time. You can increasediscovery_concurrencyto discover multiple organizations in parallel, or use a negative value to discover all organizations in parallel. Please note that it's possible to hit GitHub API rate limits when using a high value fordiscovery_concurrency.scheduler(string, optional, default:dfs) The scheduler to use when determining the priority of resources to sync. Supported values aredfs(depth-first search),round-robin,shuffleandshuffle-queue.For more information about this, see performance tuning.include_archived_repos(bool) (default:false)By default archived repositories are not included in the sync. To include archived repositories setinclude_archived_repostotrue.local_cache_path(string, optional, default: empty) Path to a local directory that will hold the cache. If set, the plugin will cache the GitHub API responses in this directory. Defaults to an empty string (no cache). By using a cache, the plugin can use conditional requests when appropriate, and help avoid hitting GitHub API rate limits.table_options(optional)Options to apply to specific tables. The available options for each table are documented on that table's reference page under the Table Options section.
FIPS compliance #
A FIPS-compliant version of this plugin is available if your environment requires it. You may enable it by updating the version string in the configuration like this:
kind: source
spec:
name: gihtub
path: cloudquery/github
registry: cloudquery
version: "v15.5.1-fips"
...
