This will create a file called auth.json in your current directory containing the required value. To save the file at a different path or filename, use the --auth=myauth.json option.
As an alternative to using an auth.json file you can add your access token to an environment variable called GITHUB_TOKEN.
Fetching issues for a repository
The issues command retrieves all of the issues belonging to a specified repository.
If an auth.json file is present it will use the token from that file. It works without authentication for public repositories but you should be aware that GitHub have strict IP-based rate limits for unauthenticated requests.
You can point to a different location of auth.json using -a:
$ github-to-sqlite issues github.db simonw/datasette -a /path/to/auth.json
You can use the --issue option one or more times to load specific issues:
While pull requests are a type of issue, you will get more information on pull requests by pulling them separately. For example, whether a pull request has been merged and when.
Following the API of issues, the pull-requests command retrieves all of the pull requests belonging to a specified repository.
Note that the merged_by column on the pull_requests table will only be populated for pull requests that are loaded using the --pull-request option - the GitHub API does not return this field for pull requests that are loaded in bulk.
The commits command retrieves details of all of the commits for one or more repositories. It currently fetches the sha, commit message and author and committer details - it does no retrieve the full commit body.
By default it will stop as soon as it sees a commit that has previously been retrieved. You can force it to retrieve all commits (including those that have been previously inserted) using --all.
The command accepts one or more repositories. It populates a contributors table, with foreign keys to repos and users and a contributions table listing the number of commits to that repository for each contributor.
Fetching repos belonging to a user or organization
The repos command fetches repos belonging to a user or organization.
Without any other arguments, this command will fetch all repos that the currently authenticated user owns, collaborates on or can access via one of their organizations:
$ github-to-sqlite repos github.db
To fetch repos belonging to a specific user or organization, provide their username as an argument:
Add the --readme option to save the README for the repo in a column called readme. Add --readme-html to save the HTML rendered version of the README into a collumn called readme_html.
You can specify one or more repository using owner/repo syntax.
Users fetched using this command will be inserted into the users table. Many-to-many records showing which repository they starred will be added to the stars table.
Fetching GitHub Actions workflows
The workflows command fetches the YAML workflow configurations from each repository's .github/workflows directory and parses them to populate workflows, jobs and steps tables.
This data is not yet available through the GitHub API. The scrape-dependents command scrapes those pages and uses the GitHub API to load full versions of the dependent repositories.
The github-to-sqlite get command provides a convenient shortcut for making authenticated calls to the API. Once you have created your auth.json file (or set a GITHUB_TOKEN environment variable) you can use it like this:
$ github-to-sqlite get https://api.github.com/gists
This will make an authenticated call to the URL you provide and pretty-print the resulting JSON to the console.
You can ommit the https://api.github.com/ prefix, for example:
$ github-to-sqlite get /gists
Many GitHub APIs are paginated using the HTTP Link header. You can follow this pagination and output a list of all of the resulting items using --paginate:
$ github-to-sqlite get /users/simonw/repos --paginate
You can outline newline-delimited JSON for each item using --nl. This can be useful for streaming items into another tool.
请发表评论