Chargeback in GitHub Enterprise Cloud
One of the most common questions asked by companies adopting GitHub Enterprise is how to know who’s using what. This information is critically important to government agencies, non-profits, consultancies, and any other customer-billable business. Even without these needs, it’s valuable to identify usage and growth trends for cost predictability, opportunities to coach teams to optimize usage, and much more.
At a high level, we’re going to need to understand four things to pull this off:
- How GitHub charges enterprises
- Structuring your account for chargeback
- Get usage data out of GitHub
- Process that data to summarize charges
Most places call this practice “chargeback” or “cost center billing”. Each contract or business unit is responsible for its’ own incurred charges. It can be confusing because there isn’t a clear one-to-one relationship between user licensing, repository/organization structures, and metered billing items to a “cost center”. Every company has some degree of uniqueness here, so while there isn’t an easily reusable solution, all follow more or less this same process below.
flowchart TD;
subgraph User-based seat licensing
A(GitHub<br>Enterprise)
B(Advanced<br>Security)
end
A --> Z{chargeback logic}
B --> Z
subgraph Metered usage
C(Copilot)
D(Actions<br>Packages<br>Codespaces<br>git-lfs data)
end
C --> Z
D --> Z
subgraph Identity provider
H(user identity<br>to<br>cost center)
end
H --> Z
subgraph Other mapping
E(organization<br>to<br>cost center)
end
E --> Z
Z --> Y(invoice)
This information is up to date as of February 2024. The end goal is a minimal, adaptable process to pass-through costs.
GitHub Enterprise cost structure 101
There are two types of costs from GitHub that an enterprise may expect to pay.
- User licensing for GitHub Enterprise and Advanced Security
- Metered (pay-as-you-go) product features
GitHub Enterprise licensing is a simple per-user seat cost. The list price is $21/user/month at time of publication. This is a flat rate per unique user account in your enterprise (documentation ). All members consume at least this amount.
GitHub Advanced Security licenses are an add-on to a subset of the users above. This license provides application security tooling and is billed per unique active contributor across all private/internal repositories where it’s enabled (more on that in the documentation ). The list price is $49/user/month.
Metered products is where life gets difficult as they’re all a little different. Some are per-user, but most are pay-as-you-go and correllated to a repository instead. These charges are all reported on the same spreadsheet by GitHub and the monthly sum is what is charged against your Azure subscription or credit card.
- Copilot Business is charged per user per month, so follows many of the same conventions above. However, it is a month-by-month user amount, so unlike annual contract licensing, your amount can vary as much as you’d like each billing cycle. The billing docs outline this in-depth.
- Actions usage is tied to repositories. GitHub Actions incurs no charge from GitHub if you self-host your own compute, but comes at a pay-as-you-go rate if using the managed SaaS compute. That rate depends on the size of the compute used in the job, as outlined in the billing docs .
- Packages usage is tied to repositories or organizations. It can cost storage and egress, each at some cost per GB/month. It incurs no cost from GitHub on the self-hosted product only. The billing docs have the most up-to-date costs.
- Codespaces provides cloud developer environments to work on projects without needing to keep code locally. It’s billed for both storage and instance size, as outlined in the billing docs .
- Large file storage allows storage of large files in git repositories. It’s billed in data packs, which are discrete chunks of 50 GiB of storage and bandwidth. The billing docs will be updated as this will move to a metered model in Q2 2024 . Your chargeback process should not change much as a result of this.
Do not reconfigure costs in ways that don’t align to how GitHub charges for things - e.g., pricing per repository or issue opened1. There’s nothing built-in to the platform to reliably account for these. These sorts of schemes tend to be delicate and hard to maintain, as there is no guarantee that GitHub won’t change something it relies on.
Enterprise setup
All costs roll up to this “enterprise” umbrella, but are incurred by the people or repositories within an organization. This means that how these organizations and/or repositories are mapped to cost centers determines the logic of creating an invoice.
This “mapping” task is simpler if a couple patterns are followed.
- One organization tracks to one cost center (or contract, department, business unit, etc)
- Each user belongs to an organization or identity provider group that carries information about their cost center.
- There’s no “mixed use” billing - eg, one user/repo/etc isn’t part of multiple cost centers.
GitHub recently shipped custom properties for repositories that can be enforced at the organization level, which would make having multiple cost centers in an organization simpler to manage if you are comfortable retrieving that data through the REST API . Since this data isn’t included in the CSV exports from GitHub, I’m not exploring this path here.
Getting usage data out of GitHub
Now that we know what data we need, let’s get it out of GitHub. We’re going to need two CSV exports to correlate into an invoice. Unfortunately, these are not both available via an API, so you’ll need to set a calendar reminder to pull these manually.
- GitHub Enterprise license usage - directions (or API documentation )
- Metered features usage - directions
The licence usage CSV file is named consumed_licenses-ENTERPRISENAME-TIMESTAMP.csv
. There is a ton of information in this spreadsheet. This export attempts to de-duplicate users across their cloud account and any connected GitHub Enterprise Server accounts, as well as if they are consuming a Visual Studio license that bundles GitHub. It looks something like this (scroll side-to-side):
github_com_login | github_com_name | enterprise_server_user_ids | github_com_user | enterprise_server_user | visual_studio_subscription_user | license_type | github_com_profile | github_com_member_roles | github_com_enterprise_roles | github_com_verified_domain_emails | github_com_saml_name_id | github_com_orgs_with_pending_invites | github_com_two_factor_auth | github_com_two_factor_auth_required_by_date | github_com_advanced_security_license_user | enterprise_server_primary_emails | visual_studio_license_status | visual_studio_subscription_email | total_user_accounts |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
user1 | User name 1 | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user1 | octodemo-enablement:Member | Member | Â | Â | Â | TRUE | Â | FALSE | Â | Â | Â | 1 |
user2 | User name 2 | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user2 | actions-clubhouse:Owner | Owner, Member | Â | Â | Â | TRUE | Â | FALSE | Â | Â | Â | 1 |
user3 | User lastname | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user3 | Â | Outside collaborator | Â | Â | Â | TRUE | Â | TRUE | Â | Â | Â | 1 |
user4 | Test user | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user4 | octodemo-enablement:Member, octodemo:Owner | Member | Â | Â | Â | TRUE | Â | FALSE | Â | Â | Â | 1 |
user5 | Alice Bob | 71:octodemo.com:Member | TRUE | TRUE | FALSE | Enterprise | https://github.com/user5 | GitHub-field-collab:Owner, nathos-test:Owner, octodemo-enablement:Member, octodemo-resources:Member, octodemo:Owner, universal-exports-ltd:Member | Owner, Member | Â | Â | Â | TRUE | Â | FALSE | test_server_user@test.com | Â | Â | 2 |
user6 | Named user6 | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user6 | octodemo-enablement:Member | Member | Â | Â | Â | TRUE | Â | FALSE | Â | Â | Â | 1 |
user7 | Seventh user | Â | TRUE | FALSE | FALSE | Enterprise | https://github.com/user7 | octodemo-enablement:Member, octodemo:Owner | Member | Â | Â | Â | TRUE | Â | TRUE | Â | Â | Â | 1 |
The resource usage file is named RANDOMSTRING_DATE_DURATION.csv
and is only requestable by email. It can take an hour or so to generate depending on the number of line items. Here’s an example:
Date | Product | SKU | Quantity | Unit Type | Price Per Unit ($) | Multiplier | Owner | Repository Slug | Username | Actions Workflow | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|
8/16/23 | Actions | Compute - MACOS | 10 | minute | 0.08 | 10 | octodemo | ghas-committer | dependabot[bot] | .github/workflows/build.yml | Â |
11/9/23 | Actions | Compute - MACOS_LARGE | 2 | minute | 0.12 | 1 | octodemo | larger-runner-matrix | user4 | .github/workflows/test-matrix.yml | Â |
2/12/24 | Actions | Compute - UBUNTU | 16 | minute | 0.008 | 1 | octodemo | audit-log-polling-service | user3 | .github/workflows/audit-log-export-service.js.yml | Â |
10/30/23 | Actions | Compute - UBUNTU_64_CORE | 53 | minute | 0.256 | 1 | some-fantastic | fedora-acs-override | user5 | .github/workflows/codeql.yml | Â |
2/12/24 | Actions | Compute - WINDOWS | 10 | minute | 0.016 | 2 | octodemo | neovim-private | user2 | .github/workflows/release.yml | Â |
8/16/23 | Copilot | Copilot Business | 27.2258 | user-month | 19 | 1 | octodemo | Â | Â | Â | Â |
11/2/23 | Packages | Data Transfer | 0 | gb | 0.5 | 1 | octodemo | Organization Packages - Data Transfer Out | user1 | Â | Â |
2/12/24 | Shared Storage | Shared Storage | 0.3549 | gb-day | 0.008 | 1 | octodemo | bookstore-demo-spring-java | Â | Â | Â |
Note that “Shared Storage” is storage for any compute product, as it’s all billed at the same rate.
At some point in the future, much of this will be revamped as part of a new billing overview that should make this data more accessible and customizeable. This page does not cover any of these enhancements.
Processing the data
We’re going to take the simplest route - loading it all into a spreadsheet for some simple correlations. Create a new spreadsheet and paste the contents of each of those two CSV files into separate tabs.
Create another tab for identity provider data. You’ll get this from Azure AD, Okta, or other identity provider to map some-natalie@github.com
(my account’s email address) to the arbitrary “cost center” or “business unit” field of your company’s organization chart. This information, plus the information for each user in the columns below, will give you a sum of license types by that cost center.
github_com_user
visual_studio_subscription_user
github_com_advanced_security_license_user
In order to map GitHub username to a human in your identity provider, you may need to have some initial user input to know that some-natalie
is natalie_somersall
. If you’re using GitHub’s Enterprise Managed Users feature, the username is generally the same as the User Principal Name followed by a suffix. It makes this additional step unnecessary.
For metered consumables, remove any charges that are outside of your window of interest using the “Date” field. Create a new column to multiply the Price Per Unit ($)
by Quantity
for each line item. Map each organization to the cost center that owns them, then summarize each charge using a PivotTable or similar functionality. This will give you a per-organization roll-up of all metered charges. You’ll now need to map that organization to a cost center as well, but that should be a small list to maintain in comparison to the many line items in the metered bill.
Side note on GitHub Advanced Security
If you want to know why someone is licensed for Advanced Security, there’s a third spreadsheet that will give you a list of all users across all repositories that have GHAS enabled and when they last pushed code to it so as to estimate your 90-day license reclaimation window (download directions ). The Advanced Security license file is named ghas_active_committers_ENTERPRISENAME_2024-02-12T0752.csv
.
User login | Organization / repository | Last pushed date | Last pushed email |
---|---|---|---|
user1 | user1-test-org/centralized-workflows | 12/15/23 | user1@test.com |
user1 | user1-test-org/copilot-arcade | 12/5/23 | user1@test.com |
user1 | user1-test-org/copilot-demo-dotnet | 1/19/24 | user1@test.com |
user2 | Dougs-Donuts/cupcakes | 1/24/24 | user2@test.com |
user2 | Dougs-Donuts/literate-sniffle | 12/12/23 | user2@test.com |
user3 | company-ghas-poc-kickoff/webgoat-js | 1/16/24 | user3@test.com |
user4 | ghas-company-demo/dependency-review-demo | 1/28/24 | user4@test.com |
user4 | ghas-company-demo/infrastructure | 2/12/24 | user4@test.com |
In this example, there would be 4 billed users. The activity of user1
on user1-test-org/copilot-arcade
will soon roll off of that “active” user threshold of 90 days. However, since that user is still active on other repositories more recently, they will still consume a seat for a while longer.
Exclusions
This method only captures what GitHub will directly charge you for. Here’s an incomplete list of things that it will not capture:
- Any usage of self-hosted GitHub Actions runners. Capture this data from your hypervisor or cloud provider instead.
- Any ingest of audit log data. It’s hard to account for how much of the SIEM bill is one person or organization’s events versus another.
- Network ingress/egress beyond GitHub’s boundary. There’s no visibility from GitHub into artifacts or code once they leave GitHub.
- Identity provider or 2FA provider traffic … apparently charging for this is a thing some companies do internally? 🤷🏻‍♀️
- Integrations from GitHub that are paid for separately. If a third-party application (such as a project management or code security tool) has their own billing, that’s between you and that third-party.
Conclusions
Each organization has something unique about their pass-through billing structure. Some treat GitHub licenses as basic overhead and consume the cost without passing it through, the same as email or endpoint protection is treated, while others recoup them from the department or customer. It’s remarkably difficult to provide specific guidance on this practice given the amount of customization at play. These are patterns that I see work well for most companies where it is not worth having significant long-term investments of developer time in cost accounting.
Clever architecture decisions, such as trying to keep hierarchy as simple as possible, ease this process. It also helps to remove the inherent fragility based on a company’s reorganizations. While somewhat manual, this should not take long to perform month over month.2
Next time
Now let’s try this on hard mode - self-hosting all this goodness using GitHub Enterprise Server and still having an awesome chargeback model!
Disclosure
I work at GitHub as a solutions engineer at the time of writing this. All opinions are my own.
Footnotes
-
Yes, a customer was quite sincerely trying to figure out how to make charging their internal users by issue a thing. I don’t think that ever was implemented though. ↩
-
There are improvements to this that GitHub is working on and once released, I will mark this page as deprecated. This is more of a stop-gap measure than something that will be maintained in perpetuity. ↩