Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid suggesting bad practice of building in the same GHA job as publishing #14

Open
webknjaz opened this issue Aug 23, 2023 · 15 comments

Comments

@webknjaz
Copy link
Contributor

Having access to OIDC opens up a can of worms related to privilege elevation during the build. And the compromised targets might be not just PyPI, but any other OIDC integrations people may have set up (and sometimes misconfigured). This also doesn't allow for adequate workflows of synchronized publishing of platform-specific wheels.
Here's a note regarding the security considerations that we have in pypi-publish, for example: https://github.com/marketplace/actions/pypi-publish#trusted-publishing.

Based on the above, I suggest modifying the README example to make use of two jobs and pass GHA artifacts between them. This also allows to smoke-test the build even if publishing gets skipped.

We're currently upgrading the PyPUG guide with similar considerations: pypa/packaging.python.org#1261.

@tschm
Copy link
Owner

tschm commented Aug 23, 2023

Thank you. Will split the job into two parts. Good idea

@tschm
Copy link
Owner

tschm commented Aug 23, 2023

poetry is striking back
https://github.com/orgs/python-poetry/discussions/8357

@tschm
Copy link
Owner

tschm commented Aug 24, 2023

I shall use twine in the 2nd job. No poetry needed

@tschm
Copy link
Owner

tschm commented Aug 24, 2023

@webknjaz Do we like this version more?

name: Upload Python Package

on:
  release:
    types: [published]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        
      # step used to set up python, install poetry and to create
      # to virtual environment described in poetry.lock
      - uses: cvxgrp/.github/actions/setup-environment@main

      # change the version in pyproject.toml and build the package
      - name: Change version in pyproject.toml
        shell: bash
        run: |
           poetry version ${{  github.ref_name }}
           poetry build
      
      # Archive the dist folder
      - name: Archive sphinx documentation
        uses: actions/upload-artifact@v3
        with:
          name: dist
          path: dist
          retention-days: 1

  deploy:
    runs-on: ubuntu-latest
    needs: build
    environment: release

    permissions:
      # This permission is required for trusted publishing.
      id-token: write

    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v3
      
      # Download the dist folder from the build job
      - uses: actions/download-artifact@v3
        with:
          name: dist
          path: dist
    
      # Generate the token
      - name: Mint token
        id: mint
        uses: tschm/token-mint-action@v1.0.2
      
      # Install twine and upload the dist folder using the token
      - name: Publish the package
        shell: bash
        run: |
           pip install twine
           twine upload --verbose -u __token__ -p '${{ steps.mint.outputs.api-token }}' dist/*

@webknjaz
Copy link
Contributor Author

I shall use twine in the 2nd job. No poetry needed

Well, at this point, why not just use my action that wraps twine and aquires tokens internally?

@webknjaz
Copy link
Contributor Author

webknjaz commented Aug 24, 2023

Do we like this version more?

The publish job doesn't need to call checkout or content permission. Plus see my comment above — this is reinventing the wheel.

@tschm
Copy link
Owner

tschm commented Aug 24, 2023

Do we like this version more?

The publish job doesn't need to call checkout or content permission. Plus see my comment above — this is reinventing the wheel.

I am well aware of the gh-action-pypi-publish. The initial goal was not to give people a tool that can upload to pypi in any way. We wanted to restrict their options and enforce trusted publishing. For that it was enough to generate the token and do a poetry publish. This token mint was a byproduct of this process. Following the GitHub mess with deployment to protected tags I need to revisit our deployment process anyway.

@webknjaz
Copy link
Contributor Author

webknjaz commented Aug 25, 2023

I am well aware of the gh-action-pypi-publish. The initial goal was not to give people a tool that can upload to pypi in any way. We wanted to restrict their options and enforce trusted publishing. For that it was enough to generate the token and do a poetry publish. This token mint was a byproduct of this process.

pypi-publish primarily advertises the Trusted Publishing flow, only mentioning other options in the very bottom of the document. This is also intended to steer people towards using secretless publishing. The updated PyPUG guide will do so as well, once complete.
We also chose to annoy users with educational warnings if they attempt publishing to PyPI w/o OIDC: pypa/gh-action-pypi-publish#164 / pypa/gh-action-pypi-publish#167.

Following the GitHub mess with deployment to protected tags I need to revisit our deployment process anyway.

By the way, the example workflow should also recommend using GitHub Environments since that allows for extra protection, like requiring a manual button click to allow the job to even start.

P.S. If you're building a scalable process to enforce OIDC, I'd recommend using jsonschema-based linting. I like https://check-jsonschema.rtfd.io/en/latest/precommit_usage.html#example-usages that lets one check GHA workflow files and action repos for basic problems. There's an example of enforcing that timeout-minutes is set on jobs: https://check-jsonschema.readthedocs.io/en/latest/precommit_usage.html#check-a-builtin-schema. You can also make your own schema to enforce certain arguments to be present/absent from the action invocation inputs. It integrates well with pre-commit.com / pre-commit.ci too.

@tschm
Copy link
Owner

tschm commented Sep 3, 2023

By the way, the example workflow should also recommend using GitHub Environments since that allows for extra protection, like requiring a manual button click to allow the job to even start.

We are using a release environment in the example? Do you mean we should more explicitly mention it or do we oversee something here?

P.S. If you're building a scalable process to enforce OIDC, I'd recommend using jsonschema-based linting. I like https://check-jsonschema.rtfd.io/en/latest/precommit_usage.html#example-usages that lets one check GHA workflow files and action repos for basic problems. There's an example of enforcing that timeout-minutes is set on jobs: https://check-jsonschema.readthedocs.io/en/latest/precommit_usage.html#check-a-builtin-schema. You can also make your own schema to enforce certain arguments to be present/absent from the action invocation inputs. It integrates well with pre-commit.com / pre-commit.ci too.

Ok, we run the checks of the workflows. I have not specified any schema though. I assume it will check against the standard schema for workflows?

@webknjaz
Copy link
Contributor Author

webknjaz commented Sep 4, 2023

We are using a release environment in the example? Do you mean we should more explicitly mention it or do we oversee something here?

Well, in the current readme it's not used. Only in the example above. But envs are auto-created. The problem is that they are created without any protection on. So it'd be important to call out that it's recommended to create the env manually and set up required reviewers there, for example, so the workflow is paused before starting the job. Otherwise, it's not doing much. Setting this up will make it so only trusted humans would be able to resume the workflow execution, letting the publishing to actually happen.

@webknjaz
Copy link
Contributor Author

webknjaz commented Sep 4, 2023

Ok, we run the checks of the workflows. I have not specified any schema though. I assume it will check against the standard schema for workflows?

Yes, it runs the checks against some standard schemas published on the schema store.

You can add another check pointing at an in-repo schema file that you would make yourself.

@tschm
Copy link
Owner

tschm commented Sep 5, 2023

Ok, we run the checks of the workflows. I have not specified any schema though. I assume it will check against the standard schema for workflows?

Yes, it runs the checks against some standard schemas published on the schema store.

You can add another check pointing at an in-repo schema file that you would make yourself.

Ok, understood. Might come in handy at different other projects where the yaml file is say the configuration for a professional engine trading international stock markets :-)

@tschm
Copy link
Owner

tschm commented Sep 5, 2023

We are using a release environment in the example? Do you mean we should more explicitly mention it or do we oversee something here?

Well, in the current readme it's not used. Only in the example above. But envs are auto-created. The problem is that they are created without any protection on. So it'd be important to call out that it's recommended to create the env manually and set up required reviewers there, for example, so the workflow is paused before starting the job. Otherwise, it's not doing much. Setting this up will make it so only trusted humans would be able to resume the workflow execution, letting the publishing to actually happen.

I see. I guess we have done this almost correct then. I tend to use the release environment and I have protection rules in place for this environment. So, I guess we ramp up the protection for this environment.

@webknjaz
Copy link
Contributor Author

webknjaz commented Sep 5, 2023

I usually call the environment pypi because:

  1. There might also be testpypi, which requires different rules and secrets
  2. it's a clearly separate publishing target
  3. there might be other targets like GH releases, they will likely do semantically different things.
  4. The name release (that I also used to use in the past) is too broad and would match any of these separate jobs so it doesn't help to accurately identify its purpose.

@tschm
Copy link
Owner

tschm commented Sep 5, 2023

I usually call the environment pypi because:

  1. There might also be testpypi, which requires different rules and secrets
  2. it's a clearly separate publishing target
  3. there might be other targets like GH releases, they will likely do semantically different things.
  4. The name release (that I also used to use in the past) is too broad and would match any of these separate jobs so it doesn't help to accurately identify its purpose.

I like that idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants