Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a bunch of missing urllib.parse.quote calls #1976

Merged
merged 57 commits into from Dec 12, 2023

Conversation

ExplodingCabbage
Copy link
Contributor

There were previously lots of places in PyGithub where public methods take a string value and interpolate it directly into a URL without URL-encoding it. This results in a bunch of bugs; for instance, trying to get a branch with a # in its name fails.

It also conceivably introduces some kind of security issue somewhere, since anywhere that an application is passing a user-provided value to one of these methods, a user can exploit this to cause a request to be sent to a malformed URL or to add arbitrary additional path segments to the URL after the interpolated value or to add an arbitrary query string (although I have not been able to think of a realistic exploit based on this, and am not confident that one really exists).

To try to fix this, I did a regex search across the whole codebase for

f"{self.url}/.+{

which finds expressions like

f"{self.url}/labels/{label}"

or

f"{self.url}/comments/{id}"

and audited each such expression. If the value being interpolated into the URL was already being explicitly URL-encoded in the method, I left it alone. If it was guaranteed via an assert isinstance(...) check to be a number, I left it alone. If it was the ._identity property of a PyGithub object whose _identity is guaranteed not to contain any special characters in need of URL-encoding, I left it alone. In all other cases, it was an arbitrary string, and in those cases, I URL-encoded it. I did this even in cases where the argument represented something like a GitHub username which cannot legally contain any special characters that need URL-encoding, for a couple of reasons:

  • there might be a security benefit to doing so in those cases, for the reason given above
  • if somebody tries to look up an illegal username like foo?bar, it still seems like the correct thing for the library to do is to faithfully encode that username and send it to the API, and get back an appropriate error from GitHub, rather than sending a malformed URL or a URL where ?bar is treated as a query string (which might produce a seemingly-nonsensical error that complains about user foo not existing, or worse, might actually return existing user foo).

I've manually tested my change to check it fixes getting a branch by name when the branch name contains a # character. Here's an attempt to get such a branch without this change:

>>> from github import Github
>>> g = Github()
>>> repo = g.get_repo('ExplodingCabbage/PyGithub')
>>> repo.get_branch('foo#bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mark/PyGithub/github/Repository.py", line 1643, in get_branch
    "GET", f"{self.url}/branches/{branch}"
  File "/home/mark/PyGithub/github/Requester.py", line 355, in requestJsonAndCheck
    verb, url, parameters, headers, input, self.__customConnection(url)
  File "/home/mark/PyGithub/github/Requester.py", line 378, in __check
    raise self.__createException(status, responseHeaders, output)
github.GithubException.GithubException: 404 {"message": "Branch not found", "documentation_url": "https://docs.github.com/rest/reference/repos#get-a-branch"}

And here's an attempt with this change:

>>> from github import Github
>>> g = Github()
>>> repo = g.get_repo('ExplodingCabbage/PyGithub')
>>> repo.get_branch('foo#bar')
Branch(name="foo#bar")

I haven't tested anything else, and indeed am not familiar with several of the API calls I've touched in this PR. I'd appreciate if a maintainer familiar with them could review my changes sceptically, and maybe test them if needed.

Note that this is kind of a breaking change, since some users may have worked around this bug by calling urllib.parse.quote on values in their own code before passing them to PyGithub, and such values will now get doubly encoded. As such, this should be called out as a breaking change in the change notes.

@stale
Copy link

stale bot commented Jan 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 8, 2022
@ExplodingCabbage
Copy link
Contributor Author

Go away, stale bot.

@stale stale bot removed the stale label Jan 9, 2022
@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 16, 2022
@ExplodingCabbage
Copy link
Contributor Author

Any chance of this getting reviewed and merged, @s-t-e-v-e-n-k? It's a bugfix with possible security implications.

(I would be happy to leave for you to handle it whenever you get to it, but, well, by using the stale bot you're not leaving me that choice; I have to either nag you about it every few months or have it be automatically thrown away.)

@EnricoMi
Copy link
Collaborator

This is very interesting, thanks for the thorough explanation.

Can you please rerun the regexp with latest main HEAD and also consider the following regexp: f".*/[{]?

Those urls can also be constructed without self.url, which are then completed in Requester.__makeAbsoluteUrl.

@codecov-commenter
Copy link

codecov-commenter commented Nov 27, 2023

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (a48c37f) 96.73% compared to head (5eb0e0a) 96.68%.

Files Patch % Lines
github/Repository.py 87.09% 4 Missing ⚠️
github/Team.py 25.00% 3 Missing ⚠️
github/GithubIntegration.py 71.42% 2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1976      +/-   ##
==========================================
- Coverage   96.73%   96.68%   -0.05%     
==========================================
  Files         142      142              
  Lines       14499    14559      +60     
==========================================
+ Hits        14025    14077      +52     
- Misses        474      482       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ExplodingCabbage
Copy link
Contributor Author

Okay, I've now:

  1. resolved 2.5 years of merge conflicts
  2. done a search with the regex you suggested and added a bunch more urllib.parse.quote calls
  3. fixed a test failure

Should be ready for another round of review!

Copy link
Collaborator

@EnricoMi EnricoMi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, thanks for the effort!

@EnricoMi EnricoMi merged commit 13194be into PyGithub:main Dec 12, 2023
15 checks passed
lettuce-bot bot added a commit to lettuce-financial/github-bot-signed-commit that referenced this pull request Jan 30, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [PyGithub](https://togithub.com/pygithub/pygithub) | `==2.1.1` ->
`==2.2.0` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/PyGithub/2.2.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/PyGithub/2.2.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/PyGithub/2.1.1/2.2.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/PyGithub/2.1.1/2.2.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>pygithub/pygithub (PyGithub)</summary>

###
[`v2.2.0`](https://togithub.com/PyGithub/PyGithub/releases/tag/v2.2.0)

[Compare
Source](https://togithub.com/pygithub/pygithub/compare/v2.1.1...v2.2.0)

#### Breaking Changes

The `github.Comparison.Comparison` instance returned by
`Repository.compare` provides a `commits` property that used to return a
`list[github.Commit.Commit]`, which has now been changed to
`PaginatedList[github.Commit.Commit]`. This breaks user code that
assumes a `list`:

```python
commits = repo.compare("v0.6", "v0.7").commits
no_of_commits = len(commits)  # will raise a TypeError
```

This will raise a `TypeError: object of type 'PaginatedList' has no
len()`, as the returned `PaginatedList`
does not support the `len()` method. Use the `totalCount` property
instead:

```python
commits = repo.compare("v0.6", "v0.7").commits
no_of_commits = commits.totalCount
```

#### New features

-   Add support to call GraphQL API

#### Improvements

- Add parent_team_id, maintainers and notification_setting for creating
and updating teams. by
[@&#8203;Cheshirez](https://togithub.com/Cheshirez) in
[PyGithub/PyGithub#2863
- Add support for issue reactions summary by
[@&#8203;smuzaffar](https://togithub.com/smuzaffar) in
[PyGithub/PyGithub#2866
- Support for DependabotAlert APIs by
[@&#8203;coopernetes](https://togithub.com/coopernetes) in
[PyGithub/PyGithub#2879
- Derive GraphQL URL from base_url by
[@&#8203;EnricoMi](https://togithub.com/EnricoMi) in
[PyGithub/PyGithub#2880
- Make `Repository.compare().commits` return paginated list by
[@&#8203;EnricoMi](https://togithub.com/EnricoMi) in
[PyGithub/PyGithub#2882
- Add missing branch protection fields by
[@&#8203;treee111](https://togithub.com/treee111) in
[PyGithub/PyGithub#2873
- Add `include_all_branches` to `create_repo_from_template` of
`AuthenticatedUser` and `Organization` by
[@&#8203;janssonoskar](https://togithub.com/janssonoskar) in
[PyGithub/PyGithub#2871
- Add and update organisation dependabot secrets by
[@&#8203;mohy01](https://togithub.com/mohy01) in
[PyGithub/PyGithub#2316
- Add missing params to `Organization.create_repo` by
[@&#8203;tekumara](https://togithub.com/tekumara) in
[PyGithub/PyGithub#2700
- Update allowed values for `Repository` collaborator permissions by
[@&#8203;flying-sheep](https://togithub.com/flying-sheep) in
[PyGithub/PyGithub#1996
- Support editing PullRequestReview by
[@&#8203;ColasGael](https://togithub.com/ColasGael) in
[PyGithub/PyGithub#2851
- Update attributes after calling `PullRequestReview.dismiss` by
[@&#8203;ColasGael](https://togithub.com/ColasGael) in
[PyGithub/PyGithub#2854
- Add `request_cve` on `RepositoryAdvisories` by
[@&#8203;JLLeitschuh](https://togithub.com/JLLeitschuh) in
[PyGithub/PyGithub#2855
- Filter collaborators of a repository by permissions by
[@&#8203;notmicaelfilipe](https://togithub.com/notmicaelfilipe) in
[PyGithub/PyGithub#2792
- Set pull request to auto merge via GraphQL API by
[@&#8203;heitorpolidoro](https://togithub.com/heitorpolidoro) in
[PyGithub/PyGithub#2816
- Support Environment Variables and Secrets by
[@&#8203;AndrewJDawes](https://togithub.com/AndrewJDawes) in
[PyGithub/PyGithub#2848
- Update workflow.get_runs & pullrequest.add_to_assignees function
signature by [@&#8203;sd-kialo](https://togithub.com/sd-kialo) in
[PyGithub/PyGithub#2799
- Add `GithubObject.last_modified_datetime` to have `last_modified` as a
`datetime` by [@&#8203;chouetz](https://togithub.com/chouetz) in
[PyGithub/PyGithub#2772
- Add support for global advisories and unify some shared logic with
repository advisories by
[@&#8203;crimsonknave](https://togithub.com/crimsonknave) in
[PyGithub/PyGithub#2702
- Add `internal` as valid Repository visibility value by
[@&#8203;AndrewJDawes](https://togithub.com/AndrewJDawes) in
[PyGithub/PyGithub#2806
- Add support for issue comments reactions summary by
[@&#8203;smuzaffar](https://togithub.com/smuzaffar) in
[PyGithub/PyGithub#2813

#### Bug Fixes

- Add a bunch of missing urllib.parse.quote calls by
[@&#8203;ExplodingCabbage](https://togithub.com/ExplodingCabbage) in
[PyGithub/PyGithub#1976
- Fix Variable and Secret url bugs by
[@&#8203;AndrewJDawes](https://togithub.com/AndrewJDawes) in
[PyGithub/PyGithub#2835

#### Maintenance

- Update the class name for NetrcAuth in the examples by
[@&#8203;vinnybod](https://togithub.com/vinnybod) in
[PyGithub/PyGithub#2860
- Move build to PEP517 by [@&#8203;trim21](https://togithub.com/trim21)
in
[PyGithub/PyGithub#2800
- Use new type assert functions in `Repository` by
[@&#8203;trim21](https://togithub.com/trim21) in
[PyGithub/PyGithub#2798
- PyTest: Move config to pyproject.toml by
[@&#8203;Borda](https://togithub.com/Borda) in
[PyGithub/PyGithub#2859
- codespell: ignore-words-list by
[@&#8203;Borda](https://togithub.com/Borda) in
[PyGithub/PyGithub#2858
- Improve fix-headers.py script by
[@&#8203;EnricoMi](https://togithub.com/EnricoMi) in
[PyGithub/PyGithub#2728
- Remove dependency on python-dateutil by
[@&#8203;lazka](https://togithub.com/lazka) in
[PyGithub/PyGithub#2804
- CI: update precommit & apply by
[@&#8203;Borda](https://togithub.com/Borda) in
[PyGithub/PyGithub#2600
- Docs: Fix parameter order according to Version 2.1.0 by
[@&#8203;nad182](https://togithub.com/nad182) in
[PyGithub/PyGithub#2786
- Add missing GitHub classes to docs by
[@&#8203;EnricoMi](https://togithub.com/EnricoMi) in
[PyGithub/PyGithub#2783
- CI: Fix mypy error by ignoring override by
[@&#8203;EnricoMi](https://togithub.com/EnricoMi) in
[PyGithub/PyGithub#2779

**Full Changelog**:
PyGithub/PyGithub@v2.1.1...v2.2.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these
updates again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/lettuce-financial/github-bot-signed-commit).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4xNTMuMiIsInVwZGF0ZWRJblZlciI6IjM3LjE1My4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants