Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Zarf Deployment Push All Git Repository Branches #154

Closed
JustinFirsching opened this issue Nov 4, 2021 · 18 comments · Fixed by #155
Closed

Make Zarf Deployment Push All Git Repository Branches #154

JustinFirsching opened this issue Nov 4, 2021 · 18 comments · Fixed by #155

Comments

@JustinFirsching
Copy link
Contributor

Currently, when using Zarf to deploy a git repository only the master branch and tags are pushed to the Gitea mirror. It would be ideal if this mirror also contained the other branches that the original git repository contained.

My approach to this would be as follows:
When provided with a Git repository on package deployment, utilize the git refs in .git/refs/remotes to populate the refs in Gitea. This directory, unlike .git/refs/heads is updated on git fetch rather than a git checkout, which Zarf runs after cloning the repository, implying that the git refs are already there and just need to be referenced.

@RothAndrew
Copy link
Member

The intention is to actually only bring over the shallow clone of the exact ref that is declared in the config file. It's been a long-standing bug that it actually brings over more than is necessary

@JustinFirsching
Copy link
Contributor Author

I may be able to solve that issue along with this. Would it be fair to say that if there is no tag provided everything is brought over, but if a tag is provided only that tag is brought over?

@jeff-mccoy
Copy link
Member

Yeah this is a good catch. I can’t remember why but I was having issues pulling / pushing just the exact tag before and ended up where we are (with extra stuff). If I recall there was an issue with the underlying go library we are using and just pulling / pushing a single tag. I believe it’s something to do with the fact that tags are not directly related to a branch and on push it became a little funky. We track on tags as a convention because a future version of zarf will rely on this for calculations diffs of packages.

@JustinFirsching
Copy link
Contributor Author

The tags were pretty tricky. I've got it pulling and pushing a single tag right now, but it is also pushing a new branch (called HEAD). I'll have to get back around to that one to see what I changed to cause that later, probably just using the wrong ReferenceName or something. I don't expect that to be a large hang-up.

Regarding the opt-in for all git branches, what would you like to see there in terms of structure? Using something like @all could cause a conflict in a case where the desired tag was all (granted an unlikely case), which is what initially lead me to assuming a deep clone of all branches if there is no tag provided. It could be switched to a struct, but that would of course not be backwards compatible (although potentially look cleaner both in the codebase and the zarf.yaml).

@jeff-mccoy
Copy link
Member

Yes thank you that was the issue--the HEAD was making things funky for push to gitea, I believe the UI wouldn't show anything without first recreating a regular branch.

Regarding opt-in and format. I actually think leaving the tag off and making that an implicit "all" would be useful, especially for more generic uses of zarf for moving git repos around that aren't just for K8s gitops strategies. However we do have an issue with communicating the zarf.yaml syntax already, so that's just one more thing to add to the list.

Overall, if we can still keep tag-based as the default operation / examples and "all branches" as opt-in (by not including a @tag) then that's probably a good balance. Interested in @RothAndrew's thoughts on this approach and @btlghrants / @Madeline-UX regarding how we effectively communicate this option in docs without making it more confusing.

@RothAndrew
Copy link
Member

RothAndrew commented Nov 4, 2021

I'm curious to know what the use case is for wanting to pull the entire repository. The issue started off jumping straight into the implementation.

"As a user of Zarf, I want to be able to package all refs of a git repo, so that <SomeReasonHere>"

@jeff-mccoy
Copy link
Member

Some possibilities:

  • support multiple gitops clusters on different versions
  • control what version you are on in the airgap vs online
  • ability to rollback versions or support n-2 version history in the airgap
  • Just using zarf to bring arbitrary gitrepos into the airgap, like maybe some docs repos or tools for developers, etc

@JustinFirsching
Copy link
Contributor Author

You guessed it!

I'm looking to package all refs of the git repo as the existing repo currently supports multiple gitops clusters on different versions. It would be more convenient to have this in one package than creating a different package for each version and redeploying for a configuration change.

As for the restricting the repo to one tag piece not working without a branch. Would it be acceptable to have current master and just the desired tag? If so, I have that working and can clean things up a little bit and push that into the current PR as well.

@RothAndrew
Copy link
Member

There's no guarantee that the name of the trunk branch is master

@JustinFirsching
Copy link
Contributor Author

JustinFirsching commented Nov 5, 2021

Yeah that's a bit unfortunate. Running into that now testing it. Unfortunately we can't pull just HEAD either since go-git force pulls master if HEAD is pulled on a SingleBranch clone. Quite an interesting problem. I'll keep digging and report back. I'm sure there is a way to satisfy both the bloated deep clone and stripped single tag and branch clone.

@jeff-mccoy
Copy link
Member

So interestingly branch names don’t really matter….we just need to agree on a convention and communicate that convention. As long as the branch we pick is the default on gitea, we should be safe.

@JustinFirsching
Copy link
Contributor Author

It looks like the issue is more-so on the pull side than on the push side, but unfortunately both would be involved in my proposed solution.

Zarf is using go-git for it's underlying git implementation, which currently doesn't support single branch cloning of remotes where the default branch is not master (unless the branch is provided directly, issue: go-git/go-git#363). As such, all branches will have to be fetched (as they are currently).

However, I am noticing that the push logic here, is somewhat over complicated. The directory name is already the transformed version of the online repo's URL. So, rather than retrieving the remote and transforming it, that section could just use the directory basename to build the url with "https://"+config.Zarf.LocalIP+"/"+zarfGitUser+"/"+directoryBasename (excuse the pseudo-code). This change would allow for the removal of the online remote at the end of package creation when a tag is provided, removing all of the refs we didn't intentionally keep.

I haven't had a chance to actually test this in go-git, but from the git command line there seems to be no complaints pushing refs that don't exist (they just don't get pushed), and all of the refs that I intentionally retrieved (both fetched tags and checked out branches) stuck around in local refs after the remote was removed.

In summary, if a tag is provided, we fetch only that tag and remove the remote at the end. If no tag is provided, we fetch all tags and don't remove the remote at the end. All pushes are the same, pushing /refs/remotes/online-upstream/*, /refs/tags/*, and force pushing /refs/heads/* (to account for the branch name collisions that will happen with the remote upstream and heads).

If this sounds good, I can get #155 updated to include these changes shortly (assuming go-git supports everything I just said, which as far as I am aware, it does).

@jeff-mccoy
Copy link
Member

You know I iterated on the go git stuff so much I can’t even remember what worked and what didn’t. My main comment would be to make sure that you test with go-git and gitea to make sure the pair are happy. I chose go-git because it had no external dependencies, but that does have its downsides too.

I think you understand the intent: either pull as little as possible—we don’t technically do a shallow, but everything up to that tag or pull ** all the things **. And whatever you pull is exactly what you should push. If there is some weird stuff under the hood for go-git that’s probably fine but it should be reliable and be exactly what the users expect.

@JustinFirsching
Copy link
Contributor Author

It looks like deleting the remote didn't take the remotes with it, but I did find where the references were stored by go-git and was able to remove those (using functions provided by go-git, not some strange workaround). I updated my PR (#155) and included an update to an example file for testing.

I expect that the new implementation is more-so what you were aiming for with the repo.git@tag format, and also supports what I was hoping for with mirroring all branches and tags. It has grown to a rather large PR now relative to its old (+1/-4) state so please do feel free to ask questions there.

@jeff-mccoy
Copy link
Member

Sounds good, rebase seems to be the cleanest way to resync and make GitHub happy for PRs, I run git rebase master on the branch I’m working on to keep history clean.

@JustinFirsching
Copy link
Contributor Author

Agreed. Should I squash while I do this or would you prefer I wait on that?

@jeff-mccoy
Copy link
Member

You’ll have to force push anyway so squash wouldn’t be a bad idea. I’d squash before rebase to avoid multiple rebase breaks in case of merge conflicts

@JustinFirsching
Copy link
Contributor Author

Squash pushed! Rebasing to master reported that the branch was up to date, so I think we should be good there.

JustinFirsching added a commit to JustinFirsching/zarf that referenced this issue Nov 14, 2021
Update the pull function so that tag-provided mirrors do not fetch all
tags, instead later fetching (externally) only the tag they need.

Implement tag fetching function to retrieve only the desired tag when
creating a tag-provided repo mirror.

Checkout the tag into a detatched HEAD state at the end of the create
stage of tag-provided repository mirrors.

Implement ref removal and addition functions used during package
creation and mirror push to ensure that refs that aren't wanted on the
mirror won't be pushed.

Update the refspecs used in the push to push detatched HEAD, branches,
online remote, and tags to the offline mirror. Note that if we later
checkout a branch from the remote and do not clean up the remote ref it
will lead to a duplicate ref name and the push will fail on one of the
refs (likely the online one since it is later in the refspec slice).

Fixes defenseunicorns#154

feat: Allow for repos to be provided without a tag to mirror all branches/tags
feat: Make tag-provided repository mirrors use the tag as master
fix: Prevent tag-provided repo mirrors from storing extra refs
fix: tag-provided clones use trunk branch name
docs: Update gitops-data Example README
Zarf Roadmap (Old, Use Org-level project instead) automation moved this from Now to Recently Finished Nov 14, 2021
jeff-mccoy pushed a commit that referenced this issue Nov 14, 2021
Update the pull function so that tag-provided mirrors do not fetch all
tags, instead later fetching (externally) only the tag they need.

Implement tag fetching function to retrieve only the desired tag when
creating a tag-provided repo mirror.

Checkout the tag into a detatched HEAD state at the end of the create
stage of tag-provided repository mirrors.

Implement ref removal and addition functions used during package
creation and mirror push to ensure that refs that aren't wanted on the
mirror won't be pushed.

Update the refspecs used in the push to push detatched HEAD, branches,
online remote, and tags to the offline mirror. Note that if we later
checkout a branch from the remote and do not clean up the remote ref it
will lead to a duplicate ref name and the push will fail on one of the
refs (likely the online one since it is later in the refspec slice).

Fixes #154

feat: Allow for repos to be provided without a tag to mirror all branches/tags
feat: Make tag-provided repository mirrors use the tag as master
fix: Prevent tag-provided repo mirrors from storing extra refs
fix: tag-provided clones use trunk branch name
docs: Update gitops-data Example README
jeff-mccoy pushed a commit that referenced this issue Feb 8, 2022
Update the pull function so that tag-provided mirrors do not fetch all
tags, instead later fetching (externally) only the tag they need.

Implement tag fetching function to retrieve only the desired tag when
creating a tag-provided repo mirror.

Checkout the tag into a detatched HEAD state at the end of the create
stage of tag-provided repository mirrors.

Implement ref removal and addition functions used during package
creation and mirror push to ensure that refs that aren't wanted on the
mirror won't be pushed.

Update the refspecs used in the push to push detatched HEAD, branches,
online remote, and tags to the offline mirror. Note that if we later
checkout a branch from the remote and do not clean up the remote ref it
will lead to a duplicate ref name and the push will fail on one of the
refs (likely the online one since it is later in the refspec slice).

Fixes #154

feat: Allow for repos to be provided without a tag to mirror all branches/tags
feat: Make tag-provided repository mirrors use the tag as master
fix: Prevent tag-provided repo mirrors from storing extra refs
fix: tag-provided clones use trunk branch name
docs: Update gitops-data Example README
Signed-off-by: Jeff McCoy <code@jeffm.us>
Noxsios pushed a commit that referenced this issue Mar 8, 2023
Update the pull function so that tag-provided mirrors do not fetch all
tags, instead later fetching (externally) only the tag they need.

Implement tag fetching function to retrieve only the desired tag when
creating a tag-provided repo mirror.

Checkout the tag into a detatched HEAD state at the end of the create
stage of tag-provided repository mirrors.

Implement ref removal and addition functions used during package
creation and mirror push to ensure that refs that aren't wanted on the
mirror won't be pushed.

Update the refspecs used in the push to push detatched HEAD, branches,
online remote, and tags to the offline mirror. Note that if we later
checkout a branch from the remote and do not clean up the remote ref it
will lead to a duplicate ref name and the push will fail on one of the
refs (likely the online one since it is later in the refspec slice).

Fixes #154

feat: Allow for repos to be provided without a tag to mirror all branches/tags
feat: Make tag-provided repository mirrors use the tag as master
fix: Prevent tag-provided repo mirrors from storing extra refs
fix: tag-provided clones use trunk branch name
docs: Update gitops-data Example README
Signed-off-by: Jeff McCoy <code@jeffm.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants