Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task #419 - GitHub Actions workflows #93

Merged
merged 1 commit into from
Jan 15, 2022
Merged

Conversation

lovro-bikic
Copy link
Member

Task: #419

Aim

Replace Semaphore with GitHub Actions.

Solution

Add reusable workflows for building and deploying. Add workflows which will be copied to the repo and which use the reusable workflows. Separate bin/build into bin/lint and bin/test.

build.yml Show resolved Hide resolved
build.yml Outdated Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
deploy-staging.yml Show resolved Hide resolved
template.rb Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
workflows/deploy.yml Outdated Show resolved Hide resolved
workflows/deploy.yml Outdated Show resolved Hide resolved
workflows/deploy.yml Outdated Show resolved Hide resolved
workflows/deploy.yml Outdated Show resolved Hide resolved
@lovro-bikic lovro-bikic marked this pull request as ready for review January 3, 2022 15:52
Copy link
Contributor

@d4be4st d4be4st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this out on any project?
I am wondering how the slacks /github integration has progressed.

Previously I had to use some slack/notifier actions to report to the slack which was not the best to be honest :D

deploy-production.yml Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
template.rb Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
- name: Install JS packages
if: ${{ inputs.use_node }}
run: yarn install
- name: Run linters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in one of my previous ventures i setup the linters to run paralel to the specs
so each linter is run independently and once all have been run then the specs are run

it might save a couple of real world minutes because its parallel :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2022-01-10 at 11 03 48

Agreed, I think we care much more about the real world minutes then the billed minutes, our hour is not cheap 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can separate the workflow into separate jobs, but I have some concerns.

First, pricing. GHA charges each job separately. In the above example, bundle-cache is charged 1 billable minute, specs is charged 4 billable minutes, lint is charged 2 billable minutes, and audit is charged 1 billable minute. The total is 8 billable minutes.

Each of those jobs invokes the checkout action, and setup-ruby action. This typically takes 10 seconds (if we assume that no new gems are downloaded, i.e. everything is pulled from cache).

Compare this to a single job which does the checkout, Ruby setup, audit, specs, and linters. With the above numbers, it would approximately take:
10s (checkout, setup) + 2m 50s (specs - checkout, setup) + 1m 20s (lint - checkout, setup) + 10s (audit - checkout, setup) = 4m 30s

This is 5 billable minutes. The time difference is 3 billable minutes, and the cost difference is $0,024 (Linux costs $0,008 per billable minute). In itself, that's not costly. However, these jobs will be run hundreds of times on tens of projects. Suddenly, 3 billable minutes become 3000 billable minutes. And costs aggregate with time, so at the end of the year, the cost difference becomes even greater.

I generally don't have a problem with GHA costing more if that's appropriate for our use-case, but I just want everyone to be on the same page about the cost aspect.

My second concern is, why would we want to run specs if linting and/or audit fails? If any one of those jobs fail, the workflow run is considered failed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can compare the build times, but include the price of the developer in it. I think it is a safe assumption that during the build process, the developer is not doing anything other than waiting, on average.
I know some (hyper-productive) people want to say that they reply on slack, reply on emails, etc during the build process. But often those tasks take more than the build time so then the merge is waiting. Maybe even someone else merges while the busy-body developer is replying to his mails and then it requires him to rebase the branch before merging again. I for one am so small minded that if I do anything else (productive at least) in between, I completely forget about the branch build, and then merge half a day later when someone pings me regarding this.

So for sake of the exercise let's assume that the developer is waiting during build process that is required for the merge. let's say the developer's billed hour is 50 eur/h.
As you stated, the difference is 3 billable minutes. This saves $0,024 in container running time. The difference in developer billable minutes is thus 50 / 60 * 3 which is 2,5 €.
So in the example where we parallelise the build runs we save more than 100x the amount we pay more for the billable time of the runners.

For the second concern, I think it is beneficial to know that specs have completed successfully but only the audit failed. Same for any other combination. You'd also want to know if all of those fail, and not exit early if any of these 3 fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To save time, can we introduce tags to run just specific jobs?
For example, rubocop linter failed because you had more than 120 line, and you want to fix it but you really just want to run the linter again

also [skip ci] tag would be handy to skip everything.

maybe something for the next iteration?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I'm going slightly of rails here, but my opinion was, and still is, we should not run Linting on each commit. It prolongs the build time with questionable usefulness and has the potential to cost us a lot.

What I would propose as an approach is:

  • Linting - run locally using pre-commit/overcommit/whatever - also run on PR (I believe @cilim once mentioned that he first opens up PR and then starts developing in it - in my opinion - that's the wrong approach. If ABSOLUTELY necessary, alternative could be to explore possibility of ignoring Linting only on "Draft PR" feature Github has. And utilize that from the developer's end
  • Security audit - should be part of the CI job on each PR to main branches - perhaps with option to merge regardless for the admins/TLs when situation requires it (:hankey: hits the fan and we need to fix something that's impairing production)
  • Other checks - depends on what do they check :-) But we need to be sensible and ask ourselves "where do we get most value from this, while keeping cost in mind"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @cilim once mentioned that he first opens up PR and then starts developing in it

fyi: no 😄 I mostly develop everything before even pushing to remote

Copy link
Contributor

@uncoverd uncoverd Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be trading pennies for worse development outcomes.

The only positive I see with reducing linting and bundle audit is that we save a few bucks. As @JureCindro said above, our development time is much more valuable than an external service's minute. In reality the test execution time takes the vast majority of the build time. The split we're seeing here is due to the team under @JureCindro watchful eye and their diligence to produce a well balanced test suite that has 100% coverage and runs extremely fast for the number of cases. Most other projects with at least decent coverage are much slower. In those cases linting and auditing becomes an even smaller portion of the bill.

Very few projects have been able to keep up with bundle audit and rubocop errors back in the day when these weren't mandatory on CI. I remember the old dance too well

"hey X, did you just push a change to staging that broke rubocop? God I hate you, I have to fix it in my staging merge now!
"hey guys, did anyone notice that theres 5 bundle audit warnings?" "yeah I just ignored those, I needed to get 5 urgent fixes out"

@ivantomica I don't understand your argument for only running certain actions "on each PR", could you elaborate?
In general the CI pipeline runs when

  • you push a branch for the first time (usually minutes before opening a PR)
  • you push another commit to an existing branch (usually for PR fixes)
  • you rebase the existing branch to the latest master (usually before merging)
  • you merge a branch to master/uat/staging

I think we want the safety of the CI pipeline in all of these cases.

  • When you first push a branch you want to see if its ready to be merged (auditing, linting, specs).
  • Once you push a PR fix you again want to know if the change is safe (auditing, linting, specs).
  • When you're just about ready to merge the feature branch to master you again want to be sure that the rebased version is still valid since its likely there were some code changes. (auditing, linting, specs)
  • Finally when you merge a feature branch into master you are constructing a new variation of the code that contains the original master + your feature branch changes and you again want to be sure the build passes as that code will get shipped. (auditing, linting, specs)

I won't get too deep into the CI vs local debate again (we did this before, reference: https://github.com/infinum/rails-jsonapi-example-app/pull/42, #63). All of these checks being done on each change automatically mean that the team fixes security issues earlier (since they break the build), stays in sync as far as code style is concerned and is confident that the code that is going out is always up to standards. This is in my opinion the cheapest win you can get, education and oversight is much more expensive.

@d4be4st I belive skip-ci already works out of the box, my gwip commits skip CI
Screenshot 2022-01-12 at 19 17 15

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot one more thing - @lovro-bikic mentioned today that rubocop is awfully slow on CI. Locally it takes ~7s while on GHA it takes around a minute usually.

I was able to halve this using the --parallel flag and I'm betting theres more execution speed to be gained via some config options or profiling.
Screenshot 2022-01-12 at 20 14 19

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize this whole thread: we've decided to run a single job, but run steps in parallel (using GNU parallel). This decision is a compromise: it's cheaper than the parallel jobs approach, but it still allows for auditing and linting on CI (for people who want that), without slowing down the single job.

Technically, this is achieved by splitting auditing, linting, and testing into three scripts, and calling them in parallel like this:

$ parallel --lb -k -j0 ::: bin/audit bin/lint bin/test

Flag explanations:
--lb: line-buffer, prints output line by line as it's produced (instead of waiting for the process to end before printing everything)
-k: keep, tells parallel not to mix output of different scripts (i.e. it will first output all lines from bin/audit, then all lines from bin/lint, and finally all lines from bin/test)
-j0: jobs, tells parallel to run as many jobs as possible in parallel. This will determined by the number of scripts parallel needs to run (by default it's 3, but some projects can use a different number of scripts)

@lovro-bikic
Copy link
Member Author

Have you tested this out on any project?

@d4be4st it's currently live on Revisor.

@nikajukic
Copy link
Contributor

@melcha @cilim @JureCindro @uncoverd @vr4b4c please take a look at this PR

README.md Outdated Show resolved Hide resolved
build.yml Outdated Show resolved Hide resolved
deploy-staging.yml Outdated Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
build.yml Outdated Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
deploy-production.yml Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
workflows/build.yml Outdated Show resolved Hide resolved
- name: Install JS packages
if: ${{ inputs.use_node }}
run: yarn install
- name: Run linters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2022-01-10 at 11 03 48

Agreed, I think we care much more about the real world minutes then the billed minutes, our hour is not cheap 😄

workflows/deploy.yml Outdated Show resolved Hide resolved
workflows/deploy.yml Outdated Show resolved Hide resolved
@vr4b4c
Copy link
Member

vr4b4c commented Jan 11, 2022

👏

@cilim cilim self-requested a review January 12, 2022 07:36
Copy link
Member

@cilim cilim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome stuff! 👏

Copy link
Contributor

@melcha melcha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, @lovro-bikic !

build.yml Outdated Show resolved Hide resolved
template.rb Outdated Show resolved Hide resolved
template.rb Show resolved Hide resolved
- name: Install JS packages
if: ${{ inputs.use_node }}
run: yarn install
- name: Run linters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lovro-bikic could checkout and ruby setup be extracted into a separate job, and then parallel lint, test and audit jobs would depend on the extracted setup job?
Maybe we could save some minutes this way.
https://docs.github.com/en/actions/learn-github-actions/managing-complex-workflows#creating-dependent-jobs

template.rb Show resolved Hide resolved
Copy link
Contributor

@uncoverd uncoverd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙆‍♂️

@lovro-bikic
Copy link
Member Author

lovro-bikic commented Jan 13, 2022

There have been some changes since the last review, please take a look at the PR again.

Important changes:

  • added bin/audit which has some of the steps from bin/lint (ab144e3)
  • added bin/prepare_ci instead of the hardcoded step for pulling secrets. I noticed that some projects have a more elaborate CI setup, so this gives greater flexibility to those projects (5c2a875)
  • steps were changed to run in parallel with the parallel command. additionally, an input ci_steps was added so projects can define which files they want to run (by default, it's bin/audit bin/lint bin/test, but if a project has a different setup, they can override it). (cda03b9) (more info here)
  • Added RuboCop cache. On one project, the cache reduced RuboCop execution time from 60s to 1s (1486980)

template.rb Outdated Show resolved Hide resolved
Copy link
Contributor

@uncoverd uncoverd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

Remove bundler setup from bin/deploy

Fix bin/build reference

Add reusable workflows for build and deploy

Add workflows for build and deploy to staging and production

Copy build/deploy workflows to .github/workflows folder

Add GA explanation to readme

Turn off node by default

Ask about manual deployers

Move postgres image prefix to reusable workflow

Update SSH key naming scheme

Add commented-out automatic deploy to production

Add interpolation marks

Change Postgres image to 13.2

Add --frozen-lockfile flag to yarn install

Remove cancel-in-progress for deploys

Add optional input for GHA runner

Revert "Update SSH key naming scheme"

This reverts commit f1df594.

Separate Mina commands

Add RAILS_ENV=test

Document workflow inputs

Add bin/audit, force color output

Add prepare_ci script

Run CI steps in parallel

Move workflows to .github/workflows folder

Remove postgres user

Use trust auth method

Add -j4 flag

Add rubocop cache step

Give names to all steps

Move rubocop cache step

Rename job to build

Use github format for rubocop

Use both simple and github formats

Fix workflow path

Make the ci_steps input required

Change location of rubocop cache

Change flag -j4 to -j0

Add example for deployers input

Create .node-version file

Add info about frontend to readme
@lovro-bikic lovro-bikic merged commit 27f2a6c into master Jan 15, 2022
@lovro-bikic lovro-bikic deleted the feature/github-actions branch January 15, 2022 11:31
This was referenced Jan 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants