At every single company I've ever worked the company issued laptop has been significantly faster than the machines provisioned for CI (i.e. m1 mac vs Github action free tier runners). Consequently I don't usually push code without running the tests locally, it's such a faster feedback loop.
I've always wondered if it would be possible to design some proof of work concept where you could hash your filesystem + test artifacts to verify tests passed for a specific change.
FWIW in yeas of development I've never had an issue where "it works on my machine" doesn't equate to the exact same result in CI.
I've seen pieces of this without the actual proof of work but with cryptographically hashed inputs and commands such that your test can just be a cache hit.
One bug that's bitten me a time or two is the rather annoying case-preserving but not case-sensitive MacOS filesystem. This can mean a link works locally, but not when e.g. deployed to a linux server.
It's a valid reason to have stuff run in CI (i.e. consistent environment). But for my line of work I can't think of a single scenario where the architecture / platform has ever caused issues in tests, typically it's caught at build time / when resolving dependencies/packages.
What kind of software do you develop? In my area, running all the tests on my laptop would take days and I frequently have issues where the tests passes locally but not in CI.
A few jobs ago, we didn't bother with a separate CI server, for exactly this reason. We just had the output of the local test run appended to our commit messages.
I've always liked the theory of this but not the implementation. I'm a big fan of squash merges so my branches are a mess of commi that each may not even build let alone pass tests. If I had to run tests each commit it would slow things down significantly for little benefit.
Seems to assume that people are actively/idly waiting for CI to pass and then start the review? Or they've done the review, wait for CI to pass then hit merge?
Most projects I've worked on with multiple contributors on have had a merge queue where you simply tell it to merge if/when the CI passes, then you can move on with your day.
Mostly I won't even look at a review until CI passes. I don't want to look at your code if some static analysis in CI would find issues. Once in a while the code is obvious and I won't need that (the one line change "the is not spelled teh" which I've seen multiple times in different areas of code - anyone know of an IDE/tool that can spell check UI strings without tripping up on variable names), but we have a lot of static analysis that we run as part of CI, some of it standard some of it custom for people who violate our code style in some way.
Note that my CI system runs multiple builds in parallel (as yours should too!). Any individual build might only take 10 minutes, but it is 2 hours if I try to run them on my local machine. Odds are if my code builds for one linux variant on x86 it will build on all the other ones and arm so I don't expect someone to run those before starting a review.
> Mostly I won't even look at a review until CI passes. I don't want to look at your code if some static analysis in CI would find issues.
> but we have a lot of static analysis that we run as part of CI, some of it standard some of it custom for people who violate our code style in some way.
Yeah, but why would issues that static analysis find stop you from doing the review? I never review things that automated tools can find, I care more about reviewing things only a human could review. "Does this make sense here?", "Is this decoupled enough/too much?", "Does this work well with the overall architecture?", "Does this test test the right thing?" and so on.
I wouldn't spend my review time on spellchecking people or pointing out issues like that, that's for the tooling to do. And even if the CI fails because of some analysis, PR author fixes it, every review comment should still be applicable, otherwise you're just doing robot work.
I don't strictly wait for CI to pass before reviewing, but I don't think it's a crazy idea. There's no point in taking the time to understand a set of changes and evaluate the design and such if it doesn't actually work in the first place and needs to be reworked.
If you have something that's easy to test locally and the CI checks are just a backstop then the initial PR should nearly always pass, but if your CI checks are much more thorough than a developer can do manually then a PR initially being in a broken state may be a routine occurrence.
Sometimes static analisys finds something that needs significant code changes to fix. Other times it will tell you use an algorithm to do this instead of a hand written loop and so the code is a lot easier to understand after. Most of the time it doesn't matter, but I don't want to waste my time in the few cases where it does.
> anyone know of an IDE/tool that can spell check UI strings without tripping up on variable names
Sublime Text works pretty well for me. I think you need to turn on spell check in the settings (off by default?) Then you can choose which syntax highlighting scopes you want to spell check. I have mine configured to spell check comments and string literals.
Anecdata of one, obviously, but I've never worked anywhere using Github that has had auto-merge turned on. You just have to wait for the "CI complete" email (or keep refreshing the page) and hit the merge button (or, in some places, ask the appropriate person to do that.)
> of all the things a human should be in the loop for, merging to main seems like one of them
That's what the review is for. And I'd argue that the review should be decoupled from CI, because you're not reviewing if you think the tests is passing before/after merge, you're reviewing the code and docs themselves.
And once the review is done, it should be fine to merge at any point, today or tomorrow, barring any merge conflicts of course.
Merging to main should be the most mundane task ever, and even a robot should be able to do it with confidence.
>once the review is done, it should be fine to merge at any point, today or tomorrow
Sometimes there are considerations like QA/infrastructure resourcing that dictate when something can go into master. Ideally that's rare but I have worked in a place where it's a consideration for every single merge.
To expand on this a little: we had a simple branching model. To/from master for work, branch from master to cut a release. Being highly regulated, as a matter of (unchangeable) policy, every change needed testing by QA team. QA team == QA guy. Among feature work without hard release dates, there was also bug fixes with some urgency, and regulatory work with hard release dates. Given we traded flexibility in our branching model for simplicity, we instead had to consider if something could be merged to master without impacting QA load of anything already merged but not released. It worked fine for us but were were a team of <5. My remark about trading flexibility is really the crux of it all though. There's compromises that need to made and this one made sense for us at the time.
If main/master isn't your production branch then allowing auto merge makes a lot of sense.
And if your changes have been QA'd in staging and/or they aren't that hectic (and are backwards compatible) then I think auto merge into prod can be fine too.
> I basically do not think "auto merge into prod" is ever OK, but then again, I let GitHub Copilot write 40-60% of my doc comments, so...
Funny how different people can be :) I'll always try to make the main branch so clean that it can be deployed at any moment, and production will always use the latest commit as soon as possible.
Basically, I'm doing CI + CD, but I understand it's not for everyone.
I've seen too many cases where people see CI is a first-pass reviewer and wait for it. I can understand if people throw complete garbage at CI that will need to be reworked to the point it isn't worth reviewing. I've rarely seen it be that case though. I instead view CI as a peer taking care of the nitty gritty while I focus on the big picture and the non-automatable.
I've also worked on too many projects without a merge queue.
My favourite CI/CD build discovery was `find .` (recursive file listing) being called after each step. (maybe like 12 total steps, 6 being the usual dependancy fetch, unit tests, smoke tests etc, and the other 6 being just this file listing task.)
CircleCI dutifully polling the ~750k-line stdout for the find command was taking up about 3 minutes of our 5 minute process. Was done so someone could debug a failing build... supposedly.
> CircleCI dutifully polling the ~750k-line stdout for the find command was taking up about 3 minutes of our 5 minute process. Was done so someone could debug a failing build... supposedly.
Doing it for a one-time CI run is fine. Forgetting to remove it is less fine. Doing it on CircleCI who have had builds you can SSH into since forever is a lot less fine, it should never have hit the config at all.
The missing data in that research is org size. If you've got 5-10 developers in a group, I don't see any of this making much difference in anyone's productivity. OTOH, if you're in an org with 100+ developers and every merge that goes in before you means some sort of conflict resolution, then yes, this might actually be important.
If you have 100+ developers you should segregate your system so that merge conflicts outside your group of 10 are rare. Sure someone needs to upgrade a build tool and that may sometimes mean changing everything, but those are not common operation and only a few people out of your 100 ever feel that pain. (having been there, it is often best to announce it in advance and then merge Saturday when everyone else is at home - my company will let me take a different day off in the week if I work a saturday)
I've worked in k8s hybrid cloud automation for the past few years, and it requires deploying a k8s cluster, installing our services, and then running tests against the cluster. Our pre-merge pipeline takes 3 hours minimum and often much longer. It's definitely a drag on productivity but in a lot of complex multi-cloud environments with tons of moving pieces, I haven't yet seen a CI system achieve much better.
"Developer Productivity for Humans, Part 4: Build Latency, Predictability, and Developer Productivity" [1] (discussion[2]) in the IEEE Software magazine earlier this year is a better look at this. TL;DR: less time than it currently takes.
The CI shows the entire team that the tests are passing, and doesn't require other devs to pull it and run the tests themselves. It prevents devs from accidentally forgetting to run tests, or thinking "This is just a typo fix, it shouldnt break any tests". Tests might also run on multiple platforms/os's/architectures/versions, etc.
The CD ensures that the code is only deployed after all tests passed, it was approved, merged, and ensures a deploy isn't skipped accidentally. Depending on what your doing it could take a while to fully build new binaries, build for multiple platforms/versions/architectures, etc.
Testing on your local machine won't catch portability issues (Linux vs macOS vs Windows, x64 vs Arm64).
Unless you take great care in writing hermetic tests, testing on your local machine won't catch problems hidden by non-standard versions of libraries and tools in your local development environment (e.g. you installed modern versions of bash and python on your Mac, but your users are using what the OS comes with).
And for large enough projects, running tests on a single machine is simply impractical: a full test suite run on a single machine would take hours; the only solution is to shard.
Maybe not for a small 1-person project. If you have a team project you definitely need automated tests to run regularly, and everyone needs to be able to see when/where they fail.
Do you always remember to run every test? Does every other dev? Are there 100 other devs? If you check in bad code will it break something for 100 other devs? It all depends on the size of the org.
I'm waiting for a CI build as I write this - If I ran it locally it would take 3 hours, but the CI system can split across multiple AWS nodes and so be done in 25 minutes (the time of the longest build). I tried getting a more powerful machine, turns out my build becomes memory bandwidth bound at about 48 cores (building on a RAM disk makes the build take longer which is how I know it is memory bandwidth).
In my case, it would take about 45 hours to run on a single machine. Our distributed testing infrastructure gets it done in about 2. But there's a limited amount of machines, so you usually have to add 2 more hours queuing to that.
The whole point is to continuously integrate your work with that of others. If you are a solo operation then sure, no diff. On a team? Different story.
I've always wondered if it would be possible to design some proof of work concept where you could hash your filesystem + test artifacts to verify tests passed for a specific change.
FWIW in yeas of development I've never had an issue where "it works on my machine" doesn't equate to the exact same result in CI.