Follow

So, while so many of the staff members are out this week, i'm taking the opportunity to train up interns and other junior devs with some semblance of "calmness" around the remote office.

So without any further context:

Whats the general consensus on Mono Repos 'round these parts?

We have a diverse git layout of repositories that are all inter-linked and the distributed nature of the config / app code / etc. is startling to these jr devs.

Is a monorepo a good fix?

So far I'm loving the feedback on this subject. I kind of knew it would be subjective enough to personal experiences with a repo layouts and then there would be academic level analysis, all biased through the various experiences within.

This is one of those topics where I usually leave with more chin scratching than I began - but the perspectives! its growing my very narrow tunnel vision immensely, so thanks for that!

@chuck switching to monorepo was one of the most helpful things we ever did. Improved our workflows significantly.

We still use modules for open source components to make it easy to upstream patches, but everything proprietary is in the monorepo.

@chuck There's not one answer, there's a trade-off.

The overhead of separate repos (versioning, dependencies, duplicate (or inconsistent) packaging/tooling) is almost always a mistake if the same team owns all the repos.

If you're large enough to have multiple, fully independent workstreams, the overhead of building everyone else's code for a simple change is a mistake.

But in my experience, the former is more common than the latter.

@chuck Monorepos have one use case, which in the experience of many developers is the only relevant one: when you just want a modular structure in your code, but effectively deliver a single monolithic product.

There are many, many constellations in which that's not true.

And then, this is also highly dependent on your implementation language and build/packaging system. Some languages make modules and packages largely equivalent. A monorepo is actually overkill here.

@jens @chuck Facebook and Google both use gigantic monorepos. Neither "effectively delivers a single monolithic product."

@jens @chuck Actually that's wrong. Facebook mostly does. Google definitely does not, though.

@freakazoid @chuck You know how they say exceptions prove the rule? You know that you just picked the two biggest exceptions?

@jens @chuck The word "prove" in that particular idiom actually means "challenge". Which I think is accurate in this case.

@freakazoid @chuck That's fair.

All I'm trying to get at is that mentioning an unusual counter example or two doesn't substantially alter the rule. But I'm also not going to pretend that this rule must absolutely apply everywhere, just... as a rule.

Google AFAIK doesn't actually have everything in a huge monorepo. I have never worked there, but I guess could ask. I wonder how representative the Android (AOSP) is.

Here, they use android.googlesource.com/tools ...

@freakazoid @chuck ... which is explicitly *not* working on a monorepo, but used to orchestrate multiple individual repos as one.

I guess from a usage point of view, it acts like a monorepo, but it still allows composition of individual repos into one such virtual monorepo.

I really wonder how representative it is. Gotta ask my insiders 🤔

@jens @chuck I worked at Google for 3 years. Their open source stuff, including Android is not managed as part of their monorepo.

@freakazoid @chuck Anyway, completely unrelated to what they *actually* do, it's also a fair view IMHO that every product team within Google effectively delivers a single monolithic product. A bit of an oversimplification for sure.

The main point being, if you have some core tech used across many different products that release each at their own pace, an actual monorepo is more likely to hurt than help.

@jens @chuck I think it's also very different between software that's running on your own infrastructure and software that's going to be downloaded onto a device. When it's easy to release updates, the potential for accidentally incorporating a bug in a dependency isn't a big deal compared to the advantages for testing and not having to version your dependencies.

Google uses a "build horizon" instead of versions: anything running in prod must have been built within the past 3 months.

@jens @chuck When they have bugs that require updating stuff, they flag the range of changelists (commits, but completely linear because Perforce) that has the bug and force a rebuild of anything that was built between those commits.

@jens @chuck I actually have a client who ended up paying a fairly big price for splitting up their monorepo and would have been far better off just doing the work to deal with the problems they'd been having with it that led them to split it up in the first place. They still haven't solved dependency versioning.

@freakazoid @chuck dependency versioning is *hard* which is why I'd tackle it immediately. Much easier to build your development practices around it than try to solve it later. At least that's my experience.

@jens @chuck Yeah, it almost certainly won't pay to switch to a monorepo once you have separate repos working well.

I suspect this is a case of the general problem of people underestimating the work to make a big change compared to the work they know they have to do in order to not make the change. This leads to misguided rewrites all the time.

This sort of thing is often spearheaded by newcomers who never bother learning the old system very well and end up with similar or worse problems.

@jens @chuck The model I have in my head of the monorepo vs microrepo decision makes me think that they are isomorphic in some way. For each problem one has, the other seems to have a conjugate version of the same problem. Internal dependency versioning versus third party dependencies. Dependencies for testing versus needing to use a monolithic build system.

It seems like we need a meta-build system of some kind. And I don't mean a CI system but a declarative meta-build language.

@freakazoid @chuck that's something I rolled around in my head for a while, but didn't want to get into.

@jens @chuck Probably a wise choice. Build systems are notoriously difficult.

Nix and Guix seem to at least have gone the right direction. It seems like a lot of build systems exist because people couldn't be bothered to learn Make and Autoconf, and thus they failed to learn any actual lessons before creating systems that have many of the problems Make and Autoconf solve.

@freakazoid @chuck What I was actually thinking of was more of a meta build system plus a toolkit of utilities. As in, the only thing we really need to know about a dependency is where to find its files (and dependencies of its own). In particular, we don't need to know anything about how it arrives there.

So the trigger should be that the expected files aren't to be found, and the action to fix that... a script. It might invoke git or curl to get soruces, or RPM for...

@freakazoid @chuck ... binaries. It might build or copy or install. Doesn't really matter, as long as the result is that the files are where they are.

The toolkit of utilities would just be a bunch of little helpers for managing repos and search paths and whatnot that you can use in your script.

@freakazoid @chuck mostly I figured that this low effort approach actually provides you with what you need, especially when each of your dependencies uses a different build system, distribution method, etc.

In that sense it's not significantly different from a packaging system. Might even zip up the final products with a manifest.

Show newer

@chuck i use two git mono repos with some regularity, freebsd src and ports

the FreeBSD src repo is a bit clunky
the ports repo is very, very clunky

but, there's no gooder way to do ports right now
because there's a lot of common code (the framework) and many ports depend on others, so you might not know how many you'd have to clone to build the port you actually want to build

and then there's updating:
often there's updating of many ports for the same reason
now imagine instead of doing that in fifteen directories, you're doing it in fifteen repositories.
even if you do everything right, and manage to commit and push all fifteen of them without forgetting anything, you've more generated fifteen commit emails instead of one which are going out to who knows how many thousand people

@chuck I like starting with a mono-repo and extracting components into their own repositories that outgrow the mono-repo nursery.

Sign in to participate in the conversation
LinuxLab

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!