adding to the “Going Dark” and DVCS debate

On programmers “going dark” — Aristotle Pagaltzis writes:

Jeff Atwood argues that open source projects are in real danger of programmers “going dark,” which means they lock themselves away silently for a long time, then surface with a huge patch that implements a complex feature.

It seems to me that this is as much a technological problem as a social issue… and that we have the technological solution figured out: it’s called distributed version control. It means that that lone developer who locked himself in a room need not resurface with a single huge patch – instead, he can come back with a branch implementing the feature in individually comprehensible steps. At the same time, it allows the lone programmer to experiment in private and throw away the most embarrassing mistakes, addressing part of the social problem.

However, I don’t think he realised that the Jeff Atwood story he responded to was in fact an echo of Ben Collins-Sussman’s original article, where he specifically picked out DVCS as a source of this danger:

A friend of mine works on several projects that use git or mercurial. He gave me this story recently. Basically, he was working with two groups on a project. One group published changes frequently…

“…and as a result, I was able to review consistently throughout the semester, offering design tweaks and code reviews regularly. And as a result of that, [their work] is now in the mainline, and mostly functional. The other group […] I haven’t heard a peep out of for 5 months. Despite many emails and IRC conversations inviting them to discuss their design and publish changes regularly, there is not a single line of code anywhere that I can see it. […] Last weekend, one of them walked up to me with a bug […] and I finally got to see the code to help them debug. I failed, because there are about 5000 lines of crappy code, and just reading through a single file I pointed out two or three major design flaws and a dozen wonky implementation issues. I had admonished them many times during these 5 months to publish their changes, so that we (the others) could take a look and offer feedback… but each time met with stony silence. I don’t know if they were afraid to publish it, or just don’t care. But either way, given the code I’ve seen, the net result is 5 wasted months.”

Before you scream; yes yes, I know that the potential for cave-hiding and writing code bombs is also possible with a centralized version control system like Subversion. But my friend has an interesting point:

“I think this failure is at least partially due to the fact that [DVCS] makes it so damn easy to wall yourself into a cave. Had we been using svn, I think the barrier to caving would have been too high, and I’d have seen the code.”

In other words, yes, this was fundamentally a social problem. A team was embarrassed to share code. But because they were using distributed version control, it gave them a sense of false security. “See, we’re committing changes to our repository every day… making progress!” If they had been using Subversion, it’s much less likely they would have sat on a 5000 line patch in their working copy for 5 months; they would have had to share the work much earlier.

To be honest, I’d tend to agree with Aristotle; just because centralized VC makes it harder to maintain a “private branch” with this “high barrier to caving”, and this therefore imposes a technical pressure to fix a social problem, doesn’t mean that is a good thing. I’d prefer to fix the DVCS to apply social pressure, and have both working tools and a working social organisation.

Another commenter on Ben’s original post put it well:

I [..] disagree, strongly, that DVCS makes code hiding any more difficult than single-branch VCS. When using a single branch, it’s usually a very small group of people who are allowed to commit. Any patches from non-core contributors get lost in a tangle of IRC pastebins, mailing lists, bug trackers, and blog posts. Furthermore, even if these patches are eventually committed, they have lost all their associated version information — the destructive rebase you complain about. DVCS allows anybody to branch from trunk, record their changes, and publish their branch in a service like Launchpad or github. For an example of this, look at the mass of user-created branches for popular projects like GNOME Do or AWN.

It’s very interesting to see those Launchpad sites, in my opinion.

I’ve spent many years shepherding contributions to SpamAssassin through our Bugzilla. We’ve often lost rule contributors, who are particularly hard to attract for some reason, due to delays and human overhead involved in this method. :( So an improved interface for this would be very useful…

This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

3 Comments

  1. Posted June 16, 2008 at 13:10 | Permalink

    Much of the comments over there work on the assumption that communication is easy, I think. It’s not, and evaluating others’ comments and taking on board their feedback can be a lot of work, if you’re lucky enough to have them.

    For the XEmacs work I do (and yes, I should do more, the project isn’t as active as it could be :-/ ) it can change the time needed to implement a feature from a matter of a weekend to a matter of a couple of months, with the latency of the back and forth and the misunderstandings and differences of opinion that arise; if I can anticipate what people will say, and pre-emptively address it, it’s less work for me in total.

    I do think that the git rebase approach is needless, though.

  2. Posted June 17, 2008 at 09:32 | Permalink

    DVCS is a terrible idea, IMO. Committing often has so many obvious and more subtle benefits. Allowing people to “commit” to a private repository makes them feel like they’re following good practices, but what’s the point if it doesn’t allow others to quietly point out the huge mistake you missed on page 1?

    If people don’t wish to commit their code (they may have valid reasons for doing so, although none spring to mind) they should at least be encouraged to commit the log entries so they can be checked for sanity.

  3. Posted June 18, 2008 at 17:11 | Permalink

    @Eoghan: Not true, you just have to add “Push” often into the mix when it comes the DVCS. Commit very often, push often. It’s just a single added samantic. The added bonus on that, is that if they decide to roll back to code they commit yesterday, but that branch hasn’t been pushed to main repository since yesterday, they can still do that.

    It just requires (in my opinion) a slight changed mindset, from thinking “Commit” means make your changes public to “Push” makes your changes public. (Of course commit still carries the rest of the meaning, as the original, but you get what I mean).

    DVCS is a new way of doing things, so it’s going to require a slightly different approach as for what best practices are.