Commits don't cost money, but time wasted on "added forgotten files"-commits whi...

eropple · on July 31, 2011

I dunno, I think the claim that those commits "waste time" (in the sense of any meaningful amount of time, even cumulatively) is a little hyperbolic.

I guess you view history differently than I do: I consider all development history to be "public history" regardless of whether it was pulled in from a clone or not. If you commit it to a repository I am going to be fulfilling a pull request from, I want the history there.

planckscnst · on July 31, 2011

So, let's get down to business here, then. What specifically is better about this:

    commit 123facdf Add asynchThingerBopper() to Thinger class.
    commit 9f9babd8 Forgot to add Thinger.h file to 1123facdf

compared with this:

    commit 123facdf Add asynchThingerBopper() to Thinger class.

What specific value does the first scenario add that the second does not?

koopajah · on Aug 1, 2011

It can help for git bisect for example to avoid having half of your commit not compiling properly because half the time you forgot to add one of the files to your commit.

fr0sty · on Aug 1, 2011

> It can help for git bisect

Not if the first commit doesn't compile.

Indelible history is good for public projects (and no one is arguing about that) and for change control systems but less good, in my estimation, in cases where mistakes are easily made, have small to non-existent consequenses, and serve no historic purpose.

pilif · on July 31, 2011

and you will get that history.

But you will get it in the form that I would like you to have it, not as it happened.

But there is no way for you to know besides the fact that all commits you are going to pull are self-contained and none of them breaks the build.

Would you reject perfect looking commits, self-contained, perfectly documented and forming a perfect temporal history based on the fact that they are too perfect not to have been created using rebase? Because that's the only indication you have that rebase was used.

eropple · on Aug 1, 2011

> But you will get it in the form that I would like you to have it, not as it happened.

And that's actually a very large part of why I don't use something that makes rebase easy. Because, fundamentally, I do not care how you "want" me to get it. I want to get it how it was put into the repository to begin with. I want to get it how it was committed--because how it was done matters to me, as a developer and as a person. This is an entirely emotional position and I don't care: show me how it was done. It matters to me, and given my personal axioms there is a decent argument, in my mind, that doing otherwise is disrespecting people who might want to see how you "put it all together."

You're certainly welcome to disagree. But because I use Hg for my projects and for any project I contribute to, and because my usual co-workers aren't going to go through the hassle of enabling Hg's rebase extension and using it just for the hell of it, I generally get what I want in the areas I care about. =)

fr0sty · on Aug 1, 2011

If your workflow or VCS doesn't allow rebase natively then people will implement it in the filesystem by never committing.

Git's approach to committing is: "commit early, commit often". The coro;lary is: "don't worry about perfection, we can fix it later". Such flexibility to me is very enabling and allows for a lot of very beneficial experimentation in the process of developing features.

Honestly, I don't understand your fascination with the sausage-making. If I were to give you the pre-rebase and post-rebase version of patches I would wager you would find much more value in the latter. And if not. Even if seeing the wandering, hacking, slashing, typoing, re-indenting, etc. is instructive to you, any future maintainer of the code will be far less pleased.

astine · on Aug 1, 2011

"I want to get it how it was committed--because how it was done matters to me, as a developer and as a person."

Or as a micro-managing boss. I don't need someone staring over my shoulder as I work or after I work. My mistakes aren't your business and I find your attitude unpleasant.

westi · on July 31, 2011

I completely agree with this.

If a change was worth committing then it is worth sharing that commit with everyone. Otherwise you run the very real risk of loosing important information about the design of a feature the bugs that were found and addressed during development.

Every change should been accompanied by a well described commit message and big changes are much harder to review.

I can see a very small positive in "hiding" the commits which resolve process issues like forgetting to add a file but in the long run you shouldn't have very many of these anyway so you should worry about them.

pyre · on Aug 1, 2011

  > If a change was worth committing then it is worth
  > sharing that commit with everyone

You're misunderstanding some of the workflows that people are discussing. Sometimes I commit things that are half-finished, or even half-baked because I know that when it comes time to push I can rewrite things into a set of commits that makes sense.

This workflow makes sense because rewriting is easy enough. Obviously, I might not do this if I had to publish every commit that I made. But then I would just resort to using something like quilt to manage patches onto of SVN, which is ridiculous. You VCS is a patch management system. The idea that someone would use a patch management system on top of a patch management system suggests that something is broken (yes, I have had people on HN claim that git sucks because SVN + quilt 'work for me').

baha_man · on Aug 1, 2011

"But then I would just resort to using something like quilt to manage patches onto of SVN, which is ridiculous."

This is exactly the way Mercurial patch queues work:

http://mercurial.selenic.com/wiki/MqExtension

westi · on Aug 1, 2011

My point is that these interim commits are useful information for other people too not just you.

They help show how the feature developed, what other implementation ideas you tried along the way etc.

chousuke · on Aug 1, 2011

How are they useful? Why would anyone care about how you developed a single bugfix or a feature?

In your model, often the commits are not even sequential in the log because you might find a mistake only after committing several other changes. I can't see how not rebasing makes commit history better in any way at all. I would like to hear your reasoning.

The way I see it, instead of a series of commits that implement something, you could have a single patch (commit) that implements something, making it much easier to

0) find all the code that implements a certain feature, because it's a self-contained commit,

1) find bugs via bisection,

2) port to other versions via cherry-picking and

3) read the changelog and figure out what the hell is actually happening because there are no trash commits around obscuring things.

EDIT:

Just to clarify, I do not think that this is a matter of opinion or preference. That would imply that both approaches are equally valid.

I consider rebasing a tool that enables a vastly superior workflow. I have given a few reasons why I think it is superior and I am interested in counterarguments or at least reasoning as to why not making perfect commits (to the best of your ability) is preferable or even acceptable at all.

westi · on Aug 1, 2011

> How are they useful? Why would anyone care about how you developed a single bugfix or a feature?

In my opinion in a collaborative environment it is immensely useful to know about how a feature / bug fix was developed.

Hiding away the development of a feature into one large commit makes it harder for people to review.

> I do not think that this is a matter of opinion or preference. That would imply that both approaches are equally valid.

I think it is a matter of preference and opinion and this is why we have such diametrically opposed views on the subject.

chousuke · on Aug 2, 2011

> Hiding away the development of a feature into one large commit makes it harder for people to review.

You have it backwards. One commit is much easier to review than three commits that you might not even know are related.

Assuming you know what you're doing, the commits you create with rebase are not large, they are just the perfect size. They contain the code needed for a single change and nothing else.

Sometimes a feature might actually take two or more commits, but then those commits represent two subfeatures... For example, you might first need to implement a new API, and then write a new feature that uses that API. That's two commits. If you forget something from the API or notice that it's problematic while coding the feature, then rebase will allow you to fix the first patch, instead of splitting code across multiple commits that make no sense separately.

One might argue that in this situation the feature patch makes no sense without the API patch but in fact it's a feature dependency, not a code dependency... As long as the API exists, the feature implementation is just fine as a standalone commit.

When used properly, rebase gives you freedom to use as many WIP commits and make all the mistakes you want in your private branch, while still allowing you to create good, easy-to-review patches that result in a logical, clean, and informative history.

If you still hold the opinion that rebase is bad, I am interested in further arguments, since the one reason you gave does not hold water.

pilif · on Aug 2, 2011

>In my opinion in a collaborative environment it is immensely >useful to know about how a feature / bug fix was developed. > >Hiding away the development of a feature into one large >commit makes it harder for people to review.

this is exactly what be pro-rebasers were talking about.

Nobody of us wants to do one big huge commit that contains the whole feature.

Everybody of us wants small, self-contained commits, every commit fixing or adding one specific thing.

What we don't want is a commit adding a thing, quickly followed by another commit "forgot to add this file", because that later commit provides no value to a reviewer.

What we rebasers are talking about is forging the history in a way that a patch reviewer can go over every single commit and, in one glance, decide whether that patch makes sense or not.

Let's assume it's the old days of svn: A whole file is the smallest unit of change you can commit and there's no way to change history.

Let's further assume that you want to add a new feature to a file. While doing so, you also notice that there's a bug in another part of that code in the same file that became apparent while writing your feature.

Your feature only works with the bug fixed, but the bug fix also makes sense independently of the feature.

In the old days, when committing that file, you have two options for commit messages:

1) "adding feature foobar"

leaving out the fact that you also fixed a bug. This is bad if your bugfix contains another bug and I have to dig in the history, wondering why you changed this seemingly unrelated piece of code. If I have to review the code, I will have to ask you, why you also changed a seeminly unrelated piece of code.

2) "adding feature foobar and fixing bug bar"

this is better, but weren't you thought that a commit should only do one thing? This clearly does two.

At that point, you could use diff, patch and an editor to remove the feature but leave the bugfix in. Then you commit that as "fixing bug bar", followed by more diff and patch to get the feature in, which you commit as "adding feature foobar".

Fine, but very cumbersome, so hardly ever done.

Git, on the other hand, with the help of "add -p" and "rebase -i" makes exactly this possible and turns something incredibly painful into something you can do with closed eyes in your sleep.

And this is why there is this vocal pro-rebasing-crowd.

We EXACTLY NOT talking about mushing everything together in a big commit

We are talking about creating MANY, MANY more SMALLER commits that are independent of each other and thus much more maintainable.

Case in point: Since we migrated to git for our product and since everybody learned about rebase and began using it, we made the same amount of commits in one year that we did in tree previous years.

I would be seriously upset if somebody used the power of rebase to create big huge commits and wanted to push them to our main repo. This is not what we are advocating rebase use for. Not at all.

pyre · on Aug 1, 2011

Personally, I think that this idea of somewhat bogus:

1) In most cases, all of those extra commits are just noise. They make it really confusing to determine what actually changed from point A to point B because of all of the dead ends that were hit and backed out in between.

2) This is like saying that every time that someone produces anything they should be required to save all of their dead ends for people that look at their work in the future. If a carpenter at a building site cuts a piece of wood wrong, should he just recut it correctly, or set it aside so that on the off-chance that someone needs to see how he mis-cut the piece of wood, they can?

3) The fundamental flaw here is that you're trying to use the tool to enforce the workflow. Would you feel the same way if the next version of Ubuntu enforced that no image files could be placed anywhere on disk except ~/Photos just because someone determined that that's what 'makes sense?'