Understanding the Git Workflow

pilif · on July 31, 2011

The minute I learned about "rebase -i" and "add -p" has changed how I think about commits. I learned how I could easily keep the history clean and conversely, I learned the huge value that a clean history has for maintenance.

Now, building the commits as self-contained entities that don't break the build in between not only helps me while searching bugs later on, it sometimes helps me detect code smells around unneeded dependencies.

That said, I still like to merge big features with --no-ff if they change a lot of code and evolved over a long time, as that, again, helps keeping history clean because a reader can clearly distinguish code before the big change from code after the big change.

Of course the individual commits in the branch are still clean and readable, but the explicit merge still helps if you look at the history after some time.

"you said 'a long time in development' - surely the merge target has changed in between. Why still -no-ff?" you might ask.

The reason, again, is clean history: before merging I usually rebase on top of the merge target to remove eventual bitrot and in order to keep the merge commit clean. Having merge commits with huge code changes in them which we're caused by fixing merge conflicts, again, feels bad.

But this is certainly a matter of taste.

false · on Aug 1, 2011

You will probably enjoy 'checkout -p' and 'reset -p' as well (revert and unstage changes hunk-by-hunk)

diminish · on July 31, 2011

just like you, i enjoy rebase -i, to change history; but I also hear some poeple claim the history should be kept as it is and should not be rewritten. What are your arguments for rebase?

pilif · on July 31, 2011

I would never ever rewrite the public history.

The public history is what ends up on the repository from where we deploy from. Whenever a commit is pushed there, it stays there. There will never be any rebasing (minus emergencies like removing accidentally committed files for which we don't own a license for - didn't happen so far though).

"rebase -i" is a tool for personal development use. It's not a tool to use on a public repository as it will make following history incredibly hard and it will screw with the clones other developers might have.

Conversely though, what I do on my personal development machine or on my personal public clone (everyone of us has a personal public clone we use for code reviews or discussions around code), is my business.

Nobody is telling me which editor to use and nobody is telling me whether I can clean up my commits or not.

Now in general, since learning that having clean commits is possible (it's not in subversion for example), I encourage my fellow developers to have clean commits and I discourage them from committing those famous "oops - removed typo" or "oops - added forgotten file" commits as they are completely useless for the overall history of the project.

Two months from now, nobody is going to care about you forgetting to add a file. But I'm likely going to care about when a feature has been added and why some lines have changed. So that's what I want to have in the public repository. Not a history of your personal forgetfulness.

If they manage to do that without ever rebasing (you can do it with add -p, it's just easy to make a mistake), then fine. In the end, I only care about a clean history on our public repository.

eropple · on July 31, 2011

IMO, that's really the wrong way to go, and it's one of the big reasons I absolutely loathe git. I want any changes that are in my tree, ever, to be in the order and position in which they happened. If somebody screwed up and forgot to add a file, fine--add it in another commit. It's not like commits cost money.

As far as rolling back later--meh? I've never had a trouble in 300Krev heavily branched SVN barf, I strongly doubt it's suddenly harder in a DVCS. Merge tags are your friend, and indelible history is a good thing.

pilif · on July 31, 2011

Commits don't cost money, but time wasted on "added forgotten files"-commits while parsing the history to trace a bug does cost money, so I'd rather not have the commits.

Additionally, it's impossible for you or anybody else to find out whether I have rebased my personal history before pushing. As such, it's totally inconsequential for the main repository whether I rebased or not.

As I said: I think rebase is a personal development tool, not one you would alter public history with.

eropple · on July 31, 2011

I dunno, I think the claim that those commits "waste time" (in the sense of any meaningful amount of time, even cumulatively) is a little hyperbolic.

I guess you view history differently than I do: I consider all development history to be "public history" regardless of whether it was pulled in from a clone or not. If you commit it to a repository I am going to be fulfilling a pull request from, I want the history there.

planckscnst · on July 31, 2011

So, let's get down to business here, then. What specifically is better about this:

    commit 123facdf Add asynchThingerBopper() to Thinger class.
    commit 9f9babd8 Forgot to add Thinger.h file to 1123facdf

compared with this:

    commit 123facdf Add asynchThingerBopper() to Thinger class.

What specific value does the first scenario add that the second does not?

koopajah · on Aug 1, 2011

It can help for git bisect for example to avoid having half of your commit not compiling properly because half the time you forgot to add one of the files to your commit.

fr0sty · on Aug 1, 2011

> It can help for git bisect

Not if the first commit doesn't compile.

Indelible history is good for public projects (and no one is arguing about that) and for change control systems but less good, in my estimation, in cases where mistakes are easily made, have small to non-existent consequenses, and serve no historic purpose.

pilif · on July 31, 2011

and you will get that history.

But you will get it in the form that I would like you to have it, not as it happened.

But there is no way for you to know besides the fact that all commits you are going to pull are self-contained and none of them breaks the build.

Would you reject perfect looking commits, self-contained, perfectly documented and forming a perfect temporal history based on the fact that they are too perfect not to have been created using rebase? Because that's the only indication you have that rebase was used.

eropple · on Aug 1, 2011

> But you will get it in the form that I would like you to have it, not as it happened.

And that's actually a very large part of why I don't use something that makes rebase easy. Because, fundamentally, I do not care how you "want" me to get it. I want to get it how it was put into the repository to begin with. I want to get it how it was committed--because how it was done matters to me, as a developer and as a person. This is an entirely emotional position and I don't care: show me how it was done. It matters to me, and given my personal axioms there is a decent argument, in my mind, that doing otherwise is disrespecting people who might want to see how you "put it all together."

You're certainly welcome to disagree. But because I use Hg for my projects and for any project I contribute to, and because my usual co-workers aren't going to go through the hassle of enabling Hg's rebase extension and using it just for the hell of it, I generally get what I want in the areas I care about. =)

fr0sty · on Aug 1, 2011

If your workflow or VCS doesn't allow rebase natively then people will implement it in the filesystem by never committing.

Git's approach to committing is: "commit early, commit often". The coro;lary is: "don't worry about perfection, we can fix it later". Such flexibility to me is very enabling and allows for a lot of very beneficial experimentation in the process of developing features.

Honestly, I don't understand your fascination with the sausage-making. If I were to give you the pre-rebase and post-rebase version of patches I would wager you would find much more value in the latter. And if not. Even if seeing the wandering, hacking, slashing, typoing, re-indenting, etc. is instructive to you, any future maintainer of the code will be far less pleased.

astine · on Aug 1, 2011

"I want to get it how it was committed--because how it was done matters to me, as a developer and as a person."

Or as a micro-managing boss. I don't need someone staring over my shoulder as I work or after I work. My mistakes aren't your business and I find your attitude unpleasant.

westi · on July 31, 2011

I completely agree with this.

If a change was worth committing then it is worth sharing that commit with everyone. Otherwise you run the very real risk of loosing important information about the design of a feature the bugs that were found and addressed during development.

Every change should been accompanied by a well described commit message and big changes are much harder to review.

I can see a very small positive in "hiding" the commits which resolve process issues like forgetting to add a file but in the long run you shouldn't have very many of these anyway so you should worry about them.

pyre · on Aug 1, 2011

  > If a change was worth committing then it is worth
  > sharing that commit with everyone

You're misunderstanding some of the workflows that people are discussing. Sometimes I commit things that are half-finished, or even half-baked because I know that when it comes time to push I can rewrite things into a set of commits that makes sense.

This workflow makes sense because rewriting is easy enough. Obviously, I might not do this if I had to publish every commit that I made. But then I would just resort to using something like quilt to manage patches onto of SVN, which is ridiculous. You VCS is a patch management system. The idea that someone would use a patch management system on top of a patch management system suggests that something is broken (yes, I have had people on HN claim that git sucks because SVN + quilt 'work for me').

baha_man · on Aug 1, 2011

"But then I would just resort to using something like quilt to manage patches onto of SVN, which is ridiculous."

This is exactly the way Mercurial patch queues work:

http://mercurial.selenic.com/wiki/MqExtension

westi · on Aug 1, 2011

My point is that these interim commits are useful information for other people too not just you.

They help show how the feature developed, what other implementation ideas you tried along the way etc.

chousuke · on Aug 1, 2011

How are they useful? Why would anyone care about how you developed a single bugfix or a feature?

In your model, often the commits are not even sequential in the log because you might find a mistake only after committing several other changes. I can't see how not rebasing makes commit history better in any way at all. I would like to hear your reasoning.

The way I see it, instead of a series of commits that implement something, you could have a single patch (commit) that implements something, making it much easier to

0) find all the code that implements a certain feature, because it's a self-contained commit,

1) find bugs via bisection,

2) port to other versions via cherry-picking and

3) read the changelog and figure out what the hell is actually happening because there are no trash commits around obscuring things.

EDIT:

Just to clarify, I do not think that this is a matter of opinion or preference. That would imply that both approaches are equally valid.

I consider rebasing a tool that enables a vastly superior workflow. I have given a few reasons why I think it is superior and I am interested in counterarguments or at least reasoning as to why not making perfect commits (to the best of your ability) is preferable or even acceptable at all.

westi · on Aug 1, 2011

> How are they useful? Why would anyone care about how you developed a single bugfix or a feature?

In my opinion in a collaborative environment it is immensely useful to know about how a feature / bug fix was developed.

Hiding away the development of a feature into one large commit makes it harder for people to review.

> I do not think that this is a matter of opinion or preference. That would imply that both approaches are equally valid.

I think it is a matter of preference and opinion and this is why we have such diametrically opposed views on the subject.

chousuke · on Aug 2, 2011

> Hiding away the development of a feature into one large commit makes it harder for people to review.

You have it backwards. One commit is much easier to review than three commits that you might not even know are related.

Assuming you know what you're doing, the commits you create with rebase are not large, they are just the perfect size. They contain the code needed for a single change and nothing else.

Sometimes a feature might actually take two or more commits, but then those commits represent two subfeatures... For example, you might first need to implement a new API, and then write a new feature that uses that API. That's two commits. If you forget something from the API or notice that it's problematic while coding the feature, then rebase will allow you to fix the first patch, instead of splitting code across multiple commits that make no sense separately.

One might argue that in this situation the feature patch makes no sense without the API patch but in fact it's a feature dependency, not a code dependency... As long as the API exists, the feature implementation is just fine as a standalone commit.

When used properly, rebase gives you freedom to use as many WIP commits and make all the mistakes you want in your private branch, while still allowing you to create good, easy-to-review patches that result in a logical, clean, and informative history.

If you still hold the opinion that rebase is bad, I am interested in further arguments, since the one reason you gave does not hold water.

pilif · on Aug 2, 2011

>In my opinion in a collaborative environment it is immensely >useful to know about how a feature / bug fix was developed. > >Hiding away the development of a feature into one large >commit makes it harder for people to review.

this is exactly what be pro-rebasers were talking about.

Nobody of us wants to do one big huge commit that contains the whole feature.

Everybody of us wants small, self-contained commits, every commit fixing or adding one specific thing.

What we don't want is a commit adding a thing, quickly followed by another commit "forgot to add this file", because that later commit provides no value to a reviewer.

What we rebasers are talking about is forging the history in a way that a patch reviewer can go over every single commit and, in one glance, decide whether that patch makes sense or not.

Let's assume it's the old days of svn: A whole file is the smallest unit of change you can commit and there's no way to change history.

Let's further assume that you want to add a new feature to a file. While doing so, you also notice that there's a bug in another part of that code in the same file that became apparent while writing your feature.

Your feature only works with the bug fixed, but the bug fix also makes sense independently of the feature.

In the old days, when committing that file, you have two options for commit messages:

1) "adding feature foobar"

leaving out the fact that you also fixed a bug. This is bad if your bugfix contains another bug and I have to dig in the history, wondering why you changed this seemingly unrelated piece of code. If I have to review the code, I will have to ask you, why you also changed a seeminly unrelated piece of code.

2) "adding feature foobar and fixing bug bar"

this is better, but weren't you thought that a commit should only do one thing? This clearly does two.

At that point, you could use diff, patch and an editor to remove the feature but leave the bugfix in. Then you commit that as "fixing bug bar", followed by more diff and patch to get the feature in, which you commit as "adding feature foobar".

Fine, but very cumbersome, so hardly ever done.

Git, on the other hand, with the help of "add -p" and "rebase -i" makes exactly this possible and turns something incredibly painful into something you can do with closed eyes in your sleep.

And this is why there is this vocal pro-rebasing-crowd.

We EXACTLY NOT talking about mushing everything together in a big commit

We are talking about creating MANY, MANY more SMALLER commits that are independent of each other and thus much more maintainable.

Case in point: Since we migrated to git for our product and since everybody learned about rebase and began using it, we made the same amount of commits in one year that we did in tree previous years.

I would be seriously upset if somebody used the power of rebase to create big huge commits and wanted to push them to our main repo. This is not what we are advocating rebase use for. Not at all.

pyre · on Aug 1, 2011

Personally, I think that this idea of somewhat bogus:

1) In most cases, all of those extra commits are just noise. They make it really confusing to determine what actually changed from point A to point B because of all of the dead ends that were hit and backed out in between.

2) This is like saying that every time that someone produces anything they should be required to save all of their dead ends for people that look at their work in the future. If a carpenter at a building site cuts a piece of wood wrong, should he just recut it correctly, or set it aside so that on the off-chance that someone needs to see how he mis-cut the piece of wood, they can?

3) The fundamental flaw here is that you're trying to use the tool to enforce the workflow. Would you feel the same way if the next version of Ubuntu enforced that no image files could be placed anywhere on disk except ~/Photos just because someone determined that that's what 'makes sense?'

gruseom · on July 31, 2011

Why, then, not commit every keystroke? After all, you're losing history every time someone types backspace.

I imagine the reason that seems absurd is that you don't consider all the false steps and reworking that go on while a commit is crafted to be part of its official meaning. The working set is malleable until it's ready, and then you commit it. Well, private branches as pilif describes them are malleable in just this way. In both cases, you work your code like clay until it's ready to be presented and then bake it in to the public history.

eropple · on July 31, 2011

I strongly dispute the assertion that a private clone is malleable in the way you're describing. I consider all history important enough to be committed to be "public" history. But (apparently unlike folks who are fast with their downvote buttons) I certainly acknowledge that it is a matter of taste.

gruseom · on July 31, 2011

I consider all history important enough to be committed to be "public" history

This argument seems to me to boil down to an attachment to a single meaning (the traditional one) of the word "commit".

p.s. Instead of complaining about being downvoted, it would be better to make your tone less aggressive in the first place. None of your other comments made it clear that you regard this as taste; actually quite the opposite.

pyre · on Aug 1, 2011

  > I want any changes that are in my tree, ever,
  > to be in the order and position in which they
  > happened.

Could you explain this better. When I read this I hear, "I absolutely loathe modern operating systems, I think that all code should go in ~/code and all documents in ~/Documents, but modern operating systems allow you to put things anywhere. The horror!"

Do you really loathe the tool because it allows for flexibility? Do you have so little faith in the end-user? Do you really understand git, or only at a cursory level? (That isn't meant to be an insult)

  > indelible history is a good thing

So long as you have a central git repository and manage it so that no one can rewrite the master/trunk branch, then you have that. If someone screws up the history in their local tree, then it won't allow them to overwrite the history on the canonical version.

(If you say, "well someone with access to the canonical version could do X," then you're just trolling, because someone with access to the central SVN repo could 'rm -rf' it too. It's the same issue.)

tomjen3 · on July 31, 2011

Commits don't cost money, but they increase complexity which takes brain power away from the development process.

Second, git won't change anything by itself, so things will always be exectly where you left them; so I am really interested in why you think rewriting is a bad idea.

ulisesroche · on July 31, 2011

For better integration with an issue tracker, and because a master branch should have a formal commit history.

I make tons of little commits on a local branch with messages such as "reformat styles", "whoops, fixed typos", "fixed the query", and such. I use "rebase -i" to squash all of them together with a better, and more formal, commit message, like "Repairs and styles main navigation, closes #9".

I hate a master branch with a commit history with casual commit messages.

decklin · on July 31, 2011

The idea that fast-forward merges are easier to follow is subjective. I find my --no-ff history easier to read. This author doesn't.

What always using fast-forward merges really means is that you rebase each branch onto master once it's ready to be public. Therefore, instead of resolving conflicts when the branch is merged, the commits are rewritten to avoid introducing the conflict in the first place.

Sometimes, this is really simple -- I added a line in one spot, you added another line in the same spot, you merged first, so I rewrite my commit to add my line next to yours instead of merging and resolving the conflict. Sometimes, it's not -- maybe there's not even any text-level conflict, but your feature and my feature interact in subtle and unanticipated ways and something breaks. Now, there's no "good" point in my branch to refer to, because I rewrote it on top of something where (I didn't realize) it was never really going to work. The unit test I now need couldn't have existed because it involves things that, when I was developing the branch, didn't exist.

Rebasing first is trading off when you do that work. There's more to review when the branch is ready, and there's a stronger incentive to get it right the first time. I think this may work better for the "two founders deploying from master when they feel like it" scenario -- you pay for manageability with context switches. If you have a formal QA process, I think being able to distinguish between "this branch failed QA" and "the combination of these branches failed" may be more helpful -- you can parallelize work and hack on a different private branch.

Git, thankfully, does not force us to choose one model or the other :-)

sandofsky · on July 31, 2011

In my experience, on large distributed projects the person integrating changes into master is rarely the same person who authored the change.

For example, when Linux branches are pulled upstream, if your code creates a conflict your branch will just be rejected and you'll be told to fix.

Rebase forces the author to solve more of these problems before submitting their change for integration.

I don't think rebase is an end-all solution for the reasons you've described. It's perfect for medium sized changes you can easily verify afterwards. My day-to-day work usually falls into this category.

In the case of larger sets of all-or-none changes, such as a site redesign, it makes perfect sense to maintain a parallel line of development. Cleanup probably isn't worth it, and the separate branch serves as documentation. You should consciously create a new public branch.

In this case, I can understand wanting a "no-ff" merge for documentation. I think you should first consider tags, but sometimes it makes sense to set a stake in the ground with a placebo commit.

The problem is that if you use "no-ff" all the time on trivial changes, then these branches lose meaning.

This post wasn't supposed to be an embargo on "no-ff." My case is that people default to "no-ff" to pave over deeper issues.

cpeterso · on Aug 1, 2011

A --no-ff merge also makes reverting a change from master easier because there is just one commit. You don't need to dig through the log to find the first commit from the merged branch fast-forwarded onto master.

fr0sty · on Aug 2, 2011

Do you actually mean "revert" there or are you talking about rewinding?

using "git revert <merge_commit>" is very nasty[1]. using 'git reset <before_bad_merge> is less so.

[1]http://kernel.org/pub/software/scm/git/docs/howto/revert-a-f...

sunchild · on July 31, 2011

This opened my eyes a bit. I am a walking, talking git anti-pattern today. I'm mostly on a two-man team, so I can get away with it. I'm definitely going to start thinking more about a clean history on master.

What are some other best-practice git workflows that HN readers use?

gruseom · on July 31, 2011

I work this way and agree about the value of a clean, linear history. It makes working with past versions of your code a breeze. There's one thing the OP doesn't mention that I've found important.

Say you're working on a major design change in a private branch and it has 100 commits. When it's ready to be put on top of master, you'd really like not to squash all 100 commits. Unfortunately, if there are conflicts, then rebasing B1,B2,...,B100 onto master is likely to be much harder than squashing B1,...,B99 into B100 and then rebasing. Why? In the squashed case you only have to deal with conflicts between B100 and master, while in the unsquashed case you have to deal with all the conflicts that ever existed as you progressed from B1 to B100. It's frustrating to find yourself fixing conflicts in code that you know doesn't exist any more. It's also error-prone since it forces you to remember what you were doing at all those steps. In such situations, I give up and squash. That's not great either, since you now have the disadvantages of a single monolithic commit.

The solution is to be diligent about rebasing B onto master as frequently as master changes, so B never has a chance to drift too far afield. This at least gets rid of the worst pain, which is conflicts that compounded unnecessarily. It also keeps you aware of what's happening on master.

js2 · on July 31, 2011

Here's a trick for you: make sure you have rerere enabled. Merge the end commit, resolve all the conflicts and commit the merge (or just run rerere to record the conflict resolution). Then abort the merge or reset back to undo it. Now do the rebase, which will re-use the resolutions for any identical conflicts. You still have to deal with conflicts unique to the intermediate state, but in my experience rerere helps a lot.

gruseom · on July 31, 2011

I tried rerere once and it felt too much like magic to me, i.e. too complicated in a way that I didn't trust. Experience with conflicts has led me to eschew magic merge tools and rely on the simplest strategies: 1. minimize conflicts; 2. bite the bullet and deal with them manually. (Edit: my question about rerere is: how identical is "identical"? How can I be sure that it will redo what I did before in exactly the way I would do it now? Doesn't it have to understand my intent to achieve that?)

The diligent-rebasing-along-the-way workflow I proposed is all about #1. You still have to deal with intermediate conflicts this way too, but at least they're minimized. If something you commit to master conflicts with my B49, I have to fix B1..B49 but at least I can write B50..B100 in a way that takes your work into account.

js2 · on July 31, 2011

There is nothing really magic about rerere. During a merge, it records each conflict. When you commit the merge, it records your resolution of the conflict. If that _identical_ conflict occurs again, it re-applies the same resolution. You can choose whether it marks the the file as resolved or not, which allows you to easily review what was done before committing the merge.

pflanze · on July 31, 2011

I've always been an extensive user of rebase -i. Committing partial work often using git commit -a is easier, or at least takes less concentration, than always being careful to commit selectively with git add -p, git commit $files, but it needs squashing of those partial commits later on. I found that git rebase -i wouldn't scale to several days worth of work: I would frequently make errors when dealing with conflicts, and restarting rebase -i from scratch would mean redoing much of the work.

Because of this, I wrote a tool[1] that lets me do the same thing as git rebase -i, but allows me to edit the history changes incrementally, by keeping the history edit file and conflict resolutions around between runs; it does this by creating git patch files from all commits in question. I now always use this whenever I need to do more than one or two changes on some history; also, I'm now often creating commits to just store a note about a thought/idea/issue (the tool automatically adds a tag to the original history head, so I can look at those later on).

I originally wrote this just for me, which is the reason its own history isn't particularly clean and that I'm relying on a set of never-released libraries of mine; also maybe there are other, perhaps more well-known or polished tools than this, I don't know. I guess I should announce this on the Git mailing list to get feedback by the core devs.

[1] https://github.com/pflanze/cj-git-patchtool

/plug

simonw · on July 31, 2011

This is the first argument for using rebase that I've found truly convincing - really worth reading. This will probably change the way I use git.

eropple · on July 31, 2011

It wouldn't mine, if I used git (I avoid git specifically for this reason, actually, and use Mercurial). If you're actually looking at your commit logs, I find that rolling back is trivial; I can't remember the last time I accidentally rolled back into an incremental commit.

Personally it feels more like an apology for git's bad behavior than a good method of development.

click170 · on Aug 1, 2011

"I can't remember the last time I accidentally rolled back into an incremental commit."

I can't remember the last time I had a heart attack and died, but that doesn't mean it isn't going to happen in the future to me or to someone else. It sounds like it's just good practice for a public repository too, since your not the only one with access to it.

gruseom · on July 31, 2011

What exactly does Mercurial do differently that is better?

eropple · on July 31, 2011

Short of explicitly installing a rebase extension, it simply does not allow you to do this sort of mucking about with the commit history. For "oops, typo" commits, you can very quickly (and I mean, "it's a button in Tortoise" quickly) roll back your change and keep it in abeyance until you've fixed the typo.

chousuke · on Aug 1, 2011

So basically you reject a powerful tool out of idealism ("must never ever edit history") and fix your commits manually, also forgoing the possibility of fixing earlier unpublished commits.

Instead you could code and commit (you know, use the VCS :P) without worry in a private branch, checking for problems afterwards and fixing them using rebase prior to merging the commits into the main branch. (where "must never ever edit history" actually applies)

Sorry if I sound snarky, but that's what this seems like to me.

wnight · on Aug 1, 2011

Then what's the difference? You were talking about how you preferred uneditable histories, why does only rebase count?

alunny · on July 31, 2011

For very short, "oh there's a syntax error I missed" commits, "commit --amend" is very useful, and quicker than "rebase -i".

daemin · on Aug 1, 2011

"git commit --amend" is very useful if you realise you forgot to include some files in the last commit.

Although if you committed since then you might be better off adding a new commit with the missing files and then doing a "git rebase -i" to move and squash the commits as appropriate.

mark_story · on Aug 1, 2011

For the extra lazy, you can use git aliases to make `git amend` or `git ca`.

zwieback · on July 31, 2011

Nice post, thanks.

I've been using traditional RCSs for years but find that whenever I introduce SVN (or CVS before that) to a team it's very easy for new users to fall into bad habits around branching and committing transitory changes.

I'd like to try git to help manage the mess during the prototyping phase but I'm wondering how suitable it is for new users to learn git vs. learning svn.

Any opions out there on the suitability of git as a first version control system? My team consists of highly experienced engineers (EE/FW) with little or no software engineering experience.

mooneater · on July 31, 2011

they sound like smart people. why hobble them with svn in 2011?

i put off the transition as long as i could out of inertia (switched from svn in 08 out of desperation when i started needing a lot of branch and merging). but once you go git, you dont look back, not one bit.

ulrich · on July 31, 2011

When you are used to the SVN/CVS workflow, it takes a long time to get over it. It took me a long time to understand why the distributed approach is better, despite having read a lot about them. In my company we are using git as well, but most developers refuse to work anywhere else than on master. They probably had their share of trouble with branching in other systems.

eropple · on July 31, 2011

It's Mercurial-based, not git-based, but you might have them read HGInit, by Joel Spolsky:

http://hginit.com

Really, really approachable guide to how to properly use a DVCS.

mark_story · on Aug 1, 2011

I found the thing that tipped the scales for people who were set in their ways was walking them through a merge and showing that tree conflicts don't exist. And that merging doesn't need to take more than 5 minutes.

I work on a codebase that changes pretty fast, and git has hugely reduced time wasted screwing around with tree conflicts and SVN losing its lunch.

andrew311 · on Aug 1, 2011

I'm wondering how people address one of the scenarios raised in the post, specifically this:

"It’s safest to keep private branches local. If you do need to push one, maybe to synchronize your work and home computers, tell your teammates that the branch you pushed is private so they don’t base work off of it.

You should never merge a private branch directly into a public branch with a vanilla merge. First, clean up your branch with tools like reset, rebase, squash merges, and commit amending."

I'm wonder how people address cleaning a private branch that has been pushed (when your goal is to get its changes into master cleanly). Rebasing the private branch is pretty much out of the picture since it has been pushed (unless you don't care about pushing it again). I can see some ways of doing this:

1) You could do a diff patch and apply it master, then commit.

2) You could checkout your private feature branch, do a git reset to master in such a way that your index is still from the private, then commit it. Ex:

currently on private branch git reset --soft master

Now all the changes from the private branch are changes to be committed on master. This is easy, but it puts everything in one commit.

If you wanted to do a few commits for different, but stable points, but you already pushed the private branch and can't rebase it, you could instead do "git reset --soft" on successive points in the private branch commit chain, committing to master as you go.

If you wanted to reorder commits from the private branch, I guess you could rebase the private branch (which means you can't push again since you pushed it already), then do the tactic from the last paragraph, then ditch the private branch cause it's no longer pushable.

Does anyone have better ways of putting changes to master for private branches that have already been pushed?

gruseom · on Aug 1, 2011

Whether a branch is private and therefore can be rebased has nothing to do with whether there's a copy of it on the server. I push my work-in-progress to the server often for backup purposes anyway. If I want to rebase, I just push -f.

I can't think of why that would be a problem, but if someone objected to push -f on a private branch, I'd just make a new branch with a new name and push that. And if that were a problem, I'd just find another server to push -f to and only ever commit to master on the official server. But these are silly workarounds. Why make things harder than they need to be?

andrew311 · on Aug 1, 2011

This is true. If the assumption is that it's a private branch, then other people shouldn't care if you push -f because no one else should be using it.

Sometimes there are cases where people want to pull a private branch because they are working on something that is in the same code path but will be deployed after the private branch is integrated and deployed. They want to work off the newest code and avoid a larger merge to their private branch later. Would rebasing that private branch make their life harder? If so, one could always stage changes in a feature branch at stable points for them. Thoughts?

Basically, my understanding is that push -f can be a hassle for others to pull if they made commits to the same branch already. You're totally right that if it's truly a private branch, though, this should be irrelevant.

gruseom · on Aug 1, 2011

If someone wants to work off my private stuff I would tell them "sure, but be careful cause I'm push -f'ing" (after all, it's usually pretty easy to fix) and give them a heads-up when I do. If that weren't acceptable, I'd add a tag like "stable" to my branch, tell them to use it only up to that tag, and move the tag forward as the work progresses. If that weren't acceptable I'd make a branch instead of a tag and tell them to use that.

andrew311 · on Aug 3, 2011

Thanks! Excellent advice.

motherwell · on Aug 1, 2011

https://github.com/nvie/gitflow works really well. The original post http://nvie.com/posts/a-successful-git-branching-model/ was really compelling, and using it has really helped, at least what I do.

stretchwithme · on Aug 1, 2011

What really helped me grasp git was attending one of Scott Chacon's speeches on the topic. Scott works for github, knows what he's talking about and explains things thoroughly.

  http://www.youtube.com/watch?v=QF_OlomyKQQ

joelhaasnoot · on July 31, 2011

Hmm, this makes sense to me: lots of Git features I'd forgotten or not used before.

Can anyone sketch my "merging" strategy I should be using in my scenario: - Have 3 branches dev, stage and master - Bugs are fixed on master, bigger bugs/changes on stage and new features on dev - Big functionality changes/additions come in the form of new branches, which currently I first merge with dev, then with stage and if everything is OK, with master. This doesn't always work well due to the timing of things: sometimes my dev branch is out of date with the master and needs fixes from the master before applying.

How should I handle merging the branches?

cvandyck76 · on July 31, 2011

I wouldn't have two separate branches for bugfixes and then one for new features - as you noted, it can get hairy. Personally I find the git-flow model very straightforward.

Do normal feature development and bug fixes on the develop branch; save master for production releases. When it's time to make a release, cut a release branch (e.g. r/1.0.1) from the develop branch. Bug fixes that are made on that branch should also be merged into develop. Once the release is made, merge r/1.0.1 back into master and develop and continue on as normal.

Also see: https://github.com/nvie/gitflow

joelhaasnoot · on July 31, 2011

The problem is that between a feature being ready and it going into production there is a certain amount of QA/tweaking that goes on. Before I was running into issues where I couldn't fix a much smaller bug than the new functionality due to not having a branch for that. The flow handles that with hotfixes, which I guess works well.

I do think the numbering is excessive however: web software releases so often on a one man team it's mostly extra work.

mark_story · on Aug 1, 2011

I would create feature branches for anything bigger than a few commits. Once the branch is done, you can merge it into dev, then merge that same branch into stage/master if you want.

Typically I would try to merge feature->dev->stage->master. With issues found on stage those could be put directly onto stage and merged into master. I guess it depends where you base new branches off of and where the 'stable code' is.

I usually aim to merge less stable (softer) branches into more stable (firmer) branches. And base all new feature branches off of the most firm branch I have.

Maro · on July 31, 2011

Great post. Calls attention to the importance of having clean, stable commits in the 'master' branch and thus avoiding plain vanilla 'git merge' for 'squash' and 'rebase'.

http://stackoverflow.com/questions/2427238/in-git-what-is-th...

swah · on July 31, 2011

He should start the article with the last paragraph.

sandofsky · on July 31, 2011

People could then read the summary and skim through the rest. The summary is there just to help you remember.

If people don't internalize the reasoning, it's a disaster waiting to happen.

echostar · on Aug 3, 2011

Under "Declaring Branch Bankruptcy", why does the author throw in a "git reset" as the last step in the example.

endlessvoid94 · on July 31, 2011

After reading this, I finally motivated myself to read through the man pages for git pull, fetch, merge, and rebase.

Thanks :-)

trusko · on July 31, 2011

Good article. Thanks

jebblue · on July 31, 2011

Git is plain scary. We should stick with SVN.

pyre · on Aug 1, 2011

Personally I find the "git isn't enough like SVN" argument to be basically the "I stick with Windows because Linux is scary" or "MacOS X sucks because I tried it once and none of my Windows shortcut keys worked."

It basically comes down to:

1) You probably shouldn't just dive into git if you don't have someone that does know it to help you out (or unless you're willing to go seek out help from mailing lists, irc, etc).

2) You shouldn't assume that just because RCS/CVS/SVN are the only VCS's that you've ever used that it means that's what all VCS's should look/work like.

j-kidd · on Aug 1, 2011

Try http://hginit.com for a fantastic introduction to Mercurial for people familiar with SVN (or not).

Git was designed to suit kernel development (as shown in the article). For us simple-minded mortals who like SVN, it is much easier to migrate to Mercurial.

jebblue · on Aug 1, 2011

Good article, well written. If we are to start using distributed version control then I guess it might as well be git since it seems to have the most traction in the press.

koenigdavidmj · on Aug 1, 2011

http://hg-git.github.com/

That is a plugin for Mercurial, written by the Github people, to support targeting git repositories.