Essays and rants on libraries, technology, webdev, etc. by Ruth Collings

Software Carpentry: Git

This past Tuesday and Wednesday I attended a Software Carpentry for Librarians session as a mentor. I had attended a previous session intended for scientists back in June. While talking to the lead instructor, Greg Wilson, on Tuesday, I mentioned that there is now a GitHub client for Mac that is surprisingly okay. He got very excited because SC has had a hard time teaching Git to people so far. It's a really irritating and, one might say, overly-complicated program. Even if you teach novices some of how to use the command line and python first, Git remains advanced magic. So starting people off on a client might be a useful way to explain Git without having to teach them directly from the command line.

So, suddenly, I was teaching Git the following day. It turns out there is also a client for Windows since I last looked (a few months ago), and development on these clients is definitely happening very rapidly. I had taught myself Git from a book back in the fall and then had brushed up on it since, but by no means am a power user (as you can see from my github account). Good thing I've taught stuff I don't know very well on short notice before!

Usually when I come up with a lesson plan it involves sitting and thinking a lot. What is my audience? Where should I start? How theoretical should I be, or practical? What can I use as example data? What can I gloss over and what is important? Should I come up with an exercise?

I did not have a lot of time to spend mulling over pedagogy, so I decided to mostly wing it when it comes to pace and level of difficulty. This is easier to do with adult learners than undergraduates, fortunately, and I knew it would be a group of < 10 so its easier to keep everybody together. I came up with an example and ran through it to make sure it worked, but of course on the day of it screwed up and I had to use something else. I didn't have notes or a script, although I read through Software Carpentry's lessons on Git to see what they thought was important, and then went back to the book I had learned from: [Version Control with Git](Version Control with Git from O'Reilley Press) from O'Reilley Press.

It seems like I don't have the mental energies to be self-conscious and teach/speak at the same time, so I am going to include notes taken by one of the attendees below on the actual content I covered. I don't think we covered a whole lot, but we managed to hit all the major points I wanted to talk about in only 1 hour. If we had another hour I would have had everybody go through the entire lesson again on the command line now that they knew how to do it in the client.

One other thing I would have liked to have covered is how a whole workflow using Git looks. Starting a new local repository, working on files in it, taking time to commit changes periodically, and pushing to the remote repository. Then someone else pulling from the remote repository and making changes locally and then pushing them back up. And dealing with merges from an administrative point of view. Sharing teaching resources in Git repositories could be especially useful!

Going back through the notes, there are things I said that were vast oversimplifications (e.g. difference between fork and branch) and I'm sure Real Nerds would love to freak out over it. Figuring out how much to simplify without actually saying things that are untrue is something that definitely requires more forethought.

The GUI client for GitHub still isn't really functional at a beginner level yet. I have hope for it, but the hardest stuff to do in Git isn't supported in the GUI (i.e. merges) so that kind of makes it pointless from a teaching perspective. Afterwards I was pointed to a Mac client that seems to have everything a power user could want: GitBox, but, again, it's only for Mac, which makes it impossible to use in Software Carpentry. I'm going to try it out for myself though.

At the beginning and the end of the session we had some good discussion about how one would go about implementing Git in your workplace as a librarian or in a corporate environment. We all know that most librarians work in Word and Excel, which is not very compatable with most version control software. Git works best with plain text, i.e. code. There was some talk about using Markdown for yourself instead and exporting to whatever format you need to share with (PDF, usually). There was also conversation about how you could create a web interface instead of GitHub for self-hosting within organizations. If you were handling patron data, for example, that would probably fall under laws that require you to keep that information hosted on Canadian servers. There are plenty of options for that nowadays.

Overall I think this ended up being an interesting introduction to Git and all its weird terminology and uses, but didn't go far enough. I think I would need 2.5 hours to really solidly be sure a group of < 10 people came out okay in Git. And more than 12 hours of preparation. I hope my students learned enough for the session to have been worth it, despite everything, because I know Git is something everybody hears about nowadays. Given how close records management is to other aspects of librarianship, we should be all over version control! If only Git was easier to learn (and teach).

As an addendum, doing this session also pointed out how using a middle-aged computer can make teaching more difficult. The mini Display Port (used to connect to HDMI) is super fussy and if it's moved in the wrong way it disconnects. For some reason my entire computer shut down after I plugged it into the monitor. And, of course, the battery has been telling me to replace it for almost a year now. Hang in there, Ghanima!


Thank you to Mari Vihuri for taking notes during the session.

Why version control
Recommended Git/version control resources:
Installing plain vanilla Git:
Version Control with Git from O'Reilley Press
Specially recommended for people who prefer to learn WHY rather than just WHAT
Tends to assume you know some command line stuff, use google!
(hint, if your favorite library subscribes to Safari Books Online, chances are you can read this book online for free - here's the direct link if you have a Toronto Public Library account:
GitHub for Mac GUI (point and click) client
GitHub for Windows client:
Alternative git client for Mac
And in case you haven't already seen it, GitHub and Code School's git-in-fifteen-minutes tutorial:

  • Git is version control.

  • GitHub is a website that lets you put your files in repositories online. Lots of programmers use this to put their open source code online. Allows folks to collaborate on code.

  • Great for libraries! We like sharing things! When you make a new bash script using your new skills, you can use git to share it with the world!

  • Most people use git via the command line, but the commands are really esoteric. [Refer to Greg's rant about git being unfriendly to novice users.]

  • We're going to be using the GitHub GUI client: or

  • Other popular option: Subversion. You can "check" files in and out of a central repository. Git, on the other hand, is decentralized. You can have multiple copies of things, and Git will allow you to manage them, merge them back together, etc. Pros and cons to both models.

  • We're going to start by setting up our own repository on our computer, and then show you how to sync your changes to GitHub.

  • Repositories on the left-hand side. You can private repositories on your own computer if you want, or you can have repositories online.

  • Fork is the verb, branch is the noun. Split off the main code.

  • "The Internet doesn't do spaces." If you're going to turn a folder that already exists into a repository, suggest renaming it so it doesn't use spaces or special characters so it's web-friendly.

  • Commit your changes: signing the changes you make. You have to type in a summary that describes your changes.

  • Push your changes: upload your changes to the repository, make it public.

  • Roll back

  • Revert

  • Merge

  • You need to make sure you're paying attention to what branch you are on. GUI makes this easier to watch for than the command line.

  • Watch out for the order in which you merge things.

  • You can always roll things back!

  • Clone = copy.

  • Social features like watching and starring.

  • Clone in desktop.

  • Etiquette: you don't have to ask if you want someone to clone a project! That's the point of GitHub, it's up there for that purpose.

  • But, it's a bit rude to go right into someone's master directory and start making changes. That's why you make a branch and fork. Then you can message the owner and let them know you've done this, do you want to add that to your master? This is called a pull request.

  • If they say yes, your branch will be merged into the master.

  • Unless you have a paid account everything is public.

  • You can use Markdown on git.

  • You can turn the sync/public option so it's automatic, but it's safer not to.

  • Bitbucket is another hosting service that uses Git, Mercurial, or Subversion. Self-hosted options too.

  • Bottom line: it doesn't matter WHAT version control software you use, just that you use one.

  • diff - from Unix days.

  • GitHub is optimized for code, but you can use all sorts of other docs too (Word docs, etc.) It just won't be able to visually show you the line-by-line changes outside of plain text files.

  • The Github client fails once you reach the point where you have a lot of different commits on the same branch and they conflict.

Comment @collingsruth