Sunday, August 23, 2009

Using a Mobile Source Code Repository

corona_laptop

If you’re a consultant or just do a lot of coding on the side either for profit or to keep your skills sharp, it’s often nice to be able to store your code in a source code repository. The benefits of a personal repository are almost the same as what you get with a repository shared by a team of developers (history-tracking of your code, the ability to branch and merge, easy roll-back of uncommitted changes), the only different is that you’re the only user.

But setting up a reliable source code repository is not trivial, especially if you want to be able to access it while on the go. The easy route is to pay for a hosted solution, but that requires an internet connection (at least for doing source code operations) and they usually cost money if you want something decent.

I’ve come up with a couple simple approaches that allow you to have a personal repository that you control, that can be accessed from anywhere whether you have an internet connection or not, and that can get backed up in the event of data corruption.

Virtualization

Before I jump into the details, I’d like to say a couple things about virtualization since it is a big component of the solutions I came up with (especially the second). While you can easily code using tools installed directly on your host OS, I highly recommend isolating your software development environments on one or more virtual machines (VM’s). There are several reasons for this:

  • It allows you to develop on operating systems that are more comparable to what your application may be running on in production. Most workstation OS’s (like Windows Vista or Windows 7) do not come with all of the server-side components or are just not quite configured the same as the server OS your app may be running on in production (Windows 2000 Server, Windows Server 2003, Windows Server 2008, etc). Some server-side application services (like SharePoint) won’t even run on a non-server OS.
  • It allows you to have multiple different development environments. If you have one project or client that uses Visual Studio 2005 and another that uses 2008, it’s safer to run them on separate operating systems to prevent interference.
  • It allows you to easily play with new beta development tools without hosing up your working development environments. Nothing like installing the latest beta of VS2010 and finding out VS2008 no longer loads your paying customer’s solution.
  • If your host OS is not Windows (ex: Mac OSX) and you want to develop Windows applications, you don’t have much of a choice!

Which virtualization option you go with is pretty irrelevant. For the longest time, I was a big user of Microsoft’s Virtual PC. It was free and it worked great. Since then I’ve moved to VMWare’s Workstation and Player on Windows and Fusion on the Mac. VMWare seems to have a slightly more superior product at the moment, but Microsoft’s virtualization offering is not too far behind, especially when you consider the new functionality available in Windows 7.

Source Control System

There are a few good choices out there for source control.  Personally, I like Subversion (SVN) the best.  It’s free (open source) and a lot of very useful third-party tools are out there.  TortoiseSVN is an excellent Windows GUI client and is free.  AnkhSVN (now being developed by Collabnet, the company that hosts the SVN source control itself) provides source control integration into Visual Studio and is also free.  Finally, there’s VisualSVN.  They make a free server application called VisualSVN Server that allows you to get a Subversion repository up and running in minutes on a Windows server, complete with HTTPS access over Apache.  VisualSVN also makes a Visual Studio source control add-in that’s comparable to AnkhSVN, although arguably more robust.

The two solutions I came up with (Method 1 and Method 2 below) both use VMWare Workstation for the virtualization platform and Subversion for source control.  I’m sure one could adapt these steps fairly easily to work with other systems since the concepts behind them are what are important.

Method 1: Single VM with a Local Repository

When I first had the need to manage my own source control, I had a single VM I used to work on my code. 

single-vm

The source control solution in this case was quite simple since I could pretty much put everything on that VM.

Install a local repository

First I used TortoiseSVN to create a local file-based repository.  You can use the command-line Subversion tool to do this as well, but TortoiseSVN makes it easy:

 create-local-repository

I usually put the repository itself in a standard location (like C:\SVN) so the repository URL in my Working Copy becomes something like file:///C:/SVN/my-repository:

checkout

Configure repository backup

You can do this a number of ways, but you want to make sure that whatever is backing up the repository is doing it automatically on a scheduled basis, so you don’t have to think about it.  An online backup service (like Mozy.com) works well, but there’s a monthly service charge.  I do all my backups on my wife’s computer, which is a MacBook.  Therefore I can use TimeMachine, which is integrated into Mac OS, to backup everything important, including my source code repository.

However, my repository is still sitting in the file system of my VM which isn’t hosted on the MacBook – it’s on my laptop, which is running Windows.  To sync the repository files to the MacBook so they can get backed up I use Windows Live Sync (used to be called FolderShare).  It’s a free file syncing tool from Microsoft that can sync between multiple PC’s and Mac’s via your local network or over the internet and even through firewalls.  All you need is a Windows Live ID to get started.

So as long as all my source code is developed on a single VM, I can keep the source control repository locally on that VM as a simple file-based SVN repository and use Live Sync and TimeMachine to keep my repository files backed up.  Also, if I go mobile and/or the MacBook isn’t connected to the internet (so Live Sync can’t sync) I still have access to the repository so I can do all the source control operations I may need to keep developing code.  Then the next time my VM is able to connect the Mac, it will sync the repository files so they can be included in the next backup.

Method 2: Multiple VM’s with a Shared Repository

The above solution works great until you decide to write your code on more than one VM.  Perhaps you have a couple projects going that require different development stacks.  Maybe one is an ASP.NET application running on IIS6 (Windows Server 2003) and another is on IIS7 (Windows Server 2008).  In that case you really should develop your applications on separate VM’s, each running the correct version of Windows.

second-dev-vm

Of course, now the problem is accessing the file-based source control repository across multiple VM’s.  One option is to move the SVN repository files to the host OS.  Most virtualization solutions like Microsoft Virtual PC and VMWare Workstation have a file sharing feature where a VM can access files on the host OS.  While this works, there’s a very noticeable performance hit.  The other drawback to putting the SVN repository on the host OS itself is that it’s mixing a part of your development environment infrastructure into the host, which conflicts with why you moved everything to virtual machines in the first place.

A better approach is to move the repository to a dedicated VM that can act as a light-weight server and handle source control operation requests from all the client VM’s.  Subversion can be run as a server in this way.  It uses Apache to handle HTTP requests from clients and it seems perform nearly as well as with the local file-based repository.

Let’s take a look at how to set all this up.

Create a source control server VM

The first step is to create this server VM which will host our shared repository and handle client SVN requests from the other VM’s.  This VM won’t require a lot of RAM since it’s primarily just going to serve up request for source control operations.  I created mine using Windows Server 2008, which has a minimum memory requirement of 512MB, so that’s what I went with.

server-vm

Before we move on to installing additional software on the source server VM, we need to take care of some necessary networking infrastructure.

Create a private virtual network to connect all the VM’s

One of the other advantages of most virtualization systems out there is the ability to create virtual networks.  If you have multiple VM’s that need to communicate with one another and especially if that communication needs to be private (or doesn’t need to occur over a public network), you can create a virtual network.

In my situation, I ended up using one of the available custom networks installed by VMWare Workstation (vmnet2 in my case).  You don’t want to use the Host-Only, Bridge, or NAT networks as we need something that’s private that we can dedicate to these VM’s.  In my case, to turn this on, I just added a network adapter to all the VM’s, including the server VM, and connected it to the vmnet2 virtual network:

virtual-network

This new network adapter showed up in the OS (Windows) of all the VM’s.  To make them easier to manager, I renamed them from “Local Area Connection” and “Local Area Connection 2” to “Public LAN” and “Private LAN” in each VM:

Windows-network-adapters

Set up DNS and DHCP services on the server VM

Simply connecting the VM’s together with our new private network connection doesn’t make it so they can communicate (at least not very well) over the private network.  It’s the equivalent to connecting a bunch of computers to a Ethernet hub or switch with network cables.  Each host will default to a 169.254.x.x address and name resolution won’t work at all if each OS has their default firewalls turned on.

Ideally, we need some way of handing out IP’s and ideally having DNS name resolution to those IP’s (the most important one being the server).  To do that we first need to pick an IP subnet and a static IP for the server VM.  In my case I went with 192.168.76.x for the subnet and 192.168.76.1 for the static IP.  I then set the IP address of the “Private LAN” network adapter on the server VM to that static IP, which is done in Windows itself:

static-ip-config

Next we need to turn on DNS services on the server VM so it can resolve DNS names local to our virtual network.  After installing the service, I created a forward lookup zone for a private DNS domain called “ts-local”.  You can use whatever private DNS domain name you want as long as it isn’t something that might conflict with domain names on the internet or any other network you may connected to.

Then I added an A-host for the server VM itself whose hostname is “MCP”:

dns-server

Next we need to enabled the DHCP server service on the server VM so we can hand out the IP’s in our subnet.  Once you install the service, configure it to hand out IP’s to our subnet as well as specify the DNS server IP (same as the server VM) and the domain name (in my case “ts-local”):

dhcp-server

Finally, we should be able to release and renew the IP address on each of the client VM’s and get an IP in our subnet.  We should also be able to resolve the IP address of the server VM from the client VM’s using PING:

ping

Install VisualSVN Server on the server VM

Now for the real reason we created this server VM in the first place.  Setting up a Subversion server that uses Apache (the web server that SVN uses) by hand is no easy task (trust me, I’ve done it).  VisualSVN Server makes this a snap.  It will install Apache, configure HTTPS, and create a location for the repositories.  By default, VisualSVN Server creates the repositories directory at C:\Repositories.  I prefer C:\SVN.

visualsvn-server-manager

After the install is complete, all you have to do is create a username for your client VM’s to use to access the repositories.

Move the repository

If you started with Method 1 like I did, you need to move the file-based repository from the single VM to our shiny new Subversion server VM.  VisualSVN Server makes this easy with it’s import option:

import-repository

Simply browse to where the repository files are and perform the import.  You may want to copy them over to a temporary directory on the server VM first to make them easily accessible from the server’s file system.

Once the repository has been imported, you’ll want to set assign the user account you created earlier to have read/write access to the repository:

asign-user

Your server is now ready to service source control requests!

Point the Working Copies to the new repository

If you already have Working Copies (the term Subversion uses for directories that are under source control) on your original development VM, you can point them at the new URL using TortoiseSVN’s Relocate function.  For example, if you had a working copy connected to file:///C:/SVN/my-repository, you can now point it to http://mcp.ts-local/SVN/my-repository:

relocate-1

relocate-2

Configure repository backup

The final step to getting the multiple-VM source code repository operational is to configure the backup.  The approach is the same as with the single-VM approach – you just have to back up the repository files on the server VM instead.  In my case, that directory is C:\SVN.  I point Live Sync there which syncs the files to my MacBook which in turn backs everything up via TimeMachine.  Very nice!

The multi-VM solution also has the same disconnected benefits that the single-VM solution does.  You can perform all your source code operations anytime you want, even if your host machine is disconnected from the network.  The next time you are connected, Live Sync will sync any repository file changes to the MacBook so they can be backed up.

2 comments:

Richard Collette said...

Have you looked at Git or Mercurial which are peer to peer (local) source code repositories?

Pete said...

I haven't had a chance to dive into Git, although it is on my list. When it first came out, there wasn't very good support for Windows. However, I think things have improved significantly.

The way I understand Git is that everyone gets their own copy of the repository, almost like a giant branch. All commits go against the local. Then, when it makes sense, all changes are sync'd up with a central repository.

I think you could adapt my approach to use Git instead and it would probably work better than Subversion since you could get rid of the need for a 3rd party synchronization tool (like FolderShare). This is probably what you were eluding to in your comment.

Thanks!