Copying from a VirtualBox drive on Linux

How to install a virtual machine, and then copy to the real thing

[2008-04-12]

Our workstations all run Linux (except for one Windows box we use for testing, and Christian's machine that dual-boots for gaming).

And workstations should easily be swap-out-able. If something happens to one, you drop in another one that looks just like the first. But we use more than just vanilla Ubuntu. We authenticate against an OpenLDAP database with PAM LDAP, and our home directories are mounted with autofs. We all need PHP5, MySQL server, phpMyAdmin, and I'll chuck on Django too, because one day (cue fade-in dream of future) PHP will be a distant memory, and we'll be adding cunning features to glorious web applications written in beautiful Python. (PHP is to Python what Allen Ginsberg is to e. e. cummings. IMNSHO. Both very good at what they do, but only one succinctly and aesthetically gratifying.)

What I used to do was set up a workstation exactly as I wanted it. Then I'd back it up, and duplicate it on other workstations as necessary.

VirtualBox makes it one step easier. In summary, you install your operating system in a virtual machine, on a fixed size virtual drive, tweak it until you are happy with it, and then you can mount the virtual drive as a loop device, and copy everything from it that way.

Here are the details. I'm going to assume that you are running Linux as your operating system, that you have installed VirtualBox, that you have installed your to-be-cloned operating system on your virtual drive, and tweaked it until your codecs are installed, your .vimrc file is in place, and your wallpaper is perfect.


  1. Here comes the fun part. The trick to mounting a VirtualBox fixed size virtual drive is finding the offset -- you need to skip the first part of the virtual drive to get to the first partition. We're going to do this using VirtualBox's vditool. (And if you don't have a fixed size virtual drive, you will need to create one from your dynamic size drive, also using vditool.) You can download it from
    http://www.virtualbox.org/download/testcase/vditool
    Right-click and choose "Save Link As ...".

  2. It's a 32-bit binary. First, move it somewhere useful, like /usr/local/bin/. Then make it executable with
    sudo chmod +x vditool

    (I'm going to write this as if your host operating system is Ubuntu. If not, where you see "sudo", you need to run the command as root.)

  3. Next, we need to make sure we have the libraries required by vditool:
    ldd vditool

    The VirtualBox package for some distributions places the VirtualBox libraries in the LD_LIBRARY_PATH. Ubuntu's doesn't. It places them in /usr/lib/virtualbox/. Your distro might have put them in /opt/VirtualBox/. You can find out with ...
    locate VBoxDD.so

    Also, my workstation didn't have libstdc++5 installed. So I saw this:
    $ ldd vditool
    linux-gate.so.1 => (0xffffe000)
    libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7f60000)
    libuuid.so.1 => /lib/libuuid.so.1 (0xb7f5d000)
    librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7f53000)
    libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7f4f000)
    VBoxDD.so => not found
    VBoxRT.so => not found
    libstdc++.so.5 => not found
    libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7e6f000)
    libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb7e64000)
    libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7d1a000)
    /lib/ld-linux.so.2 (0xb7f8a000)

    I installed libstdc++5 with:
    sudo apt-get install libstdc++5


  4. Now we're ready. Change into the directory where your VDI file is found.
    (If you have a dynamic size drive, ddcopy it to create a fixed size drive, like so...
    LD_LIBRARY_PATH=/usr/lib/virtualbox/ vditool DDCOPY mydynamicdrive.vdi mydrive.vdi

    (I did not do this step, so please correct me if I'm wrong.) Your LD_LIBRARY_PATH will be what you learned from the "locate" command. If "ldd" found the VirtualBox libraries, VBoxRT.so and VBoxDD.so, you can leave out the LD_LIBRARY_PATH stuff.)

  5. To find the offset to your data, type ...
    LD_LIBRARY_PATH=/usr/lib/virtualbox/ vditool DUMP mydrive.vdi | grep offData

    where mydrive.vdi is the name of your virtual drive.
    If your VirtualBox libraries VBoxDD.so and VBoxRT.so were found by ldd above, you can just use ...
    vditool DUMP mydrive.vdi | grep offData

    You'll get something like ...
    Header: offBlocks=512 offData=41472


  6. The offData figure is the one we need. Add 32256 to that. (See why at Forensic Incident Response.) So for this example that gives us 73728. Use that as your mount offset ...
    sudo mount -t ext3 -o ro,loop,offset=73728 mydrive.vdi /mnt


Voila! All your files!

I tarred mine up, and then copied the tar to a flash drive. A useful way to omit "lost+found" is:


cd /mnt
sudo tar -czf /tmp/mydrive.tar.gz `ls -1 | egrep -v "^lost\+found$"`

Thank you,
hogfly at http://forensicir.blogspot.com/
and murali at http://muralipiyer.blogspot.com/



Mike Duskis writes:

Norman,

Thank you for your blog on copying a VirtualBox drive on Linux. Until I read it, I did not realize that .VDI files were just ordinary disk images with a prefix. After I read it, I did some exploration with a hex editor and made a discovery that might interest you.

There is no guarantee that Sun won't keep changing the format, but at least as of VirtualBox 2.1, the .VDI headers for a fixed hard drive are exactly 1024 bytes long, so the partition table starts at offset 1024 and the first partition starts at 1024+32256=33280.

I learned this by formatting the first partition as FAT16 and then searching for the telltale start of a FAT partition: 0xEB3C90. I was then able to mount it on a loopback device.

There was no need to fiddle with vditool with all of its unsupported wierdness.

-- Mike

Thank you very much, Mike.