This weekend I had no internet connection thanks to a DSL upgrade (well, more like downgrade, since I’m getting half the speed I got before) and since I had Python, PIL and a webcam, I decided to see what I could do.

After playing around a bit with PIL and motion recognition in images, I decided to write an image stitcher.

When writing something that has a tight loop, you need something to display the progress, because I was tired of guessing how long it was going to take until it was done, so I wrote this little module (save it as if you want to use it)

The latest code is also available on GitHub. This code might be a bit behind, so get the GitHub version if possible.

Here’s the code:

# Progress module
# Copyright: Stavros Korokithakis, 2007
# License: GNU LGPL
import time
import unittest

class Progress:
    """Track the progress of an operation and calculate the projected time to
    its completion."""
    def __init__(self, totalitems, timeasstring = True):
        """Create a Progress instance. totalitems must be the total number of
        items we intend to process, so the class knows how far we've gone."""
        self.__totalitems = totalitems
        self.__starttime = time.time()
        self.__timeasstring = timeasstring

    def timetostr(self, duration):
        """Convert seconds to D:H:M:S format (whichever applicable)."""
        # The verbosity is because this class should be as light as possible,
        # since it will probably be used in loops (hopefully not called very
        # often, but still).
        duration = int(duration)
        timelist = [duration / 86400, (duration / 3600) % 24]
        timestring = ""
        printall = False
        for item in timelist:
            printall |= item
            if printall:
                timestring += str(item).zfill(2) + ":"
        timestring += str((duration / 60) % 60).zfill(2) + \
        ":" + str(duration % 60).zfill(2)
        return timestring

    def progress(self, itemnumber):
        """We have progressed itemnumber items, so return our completion
        percentage, items/total items, total time and projected total
        elapsed = time.time() - self.__starttime
        # Multiply by 1.0 to force conversion to long.
        percentcomplete = (1.0 * itemnumber) / self.__totalitems
            total = int(elapsed / percentcomplete)
        except ZeroDivisionError:
            total = 0
        if self.__timeasstring:
            return (
                int(percentcomplete * 100),
                itemnumber, self.__totalitems
            return (
                int(percentcomplete * 100),

    def progressstring(self, itemnumber):
        """Return a string detailing the current progress."""
        timings = self.progress(itemnumber)
        if itemnumber == self.__totalitems:
            return "Done in %s, processed %s items.        " % (timings[0], timings[4])
            return "Progress: %s/%s, %s%%, %s/%s items.\" % timings
        return progstr

class UnitTests(unittest.TestCase):
    def setUp(self):
        self.progress = Progress(100)

    def testtimetostr(self):
        tests = [(10, "00:10"),
            (75, "01:15"),
            (4000, "01:06:40"),
            (87123, "01:00:12:03"),
            (187123, "02:03:58:43"),
        for test, result in tests:
            self.assertEqual(self.progress.timetostr(test), result)

if __name__ == "__main__":

Using it is pretty simple, you just instantiate it with the total number of elements in your operation and then just call progress.progressstring with the number of processed elements. The module counts the time spent and will extrapolate from the number of your elements the total time needed, and will return it in a string. It is pretty self-explanatory I think, and I am very bad at writing comments.

The nice thing about it is that it returns a string with a carriage return in the end, so if you print it like so:

print progress.progressstring(processed),

it will print each time on the same line, giving the illusion that the line changes (which, I guess, is what it actually does). Anyway, enough with that, and on to the actual image manipulation.

To give you an idea of what it does, imagine that you have taken some pictures (the more the better) of a busy street (from the exact same spot, i.e. using a tripod), and you want to make it appear empty. This script will look at all the pictures and decide which parts are background and which parts are people, cars and other useless stuff, and just keep the background, so you’re left with an empty street.

It accomplishes said feat thus: It measures an image and assumes that they’re all the same dimensions (which they should, otherwise wtf are you trying to do?) and for each (x, y) pair of coordinates makes a list of all the RGB values of that particular pixel in all the pictures you provide. Then, it searches for the most frequent values using a clustering algorithm I will explain later, and selects the median as the colour of the pixel in question. This way, unless some dude was standing in the same spot for more than half or so of the pictures, it should select the correct background color.

The clustering algorithms works thus: It assumes that the RGB values are coordinates of points in three-dimensional space and calculates a sphere around the particular point (with a radius you specify), and then checks to see how many other points are in that sphere. Values that are not frequent (i.e. do not appear in many of the pictures) do not have as many close neighbours, so they get discarded in favor of the more frequent ones. This way, the correct background values are selected (well, hopefully, anyway).

If you want to try it, just put it in a folder with a bunch of photos taken from the same spot (I would recommend about 10), download psyco because it will speed computation up about 200% and run the script. The function call arguments are (<number of images in folder>, <variance>, <image filename template>, <number of zeroes in image filenames>). I.e. if your files are named “photo_000.jpg” to “photo_020.jpg” you would use the default below, blendall(21, 30, "photo_%s.jpg", 3). 30 is a good variance for most purposes, I have discovered.

This is the actual script, copy and paste it in a file to run it (it takes about 3 minutes on my testing set of 21 1600x1200 pics):

# Copyright: Stavros Korokithakis, 2007
# License: GNU LGPL

from PIL import Image, ImageOps, ImageChops
from progress import Progress

def blendall(images, variance, filemask, zeroes):

    imagex, imagey = % "0".zfill(zeroes)).size
    print "Dimensions:", imagex, imagey
    image_list = []
    imagesminusone = images - 1
    halfimages = images / 2
    varsquare = variance ** 2
    imagerange = range(images)

    print "Opening..."
    progress = Progress(images)
    for counter in range(images):
        image_list.append( % str(counter).zfill(zeroes)).load())
        print progress.progressstring(counter + 1),

    # Create the composite image from the stored images.
    print "Compositing..."
    composite ="RGB", (imagex, imagey))
    ld = composite.load()
    progress = Progress(imagex)
    for x in range(imagex):
        if x % 10 == 0:
            print progress.progressstring(x),
        for y in range(imagey):
            pixellist = []
            for image in image_list:
                pixellist.append(image[x, y])

            temp = []
            maxneighbours = 0
            for counter1 in imagerange:
                for counter2 in imagerange:
                    if counter1 == counter2:
                    radius = (pixellist[counter1][0] - pixellist[counter2][0]) ** 2 + \
                             (pixellist[counter1][1] - pixellist[counter2][1]) ** 2 + \
                             (pixellist[counter1][2] - pixellist[counter2][2]) ** 2
                    if radius < varsquare:
                        temp[counter1] += 1

                if temp[counter1] == imagesminusone:
                    # If a pixel has all the others as neighbours, we don't need
                    # anything else, they're all adjacent.
                    ld[x, y] = tuple(pixellist[0])
                    if temp[counter1] > maxneighbours:
                        maxneighbours = maxneighbours
                templist = []
                maximum = round(maxneighbours * 0.65)

                for counter in imagerange:
                    # If the pixel has more neighbours than desired, add to the
                    # list.
                    if temp[counter] >= maximum:

                ld[x, y] = templist[counter / 2]

    print "\
Done, saving composite image...""composite.png")

    import psyco
except ImportError:
blendall(21, 30, "photo_%s.jpg", 3)

I hope you’ve learnt something from it, comments/suggestions/optimisations are welcome.