Author Archives

Piki: Development Process

So, after all of 10 minutes of thought to naming, my girlfriend Melissa and I came up with the name Piki for the app, as a shortened form of Picture To Wiki. I’m planning on using my own preferred development process for this project, which is broken down into the following steps:

  1. Definiton phase
    Figure out basic goals

    This is what the last post was detailing, the basic usage of the app. At this stage it’s more of a conceptual use case than a formal use case. Essentially, enough needs to be decided here to allow filtering of ‘better’ and ‘worse’ in the exploration phase that follows.

  2. Exploration phase
    Find and verify enabling technologies

    This is where most technological architecture decisions are made, such as languages, libraries, communications channels. It consists of defining technical questions that we can use to decide which viable technology will be most beneficial, and answering those questions through a series of small demonstration apps. A primitive library can be developed in this phase to facilitate quick creation of the demonstration apps.

  3. Workspace Creation phase
    Create full-package process for chosen technologies

    This is where you make a shell architecture that verifies the packaging system and supports all the app-external inputs and outputs. At the end of this stage, you should have a working installable. The reason for this stage to occur here is that often the packaging process itself presents unique technical constraints on the application, and before the final development, you should be aware of them. This is conceptually the final part of exploration and the first part of development.

  4. Development phase
    Implement the features you’ve explored

    This is where the software architecture comes into play as you build out the application. You’ve already verified how to talk to everything, so the focus here is exclusively on how to make things talk together efficiently and intuitively, in a way that will support future goals.

  5. Maintenance phase
    Create new features, refactor, bug-fix

    New features and certain refactorings can potentially follow the whole process again, but bug-fixes mostly stay at the outer fringes of the development phase.

I’d contend that this differs from both ‘Agile’ and ‘Big Up-Front Design’ approaches. Agile methodologies go straight to the workspace phase, or in worse cases, the development phase, and often suffers architecturally as a result. Big Up-Front Design typically tries to design before the necessary exploration.

From Zero to App in ? Days

This is the first of several posts concerning the research & development of a small app concept that I’ve decided to make. The need for this app arose out of a conversation with the development manager at my current place of work. The work this particular company is involved with is maintenance work for a monolithic VB6/VB.NET application (in-memory size of over 700 MB). With a high turnover rate, the current situation at that company is that no one has a full sense of the application, and everyone is essentially paralyzed by lack of information. I was suggesting that with a wiki, we could begin to build a knowledge-base that would support bringing people up-to-speed quickly.

My manager raised the very real issue of wiki markup and document interaction being a significant barrier to entry compared to putting documentation in Word documents. At this company, screenshots form a significant portion of the needed documentation for our app and the interaction with it. In a word document, or in an Outlook email, you can simply paste the screenshot right into the document you are working on. Of course, this is not possible with current wikis, because editing of wikis is done in plaintext. To add an image to an article, you might do something like this from the Fractal entry on wikipedia:

[[Image:Mandel zoom 00 mandelbrot set.jpg|300px|right|thumb|The [[Mandelbrot set]] is a famous example of a ”’fractal”’.]]

When you add this image link, you can upload an image from a form accessible by clicking the link. The problem with this is it takes several steps to put a screenshot in.

For Word:

  1. Browse in the document to where you want the image
  2. Get the screenshot (Alt+PrintScreen or just PrintScreen)
  3. Bring the document into focus (click on it from the start bar, or Alt+Tab)
  4. Paste the image

For the Wiki:

  1. Make the image stub on the wiki
  2. Get the screenshot
  3. Open a picture editor capable of saving images in web format (png, jpg, gif)
  4. Paste the image
  5. Save the image to a temporary location
  6. Click the image stub to access the upload form
  7. Click the upload button to bring up the filesystem browse
  8. Upload the image

In addition to taking longer, the wiki version requires you to do two namings of middle products: one on the wiki when you name the image stub, and one on the filesystem when you save the screenshot to a temporary location. The wiki naming is good, because it supports versioning and having multiple wiki pages point to the same image, but the local naming is entirely un-necessary.

It’s easy to see that this takes significantly longer. My goal with this small app is to make an app that sits in the system tray and allows ‘pasting’ to the wiki, so that putting an image in the wiki is as simple as the following workflow:

  1. Get the screenshot
  2. Click on the system tray icon for the app, bringing up a window
  3. Paste, which brings up the image, handles putting it into web format, resampling to web-reasonable sizes, and allows cropping
  4. Either:
    • Paste New, creating a name for the image on the wiki
    • Paste Into, selecting the name from the available names on the wiki

Mingw Ocaml on Vista

So, earlier this past summer two friends and I used the ICFP contest as an opportunity to try to learn ocaml. For ease of installation, we just used ocaml under linux, but now I’m trying to set up a working environment on my new desktop, running Vista (oh no!)…

First off was installing MinGW/MSYS. This install worked pretty well. Next was downloading the Ocaml distribution for MinGW and installing it.

Then I tried to install findlib and extlib, two libraries/utilities we found valuable during the week of the competition, and consequently two things our makefile depends on…

This was where the problems started. First, findlib’s configure script wouldn’t work because ‘dirname’ kept on erroring out. The problem turned out to be a space in my install location (C:/Languages/Objective Caml). So, I reinstalled it to a more sane place (C:/MinGW/lang/ocaml). After fixing this, the configure script got farther, but died with the following error:

Detecting compiler arguments: FAILED

Upon some googling and inspection, I found that more info on the error got dumped into an ‘ocargs.log’ file in the same directory as the configure script. Inspecting this yielded the following error:

gcc: installation problem, cannot exec `cc1': 
No such file or directory

This, as it turns out, is a MinGW installation error on Vista. I found the fix courtesy of this message. I added the following line at the end of my /etc/profile:

export PATH=/mingw/libexec/gcc/mingw32/3.4.5:$PATH

This time, the configure script worked, and findlib was installed… On to extlib! A simple make, make opt, then make install seemed to work reasonably well.

Now, our makefile worked most of the way, creating the non-optimized version just fine, but still with an error on the optimized version:

ocamlc -o dna2rna -I `ocamlfind query extlib`  -g  extLib.cma
bigarray.cma unix.cma util.cmo genetic.cmo baseString.cmo 
matchreplace.cmo dnaLexer.cmo main.cmo dna2rnaprog.ml
ocamlopt -o dna2rna-opt -I `ocamlfind query extlib`  -p  
extLib.cmxa bigarray.cmxa unix.cmxa util.cmx genetic.cmx 
baseString.cmx matchreplace.cmx dnaLexer.cmx main.cmx 
dna2rnaprog.ml
Cannot find file std_exit.p.cmx

So, where was the problem? This message would seem to indicate that it was in trying to compile with profiling info on. So, for now I drop the -p, later, try to find a way to profile under mingw.

Dropping the -p got a little further. Now the error message is this:

ocamlopt -o dna2rna-opt -I `ocamlfind query extlib`    
extLib.cmxa bigarray.cmxa unix.cmxa util.cmx genetic.cmx 
baseString.cmx matchreplace.cmx dnaLexer.cmx main.cmx 
dna2rnaprog.ml
gcc: @C:/Users/user/AppData/Local/Temp\camlrespfefe77: 
Invalid argument
Error during linking
make: *** [dna2rnap] Error 2

A linker error on the optimized version… first, let’s see whether any optimized program will compile by making a test program. Here’s the code, taken from the Ocaml Tutorial example grtest1:

(* File: grtest1.ml *)

open Graphics;;

open_graph " 640x480";;
for i = 12 downto 1 do
  let radius = i * 20 in
  set_color (if (i mod 2) = 0 then red else yellow);
  fill_circle 320 240 radius
done;;
read_line ();;

Compiling this with ocamlc and ocamlopt yielded results that were non-deterministic. The code above should make a graphic of concentric yellow and red rings. Example results are shown below:

Example A Example B Example C Example D

As is easily seen, there’s clearly nondeterminacy going on here… For those who might have a clue why this is happening, I’m running 32-bit Vista on a Intel Core 2 Duo @ 2.66 GHz.

Well, I’m out for the time being. Still not sure why I’m encountering the linker error when building the optimized version…

Recursive generators, multisets, and Randall Munroe

Generators are an interesting language feature, somewhat related to the concept of lazy evaluation. Essentially, they are a way to lazily generate a sequence of values using a persistent local function state. Generator functions ‘pause’ by yielding a value, and ‘resume’ at the next line when another value is needed.

The classic example of a generator in Python is the following:

def fib():
    a, b = 0, 1
    while 1:
        yield b
        a, b = b, a+b

This fibonacci sequence generator is an example of an infinite lazy sequence. If we used this in a loop, the iterator would never terminate, so we would have to break out of the loop. To make this infinite sequence terminate, let’s wrap it in another iterator:

def fibs_up_to(n):
    for f in fib():
        if f > n:
            break
        else:
            yield f

When the infinite fib() generator yields a value greater than our threshold, the fibs_up_to() generator breaks out of the loop, and reaches the end of the function. When evaluation reaches the function bottom, the generator terminates itself and any enclosing ‘for’ loop.

In the following snip from the python shell, we see that the next fibonacci number (610) caused our wrapper iteration to break out of its loop and terminate the outer print loop.

>>> for f in fibs_up_to(500):
...     print '%d ' % (f,),
...
1  1  2  3  5  8  13  21  34  55  89  144  233  377

Now, with the background info out of the way, let’s dive into the fun part: recursive generators! The first time I noticed this pattern was while working on a fun problem Randall Munroe posted on his xkcd webcomic:

image of comic

One (inefficient) way of attacking this problem is to generate all sequences of orders, and test each to see whether it totals the target price. As it turns out, generating all sequences of a set is surprisingly easy using a recursive generator:

def all_seqs(size, alphabet):
    if size == 1:
        for c in alphabet:
            yield [c]
    else:
        for c in alphabet:
            for s in all_seqs(size - 1, alphabet):
                yield [c]+s

>>> for s in all_seqs(3, ‘ABC’):
…     print s
…
['A', 'A', 'A']
['A', 'A', 'B']
['A', 'A', 'C']
['A', 'B', 'A']
['A', 'B', 'B']
['A', 'B', 'C']
['A', 'C', 'A']
['A', 'C', 'B']
['A', 'C', 'C']
['B', 'A', 'A']

The generator starts with the terminal case: if the size of the sequence is one, simply yield each option element, one at a time. The recursive case is only slightly more complicated: prepend each option element to all of the sequences one smaller than the current size. Recursively calling the generator allows this to be represented very simply, without needing to waste memory on creating a complete list of all the one-smaller sequences.

While interesting, this approach has a problem when applied to the xkcd problem: it generates duplicates…

[('Fruit', 215), ('Wings', 355), 
    ('Wings', 355), ('Sampler', 580)]
[('Fruit', 215), ('Wings', 355), 
    ('Sampler', 580), ('Wings', 355)]
[('Fruit', 215), ('Sampler', 580), 
    ('Wings', 355), ('Wings', 355)]
…

These repeated results (12 in all) are all the permutations of the multiset with 1 Mixed Fruit, 2 Hot Wings, and 1 Sampler. So, what we’re really interested in is a generator that will yield successive multisets, thus making sure we don’t waste time evaluating multiple equivalent orders.

Like most abstract data structures, a multiset has several different concrete representations. To investigate this, we will look at how to represent several multisets taken from the set {A, B, C} in three different forms:

Element List Item Counts Count List
[A,A,A,C] {A:3, C:1} [3,0,1]
[A,B,B,C] {A:1, B:2, C:1} [1,2,1]
[B,B,B,C,C,C] {B:3, C:3} [0,3,3]

The item counts and count list both represent sets with repeated elements, related in the same way that sparse and dense matrices are. For sparser multisets, the item counts are better, but for denser sets, the count list saves a little space by not carrying a reference to the element around with each count.

First, we’ll take a look at how to generate the simple element list with a recursive generator. Here’s the code:

def all_multisets(cardinality, options):
    def all_multi_rec(card, ind):
        if card==0:
            yield []
        else:
            for i in xrange(ind, len(options)):
                for m in all_multi_rec(card-1,i):
                    m.append(options[i])
                    yield m
    for m in all_multi_rec(cardinality, 0):
        yield m

This is slightly more complicated than generating sequences, but still fairly simple. The recursive generator is the inner function ‘all_multi_rec’. The parameters to the outer function are the desired multiset cardinality (number of items in the multiset), and the options from which to take values. The recursion hinges on maintaining an index which is used to limit the already-used items from being re-included. in the recursive path, the ‘else’ in the above code, only items at the current index and beyond are added.

How close are we to being able to write a simple solution now? This close:

menu = [('Fruit', 215), ('Fries', 275),
        ('Salad', 335), ('Wings', 355),
        ('Sticks', 420), ('Sampler', 580)]

for c in xrange(1,8):
    for m in all_multisets(c, menu):
        if sum(x[1] for x in m) == 1505:
            print m

…
…
[('Sampler', 580), ('Wings', 355), 
    ('Wings', 355), ('Fruit', 215)]
[('Fruit', 215), ('Fruit', 215), 
    ('Fruit', 215), ('Fruit', 215), 
    ('Fruit', 215), ('Fruit', 215), 
    ('Fruit', 215)]

Admittedly, hard-coding the cardinality limits is clunky… maybe more on that later. For something more to chew on, here are versions for the other two representations of multisets:

def all_multiset_counts(cardinality, length):
    def all_multi_rec(card, ind):
        if card == 0:
            yield [0,]*length
        else:
            for i in range(ind, length):
                for m in all_multi_rec(card-1, i):
                    m[i] += 1
                    yield m
    for m in all_multi_rec(cardinality, 0):
        yield m

def all_multisets_dict(cardinality, options):
    def all_multi_rec(card, ind):
        if card==0:
            yield {}
        else:
            for i in xrange(ind, len(options)):
                for m in all_multi_rec(card-1, i):
                    item = options[i]
                    if m.has_key(item):
                        m[item] += 1
                    else:
                        m[item] = 1
                    yield m
    for m in all_multi_rec(cardinality, 0):
        yield m

Thoughts anyone? Hope you enjoyed,

-DH