Archive for the ‘code’ Category

Sieving Primes

Saturday, January 3rd, 2009

Here’s an interesting lazy variant of the sieve of Eratosthenes by Nick Chapman, from a discussion on LTU:

    primes = 2 : diff [3..] (composites primes)

    composites (p:ps) = cs
        where cs = (p*p) : merge (map (p*) (merge ps cs)) (composites ps)

    merge (x:xs) (y:ys) | x<y           = x : merge xs (y:ys)
                        | otherwise     = y : merge (x:xs) ys

    diff (x:xs) (y:ys)  | x==y          = diff xs ys
                        | x<y           = x : diff xs (y:ys)
                        | x>y           = diff (x:xs) ys

In words, the algorithm computes the list of primes as all numbers which are not composites, and the computes the list of composites by extremely constructing and merging all prime factorizations.

As the discussion shows, this is also a great example of why laziness hasn’t caught on: no one could figure out how fast it was.

To see that the algorithm computes each prime p after finite time, note that all recursive calls are “upwards” towards higher prime factors. If q 2 >p, composites(q:...) won’t be expanded beyond the first term q 2 , and therefore for r>q composites(r:...) will never be called.

Only O(n) multiplications are used (one for each composite), so the cost is the construction and merging of the partial lists of composite numbers. The total runtime complexity of the algorithm is equal to the total number of terms expanded out of the following lists:

  1. composites(p:ps) for each prime p.
  2. Each mergepscs.
  3. Each map(p*)(mergepscs).

Say we evaluate all primes less than some integer n. The cost of (2) and (3) are at most a constant factor greater than the corresponding (1), so we need to compute the number of terms computed from each composites(p:...). At most one number >n will appear in each, so we can restrict considering to numbers <n. Since a given composite k<n appears in composites(p:...) iff all primes dividing k are p, the total cost due to k is roughly π(lpf(k)) where lpf(k) is the least prime factor of k.

Thus the complexity of the algorithm is given by the partial sums of π(lpf()) over the composite numbers. That is,

C(n)= n>kπ(lpf(k))

I can’t find much information about the lpf function online, and I didn’t manage to estimate it analytically. Numerically, it looks like C(n) is between O(nlog 2 n) and O(nlog 3 n).

So, not O(n), but O˜(n) is pretty good, and it’s cool that you can do it with lists alone.

Update: my complexity estimate is wrong. Here’s the correct analysis.

Visualizing collisions

Friday, December 19th, 2008

We added visualization of collisions to physbam’s opengl viewer for the high resolution cloth paper. These turn out to be one of the prettiest things it generates. Here are screenshots of edge-edge collision pairs (two edges colliding) and point-face collision pairs (a point colliding with a face) for a scene with a bunch of soft deformable objects:

Never do this

Thursday, December 18th, 2008

Here’s a snippet of code to sneak into someone’s .pythonrc if you want them to hate you:

    import sys
    from numpy import *

    class Evil:
        size = array(1).itemsize
        __array_interface__ = {
            'shape' : (1,),
            'typestr' : '<i%d'%size,
            'data' : (id(1)+2*size, 0) } 

    asarray(Evil())[0] = 2
    assert 1 == 2

It’s even platform independent! (Note: it will be less innocuous if the class isn’t named Evil.)

Python closures and stack frames

Monday, December 1st, 2008

I use closures quite a lot in python, but the variable lifetime semantics of closures are undocumented as far as I can tell. For example, consider the following code:

    def f():
        a=1
        b=1
        def g():
            return b
        return g

If we call ‘f’ and store the value, ‘b’ will be kept alive. Is ‘a’ kept alive? A naive implementation of closures might store a reference to the entire stack frame of ‘f’ inside the ‘g’ closure, causing ‘a’ to live even though it will never be used. This would be very bad, since generating a closure inside a long function could potentially cause a large memory leak. I wouldn’t expect this since python is implemented by smart people, but it’s worth checking.

Happily, it appears the variable lifetime semantics are as expected. Here’s a test program:

    class A(object):
        def __init__(self,name):
            self.name=name
            print 'creating',name
        def __del__(self):
            print 'destroying',self.name

    def f():
        a=A('a')
        b=A('b')
        c=A('c')
        def g():
            print 'inspecting',b.name
        def h():
            print 'inspecting',c.name
        return g,h

    g,h=f()
    print 'only b and c should be alive'
    del h
    print 'only b should be alive'
    g()
    del g
    print 'done'

The output of this program in CPython is

    creating a
    creating b
    creating c
    destroying a
    only b and c should be alive
    destroying c
    only b should be alive
    inspecting b
    destroying b
    done

All variables die at the correct times, so CPython appears to have correct lifetime semantics in the presence of closures.