Sorting in linear time

If you assume fixed word length, integers can be sorted in linear time using radix sort. This paper gives a stunningly elegant generalization of this to essentially arbitrary data structures, including lists, sets, bags, etc.:

Henglein, F. (2008). “Generic Discrimination: Sorting and Partitioning Unshared Data in Linear Time”.

Wow. The most interesting aspect of this to me is that it provides some extra hope about the practicality of functional containers. Right now everyone is using hash tables, which are fast but extremely lacking in functionality (e.g., they’re not sorted). In many languages they’re also nondeterministic across multiple program runs for security reasons, which is horrible for debugging. Maybe some future abstraction along the lines of this paper (generalized tries?) will make ordering-based containers competitive with hash tables.

Unfortunately, Henglein notes that efficient implementations of these techniques are heavily imperative, which is fine for sorting but bad for use in flexible containers. I’ll keep hoping.

Note: Containers and tries are mentioned as future work in the paper. There are lots of challenges in the way, but Henglein seems optimistic. I expect that some fairly serious compiler support would be required to make an implementation of these ideas both clean and fast.

comments powered by Disqus