Imagine a computer stored in a box with a single small hole connecting it to the outside world. We are able to run programs inside the box and receive the results through the hole. In fact, in a sense results are all we can see; if the program makes efficient use of the hardware inside, the size of the hole will prevent us from knowing exactly what went on inside the box (unless we simulate the workings of the box somewhere else, but then the box is useless).
The human brain is a good example. To an outside observer, the hole is speech; other people can’t know what you’re thinking any faster than you can say it. However, speech is also the hole to ourselves. As I write these words, I am only fully aware of them as they appear on the screen in front of me. Until then, I do not consciously know what they are. I can choose to become conscious of them if I say the words to myself internally, but I must slow down in order to do this. The reason is that I am capable of being fully conscious of only one thing at a time, and it is more efficient to be conscious of the words visually on the screen rather than as I “think” of them.
Thus, we have a trade-off between consciousness and efficiency. In order to be fully aware of our thoughts, we must slow them down until they fit through the hole of consciousness. Conscious thought is necessary in order to correct mistakes in our thinking, remember our conclusions, and communicate with others. However, since our brains are internally structured as a massive parallel computer, the only way to use our brains efficiently is to not be aware of what we’re thinking.
This trade-off applies to thoughts of all kinds, and most areas of human endeavor require carefully switching between the different modes. For example, working out a mathematical proof is often said to require intuitive “leaps” of thought. The reasons these appear as leaps is not because they are actually sudden; our brain does not sit around doing nothing and then suddenly have the idea. Rather, our unconscious mind is considering various different avenues of thought in parallel. If an avenue appears fruitful, the unconscious part will make it available to the conscious part, and it seems to “pop into our minds”. Similarly, it is very difficult to consciously try to remember a particular fact. If you let your mind wander, the unconscious part is better able to search around for what we need in a massively parallel fashion, providing us with the answer asynchronously.
Movement is the same. When you dodge a ball thrown at you, your unconscious sees the ball and moves out of the way before there is time to articulate what is happening. Soon afterwards, your brain retroactively explains what happened: “If anyone asks, say we saw the ball coming towards us and dodged.” One of my hobbies is swiveling to fit through doors as they close without touching them. The last time I did this, I had a vague memory of my vision dimming during this motion, almost as if I was passing out. I think this effect was my consciousness temporarily shutting down to avoid interfering with the reflexive motion.
The same trade-off applies to computers and algorithms. Consider the problem of checking a C++ program for type errors. Current compilers do this by running full template instantiation, which generates code for each instance of the template. In other words, the compiler is “conscious” of the entire process. If we all want is the type errors, it would be much faster to use a specially tailored algorithm that checked for errors only, remembering only enough of the results to speech up the rest of the error checking process. It would be even faster if want to know only whether errors exist, not what they are, since the program could forget line numbers and other details. The cool part is that it is possible to combine the different approaches to get the best of all words: we can running error checking before code generation to reduce the latency of reporting errors back to the user, and we can speed up error checking by first running the stripped down yes/no algorithm and reprocessing any portion that has an error. In a decent programming language, all three variants can be generated from the same source using partial specialization.
This last point is very important, since it means that the trade-off between consciousness and efficiency can often be eliminated; we can start with “fully conscious” code (which remembers everything it does) and apply various “forgetfulness” transformations to shift towards the efficiency side. The different versions can be seamlessly interleaved so that it looks from the outside as if the fully conscious version is operating at the speed of the fastest version; the missing detail is recovered only when we need it. This is similar to human vision; any object we look at appears sharp, so we generally imagine that we see with uniformly high detail. What is actually happening is that almost all of our visual field is blurry, with one high resolution point in the center. We can shift this high resolution point to anywhere we wish, so we get the illusion of uniform sharpness for free.
Taking full advantage of this trade-off to speed up programs will require languages that combine low level and high level features and make it easy for the program to inspect and transform its own code. I’ll write more about this later.