## Archive for August, 2009

### Insurance choice can be bad

Sunday, August 30th, 2009

This is a followup to the previous post about health insurance elaborating on the fact that it can be bad to let individuals make choices about their insurance policy. I stated without much detail that “assuming sufficient options and perfect competition, the result of this individual choice would be exactly the same as if the insurers were allowed to use knowledge of $K$.” The “sufficient options” assumption is important (and not necessarily realistic), so more explanation is warranted.

Imagine there’s a genetic test that predicts the occurrence of a particular disease with overwhelming probability. Let’s call this disease H (for Huntington’s disease or maybe HIV/AIDS). Further imagine that the disease is treatable, but the treatment is expensive (not true for Huntington’s yet, unfortunately). Say the price of treatment is $c$.

If insurers are allowed to administer the genetic test and adjust policy prices accordingly, prices will converge towards being $c$ greater for those with $H$. If you have $H$, you’ll pay the entire cost yourself. This situation is clearly bad, so we’ll ban insurers from knowing about the genetic test.

However, individuals still know about the genetic test, and are allowed to make decisions accordingly. Let’s say an insurer provides two insurance policies, identical except that one pays for the treatment for $H$ and one does not. Anyone who knows that they’re $H$-negative will buy the policy that doesn’t treat $H$, and anyone who has $H$ will buy the other. If the insurer is allowed to charge different amounts for the two policies, they will adjust the prices to match the different expectations of cost. This different is $c$.

Can we ban insurers from having one policy that covers $H$ and one that doesn’t? Possibly, but it’s hard. First, we have to choose all or none; if one insurer covers $H$ and a different one doesn’t, the non-$H$ people will flock to the second insurer, and the same thing happens. Second, the connection between $H$ and the treatment for $H$ may be far from obvious, and certainly can’t be expected to be known at the time we pass any particular law. For example, Down’s syndrome increases the likelihood of recurrent ear infections, so allowing policies to not cover recurrent ear infections would penalize anyone with Down’s syndrome. This might be harmless by itself, but a thousand similar options could add up quickly.

There are probably examples of insurance policy choices which wouldn’t be problematic, but figuring out which these are is an extreme subtle proposition. Moreover, this issue will become rapidly more important as our knowledge about genetic risk factors and relations between different diseases expands. I have no confidence that these nuances can be encoded in any kind of government regulation.

Does this mean we can only have one insurance policy for everyone if we want to be fair? Unfortunately to a first approximation, it seems like the answer is yes. I’d love to hear details if anyone knows of a type of policy choice which doesn’t suffer from this problem, though.

### Free market insurance is incompatible with knowledge

Sunday, August 30th, 2009

Yes, it's an extreme title, but it's true. The idea of insurance is to average risk over a large group of people. If advance information exists about the outcomes of individuals, it's impossible for a fully competitive free market to provide insurance.

In particular, free markets cannot provide health insurance.

To see this, consider a function $u:S\to R$ which assigns a utility value to each point of a state space $S$. For example, one of the elements of $S$ could be "you will have cancer in 23 years". This outcome is bad, so the corresponding $u\left(s\right)$ would be a large, negative number.

We also have a probability distribution $p:S\to R$ over $S$. Without insurance, the expected value of $u$ is $E\left(u\right)={\sum }_{s\in S}p\left(s\right)u\left(s\right)$. With insurance, we can average over a large number of people to change the utility function to be closer to the average. For simplicity, we'll consider only the case of perfect insurance, where the new utility function is exactly the average. In the perfect insurance model, we pay an insurance company $E\left(u\right)+o$, and in return they agree to pay us $-u\left(s\right)$ depending on the particular outcome $s$. $o$ is an extra amount to cover administrative costs, risks due to lack of independence and finite numbers of customers, and profit (in the case of imperfect competition).

Assuming no one has any prior knowledge of the state $s$, the only way for different insurers to compete in the perfect insurance model is to reduce overhead. Everyone looks the same, so there's no advantage in charging different amounts to different people. The insurers profit from anyone with $u\left(s\right)>E\left(s\right)$ and lose money from anyone with $u\left(s\right), but there's nothing they can do about it if they can't tell the difference in advance.

Now assume there's some prior knowledge about the state, say $S=K×U$ where $K$ is known in advance and $U$ is unknown. In the absence of regulation, it becomes possible for an insurance company to charge different amounts based on the different $k\in K$. In particular, it's possible for an insurer to sell policies only to people with a favorable value of $k$, and charge $E\left(u|k\right)>E\left(u\right)$. In a free market, anyone with a favorable value of $k$ will flock to these cheaper policies. Insurers offering policies to those with unfavorable values of $k$ will have to raise rates in order to stay in business, since they will have lost the customers from which they make money. Assuming a sufficient level of competition, the price of all insurance policies will converge on $E\left(u|k\right)+o\left(k\right)$.

The result is that we're now insuring only over the uncertainty contained in $U$, not $K$. In the worst case, if $K=S$, $E\left(u|k\right)=u\left(s\right)$ and insurance vanishes completely.

Whether this is good or bad policy-wise depends on what $K$ and $U$ look like. For car insurance, $K$ includes whether the driver was considered at fault in accidents in the past, whether they've driven drunk, whether they drive a muscle car or a Honda Civic, etc. Charging different amounts depending on these factors seems fair, since intuitively these factors can be considered the "fault" of the individual. Similarly, charging more for home owners insurance if you live in the path of a hurricane is also (arguably) reasonable.

In the case of car insurance, even with these known factors out of a way, the space of uncertainty $U$ is still quite large. It includes the actions of other drivers, random equipment failure, invisible road conditions, etc. It is impossible for insurers to predict these factors, which means that private, free market insurance can efficiently insure against them.

For health insurance,the space of known factors includes all past medical history and preexisting conditions, public genetic information including gender and race, healthy or unhealthy lifestyle, etc. In many cases, it includes information about the current medical problem, since insurers have significant control over what kind of treatment people can receive once they are diagnosed. Now, we can argue about whether it's fair to blame people for unhealthy lifestyles, but I highly doubt anyone will argue that black men should be held responsible for their higher rates of prostate cancer.

If we accept that the space of known factors $K$ is too large, the only way to reduce it is to apply some type of regulation to reduce the effective size of $K$. A fair amount of subtlety is required to make such regulation effective. For example, let's say we ban insurers from discriminating based on race, but still allow them to collect information about healthy lifestyle. It's healthy to play sports, so the insurer might ask whether the person plays basketball. People who play basketball are more likely to be black than those who don't (caveat: I'm just guessing here), and therefore it's quite possible that they have higher risks of prostate cancer. Unless the government is smarter than the insurers (impossible, since the insurers have access to the text of laws), the only reliable way to solve this is to ban knowledge of $K$ entirely.

However, banning insurers from using knowledge of $K$ is dangerous unless you also ban customers from using knowledge of $K$. In an extreme case, it would be very bad to allow people to buy insurance policies in response to accidents of unexpected diagnoses. Everyone would wait until they needed medical coverage to buy insurance, and all insurers would rapidly go out of business.

In general, if individuals are allowed to use any information prohibited to insurers, and the space of available policies is large enough, sufficiently diligent individuals with favorable $k$ values can use this information to lower their insurance premiums without raising their risk. Insurers will have to raise their premiums in response, which results in an increase in cost for those with unfavorable $k$ values.

In fact, assuming sufficient options and perfect competition, the result of this individual choice would be exactly the same as if the insurers were allowed to use knowledge of $K$! Wow. I didn't fully understand that point before writing this post.

The conclusion is that if we believe true health insurance is a good thing, and that health insurance means insuring over factors which can be known in advantage, free markets don't work either for insurers or for individuals. We can't allow insurers to base prices on prior knowledge, and we can't even allow individuals to choose which policy they buy based on their knowledge of their own medical history.

Hmm. The individual side of this is somewhat unfortunate, but I don't see any way around this argument.

### Duck talk

Friday, August 21st, 2009

I gave a short presentation on the ideas behind duck at DESRES today. Here are the slides. Caveat: I made these slides in the two hours before the presentation.

### The Verbosity Difference

Tuesday, August 18th, 2009

I like conciseness. Syntactic sugar increases the amount of code I can fit on a single screen, which increases the amount of code I can read without scrolling. Eye saccades are a hell of a lot faster than keyboard scrolling, so not having to scroll is a good thing.

However, I recently realized that simple absolute size is actually the wrong metric with which to judge language verbosity, or at least not the most important one.

Consider the evolution of a chunk of C++ code. We start with a single idea, and encode it as a single class to encapsulate the structure. We add a class declaration, some constructors and a destructor, perhaps even a private operator= to disallow copying. Fine. After this boilerplate, we add various methods to the class to encode the actual behavior. The class also develops a few fields, because fields let us easily share data between the related methods.

Next we have another idea. Conceptually the new idea is distinct from the original one, so we should really make a new class. However, we’ve just gone through all the work of setting up a C++ class, with it’s constructors, destructor, private operator=, access specifiers, etc., and it’d be a shame to have to redo all that effort. Maybe it won’t be so bad if we just add the new idea into the same class…

Boom. Now we have two ideas merged into the same class. You can’t pass around one idea without passing around the other. You can’t rewrite one without analyzing dependency chains to make sure the class fields doesn’t overlap between concepts. After a while, we start to forget that the ideas were ever really distinct. That’s right: the language has actually made us stupider.

You can’t blame the programmer here. We were only maximizing our local utility. We might be smart, but we’re not omniscient, and we can’t always be bothered to follow style manuals. The problem also can’t be ascribed to the overall verbosity of C++; it’s quite possible that the code would be larger if it was written in C, since C++ class syntax, fields, etc. really can make for smaller (source) code.

The problem is that the marginal cost of adding a new class is greater than the marginal cost of extending an existing class. If it was easier to make a new class, we would have done so. But we would also have made a new class if it was harder to add methods to an existing class, because then the trade-off would have been different. In other words, what matters is the difference in verbosity between the “right way” and the “wrong way”, not the absolute level of verbosity.

Therefore, the conclusion is that any new abstraction with a large startup cost but a low marginal cost is bad, because people end up merging them in disgusting ways. Examples include interfaces (adding one more method is easier than splitting one interface into two), Haskell type classes (see fail), and monads (once you’ve converted your code into monadic form, making the monad do something else is easy).

Similarly, any abstraction which merges two benefits into one language construct is also bad, even if the extra benefits are free. The best example of this is inheritance, which merges the benefits of code reuse and subtyping. If I’m making a new class, and it would be really convenient to be able to call one of them methods in an old class, I may end up inheriting from that class in order to save typing even if subtyping makes no sense. By contrast, if I’d been writing the same code in C, that function I really wanted to call would probably just be a function, and I’d just call it. Object oriented programming makes you stupider.

Happily, it’s easy to notice when you’re running into one of these language flaws. Most of us have a good sense for what the right way of doing things is. If we set out to write a new piece of code, the right way will generally be the first thing that comes to mind, but then we’ll remember that doing it the right way is hard. We’ve probably trained ourselves not to notice this conflict after years of painful compromise, so all we have to do is untrain ourselves.

### Marshmallows and Achievement Gaps

Sunday, August 16th, 2009

Here are links related to a few interesting studies that came up in a discussion with Ross. I figured I’d post them here so I have somewhere to point other people:

#### Marshmallows and Delayed Gratification

Walter Mischel did a study where he put children in a room, gave them single marshmallow, and told them that if they held off from eating the marshmallow for a while they would get two marshmallows later. He then left the room and watched via hidden camera to see how long they would hold out. Several years later, he happened to do a follow-up study on the same kids, and discovered that the time they held out was strongly correlated to their grades, whether they went to college, SAT scores, etc. Here are some links:

#### Racial and Gender Achievement Gaps

Claude Steele and Joshua Aronson did a study where they gave the GRE exam to African Americans and European American students. The two groups performed at roughly the same level. However, if they told the students they were taking an intelligence test, the black students performed significantly worse. There are a lot of variant of this experiment for different kinds of tests or gaps (physical activities, gender, etc.) with similar results. I.e., you can dramatically change test scores by saying or not saying a single particular sentence to the test takers before the test.

I agree with Dan Ariely that these studies can and should be interpreted extremely optimistically. If the driving factors behind success or failure are this simple or fragile, we should be able to find easy ways to make huge improvements.

### The Anonymous, Recursive Suggestion Box

Sunday, August 16th, 2009

Good discussion with Ross today, resulting in one nice, concrete idea.

Consider the problem of suggesting policy improvements to the government. In particular, let’s imagine someone has a specific, detailed policy change related to health care, financial regulation, etc. Presumably, the people who know the most about these industries are (or were) in the industries themselves, so you could argue that they can’t be trusted to propose ideas that aren’t just self-serving. Maybe it’s possible for someone to build a reputation of trustworthiness, but that’s hard and would ideally be unrelated to the actual ideas proposed. Instead of relying on reputation, we’ll remove the issue entirely by making the suggestion box anonymous.

Now we have an anonymous suggestion box on a website. People go to it and propose ideas. There are a few good ideas, and a vast amount of bad, malicious, and nonsense ideas (including spam). Eliminating the spam is easy (I have a single, completely public email address and get roughly one spam message per day, from which I conclude that the spam issue is solved). In order to eliminate the bad or malicious ideas, we need to be able to judge their correctness in a logical manner. For this, we rely on distributed intelligence: other people are allowed to judge whether each idea is good or bad. To get loaded words out of the picture, let’s replace “good” and “bad” with the words “true” and “false”. “Ideas” become propositions of the form “Implementing this idea would be good” (yes, “good” is still there, but keep reading).

Let’s assuming voting isn’t a completely reliable system for determine the truth or falsity of ideas (otherwise, we’re done). Therefore, some true propositions will get a lot of false votes, and vice versa. To solve this, we allow people to propose arguments for or against each proposition. These have the form of some statement, like “That proposition is false because the author is a moron”, together with a more details argument for why the statement is true. Now we let people vote on two more things:

1. Whether the truth of the statement would imply that the original proposition is true or false.
2. Whether the argument for the truth of the statement itself is sound.

If we get enough votes in favor of both (1) and (2), we conclude that the original idea is true (or false), and discount the votes for or against the original idea.

This is the key part, so I’ll restate it. If we have propositions $A$, $B$, and $B⇒A$, then enough votes for both $B$ and $B⇒A$ override any votes against $A$. You can’t kill a good idea unless you can the arguments for it as well.

Now we have to get recursive. What if $B$ gets a lot of votes, but is actually wrong? Then you let people propose arguments for the falsity of $B$, and so on. What if there are two competing arguments which appear to contradict? Then you let people propose arguments about why there isn’t a contradiction? There are a lot of logical issues to deal with, but people can post arbitrary arguments written in normal human languages and we have the full power of human intelligence to judge them, so we’re not limited by artificial logical restrictions. This isn’t a formal proof system.

Unfortunately, we are limited by what happens along the full recursive tree. If people lie about the propositions all the way down, and manage to flood away all the counter arguments, the system will fail. However, this is basically a problem of spam, and can be solved in the usual way. If you detect that someone is consistently voting opposite the correct answer, you flag them as malicious and discount their votes. This rule is circular, but that’s what probabilistic analysis is for: we take all the data and compute the most likely assignment of truth values to propositions and spam flags to people. There’s some threshold of validity that you need to achieve in order to such a solver to converge to the correct answer, but that level of trust is often quite low due to network effects and self-reinforcement. In other words, contradictions don’t fit together.

Since this is a website, we have to identify whether the “users” are actually people. We could do this conventionally with a system like ReCAPTCHA, but since we’re in recursive mode it’s much cooler to instead ask users to judge the correctness of randomly selected propositions. If you want to vote on whether a proposition is true or false, or propose a new proposition, you need to spent a little time judging the ideas of others. If someone comes up with a way to trick this system by writing a program that can judge the truth or falsity or arbitrary English propositions, this discussion may be obsolete (thanks to Ross for this particular bit of reasoning).

Other issues probably abound, but they can be fixed by allowing people to suggest improvements to the system. If deemed reasonable, these ideas can be implemented and tested in parallel with the existing system, resulting in a potentially large number of competing systems for determining truth values from the same data set. The data set itself could probably be made freely available (under a suitable license), so that others could build competing systems.

I don’t think this system would be all that difficult to implement. Thanks to the previous paragraph, if it reached a sufficient level of quality it would start to improve itself. Maybe that would even get scary.

Of course, if we apply this to a realm like politics, the truth or falsity of various statements will be very controversial, and different people will have legitimately different opinions. This can be solved by adding side conditions to the statements, like “If you believe in flat tax systems, we should do this” or “If you believe that health care is a basic human right, we should do that.” More importantly, however, there is a vast range of ideas that any rational person should agree to. Statements like “the proposed health care bills do not include death panels”, and “given an otherwise equivalent choice between taxing a public good and a public evil, we should tax the latter.” I think it’s fair to say that the U.S. would be better off if we could agree on the statements that don’t need side conditions.

Note: I’ve done zero checking to see if this has been proposed or implemented before (this discussion happen just now), so I’m curious if anyone knows related references or links.

Another note: presumably this would be set up as a nonprofit supported by donations of some kind. If this system actually existed, I would probably be willing to donate at least \$1000.