## Archive for the ‘games’ Category

### Inverse of a hash function

Sunday, March 18th, 2012

I’ve used Thomas Wang’s integer hash functions for years for various purposes. Using techniques invented by Bob Jenkins for general hashing (e.g., hashes of strings), Wang derived several hash specialized for fixed size integer input. His 64-bit version is

``````uint64_t hash(uint64_t key) {
key = (~key) + (key << 21); // key = (key << 21) - key - 1;
key = key ^ (key >> 24);
key = (key + (key << 3)) + (key << 8); // key * 265
key = key ^ (key >> 14);
key = (key + (key << 2)) + (key << 4); // key * 21
key = key ^ (key >> 28);
key = key + (key << 31);
return key;
}
``````

Key properties include avalanche (changing any input bit changes about half of the output bits) and invertibility. Recently I wanted to make explicit use of the inverse, in order to verify that zero would never arise as the hash of a given set of inputs. This property would allow me to initialize the hash table in question (which takes up several gigabytes) with zeros and avoid an explicit occupied bit on each entry. Thus, I needed `inverse_hash(0)`.

Our function is the composition of the functions on each line, so we need to invert each one. Multiplication by 21 and 265 is easy; both numbers are odd, and therefore have multiplicative inverses mod ${2}^{64}$. The rest of the lines are invertible because they’re Feistel functions; they break the key into two pieces, leave one piece alone, run the second function through an invertible function that depends on the first. For example, the line `key = key ^ (key >> 24)` leaves the top 24 bits alone. Once you know the top 24 bits, you can reconstruct the next 24 bits with an xor, and one more round gives the remaining bits. The full inverse is

``````uint64_t inverse_hash(uint64_t key) {
uint64_t tmp;

// Invert key = key + (key << 31)
tmp = key-(key<<31);
key = key-(tmp<<31);

// Invert key = key ^ (key >> 28)
tmp = key^key>>28;
key = key^tmp>>28;

// Invert key *= 21
key *= 14933078535860113213u;

// Invert key = key ^ (key >> 14)
tmp = key^key>>14;
tmp = key^tmp>>14;
tmp = key^tmp>>14;
key = key^tmp>>14;

// Invert key *= 265
key *= 15244667743933553977u;

// Invert key = key ^ (key >> 24)
tmp = key^key>>24;
key = key^tmp>>24;

// Invert key = (~key) + (key << 21)
tmp = ~key;
tmp = ~(key-(tmp<<21));
tmp = ~(key-(tmp<<21));
key = ~(key-(tmp<<21));

return key;
}
``````

The cleverness of the original hash function is that each invertible step is also extremely fast. The inverse is slower, but only moderately.

Finally, I did indeed luck out:

``````inverse_hash(0) = 0x7ffffbffffdfffff
``````

This isn’t a valid pentago board in my packed representation, so zero initialization works. This turned out to be obvious in hindsight: all but the first step in the hash function leaves zero alone, and the last step is complement on the lower 21 bits, which is enough to know that `inverse_hash(0)` can’t be a valid pentago board. It’s still cool to have the full inverse, though.

I tried to email Thomas Wang in case he hadn’t had the occasion to write down the inverse explicitly, but unfortunately his HP email bounced. Impermanent email addresses make me sad.

### Near optimal holdem play with one round of betting

Sunday, December 25th, 2011

Consider the following simplified version of Texas holdem, with two players Alice and Bob:

1. Alice and Bob are each dealt two private cards.

2. Alice posts a small blind of 1, Bob posts a big blind of 2.

3. Alice either folds, calls, or raises by any amount $\ge 2$.

4. Bob either calls or folds.

5. Five more shared cards are dealt, and the winner is determined as usual.

6. Both Alice and Bob have infinite stack sizes, so only expected value matters.

Here’s Alice’s and Bob’s near optimal strategies if Alice only raises integers $\le 40$.  It turns out Alice can leverage her small blind, better advantage to gain an average of at least 0.39776:

one-round-strategy.txt

For Alice, the strategy is a linear combination of actions (‘f’ for fold or ‘b<n>’ for raise by n) for each hand.  For Bob, the strategy is the probability of calling with each hand and each bet by Alice.  These results are definitely optimal for integer bets $\le 40$ to within an expected equity of 1e-4, which I can say with reasonable confidence since it’s much easier to check Nash equilibrium computations then it is to perform them.  It’s also unlikely that Alice could gain all that much more by varying her bet further; she rarely uses the full range of betting options (raising by 40 only with AA and even then only 0.1% of the time, which is likely a numerical error), and I doubt that using fractional bets would dramatically increase her equity.

A few interesting aspects of the results:

1. Alice never folds.  This is clearly an artifact of Bob’s inability to reraise.

2. We have crazy bluffing!  With 63s Alice raises by 15 70% of the time!  63s is slightly worse equity-wise than 85o, but apparently the flush possibilities the bluff worthwhile.

3. Alice’s strategy is extremely nonsmooth, presumably to hide information.  For example, with 87s Alice’s most likely raises are 13 (40% of the time), 16 (10% of the time), and 22 (29% of the time), with small probabilities in between and on either side.  In other words, remembering anything like the optimal strategy looks pretty hopeless (but let me know if you think otherwise!).

Overall, Alice tends to either flat call (around 43% of the time), raise a moderate amount between 4 and 7 (trailing off up to 16), or go all out with a raise of 20 or 21.  The plot follows.  Alice almost never raises by 2 or 3 or more than 23.

It turns out to be possible to compute the optimal strategy for both players by solving a single (though fairly large) linear program.  The case of a simple zero sum game with outcomes given by a payoff matrix is described on Wikipedia; the key is to let one of the players (Alice) rescale their probability vector so that their “expected” outcome ignoring the rescale is at least 1, then minimize the amount of scaling necessary (the sum of the elements of the vector).  A given strategy of Alice’s has utility at least 1 if the utility is at least one for each of Bob’s pure strategies, which gives one linear constraint for each of Bob’s strategies.

In our simplified poker game, the strategies for Alice and Bob can be expressed as a huge Cartesian product over the situations in which they have to decide.  Alice has a choice of bet for each hand she receives, and Bob has a choice of call/fold for each hand and value of Alice’s bet.  The rescaling trick still works in this more general context; we simply let Alice’s probability vector in each situation scale and constrain them to all have the same sum to provide a single global scale.  The constraint of “utility at least 1″ is more complicated, though, since the total number of Bob’s pure strategies is exponential due to the Cartesian product.  To avoid the exponential blowup, we formulate Bob’s response to Alice as its own nested linear program, which gives us a problem of the form

$\underset{x}{\mathrm{min}}{c}^{T}x\phantom{\rule{1em}{0ex}}s.t.\phantom{\rule{1em}{0ex}}\mathrm{Ax}=b,\forall y.Cy=d⇒{x}^{T}Gy\ge 1$

Here $x$ and $y$ are Alice’s and Bob’s strategies, $\mathrm{Ax}=b$ and $\mathrm{Cy}=d$ enforce that probabilities sum to a fixed scaled amount or one, respectively, and ${x}^{T}Gy$ is Alice’s payoff for a given pair of strategies.  See the code for more details if you’re curious; it’s fairly well commented.

Now we get to the cool part: we can apply linear programming duality to turn the universal quantification in the above problem into an existential inequality, which turns the whole thing into a standard linear program.  Thanks to Tomas Tinoco De Rubira for this idea, after I naively asked if it was possible to express nested linear programs in his general convex programming package cvxpy.  Flipping back and forth between universal and existential quantification is the whole point of convex duality: you can prove that a linear program can’t possibly have value greater than $v$ by exhibiting a single feasible solution to the dual program.  Strong duality has to hold to get an exact match, which we have in our case since the subprograms are always feasible.  Again, I’ll leave all the details to the comments in the code.

The unfortunate part is lack of sparsity: the payoff matrix $G$ in the above convex program (which remains after we dualize into a linear program) has $O\left({n}^{2}m\right)$ nonzeros, where $n=169$ is the number of holdem hands and $m$ is the number of Alice’s possible bets.  It’s possible to store the relevant information in only $O\left({n}^{2}+m\right)$ space, but impossible to compute a matrix vector multiply with it faster than $O\left({n}^{2}m\right)$ without using asymptotically faster (meaning slow) matrix multiplication.  If I write a bunch of structure-aware code it’s possible to take advantage of structure to solve the resulting KKT systems required for linear programming in low constant $O\left({n}^{2}m\right)$ time and $O\left({n}^{2}+m\right)$ space as well, but write now I’m expanding the whole thing out into an explicit sparse matrix and passing it to cvxopt, which causes a bunch of extra blowup via sparse factorization.  Bad.

Thus, without (fun but time consuming) structure awareness, my laptop is unable to scale beyond 40 bet options for Alice, and more importantly to the more interesting case of Bob retaliating with another round of betting.  However, the current structure wouldn’t suffer any additional cost for two rounds of betting each with $⌊\sqrt{40}⌋=6$ rounds of betting, so that’s something to try next.

### Minimal continuous poker

Sunday, November 13th, 2011

Consider the following poker-like game, played with two players: Alice and Bob.  Bob posts a blind of 1.  Both players are dealt a single, continuous hand chosen uniformly at random from $\left[0,1\right]$.  Alice can fold, call, or raise any amount $b>0$ (calling means $b=0$).  Bob either calls or folds.

My original plan was to work out the Nash equilibrium for this game, and therefore derive interesting smooth curves describing the optimal way for Alice to bluff and generally obscure her hand.  It didn’t quite work out that way, but the result is still interesting.

Let $A$ be the optimal expected payoff for Alice, $A\left(x\right)$ the optimal expected payoff given Alice’s specific hand x.  Clearly A(x) is a weakly increasing function of x; if x’ > x but A(x’) < A(x), Alice can improve A(x’) to A(x) by pretending she has hand x and using the same strategy.  If A(x) > 0 for some x, so that Alice’s optimal strategy is better than folding, then Alice will never fold since any nonzero probability of folding would reduce her utility.  Thus there is a threshold hand ${x}_{t}$ below which Alice always folds (technically, always “might as well” fold, but we’ll gloss over this detail for the moment), and above which Alice always calls but might vary the amount she raises by.  A similar argument applied to Bob gives a threshold function ${y}_{t}\left(b\right)$ s.t. Bob folds if $y<{y}_{t}\left(b\right)$ and calls if $y>{y}_{t}\left(b\right)$.

These and other similar preliminaries aside, I spent a while banging my head against the wall before temporarily giving up and picking a simpler game.  Specifically, make the bet $b$ a fixed constant.  If $b=0$, Bob always checks, and Alice’s optimal strategy is (unsurprisingly) to call if $x>1/2$.  This gives a payoff of $A=1/4$.  If $b>0$, Bob’s function ${y}_{t}\left(b\right)$ is a constant. The values of ${x}_{t}$, ${y}_{t}$, and $A$ at the Nash equilibrium turn out to be

$\begin{array}{rl}{x}_{t}=& \frac{2+2b+{b}^{2}}{\left(2+b{\right)}^{2}}\\ {y}_{t}=& \frac{1+b}{2+b}\\ A=& \frac{2+2b}{\left(2+b{\right)}^{3}}\end{array}$

A few notes: for any $b>0$, ${x}_{t}<{y}_{t}$, since Bob knows Alice’s threshold and gains nothing by calling with hands only slightly over ${x}_{t}$.  Interestingly, Alice’s payoff remains unchanged if she varies her threshold anywhere satisfying ${x}_{t}<{y}_{t}$; the particular value of ${x}_{t}$ is only prevents Bob from increasing his utility with a different value of ${y}_{t}$.  However, the most important property of this result is that $A$ is a decreasing function of $b$.  If the bet is fixed, Alice would prefer it to be zero so that $A=1/4$.

Now back the general game, where Alice chooses the bet.  We have $A\ge 1/4$, since Alice can choose to always call or fold.  Can she do better, varying the bet in some exotic fashion so as to milk more money out of Bob while keeping her hand disguised?  Well, no.  Update: After disproving a conjecture (see below), the answer is maybe.

Let’s see what happens if Bob follows the optimal strategy from the fixed $b$ game, calling if $y>\left(1+b\right)/\left(2+b\right)$ and folding otherwise. Since Bob’s strategy is fixed, Alice can maximize her utility without any extra randomness, and we need only consider bets which are a deterministic function of $x$.  For $x<{x}_{t}$ she folds and $A\left(x\right)=0$, for $x>{x}_{t}$ she bets $b=b\left(x\right)$ with payoff

$\begin{array}{rl}A\left(x\right)=& 1\mathrm{Pr}\left({y}_{t}>y\right)+\left(1+b\right)\mathrm{Pr}\left(x>y>{y}_{t}\right)-\left(1+b\right)\mathrm{Pr}\left(y>x,{y}_{t}\right)\\ =& {y}_{t}+\left(1+b\right)\mathrm{max}\left(0,x-{y}_{t}\right)-\left(1+b\right)\left(1-\mathrm{max}\left(x,{y}_{t}\right)\right)\\ =& {y}_{t}+\left(1+b\right)\mathrm{max}\left(0,x-{y}_{t}\right)-\left(1+b\right)+\left(1+b\right)x+\left(1+b\right)\mathrm{max}\left(0,{y}_{t}-x\right)\\ =& {y}_{t}+\left(1+b\right)\left(x-1\right)+\left(1+b\right)|x-{y}_{t}|\\ \end{array}$

Differentiating w.r.t. $b$,

$\begin{array}{rl}A\left(x\right)\prime =& x-1+{y}_{t}\prime +|x-{y}_{t}|+\left(1+b\right)\mathrm{sgn}\left({y}_{t}-x\right){y}_{t}\prime \\ {y}_{t}\prime =& \frac{1}{2+b}-\frac{1+b}{\left(2+b{\right)}^{2}}=\frac{1-{y}_{t}}{2+b}\end{array}$

If $x<{y}_{t}$, this becomes

$\begin{array}{rl}A\left(x\right)\prime =& x-1+{y}_{t}\prime +{y}_{t}-x+\left(1+b\right){y}_{t}\prime \\ =& -1+\left(2+b\right)\left(1-{y}_{t}\right)/\left(2+b\right)+{y}_{t}\\ =& -1+1-{y}_{t}+{y}_{t}=0\end{array}$

so if $x<{y}_{t}$, Alice’s bet doesn’t matter.  If $x>{y}_{t}$, we have

$\begin{array}{rl}A\left(x\right)\prime =& x-1+{y}_{t}\prime +x-{y}_{t}-\left(1+b\right){y}_{t}\prime \\ =& 2x-1-b{y}_{t}\prime -{y}_{t}\\ =& 2x-1-b\left(1-{y}_{t}\right)/\left(2+b\right)-{y}_{t}\\ =& 2x-1-b/\left(2+b\right)+b\left(1+b\right)/\left(2+b{\right)}^{2}-\left(1+b\right)/\left(2+b\right)\\ =& 2x-1-\frac{2b+{b}^{2}-b-{b}^{2}+2+3b+{b}^{2}}{\left(2+b{\right)}^{2}}\\ =& 2x-1-\frac{2+4b+{b}^{2}}{\left(2+b{\right)}^{2}}\\ =& 2x-2+\frac{2}{\left(2+b{\right)}^{2}}\\ \end{array}$

which is negative as $b→\infty$ and zero at $b=\sqrt{1/\left(1-x\right)-2}$.

Thus $A\left(x\right)\prime =0$ iff $\left(2+b{\right)}^{2}=1/\left(1-x\right)$, or

$b=\sqrt{\frac{1}{1-x}-2}$

Unfortunately for Bob, $A\left(x\right)\prime >0$ at $b=0$ iff $2x-2+1/2>0$ iff $x>3/4$.  Thus Alice can beat $A=1/4$ by increasing her bet whenever $x>3/4$.  Abandoning hand calculation and turning to Mathematica, it turns out Alice’s optimal strategy achieves $A=7/24\approx .29$.

So, we haven’t found the Nash equilibrium yet, but at least our minimal poker game isn’t (necessarily) incredibly boring.  More analysis is required.  For pretty pictures’ sake, here’s a plot of Alice’s outcome as $x$ and $b$ vary:

Alice's expected outcome for hand x and bet b