Currently, we (or at least I) don’t know how to do multigrid on general problems, so we’re stuck using conjugate gradient. The problem with conjugate gradient is that it is fundamentally about linear systems: given $Ax = b$, construct the Krylov subspace $b, Ax, A^2 x, \ldots$ and pick out the best available linear combination. It’s all in terms of linear spaces.
Interesting human scale physics is mostly not about linear spaces: it’s about half-linear subspaces, or linear spaces with inequality constraints. As we start cramming more complicated collision handling into algorithms, these inequalities play a larger and larger role in the behavior, and linearizing everything into a CG solve hurts.
Enter multigrid. Yes, it’s theoretically faster, but the more important thing is that the intuition for why multigrid works is less dependent on linearity: start with a fine grid, smooth it a bit, and then coarsen to handle the large scale cheaply. Why shouldn’t this work on a system that’s only half linear? There are probably a thousand reasons, but happily I haven’t read enough multigrid theory yet to know what they are.
So, I predict that eventually someone will find a reasonably flexible approach to algebraic multigrid that generalizes to LCPs, and we’ll be able to advance beyond the tyranny of linearization.