When hunting for solutions to problems with many
variables it is important to examine two aspects of the process we
use. The first is the search technique. How do we determine where to
start looking and how do we adapt to the results we find to try
again? Our search technique is recursive as one would expect of
humans who learn from their errors. The second is our determination
of what is optimal. How do we know that nearby alternative solutions
aren't better? This translates as knowing when to stop searching,
thus it is a recursion termination issue and falls into the problem
of knowing when a computation is complete.
Consider the search technique first, though. If a
problem to be solved has only one variable it is easy to describe an
algorithm that works at finding a solution fairly well. Assume one
resource is used and only a finite amount of it is available. If we
use none of it we assign the variable a zero in the function that
determines what we do with it. If we use all of it we assign unity to
the variable. Some other value implies fractional use of the
resource. The nature of the function is purposely left ambiguous here
because it depends on the utility one derives from using the
resource. It is whatever it is and doesn't matter much. What matters
here is the domain of the function and not its range.
A typical approach one may use if one can make a few
assumptions about the non-pathological range of the function is a
binary search method. One might start in the middle of the domain and
calculate the function along with the ¼ and ¾ values. If the values
to either side of the middle produce better answers than the middle,
we recalculate the next iteration using the better value as the
middle and add and subtract 1/8 from it to get the extremes and then
examine the results again. If we know more about the range of the
function and can assume the existence of one or more derivatives at
the middle point being calculated this is similar to calculating the
two extremes, but we might be able to delay the calculations for
awhile. Either way, though, the information in this technique creates
a shifting net of calculations that examines the output of the
function through iterations and narrows in on a solution using our
belief that the marginal rates of substitution must be equal. In this
case, there is only one commodity, so what is being traded is our
inclination to prefer one solution over another.
Now consider problem termination. In one variable we
are already done. The iterations end when we don't know where to go
next in our screening of the domain. When the differences between the
middle value and the two extremes drop below some threshold, we are
done and call the middle value the solution. There is no guarantee
the solution is globally optimal, of course, but it is probably a
good candidate. We might try a few more times using random starting
points in the domain instead of the middle, but if we choose a finite
number of retries, that problem also ends the same way.
Things become more complicated when the number of
variables grows, though. Even at two variables we have an issue with
the search technique. We use the same fractional domains [0, 1] for
each of the variables and define the 'middle' as (½, ½). What are
the extremes to be calculated in the approximation of a derivative,
though? In two dimension a tangent line slope isn't enough to know
the nature of the function range as it was for one dimension. We need
something equivalent to a gradient operator where we don't have to
make too many assumptions about continuity and the existence of
derivatives along an arbitrary curve through the domain. We get this
by creating a mesh of points to examine with the middle point being
the start point for each iteration. The mesh points are always
defined relative to this middle and get closer to it as the iteration
step increases. In two dimensions we might pick the corners of a
square and the midpoints on the edges of the square and use the
middle of the square as the foot fo the iteration. That means a jump
from one dimension to two increases the number of calculations from
three to nine each time. It's not hard to see where this goes if the
number of dimensions becomes much larger. In the computational sense,
the problem is intractable and we won't bother trying to solve it
that way.
In two dimensions we might use our nine point mesh
and be able to terminate our search much like we did in one
dimension. We might even use a similar approach to test whether the
solution we find is locally or globally optimal. In a theoretical
sense, the search techniques and termination strategies can be the
same even though we know the problem will take a lot longer to
calculate if the function is messy.
In higher dimensional problems a more likely search
technique we will use involves choosing random points near the foot
of the iteration and examining them as our mesh. We can choose a
finite number to test and approximate the gradient operator. Doing it
this way keeps the problem tractable, but sacrifices exactness
regarding where the next iteration of the search can go. If through
randomness we wind up testing one region of the domain better than
another, our next iteration will be biased. We mitigate this risk
through statistical measures and only move to the next step when the
quality of the test points is sufficient. In this, though, we make it
a little more difficult to know when our problem ends. We can be
reasonably certain it WILL end, but it is hard to predict when beyond
a probabilistic statement.
It is in the statistical measures that there are
issues, though. We should weight points near the foot stronger when
determining the value of the gradient operator near that foot. This
is similar to what we do when calculating the slope of a tangent
line. The close the points are to each other in the domain, the more
likely the slope of the line connecting them IS the slope of the
tangent line at the foot. We are merely extending the concept and the
continuity and derivative assumptions if they are available to higher
dimensions. What does it mean to be near a foot in a higher
dimensional domain, though? That requires something similar to a
distance formula which implies the existence of n-balls in the domain
that map to something reasonable in the range. The problem is that
there are 'corners' in our problem if there are two or more
variables. The corners are underweighted in the test and might
contain legitimate solutions we gloss over. In fact, the corners are
a serious problem. It is easy to show that the ratio of the volumes
of an n-ball to the n-cube that contains it drops to zero as the
number of dimensions increases in the limit. An n-cube is mostly
corners relative to the foot in the middle. Try it with squares,
cubes, and hypercube formulas and things don't look too bad, but when
the number of dimensions is higher than five the point is quickly
demonstrated.
Can this underweighting of corners be mitigated?
Sure. We can increase the number of test points with the number of
dimensions and adjust the weighting function. Unfortunately, though,
we restore the intractability of our problem at some point. Exactly
where depends on how many variables are required and how much we wish
to avoid over-reliance on a technique that underweights corners. THAT
means we are also back to solving the less than ideal problem and
choosing instead to settle for an approximate solution where we can't
know if it is globally optimal even in a probabilistic sense.
If you are the head of a household and the solution
you seek is the typical family economic planning associated with
having enough to eat for everyone, there are a limited number of
variables that matter. In the simplest sense one must provide enough
carbohydrates and proteins to everyone. We can hope the rest works
out regarding minerals and vitamins and all that or try to find local
solutions on occasional to make up for deficiencies. Exactly how we
approach this doesn't matter here, though. What matters is that we
CAN approach it and that we typically do so by diving the effort
among the family members. One might focus on gathering carbohydrates
and optimizing what THEY do in order to accomplish that. Another
might focus on hunting protein and optimize for that. The combined
solution makes use of parallel processing among the minds of the
family and makes difficult tasks a little more tractable, but it
occurs through the sacrifice of any knowledge about the global
optimum solution. Fortunately, we usually don't care. A local
solution that keeps the family alive is good enough most of the time.
Global optimums only matter under times of great stress.
If you are the central planner of a tribe or
community, though, the problem rapidly balloons to many, many
dimensions. The more your community gets along with each other, the
simpler the problems gets, but only because THEY coordinate their
preferences for you. This is the core concept behind group
identification I suspect. A nation is such a group of people who
CHOOSE to coordinate preferences in order to facilitate the finding
of solutions to resource allocation problems. If they don't make that
choice, though, or only make it in the weakest sense meaning they
won't steal from each other when they don't like a particular
solution, the ideal problem isn't solvable in any sense of the word.
Even the reduced problem intractable. Any central planner who makes
the attempt is most likely going to find solutions that are
relatively good for a subset of the community with preferences near
the solution. The members in the corners of the preference space,
though, will not be satisfied and might actually be quite upset.
Since corner dominate in large dimensions this could mean that most
members of a community of any size dislike a planned solution and
would view it as serving the desires of a elite subset.
This essay is about what humans do best, though.
Apparently what we do best is serve an elite subset of our
communities with a focus upon our families. This makes biological AND
cultural sense since we benefit in the evolutionary sense when our
off-spring prosper relative to other people. Remember that our
primary competitors on Earth are other people and this helps make
sense of why people will resist surrendering this problem to
automation. Yet, in a sense, we DO surrender it. We do so through
task delegation to make the problem moderately tractable and when we
voluntarily trade with other people outside our groups for which we
plan. We can't optimize the resources used by those outside our
groups very well because we can't dictate their usage, but there IS a
mechanism by which we have managed to accomplish partial control. We
trade what they value and lure others to optimize in the way we desire and that implies our markets play a critical role in finding solutions. More on that next time.
No comments:
Post a Comment