01 December 2013

What humans do best - part 3


When hunting for solutions to problems with many variables it is important to examine two aspects of the process we use. The first is the search technique. How do we determine where to start looking and how do we adapt to the results we find to try again? Our search technique is recursive as one would expect of humans who learn from their errors. The second is our determination of what is optimal. How do we know that nearby alternative solutions aren't better? This translates as knowing when to stop searching, thus it is a recursion termination issue and falls into the problem of knowing when a computation is complete.

Consider the search technique first, though. If a problem to be solved has only one variable it is easy to describe an algorithm that works at finding a solution fairly well. Assume one resource is used and only a finite amount of it is available. If we use none of it we assign the variable a zero in the function that determines what we do with it. If we use all of it we assign unity to the variable. Some other value implies fractional use of the resource. The nature of the function is purposely left ambiguous here because it depends on the utility one derives from using the resource. It is whatever it is and doesn't matter much. What matters here is the domain of the function and not its range.

A typical approach one may use if one can make a few assumptions about the non-pathological range of the function is a binary search method. One might start in the middle of the domain and calculate the function along with the ¼ and ¾ values. If the values to either side of the middle produce better answers than the middle, we recalculate the next iteration using the better value as the middle and add and subtract 1/8 from it to get the extremes and then examine the results again. If we know more about the range of the function and can assume the existence of one or more derivatives at the middle point being calculated this is similar to calculating the two extremes, but we might be able to delay the calculations for awhile. Either way, though, the information in this technique creates a shifting net of calculations that examines the output of the function through iterations and narrows in on a solution using our belief that the marginal rates of substitution must be equal. In this case, there is only one commodity, so what is being traded is our inclination to prefer one solution over another.

Now consider problem termination. In one variable we are already done. The iterations end when we don't know where to go next in our screening of the domain. When the differences between the middle value and the two extremes drop below some threshold, we are done and call the middle value the solution. There is no guarantee the solution is globally optimal, of course, but it is probably a good candidate. We might try a few more times using random starting points in the domain instead of the middle, but if we choose a finite number of retries, that problem also ends the same way.

Things become more complicated when the number of variables grows, though. Even at two variables we have an issue with the search technique. We use the same fractional domains [0, 1] for each of the variables and define the 'middle' as (½, ½). What are the extremes to be calculated in the approximation of a derivative, though? In two dimension a tangent line slope isn't enough to know the nature of the function range as it was for one dimension. We need something equivalent to a gradient operator where we don't have to make too many assumptions about continuity and the existence of derivatives along an arbitrary curve through the domain. We get this by creating a mesh of points to examine with the middle point being the start point for each iteration. The mesh points are always defined relative to this middle and get closer to it as the iteration step increases. In two dimensions we might pick the corners of a square and the midpoints on the edges of the square and use the middle of the square as the foot fo the iteration. That means a jump from one dimension to two increases the number of calculations from three to nine each time. It's not hard to see where this goes if the number of dimensions becomes much larger. In the computational sense, the problem is intractable and we won't bother trying to solve it that way.

In two dimensions we might use our nine point mesh and be able to terminate our search much like we did in one dimension. We might even use a similar approach to test whether the solution we find is locally or globally optimal. In a theoretical sense, the search techniques and termination strategies can be the same even though we know the problem will take a lot longer to calculate if the function is messy.

In higher dimensional problems a more likely search technique we will use involves choosing random points near the foot of the iteration and examining them as our mesh. We can choose a finite number to test and approximate the gradient operator. Doing it this way keeps the problem tractable, but sacrifices exactness regarding where the next iteration of the search can go. If through randomness we wind up testing one region of the domain better than another, our next iteration will be biased. We mitigate this risk through statistical measures and only move to the next step when the quality of the test points is sufficient. In this, though, we make it a little more difficult to know when our problem ends. We can be reasonably certain it WILL end, but it is hard to predict when beyond a probabilistic statement.

It is in the statistical measures that there are issues, though. We should weight points near the foot stronger when determining the value of the gradient operator near that foot. This is similar to what we do when calculating the slope of a tangent line. The close the points are to each other in the domain, the more likely the slope of the line connecting them IS the slope of the tangent line at the foot. We are merely extending the concept and the continuity and derivative assumptions if they are available to higher dimensions. What does it mean to be near a foot in a higher dimensional domain, though? That requires something similar to a distance formula which implies the existence of n-balls in the domain that map to something reasonable in the range. The problem is that there are 'corners' in our problem if there are two or more variables. The corners are underweighted in the test and might contain legitimate solutions we gloss over. In fact, the corners are a serious problem. It is easy to show that the ratio of the volumes of an n-ball to the n-cube that contains it drops to zero as the number of dimensions increases in the limit. An n-cube is mostly corners relative to the foot in the middle. Try it with squares, cubes, and hypercube formulas and things don't look too bad, but when the number of dimensions is higher than five the point is quickly demonstrated.

Can this underweighting of corners be mitigated? Sure. We can increase the number of test points with the number of dimensions and adjust the weighting function. Unfortunately, though, we restore the intractability of our problem at some point. Exactly where depends on how many variables are required and how much we wish to avoid over-reliance on a technique that underweights corners. THAT means we are also back to solving the less than ideal problem and choosing instead to settle for an approximate solution where we can't know if it is globally optimal even in a probabilistic sense.

If you are the head of a household and the solution you seek is the typical family economic planning associated with having enough to eat for everyone, there are a limited number of variables that matter. In the simplest sense one must provide enough carbohydrates and proteins to everyone. We can hope the rest works out regarding minerals and vitamins and all that or try to find local solutions on occasional to make up for deficiencies. Exactly how we approach this doesn't matter here, though. What matters is that we CAN approach it and that we typically do so by diving the effort among the family members. One might focus on gathering carbohydrates and optimizing what THEY do in order to accomplish that. Another might focus on hunting protein and optimize for that. The combined solution makes use of parallel processing among the minds of the family and makes difficult tasks a little more tractable, but it occurs through the sacrifice of any knowledge about the global optimum solution. Fortunately, we usually don't care. A local solution that keeps the family alive is good enough most of the time. Global optimums only matter under times of great stress.

If you are the central planner of a tribe or community, though, the problem rapidly balloons to many, many dimensions. The more your community gets along with each other, the simpler the problems gets, but only because THEY coordinate their preferences for you. This is the core concept behind group identification I suspect. A nation is such a group of people who CHOOSE to coordinate preferences in order to facilitate the finding of solutions to resource allocation problems. If they don't make that choice, though, or only make it in the weakest sense meaning they won't steal from each other when they don't like a particular solution, the ideal problem isn't solvable in any sense of the word. Even the reduced problem intractable. Any central planner who makes the attempt is most likely going to find solutions that are relatively good for a subset of the community with preferences near the solution. The members in the corners of the preference space, though, will not be satisfied and might actually be quite upset. Since corner dominate in large dimensions this could mean that most members of a community of any size dislike a planned solution and would view it as serving the desires of a elite subset.

This essay is about what humans do best, though. Apparently what we do best is serve an elite subset of our communities with a focus upon our families. This makes biological AND cultural sense since we benefit in the evolutionary sense when our off-spring prosper relative to other people. Remember that our primary competitors on Earth are other people and this helps make sense of why people will resist surrendering this problem to automation. Yet, in a sense, we DO surrender it. We do so through task delegation to make the problem moderately tractable and when we voluntarily trade with other people outside our groups for which we plan. We can't optimize the resources used by those outside our groups very well because we can't dictate their usage, but there IS a mechanism by which we have managed to accomplish partial control. We trade what they value and lure others to optimize in the way we desire and that implies our markets play a critical role in finding solutions. More on that next time.

No comments: