Why it feels like everything is a trade-off

sunny-from-qad

Why it feels like everything is a trade-off

post by Sunny from QAD (Evan Rysdam) · 2019-07-18T01:33:04.764Z · LW · GW · 15 comments

  Example of a trade-off
  Question
  Answer
None
15 comments

Epistemic status: A cute insight that explains why it might feel like you always have to make sacrifices along one metric to get improvements along another. Seems tautological once you understand it. Might be obvious to everyone.

Meta-epistemic status or something: My first post. Testing the waters.

Tl;dr: Skip to the last paragraph.

Example of a trade-off

I'm a programmer. I'm also a design prude. I'm also lazy. This all means that I spend a lot of my time chasing some different metrics in my code:

1) How easy it is to read.

2) How long it takes to run.

3) How long it takes to write.

4) ...

These metrics are often at odds with one another. Just the other day I had to make a trade-off involving a function I'd written to evaluate a polynomial at a given point. Originally, it was written in a way that I felt was self-explanatory: it looped over the derivatives-at-zero of the polynomial, which were passed in as a list, and summed up the appropriate multiples of powers of x — a Taylor sum. Pseudo-code:

def apply_polynomial( deriv, x ):
    sum = 0
    for i from 0 to length( deriv ):
        sum += deriv[i] * pow(x, i) / factorial(i)
    return sum

It turned out that this function was a significant bottleneck in the execution time of my program: about 20% of it was spent inside this function. I was reasonably sure that the pow and factorial functions were the issue. I also knew that this function would only ever be called with cubics and lower-degree polynomials. So I re-wrote the code as follows:

def apply_cubic( deriv, x ):
    sum = 0
    len = length( deriv )

    if len > 0:
        sum += deriv[0]
        if len > 1:
            sum += deriv[1] * x
            if len > 2:
                square = x * x
                sum += deriv[2] * square / 2
                if len > 3:
                    cube = square * x
                    sum += deriv[3] * cube / 6

    return sum

Sure enough, this improved the runtime significantly — by nearly the whole 20% that had been being spent inside this function. But notice that the code no longer contains the elements that define a Taylor sum: the loop is gone, and the factorials (0!, 1!, 2!, 3!) have been replaced with their values (1, 1, 2, 6). It also isn't obvious why the length comparisons stop at 3. The code no longer explains itself, and must be commented to be understood. Readability has been sacrificed on the altar of efficiency.

Question

Why am I cursed so? Why can't these metrics go hand-in-hand? And in general, why am I always doing this sort of thing? Sacrificing flavor for nutrition in the cafeteria, sacrificing walking-speed for politeness on a crowded sidewalk? Why are my goals so often set against one another?

Answer

Let's take a second to think about what a trade-off is.

Suppose you're faced with a problem to solve. You have two solutions in mind (Solution A & Solution B), and you have two metrics (Metric 1 & Metric 2) by which to judge a solution. A trade-off occurs when the solution that scores better along Metric 1 scores worse along Metric 2:

For example:

Of course, there need not be only two possible solutions. Maybe I'm willing to spend two hours working on improving this function, and depending on what I focus on, I could achieve any of the following balances:

And — argh! The correlation is negative! Why!

Well, there's a reason, and the reason is that this isn't the full picture. This is:

See, there are a whole bunch of ways to write the code that are neither as efficient nor as readable as one of the filled-in circles on the perimeter. But I would never choose any of those ways, for obvious reasons. And if a new solution occurs to me that beats out some of my old solutions along both metrics...

...then this new solution would replace all the solutions strictly worse than it, which in turn would become part of the mass that resides below the curve:

No matter how many new solutions are introduced into the mix, and no matter by how far they out-perform the old solutions, the outer frontier of non-dominated solutions must have negative slope everywhere. A step to the right along this line must be accompanied by a step downward, because if it isn't, then the solution you just stepped off of is dominated by the one you just stepped onto, so the former wasn't on the line.

It doesn't take any math to generalize this result to situations where you have more than two metrics. Any solution that is dominated by another solution will be thrown out, so the options you end up considering form a set where no element dominates another, a.k.a. one where a gain along one metric entails a loss along at least one other, a.k.a. a trade-off. Throwing out obviously-poor solutions is easy (and is sometimes done subconsciously when you're working in your domain of expertise, or done by other people before a decision gets to you), but weighing the remaining options usually takes some consideration. So, our subjective experience is that we spend most of our time and energy thinking about trade-offs. Q.E.D.

Note: Down below [LW(p) · GW(p)], Dagon has written a comment explaining something that I meant to make more explicit in this post. Since they've expressed it better than I can, I'm just going to copy-paste their comment here:

One reason that most of your actual decisions involve tradeoffs is that easy no-tradeoff decisions get made quickly and don't take up much of your time, so you don't notice them. Many of the clear wins with no real downside are already baked into the context of the choices you're making. For the vast majority of topics you'll face, you're late in the idea-evolution process, and the trivial wins are already in there.

15 comments

Comments sorted by top scores.

comment by Dagon · 2019-07-18T18:44:40.629Z · LW(p) · GW(p)

Not always - every once in awhile you'll find a solution that is a pareto-improvement: better on all dimensions than the next-best alternative. Those are good days!

One reason that most of your actual decisions involve tradeoffs is that easy no-tradeoff decisions get made quickly and don't take up much of your time, so you don't notice them. Many of the clear wins with no real downside are already baked into the context of the choices you're making. For the vast majority of topics you'll face, you're late in the idea-evolution process, and the trivial wins are already in there.

Replies from: Evan Rysdam

↑ comment by Sunny from QAD (Evan Rysdam) · 2019-07-18T19:37:23.170Z · LW(p) · GW(p)

They are good days indeed! As for your second point, I whole-heartedly agree. I was trying to allude to this in my last two sentences, especially the part in parentheses:

Throwing out obviously-poor solutions is easy (and is sometimes done subconsciously when you're working in your domain of expertise, or done by other people before a decision gets to you), but weighing the remaining options usually takes some consideration. So, our subjective experience is that we spend most of our time and energy thinking about trade-offs.

But I think I should have dedicated more time to it. Originally I had a sentence in there that was something like "when I get dressed in the morning, I don't celebrate the fact that putting my pants on forwards instead of backwards is both easier and more comfortable -- I just do it that way automatically!". You've expressed it better than I have, so I'm going to add your paragraph into the main post (credited to you of course).

comment by dxu · 2019-07-18T17:00:18.813Z · LW(p) · GW(p)

Good post. Seems related to (possibly the same concept as) why the tails come apart [LW · GW].

comment by Donald Hobson (donald-hobson) · 2019-07-18T07:19:02.727Z · LW(p) · GW(p)

A more sensible way to code this would be

def apply_polynomial( deriv, x ):
sum = deriv[0]
div=1
xpow=1
for i from 1 to length( deriv ):
div*=i
xpow*=x
sum += deriv[i] * xpow / div
return sum

Its about as fast as the second, nearly as readable as the first, and works on any poly. (except the zero poly symbolized by the empty list. )

The bit about the tradeoffs is correct as far as I can tell.

Although if a single solution was by far the best under every metric, there wouldn't be any tradeoffs.

In most real cases, the solution space is large, and there are many metrics. This means that its unusual for one solution to be the best by all of them. And in those situations, you might not see a choice at all.

Replies from: Evan Rysdam

↑ comment by Sunny from QAD (Evan Rysdam) · 2019-07-18T14:28:55.801Z · LW(p) · GW(p)

Although if a single solution was by far the best under every metric, there wouldn't be any tradeoffs.

Yes, I debated mentioning this in the post. If a single solution was the best under any metric, then that solution would quickly fade into the background, I think. When I get dressed in the morning, I don't celebrate the fact that putting on socks before shoes is both easier and more comfortable than the reverse!

A more sensible way to code this would be [...]

I haven't tested it, but that involves extra multiplication for computing $x^{1}$ as $1 * x$ and for multiplying numbers together to get the factorial values. But I haven't tested it! Maybe I'll try that today and see if it really is as fast. (The function gets called ~1.7 million times, so even a small difference will be worth keeping the faster code.)

Replies from: donald-hobson

↑ comment by Donald Hobson (donald-hobson) · 2019-07-20T11:16:15.321Z · LW(p) · GW(p)

Your right. I did some python. My version took 1.26, yours 0.78 microseconds. My code is just another point on the Pareto boundary.

Replies from: Evan Rysdam

↑ comment by Sunny from QAD (Evan Rysdam) · 2019-07-20T16:43:27.854Z · LW(p) · GW(p)

Whoa! I wasn't expecting so much of a difference. Did you use ```for i in range(...)``` for your loop? That range() call returns an iterator, which I imagine isn't too fast.

comment by romeostevensit · 2019-07-18T06:55:39.923Z · LW(p) · GW(p)

Slipping out of production possibility frontier zero-sum relations involves slipping sideways along some other dimension. Saddle points in gradient descent. This is yet another reason why 'If a problem seems hard, the representation is probably wrong' often holds.

comment by Slider · 2019-07-18T03:34:16.496Z · LW(p) · GW(p)

Post is excellent in dividing the effect and deduction into small steps which use very little technical complexity.

Part of the final steps take a bit bigger steps. The "true improvement" steps do infact take place and it is not obvious why they could not be enjoyed. Processing less subconcouisly would maybe make it slower but would make it more enjoyable.

Replies from: Evan Rysdam

↑ comment by Sunny from QAD (Evan Rysdam) · 2019-07-18T04:41:21.855Z · LW(p) · GW(p)

Thanks for the feedback. The true improvement steps certainly can be enjoyed -- in fact, that's what I like about being a programmer! The effect of everything-feeling-like-a-tradeoff should be strongest when looking at the available solutions to well-known problems such as primality testing or sorting algorithms, where many people before us have already expended a lot of energy pushing the boundary outwards. History forgets the solutions that got dominated, and we are left with a trade-off solution set.

Your last sentence makes me think you read:

Throwing solutions out is easy (and is sometimes done subconsciously when you're working in your domain of expertise ...)

as referring to the moment that one discovers a new solution and throws out the ones that it dominates. I see why you read it that way. I'll edit the wording to make it clear that I mean throwing out solutions that are just obviously very poor. (Initially I had another sentence here where I said that deliberately comparing these solutions to the good ones could help mitigate the everything-is-a-tradeoff feeling, but then I noticed that you've already said the same thing! Though I'll note that you can still appreciate the gap between the bad ones and the good ones long after you've made your choice, which doesn't result in slowness.)

comment by limerott · 2019-07-19T03:45:05.588Z · LW(p) · GW(p)

Thank you for the post and the analysis. I enjoyed this insight and I recall having the feeling that everything is a trade-off at least once. This explains it clearly and succinctly.

Maybe we should make a collection of solutions that did not make it, just to acknowledge their existence, like all sorting algorithms that are strictly worse than mergesort.

comment by dsizemore0012 · 2019-07-18T16:55:30.543Z · LW(p) · GW(p)

I started tinkering with it before reading the comments but I might as well post my thoughts. My answer was basically the same as Donald Hobson's answer except I pre-calculated the powers and factorials you would use and stored them in an array or collection, then I calculated the sum using the stored values. It's basically the same solution but a little easier to follow. If you aren't worried about breaking taboos then you could stick the factorials in a global variable and have a function to get the factorial that checks to see if the factorial has been calculated and if it hasn't then calculate and store the factorials up to that number. Then again if your hard coded solution works for you then no need to bother adding more complications to it.

comment by tenthkrige · 2019-07-26T13:10:23.490Z · LW(p) · GW(p)

This feels related to Policy Debates Should Not Appear One-Sided [LW · GW]: anything that's obvious does not even enter into consideration, so you only have difficult choices to make.

comment by Pattern · 2019-07-18T06:00:13.092Z · LW(p) · GW(p)

a function I'd written to evaluate a polynomial at a given point.

I'm curious what you were using this for. (I haven't found a use for a lot of my mathematical knowledge at that level so far.)

Replies from: Evan Rysdam

↑ comment by Sunny from QAD (Evan Rysdam) · 2019-07-18T06:17:29.238Z · LW(p) · GW(p)

Without giving away too much "insider info", at my current job I'm working on some software that models the interaction of some different 1-dimensional surfaces. I represent these surfaces using splines, which are piece-wise polynomial functions. When I need to know the y-position of a surface at a given x-position, I evaluate the appropriate polynomial at that x-position to find it.

Why it feels like everything is a trade-off

Contents

Example of a trade-off

Question

Answer

15 comments