Posts

Individual Rationality Needn't Generalize to Rational Consensus 2020-05-04T22:53:15.442Z

Comments

Comment by Akshat Mahajan (AkshatM) on Machine learning could be fundamentally unexplainable · 2020-12-17T18:46:43.627Z · LW · GW

I have a huge problem with the "Some problems are boring" section, and it basically boils down into the following set of rebuttals:

  1. Some problems may seem boring, but are vital to solve anyway
  2. Some problems may seem boring, but their generalizations are interesting
  3. Problems that seem boring may have really interesting solutions we are unaware of

Every single one of the examples cited in that section falls into this category:

  1. Figuring out if a blotch on a dental CT scan is more likely to indicate a streptococcus or a lactobacillus infection.
  2. Understanding what makes an image used to advertise a hiking pole attractive to middle-class Slovenians over the age of 54.
  3. Figuring out, using l2 data, if the spread for the price of soybean oil is too wide, and whether the bias is towards the sell or buy.
  4. Finding the optimal price at which to pre-sell a new brand of luxury sparkling water based on yet uncertain bottling, transport, and branding cost.
  5. Figuring out if a credit card transaction is likely to be fraudulent based on the customer’s previous buying pattern.

They all have interesting generalizations, applications and potential solutions. Identifying arbitrary blotches on dental CT scans can be generalized to early-stage gum disease prevention. Figuring out optimal pricing for any item can assist in optimal market regulation. Identifying fraud actively makes the world safer and gives us tools to understand how cheaters adapt in real-time to detection events. And, be honest, if the answer to any of these turned not to be trivial at all - if this is what our models point to - no one would be suddenly claiming the problem itself is boring.

I feel really strongly about this because dismissing any problem as "boring" is isomorphic to asking "why do we fund basic science at all if we get no applications from it" or "why study pure math", and we all ought to know better than to advance a position so well-rebuffed as "it seems really specific and not personally interesting to me, so why should we (as a society/field) care?"

Comment by Akshat Mahajan (AkshatM) on Individual Rationality Needn't Generalize to Rational Consensus · 2020-05-06T20:30:01.637Z · LW · GW

It is standard wisdom in politics that if you control the agenda, it doesn't matter how people vote.

I think I understand the confusion. When I say "vote", I am not necessarily talking about electorates or plebiscites. In fact, Pettit's paper is remarkable precisely for also considering situations that have nothing to do with politics or government.

Consider the case of a trust fund that must make decisions for the trust based on how the original creator specified it. For example, they may be charged to make investment decisions that best support a specific community or need. The executors of this trust try their hardest to meet the spirit as well as the letter of these instructions, so they end up adopting rules that require members to vote separately on whether a proposed action meets the spirit of the instructions and whether it meets the letter of the instructions. The rationale is that this ensures the executors as a whole have done their homework and cannot be held liable for missing one or the other requirement through a single vote.

The doctrinal paradox in this case demonstrates you can get different outcomes if you had them vote directly on whether it met spirit and letter, or had them vote separately on the components of the question.

I hope that this explains what I mean by "required to do it" by providing an incentive that has nothing to do with politics. I hope it also encourages a shift towards thinking in terms of systems and their consistency criterions.

I won't respond to the rest of the comment because discourse about political agenda is not relevant to this discussion.

Comment by Akshat Mahajan (AkshatM) on Individual Rationality Needn't Generalize to Rational Consensus · 2020-05-06T03:17:05.280Z · LW · GW

Why is the collective decision of the three judges wrong? Two of the judges believe there was no breach of contract, although for different reasons. Therefore the defendant is acquitted.

The paradox demonstrates that there are differences in outcome based on the way you aggregate majorities. It doesn't claim that one aggregation rule is superior to the other.

That's another way to say that collective decision isn't "wrong" - the point of the paradox is to show that it depends on how you choose to measure that decision.

It seems to me clearly wrong to prefer a separate vote on A and B, round the results off to true/false, and then use those fictitious values to infer C.

Naturally. But there are cases where you can't avoid separate votes on A and B. Pettit provides two cases, which I have reproduced above (see "it is not the case that organizations can always choose to vote directly and ignore preserving collective consistency").

The obvious case is where you are required to vote on A and B, and infer C from there. This can happen in a procedural context, because that's just the way someone specified it.

The less-obvious case is where acquiring consensus on C directly is prohibitive or does not reflect the same result as acquiring consensus on A and B. Perhaps C is controversial or people have incentives to lie, but A and B are not. Perhaps A and B were ratified by Congress and now it is upto constitutional scholars to decide on the merits of C without being able to consult Congress as a whole.

Whatever the case, the consequences for decision-making are clear. We cannot build inferences of the form "the majority agreed on A", "the majority agreed on B" so this implies "the majority agreed on C". Yet, as the foregoing illustrates, such inferences are sometimes made out of necessity.

Comment by Akshat Mahajan (AkshatM) on The new dot com bubble is here: it’s called online advertising · 2019-11-22T07:53:08.758Z · LW · GW
However, my own experience in the industry suggests that most spend that goes beyond generating more than zero awareness is poorly spent. Much to the dismay of marketing departments, you can't usually spend your way through ads to growth. Other forms of marketing look better (content marketing can work really great and can be a win-win when done right).

This experience has been corroborated by countless reviews (summarized in my other comment), so I agree with you.

Comment by Akshat Mahajan (AkshatM) on The new dot com bubble is here: it’s called online advertising · 2019-11-22T07:49:03.265Z · LW · GW
The question is, why should we believe advertising works at all?

This is a fair objection. I decided to look for a review paper summarizing the existing literature on the subject of advertising effectiveness.

Via Google Scholar, I was able to find a particularly useful review paper, summarizing both empirical effects and prior literature reviews for advertisements as well as political and health campaigns across multiple channels (print, TV, etc.). Overall, the literature paints a disjointed, inconclusive view of the value of advertising - there is insufficient data to conclude that advertising in general has no impact.

I invite you or anyone interested to read it in depth, but will, for the purpose of this discussion, summarize its concluding remarks (as available in the section "Behavioral Effects of Advertising or Commercial Campaigns"):

1. Advertising interventions appear to be correlated with short-term boosts in product sales.

A set of case studies has shown strong short-term effects of campaigns on sales (Jones 2002). In a recent study, the buying of a service (use of the weight room in a training facility) increased to almost five times the initial use after an outdoor advertising campaign (Bhargava and Donthu 1999). In another study, exposure to printed store sale flyers led to a doubling of the number of advertised products bought, and more than a doubling of the amount spent on items in ads (Burton, Lichtenstein et al. 1999).

2. There *is* disagreement on the long-term effects.

Ninety percent of advertising effects dissipate after three to fifteen months. The first response is most important; the share returns for advertising diminish fast. After the third exposure advertisers should focus on reach rather than frequency, according to research findings from advertising effects research (Vakratsas and Ambler 1999).
While some claim that advertising seems not to be important for sales in the short term, although more important in the longer term (Tellis 1994; see also Tellis, Chandy et al. 2000), others disagree. Jones found that advertisements must work in the short term to be able to have any medium or long-range effect on sales (Jones 2002).

3. Despite contributing to a short-term boost, advertising by itself is weaker compared to other kinds of promotional activities. Increased advertising spend yields diminishing results:

The influence of advertising has been estimated to be 9% of the variation in sales for consumer products. The effect of promotional activities – such as offers of reduced prices for shorter periods of time – was more than double that size (Jones 2002). In some studies price reductions have been found to be 20 times more effective for increasing sales than is advertising (Tellis 1994), a consequence being that since the late 1980s the industry has changed its emphasis from advertising to promotion (Turk and Katz 1992; Vakratsas and Ambler 1999; Jones 2002). The solution to the problem of small effects may be that most advertising research has not taken into consideration the fact that only a small amount of advertising seems to increase sales. Increased spending on advertising (increased number of exposures and increased gross rating points) has been found to induce larger sales when ads were persuasive, but not when they were not (Stewart, Paulos et al. 2002).

4. "Likeability", medium, and what it's selling matters a lot in the effectiveness of advertising:

The advertising copy and novelty in ads seemed more important than the amount of advertising itself (Tellis 1994). The two most important qualities of ads that sell products are likeability of the ad (Biel 1998) and its ability to make people believe that a company has an excellent product (Joyce 1998: 20). A study has shown that advertising likeability predicted sales winners 87% of the time (Biel 1998). It is no news that copy research works (Caples 1997; for a review, see Jeffres 1997: 252-263), but new data-processing techniques have made it possible to apply this knowledge almost instantly to TV advertising as well (Woodside 1996). Channel selection may also be an important influence on sales (Tellis, Chandy et al. 2000). For some groups of products (lower-priced daily consumer goods) the first exposure to advertising may contain most of the ad’s effect on behavior (Jones 1995; Jones 2002).

There are many other aspects of advertising influence that is covered in the conclusions which I have not summarized - I have selected the few that seem most salient here.

Overall, I think a reasonable prior is that advertising *has* an impact, but has strong situational limits to its effectiveness compared to other sales growth techniques. Since digital advertising is a specific case of advertising in general and there are some effects for advertising in general, it would be difficult to make the case that no digital advertising works at all - it is much safer to expect that digital advertising has *some* (albeit situational and weak) impact.

---

The other half of this comment is re:

the article is asserting that they have not met it

This is a reductive picture. It is true the article is setting out to check marketer's claims of the effects of digital advertising. However, it is also setting out to provide an overview of the evidence for whether digital advertising works in general. This last aspect was the focus of my prior comment.

My comment was meant to highlight the flaws in their methodology for reviewing whether digital advertising does not work. Their review focus has been restricted to a very specific set of claims and cases targeted at larger advertising platforms, and one should not generalize early from those remarks. To do a better job of truth-seeking, articulating what specifically is currently not known after the analysis is necessary, I think - hence my last comment.

Comment by Akshat Mahajan (AkshatM) on The new dot com bubble is here: it’s called online advertising · 2019-11-19T18:37:11.414Z · LW · GW

I read this article and its referenced papers when it was published on Hacker News 12 days ago, and I have reservations against accepting its conclusions regarding the broken nature of digital advertising.

The article's conclusion is predicated on two specific kinds of evidence:

1. That brand-keyword ads overwhelmingly demonstrate selection effects.

2. That advertising for companies with large advertising impact across different channels demonstrates more selection effects than advertising effects.

The evidence is compelling, but it doesn't warrant the conclusion that digital advertising is ineffective because:

1. Brand-keyword ads (when someone searching for "Macy's" gets an ad linking to Macy's website) are not the only kind or even the most common kind of keyword ads. Targeted keyword ads (having an ad for Macy's website when someone looks up "cashmere sweater") are more common and more competitive, yet haven't been covered or studied in the provided literature.

2. All the studies cited in this article (such as Lewis and Rao 2015 and Gordon et al. 2018) either explicitly deal with firms that are described as "large" or "having millions of customers" (Lewis and Rao, or the eBay intervention), or neglect to disclose or characterize the firms involved in the study (such as Gordon et al). A possible selection bias might be ocurring where only brands with large pre-existing brand identity are being studied - in such a case, it would not be surprising the literature demonstrates more selection effects than advertising effects, as customers would have already heard about the brands by the time these studies ran.

Ideally, the following pieces of evidence would be needed to conclude that digital advertising as-is really is broken:

1. A survey of the effectiveness of targeted keyword ads.

2. The impact of digital advertising among companies with no large brand presence among different channels.

I was unable to find anything in the literature for either, but I confess I did not try very hard beyond a perfunctory Google Scholar search.

---

I agree that the examples cited in this article are compelling evidence for an application of Goodhart's law in digital advertising.

Comment by Akshat Mahajan (AkshatM) on Conservation of Expected Evidence · 2019-11-19T08:13:57.379Z · LW · GW

I do not understand the validity of this statement:

There is no possible plan you can devise, no clever strategy, no cunning device, by which you can legitimately expect your confidence in a fixed proposition to be higher (on average) than before.

Given a temporal proposition A among a set of other mututally exclusive temporal propositions {A, B, C...}, demonstrating B, C, and other candidates do not meet the evidence so far while A meets the evidence so far does raise our confidence in the proposition *continuing to hold*. This is standard Bayesian inference applied to temporal statements.

For example, we have higher confidence in the statement "the sun will come up tomorrow" than the statement "the sun will not come up tomorrow", because the sun has come up in the past, whereas it has not not come up comparably fewer times. We have relied on the prior distribution to make confident statements about the result of an impending experiment, and can constrain our confidence using the number of prior experiments that conform to it - further, every new experiment that confirms "the sun will come up" makes it harder to argue that "the sun will not come up" because the latter statement now has to explain *why* it failed to apply in the prior cases as well as why it will work now.

It would seem quantifying the prior distribution against a set of mutually-exclusive statements thus *is* a valid strategy for raising confidence in a specific statement.

Maybe I'm misinterpreting what "fixed proposition" means here or am missing something more fundamental?