Posts

[linkpost] The Psychological Economy of Inaction by William Gillis 2021-09-12T22:04:43.991Z
Cliffnotes to Craft of Research parts I, II, and III 2021-06-26T14:00:53.560Z
What am I fighting for? 2021-04-20T23:27:26.416Z
Could degoogling be a practice run for something more important? 2021-04-17T00:03:42.790Z
[Recruiting for a Discord Server] AI Forecasting & Threat Modeling Workshop 2021-04-10T16:15:07.420Z
Aging: A Surprisingly Tractable Problem 2021-04-08T19:25:42.287Z
Averting suffering with sentience throttlers (proposal) 2021-04-05T10:54:09.755Z
TASP Ep 3 - Optimal Policies Tend to Seek Power 2021-03-11T01:44:02.814Z
Takeaways from the Intelligence Rising RPG 2021-03-05T10:27:55.867Z
Quinn's Shortform 2021-01-16T17:52:33.020Z
Is it the case that when humans approximate backward induction they violate the markov property? 2021-01-16T16:22:21.561Z
Infodemics: with Jeremy Blackburn and Aviv Ovadya 2021-01-08T15:44:57.852Z
Announcing the Technical AI Safety Podcast 2020-12-07T18:51:58.257Z
How ought I spend time? 2020-06-30T16:53:53.787Z
Have general decomposers been formalized? 2020-06-27T18:09:06.411Z
Do the best ideas float to the top? 2019-01-21T05:22:51.182Z
on wellunderstoodness 2018-12-16T07:22:19.250Z

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-08-03T16:34:08.583Z · LW · GW

capabilities-prone research.

I come to you with a dollar I want to spend on AI. You can allocate p pennies to go to capabilities and 100-p pennies to go to alignment, but only if you know of a project that realizes that allocation. For example, we might think that GAN research sets p = 98 (providing 2 cents to alignment) while interpretability research sets p = 10 (providing 90 cents to alignment).

Is this remotely useful? This is a really rough model (you might think it's more of a venn diagram and that this model doesn't provide a way of reasoning about the double counting problem).

a task: rate research areas, even whole agendas, with such value p. Many people may disagree about my example assignments to GANs and interpretability, or think both of those are too broad.

What are some alternatives to the splitting a dollar intuition?

To say something is capabilities-prone is less to say a dollar has been cleanly split, and more to say that there are some dynamics that sort of tend toward or get pushed toward different directions. Perhaps I want some sort of fluid metaphor instead.

Comment by Quinn (quinn-dougherty) on Saving Time · 2021-07-27T19:03:36.075Z · LW · GW

Is logical time at all like cost semantics? also this -- cost semantics of a programming language are sort of like "number of beta reductions" from lambda calculus.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-07-25T14:13:37.508Z · LW · GW

New discord server dedicated to multi-multi delegation research

DM me for invite if you're at all interested in multipolar scenarios, cooperative AI, ARCHES, social applications & governance, computational social choice, heterogeneous takeoff, etc.

(side note I'm also working on figuring out what unipolar worlds and/or homogeneous takeoff worlds imply for MMD research).

Comment by Quinn (quinn-dougherty) on What are some claims or opinions about multi-multi delegation you've seen in the memeplex that you think deserve scrutiny? · 2021-06-27T19:40:36.913Z · LW · GW

I think both sets of bullets (multi-multi (eco?)systems either replicating cooperation-etc-as-we-know-it or making new forms of cooperation etc) are important, I think I'll call them prosaic cooperation and nonprosaic cooperation, respectively, going forward. When I say "cooperation etc." I mean cooperation, coordination, competition, negotiation, compromise.

You've provided crisp scenarios, so thanks for that!

In some sense "most" of that work will presumably be done by AI systems, but doing the work ourselves may unlock those benefits much earlier.

But if the AI does that work there will be an interpretability problem, an inferential distance. I'm imagining people ask a somewhat single-single aligned AI for solutions to multi-multi problems and the black box returns something inscrutable. Putting ourselves in a position where we can grok what its recommendations are seems aligned with researching it for ourselves so we won't have to ask the black box in the first place, though this probably only applies to prosaic cooperation.

Comment by Quinn (quinn-dougherty) on What are some claims or opinions about multi-multi delegation you've seen in the memeplex that you think deserve scrutiny? · 2021-06-27T19:18:27.595Z · LW · GW

I wrote out the 2x2 grid you suggested in MS paint

I'm not sure I'm catching how multi-inner is game theory. Except that I think "GT is the mesa- of SCT" is an interesting, reasonable (to me) claim that is sort of blowing my mind as I contemplate it, so far.

Comment by Quinn (quinn-dougherty) on What are some claims or opinions about multi-multi delegation you've seen in the memeplex that you think deserve scrutiny? · 2021-06-27T18:52:48.025Z · LW · GW

Thanks! Trust, compromise, and communication are all items in Dafoe et. al. 2020, if you're interested in exploring. I agree that primitive forms of these issues are present in multi-single and single-multi, it's not clear to me whether we should think of solving these primitive forms then solving some sort of extension to multi-multi or if we should think of attacking problems that are unique to multi-multi directly. It's just not clear to me which of those better reflects the nature of what's going on.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-06-24T13:28:55.068Z · LW · GW

No you're right. I think I'm updating toward thinking there's a region of nonprosaic short-timelines universes. Overall it still seems like that region is relatively much smaller than prosaic short-timelines and nonprosaic long-timelines, though.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-06-23T21:21:07.804Z · LW · GW

That's totally fair, but I have a wild guess that the pipeline from google brain to google products is pretty nontrivial to traverse, and not wholly unlike the pipeline from arxiv to product.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-06-23T21:19:51.124Z · LW · GW

I should've mentioned in OP but I was lowkey thinking upper bound on "short" would be 10 years.

I think developer ecosystems are incredibly slow (longer than ten years for a new PL to gain penetration, for instance). I guess under a singleton "one company drives TAI on its own" scenario this doesn't matter, because tooling tailored for a few teams internal to the same company is enough which can move faster than a proper developer ecosystem. But under a CAIS-like scenario there would need to be a mature developer ecosystem, so that there could be competition.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-06-23T11:35:25.328Z · LW · GW

nonprosaic ai will not be on short timelines

I think a property of my theory of change is that academic and commercial speed is a bottleneck. I recently realized that my mass assignment for timelines synchronized with my mass assignment for the prosaic/nonprosaic axis. The basic idea is that let's say a radical new paper that blows up and supplants the entire optimization literature gets pushed to the arxiv tomorrow, signaling the start of some paradigm that we would call nonprosaic. The lag time for academics and industry to figure out what's going on, figure out how to build on that result, for developer ecosystems to form, would all compound to take us outside of what we would call "short timelines".

How flawed is this reasoning?

Comment by Quinn (quinn-dougherty) on johnswentworth's Shortform · 2021-05-23T11:23:18.820Z · LW · GW

You might check out Donald Braben's view, it says "transformative research" (i.e. fundamental results that create new fields and industries) is critical for the survival of civilization. He does not worry that transformative results might end civilization.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-23T10:50:30.200Z · LW · GW

Three predictable disagreements are

• There are causes in addition to the one you claim
• I don't define X as you do, to me X means...

1. intrinsic soundness - "challenging the clarity of a claim, relevance of reasons, or quality of evidence"
2. extrinsic soundness - "different ways of framing the problem, evidence you've overlooked, or what others have written on the topic." The idea is to anticipate, acknowledge, and respond to both kinds of questions. This is the path to making an argument that readers will trust and accept.

Voicing too many hypothetical objections up front can paralyze you. Instead, what you should do before anything else is focus on what you want to say. Give that some structure, some meat, some life. Then, an important exercise is to imagine readers' responses to it.

I think cleaving these into two highly separated steps is an interesting idea, doing this with intention may be a valuable exercise next time I'm writing something.

View your argument through the eyes of someone who has a stake in a different outcome, someone who wants you to be wrong.

1. Why do you think there's a problem at all?
2. Have you properly defined the problem?
3. Is your solution practical or conceptual?
4. Have you stated your claim too strongly?
5. Why is your practical/conceptual solution better than others?

1. "I want to see a different kind of evidence" i.e. hard numbers over anecdotes / real people over cold numbers
2. "It isn't accurate"
3. "It isn't precise enough"
4. "It isn't current"
5. "It isn't representative"
6. "It isn't authoritative"
7. "You need more evidence"

It builds credibility to play defense: to recognize your own argument's limitations. It builds even more credibility to play offense: to explore alternatives to your argument and bring them into your reasoning. If you can, you might develop those alternatives in your own imagination, but more likely you'd like to find alternatives in your sources.

What is the perfect amount of objections to acknowledge? Acknowledging too many can distract readers from the core of your argument, while acknowledging too few is a signal of laziness or even disrespect. You need to narrow your list of alternatives or objections by subjecting them to the following priorities

• plausible charges of weaknesses that you can rebut
• alternative lines of argument important in your field
• alternative conclusions that readers want to be true
• alternative evidence that readers know
• important counterexamples that youu have to address. What if your argument is flawed? The best thing to do is candidly acknowledge the issue and respond that...
• the rest of your argument more than balances the flaw
• while the flaw is serious, more research will show a way around it
• while the flaw makes it impossible to accept your claim fully, your argument offers important insight into the question and suggests what a better answer would need.

It is wise to build up good faith by acknowledging questions you can't answer. Concessions are often interpreted as positive signals by the reader.

It is important for your responses to acknowledgments to be subordinate to your main point, or else the reader will miss the forest for the trees.

Remember to make an intentional decision about how much credence to give to an objection or alternative. Weaker ones imply weaker credences, imply less effort in your acknowledgment and response.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-19T16:33:07.571Z · LW · GW

I asked a friend whether I should TA for a codeschool called ${{codeschool}}. You shouldn't hang around${{codeschool}}. People at ${{codeschool}} are not pursuing excellence. A hidden claim there that I would soak up the pursuit of non-excellence by proximity or osmosis isn't what's interesting (though I could see that turning out either way). What's interesting is the value of non-excellence, which I'll call adequacy.${{codeschool}} in this case is effective and impactful at putting butts in seats at companies, and is thereby responsible for some negligible slice of economic growth. It's students and instructors are plentiful with the virtue of getting things done, do they really need the virtue of high-craftsmanship? The student who reads SICP and TAPL because they're pursuing mastery over the very nature of computation is strictly less valuable to the economy than the student who reads react tutorials because they're pursuing some cash.

Obviously, my friend who was telling me this was of the SICP/TAPL type. In software, this is problematic: lisp and type theory will increase your thinking about the nature of computation, but will it increase your thinking about the social problem of steering a team? From an employer's perspective, it is naive to prefer excellence over adequacy, it is much wiser to saddle the excellent person with the burden of proving that they won't get bored easily.

Clever kids in Ravenclaw, evil kids in Slytherin, wannabe heroes in Gryffindor, and everyone who does the actual work in Hufflepuff.

Hufflepuffs can go far, and the fuel is adequacy. Enough competence to get it done, any more is egotistical, a sunk cost.

But what if it's not about industry/markets, what if it's about the world's biggest problems? Don't we want people who are more competent than strictly necessary to be working on them? Maybe, maybe not.

Related: explore/exploit, become great/become useful

For a long time I've operated in the excellence mindset: more energy for struggling with textbooks than for exploiting the skills I already have to ship projects and participate in the real world. Thinking it might be good to shift gears and flex my hufflepuff virtues more.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-19T11:59:24.792Z · LW · GW

thoughts on chapter 9 of Craft of Research

Getting the easy things right shows respect for your readers and is the best training for dealing with the hard things.

If they don't believe the evidence, they'll reject the reasons and, with them, your claim.

We saw previously that claims ought to be supported with reasons, and reasons ought to be based on evidence. Now we will look closer at reasons and evidence.

Reasons must be in a clear, logical order. Atomically, readers need to buy each of your reasons, but compositionally they need to buy your logic. Storyboarding is a useful technique for arranging reasons into a logical order: physical arrangements of index cards, or some DAG-like syntax. Here, you can list evidence you have for each reason or, if you're speculating, list the kind of evidence you would need.

When storyboarding, you want to read out the top level reasons as a composite entity without looking at the details (evidence), because you want to make sure the high-level logic makes sense.

Readers will not accept a reason until they see it anchored in what they consider to be a bedrock of established fact. ... To count as evidence, a statement must report something that readers agree not to question, at least for the purposes of the argument. But if they do question it, what you think is hard factual evidence is for them only a reason, and you have not yet reached that bedrock of evidence on which your argument must rest.

I think there is a contract between you and the reader. You must agree to cite sources that are plausibly truthful, and your reader must agree to accept that these sources are reliable. A diligent and well-meaning reader can always second-guess whether, for instance, the beureau of subject matter statistics is collecting and reporting data correctly, but at a certain point this violates the social contract. If they're genuinely curious or concerned, it may fall on them to investigate the source, not on you. The bar you need to meet is that your sources are plausibly trustworthy. The book doesn't talk much about this contract, so there's little I can say about what "plausible" means.

Sometimes you have to be extra careful to distinguish reasons from evidence, a (<claim>, <reason>, <evidence>) tuple is subject to regress in the latter two components, (A, B, C) may need to be justified by (B, C, D) and so on. The example given of this regress is if I told you (american higher education must curb escalating tuition costs, because the price of college is becoming an impediment to the american dream, today a majority of students leave college with a crushing debt burden). In the context of this sentence, "a majority of students..." is evidence, but it would be reasonable to ask for more specifics. In principle, any time information is compressed it may be reasonable to ask for more specifics. A new tuple might look like (the price of college is becoming an impediment to the american dream, because today a majority of students leave college with a crushing debt burden, in 2013 nearly 70% of students borrowed money for college with loans averaging \$30000...). The third component is still compressing information, but it's not in the contract between you and the reader for the reader to demand the raw spreadsheet, so this second tuple might be a reasonable stopping point of the regress.

If you can imagine readers plausibly asking, not once but many times, how do you know that? What facts make it true?, you have not yet reached what readers want - a bedrock of uncontested evidence.

Sometimes you have to be careful to distinguish evidence from reports of it. Again, because we are necessarily dealing with compressed information, we can't often point directly to evidence. Even a spreadsheet, rather than summary statistics of it, is a compression of the phenomena in base reality that it tracks.

data you take from a source have invariably been shaped by that source, not to misrepresent them, but to put them in a form that serves that source's ends. ... when you in turn report those data as your own evidence, you cannot avoid manipulating them once again, at least by putting them in a new context.

There is a criteria you want to screen your evidence with respect to.

• sufficient
• representative
• accurate
• precise
• authoritative

Being honest about the reliability and prospective accuracy of evidence is always a positive signal. Evidence can be either too precise or not precise enough. The women in one or two of Shakespeare's plays do not represent all his women, they are not representative. Figure out what sorts of authority signals are considered credible in your community, and seek to emulate them.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-17T15:07:26.828Z · LW · GW

Claims - thoughts on chapter eight of Craft of Research

Broadly, the two kinds of claims are conceptual and practical.

Conceptual claims ask readers not to ask, but to understand. The flavors of conceptual claim are as follows:

• Claims of fact or existence
• Claims of definition and classification
• Claims of cause and consequence
• Claims of evaluation or appraisal

There's essentially one flavor of practical claim

• Claims of action or policy.

If you read between the lines, you might notice that a kind of claim of fact or cause/consequence is that a policy works or doesn't work to bring about some end. In this case, we see that practical claims deal in ought or should. There is a difference, perhaps subtle perhaps not, between "X brings about Y" and "to get Y we ought to X".

Readers expect a claim to be specific and significant. You can evaluate your claim along these two axes.

To make a claim specific, you can use precise language and explicit logic. Usually, precision comes at the cost of a higher word count. To gain explicitness, use words like "although" and "because". Note some fields might differ in norms.

You can think of significance of a claim as the quantity it asks readers to change their mind, or I suppose even behavior.

While we can't quantify significance, we can roughly estimate it: if readers accept a claim, how many other beliefs must they change?

Avoid arrogance.

As paradoxical as it seems, you make your argument stronger and more credible by modestly acknowledging its limits.

Two ways of avoiding arrogance are acknowledging limiting conditions and using hedges to limit certainty.

Don't run aground: there are innumerable caveats that you could think of, so it's important to limit yourself only to the most relevant ones or the ones that readers would most plausibly think of. Limiting certainty with hedging is given by example of Watson and Crick, publishing what would become a high-impact result, "We wish to suggest ... in our opinion ... we believe ... Some ... appear"

without the hedges, Crick and Watson would be more concise but more aggressive.

In most fields, readers distrust flatfooted certainty

It is not obvious how to walk the line between hedging too little and hedging too much.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-12T13:36:35.097Z · LW · GW

Good arguments - notes on Craft of Research chapter 7

Arguments take place in 5 parts.

1. Claim: What do you want me to believe?
2. Reasons: Why should I agree?
3. Evidence: How do you know? Can you back it up?
4. Acknowledgment and Response: But what about ... ?
5. Warrant: How does that follow?

This can be modeled as a conversation with readers, where the reader prompts the writer to taking the next step on the list.

Claim ought to be supported with reasons. Reasons ought to be based on evidence. Arguments are recursive: a part of an argument is an acknowledgment of an anticipated response, and another argument addresses that response. Finally, when the distance between a claim and a reason grows large, we draw connections with something called warrants.

The logic of warrants proceeds in generalities and instances. A general circumstance predictably leads to a general consequence, and if you have an instance of the circumstance you can infer an instance of the consequence.

Arguing in real life papers is complexified from the 5 steps, because

• Claims should be supported by two or more reasons
• A writer can anticipate and address numerous responses. As I mentioned, arguments are recursive, especially in the anticipated response stage, but also each reason and warrant can necessitate a subargument.

You might embrace a claim too early, perhaps even before you have done much research, because you "know" you can prove it. But falling back on that kind of certainty will just keep you from doing your best thinking.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-12T13:03:32.587Z · LW · GW

Sources - notes on Craft of Research chapters 5 and 6

Primary, secondary, and tertiary sources

Primary sources provide you with the "raw data" or evidence you will use to develop, test, and ultimately justify your hypothesis or claim. Secondary sources are books, articles, or reports that are based on primary sources and are intended for scholarly or professional audiences. Tertiary sources are books and articles that synthesize and report on secondary sources for general readers, such as textbooks, articles in encyclopedias, and articles in mass-circulation publications.

The distinction between primary and secondary sources comes from 19th century historians, and the idea of tertiary sources came later. The boundaries can be fuzzy, and are certainly dependent on the task at hand.

I want to reason about what these distinctions look like in the alignment community, and whether or not they're important.

The rest of chapter five is about how to use libraries and information technologies, and evaluating sources for relevance and reliability.

Chapter 6 starts off with the kind of thing you should be looking for while you read

Look for creative agreement

• Offer additional support. You can offer new evidence to support a source's claim.
• Confirm unsupported claims. You can prove something that a source only assumes or speculates about.
• Apply a claim more widely. You can extend a position.

Look for creative disagreement

• Contradictions of kind. A source says something is one kind of thing, but it's another.
• Part-whole contradictions. You can show that a source mistakes how the parts of something are related.
• Developmental or historical contradictions. You can show that a source mistakes the origin or development of a topic.
• External cause-effect contradictions. You can show that a source mistakes a causal relationship.
• Contradictions of perspective. Most contradictions don't change a conceptual framework, but when you contradict a "standard" view of things, you urge others to think in a new way.

The rest of chapter 6 is a few more notes about what you're looking for while reading (evidence, reasons), how to take notes, and how to stay organized while doing this.

The alignment community

I think I see the creative agreement modes and the creative disagreement modes floating around in posts. Would it be more helpful if writers decided on one or two of these modes before sitting down to write?

Moreover, what is a primary source in the alignment community? Surely if one is writing about inner alignment, a primary source is the Risks from Learned Optimization paper. But what are Risks' primary, secondary, tertiary sources? Does it matter?

Now look at Arbital. Arbital started off to be a tertiary source, but articles that seemed more like primary sources started appearing there. I remember distinctively thinking "what's up with that?" it struck me as awkward for Arbital to change it's identity like that, but I end up thinking about and citing the articles that seem more like primary sources.

There's also the problem of stuff in the memeplex not written down is the real "primary" source while the first person who happens to write it down looks like they're writing a primary source when in fact what they're doing is really more like writing a secondary or even tertiary source.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-10T15:42:36.017Z · LW · GW

Questions and Problems - thoughts on chapter 4 of Craft of Doing Research

Last time we discussed the difference between information and a question or a problem, and I suggested that the novelty-satisfied mode of information presentation isn't as good as addressing actual questions or problems. In chapter 3 which I have not typed up thoughts about, A three step procedure is introduced

1. Topic: "I am studying ..."
2. Question: "... because I want to find out what/why/how ..."
3. Significance: "... to help my reader understand ..." As we elaborate on the different kinds of problems, we will vary this framework and launch exercises from it.

Some questions raise problems, others do not. A question raises a problem if not answering it keeps us from knowing something more important than its answer.

The basic feedback loop introduced in this chapter relates practical with conceptual problems and relates research questions with research answers.

Practical problem -> motivates -> research question -> defines -> conceptual/research problem -> leads to -> research answer -> helps to solve -> practical problem (loop)

What should we do vs. what do we know - practical vs conceptual problems

Opposite eachother in the loop are practical problems and conceptual problems. Practical problems are simply those which imply uncertainty over decisions or actions, while conceptual problems are those which only imply uncertainty over understanding. Concretely, your bike chain breaking is a practical problem because you don't know where to get it fixed, implying that the research task of finding bike shops will reduce your uncertainty about how to fix the bike chain.

Conditions and consequences

The structure of a problem is that it has a condition (or situation) and the (undesirable) consequences of that condition. The consequences-costs model of problems holds both for practical problems and conceptual problems, but comes in slightly different flavors. In the practical problem case, the condition and costs are immediate and observed. However, a chain of "so what?" must be walked.

Readers judge the significance of your problem not by the cost you pay but by the cost they pay if you don't solve it... To make your problem their problem, you must frame it from their point of view, so that they see its cost to them.

One person's cost may be another person's condition, so when stating the cost you ought to imagine a socratic "so what?" voice, forcing you to articulate more immediate costs until the socratic voice has to really reach in order to say that it's not a real cost.

The conceptual problem case is where intangibles play in. The condition in that case is always the simple lack of knowledge or understanding of something. The cost in that case is simple ignorance.

Modus tollens

A helpful exercise is if you find yourself saying "we want to understand x so that we can y", try flipping to "we can't y if we don't understand x". This sort of shifts the burden on the reader to provide ways in which we can y without understanding x. You can do this iteratively: come up with _z_s which you can't do without y, and so on.

Pure vs. applied research

Research is pure when the significance stage of the topic-question-significance frame refers only to knowing, not to doing. Research is applied when the significance step refers to doing. Notice that the question step, even in applied research, refers to knowing or understanding.

Connecting research to practical consequences

You might find that the significance stage is stretching a bit to relate the conceptual understanding gained from the question stage. Sometimes you can modify and add a fourth step to the topic-question-significance frame and make it into topic-conceptual question-conceptual significance-possible practical application. Splitting significance into two helps you draw reasonable, plausible applications. A claimed application is a stretch when it is not plausible. Note: the authors suggest that there is a class of conceptual papers in which you want to save practical implications entirely for the conclusion, that for a certain kind of paper practical applications do not belong in the introduction.

AI safety

One characterisitic of AI safety that makes it difficult both to do and interface with is the chains of "so what" are often very long. The path from deconfusion research to everyone dying or not dying feels like a stretch if not done carefully, and has a lot of steps when done carefully. As I mentioned in my last post, it's easy to get sucked into the "novel information for it's own sake" regime at least as a reader. More practical oriented approaches are perhaps those that seek new regimes for how to even train models, and the "so what?" is answered "so we have dramatically less OODR-failures" or something. The condition-costs framework seems really beneficial for articulating alignment agendas and directions.

Misc

• "Researchers often begin a project without a clear idea of what the problem even is."
• Look for problems as you read. When you see contradictions, inconsistencies, incomplete explanations tentatively assume that readers would or should feel the same.
• Ask not "Can I solve it?" but "will my readers think it ought to be solved?"
• "Try to formulate a question you think is worth answering, so that down the road, you'll know how to find a problem others think is worth solving."
Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-10T11:45:24.043Z · LW · GW

The audience models of research - thoughts on Craft of Doing Research chapter 2

Writers can't avoid creating some role for themselves and their readers, planned or not

1. I've found some new and interesting information - I have information for you
2. I've found a solution to an important practical problem - I can help you fix a problem
3. I've found an answer to an important question - I can help you understand something better

The authors recommend assuming one of these three. There is of course a wider gap between information and the neighborhood of problems and questions than there is between problems and questions! Later on in chapter four the authors provide a graph illustrating problems and questions: Practical problem -> motivates -> Research question -> defines -> Conceptual/research problem. Information, when provided mostly for novelty, however, is not in this cycle. Information can be leveled at problems or questions, plays a role in providing solutions or answers, but can also be for "its own sake".

I'm reminded of a paper/post I started but never finished, on providing a poset-like structure to capabilities. I thought it would be useful if you could give a precise ordering on a set of agents, to assign supervising/overseeing responsibilities. Looking back, providing this poset would just be a cool piece of information, effectively: I wasn't motivated by a question or problem so much as "look at what we can do". Yes, I can post-hoc think of a question or a problem that the research would address, but that was not my prevailing seed of a reason for starting the project. Is the role of the researcher primarily a writing thing, though, applying mostly to the final draft? Perhaps it's appropriate for early stages of the research to involve multi-role drifting, even if it's better for the reader experience if you settle on one role in the end.

Additionally, it occurs to me that maybe "I have information for you" mode just a cheaper version of the question/problem modes. Sometimes I think of something that might lead to cool new information (either a theory or an experiment), and I'm engaged moreso by the potential for novelty than I am by the potential for applications.

I think I'd like to become more problem-driven. To derive possibilities for research from problems, and make sure I'm not just seeking novelty. At the end of the day, I don't think these roles are "equal" I think the problem-driven role is the best one, the one we should aspire to.

[When you adopt one of these three roles, you must] cast your readers in a complementary role by offering them a social contract: _I'll play my part if you play yours ... if you cast them in a role they won't accept, you're likely to lose them entirely... You must report your research in a way that motivates your readers to play the role you have imagined for them.

The three reader roles complementing the three writer roles are

1. Entertain me
2. Help me solve my practical problem
3. Help me understand something better

It's basically stated that your choice of writer role implies a particular reader role, 1 mapping to 1, 2 mapping to 2, and 3 mapping to 3.

Role 1 speaks to an important difficulty in the x-risk, EA, alignment community; which is how not to get drawn into the phenomenal sensation of insight when something isn't going to help you on a problem. At my local EA meetup I sometimes worry that the impact of our speaker events is low, because the audience may not meaningfully update even though they're intellectually engaged. Put another way, intellectual engagement can be goodhartable, the sensation of insight can distract you from your resolve to shatter your bottlenecks and save the world if it becomes an end itself. Should researchers who want to be careful about this avoid the first role entirely? Should the alignment literature look upon the first reader role as a failure mode? We talk about a lot of cool stuff, it can be easy to be drawn in by the cool factor like some of the non-EA rationalists I've met at meetups.

I'm not saying reader role number two absolutely must dominate, because it can diverge from deconfusion which is better captured by reader role number three.

Division of labor between reader and writer, writer roles do not always imply exactly one reader role

Isn't it the case that deconfusion/writer role three research can be disseminated to practical (as opposed to theoretical) -minded people, and then those people turn question-answer into problem-solution? You can write in the question-answer regime, but there may be that (rare) reader who interprets it in the problem-solution regime! This seems to be an extremely good thing that we should find a way to encourage. In general reading the drifts across multiple roles seems like the most engaged kind of reading.

Comment by quinn-dougherty on [deleted post] 2021-05-05T10:24:09.205Z
Comment by Quinn (quinn-dougherty) on [timeboxed exercise] write me your model of AI human-existential safety and the alignment problems in 15 minutes · 2021-05-04T19:11:36.097Z · LW · GW

Given that systems of software which learn can eventually bring about 'transformative' impact (defined as 'impact comparable to the industrial revolution'), the most important thing to work on is AI. Given that the open problems in learning software between now and its transformativity can be solved in a multitude of ways, some of those solutions will be more or less beneficial, less or more dangerous, meaning there's a lever that altruistic researchers can use to steer outcomes in these open problems. Given the difficulty of social dilemmas and coordination, we need research that is aimed at improving single-multi, multi-single, and multi-multi capabilities until those capabilities outpace single-single capabilities. Given the increase in economic and military power implied by transformative systems, civilization could be irrevocably damaged by simple coordination failures.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-04T15:10:38.222Z · LW · GW

there's a gap in my inside view of the problem, part of me thinks that capabilities progress such as out-of-distribution robustness or the 4 tenets described in open problems in cooperative ai is necessary for AI to be transformative, i.e. a prereq of TAI, and another part of me that thinks AI will be xrisky and unstable if it progresses along other aspects but not along the axis of those capabilities.

There's a geometry here of transformative / not transformative cross product with dangerous not dangerous.

To have an inside view I must be able to adequately navigate between the quadrants with respect to outcomes, interventions, etc.

Comment by Quinn (quinn-dougherty) on We need a career path for invention · 2021-05-04T12:40:32.429Z · LW · GW

You might like Scientific Freedom by Donald Braben. It's a whole book about the problem of developing incentives for basic research.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-05-04T11:52:48.493Z · LW · GW

notes (from a very jr researcher) on alignment training pipeline

Training for alignment research is one part competence (at math, cs, philosophy) and another part having an inside view / gears-level model of the actual problem. Competence can be outsourced to universities and independent study, but inside view / gears-level model of the actual problem requires community support.

A background assumption I'm working with is that training as a longtermist is not always synchronized with legible-to-academia training. It might be the case that jr researchers ought to publication-maximize for a period of time even if it's at the expense of their training. This does not mean that training as a longtermist is always or even often orthogonal to legible-to-academia training, it can be highly synchronized, but it depends on the occasion.

It's common to query what relative ratio should be assigned to competence building (textbooks, exercises) vs. understanding the literature (reading papers and alignment forum), but perhaps there is a third category- honing your threat model and theory of change.

I spoke with a sr researcher recently who roughly said that a threat model with a theory of change is almost sufficient for an inside view / gears-level model. I'm working from the theory that honed threat models and your theory of change are important to calculate interventions. See Alice and Bob in Rohin's faq.

I've been trying by doing exercises with a group of peers weekly to hone my inside view / gears-level model of the actual problem. But the sr researcher i spoke to said mentorship trees of 1:1 time, not exercises that jrs can just do independently or in groups, is the only way it can happen. This is troublesome to me, as the bottleneck becomes mentors' time. I'm not so much worried about the hopefully merit-based process of mentors figuring out who's worth their time, as I am about the overall throughput. It gets worse though- what if the process is credentialist?

Take a look at the Critch quote from the top of Rohin's faq:

I get a lot of emails from folks with strong math backgrounds (mostly, PhD students in math at top schools) who are looking to transition to working on AI alignment / AI x-risk.

Is he implicitly saying that he offloads some of the filtering work to admissions people at top schools? Presumably people from non-top schools are also emailing him, but he doesn't mention them.

I'd like to see a claim that admissions people at top schools are trustworthy. No one has argued this to my knowledge. I think sometimes the movement falls back on status games, unless there is some intrinsic benefit to "top schools" (besides building social power/capital) that everyone is aware of. (Indeed if someone's argument is that they identified a lever that requires a lot of social power/capital, then they can maybe put that top school on their resume to use, but if the lever is strictly high quality useful research (instead of say steering a federal government) this doesn't seem to apply).

Comment by Quinn (quinn-dougherty) on AMA: Paul Christiano, alignment researcher · 2021-05-02T11:45:09.039Z · LW · GW

If anyone's interested, I took a crack at writing down a good successor criterion.

Comment by Quinn (quinn-dougherty) on Announcing the Technical AI Safety Podcast · 2021-05-01T22:00:55.731Z · LW · GW

Thanks for reaching out! Alex had passed onto me the note about transcripts, I hope to get to it (including the backlog of already released episodes) in the next few months.

Comment by Quinn (quinn-dougherty) on Averting suffering with sentience throttlers (proposal) · 2021-04-26T23:55:10.063Z · LW · GW

Right, I feel like there's a tradeoff between interestingness of consciousness theory and the viability of computational predicates. IIT gives you a nice computer program, but isn't very interesting.

Comment by Quinn (quinn-dougherty) on Could degoogling be a practice run for something more important? · 2021-04-22T12:24:03.711Z · LW · GW

I think the litmus test for the value of reducing dependency on a given product/technology is whether we think it's empowering or enfeebling. Consider arithmetic calculators: is it empowering to delegate boring stuff to subroutines freeing up your mind to do harder stuff, or is it enfeebling because it reduces incentive to learn to do mental arithmetic well? Dependence can be a problem in either case.

Each product needs to be assessed individually.

Comment by Quinn (quinn-dougherty) on Open and Welcome Thread - April 2021 · 2021-04-05T17:12:50.852Z · LW · GW

I'm trying to decide if i'm going to write up a thought about longtermism I had.

I think there are two schools of thought-- that the graph of a value function over time is continuous or discontinuous. The continuous school of thought suggests that you get near term evidence about long term consequences, and the discontinuous school of thought does not interpret local perturbation in this way at all.

I'm sure this is covered in one of the many posts about longtermism, and the language of continuous functions could either make it clearer or less clear depending on the audience.

Comment by Quinn (quinn-dougherty) on Takeaways from the Intelligence Rising RPG · 2021-03-05T11:30:34.527Z · LW · GW

I can't post a complete ruleset, but I can add some insight-- each party had "stats" representing hard power, soft power, budget, that sort of thing. Each turn you could spend "talent" stats on research arbitrarily, and you could take two "actions" which were GM-mediated expenditures of things like soft power, budget, etc. The game board was a list of papers and products that could be unlocked, unlocking papers released new products onto the board

Comment by Quinn (quinn-dougherty) on Reading recommendations on social technology: looking for the third way between technocracy and populism · 2021-02-24T20:46:38.606Z · LW · GW

isn't increasing the competence of the voter akin to increasing the competence of the official, by proxy? I'm pattern matching this to yet another push-pull compromise between the ends of the spectrum, with a strong lean toward technocracy's side.

I'm assuming I'll have to read Brennan for his response to the criticism that it was tried in u.s. and made a lot of people very upset / is widely regarded as a bad move.

I agree with Gerald Monroe about the overall implementation problems even if you assume it wouldn't just be a proxy for race or class war (which I think is a hefty "if").

Just doesn't seem like "off the spectrum" thinking to me, though it may be the case that reading Brennan will improve my appreciation of the problem.

Comment by Quinn (quinn-dougherty) on Scott and Rohin doublecrux on AI with human models · 2021-02-22T18:16:17.682Z · LW · GW

should i be subscribed to a particular youtube channel where these things get posted?

Comment by Quinn (quinn-dougherty) on Anki decks by LW users · 2021-02-21T14:49:31.346Z · LW · GW

Quick Bayes Table, by alexvermeer. A simple deck of cards for internalizing conversions between percent, odds, and decibels of evidence.

Comment by Quinn (quinn-dougherty) on Lessons I've Learned from Self-Teaching · 2021-01-25T12:23:27.640Z · LW · GW

Leverage the Pareto principle, get 80% of the benefit out of the key 20/30/40% of the concepts and exercises, and then move on.

This is hard to instrumentalize regarding difficulty. I find that the hardest exercises are likeliest to be skipped (after struggling with them for an hour or two), but it doesn't follow that I can expect the easier ones (which I happened to have completed) to lie in that key 20%.

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-01-16T17:59:13.686Z · LW · GW

::: latex :::

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-01-16T17:57:39.185Z · LW · GW

:::hm? x :: Bool -> Int -> String :::

Comment by Quinn (quinn-dougherty) on Quinn's Shortform · 2021-01-16T17:52:33.634Z · LW · GW

testing latex in spoiler tag

Testing code block in spoiler tag

Comment by Quinn (quinn-dougherty) on Infodemics: with Jeremy Blackburn and Aviv Ovadya · 2021-01-08T15:46:11.239Z · LW · GW

7p on thursday the 14th for New York, 4p in San Fransisco

Comment by Quinn (quinn-dougherty) on Announcing the Technical AI Safety Podcast · 2020-12-08T19:02:02.930Z · LW · GW

When I submitted to pocketcasts it said we were already on it :) https://pca.st/9froevor

Comment by Quinn (quinn-dougherty) on Have general decomposers been formalized? · 2020-07-10T15:11:55.172Z · LW · GW

Thank you Abram. Yes, factored cognition is more what I had in mind. However, I think it's possible to speak of decomposition generally enough to say that PCA/SVD is a decomposer, albeit an incredibly parochial one that's not very useful to factored cognition.

Like, my read of IDA is that the distillation step is proposing a class of algorithms, and we may find that SVD was a member of that class all along.

Comment by Quinn (quinn-dougherty) on How ought I spend time? · 2020-06-30T23:43:23.413Z · LW · GW

I'll check out Lynette's post.

I'd like to take a shot at technical AI alignment

Comment by Quinn (quinn-dougherty) on How ought I spend time? · 2020-06-30T21:14:48.829Z · LW · GW

What granularity of time are you talking about? When you "never maintain 1 and 2 at the same time", is that any given minute, or any given decade?

I would say every couple months is an opportunity to either pivot or continue.

Comment by Quinn (quinn-dougherty) on Have general decomposers been formalized? · 2020-06-28T17:35:55.391Z · LW · GW

Sorry, I think I might have a superficial understanding of encoders and embeddings. Would you be able to try pointing out for me how decomposition is performed in that case (or point me toward a favorite reading on the subject)? When I think of feeding a sentence into an encoder, I can think of multiple ways in which some compositional structure might be inferred.

I'm drawing up a proof of concept with seq2seq learners right now, but my hypothesis is that they will be inadequate decomposers suitable only for benchmarking a baseline.

Comment by Quinn (quinn-dougherty) on The Politics of Age (the Young vs. the Old) · 2019-03-30T04:35:35.852Z · LW · GW

SITG-suffrage Sorry, by this point OP and I had established "right to vote weighted by stake" as a concept, using the words "skin-in-the-game", so SITG was an acronym for skin-in-the-game, and suffrage referred to right to vote.

Parents are different from any other group in my comment because I was referencing Richard Kennaway's question "Does having children whose future you care about also count as skin in the game?"

Comment by Quinn (quinn-dougherty) on The Unexpected Philosophical Depths of the Clicker Game Universal Paperclips · 2019-03-30T04:29:21.883Z · LW · GW

A year or two before the paperclip version came out I played a lot of AdVenture Capitalist (and it's sequel, wait for it, AdVenture Communist), was wondering to myself whether reinforcement learning researchers would find it interesting, and wondering if deep mind would start training up agents to compete in AdVenture Capitalist tournaments.

Comment by Quinn (quinn-dougherty) on The Politics of Age (the Young vs. the Old) · 2019-03-27T18:29:16.647Z · LW · GW

Does having children whose future you care about also count as skin in the game?

Unclear. There's a lot to unpack, because we don't know the 1. narcissism or 2. epistemic competence distributions across parents. I.e., we can't expect that what parents' say are in their kids' interests actually share their kids' interests (either through willful misdirection or through earnest mistakes).

Or you can say that your skin-in-the-game factor is proprotional to how much you've already invested in the status quo. If you've spent 50 years working towards a goal it seems unfair that a 16-year old know-nothing should be able, on a whim, to throw all of that away.

I don't mean to guilt-by-association dismiss this, but it strongly reminds me of the property/land interpretation of SITG-suffrage.

The risk of 16 year old know-nothings throwing things away on a whim is measured against the risk of bad "tradition is the democracy of the dead" / "most insolent of tyrannies is to govern from beyond the grave" scenarios. Which equilibrium is worse, a civilization unable to cooperate across lifetimes (because kids constantly throw everything away and start over, reinventing wheels and repeating mistakes), or one where adults only inherit agency at age 70 and by then all they care about is the same stuff the previous 70+ cohort cared about? I think "epistemically defer to the elderly when it seems wise to do so" is a more beneficial heuristic than "we owe the elderly deference for the sacrifices they made before I was born", and if we're going to bet on the distribution of how responsibly we expect these heuristics to scale, I'd much rather bet on the former.

Comment by Quinn (quinn-dougherty) on The Politics of Age (the Young vs. the Old) · 2019-03-24T20:20:08.536Z · LW · GW

A skin-in-the-game vote multiplier based on age might look like mean lifespan - your age. That's the logical consequence of saying that people who have to put up with outcomes longer ought to weigh higher in shaping them. It should floor out at around 1 at the upper limit, and the lower limit should come from enforceability of anti-fraud measures (i.e. effectiveness at stopping parents from using kids who can't walk yet for extra votes) instead of from anyone's intuitions about when kids can think for themselves.

If some experts got together and said that brain development, knowledge, wisdom, etc. peaks at N, then you'd want the multiplier to be convex with a max at N.

With functions like these, averages between them, etc. there's a lot of material to play with, in terms of starting with one-person-one-vote and fixing it's weirdness with multipliers.

Maybe the latest in voting theory or the current stage of quadratic voting research already considered all this and came up with something more promising.

Comment by Quinn (quinn-dougherty) on Do the best ideas float to the top? · 2019-02-24T16:42:18.867Z · LW · GW

IMO, this is what I briefly suggested by linking to Scott's Against Murderism with the words "misleading compression", i.e., I think describing a policy as murderistic and optimizing for stories are each instances of misleading compression.

If it’s only stories which matter, yet you split your efforts between stories and reality, then you will likely be outcompeted by someone who spent all of their resources on crafting good stories.

This is 100% what I find alarming about misinformation (both the malicious kind and the emergent/inadequate kind), and I don't know a reason why alignment via debate would be resilient.

Comment by Quinn (quinn-dougherty) on Do the best ideas float to the top? · 2019-02-24T16:25:46.303Z · LW · GW

Sorry. The point was NAT, density_{1,2,3} was devised scaffolding for the MVB (minimum viable blogpost). I imagine that NAT has already been discovered, discussed, problematized etc. somewhere but I couldn't find it. I have a background assumption that attention economists are competent and well-intentioned people, so I trust that they have the situation under control.