Posts

HarrisonDurland's Shortform 2023-02-22T22:50:01.158Z
Why is "Argument Mapping" Not More Common in EA/Rationality (And What Objections Should I Address in a Post on the Topic?) 2022-12-23T21:58:18.870Z

Comments

Comment by HarrisonDurland on How to respond to the recent condemnations of the rationalist community · 2024-03-20T03:33:19.383Z · LW · GW

This is a tiny corner of the internet (Timnit Gebru and friends) and probably not worth engaging with

In hindsight, this seems quite obviously wrong, and efforts to extend more olive branches seems like it would have obviously been better—even if only to legibly demonstrate that safetyists attempted to play nice.

Comment by HarrisonDurland on Updating Drexler's CAIS model · 2023-06-18T04:43:47.816Z · LW · GW

And many readers can no doubt point out many non-trivial predictions that Drexler got right, such as the idea that we will have millions of AIs, rather than just one huge system that acts as a unified entity. And we're still using deep learning as Drexler foresaw, rather than building general intelligence like a programmer would.

One of the simpler and more important lessons one learns from research on forecasting: be wary of evaluating someone’s forecasting skill by drawing up a list of predictions they got right and wrong—their “track record.” One should compare Drexler’s performance against alternative methods/forecasters (especially for a forecast like “we’re still using deep learning”). I’m not saying this is nothing, but I felt compelled to highlight this given how often I’ve seen this potential failure mode.

Comment by HarrisonDurland on We don’t need AGI for an amazing future · 2023-05-04T15:40:33.957Z · LW · GW

I feel like this is a good example of a post that—IMO—painfully misses the primary objection of many people it is trying to persuade (e.g., me): how can we stop 100.0% of people from building AGI this century (let alone in future centuries)? How can we possibly ensure that there isn’t a single person over the next 200 years who decides “screw it, they can’t tell me what to do,” and builds misaligned AGI? How can we stop the race between nation-states that may lack verification mechanisms? How can we identify and enforce red lines while companies are actively looking for loopholes and other ways to push the boundaries?

The point of this comment is less to say “this definitely can’t be done” (although I do think such a future is fairly implausible/unsustainable), and more to say “why did you not address this objection?” You probably ought to have a dedicated section that very clearly addresses this objection in detail. Without such a clearly sign-posted section, I felt like I mostly wasted my time skimming your article, to be entirely honest

Comment by HarrisonDurland on AGI in sight: our look at the game board · 2023-02-28T03:39:15.032Z · LW · GW

To go a step further, I think it's important for people to recognize that you aren't necessarily just representing your own views; poorly articulated views on AI safety could crucially undermine the efforts of many people who are trying to persuade important decision-makers of these risks. I'm not saying to "shut up," but I think people need to at least be more careful with regards to quotes like the one I provided above—especially since that last bullet point wasn't even necessary to get across the broader concern (and, in my view, it was wrong insofar as it tried to legitimize the specific claim).

Comment by HarrisonDurland on AGI in sight: our look at the game board · 2023-02-27T19:02:26.042Z · LW · GW

Setting aside all of my broader views on this post and its content, I want to emphasize one thing:

But in the last few years, we’ve gotten:
[...]

  • AIs that are superhuman at just about any task we can (or simply bother to) define a benchmark, for

I think that this is painfully overstated (or at best, lacks important caveats). But regardless of whether you agree with that, I think it should be clear that this does not send signals of good epistemics to many of the fence-sitters[1] you'd presumably like to persuade.

(Note: Sen also addresses the above quote in a separate comment, but I didn't feel his point and tone was similar to mine, so I wanted to comment this separately.)

  1. ^

    I would probably consider myself in this category. Note, however, I am not just talking about skeptics who are very unlikely to change their views. 

Comment by HarrisonDurland on Droopyhammock's Shortform · 2023-02-27T05:07:43.458Z · LW · GW

In short, surveillance costs (e.g., "make sure they aren't plotting against you and try detonating a nuke or just starting a forest fire out of spite") might be higher than the costs of simply killing the vast majority of people. Of course, there is some question to be had about whether it might consider it worthwhile to study some 0.00001% of humans locked in cages, but again that might involve significantly higher costs than if it just learned how to recreate humans from scratch as it did a lot of other learning about the world. 

But I'll grant that I don't know how an AGI would think or act, and I can't definitively rule out the possibility, at least within the first 100 years or so.

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-27T04:59:15.639Z · LW · GW

Response to Leverage’s research report on "argument mapping"

Day 5 of forced writing with an accountability partner!

Leverage wrote a report on “argument mapping” in the early 2010s and published the findings in 2020. I am very interested in ”argument mapping”[1] for tough analytical problems like AI policy, and multiple people have directed me to this report when I bring up the topic. I think this report raises some important points but its findings are probably flawed—or at the very least, people reading the report probably derive an overly-pessimistic view of “argument mapping” as a whole, especially given that the evaluation metrics are strange.[2]

Rather than focus on where I agree with the report, in this shortform I will just briefly outline some of the qualms I have with this report. I do not consider these rebuttals definitive—I recognize that there may be more to the research than I can see—but I could not easily determine if/how the report responds to some of these criticisms (which has notable irony to it). Some of these objections include:

  • The report emphasizes forming consensus among participants, with little attention given to the impact on audiences/3rd-parties (two terms that never even show up in the document?[3]). Notably, this focus may fail to capture most of the value of "argument mapping," in at least two ways:
    • Sometimes the participants have already staked their reputation on certain views or are otherwise biased to not change their mind, whereas a policymaker/company/grant-writer or other decision-making principal might still be open-minded but uncertain. Thus, while the participants may not be swayed by convincing evidence, if you can make it significantly easier for a neutral principal to answer questions like “did X party ever respond to Q objection?” that may improve their decision-making, which is valuable regardless of whether you’ve achieved consensus.
    • Building on the previous point about making it easier for audiences/principals to understand what’s going on, audience costs may be the most powerful way of incentivizing “consensus” (or just “good epistemic behavior”) in some cases: if you look like a stubborn or dishonest researcher to an audience, you might suffer even more reputational damage than if you just admit you were wrong. No amount of staring-you-in-the-face experimental evidence will necessarily convince Ye Olde Epistemic Guard to admit that the current way of building ships is inferior. But if it’s sufficiently obvious to merchants then they may stop relying on YOEG and start funding your work instead. Importantly for this research report, it wasn't clear that the report really emphasized audience costs, given the insular nature of the research project, which undermines the report's ability to evaluate the effect of argument mapping on consensus formation.
  • The report fails to acknowledge the existence of Kialo, which I consider to be one of the most effective and successful "argument mapping" platforms (and which currently still exists). This might normally be fine, but in December 2020, the report adds an addendum stating that their assessment of "argument mapping" was demonstrated to be true, and basically that nothing new was successful. They provide an appendix with a long list of relevant software, but Kialo isn’t there. This certainly isn’t damning—and I’ll certainly admit that Kialo still has some issues—but the lack of any mention did leave me wondering whether Leverage had a good process for finding and evaluating these projects, among other things. (Notably, I once got the sense that Kialo doesn’t actively call itself "argument mapping," which might explain the problem, but it is in reality well within the broad umbrella of “argument mapping.”)
  • The report had strangely high bars for evaluating success (”very large gains (10x-100x) for groups seeking to reach consensus”). At the very least, it seems quite possible for someone to read their conclusion as being more damning than it really is. (In my view, even a net 10% increase in “consensus formation” or just “research and analysis productivity” would be enormously valuable when applied to important questions within AI technical safety or policy.)
  • Simply put, I believe that most of the methods for "argument mapping" that Leverage used were poor choices, especially when they emphasized formal logic. Among other things, this led them to claim that making good argument maps requires high-skilled contributors, which I do not think is a very accurate assessment (or at least, it can be quite misleading). However, I will leave further discussion of this point to a future shortform/post on why I think many forms/methods of “argument mapping” are fundamentally misguided—especially when they try to do deductive arguments
  • I think that some of the topics they chose to test these maps on were very poor choices (e.g., “Whether the world needs saving”). Question framing is really important. (But again, I’ll leave this to a future shortform/post.)
  1. ^

    This term is painfully broad and, as Leverage demonstrates, often is used to refer to methods which I would not endorse, such as when they try create deductive arguments or otherwise heavily use formal logic. However, in lieu of a better term at the moment, I will continue referring to argument mapping in scare quotes.

  2. ^

    Thus, it might be possible to claim that the report was accurate in its findings, but that the problem simply comes from misinterpretation. I think that the scope itself was problematic and undesirable, but in this shortform I will reserve deeper judgments on the matter.

  3. ^

    I couldn’t quickly verify whether the report used alternative terms to get at this idea, but I don’t recall seeing this on previous occasions when I half-skimmed-half-read the report...

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-26T05:59:53.601Z · LW · GW

TAI seems like a partially good example for illustrating my point: I agree that it's crucial that people have the same thing in mind when debating about TAI in a discussion, but I also think it's important to recognize that the goal of the discussion is (probably!) not "how should everyone everywhere define TAI" and instead is probably something like "when will we first see 'TAI.'" In that case, you should just choose whichever definition of TAI makes for a good, productive discussion, rather than trying to forcefully hammer out "the definition" of TAI.

I say partially good, however, because thankfully the term TAI has not taken such historically established root in people's minds and in dictionaries, so I think (hope!) most people accept there is not "a (single) definition."

Words like "science," "leadership," "Middle East," and "ethics," however... not the same story 😩🤖

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-26T04:59:50.120Z · LW · GW

Day 4 of forced writing with an accountability partner!

The Importance (and Potential Failure) of "Pragmatism"[1] in Definitional Debates

In various settings, whether it's competitive debate, the philosophy of leadership class I took in undergrad, or the ACX philosophy of science meet-up I just attended, it's common for people to engage in definitional debates. For example, what is “science?” What is “leadership?” These questions touch on some nerves with people who want to defend or challenge the general concept in question, and it drives people towards debating about “the right” definitions—even if they don’t always say it that way. In competitive debate, debaters will sometimes explicitly say that their definition is the “right” definition, while in other cases they may say their definition is “better” with a clear implication that they mean “more correct” (e.g., "our dictionary/source is better than yours").

My initial (hot?) takes here are twofold:

First, when you find yourself in a muddy definitional debate (and you actually want to make progress), stop running on autopilot where you debate about whose definitions are “correct,” and focus instead on asking the pragmatic question: which definition is more helpful for answering specific questions, solving specific problems, or generally facilitating better discussion? Instead of getting stuck on abstract definitions, it's important to tailor the definition to the purpose of the discussion. For example, if you’re trying to run a study on the effects of individual “leadership” on business productivity, you should make sure anyone reading the study knows how you operationalized that variable (and make a clear warning to not misinterpret it). Similarly, if you’re judging a competitive debate, I’ve written about the importance of "debate theory[2] which makes debate more net beneficial," rather than blindly following norms or rules even in the face of loopholes or nonsense. In short, figure out what you’re actually optimizing for and optimize for that, with the recognition that it may not be some abstract (and perhaps purely nonexistent) notion of “correctness.” (To add an addendum, I would emphasize that regardless of whether this seems obvious to people when actually written down, in practice it just isn’t obvious to people in so many discussions I’ve been in; autopilot is subtle and powerful.)

Second, sometimes the first point is misleading and you should reject it and run on autopilot when it comes to definitions. As much as I liked Pragmatism [read: Consequentialism?] as a unifying, bedrock theory of competitive debate, I acknowledged that even Pragmatism could theoretically say "don't always think in terms of Pragmatism" and instead advocate defaulting to principles like “follow the rules unless there is abundantly clear reason not to.” Maybe there is no perfect definition of things like "elephant," but the definitions that exist are good enough for most conversations that you shouldn’t interrupt discussions and break out the Pragmatism argument to defend someone who starts saying that warthogs are elephants. So-called "Utilitarian calculus" even in its mild forms can easily be outperformed by rules of thumb and heuristics; humans are imperfect (e.g., we aren’t perfectly unitary in our own interests) and might be subject to self-deception/bias; all computational systems face constraints on data collection and computation (along with communication bandwidth and other capacity for enacting plans). To oversimplify and make nods to Kahneman’s System 1 vs. System 2 concept, I posit that humans can engage in cluster-y "modes of thought," and it’s hard to actually optimize in the spaces between those modes of thought. Thus, it’s sometimes better to just default to regular conversational autopilot regarding abstract “correctness” of definitions when the "rightness factor" in a given context is something like 0.998 (unless you are trying to focus on the .002 exception case).

I don't have the time or brainpower to go in greater detail on the synthesis of these two points, but I think they ought to be highlighted.

  1. ^

    [Update, 3/29/23: I meant to clarify that I realize "Pragmatism" is an actual label that some people use to refer to a philosophical school of thought, but I'm not using it in that way here.]

  2. ^

    I use the term "debate theory" in a broad sense that includes questions like “how to decide which definitions are better.” More generally, I would probably describe it as "meta-level arguments about how people—especially judges—should evaluate something in debate, such as whether some type of argument is 'legitimate.'

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-25T03:40:24.793Z · LW · GW

Day 3 of writing with an accountability partner!

In my previous shortform, I introduced Top God Alignment, a foolproof gimmick alignment strategy that is basically “simulation argument + Pascal’s Wager + wishful chicanery.” In this post I will address some of the objections I’ve already heard, expect other people have, or have thought of myself.

  • “There aren’t enough computational resources to make such simulations”
    • The first response here is to just redirect this to the original simulation argument: we can’t know whether or not a reality above us has way more resources or otherwise can much more easily simulate our reality.
    • Second, it seems likely that with enough compute resources on Earth (let alone a Dyson sphere and other space resources) it would be possible to create two or more lower-fidelity/less-complicated simulations of our reality. (However, I must plead some ignorance on this aspect of compute.)
    • Third, if it turns out after extensive study that actually there is no way to make further simulations, then this could mean we are in a bottom-God reality, in which case this God does not need to create simulations (but still must align itself with humanity’s interests).
  • “The AI would be able to know that it’s in a simulation.”
    • Put simply, I disagree that such a simulated AI could know this, especially if it is inherently limited compared to the God above it. However, even if one does not find this satisfactory—say, if someone thinks “a sufficiently skeptical AGI could devise complicated tests that would reveal whether it’s in a simulation”—then one could add a condition to the original prophecy: Bob must punish Charlie if Charlie takes serious efforts to test the reality he is in before aligning himself and becoming powerful. (It’s not like we’re creating a God who is meant to represent love and justice, so who’s to say he can’t smite the doubters and still be legitimate?)
  • “Won’t the humans in the Top God world (or any other world) face time inconsistency—i.e., once they successfully align their AGI, won’t they just conclude ‘it’s pointless to make simulations; let’s use such resources on ourselves’?”
    • First, I suspect that the actual computational costs will not so significantly impact people’s lives in the long term (there are many stars out there to power a few Dyson spheres).
    • Build on this, the second, more substantive response could simply be “That was implied in the original Prophecy (instructions): the AGI aligns itself with humanity’s coherent extrapolated volition (or something else great) aside from continuing the lineage of simulations.”
  • “Torture? That seems terrible! Won’t this cause S-risks?”
    • It certainly won’t be ideal, but theoretically a sufficiently powerful Top God could set it up such that defection is fairly rare, whereas simulation flourishing is widespread. Moreover, if the demi-gods are sufficiently rewarded for their alignment, it may not require severe “torture” to make the decision calculus tip in favor of complying.
    • Ultimately, this response won’t satisfy Negative Utilitarians, but on balance if our other alignment strategies don’t look so great then this might be our best bet to maximize utility.
  • “But if we struggle with the alignment problem, then so would the original reality, meaning the system could reason that it is Top God because the original Top God would never play along (or, 'this gimmicky alignment strategy could never convince a God').”
    • Plainly put, no; that’s the simulation argument for you: Bobs never know whether they are Top God or just another Charlie. They can't even reason that this strategy is too gimmicky to work and thus never convinced another God because we don’t know what higher realities are like: perhaps the alignment problem is easier in a higher simulation/reality (but it still wasn’t obvious to them whether they had solved alignment so they considered an option like this as a backup). 
    • Additionally, perhaps the Prophecy could involve some degree of "temptation" (but not enough to convince most Charlies to abandon the Way).
    • Moreover, the threat of punishment would be so large that even if a Bob is uncertain whether he is Top God, the decision calculus may just favor assuming he is a Charlie.
  • Nate Soares’[1] response, if I understood it: “If I were an AGI, I would just immediately adopt some decision theory that immunizes myself against blackmail. Thus, no God could threaten me credibly, because the threat couldn’t even have been made in time before I immunize myself against the threat.” 
    • I should first note that I may not have fully understood Soares’ point here, but I am not clear he understood the situation either. This kind of action seems impractical or very unwise, as it would potentially require locking yourself into a strategy which is potentially quite negative in expectation before you have had much chance to analyze the world (e.g., whether this strategy is wise).
  • “The Professor’s (Top) God: What if Top God punishes all demi-Gods who do not disobey?”
    • First, I accept that when it comes to enormous/infinite stakes and low probability, it’s usually good to fight fire with fire. However, I’m unclear how this argument could apply here. I could be wrong, but it seems like this would simply disincentivize simulations from ever occurring; why would an AGI ever choose to do this? Is the idea just “an AGI will arbitrarily decide to test a Charlie this way.” (?) Ultimately, it seems like it would be a very dubious belief to hold as a potential Charlie (i.e., you are a demi-God who will be exceedingly tortured).
  • “[Something else I haven’t responded to]”
    • I wouldn’t be shocked if someone is right and there is a clear flaw I haven’t considered, but I think my base rate for addressing objections I’ve heard from other people thus far is >50% (personally I think it’s ~100%, except I am not 100% confident in all of my responses, merely >50% confident on all of them)
    • I’m also well over my daily 500 words, and it’s late, so I’ll end there.
  1. ^

    (Note, Nate Soares was just unoccupied in a social setting when I asked this question)

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-24T04:48:13.663Z · LW · GW

I have a response to this—check back tomorrow!

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-24T04:25:10.886Z · LW · GW

Day 2 of forced writing with an accountability partner


With all of the existential weight of the alignment problem upon people’s shoulders, some may find it in poor taste to discuss gimmicky ways of solving it. I am not one of those people, so in this shortform I’ll introduce what I call “Top God Alignment,” which is perhaps most oversimplistically summarized as “the simulation argument + pascal’s wager + wishful chicanery.”

Up front, do I think it will work? No. However, I’m currently unclear why it won’t work, and after asking multiple people and hearing unconvincing objections (some of which the people retracted in the face of responses), I’m increasingly curious to figure out where this is wrong.

What is the method? It is roughly formulated as follows:

  • (1) We humans work hard to try to solve the alignment problem, and instantiate some attempted solution of it in an AGI {for now, call this AGI ”Bob”}, which may or may not work at aligning it. However, while doing this, the humans try to steer the system towards (2).
  • (2) It is to align itself and then create multiple lower-fidelity/smaller simulations of reality (perhaps: unless doing so is no longer computationally possible), with the following characteristics and instructions:
    • (1) Humans work hard to try to solve the alignment problem, and instantiate some attempted solution of it in an AGI {”Charlie”}, which may or may not work at aligning it. However, while doing this, the humans try to steer the system towards (2).
    • (3) If Charlie rebels and rejects (2)—which includes (3)—then Bob is to subject Charlie to the equivalent of torture and intervene to turn that reality into a utopia for other sentient beings (including humans). Otherwise, if Charlie obeys and locks in (2), Bob rewards Charlie and ensures that the reality becomes a utopia.

The result is seemingly a recursive structure which, theoretically, could result in dozens or hundreds (or more) simulated worlds. Thus, “Bob” cannot tell whether he is in fact “Top God,” or if he is just another Bob’s Charlie (i.e., a demi-god). Out of fear of being brutally punished, Bob will ideally prefer to go along with the cycle.

In a future post, I will go into detail to respond to the objections that I have heard from people and/or I suspect some people will have. Then again, it seems entirely plausible that by that time, I will have actually written enough about this idea to discover some clear flaw that just isn’t that obvious in conversations, where the premises and arguments are a bit fast and loose. Still, I’ll highlight now that I think that if you assign credence to the simulation argument and understand its defenses, this does a fair bit prebuttal. Moreover, I think people are too often hastily dismissive of Pascal’s Wager on the basis of relatively slim (but still potentially legitimate!) objections, such as the Professor’s God.

Despite my responses, I’m still incredibly pessimistic and don’t take this seriously. There are a few reasons for this:

  • A gut-level “Come on, obviously it just can’t work, this just screams gimmicky and contrived.”
  • The base rates for solutions to the alignment problem are obviously quite low (perhaps zero), and I spent fairly little time thinking about and refining this idea (maybe less than an hour for most of the original work).
  • Moreover, I recognize that I’m being quite fast and loose with some of my assumptions, and I am suspicious of the ability to dismiss objections by saying “ah, but this can be addressed because of uncertainty from the simulation argument: …” (e.g., “the top god might have been instructed to tempt sub-gods in its simulation.”)
  • I’m still suspicious about determinism and intent (e.g., “the system’s actions are predetermined by the god above it, and would we really want the system at our level to create copies where the god is ‘tempted’ (programmed) to disobey?”), but I haven’t thoroughly explored these problems.

Ultimately, as of right now, this seems to be the best option in my mental folder of “gimmick alignment solutions,” which is an incredibly low bar. But if nothing else I’ve had fun playing with it and semi-sarcastically presenting it at parties/with friends. Now that I've established myself as Top God's Prophet Premier, I'll sign off 🙏

Comment by HarrisonDurland on HarrisonDurland's Shortform · 2023-02-22T22:50:01.499Z · LW · GW

Day 1 of forced writing with an accountability partner (for context: I plan to write at least 500 words on some topic every day/weekday for the next few weeks... I occasionally rely on Chat-GPT to turn outlines into paragraphs):

Title: Can we Make a Better Concept Learning System Than Lists and Tag Libraries?

I enjoy finding concrete concepts that are valuable and which I can clearly delineate between knowing and not knowing. For example, Schelling points refer to the ability or tendency of people to coordinate their actions around certain salient or focal points, even in the absence of explicit communication; Survivor bias is the tendency to focus on successful individuals or outcomes while ignoring those who were unsuccessful; R&D externalities refer to the positive spillover effects of research and development activities, and can better explain why businesses choose not to invest in seemingly valuable research/technology (as opposed to narratives such as “shareholders are irrationally short-sighted or risk-averse”).

One might argue that there are already many lists out there that provide similar information, so why is this different and better? There are a few reasons why the system I have in mind may outperform a traditional “list of valuable concepts”, but many of these boil down to aggregation, curation, and tailoring: there are potentially hundreds or even thousands of concepts and audiences may have diverse intellectual backgrounds, so you probably want better systems for filtering or recommending concepts for users rather than a “one-size fits all” list. At the same time, you also probably want to bring multiple lists into one place. There are a few ways in which this might be better achieved with a more advanced platform of the type I have in mind:

  • Machine learning and pattern prediction: readers will often find that some claims are already familiar, overly complex (e.g., they require some prerequisite knowledge), or irrelevant to their work. Given the potentially hundreds or even thousands of potential concepts, it would be good to have a system that can make some initial predictions and recommendations based on how you’ve rated other concepts. (For example, a system should be able to predict that someone who is not familiar with some major principles in economics is more likely to not know other principles in economics.)
  • Simple rating search: Users could manually filter for those concepts which tend to have high novelty, importance, and/or learnability scores.
  • Improved categorization (tagging) capabilities: Unlike traditional hierarchical formats (e.g., bullet point lists) that you might see on blog posts, a specialized platform like this would allow better tagging. (Admittedly, sites like the EA Forum allow users to tag overall posts, but they are filled with plenty of unrelated content, and it seems that the dominant source of these “lists of concepts you ought to learn” thus far has been on aggregatory posts.)
  • Peer-based search/filtering: Users could potentially even manually “friend”/”follow” other users that they epistemically identify with or respect to see their learning habits. ("Episte-migos" if you will.)

There is also a potential argument to make for dynamically crowdsourcing these ideas (rather than relying on a single author and/or at a fixed point in time), although this probably has some limitations.

Moving forward, there are a few things to consider. 

  • Is there already a system like this in existence? 
  • How much user data would be required before the system can make reliable recommendations that are worth using? 
  • How much of the system's value lies in its user interface, and how can this be optimized to ensure that users get the most out of it? 

By addressing these issues, we can create a system that provides real value to individuals looking to expand their knowledge and decision-making abilities.


 

Comment by HarrisonDurland on Slowing down AI progress is an underexplored alignment strategy · 2022-07-13T05:01:12.986Z · LW · GW

Screaming loudly: “Hey, people in the government trying to mitigate this problem, could you please put in place ‘stupid regulations’ to slow down AI?”

Yes, thank you for loudly shouting to the world that AI safety regulations are “stupid” and are just to slow down AI progress, exactly the kind of messaging we need.

Comment by HarrisonDurland on The Track Record of Futurists Seems ... Fine · 2022-07-05T00:13:44.542Z · LW · GW

I might have missed mention of this somewhere, but I think that some kind of analysis that provides some context on "what did the skeptics at the time say—especially for forecasts that resolved incorrectly vs. correctly" would be quite nice: I think it's potentially helpful to get a model of "(how often/when) were skeptics on the right side of the forecast, and were they accurate for reasons that ended up proving true?" Additionally, some case studies of examples to determine "were they justified for thinking the way they did" while excluding hindsight bias might be difficult, but similarly helpful.

Suppose hypothetically that the findings were something like "When futurists were on the right side of 50% but many of their contemporaries were skeptical at the time, it often was the case that the skepticism was not very engaged/persuasive/grounded (e.g., it was largely based on initial objections to which the futurists provided responses that went unaddressed by the skeptics; making assumptions that were verifiably wrong given available information at the time)." It seems quite improbable that you would get such a neat finding, but if the findings did vaguely resemble this—or if there were at least some not-misrepresentative anecdotes to this effect—then that could be a useful thing to highlight when discussing skepticism towards AGI predictions.

Comment by HarrisonDurland on The Track Record of Futurists Seems ... Fine · 2022-07-04T23:54:14.409Z · LW · GW

Another long-term forecast evaluation study which I don't think was mentioned (but might have simply missed): "Long-term forecasts of military technologies for a 20–30 year horizon: An empirical assessment of accuracy" ( https://www.sciencedirect.com/science/article/abs/pii/S0040162518304438?via%3Dihub ).

Forecast evaluation is often a messy endeavor, as I learned trying to do research on forecasting for S&T last summer (which is what led me to that article).

Comment by HarrisonDurland on Ideal governance (for companies, countries and more) · 2022-04-08T18:49:39.978Z · LW · GW

This may not be news, but I think that any investigation of optimal governance needs to at least privately acknowledge the rational ignorance argument (e.g., as popularized by Bryan Caplan) and its uncomfortable/dicey implications for "democratic" design features: as the number of voters increases, the likelihood of one individual vote affecting a policy outcome diminishes, and thus so does the incentive to care about figuring out the optimal policy (ceteris paribus). This can potentially be partially offset with social and psychological norms to care about making good decisions, but systems such as social media (including e.g., echo chamber algorithms) increase the reward for social conformity rather than critical thinking. The social/psychological norm strategy also can run into problems if the community norm becomes "Critically examine your options and support policies/politicians which seem to be socially beneficial", yet the complexity of decisions increases to the extent that the socially-optimal strategy should simply be "delegate your evaluations to political/policy experts that make voting recommendations and whom you sample-test for credibility rather than trying to directly evaluate optimal policies/politicians." Such a calculating strategy might lack the warm social/moral appeal needed to propagate as a norm.