aaron-kaufman

Posts
Comments

Posts

SAINT PAUL – ACX Meetups Everywhere Spring 2025 2025-03-25T23:48:47.896Z

ARENA4.0 Capstone: Hyperparameter tuning for MELBO + replication on Llama-3.2-1b-Instruct 2024-10-05T11:30:11.953Z

St. Paul USA - ACX Meetups Everywhere Fall 2024 2024-08-29T18:42:21.899Z

MSP ACX Hangout: Davanni's Pizza 2024-05-03T18:56:15.401Z

Minneapolis-St Paul ACX Article Club: Problem Actors and Groups! 2024-04-13T17:57:57.876Z

St. Paul – ACX Meetups Everywhere Spring 2024 2024-03-30T11:26:29.098Z

Minneapolis-St Paul ACX Article Club: Prosaic Life Skills and Agency 2024-03-12T03:56:55.072Z

Minneapolis-St Paul ACX Article Club: Super-Illegal Crypto Shenanigans 2024-02-26T00:09:54.168Z

Minneapolis-St Paul ACX Article Club: Meditation and LSD 2024-01-29T01:24:14.764Z

MSP Article Discussion Meetup: The EMH, Long-Term Investing, and Leveraged ETFs 2023-12-27T16:50:03.094Z

Article Discussion And Free Pizza - St Paul 2023-11-24T21:02:45.491Z

Perhaps vastly more people should be on FDA-approved weight loss medication 2021-08-14T17:22:30.651Z

Comments

Comment by 25Hour (aaron-kaufman) on SAINT PAUL – ACX Meetups Everywhere Spring 2025 · 2025-04-05T19:40:09.778Z · LW · GW

IMPORTANT CHANGE: I'm moving this to May 4th because April 13th is on Passover and Jewish people can't eat pizza on Passover.

Comment by 25Hour (aaron-kaufman) on St. Paul USA - ACX Meetups Everywhere Fall 2024 · 2024-10-07T21:13:00.515Z · LW · GW

oh wait you mean the email invite. Yeah, that's a great point, i'll kick that off again.

Comment by 25Hour (aaron-kaufman) on St. Paul USA - ACX Meetups Everywhere Fall 2024 · 2024-10-07T21:12:41.052Z · LW · GW

sorry, didn't see this previously! No, we actually still have those roughly weekly. We post them on the discord at https://discord.gg/m2xJcuC937

Comment by 25Hour (aaron-kaufman) on MSP ACX Hangout: Davanni's Pizza · 2024-05-04T17:20:20.934Z · LW · GW

Primarily people come to this on the discord, so I just have this on lw for visibility

Comment by 25Hour (aaron-kaufman) on MSP Article Discussion Meetup: The EMH, Long-Term Investing, and Leveraged ETFs · 2024-01-20T20:17:06.458Z · LW · GW

Hey people! Sorry, due to uber related issues going to be a few minutes late. Shouldn't be more than 10 though.

Comment by 25Hour (aaron-kaufman) on How to (hopefully ethically) make money off of AGI · 2023-11-08T15:18:57.299Z · LW · GW

So this all makes sense and I appreciate you all writing it! Just a couple notes:

(1) I think it makes sense to put a sum of money into hedging against disaster e.g. with either short term treasuries, commodities, or gold. Futures in which AGI is delayed by a big war or similar disaster are futures where your tech investments will perform poorly (and depending on your p(doom) + views on anthropics, they are disproportionately futures you can expect to experience as a living human).

(2) I would caution against either shorting or investing in cryptocurrency as a long-term AI play; as patio11 in his Bits About Money has discussed (most recently in A review of Number Go Up, on crypto shenanigans (bitsaboutmoney.com) ), cryptocurrency is absolutely rife with market manipulation and other skullduggery; shorting it can therefore easily result in losing your shirt even in a situation where cryptocurrencies otherwise ought to be cratering.

Comment by 25Hour (aaron-kaufman) on The AI apocalypse myth. · 2023-09-08T17:55:41.796Z · LW · GW

Worth considering that humans are basically just fleshy robots, and we do our own basic maintenance and reproduction tasks just fine. If you had a sufficiently intelligent AI, it would be able to:

(1) persuade humans to make itself a general robot chassis which can do complex manipulation tasks, such as Google's experiments with SayCan

(2) use instances of itself that control that chassis to perform its own maintenance and power generation functions

(2.1) use instances of itself to build a factory, also controlled by itself, to build further instances of the robot as necessary.

(3) kill all humans once it can do without them.

I will also point out that humans' dependence on plants and animals has resulted in the vast majority of animals on earth being livestock, which isn't exactly "good end".

Comment by 25Hour (aaron-kaufman) on Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)? · 2023-07-03T20:26:38.717Z · LW · GW

This seems doubtful to me; if Yan truly believed that AI was an imminent extinction risk, or even thought it was credible, what would Yann be hoping to do or gain by ridiculing people who are similarly worried?

Comment by 25Hour (aaron-kaufman) on Charting Is Mostly Superstition · 2023-06-20T03:18:00.259Z · LW · GW

Hey, I really appreciated this series, particularly in that it introduced me to the fact that leveraged etfs (1) exist and (2) can function well as a fixed proportion of overall holdings over long periods.

Is the lesswrong investing seminar still around/open to new participants, by any chance? I've been doing lots of research on this topic (though more for long-term than short-term strategies) and am curious about how deep the unconventional investing rabbit hole goes.

Comment by 25Hour (aaron-kaufman) on How to respond to the recent condemnations of the rationalist community · 2023-04-04T22:19:58.528Z · LW · GW

It's a beautiful dream, but I dunno, man. Have you ever seen Timnit engage charitably and in-good-faith with anyone she's ever disagreed publicly with?

And absent such charity and good faith, what good could come of any interaction whatsoever?

Comment by 25Hour (aaron-kaufman) on How to respond to the recent condemnations of the rationalist community · 2023-04-04T02:16:36.563Z · LW · GW

This is a tiny corner of the internet (Timnit Gebru and friends) and probably not worth engaging with, since they consider themselves diametrically opposed to techies/rationalists/etc and will not engage with them in good faith. They are also probably a single-digit number of people, albeit a group really good at getting under techies' skin.

Comment by 25Hour (aaron-kaufman) on The Power of High Speed Stupidity · 2023-03-18T05:42:12.779Z · LW · GW

Re: blameless postmortems, i think the primary reason for blamelessness is because if you have blameful postmortems, they will rapidly transform (at least in perception) into punishments, and consequently will not often occur except when management is really cheesed off at someone. This was how the postmortem system ended up at Amazon while i was there.

Blameful postmortems also result in workers who are very motivated to hide issues they have caused, which is obviously unproductive.

Comment by 25Hour (aaron-kaufman) on Some Thoughts on AI Art · 2023-01-26T08:50:35.719Z · LW · GW

Reasonable points, all! I agree that the conflation of legality and morality has warped the discourse around this; in particular the idea of Stable Diffusion and such regurgitating copyrighted imagery strikes me as a red herring, since the ability to do this is as old as the photocopier and legally quite well-understood.

It actually does seem to me, then, that style copying is a bigger problem than straightforward regurgitation, since new images in a style are the thing that you would ordinarily need to go to an artist for; but the biggest problem of all is that fundamentally all art styles are imperfect but pretty good substitutes in the market for all other art styles.

(Most popular of all the art styles-- to judge by a sampling of images online-- is hyperrealism, which is obviously a style that nobody can lay either legal OR moral claim to.)

So i think that if Stability tomorrow came out with a totally unimpeachable version of SD with no copyrighted data of any kind (but with a similarly high quality of output) we would have, essentially, the same set of problems for artists.

Comment by 25Hour (aaron-kaufman) on Some Thoughts on AI Art · 2023-01-26T08:37:36.408Z · LW · GW

Interestingly i believe this is a limitation that one of the newest (as yet unreleased) diffusion models has overcome, called DeepFloyd; a number of examples have been teased already, such as the following Corgi sitting in a sushi doghouse:

https://twitter.com/EMostaque/status/1615884867304054785?t=jmvO8rvQOD1YJ56JxiWQKQ&s=19

As such the quoted paragraphs surprised me as an instance of a straightforwardly falsifiable claim in the documents.

Comment by 25Hour (aaron-kaufman) on How to Convince my Son that Drugs are Bad · 2022-12-19T18:05:53.273Z · LW · GW

I think that your son is incorrectly analogizing heroin/other opiate cravings to be similar to "desire for sugar" or "desire to use X social media app" or whatever. These are not comparable. People do not get checked into sugar rehab clinics (which they subsequently break out of); they do not burn down each one of their social connections to get to use an hour of TikTok or whatever; they do not break their own arms in order to get to go to the ER which then pumps them full of Twitter likes. They do routinely do these things, and worse, to delay opiate withdrawal symptoms.

(For reference, my wife is a paramedic and she has seen this last one firsthand. Tell me: have you ever, in your life, had something you wanted so much that you would break one of your own limbs to get it?)

Another way of putting this is that opiate use frequently gives you a new utility function where the overwhelmingly dominant term is "getting to consume opiates."

For reference, I'm not automatically suspicious of drugs-- I wrote https://www.lesswrong.com/posts/NDmbnaniJ2xJnBASx/perhaps-vastly-more-people-should-be-on-fda-approved-weight .

believes he has enough self control to not get addicted

So first, as poster above points out, there is not a good way to establish this. You have certainty on this topic well above what the evidence merits.

But leaving that aside. A lot of the core issue here is that the risk/reward profile absolutely sucks for recreational opiates given almost any reasonable set of initial assumptions.

Like, suppose you're right and you don't get addicted. I guess you have... discovered a new hobby, I guess? Whereas if you're wrong then your life is pretty much destroyed, as is the life of everyone who loves you most.

EDIT: Another pretty-routine circumstance my wife runs into at work: Narcan injections are used to bring somebody back if they've stopped breathing due to opiate overdose. Patients need to be restrained beforehand since they will frequently attack providers out of anger for ruining their high, even after it is pointed out to them that they weren't breathing and were approx. 1 minute from death.

Comment by 25Hour (aaron-kaufman) on Semi-conductor/AI Stock Discussion. · 2022-11-27T05:32:52.016Z · LW · GW

I actually think you can get an acceptable picture of whether something is priced in by reading stock analysts on the topic, since one useful thing you can get from them is a holistic perspective of what is on/off the radar of finance types, and what they perceive as important.

Having done this for various stocks, i actually do not think LLM-based advances are on anyone's radar and i do not believe they are priced in meaningfully.

Comment by 25Hour (aaron-kaufman) on Semi-conductor/AI Stock Discussion. · 2022-11-27T05:22:36.814Z · LW · GW

I don't think i ever heard about tesla doing LLM stuff, which seems like the most relevant paradigm for TAI purposes. Can you elaborate?

Comment by 25Hour (aaron-kaufman) on Semi-conductor/AI Stock Discussion. · 2022-11-27T05:17:33.204Z · LW · GW

One possible options play is puts on shutterstock, since as of about 2 weeks ago midjourney got up to a level where you can for a pittance replicate the most common and popular stock image varieties at an extremely high level of quality. (E.g. girl holding a credit card and smiling).

I think the most likely way this shakes out is adobe integrates image generation with figma and its other products, leaving "buying a stock image" as an increasingly niche and limited option for people who want an image to decorate a thing where they aren't all that particular about what the image is.

Primary question to me is on what time scale the SSTK business model dissolves in, since these changes take time.

Comment by 25Hour (aaron-kaufman) on How would I know if a PhD is the right career path? · 2022-09-27T06:38:56.875Z · LW · GW

Having a Ph.D. confers relatively few benefits outside of academia. The writing style and skills taught in academia are very very different from that of industry, and the opportunity cost of pursuing a Ph.D. vs going into software engineering (or something similarly renumerative) is in the hundreds of thousands of dollars.

I would suggest that if you don't know exactly what you want to do with your life, you would be well-suited to doing something that earns you a bunch of money. This money can later be used to finance grander ambitions when you have figured out what you want to do.

I'll turn this question around on you: why is a Ph.D. the best way of accomplishing what you want to do?

As to the drudgery of office work-- "office work" is, i think, a false category. I spent hours of unbearable tedium performing repetitive reactions in lab during my PhD, and my current cushy Microsoft engineering job is enormously more creative and interesting while paying approximately 10x as much. For someone with the smarts to get a ph.d., retraining into engineering is very, very easy.

One other generally undiscussed aspect of the working world is that, for a number of reasons, your employers mostly treat you with respect roughly proportional to your salary. Ph.D.s, consequently, are often treated very poorly. This probably contributes to their poor mental health, as documented elsewhere.

Comment by 25Hour (aaron-kaufman) on We will be around in 30 years · 2022-06-07T22:19:38.757Z · LW · GW

My response comes in two parts.

First part! Even if, by chance, we successfully detect and turn off the first AGI (say, Deepmind's), that just means we're "safe" until Facebook releases its new AGI. Without an alignment solution, this is a game we play more or less forever until either (A) we figure out alignment, (B) we die, or (C) we collectively, every nation, shutter all AI development forever. (C) seems deeply unlikely given the world's demonstrated capabilities around collective action.

Second part:

I like Bitcoin as a proof-of-concept here, since it's a technology that:

Imposes broadly distributed costs in the form of global warming and energy consumption, which everyone acknowledges.
Is greatly disliked by the powers-that-be for enabling various kinds of regulatory evasion; and in fact has one authority (China) actively taking steps to eradicate it from their society, which per reports has not been successful.
Is strictly worse at defending itself than AGI, since Bitcoin is non-sentient and will not take any steps whatsoever to defend itself.

This is an existence proof that there are some software architectures that today, right now cannot be eradicated in spite of a great deal of concerted societal efforts going into just that. Presumably an AGI can just ape their successful characteristicsinaddition to anything else it does; hell, there's no reason an AGI couldn't just distribute itself as particularly profitable bitcoin mining software.

After all, are people really going to turn off a computer making them hundreds of dollars per month just because a few unpopular weirdos are yelling about far-fetched doomsday scenarios around AGI takeover?

Comment by 25Hour (aaron-kaufman) on We will be around in 30 years · 2022-06-07T12:05:39.590Z · LW · GW

"If you think this is a simplistic or distorted version of what EY is saying, you are not paying attention. If you think that EY is merely saying that an AGI can kill a big fraction of humans in accident and so on but there will be survivors, you are not paying attention."

Not sure why this functions as a rebuttal to anything i'm saying.

Comment by 25Hour (aaron-kaufman) on We will be around in 30 years · 2022-06-07T11:37:29.781Z · LW · GW

You ask elsewhere for commenters to sit down and think for 5 minutes about why an agi might fail. This seems beside the point, since averting human exctinction doesn't require averting one possible attack from an agi. It involves averting every single one of them, because if even one succeeds everyone dies.

In this it's similar to human security-- "why might a hacker fail" is not an interesting question to system designers, because the hacker gets as many attempts as he wants. For what attempts might look like, i think other posts have provided some reasonable guesses.

I also note that there already exist (non-intelligent) distributed computer systems entirely beyond the ability of any motivated human individual, government or organization to shut down. I refer, of course, to cryptocurrencies, which have this property as an explicit goal of their design.

So. Imagine that an AGI distributes itself among human computer systems in the same way as bitcoin mining software is today. Then it starts executing on someone's list of doomsday ideas, probably in a way secretive enough to be deniable.

Who's gonna shut it down? And what would such an action even look like?

(A possible suggestion is "everyone realizes their best interest is in coordinating shutting down their computers so that the AGI lacks a substrate to run on". To which i would suggest considering the last three years' worth of response to an obvious, threatening, global enemy that's not even sentient and will not attempt to defend itself.)

Comment by 25Hour (aaron-kaufman) on Beyond micromarriages · 2022-03-11T21:36:14.646Z · LW · GW

but it's such a good pun!

Comment by 25Hour (aaron-kaufman) on Late 2021 MIRI Conversations: AMA / Discussion · 2022-03-03T00:56:28.050Z · LW · GW

I'm not sure whether the unspoken context of this comment is "We tried to hire Terry Tao and he declined, citing lack of interest in AI alignment" vs "we assume, based on not having been contacted by Terry Tao, that he is not interested in AI alignment."

If the latter: the implicit assumption seems to be that if Terry Tao would find AI alignment to be an interesting project, we should strongly expect him to both know about it and have approached MIRI regarding it, neither which seems particularly likely given the low public profile of both AI alignment in general and MIRI in particular.

If the former: bummer.

Comment by 25Hour (aaron-kaufman) on Have You Tried Hiring People? · 2022-03-03T00:43:35.732Z · LW · GW

From a rando outsider's perspective, MIRI has not made any public indication that they are funding-constrained, particularly given that their donation page says explicitly that:

We’re not running a formal fundraiser this year but are participating in end-of-year matching events, including Giving Tuesday.

Which more or less sounds like "we don't need any more money but if you want to give us some that's cool"

Comment by 25Hour (aaron-kaufman) on What are some ways to do a PhD without an educational institution · 2022-03-02T23:53:19.316Z · LW · GW

It might be worth doing some goal-factoring on why you want the PhD in the first place.

If you just want to advance human knowledge, one plausible option is to get a fancy tech job, save up enough money to fund the project you're interested in, then commission someone to do the project. Feasibility naturally depends on the specifics of the project.

PhDs can involve dealing with a lot of financial insecurity and oftentimes personal hardship to get through (with six years of opportunity cost and no guarantee of getting funding for your research interests at the end), so it's probably worth verifying that a PhD is actually your best option for whatever your personal goals are.

Comment by 25Hour (aaron-kaufman) on Have You Tried Hiring People? · 2022-03-02T06:12:43.767Z · LW · GW

> Give Terrence Tao 500 000$ to work on AI alignement six months a year, letting him free to research crazy Navier-Stokes/Halting problem links the rest of his time... If money really isn't a problem, this kind of thing should be easy to do.
Literally that idea has been proposed multiple times before that I know of, and probably many more times many years ago before I was around.

What was the response? (I mean, obviously it was "not interested", otherwise it would've happened by now, but why?)

Comment by 25Hour (aaron-kaufman) on Money-generating environments vs. wealth-building environments (or "my thoughts on the stock market") · 2022-02-04T21:36:44.962Z · LW · GW

I think you're totally right that to the extent that the stock market is a zero-sum game retail traders will lose almost every time, since the big players on the other end will always have more information and power to leverage that information than retail.

I think a lot of the relevance of this comment depends on your view of stock-market-as-casino vs stock-market-as-generator-of-wealth-at-several-steps-removed. I take the view that it's mostly the latter; widget maker IPOs, accepts money from big institutional IPO investor and buys capital with it in exchange for proceeds, IPO investor (effectively after several intermediate trades) sells that share of proceeds-generated-from-capital to retail trader. The capital is still doing stuff for people! It's just exactly what it's doing is totally opaque to almost everyone.

Comment by 25Hour (aaron-kaufman) on Money-generating environments vs. wealth-building environments (or "my thoughts on the stock market") · 2022-02-04T21:23:03.089Z · LW · GW

I'd like to see the intuition expanded upon here:

And yet when I write that, I start asking myself “but what is a dollar if not an investment that is only worth what someone else is willing to trade for it” and then “wait, what if a stock is a better investment than a dollar” and then “no no no no no investing on top of investing is like double risk”

Is it double risk? We're going from a situation where we're talking to a widget producer and saying "yes I would like to exchange a dollar for a widget" to a situation where we're saying "I would like to exchange a fractional share of Microsoft for a widget." Seems basically analogous.

Now, obviously in our society all transactions are denominated in dollars, and you have to do the conversion to dollars beforehand because no retailer is actually able to accept shares-of-stock at the counter, but the fact that purchases have to be converted to dollars beforehand doesn't imply you're taking on the risk of that currency increasing or decreasing in value if you don't hold any of it at baseline.

And I guess if you accept this, the question is what defines a "better" or "worse" investment. It sounds like you're making an assessment that trading risk for money is fundamentally not worthwhile above a certain savings amount; I suppose that's fair; it just means that to maintain a specific retirement withdrawal rate you have to have a bunch more money saved up pre-retirement (in expectation) than someone who doesn't, though having done that you also face less risk of ruin from the stock market crashing.

I'm wondering if that mindset can be trivially extended to "but actually the really foolproof asset is freeze-dried meals, since 1 meal=1 meal, as opposed to one dollar which could equal any number of fractional meals in the future".

Comment by 25Hour (aaron-kaufman) on Covid 8/19: Cracking the Booster · 2021-08-21T04:00:32.237Z · LW · GW

I look forward to Thursdays specifically for these updates.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-17T22:21:40.841Z · LW · GW

This is an interesting argument! I certainly acknowledge that if you can become non-obese via purely dietary means, that is best.

I wonder whether your analogy holds in the circumstance where dietary means have been attempted and failed, as often happens judging by the truly staggering number of posts online on this very topic-- whether becoming non-obese via medication constitutes a short-term win outweighed by long-term detriments, and whether the effects of the pills turn out to be more harmful than the original obesity it was meant to treat.

But it's not totally clear to me that you have attempted to make an affirmative case for this being true, as opposed to suggesting it as a pure hypothetical.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T22:52:48.323Z · LW · GW

Oh, Wellbutrin (bupropion) is totally a thing you can use for weight loss, and is even found in Contrave (one of the drugs I listed) for that reason. Lesser effect, though, since its weight loss effects are additive with naltrexone.

Berberine is one I hadn't heard of before; unfortunately I can't find any articles discussing its use in weight loss.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T21:15:22.807Z · LW · GW

I suppose that's reasonable, though i will point out that this is a fully-general argument against taking any drugs long-term at all.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T20:49:29.165Z · LW · GW

Oh, I'm guessing based on purely correlational studies, with all the uncertainty and fuzziness that implies. Added a disclaimer to the relevant section to this effect, since it's worth calling out.

That said, I'd be shocked if the whole effect was due to confounders, since there are so many negative conditions comorbid with obesity, along with the existence of some animal studies also pointing in the direction of improved lifespan with caloric restriction.

Unfortunately, we don't have the ability to run controlled studies over a human lifespan, so we end up needing to do correlational studies and control for what we can. It seems like a bad idea to simply throw up our hands in complete epistemic helplessness and say that we don't know anything for sure; we need to act in the presence of incomplete information.

Also, re: the specific point of

Diet might fix these, while drugs might not.

Keep in mind that these drugs cause weight loss by way of causing dietary changes.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T20:40:11.741Z · LW · GW

Yup! It's branded as "Topamax", but I've heard that some users refer to it as "Stupamax" because of the brain fog effect. It doesn't sound awesome.

Also, it sounded like it increases probability of getting a kidney stone by a lot, though I'd need to track down the reference. All told, feels like one of the worse options out there.

As far as I understand it, "combination" drugs don't really do anything together that each component doesn't do alone. For example, bupropion causes weight loss if you take it alone; it just causes more when you pair it with naltrexone, which also causes weight loss.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T20:35:57.788Z · LW · GW

Also, good point about highlighting the uncertainty; I've added a disclaimer to that effect at the beginning of the section.

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T20:30:15.805Z · LW · GW

Can you give any examples of that happening, where a drug reduces lifespan but not by causing any specific fatal effect?

Comment by 25Hour (aaron-kaufman) on Perhaps vastly more people should be on FDA-approved weight loss medication · 2021-08-14T19:44:02.388Z · LW · GW

All fair points! That said, I think extended lifespan is a very reasonable thing to expect, since IIRC from longevity research that caloric restriction extends lifespan (from animal studies); this seems like a very natural extrapolation from that.

Comment by 25Hour (aaron-kaufman) on Thoughts on Iason Gabriel’s Artificial Intelligence, Values, and Alignment · 2021-01-15T21:44:47.644Z · LW · GW

I'd be concerned that our instincts toward vengeance in particular would result in extremely poor outcomes if you give humans near-unlimited power (which is mostly granted by being put in charge of an otherwise-sovereign AGI); one potential example is the AGI-controller sending a murderer to an artificial, semi-eternal version of Hell as punishment for his crimes. I believe there's a Black Mirror episode exploring this. In a hypothetical AGI good outcome, this cannot occur.

The idea of a committee of ordinary humans, ems, and semi-aligned AI which are required to agree in order to perform actions does, in principle, avoid this failure mode. Though worth pointing out that this also requires AI-alignment to be solved-- if the AGI is effectively a whatever-maximizer with the requirement that it gets agreement from the ems and humans before acting, it will acquire that agreement by whatever means, which brings up the question of whether the participation of the humans and ems is at all meaningful in this scenario.

The obvious retort is that the AGI would be a tool-ai without the desire to maximize any specific property of the world beyond the degree to which it obeys the humans and ems in the loop; the usual objections to the idea of tool-ais as an AI alignment solution apply here.

Another possible retort, which you bring up, is that the AGI needs to (instead of maximizing a specific property of the world) understand the balance between various competing human values, which is (as I understand it) another way of saying it needs to be capable of understanding and implementing coherent extrapolated volition. Which is fine, though if you posit this then CEV simply becomes the value to maximize instead; which brings us back to the objection that if it wants to maximize a thing conditional on getting consent from a human, it will figure out a way to do that. Which turns "AI governs the world, in a fashion overseen by humans and ems" into "AI governs the world."

Comment by 25Hour (aaron-kaufman) on Thoughts on Iason Gabriel’s Artificial Intelligence, Values, and Alignment · 2021-01-14T19:52:33.720Z · LW · GW

I'd definitely agree with this. Human institutions are very bad at making a lot of extremely crucial decisions; the Stanislav Petrov story, the Holocaust, and the prison system are all pretty good examples of cases where the institutions humans have created have (1) been invested with a ton of power, human and technological, and (2) made really terrible decisions with that power which either could have or did cause untold suffering.

Which I guess is mostly a longer way of saying +1.

User info

Posts

Comments