Posts

Comments

Comment by Amalthea (nikolas-kuhn) on AI Regulation is Unsafe · 2024-04-27T04:14:25.158Z · LW · GW
  1. I agree that potentially the benefits can go to everyone. The point is that as the person pursuing AGI you are making the choice for everyone else.
  2. The asymmetry is that if you do something that creates risk for everyone else, I believe that does single you out as an aggressor? While conversely, enforcing norms that prevent such risky behavior seems justified. The fact that by default people are mortal is tragic, but doesn't have much bearing here. (You'd still be free to pursue life-extension technology in other ways, perhaps including limited AI tools).
  3. Ideally, of course, there'd be some sort of democratic process here that let's people in aggregate make informed (!) choices. In the real world, it's unclear what a good solution here would be. What we have right now is the big labs creating facts that society has trouble catching up with, which I think many people are reasonably uncomfortable with.
Comment by Amalthea (nikolas-kuhn) on AI Regulation is Unsafe · 2024-04-27T02:32:02.573Z · LW · GW

I think the perspective that you're missing regarding 2. is that by building AGI one is taking the chance of non-consensually killing vast amounts of people and their children for some chance of improving one's own longevity.

Even if one thinks it's a better deal for them, a key point is that you are making the decision for them by unilaterally building AGI. So in that sense it is quite reasonable to see it as an "evil" action to work towards that outcome.

Comment by Amalthea (nikolas-kuhn) on AI Regulation is Unsafe · 2024-04-27T00:11:32.134Z · LW · GW

Somewhat of a nitpick, but the relevant number would be p(doom | strong AGI being built) (maybe contrasted with p(utopia | strong AGI)) , not overall p(doom).

Comment by Amalthea (nikolas-kuhn) on AI Regulation is Unsafe · 2024-04-24T01:49:25.849Z · LW · GW

I was down voting this particular post because I perceived it as mostly ideological and making few arguments, only stating strongly that government action will be bad. I found the author's replies in the comments much more nuanced and would not have down-voted if I'd perceived the original post to be of the same quality.

Comment by Amalthea (nikolas-kuhn) on Tamsin Leake's Shortform · 2024-04-22T08:00:28.654Z · LW · GW

Basically, I think whether or not one thinks whether alignment is hard or not is much more of the crux than whether or not they're utilitarian.

Pesonally, I don't find Pope & Belrose very convincing, although I do commend them for the reasonable effort - but if I did believe that AI is likely to go well, I'd probably also be all for it. I just don't see how this is related to utilitarianism (maybe for all but a very small subset of people in EA).

Comment by Amalthea (nikolas-kuhn) on Tamsin Leake's Shortform · 2024-04-22T02:08:01.516Z · LW · GW

"One reason that goes overlooked is that most human beings are not utilitarians" I think this point is just straightforwardly wrong. Even from a purely selfish perspective, it's reasonable to want to stop AI.

The main reason humanity is not going to stop seems mainly like coordination problems, or something close to learned helplessness in these kind of competitive dynamics.

Comment by Amalthea (nikolas-kuhn) on Tamsin Leake's Shortform · 2024-04-21T22:42:12.690Z · LW · GW

Unambiguously evil seems unnecessarily strong. Something like "almost certainly misguided" might be more appropriate? (still strong, but arguably defensible)

Comment by Amalthea (nikolas-kuhn) on What's with all the bans recently? · 2024-04-04T12:04:26.276Z · LW · GW

Do you have an example for where better conversations are happening?

Comment by Amalthea (nikolas-kuhn) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T13:05:57.057Z · LW · GW

You're conflating between "have important consequences" and "can be used as weapons in discourse"

Comment by Amalthea (nikolas-kuhn) on My Interview With Cade Metz on His Reporting About Slate Star Codex · 2024-03-27T09:37:26.051Z · LW · GW

What do you mean by example, here? That this is demonstrating a broader property, or that in this situation, there was a tribal dynamic?

Comment by Amalthea (nikolas-kuhn) on Intuition for 1 + 2 + 3 + … = -1/12 · 2024-02-18T23:54:21.036Z · LW · GW

Typo, thanks for pointing it out. Also, see here for the physics reference: https://en.m.wikipedia.org/wiki/1_%2B_2_%2B_3_%2B_4_%2B_⋯

Comment by Amalthea (nikolas-kuhn) on Intuition for 1 + 2 + 3 + … = -1/12 · 2024-02-18T21:57:11.574Z · LW · GW

I'm one of these professional mathematicians, and I'll say that this article completely fails to demonstrate it's central thesis that there is a valid intuitive argument for concluding that 1 + 2 + 3 + ... = -1/12 makes sense. What's worse, it only pretends to do so by what's essentially a swindle. In my understanding, it's relatively easy to reason that a given divergent series "should" take an arbitrary finite value by the kind of arguments employed here, so what is being done is taking a foregone conclusion and providing some false intuition for why it should be true.

On a less serious note, speaking to the real reason why 1 + 2 + 3 + ... = -1/12, that's actually what physicists will tell you, and we all know one should be careful around those.

Comment by Amalthea (nikolas-kuhn) on Intuition for 1 + 2 + 3 + … = -1/12 · 2024-02-18T21:45:05.460Z · LW · GW

Sure, but you're just claiming that, and I don't think it's actually true.

Comment by Amalthea (nikolas-kuhn) on Intuition for 1 + 2 + 3 + … = -1/12 · 2024-02-18T19:37:21.346Z · LW · GW

You run into the trouble of having to defend why your way to fit the divergent series into a pattern is the right one - other approaches may give different results.

Comment by Amalthea (nikolas-kuhn) on Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy · 2024-02-14T19:58:28.616Z · LW · GW

I think it's quite unlikely that GPT 5 will destroy the world. That said, I think it's generally reasonable to doubt prediction markets on questions that can't be fairly evaluated both ways.

Comment by Amalthea (nikolas-kuhn) on Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy · 2024-02-14T19:55:33.375Z · LW · GW

I think the possibility of compute overhang seems plausible given the technological realities, but generalizing from this to a second-order overhang etc. seems taking it too far.

If there is an argument that we should push compute due to the danger of another "overhang" down the line that should be made explicitly and not by generalisation from one (debatable!) example.

Comment by Amalthea (nikolas-kuhn) on There is way too much serendipity · 2024-01-23T08:59:06.883Z · LW · GW

Sorry, low effort comment on my side. Still, I think the original link seems misleading in the point it's purportedly trying to make.

Comment by Amalthea (nikolas-kuhn) on There is way too much serendipity · 2024-01-22T17:28:01.409Z · LW · GW

Doesn't have any bearing historically. Also seems more like a brute force search, where the component of studying the materials properties has been made more efficient (by partially replacing lab experiments with deep learning).

Comment by Amalthea (nikolas-kuhn) on The impossible problem of due process · 2024-01-21T16:25:11.117Z · LW · GW

Which document, and in what way?

Comment by Amalthea (nikolas-kuhn) on An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers · 2024-01-18T10:21:40.938Z · LW · GW

To be quite frank, you're avoiding complex numbers only in the sense that you spell out the operations involved in handling complex numbers explicitly - so of course there's no added benefit, you're simply lifting the lid of the box...

That being said, as you discover by decomposing complex multiplication into it's parts (rotation and scaling), you get to play with them separately, which already leads you to discover interesting new variations on the theme.

Comment by Amalthea (nikolas-kuhn) on The impossible problem of due process · 2024-01-17T15:11:56.526Z · LW · GW

I don't understand what you're trying to say - would you mind trying to illustrate what you mean?

Comment by Amalthea (nikolas-kuhn) on The impossible problem of due process · 2024-01-17T10:57:11.439Z · LW · GW

"if I keep having "misunderstandings" with more people who have no past record of similar behavior, after two or three it cumulatively becomes a strong Bayesian evidence that I am actually the bad guy." It's not quite that easy. Abuser's may particularly tend to seek out vulnerable people, and it's a real effect that when you are already raising a complaint about someone, this may open you up to further abuse by other bad actors, who can now have exploit that you now have spent your social capital. In other words, Bayesian considerations have a place, but you need to be extra careful that you're not misattributing the correlations.

Comment by Amalthea (nikolas-kuhn) on The impossible problem of due process · 2024-01-17T09:30:13.355Z · LW · GW

Fighting dirty can involve looking reasonable to the outside, e.g. being willing to lie or bend the truth and distracting from the key issues of the matter - these can all be done civilly.

Comment by Amalthea (nikolas-kuhn) on Most People Don't Realize We Have No Idea How Our AIs Work · 2024-01-10T10:27:36.130Z · LW · GW

I think think it's also easy to falsely conflate progress in understanding with having achieved some notable level of understanding. Whether one has the latter will likely only become clear after a significant passage of time, so it's hard to make a judgement right away. That said, it's fair to say "No idea" is overstating th case compared to e.g. "We understand very little about".

Comment by Amalthea (nikolas-kuhn) on re: Yudkowsky on biological materials · 2023-12-15T16:20:58.832Z · LW · GW

"The others give humanity a chance to see what is happening and change the rules 'in flight '."

This is possible in non-Foom scenarios, but not a given (e.g. super-human persuasion AIs).

Comment by Amalthea (nikolas-kuhn) on AI #41: Bring in the Other Gemini · 2023-12-07T20:08:06.939Z · LW · GW

I tentatively agree, but it also seems difficult to think of a better alternative.

Comment by Amalthea (nikolas-kuhn) on Quick takes on "AI is easy to control" · 2023-12-04T16:01:08.907Z · LW · GW

I agree. It's rare enough to get reasonable arguments for optimistic outlooks, so this seems worth for someone to openly engage with in some detail.

Comment by Amalthea (nikolas-kuhn) on Neither EA nor e/acc is what we need to build the future · 2023-11-29T09:27:29.692Z · LW · GW

I mean he didn't threaten to take the team with him, he was just going to do so.

We also don't know what went on behind the scenes, and it seems plausible that many OpenAI employees were (mildly) pressured into signing by the pro-Sam crowd.

So if counterfactually he hadn't been willing to destroy the company, he could have assuaged the people closest to him, and likely the dynamics would have been much different.

Comment by Amalthea (nikolas-kuhn) on Neither EA nor e/acc is what we need to build the future · 2023-11-28T16:33:20.371Z · LW · GW

One also might argue that Altman was willing to see the organization destroyed, and that he was the one raising the threat of taking OpenAI with him if he went down.

Comment by Amalthea (nikolas-kuhn) on Stephanie Zolayvar's Shortform · 2023-11-23T13:56:41.284Z · LW · GW

I completely understand that sentiment and am myself concerned about the social dynamics we could witness there.

Nevertheless, I think it is unclear how much these events matter in the end. Personally, I updated a bit towards people at OpenAI not really knowing what they're doing, which is good to know if true.

Comment by Amalthea (nikolas-kuhn) on OpenAI: Facts from a Weekend · 2023-11-20T20:11:31.581Z · LW · GW

It's hard to know for sure, but I think this is a reasonable and potentially helpful perspective. Some of the perceived repercussions on the state of AI safety might be "the band-aid being ripped off". 

Comment by Amalthea (nikolas-kuhn) on OpenAI: Facts from a Weekend · 2023-11-20T17:59:58.011Z · LW · GW

It doesn't seem to me like e/acc has contributed a whole lot to this beyond commentary. The rallying of OpenAI employees behind Altman is quite plausibly his general popularity + ability to gain control of a situation. 

At least that seems likely if Paul Graham's assessment of him as a master persuader is to be believed (and why wouldn't it?). 

Comment by Amalthea (nikolas-kuhn) on OpenAI: Facts from a Weekend · 2023-11-20T17:45:23.231Z · LW · GW

Whatever else, there were likely mistakes from the side of the board, but man does the personality cult around Altman make me uncomfortable. 

Comment by Amalthea (nikolas-kuhn) on Sam Altman fired from OpenAI · 2023-11-17T22:24:33.485Z · LW · GW

Adam D'Angelo via X:

Oct 25

This should help access to AI diffuse throughout the world more quickly, and help those smaller researchers generate the large amounts of revenue that are needed to train bigger models and further fund their research.

Oct 25

We are especially excited about enabling a new class of smaller AI research groups or companies to reach a large audience, those who have unique talent or technology but don’t have the resources to build and market a consumer application to mainstream consumers.

Sep 17

This is a pretty good articulation of the unintended consequences of trying to pause AI research in the hope of reducing risk: [citing Nora Belrose's tweet linking her article]

Aug 25

We (or our artificial descendants) will look back and divide history into pre-AGI and post-AGI eras, the way we look back at prehistoric vs "modern" times today.

Aug 20

It’s so incredible that we are going to live through the creation of AGI. It will probably be the most important event in the history of the world and it will happen in our lifetimes.

Comment by Amalthea (nikolas-kuhn) on Sam Altman fired from OpenAI · 2023-11-17T22:12:13.095Z · LW · GW

Judging from his tweets, D'Angelo seems like significantly not concerned with AI risk, so I was quite taken aback to find out he was on the OpenAI board. This might be misinterpreting his views based on vibes.

Comment by Amalthea (nikolas-kuhn) on A framing for interpretability · 2023-11-14T18:47:01.992Z · LW · GW

And we have a good idea of what signals we care about. 

 

Seems dubious. Or, understood narrowly, is an irrelevant tautology, and the real question is which signals are important (what we should care about), which again is unclear whether we know that. 

At least it'd be good to give further evidence (sorry if that is elsewhere and I missed it). 

Comment by nikolas-kuhn on [deleted post] 2023-10-30T20:02:15.467Z

Minor point: It's unclear to me that a model that doesn't contain harmful information in the training set is significantly bad. One problem with current models is that they provide the information in an easily accessible form - so if bad actors have to assemble it for fine-tuning it at least makes their job partially harder. Also, it seems at least plausible that the fine-tuned model will still underperform compared to one that contained dangerous information in the original training data.

Comment by Amalthea (nikolas-kuhn) on math terminology as convolution · 2023-10-30T18:16:41.909Z · LW · GW

But incentives in mathematics aren't structured around that being the case, perhaps because elegance of proofs is harder for institutions to measure.

 

I don't think this is true in a strong form (although it's arguably not "as important"). If you found a proof of the 4-color theorem that doesn't rely on an exhaustive machine search, that would be a quite noteworthy achievement. 

Comment by Amalthea (nikolas-kuhn) on Techno-humanism is techno-optimism for the 21st century · 2023-10-28T08:48:14.398Z · LW · GW

I find it quite alienating to that you seem to be conflating "techno-optimism" with "technological progress".

Particularly, I think "techno-optimism" beyond "recognizing that technological progress is often good (and maybe to a larger extend than is often recognized)" easily rises to the level of an ideology in that it diverts from truth-seeking (exemplified by Andreessen).

Basically, I agree on most of the object-level points you make but, in my intuition, having an additional emotional layer of attachment to so-and-so belief is not a thing we want in and of itself.

Comment by Amalthea (nikolas-kuhn) on AI as a science, and three obstacles to alignment strategies · 2023-10-26T14:17:31.250Z · LW · GW

"That deep learning systems are a kind of artifact produced by a few undifferentiated commodity inputs, one of which is called 'parameters', one called 'compute', and one called 'data', and that the details of these commodities aren't important. Or that the details aren't important to the people building the systems."

That seems mostly true so far for the most capable systems? Of course, some details matter and there's opportunity to do research on these systems now, but centrally it seems like you are much more able to forge ahead without a detailed understanding of what you're doing than e.g. in the case of the Wright brothers.

Comment by Amalthea (nikolas-kuhn) on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-16T15:02:58.966Z · LW · GW

Hmm, I guess the point of using the term "white box" is then to illustrate that it is not a literal black box, while the point of the term "black box" is that while it's a literal transparent system, we still don't understand it in the ways that matter. There's something that feels really off about the dynamic of term use here, but I can't quite articulate it.

Comment by Amalthea (nikolas-kuhn) on Arguments for optimism on AI Alignment (I don't endorse this version, will reupload a new version soon.) · 2023-10-16T08:51:08.257Z · LW · GW

The track record of mech interp for alignment is quite poor, especially compared to gradient based methods like RLHF.

 

I think this is essentially what people mean when they say "LLMs are a black box" and since you seem to be agreeing, I find myself very confused that you've been pushing a "white box" talking point. 

Comment by Amalthea (nikolas-kuhn) on AI Alignment Breakthroughs this week (10/08/23) · 2023-10-11T13:37:37.420Z · LW · GW

I don't think that essay does what you want. It seems to be about "you can't always capture the meaning of something by writing down defining a simple precise definition", while the complaint is that you're not using the word according to its widely agreed-upon meaning. If you don't want to keep explaining what you specifically mean by "breakthrough" in your title each time, you could simply change to a more descriptive word.

Comment by Amalthea (nikolas-kuhn) on EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem · 2023-10-06T14:07:32.565Z · LW · GW

"AI doom arguments are more intuitive than AI safety by default arguments, making AI doom arguments requires less technical knowledge than AI safety by default arguments, and critically the AI doom arguments are basically entirely wrong, and the AI safety by default arguments are mostly correct."

I really don't like that you make repeated assertions like this. Simply claiming that your side is right doesn't add anything to the discussion and easily becomes obnoxious.

Comment by Amalthea (nikolas-kuhn) on Evaluating the historical value misspecification argument · 2023-10-06T07:12:25.346Z · LW · GW

I don't understand your objection. A more capable AI might understand that it's completely sufficient to tell you that your mother is doing fine, and simulate a phone call with her to keep you happy. Or it just talks you into not wanting to confirm in more detail, etc. I'd expect that the problem wouldn't be to get the AI what you want to do in a specific supervised setting, but to remain in control of the overall situation, which includes being able to rely on the AI's actions not having any ramifications beyond it's narrow task.

The question is how do you even train the AI under the current paradigm once "human preferences" stops being a standard for evaluation and just becomes another aspect of the AIs world model, that needs to be navigated.

Comment by Amalthea (nikolas-kuhn) on Evaluating the historical value misspecification argument · 2023-10-06T06:56:42.773Z · LW · GW

Can you explain how this comment applies to Zvi's post? In particular, what is the "subtle claim" that Zvi is not addressing. I don't particularly care about what MIRI people think, just about the object level.

Comment by Amalthea (nikolas-kuhn) on AI Alignment Breakthroughs this Week [new substack] · 2023-10-02T13:03:53.951Z · LW · GW

It sounds like what you call a breakthrough, I'd just call a "result". In my understanding, it'd either have to open up an unexpected + promising new direction, or solve a longstanding problem in order to be considered a breakthrough.

Unfortunately, significant insights into alignment seem much rarer than "capabilities breakthroughs" (which are probably also more due to an accumulation of smaller insights, so even there one might simply say the field is moving fast)

Comment by Amalthea (nikolas-kuhn) on AI Alignment Breakthroughs this Week [new substack] · 2023-10-02T08:22:16.882Z · LW · GW

In general, it seems interesting to track the state of AI-alignment techniques, and how different ideas develop! 

I strongly suggest not using the term "Breakthrough" so casually, in order to avoid unnecessary hype. It's unclear we had any alignment breakthrough so far, and talking about "weekly breakthroughs" seems absurd at best. 

Comment by Amalthea (nikolas-kuhn) on The King and the Golem · 2023-09-30T14:17:32.082Z · LW · GW

In short, the king has given up a situation with known unknowns in favor of one with unknown unknowns and some additional economic gain.

Comment by Amalthea (nikolas-kuhn) on My Current Thoughts on the AI Strategic Landscape · 2023-09-30T12:01:37.421Z · LW · GW

I disagree in the sense that I don't think current systems are intelligent enough for "aligned" to be a relevant adjective. "Safe", or "controllable" seem much better, while I would reserve  the term "aligned" for the much stronger property that a system is robustly behaving in accordance with our interests. I agree with Steven Byrnes that "locally aligned" doesn't even make much sense ("performing as intended under xyz circumstances" would be much more descripitive)