Posts
Comments
Thanks! I should have been more clear that the trajectory toward level 5 (with all human virtue/trust being hackable for instrumental gains) itself is concerning, not just the eventual leap when it gets there.
The next goodwill-inducing paradigm that has outlived its utility seems to be the concept of "AGI":
From here:
Oddly, that could be the key to getting out from under its contract with Microsoft. The contract contains a clause that says that if OpenAI builds artificial general intelligence, or A.G.I. — roughly speaking, a machine that matches the power of the human brain — Microsoft loses access to OpenAI’s technologies.
The clause was meant to ensure that a company like Microsoft did not misuse this machine of the future, but today, OpenAI executives see it as a path to a better contract, according to a person familiar with the company’s negotiations. Under the terms of the contract, the OpenAI board could decide when A.G.I. has arrived.
Despite being founded on the precept of developing AGI, structuring the company and many major contracts around the idea, while never precisely defining it - there now seems to be deliberate distancing, as evidenced here. Notably Sam's recent vision of the future "The Intelligence Age" does not mention AGI.
I expect more tweets like this from OpenAI employees in the coming weeks/months, expressing doubts about the notion of AGI, often taking care to say that the causal motivations are altruistic/epistemic.
I categorically disagree with Eliezer's tweet that "OpenAI fired everyone with a conscience", and all of this might not be egregious as far as corporate sleights-of-hand/dissonance go - but scaled up recursively, eg. when extended to principles relating to alignment/warning shots/surveillance/misinformation/weapons, this does not bode well.
- Explain why you're concerned in public.
I'm concerned about OpenAI's behavior in context of their stated trajectory towards level 5 intelligence - running an organization. If the model for a successful organization lies in the dissonance between actions intended to foster goodwill (open research/open source/non-profit/safety concerned/benefit all of humanity) but those virtuous paradigms are all instrumental rather than intrinsic, requiring NDAs/financial pressure/lobbying to be whitewashed, scaling that up with AGI (which would have more intimate and expansive data, greater persuasiveness, more emotional detachment, less moral hesitation) seems clearly problematic.
In the spirit of Situational Awareness, I'm curious how people are parsing some apparent contradictions:
- OpenAI is explicitly pursuing AGI
- Most/many people in the field (eg. Leopold Aschenbrenner, who worked with Ilya Sutskever) presume that (approximately) when AGI is reached, we'll have automated software engineers and ASI will follow very soon
- SSI is explicitly pursuing straight-shot superintelligence - the announcement starts off by claiming ASI is "within reach"
- In his departing message from OpenAI, Sutskever said "I’m confident that OpenAI will build AGI that is both safe and beneficial...I am excited for what comes next - a project that is very personally meaningful to me about which I will share details in due time"
- At the same time, Sam Altman said "I am forever grateful for what he did here and committed to finishing the mission we started together"
Does this point to increased likelihood of a timeline in which somehow OpenAI develops AGI before anyone else, and also SSI develops superintelligence before anyone else?
Does it seem at all likely from the announcement that by "straight-shot" SSI is strongly hinting that it aims to develop superintelligence while somehow sidestepping AGI (which they won't release anyway) and automated software engineers?
Or is it all obviously just speculative talk/PR, not to be taken too literally, and we don't really need to put much weight on the differences between AGI/ASI for now? Just seems like more unnecessary specificity than warranted, if that were the case.
Could you clarify how binding "OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity." is?
I'm curious how people are parsing this rumor (part of Connor's tweets):
I recall a story of how a group of AI researchers at a leading org (consider this rumor completely fictional and illustrative, but if you wanted to find its source it's not that hard to find in Berkeley) became extremely depressed about AGI and alignment, thinking that they were doomed if their company kept building AGI like this. So what did they do? Quit? Organize a protest? Petition the government? They drove out, deep into the desert, and did a shit ton of acid...and when they were back, they all just didn't feel quite so stressed out about this whole AGI doom thing anymore, and there was no need for them to have to have a stressful confrontation with their big, scary, CEO.
Do people who are in proximity to the relevant community consider this anecdote fictional/not-pertinent/exaggerated/but-of-course with respect to AI safety?
Sure! I think a bunch of other answers touch upon this though.
The idea is that it's not determinism in itself that's causing the demotivation, that's just a narrative your subconscious mind brings forward when faced with a tough task, to protect you from thinking about something that is more difficult to face, but often actionable, eg. "I feel I'm not smart enough", "I think I will fail", "I'm embarrassed about what others will think". By explicitly asking yourself what that 'other' cause is (by phrasing it as above, or perhaps by imagining a stern parent/coach giving you a reality check), you can focus on something that might be very tough but not literally impossible to solve like the universe being deterministic.
The tool you essentially have in the face of determinism despair is awareness of distributed causality. It is the 'thinking about/sense of' part that is (or seems to be) causing it. A practical exercise I like is asking "If I had to bring myself to face the most 'makes me feel bad about myself' cause of my demotivation, what would it be?". Existential despair often masks some other pertinent but deeply invalidating anxiety.
I'm a former quant now figuring out how to talk to tech people about love (I guess it's telling that I feel a compelling pressure to qualify this).
Currently reading
https://www.nytimes.com/2023/10/16/science/free-will-sapolsky.html
Open to talking about anything in this ballpark!
Ok, this is me discarding my 'rationalist' hat, and I'm not quite sure of the rules and norms applicable to shortforms, but I simply cannot resist pointing out the sheer poetry of the situation.
I made a post about unconditional love and it got voted down to the point I didn't have enough 'karma' to post for a while. I'm an immigrant from India and took Sanskrit for six years - let's just say there is a core epistemic clash in its usage on this site[1]. A (very intelligent and kind) person whose id happens to be Christian takes pity and suggests, among other things, an syntactic distancing from the term 'love'.
TMI: I'm married to a practicing Catholic - named Christian.
- ^
Not complaining - I'm out of karma jail now and it's a terrific system. Specifically saying that the essence of 'karma', etymologically speaking, lies in its intangibility and implicit nature.
Thank you - I agree with you on all counts, and your comment on my thesis needing to be falsifiable is a helpful direction for me to focus.
I alluded to this above - this constraint to operate within provability was specifically what led me away from rationalist thinking a few years ago - I felt that when it really mattered (Trump, SBF, existential risk, consciousness), there tended to be this edge-case Godelian incompleteness when the models stopping working and people ended up fighting and fitting theories to justify their biases and incentives, or choosing to focus instead on the optimal temperature for heating toast.
So for the most part, I'm not very surprised. I have been re-acquainting myself the last couple of weeks to try and speak the language better. However, it's sad to see, for instance, the thread on MIRI drama, and hard not to correlate that with the dissonance from real life, especially given the very real-life context of p(doom).
The use of 'love' and 'unconditional love' from the get-go was very intentional, partly because they seem to bring up strong priors and aversion-reflexes, and I wanted to face that head on. But that's a great idea - to try and arrive at these conclusions without using the word.
Regardless, I'm sure my paper needs a lot of work and can be improved substantially. If you have more thoughts, or want to start a dialogue, I'd be interested.
But, your phrasing here feels a bit like a weird demand for exceptional rigor.
No - the opposite. I was implying that there's clearly a deeper underpinning to these patterns that any amount of rigor will be insufficient in solving, but my point has been articulated within KurtB's excellent later comment, and solutions in the earlier comment by jsteinhardt.
it's not that weird for a company to have an intense manager
I agree; that's very true. However, this usually occurs in companies that are chasing zero-sum goals. Employees treated in this manner might often resort to a combination of complaining to HR, being bound by NDAs, or biting the bullet while waiting for their paydays. It's just particularly disheartening to hear of this years-long pattern, especially given the induced discomfort in speaking out and the efforts to downplay, in an organization that publicly aims to save the world.
Thanks - that's fair on all levels. Where I'm coming from is an unyielding first-principles belief in the power and importance of love. It took me some life experience and introspection to acquire, and it doesn't translate well to strictly provable models. Takes a lot of iterations of examining things like "people (including very smart ones) just end up believing the world models that make them feel good about themselves" and "people are panicked about AI and their beliefs are just rationalizations of their innate biases", "if my family or any social circle don't really love each other, it always comes through", "Elon's inclination to cage fight or fly to Mars is just repressed fight or flight" to arrive at it.
I tried to justify it through a model of recurrence and self-similarity in consciousness, but clearly that's not sufficient or well articulated enough.
So yeah, I hear you on the inferential distance from LW ideas, and your model of "unconditional love" being more cloistered. For what it's worth - it really isn't, maybe I should find an analogue in diffusion models, I dunno. The negative, anti-harmonic effects at least are clearly visible and pervasive everywhere - there is no model that adequately captures our pandemic trauma and disunity, but it ends up shaping everything because we are animals and not machines, and quite good at hiding our fears and insecurities when posting on social media or being surveyed or even being honest with ourselves.
Thank you for taking the time to reply and engage - it's an unconditional kind act!
Three points that might be somewhat revealing:
- There was never an ask for reciprocal documents from employees. "Here's a document describing how to communicate with me. I'd appreciate you sending me pointers on how to communicate with you, since I am aware of my communication issues." was never considered.
- There are multiple independent examples of people in various capacities, including his girlfriend, expressing that their opinions were not valued, and a clear hierarchical model was in play.
- The more humble "my list of warnings" was highlighted immediately as justification but never broadcast broadly, and there seems to be no cognizance that it's not something anyone else would ever take upon themselves to share.
So I posted my paper, and it did get downvoted, with no comments, to the point I can't comment or post for a while.
That's alright - the post is still up, and I am not blind to the issue with trying to convince rationalists that love is real and super important biologically and obviously all that actually matters to save the world and exponentially more so because AI people are optimizing for everything else - without coming off as insulting or condescending. This presumption, of course, is just me rephrasing my past issues with rationalism, but it was always going to be hard to find an overlap of people who value emotions and understand AI.
For now, I'm taking this as a challenge to articulate my idea better, so I can least can get some critique. Maybe I'll take your suggestion and try distilling it in some way.
Thank you!
I'd call myself a lapsed rationalist. I have an idea I've been thinking about that I'd really like feedback on, have it picked apart etc. - and strongly feel that LessWrong is a good venue for it.
As I'm going through the final edits, while also re-engaging with other posts here, I'm discovering that I keep modifying my writing to make it 'fit' LW's guidelines and norms, and it's not been made easy by the fact that my world-lens has evolved significantly in the last five-ish years since I drifted away from this modality.
Specifically, I keep second-guessing myself with stuff like "ugh, this is obvious but I should really spell it out", "this is too explicit to the point of being condescending", "this is too philosophical", "this is trivial".
I haven't actually ever posted anything or gotten feedback here, so I'm sure it's some combination of overthinking, simply being intimidated and being hyper-aware of the naivete in my erstwhile world view.
My goal really is to get to the point that I'm reasonably confident it doesn't get deleted for some reason after I post.
I guess this is serving to dip my toe in the water and express my frustration. Thoughts?