Posts
Comments
Thanks! I'm excited to go over the things I never heard of
So far,
- Elevenlabs app: great, obviously
- Bolt: I didn't like it
- I asked it to create a React Native app that prints my GPS coordinates to the screen (as a POC), it couldn't do it. I also asked for a podcast app (someone must and no one else will..), it did less well than Replit (though Replit used web). Anyway my main use case would be mobile apps (I don't have a reasonable solution for that yet) (btw I hardly have mobile development experience, so this is an extra interesting use case for me).
- It sounds like maybe you're missing templates to start from? I do think Bolt's templates have something cool about them, but I don't think
- Warp: I already use the free version and I like it very much. Great for things like "stop this docker container and also remove the volume"
- Speech to text: I use ChatGPT voice. My use case is "I'm riding my bike and I want to use the time to write a document", so we chat about it back and forth
Q:
5. How do you "Use o1-mini for more complex changes across the codebase"? (what tool knows your code and can query o1 about it?)
5.1. OMG, Is that what Cursor Composer is? I have got to try that
I don't think so (?)
There are physical things that make me have more nightmares, like being too hot, or needing to pee
Sounds like I might be missing something obvious?
I find lucid dreams to be effective "against" nightmares (for 10+ years already).
AMA if you want
Thanks for sharing <3
My main concern about trying SSRIs is that they'll make me stop noticing certain things that I care about, things that currently manifest as anxiety or so.
Opinions?
As AIs become more capable, we may at least want the option of discussing them out of their earshot.
If I'd want to discuss something outside of an AI's earshot, I'd use something like Signal, or something that would keep out a human too.
AIs sometimes have internet access, and robots.txt won't keep them out.
I don't think having this info in their training set is a big difference (but maybe I don't see the problem you're pointing out, so this isn't confident).
Scaling matters, but it's not all that matters.
For example, RLHF
@habryka , Would you reply to this comment if there's an opportunity to donate to either? Me and another person are interested, and others could follow this comment too if they wanted to
(only if it's easy for you, I don't want to add an annoying task to your plate)
+1, you convinced me.
I worry this will distract from risks like "making an AI that is smart enough to learn how to hack computers from scratch", but I don't buy the general "don't distract with true things" argument.
"I don't think that there is more that 1% that support direct violence against non-terrorists for its own sake": This seems definitely wrong to me, if you also count Israelies who consider everyone in Gaza as potential terrorists or something like that.
If you offer Israelies:
Button 1: Kill all of Hamas
Button 2: Kill all of Gaza
Then definitely more than 1% will choose Button 2
I haven't heard of anything like that (but not sure if I would).
Note there are also problems in trying to set up a government using force, in setting up a police force there if they're not interested in it, in building an education system (which is currently, afaik, very anti Israel and wouldn't accept Israel's opinions on changes, I think) ((not that I'm excited about Israel's internal education system either)).
I do think Israel provides water, electricity, internet, equipment, medical equipment (subsidized? free? i'm not sure of all this anyway) to Gaza. I don't know if you count that is something like "building a stockpile of equipment for providing clean drinking water to residents of occupied territory".
I don't claim the current solution is good, I'm just pointing out some problems with what I think you're suggesting (and I'm not judging whether those problems are bigger or smaller).
What do you mean by "building capacity" in this context? (maybe my English isn't good enough, I didn't understand your question)
I was a software developer in the Israeli military (not a data scientist), and I was part of a course constantly trains software developers for various units to use.
The big picture is that the military is a huge organization, and there is a ton of room for software to improve everything. I can't talk about specific uses (just like I can't describe our tanks or whatever, sorry if that's what you're asking, and sorry I'm not giving the full picture), but even things like logistics or servers or healthcare have big teams working on them.
Also remember the military started a long time ago, when there weren't good off-the-shelf solutions for everything, and imagine how big are the companies that make many of the products that you (or orgs) use.
- There are also many Israelies that don't consider Plaestinians to be humans worth protecting, but rather as evil beings / outgroup / whatever you'd call that.
- Also (with much less confidence), I do think many Palastinians want to kill Israelies because of things that I'd consider brainwashing.
- Hard question - what to do about a huge population that's been brainwashed like that (if my estimation here is correct), or how might a peaceful resolution look?
Not a question, but seems relevant for people who read this post:
Meni Rosenfeld, one of the early LessWrong Israel members, has enlisted:
Source: https://www.facebook.com/meni.rosenfeld/posts/pfbid0bkvfrb3qFTF7U82eMgkZzgMjMT4s3pbGUx7ahgKX1B8hr2n1viYqg9Msz6t3dBUPl (a public post by him)
Any ideas on how much to read this as "Sam's actual opinions" vs "Sam trying to say things that will satisfy the maximum amount of people"?
(do we have priors on his writings? do we have information about him absolutely not meaning one or more of the things here?)
Hey Kaj :)
The part-hiding-complexity here seems to me like "how exactly do you take a-simulation/prediction-of-a-person and get from it the-preferences-of-the-person".
For example, would you simulate a negotiation with the human and how the negotiation would result? Would you simulate asking the human and then do whatever the human answers? (there were a few suggestions in the post, I don't know if you endorse a specific one or if you even think this question is important)
Because (I assume) once OpenAI[1] say "trust our models", that's the point when it would be useful to publish our breaks.
Breaks that weren't published yet, so that OpenAI couldn't patch them yet.
[unconfident; I can see counterarguments too]
- ^
Or maybe when the regulators or experts or the public opinion say "this model is trustworthy, don't worry"
I'm confused: Wouldn't we prefer to keep such findings private? (at least, keep them until OpenAI will say something like "this model is reliable/safe"?)
My guess: You'd reply that finding good talent is worth it?
This seems like great advice, thanks!
I'd be interested in an example for what "a believable story in which this project reduces AI x-risk" looks like, if Dane (or someone else) would like to share.
A link directly to the corrigibility part (skipping unrelated things that are in the same page) :
This post got me to do something like exposure therapy to myself in 10+ situations, which felt like the "obvious" thing to do in those situations. This is a huge amount of life-change-per-post
My thoughts:
[Epistemic status + impostor syndrome: Just learning, posting my ideas to hear how they are wrong and in hope to interact with others in the community. Don't learn from my ideas]
A)
Victoria: “I don't think that the internet has a lot of particularly effective plans to disempower humanity.
I think:
- Having ready plans on the internet and using them is not part of the normal threat model from an AGI. If that was the problem, we could just filter out those plans from the training set.
- (The internet does have such ideas. I will briefly mention biosecurity, but I prefer not spreading ideas on how to disempower humanity)
B)
[Victoria:] I think coming up with a plan that gets past the defenses of human society requires thinking differently from humans.
TL;DR: I think some ways to disempower humanity don't require thinking differently than humans
I'll split up AI's attack vectors into 3 buckets:
- Attacks that humans didn't even think of (such as what we can do to apes)
- Attacks that humans did think of but are not defending against (for example, we thought about pandemic risks but we didn't defended against them so well). Note this does not require thinking about things that humans didn't think about.
- Attacks that humans are actively defending against, such as using robots with guns or trading in the stock market or playing go (go probably won't help taking over the world, but humans are actively working on winning go games, so I put the example here). Having an AI beat us in one of these does require it to be in some important (to me) sense smarter than us, but not all attacks are in this bucket.
C)
[...] requires thinking differently from humans
I think AIs already today think differently than humans in any reasonable way we could mean that. In fact, if we could make an them NOT think differently than humans, my [untrustworthy] opinion is that this would be non-negligible progress towards solving alignment. No?
D)
The intelligence threshold for planning to take over the world isn't low
First, disclaimers:
(1) I'm not an expert and this isn't widely reviewed, (2) I'm intentionally being not detailed in order to not spread ideas on how to take over the world, I'm aware this is bad epistemic and I'm sorry for it, it's the tradeoff I'm picking
So, mainly based on A, I think a person who is 90% as intelligent as Elon Musk in all dimensions would probably be able to destroy humanity, and so (if I'm right), the intelligence threshold is lower than "the world's smartest human". Again sorry for the lack of detail. [mods, if this was already too much, feel free to edit/delete my comment]
"Doing a Turing test" is a solution to something. What's the problem you're trying to solve?
As a judge, I'd ask the test subject to write me a rap song about turing tests. If it succeeds, I guess it's a ChatGPT ;P
More seriously - it would be nice to find a judge that doesn't know the capabilities and limitations of GPT models. Knowing those is very very useful
[I also just got funded (FTX) to work on this for realsies 😸🙀 ]
I'm still in "learn the field" mode, I didn't pick any direction to dive into, but I am asking myself questions like "how would someone armed with a pretty strong AI take over the world?".
Regarding commitment from the mentor: My current format is "live blogging" in a Slack channel. A mentor could look whenever they want, and comment only on whatever they want to. wdyt?
(But I don't know who to add to such a channel which would also contain the potentially harmful ideas)
This is a problem for me, a few days after starting to (try) doing this kind of research. Any directions?
The main reason for me is that I want feedback on my ideas, to push me away from directions that are totally useless, which I'm afraid to fall into since I'm not an experienced researcher.
I recommend discussing in the original comment as opposed to splitting up the comments between places, if you have something to ask/say
Poll: Agree/Disagree:
Working for a company that advances AI capabilities is a good idea for advancing-safety because you can speak up if you disagree with something, and this outweighs the downside of how you'd help them advance capabilities
Poll: Agree/disagree:
Working for companies that advance AI capabilities is generally a good idea for people worried about AI risk
Could you help me imagine an AGI that "took over" well enough to modify it's own code or variables - but chooses not to "wire head" it's utility variable but rather prefers to do something in the outside world?
This looks like a guide for [working in a company that already has a research agenda, and doing engineering work for them based on what they ask for] and not for [trying to come up with a new research direction that is better than what everyone else is doing], right?
Ah,
I thought it was "I'm going to sacrifice sleep time to get a few extra hours of work"
My bad
- I try to both [be useful] and [have a good life / don't burn out]
- I started thinking a few days ago about investments. Initial thoughts:
- Given we're not all dead, what happens and how to get rich?
- Guess 1: There was a world war or something similar that got to all AI labs worldwide
- My brain: OMG that sounds really bad. Can I somehow avoid the cross fire?
- Guess 2: One specific org can generate 100x more tech and science progress than the entire rest of the world combined
- My brain: I hope they will be publicly tradable, still respect the stock market, and I can buy their stock in advance?
- Problem: Everyone wants to invest in AI companies already. Do I have an advantage?
- Guess 1: There was a world war or something similar that got to all AI labs worldwide
- If there will be a few years of vast-strangeness before we'll probably all die, can I get very rich beforehand and maybe use that for something?
- (Similar to Guess 2 above, and also doesn't seem promising)
- Given we're not all dead, what happens and how to get rich?
This is just initial, I'm happy in anyone joining the brainstorm, it's easier together
I disagree with "sleeping less well at night".
I think if you're able to sleep well (if you can handle the logistics/motivation around it, or perhaps if sleeping well is a null action with no cost), it will be a win after a few days (or at most, weeks)
When I ask this question, my formulation is "50% of the AI capabilities researchers [however you define those] stop [however you define that] for 6 months or more".
I think that your definition, of "making people change their mind" misses the point that they might, for example, "change their mind and work full time on making the AGI first since they know how to solve that very specific failure mode" or whatever
Epistemic Status: Trying to form my own views, excuse me if I'm asking a silly question.
TL;DR: Maybe this is over fitted to 2020 information?
My Data Science friends tell me that to train a model, we take ~80% of the data, and then we test our model on the last 20%.
Regarding your post: I wonder how you'd form your model based on only 2018 information. Would your model nicely predict the 2020 information or would it need an update (hinting that it is over fitted)? I'm asking this because it seems like the model here depends very much on cutting edge results, which I would guess makes it very sensitive to new information.
May I ask what you are calling "general alignment sympathy"? Could you say it in other words or give some examples?
I don't think "infinite space" is enough to have infinite copies of me. You'd also need infinite matter, no?
[putting aside "many worlds" for a moment]
Anonymous question (ask here) :
Why do so many Rationalists assign a negligible probability to unaligned AI wiping itself out before it wipes humanity out?
What if it becomes incredibly powerful before it becomes intelligent enough to not make existential mistakes? (The obvious analogy being: If we're so certain that human wisdom can't keep up with human power, why is AI any different? Or even: If we're so certain that humans will wipe themselves out before they wipe out monkeys, why is AI any different?)
I'm imagining something like: In a bid to gain a decisive strategic advantage over humans and aligned AIs, an unaligned AI amasses an astonishing amount of power, then messes up somewhere (like AlphaGo making a weird, self-destructive move, or humans failing at coordination and nearly nuking each other), and ends up permanently destroying its code and backups and maybe even melting all GPUs and probably taking half the planet with it, but enough humans survive to continue/rebuild civilisation. And maybe it's even the case that hundreds of years later, we've made AI again, and an unaligned AI messes up again, and the cycle repeats itself potentially many, many times because in practice it turns out humans always put up a good fight and it's really hard to kill them all off without AI killing itself first.
Or this scenario considered doom? (Because we need superintelligent AI in order to spread to the stars?)
(Inspired by Paul's reasoning here: "Most importantly, it seems like AI systems have huge structural advantages (like their high speed and low cost) that suggest they will have a transformative impact on the world (and obsolete human contributions to alignment retracted) well before they need to develop superhuman understanding of much of the world or tricks about how to think, and so even if they have a very different profile of abilities to humans they may still be subhuman in many important ways." and similar to his thoughts here: "One way of looking at this is that Eliezer is appropriately open-minded about existential quantifiers applied to future AI systems thinking about how to cause trouble, but seems to treat existential quantifiers applied to future humans in a qualitatively rather than quantitatively different way (and as described throughout this list I think he overestimates the quantitative difference).")
If this question becomes important, there are people in our community who are.. domain experts. We can ask
Hey,
TL;DR I know a researcher who's going to start studying C. elegans worms in a way that seems interesting as far as I can tell. Should I do something about that?
I'm trying to understand if this is interesting for our community, specifically as a path to brain emulation, which I wonder if could be used to (A) prevent people from dying, and/or (B) creating a relatively-aligned AGI.
This is the most relevant post I found on LW/EA (so far).
I'm hoping someone with more domain expertise can say something like:
- "OMG we should totally extra fund this researcher and send developers to help with the software and data science and everything!"
- "This sounds pretty close to something useful but there are changes I'd really like to see in that research"
- "Whole brain emulation is science fiction, we'll obviously destroy the world or something before we can implement it"
- "There is a debate on whether this is useful, the main positions are [link] and [link], also totally talk to [person]"
Any chance someone can give me direction?
Thx!
(My background is in software, not biology or neurology)
I heard "kiwi" is a company with a good reputation, but I didn't try their head strap myself. I have their controller-straps which I really like
- (I'm not sure but why would this be important? Sorry for the silly answer, feel free to reply in the anonymous form again)
- I think a good baseline for comparison would be
- Training large ML models (expensive)
- Running trained ML models (much cheaper)
- I think comparing to blockchain is wrong, because
- it was explicitly designed to be resource intensive on purpose (this adds to the security of proof-of-work blockchains)
- there is a financial incentive to use a specific (very high) amount of resources on blockchain mining (because what you get is literally a currency, and this currency has a certain value, so it's worthwhile to spend any money lower than that value on the mining process)
- None of these are true for ML/AI, where your incentive is more something like "do useful things"
+1 for the Abusive Relationships section.
I think there's a lot of expected value in a project that raises awareness to "these are good reason to break up" and/or "here are common-but-very-bad reasons to stay in an abusive relationship", perhaps with support for people who choose to break up. It's a project I sometimes think of opening but I'm not sure where I'd start
Anonymous question (ask here) :
Given all the computation it would be carrying out, wouldn't an AGI be extremely resource-intensive? Something relatively simple like bitcoin mining (simple when compared to the sort of intellectual/engineering feats that AGIs are supposed to be capable of) famously uses up more energy than some industrialized nations.
If you buy a VR (especially if it's an Oculus Quest 2), here's my getting started guide
Just saying I appreciate this post being so short <3
(and still informative)
Ok,
I'm willing to assume for sake of the conversation that the AGI can't get internet-disconnected weapons.
Do you think that would be enough to stop it?
("verified programmatically": I'm not sure what you mean. That new software needs to be digitally signed with a key that is not connected to the internet?)