Posts
Comments
The term that personality psychologists use for this trait is "industriousness".
I've been pasting into Emacs. If you're a Linux user, I would be interested to know what program you paste math into. Or if the thing you paste into "uses web tech" (and consequently is independent of OS), tell me which web site or program it is.
Great! My being given some way to obtain the original LaTeX as written by the author is the solution I have been tending to imagine over the years when I imagined how LW might be changed to accommodate my current workflow!
Thanks for pointing it out!
BTW, I'd like to learn more about the workflows of people who work with math all day every day.
If LW would suddenly change so that math could be saved for reading at a later time when I'm not connected to the internet, the amount thinking I do about math would probably suddenly triple.
Details: the main way I save text from web for reading at a later time is by copying a part of the web page, then pasting into a file on my local machine. That does not work for most text containing math.
I'm curious why you think that.
My probability that there will be any human alive 100 years from now is .07. If MIRI were magically given effective control over all AI research starting now or if all AI research were magically stopped somehow, my probability would change to .98.
Is what I just wrote also badly misleading? Surely your objection is not to ambiguity in the definition of "alive", but I'm failing to imagine what your objection might be unless we think very differently about how probability works.
(I'm interested in answers from anyone.)
What I want to be notified about is signs (e.g., satellite photos) that Russia or China is evacuating its cities.
Also, any threat by Russia or China to attack the US with nukes if Washington does not take specific action X before date Y.
Russia's using a nuke in Ukraine wouldn't increase my P(nuclear attack on the US) enough to cause me to relocate to a rural area, but those 2 other things would.
This channel seems pretty good:
Before the internet became a "mass medium" (i.e., before 1993) it was drastically less agreeable (and drastically less extroverted) than it is today. The difference between then and now is absolutely huge.
If I needed to find disagreeable people today (that I didn't already know), I'd hang around in my county's law library or maybe the Clerk of the Court's office.
We have yet to touch on the topic of timing: comedians who perform in front of an audience often say that timing is important.
There can be a delay of a few seconds between punchline and the start of the laughter, and once the laughter begins, it usually gets loud very quickly. This behavior suggests that there is a (social?) cost to laughing at something most of the audience choose not to laugh at and also probably a cost to not laughing when most are laughing.
Destroying the fabric of the universe sounds hard even for a superintelligence. "hard": probably impossible even if the superintelligence makes it its only priority.
In one of his appearances on video this year, Eliezer said IIRC that all of the intent-alignment techniques he knows of stop working once the AI's capabilities improve enough, mentioning RLHF. Other than that I am not knowledgeable enough to answer you.
The people who coined the term "AI alignment" were worried about AI research's causing the end of humanity. The source of your confusion seems to be the fact that the term has taken on other meanings. So let us take a step back and look at the original meaning of "the race between capabilities research and alignment research".
None of the AIs so far deployed are a danger to the survival of the human species because they're not capable enough. For example, although it is much more capable than a human or a team of humans at Go and Chess, Alpha Zero does not know that, e.g., it is running on a computer and does not know that there are people who could at any moment decide to shut that computer off, which would severely curtail its ability to win the chess game is it currently playing. It's lack of knowledge of reality means Alpha Zero is not a danger to our survival. GPT-4 has vast knowledge of reality, and if you ask it to make a plan, it will make a plan, but there's a good chance that the plan has a bad flaw in it. This unreliability in its planning capability means that GPT-4 is not a danger to our survival.
Soon the AI labs will create an AI that is much more capable at every important endeavor than our most capable institutions (e.g., Cambridge University or the FBI) as well of course as our most capable individuals. If that AI does not care about human survival or human preferences, then we are done for, but no one knows (and no one even has a good plan for finding out) how to make an AI care about us even a little tiny bit. MIRI has been trying to figure it out for about 20 years, and they're telling us that maybe they can do it given another 3 or 4 decades, but we probably don't have 3 or 4 decades--unless by some miracle all the AI labs get shut down.
Repeating myself: the only thing preventing AI from accidentally or intentionally wiping us out is the fact that so far no AI has the necessary capabilities (e.g., in making plans that can survive determined human opposition, e.g., in knowledge of reality), but the AI labs are bent on giving AI the relevant capabilities despite the fact that they don't have any good plan (nor even a halfway-decent plan) for ensuring the first AI with the capabilities to wipe us out cares enough about us to refrain from doing so. (Once an AI becomes capable enough to kill us all, the only way it can be made safe is to make it care about us right from when the AI is first turned on, i.e., in its initial design.)
AI alignment is often presented as conceptually distinct from capabilities. However, the distinction seems somewhat fuzzy
From my point of view, the distinction is not fuzzy at all, and I hope this comment has enabled you to see my point of view: namely, our civilization is getting pretty good a capabilities research, but progress in alignment research has proved much more difficult. Here I am using "alignment" in its original, "strict" sense to refer only to methods that continue to work even after the AI become much more capable than people and human organizations.
I’ve had these kinds of thoughts [referring to the OP] many times. I don’t think they were very healthy for me
I suspect that they are unhealthy for a lot of people.
Jordan Peterson asserts that many mass murders are motivated by desire for revenge against the Universe and offers as evidence the observation that many mass murders choose the most innocent victims they can. (More precisely, since Peterson is religious, he has "revenge against God", but the nearest translation for non-religious people would be "revenge against the Universe".) And of course for a person to accumulate a large pool of vengeful feelings, the person must believe or at least strongly suspect that the target of the feelings is responsible for the bad things that happened to the person.
I cannot recall whether Peterson ever explicitly said so, but I am left with the impression that Peterson considers vengeful feelings directed at the Universe (and possibly also other large collective entities, e.g., the Cambodian nation) to be more likely to accumulate to the extent that they blot out the ability to think rationally in the pursuit of one's own self-interest than vengeful feeling directed at a mere person or clique of persons. (Harboring vengeful feelings directed at the Cambodian nation would be particularly harmful to you if you were Cambodian or lived in Cambodia.)
The statements strike me as more credible and more interesting than they would be delivering in the speaking style of the people who usually talk on the topic, but then it is no surprise that winners of presidential elections have compelling vocal skills.
I sometimes imagine that making it so that anyone who works for or invests in an AI lab is unwelcome at the best Bay Area parties would be a worthwhile state of affairs to work towards, which is sort of along the same lines as you write.
There is a questionable trend to equate ML skills with the ability to do alignment work.
Yes!
I was going to upvote this (for reporting what LeCun said on Twitter) till I got to "I asked ChatGPT 3.5 to criticize this".
According to my notes, what got me to resolve to avoid water fasting is the first .66 of this next interview with longevity researcher Valter Longo:
https://thedoctorskitchen.com/podcasts/62-fasting-and-medicine-with-prof-valter-longo
You can avoid the 90-second ads on that page by using yt-dlp to download the interview audio.
The next paragraph in my notes is the next URL, which describes what I replaced water fasts with, namely the fasting-mimicking diet.
https://kahn642.medium.com/265fc68f8e19
But the usual purpose of the fasting-mimicking diet is not fat loss. (It is autophagy.) It's low on protein. So maybe that last URL is irrelevant to you.
Please let me know if there's anything else I can do for you. You seem to have the potential to contribute to the desperate fight to save the world from AI, so I want you as healthy as possible.
It definitely is easier to stop eating completely! A water fast trades convenience against a significant risk of permanent damage (e.g., never regaining all the muscle you lost) or death.
Weight lifting during a water fast will not help
I agree! I was using the obvious unsuitability of strength training during a water fast as an argument against the water fast relative to the other ways to burn fat. (Weight lifting plus eating enough protein often enough is better at preserving muscle mass during an attempt to lose fat than eating enough protein often enough without the weight lifting.)
Yes. I basically cook everything I eat from scratch. I don't eat any seeds or any fats or oils except for coconut, avocado, olive oil and fat from cow's milk and lamb's meat.
Hmm. I was going to write, "cows and lambs which I know not to have been fed seed oils," but on second thought I do not know that to be the case. In particular, I use Kerrygold butter, which promises to be from cows fed on at least 95% grass, but as far as I know, the remaining 5% could include a large dose of seed oils. Kerrygold melts or more precisely gets soft at a much lower temperature than another brand of butter that claims to be 100% grass-fed, which means that the fatty-acid composition is much different than the other butter. The addition of seed oil to the cows diet could explain the difference.
My BMI is under 25. My motivation in entering this conversation is to try to talk you out of the water fast, not to learn how I might lose fat.
Water fasting strikes me as an inefficient way to do what you want to do. Sure, obviously, if you keep on getting no calories, eventually the body is going to burn fat, but paradoxically it prefers to burn protein from the muscles first (!) and the only way to stop that is with sufficiently intense exercise of the muscles, i.e., weight lifting, while trying to lose fat.
This next Andrew Huberman lecture describes how to burn fat via "using cold to create shiver", exercise, "non-exercise movements such as fidgeting", supplements and prescription drugs. I don't recall any mention of fasting in this lecture though it has been a few months since I listened to the lecture. To be precise, because I would not have been surprised to hear Huberman warn against water fasting, I probably would not have remembered that, but I would've been quite surprised to hear him recommend it, and almost certainly would've remembered that. (So, he almost certainly does not recommend it in the lecture.)
https://www.youtube.com/watch?v=GqPGXG5TlZw
(Sadly, to burn fat using exercise, you have to exercise continuously for about 90 minutes IIRC.)
I did 6.5 days of water fasting once out of ignorance of the risks and disadvantages, and I'll never voluntarily go that long again without protein and without calories.
There's a natural human tendency to believe that if one is basically healthy, one's health interventions should be mild whereas if one is severely ill, drastic health interventions should be chosen. In reality, for most cases severe chronic illness, it takes great expertise or a heroic efforts of rationality sustained usually over years to identify or imagine any intervention that has more than a negligible chance of positively affecting the illness, and people dealing with severe illness should spend a significant fraction of their thinking time and mental energy on avoiding making the situation worse.
Happily for you, there are probably people with very deep expertise on fat loss although I don't know enough about the subject to tell you who those people are. (And the fact that the drug companies hope to make a lot of money on fat loss makes it much harder to identify the deep experts.)
I personally have had 2 friends who've killed themselves by being overly aggressive about trying to rid themselves of chronic illness: one was being prescribed an anti-coagulant and either took more than he should have because of a strong desire to return to his previous healthy lifestyle; the other was enamored of smart drugs, and one of the many combinations of drugs he tried gave him Parkinsonism.
I've stayed completely the hell away from seed oils for decades: it's just your plan for a long water fast that alarms me.
We cannot follow that link into Gmail unless you give us your Gmail username and password.
Yeah, but even if the advice VCs give to people in general is worthless, it remains the case that (like Viliam said) once the VC has invested, its interests are aligned with the interests of any founder whose utility function grows linearly with money. And VCs usually advise the startups they've invested in to try for a huge exit (typically an IPO).
you could easily set the starting point much earlier than 1989.
OP is not asserting that the current period of instability started in 1989: his reference to that year is his (too concise?) way of saying that if the Soviet Union could suddenly unravel, it could happen here.
IMHO the current period of instability started about 10 years ago. Occupy Wall Street happened in 2011, and although it got a decent amount of press (and the most dissatisfied elements of society hoped that it would be start of a broader upturning of society) it had almost no effect on the broader society. Moreover, the mere fact that the more radical parts of the press put so much of their hope into it is a sign that in 2011 there was actually no hope for radical change.
In other words, I'm taking Occupy as evidence of stability: radical journalists are pretty smart as a group, so if the only thing they can find to write about is Occupy, that is a strong sign that there's actually no hope at present of radical change.
George Friedman says that the US undergoes intervals of unrest about every 60 years. The previous period of unrest IMO ended about 1973, the year when the US military completely withdrew from Vietnam. Remember that during the previous period of unrest, there was an organization (the Weather Underground) of at least 1000 members actively trying to overthrow the US government using riots and bombs.
During the period from 1973 to about 2013, which I claim was a period of stability, the professionals that run election campaigns learned that conservative Christians could be harnessed as a powerful voting bloc because they're "well organized" (i.e., can be effectively led by church leaders) and of course the liberals reacted against that (successfully, IMO) in what is usually called the Culture Wars, but IIRC the Culture Wars never went beyond words and votes whereas the 2020 BLM protests had a few politically-motivated killings, e.g., https://en.wikipedia.org/wiki/Killings_of_Aaron_Danielson_and_Michael_Reinoehl
It depends on person 1's motivation. If his or her motivation is selfish, then I agree with you, but if the motivation is altruistic, that makes the utility of money linear, and startups are a potent way to maximize expected money.
Yes, the "epistemic status" is me telling you how confident I am.
I can learn something (become more capable at a task) without being able to describe in words what I learned unless I spend much more time and effort to create the verbal description than I spent to learn the thing. I've seen this happen enough times that it is very unlikely that I am mistaken although I haven't observed how other people learn things closely enough to know whether what I just said generalizes to other people.
This has happened when I've learned a new skill in math, philosophy or "self-psychotherapy" i.e., it is not restricted to those skills (e.g., how to lift weights while minimizing the risk of injury) in which the advantage of a non-verbal means of communication (e.g., video) is obvious.
Something you just wrote makes me wonder whether what I just described is foreign to you.
The OP's argument can be modified to be immune to your objection:
Technical alignment is harder than capability research. Technical alignment will take longer than we have before capability research kills us all.
Even if Eliezer's argument in that Twitter thread is completely worthless, it remains the case that "merely hoping" that the AI turns out nice is an insufficiently good argument for continuing to create smarter and smarter AIs. I would describe as "merely hoping" the argument that since humans (in some societies) turned out nice (even though there was no designer that ensured they would), the AI might turn out nice. Also insufficiently good is any hope stemming from the observation that if we pick two humans at random out of the humans we know, the smarter of the two is more likely than not to be the nicer of the two. I certainly do not want the survival of the human race to depend on either one of those two hopes or arguments! Do you?
Eliezer finds posting on the internet enjoyable, like lots of people do. He posts a lot about, e.g., superconductors and macroeconomic policy. It is far from clear to me that he consider this Twitter thread to be relevant to the case against continuing to create smarter AIs. But more to the point: do you consider it relevant?
One solution I’ve been exploring is forcing myself to write down my thought process
Keeping notes about everything I tried works well for me--at least for a problem I will probably solve in a day or 3. For longer-range problems and concerns, I'm less enthusiastic about making notes because it tends to be difficult to figure out how to store or catalog the notes so that I have a decent chance of finding the notes again.
But my brain does not try to explore every possibility in the solution space, even when I'm not taking notes, so maybe my experience is not relevant to you.
Good catch.
However, if you and I have the seed of a super-intelligence in front of us, waiting only on our specifying a utility function and for us to press the "start" button, then if we can individually specify what we want for the world in the form of a utility function, then it would prove easy for us to work around the first of the two gotchas you point out.
As for the second gotcha, if we were at all pressed for time, I'd go ahead with my normalization method on the theory that the probability of the sum's turning out to be exactly zero is very low.
I am interested however in hearing from readers who are better at math than I: how can the normalization method can be improved to remove the two gotchas?
ADDED. What I wrote so far in this comment fails to get at the heart of the matter. The purpose of a utility function is to encode preferences. Restricting our discourse to utility functions such that for every o in O, U(o) is a real number greater than zero and less than one does not restrict the kinds of preferences that can be encoded. And when we do that, every utility function in our universe of discourse can be normalized using the method already given--free from the two gotchas you pointed out. (In other words, instead of describing a gotcha-free method for normalizing arbitrary utility functions, I propose that we simply avoid defining certain utility functions that might be trigger one of the gotchas.)
Specifically, if o_worst is the worst outcome according to the agent under discussion and o_best is its best outcome, set U(o_worst)=0, U(o_best)=1 and for every other outcome o, set U(o) = p where p is the probability for which the agent is indifferent between o and the lottery [p, o_best; 1-p, o_worst].
if that function increases fast enough
Nit: I'm not seeing how "increase" is well defined here, but I probably know what you mean anyways.
I thought we were talking about combining utility functions, but I see only one utility function here, not counting the combined one:
U_{combined} = \min{U_{agent}} + \arctan(\max{U_{agent}} - \min{U_{agent}})
If I wanted to combine 2 utility functions fairly, I'd add them, but first I'd normalize them by multiplying each one by the constant that makes its sum over the set of possible outcomes equal to 1. In symbols:
U_combined(o) = U_1(o) / (\sum_{o2 in O} U_1(o2)) + U_2(o) / (\sum_{o2 in O} U_2(o2)) for all o in O where O is the set of outcomes (world states or more generally world histories).
Epistemic status: shaky. Offered because a quick answer is often better than a completely reliable one.
An ontology is a comprehensive account of reality.
The field of AI uses the term to refer to the "binding" of the AI's map of reality to the territory. If the AI for example ends up believing that the internet is reality and all this talk of physics and galaxies and such is just a conversational ploy for one faction on the internet to gain status relative to another faction, the AI has an ontological failure.
ADDED. A more realistic example would be the AI's confusing its internal representation of the thing to be optimized with the thing the programmers hoped the AI would optimize. Maybe I'm not the right person to answer because it is extremely unlikely I'd ever use the word ontology in a conversation about AI.
It's only safer for the occupants of the SUV. For everyone else, an SUV is more dangerous.
Nick Szabo thinks that Europe's history of relatively high intensity in use of animal labor helped it undergo the industrial revolution. One of the first uses of the steam engine was pumping water out of mines--at a time when the status quo was to use horses and oxen to do the same thing. He shows street scenes of China in the 1900s where there is a lot of transportation of heavy loads going on--all done by human labor.
The people of northern Europe were not the first to domesticate cows, but they were the first farming (i.e., non-nomadic) people to domesticate cows on a large scale--many millennia ago. Domesticated animals have been important in Europe ever since--more so in northern Europe than southern Europe (getting back to the original question of why no Roman industrial revolution). The Romans had cavalry because it was an important component of military force, and they certainly weren't going to ignore a significant military factor, but they were much more likely to "outsource" cavalry to non-Romans than to "outsource" infantry, which is a sign that domesticated animals were less important in Roman society than they were in surrounding areas.
I expect it to be much harder to measure the "smarts" of an AI than it is to measure the smarts of a person (because all people share a large amount of detail in their cognitive architecture), so any approach that employs "near-human level" AI runs the risk that at least one of those AIs is not near human level at all.
If I'm a superintelligent AI, killing all the people is probably the easiest way to prevent people from interfering with my objectives, which people might do for example by creating a second superintelligence. It's just easier for me to kill them all (supposing I care nothing about human values, which will probably be the case for the first superintelligent AI given the way things are going in our civilization) than to keep an eye on them or to determine which ones might have the skills to contribute to the creation of a second superintelligence (and kill only those).
(I'm slightly worried what people will think of me when I write this way, but the topic is important enough I wrote it anyways.)
Eliezer said in one of this year's interviews that gradient descent "knows" the derivative of the function it is trying to optimize whereas natural selection does not have access to that information--or is not equipped to exploit that information.
Maybe that clue will help you search for the answer to your question?
To do better, you need to refine your causal model of the doom.
Basically, smart AI researchers with stars in their eyes organized into teams and into communities that communicate constantly are the process that will cause the doom.
Refine that sentence into many paragraphs, then you can start to tell which interventions decrease doom a lot and which ones decrease just a little.
E.g., convincing most of the young people with the most talent for AI research that AI research is evil similar to how most of the most talented programmers starting their careers in the 1990s were convinced that working for Microsoft is evil? That would postpone our doom a lot--years probably.
Slowing down the rate of improvement of GPUs? That helps less, but still buys us significant time in expectation, as far as I can tell. Still, there is a decent chance that AI researchers can create an AI able to kill us with just the GPU designs currently available on the market, so it is not as potent an intervention as denying the doom-causing process the necessary talent to create new AI designs and new AI insights.
You can try to refine your causal model of how young people with talent for AI research decide what career to follow.
Buying humanity time by slowing down AI research is not sufficient: some "pivotal act" will have to happen that removes the danger of AI research permanently. The creation of an aligned super-intelligent AI is the prospect that gets the most ink around here, but there might be other paths that lead to a successful exit of the crisis period. You might try to refine your model of what those paths might look like.
You can make the long reply its own post (and put a link to the post in a brief reply).
I agree with the factual correctness of this, but I don't personally consider the outcome you describe an improvement over the status quo.
The suggestion was to consider all companies currently looking to hire -- because that function entails advertising and advertising is by its nature is difficult to hide from people like you trying to learn more about the company. (More precisely, the function (hiring) does not strictly entail advertising, but refraining from advertising will greatly slow down the execution of the function.)
Because AI companies are in an adversarial relationship with us (the people who understand that AI research is very dangerous), we should expect them to gather information about us and to try to prevent us from gathering accurate information about them.
It is possible that there currently exists a significantly-sized AI company that you and I know nothing about and cannot hope to learn anything about except by laborious efforts such as having face-to-face conversations with hundreds of AI experts located in diverse locations around the world (if they chose not to advertise to speed up hiring).
Same here. More information.
Private companies can (at least in the US) easily keep their funding, revenue and the number of parameters in their big models secret, but it is hard to keep secret the fact that they are hiring AI experts or experts-in-training (without slowing down that hiring process quite a lot and probably also making it much more expensive).