# Daniel Kokotajlo's Shortform

post by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T18:53:22.087Z · LW · GW · 158 comments

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-17T10:38:41.959Z · LW(p) · GW(p)

Technologies I take for granted now but remember thinking were exciting and cool when they came out

• Smart phones
• Video calls
• DeepDream (whoa! This is like drug hallucinations... I wonder if they share a similar underlying mechanism? This is evidence that ANNs are more similar to brains than I thought!)
• AlphaGo
• AlphaStar (Whoa! AI can handle hidden information!)
• OpenAI Five (Whoa! AI can work on a team!)
• GPT-2 (Whoa! AI can write coherent, stylistically appropriate sentences about novel topics like unicorns in the andes!)
• GPT-3

I'm sure there are a bunch more I'm missing, please comment and add some!

Replies from: Dagon, gworley
comment by Dagon · 2021-10-19T00:10:28.310Z · LW(p) · GW(p)

Heh.  In my youth, home computers were somewhat rare, and modems even more so.  I remember my excitement at upgrading to 2400bps, as it was about as fast as I could read the text coming across.  My current pocket computer is about 4000 times faster, has 30,000 times as much RAM, has hundreds of times more pixels and colors, and has worldwide connectivity thousands of times faster.  And I don't even have to yell at my folks to stay off the phone while I'm using it!

I lived through the entire popularity cycle of fax machines.

My parents grew up with black-and-white CRTs based on vacuum tubes - the transistor was invented in 1947.  They had just a few channels of broadcast TV and even audio recording media was somewhat uncommon (cassette tapes in the mid-60s, video tapes didn't take off until the late 70s).

comment by G Gordon Worley III (gworley) · 2021-10-17T19:17:34.607Z · LW(p) · GW(p)

Some of my own:

• SSDs
• laptops
• CDs
• digital cameras
• modems
• genome sequencing
• automatic transmissions for cars that perform better than a moderately skilled human using a manual transmission can
• cheap shipping
• solar panels with reasonable power generation
• breathable wrinkle free fabrics that you can put in the washing machine
• bamboo textiles
• good virtual keyboards for phones
• scissor switches
• USB
• GPS
Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-18T12:02:50.458Z · LW(p) · GW(p)

Oh yeah, cheap shipping! I grew up in a military family, all around the world, and I remember thinking it was so cool that my parents could go on "ebay" and order things and then they would be shipped to us! And then now look where we are -- groceries delivered in ten minutes! Almost everything I buy, I buy online!

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T18:53:22.262Z · LW(p) · GW(p)

My baby daughter was born two weeks ago, and in honor of her existence I'm building a list of about 100 technology-related forecasting questions, which will resolve in 5, 10, and 20 years. Questions like "By the time my daughter is 5/10/20 years old, the average US citizen will be able to hail a driverless taxi in most major US cities." (The idea is, tying it to my daughter's age will make it more fun and also increase the likelihood that I actually go back and look at it 10 years later.)

I'd love it if the questions were online somewhere so other people could record their answers too. Does this seem like a good idea? Hive mind, I beseech you: Help me spot ways in which this could end badly!

On a more positive note, any suggestions for how to do it? Any expressions of interest in making predictions with me?

Thanks!

EDIT: Now it's done, though I have yet to import it to Foretold.io it works perfectly fine in spreadsheet form.

Replies from: riceissa, bgold, Pablo_Stafforini
comment by riceissa · 2020-11-13T00:58:53.501Z · LW(p) · GW(p)

I find the conjunction of your decision to have kids and your short AI timelines [LW · GW] pretty confusing. The possibilities I can think of are (1) you're more optimistic than me about AI alignment (but I don't get this impression from your writings), (2) you think that even a short human life is worth living/net-positive, (3) since you distinguish between the time when humans lose control and the time when catastrophe actually happens, you think this delay will give more years to your child's life, (4) your decision to have kids was made before your AI timelines became short. Or maybe something else I'm not thinking of? I'm curious to hear your thinking on this.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-14T12:03:09.190Z · LW(p) · GW(p)

4 is correct. :/

Replies from: riceissa
comment by riceissa · 2020-11-15T19:30:47.039Z · LW(p) · GW(p)

Oh :0

comment by bgold · 2019-10-14T19:18:57.220Z · LW(p) · GW(p)

I'm interested, and I'd suggest using https://foretold.io for this

comment by Pablo (Pablo_Stafforini) · 2019-10-08T19:34:46.067Z · LW(p) · GW(p)

I love the idea. Some questions and their associated resolution dates may be of interest to the wider community of forecasters, so you could post them to Metaculus. Otherwise you could perhaps persuade the Metaculus admins to create a subforum, similar to ai.metaculus.com, for the other questions to be posted. Since Metaculus already has the subforum functionality, it seems a good idea to extend it in this way (perhaps a user's subforum could be associated with the corresponding username: e.g. user kokotajlo can post his own questions at kokotajlo.metaculus.com).

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-28T00:59:15.780Z · LW(p) · GW(p)

I made this a while back to organize my thoughts about how all philosophy fits together:

comment by Measure · 2021-10-28T18:50:02.298Z · LW(p) · GW(p)

I find the bright green text on white background difficult to read even on a large screen. I would recommend black or dark gray text instead.

Replies from: Pattern
comment by Pattern · 2021-10-28T21:06:40.710Z · LW(p) · GW(p)

Invert the colors, and it's more readable.

comment by acylhalide (samuel-shadrach) · 2021-11-04T18:10:44.383Z · LW(p) · GW(p)

I think it's important to mention that this map is only useful if you are meta-reasoning about humans as if they are ideal rational agents. And not how reasoning actually happens in the brain. System-1 and System-2 mapping would be more realistic, so would studying based on lobes.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-05T14:45:48.977Z · LW(p) · GW(p)

Yeah, this is a map of how philosophy fits together, so it's about ideal agents/minds not actual ones. Though obviously there's some connection between the two.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-15T12:32:54.703Z · LW(p) · GW(p)

Elon Musk is a real-life epic tragic hero, authored by someone trying specifically to impart lessons to EAs/rationalists:

--Young Elon thinks about the future, is worried about x-risk. Decides to devote his life to fighting x-risk. Decides the best way to do this is via developing new technologies, in particular electric vehicles (to fight climate change) and space colonization (to make humanity a multiplanetary species and thus robust to local catastrophes)

--Manages to succeed to a legendary extent; builds two of the worlds leading tech giants, each with a business model notoriously hard to get right and each founded on technology most believed to be impossible. At every step of the way, mainstream expert opinion is that each of his companies will run out of steam and fail to accomplish whatever impossible goal they have set for themselves at the moment. They keep meeting their goals. SpaceX in particular brought cost to orbit down by an order of magnitude, and if Starship works out will get one or two more OOMs on top of that. Their overarching goal is to make a self-sustaining city on mars and holy shit it looks like they are actually succeeding. Did all this on a shoestring budget compared to various rivals, made loads of enemies who employed various dirty tricks against him, etc. Succeeded anyway.

--Starts to worry more about AI x-risk. Tries to convince people to take it more seriously. People don't listen to him. Doesn't like Demis Hassabis' plan for how to handle the situation. Founds OpenAI with what seems to be a better plan.

--Oops! Turns out the better plan was actually a worse plan. Also turns out AI risk is a bigger deal than he initially realized; it's big enough that everything else he's doing won't matter (unaligned AI can follow us to Mars...). Oh well. All the x-risk-reduction accomplished by all the amazing successes Elon had, undone in an instant, by a single insufficiently thought-through decision.

This story is hitting us over the head with morals/lessons.

--Heavy Tails Hypothesis: Distribution of interventions by impact is heavy-tailed, you'll do a bunch of things in life and one of them will be the most important thing and if it's good, it'll outweigh all the bad stuff and if it's bad, it'll outweigh all the good stuff. This is true even if you are doing MANY very important, very impactful things.

--Importance of research and reflection: It's not obvious in advance what the most important thing is, or whether it's good or bad. You need to do research and careful analysis, and even that isn't a silver bullet, it just improves your odds.

Replies from: ea247, Pattern, Pattern
comment by KatWoods (ea247) · 2021-10-16T08:25:27.286Z · LW(p) · GW(p)

I agree with you completely and think this is very important to emphasize.

I also think the law of equal and opposite advice applies. Most people act too quickly without thinking. EAs tend towards the opposite, where it’s always “more research is needed”. This can also lead to bad outcomes if the results of the status quo are bad.

I can’t find it, but recently there was a post about the EU policy on AI and the author said something along the lines of “We often want to wait to advise policy until we know what would be good advice. Unfortunately, the choice isn’t give suboptimal advice now or great advice in 10 years. It’s give suboptimal advice now or never giving advice at all and politicians doing something much worse probably. Because the world is moving, and it won’t wait for EAs to figure it all out.”

I think this all largely depends on what you think the outcome is if you don’t act. If you think that if EAs do nothing, the default outcome is positive, you should err on extreme caution. If you think that the default is bad, you should be more willing to act, because an informed, altruistic actor increases the value of the outcome in expectation, all else being equal.

comment by Pattern · 2021-10-15T21:58:53.115Z · LW(p) · GW(p)
in particular EV's [Electric Vehicles]

It wasn't clear what this meant.

Manages to succeed to a legendary extent; builds not one but two of the worlds leading tech giants.

This made it seem like it was a word for a type of company.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-15T22:40:21.223Z · LW(p) · GW(p)

Thanks, made some edits. I still don't get your second point though I'm afraid.

Replies from: Pattern
comment by Pattern · 2021-10-16T03:41:40.509Z · LW(p) · GW(p)

The second point isn't important, it's an incorrect inference/hypothesis, predicated on the first bit of information being missing. (So it's fixed.)

comment by Pattern · 2021-10-15T21:54:38.088Z · LW(p) · GW(p)
--Importance of research and reflection: It's not obvious in advance what the most important thing is, or whether it's good or bad. You need to do research and careful analysis, and even that isn't a silver bullet, it just improves your odds.

It's not clear that would have been sufficient to change the outcome (above).

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-15T22:31:57.268Z · LW(p) · GW(p)

I feel optimistic that if he had spent a lot more time reading, talking, and thinking carefully about it, he would have concluded that founding OpenAI was a bad idea. (Or else maybe it's actually a good idea and I'm wrong.)

Can you say more about what you have in mind here? Do you think his values are such that it actually was a good idea by his lights? Or do you think it's just so hard to figure this stuff out that thinking more about it wouldn't have helped?

Replies from: Pattern
comment by Pattern · 2021-10-16T03:39:05.793Z · LW(p) · GW(p)

My point was just:

How much thinking/researching would have been necessary to avoid the failure?

5 hours? 5 days? 5 years? 50? What does it take to not make a mistake? (Or just, that one in particular?)

Expanding on what you said:

Or do you think it's just so hard to figure this stuff out that thinking more about it wouldn't have helped?

Is it a mistake that wouldn't have been solved that way? (Or...solved that way easily? Or another way that would have fixed that problem faster?)

For research to trivially solve a problem, it has...someone pointing out it's a bad idea. (Maybe talking with someone and having them say _ is the fix.)

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-10-14T18:24:48.696Z · LW(p) · GW(p)

When I first read the now-classic arguments for slow takeoff -- e.g. from Paul and Katja -- I was excited; I thought they described a serious alternative scenario to the classic FOOM scenarios. However I never thought, and still do not think, that the classic FOOM scenarios were very unlikely; I feel that the slow takeoff and fast takeoff scenarios are probably within a factor of 2 of each other in probability.

Yet more and more nowadays I get the impression that people think slow takeoff is the only serious possibility. For example, Ajeya and Rohin seem very confident that if TAI was coming in the next five to ten years we would see loads more economic applications of AI now, therefore TAI isn't coming in the next five to ten years...

I need to process my thoughts more on this, and reread their claims; maybe they aren't as confident as they sound to me. But I worry that I need to go back to doing AI forecasting work after all (I left AI Impacts for CLR because I thought AI forecasting was less neglected) since so many people seem to have wrong views. ;)

This random rant/musing probably isn't valuable to anyone besides me, but hey, it's just a shortform. If you are reading this and you have thoughts or advice for me I'd love to hear it.

Replies from: jacob_cannell
comment by jacob_cannell · 2021-11-30T05:29:16.988Z · LW(p) · GW(p)

So there is a distribution over AGI plan costs. The max cost is some powerful bureaucrat/CEO/etc who has no idea how to do it at all but has access to huge amounts of funds, so their best bet is to try and brute force it by hiring all the respected scientists (eg manhattan project).  But notice - if any of these scientists (or small teams) actually could do it mostly on their own (perhaps say with vc funding) - then usually they'd get a dramatically better deal doing it on their own rather than for bigcorp.

The min cost is the lucky smart researcher who has mostly figured out the solution, but probably has little funds, because they spent career time only on a direct path. Think wright brothers after the wing warping control trick they got from observing bird flight. Could a bigcorp or government have beat them? Of course, but the bigcorp would have had to spend OOM more.

Now add a second dimension let's call vision variance - the distribution of AGI plan cost over all entities pursuing it. If that distribution is very flat, then everyone has the same obvious vision plan (or different but equivalently costly plans) and the winner is inevitably a big central player.  However if the variance over visions/plans is high, then the winner is inevitably a garage researcher.

Software is much like flight in this regard - high vision variance. Nearly all major software tech companies were scrappy garage startups - google, microsoft, apple, facebook, etc. Why? Because it simply doesn't matter at all how much money the existing bigcorp has - when the idea for X new software thing first occurs in human minds, it only occurs in a few, and those few minds are smart enough to realize it's value, and they can implement it. The big central player is a dinosaur with zero leverage, and doesn't see it coming until it's too late.

AGI could be like software because . . it probably will be software. Alternatively it could be more like the manhattan project in that it fits into a well known and widely shared sci-fi level vision;  all the relevant players know AGI is coming; it wasn't so obvious that a new graph connectivity algorithm would then enable a search engine which actually works which then would takeover advertising - what?.

Finally another difference between the manhattan project and software is that the manhattan project required a non-trivial amount of tech tree climbing that was all done in secret, which is much harder for a small team to boostrap. Software research is done near fully in the open, which makes it much easier for a small team because they usually just need to provide the final recombinative innovation or two, building off the communal tech tree. Likewise aviation research was in the open, the wright brothers directly literally started with a big book of known airplane designs, like a training dataset.

So anyway one take of this is one shouldn't discount AGI being created by an unknown garage researcher, as the probability mass in "AGI will be like other software" is non-trivial.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-11-03T17:47:33.554Z · LW(p) · GW(p)

It seems to me that human society might go collectively insane sometime in the next few decades. I want to be able to succinctly articulate the possibility and why it is plausible, but I'm not happy with my current spiel. So I'm putting it up here in the hopes that someone can give me constructive criticism:

I am aware of three mutually-reinforcing ways society could go collectively insane:

1. Echo chambers/filter bubbles/polarization: Arguably political polarization is increasing across the world of liberal democracies today. Perhaps the internet has something to do with this--it’s easy to self-select into a newsfeed and community that reinforces and extremizes your stances on issues. Arguably recommendation algorithms have contributed to this problem in various ways--see e.g. “Sort by controversial” and Stuart Russell’s claims in Human Compatible. At any rate, perhaps some combination of new technology and new cultural or political developments will turbocharge this phenomenon. This could lead to civil wars, or more mundanely, societal dysfunction. We can’t coordinate to solve collective action problems relating to AGI if we are all arguing bitterly with each other about culture war issues.
2. Deepfakes/propaganda/persuasion tools: Already a significant portion of online content is deliberately shaped by powerful political agendas--e.g. Russia, China, and the US political tribes. Much of the rest is deliberately shaped by less powerful apolitical agendas, e.g. corporations managing their brands or teenagers in Estonia making money by spreading fake news during US elections. Perhaps this trend will continue; technology like chatbots, language models, deepfakes, etc. might make it cheaper and more effective to spew this sort of propaganda, to the point where most online content is propaganda of some sort or other. This could lead to the deterioration of our collective epistemology.
3. Memetic evolution: Ideas (“Memes”) can be thought of as infectious pathogens that spread, mutate, and evolve to coexist with their hosts. The reasons why they survive and spread are various; perhaps they are useful to their hosts, or true, or perhaps they are able to fool the host’s mental machinery into keeping them around even though they are not useful or true. As the internet accelerates the process of memetic evolution--more memes, more hosts, more chances to spread, more chances to mutate--the mental machinery humans have to select for useful and/or true memes may become increasingly outclassed. Analogy: The human body has also evolved to select for pathogens (e.g. gut, skin, saliva bacteria) that are useful, and against those that are harmful (e.g. the flu). But when e.g. colonialism, globalism, and World War One brought more people together and allowed diseases to spread more easily, great plagues swept the globe. Perhaps something similar will happen (is already happening?) with memes.
Replies from: Benito, Zack_M_Davis
comment by Ben Pace (Benito) · 2019-11-03T18:01:10.467Z · LW(p) · GW(p)

All good points, but I feel like objecting to the assumption that society is currently sane and then we'll see a discontinuity, rather than any insanity being a continuation of current trajectories.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-11-03T23:36:00.977Z · LW(p) · GW(p)

I agree with that actually; I should correct the spiel to make it clear that I do. Thanks!

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-03-06T14:47:40.587Z · LW(p) · GW(p)

I just want to say: Well done, Robin Hanson, for successfully predicting the course of the coronavirus over the past year. I remember a conversation with him in, like, March 2020 where he predicted that there would be a control system, basically: Insofar as things get better, restrictions would loosen and people would take more risks and then things would get worse, and trigger harsher restrictions which would make it get better again, etc. forever until vaccines were found. I think he didn't quite phrase it that way but that was the content of what he said. (IIRC he focused more on how different regions would have different levels of crackdown at different times, so there would always be places where the coronavirus was thriving to reinfect other places.) Anyhow, this was not at all what I predicted at the time, nobody I know besides him made this prediction at the time.

Replies from: Yoav Ravid
comment by Yoav Ravid · 2021-03-07T12:50:46.490Z · LW(p) · GW(p)

I wonder why he didn't (if he didn't) talk about it in public too. I imagine it could have been helpful - Anyone who took him seriously could have done better.

Replies from: juliawise, juliawise
comment by juliawise · 2021-03-08T20:33:30.590Z · LW(p) · GW(p)

He did write something along similar lines here: https://www.overcomingbias.com/2020/03/do-you-feel-lucky-punk.html

comment by juliawise · 2021-03-08T20:24:20.268Z · LW(p) · GW(p)

He did write publicly: https://www.overcomingbias.com/2020/03/variolation-may-cut-covid19-deaths-3-30x.html

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-10-05T22:49:02.784Z · LW(p) · GW(p)

I recommend listening to it yourself. I'm sorry I didn't take timestamped notes, then maybe you wouldn't have to. I think that listening to it has subtly improved my intuitions/models/priors about how US government and society might react to developments in AI in the future.

In a sense, this is already an example of an "AI warning shot" and the public's reaction to it. This hearing contains lots of discussion about Facebook's algorithms, discussion about how the profit-maximizing thing is often harmful but corporations have an incentive to do it anyway, discussion about how nobody understands what these algorithms really think & how the algorithms are probably doing very precisely targeted ads/marketing even though officially they aren't being instructed to. So, basically, this is a case of unaligned AI causing damage -- literally killing people, according to the politicians here.

And how do people react to it? Well, the push in this meeting here seems to be to name Facebook upper management as responsible and punish them, while also rolling out a grab bag of fixes such as eliminating like trackers for underage kids and changing the algorithm to not maximize engagement. I hope they would also apply such fixes to other major social media platforms, but this hearing does seem very focused on facebook in particular. One thing that I think is probably a mistake: The people here constantly rip into Facebook for doing internal research that concluded their algorithms were causing harms, and then not sharing that research with the world. I feel like the predictable consequence of this is that no tech company will do research on topics like this in the future, and they'll hoard their data so that no one else can do the research either. In a sense, one of the outcomes of this warning shot will be to dismantle our warning shot detection system.

We'll see what comes of this.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-07T01:50:11.495Z · LW(p) · GW(p)

This article says OpenAI's big computer is somewhere in the top 5 largest supercomputers. I reckon it's fair to say their big computer is probably about 100 petaflops, or 10^17 flop per second. How much of that was used for GPT-3? Let's calculate.

I'm told that GPT-3 was 3x10^23 FLOP. So that's three million seconds. Which is 35 days.

So, what else have they been using that computer for? It's been probably about 10 months since they did GPT-3. They've released a few things since then, but nothing within an order of magnitude as big as GPT-3 except possibly DALL-E which was about order of magnitude smaller. So it seems unlikely to me that their publicly-released stuff in total uses more than, say, 10% of the compute they must have available in that supercomputer. Since this computer is exclusively for the use of OpenAI, presumably they are using it, but for things which are not publicly released yet.

Is this analysis basically correct?

Replies from: jacob_cannell
comment by jacob_cannell · 2021-11-30T05:59:08.240Z · LW(p) · GW(p)

100 petaflops is 'only' about 1,000 GPUs, or considerably less if they are able to use lower precision modes. I'm guessing they have almost 100 researchers now? Which is only about 10 GPUs per researcher, and still a small budget fraction (perhaps $20/hr ish vs >$100/hr for the researcher).  It doesn't seem like they have a noticeable compute advantage per capita.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-12-17T20:08:49.859Z · LW(p) · GW(p)

When I saw this cool new OpenAI paper, I thought of Yudkowsky's Law of Earlier/Undignified Failure:

Relevant quote:

In addition to these deployment risks, our approach introduces new risks at train time by giving the model access to the web. Our browsing environment does not allow full web access, but allows the model to send queries to the Microsoft Bing Web Search API and follow links that already exist on the web, which can have side-effects. From our experience with GPT-3, the model does not appear to be anywhere near capable enough to dangerously exploit these side-effects. However, these risks increase with model capability, and we are working on establishing internal safeguards against them.

To be clear I am not criticizing OpenAI here; other people would have done this anyway even if they didn't. I'm just saying: It does seem like we are heading towards a world like the one depicted in What 2026 Looks Like [LW · GW] where by the time AIs develop the capability to strategically steer the future in ways unaligned to human values... they are already roaming freely around the internet, learning constantly, and conversing with millions of human allies/followers. The relevant decision won't be "Do we let the AI out of the box?" but rather "Do we petition the government and tech companies to shut down an entire category of very popular and profitable apps, and do it immediately?"

Replies from: gwern
comment by gwern · 2021-12-17T21:59:36.016Z · LW(p) · GW(p)

"Tool AIs want to be agent AIs."

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-03-10T10:29:53.746Z · LW(p) · GW(p)

I just found Eric Drexler's "Paretotopia" idea/talk. [? · GW] It seems great to me; it seems like it should be one of the pillars of AI governance strategy. It also seems highly relevant to technical AI safety (though that takes much more work to explain).

Why isn't this being discussed more? What are the arguments against it?

Replies from: gerald-monroe
comment by Gerald Monroe (gerald-monroe) · 2021-03-11T04:11:06.407Z · LW(p) · GW(p)

Without watching the video, prior knowledge of the nanomachinery proposals show that a simple safety mechanism is feasible.

No nanoscale robotic system can should be permitted to store more than a small fraction of the digital file containing the instructions to replicate itself.  Nor should it have sufficient general purpose memory to be capable of this.

This simple rule makes nanotechnology safe from grey goo.  It becomes nearly impossible as any system that gets out of control will have a large, macroscale component you can turn off.  It's also testable, you can look at the design of a system and determine if it meets the rule or not.

AI alignment is kinda fuzzy and I haven't heard of a simple testable rule.  Umm, also if such a rule exists then MIRI would have an incentive not to discuss it.

At least for near term agents we can talk about such rules.  They have to do with domain bounding.  For example, the heuristic for a "paperclip manufacturing subsystem" must include terms in the heuristic for "success" that limit the size  of the paperclip manufacturing machinery.  These terms should be redundant and apply more than a single check.  So for example, the agent might:

Seek maximum paperclips produced with large penalty for : (greater than A volume of machinery, greater than B tonnage of machinery, machinery outside of markers C, greater than D probability of a human killed, greater than E probability of an animal harmed, greater than F total network devices, greater than G ..)

Essentially any of these redundant terms are "circuit breakers" and if any trip the agent will not consider an action further.

"Does the agent have scope-limiting redundant circuit breakers" is a testable design constraint.  While "is it going to be friendly to humans" is rather more difficult.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2021-03-11T05:32:05.665Z · LW(p) · GW(p)

No nanoscale robotic system ... should be permitted to store more than a small fraction of the digital file containing the instructions to replicate itself.

Will you outlaw bacteria?

Replies from: gilch, gerald-monroe
comment by gilch · 2021-03-11T05:57:40.209Z · LW(p) · GW(p)

The point was to outlaw artificial molecular assemblers like Drexler described in Engines of Creation. Think of maybe something like bacteria but with cell walls made of diamond. They might be hard to deal with once released into the wild. Diamond is just carbon, so they could potentially consume carbon-based life, but no natural organism could eat them. This is the "ecophagy" scenario.

But, I still think this is a fair objection. Some paths to molecular nanotechnology might go through bio-engineering, the so-called "wet nanotechnology" approach. We'd start with something like a natural bacterium, and then gradually replace components of the cell with synthetic chemicals, like amino acid analogues or extra base pairs or codons, which lets us work in an expanded universe of "proteins" that might be easier to engineer as well as having capabilities natural biology couldn't match. This kind of thing is already starting to happen. At what point does the law against self-replication kick in? The wet path is infeasible without it, at least early on.

Replies from: gerald-monroe
comment by Gerald Monroe (gerald-monroe) · 2021-03-11T06:28:53.016Z · LW(p) · GW(p)

The point was to outlaw artificial molecular assemblers like Drexler described in Engines of Creation.

Not outlaw.  Prohibit "free floating" ones that can work without any further input (besides raw materials).  Allowed assemblers would be connected via network ports to a host computer system that has the needed digital files, kept in something that is large enough for humans to see it/break it with a fire axe or shotgun.

Note that making bacteria with gene knockouts so they can't replicate solely on their own, but have to be given specific amino acids in a nutrient broth, would be a way to retain control if you needed to do it the 'wet' way.

The law against self replication is the same testable principle, actually - putting the gene knockouts back would be breaking the law because each wet modified bacteria has all the components in itself to replicate itself again.

comment by Gerald Monroe (gerald-monroe) · 2021-03-11T06:18:37.170Z · LW(p) · GW(p)

I didn't create this rule.  But succinctly:

life on earth is more than likely stuck at a local maxima among the set of all possible self-replicating nanorobotic systems.

The grey goo scenario posits you could build tiny fully artificial nanotechnological 'cells', made of more durable and reliable parts, that could be closer to the global maxima for self-replicating nanorobotic systems.

These would then outcompete all life, bacteria included, and convert the biosphere to an ocean of copies of this single system.  People imagine each cellular unit might be made of metal, hence it would look grey to the naked eye, hence 'grey goo'.   (I won't speculate how they might be constructed, except to note that you would use AI agents to find designs for these machines.  The AI agents would do most of their exploring in a simulation and some exploring using a vast array of prototype 'nanoforges' that are capable of assembling test components and full designs.  So the AI agents would be capable of considering any known element and any design pattern known at the time or discovered in the process, then they would be capable of combining these ideas into possible 'global maxima' designs.  This sharing of information - where any piece from any prototype can be adapted and rescaled to be used in a different new prototype - is something nature can't do with conventional evolution - hence it could be many times faster )

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-03T14:34:16.598Z · LW(p) · GW(p)

I heard a rumor that not that many people are writing reviews for the LessWrong 2019 Review. I know I'm not, haha. It feels like a lot of work and I have other things to do. Lame, I know. Anyhow, I'm struck by how academia's solution to this problem is bad, but still better than ours!

--In academia, the journal editor reaches out to someone personally to beg them to review a specific piece. This is psychologically much more effective than just posting a general announcement calling for volunteers.

--In academia, reviews are anonymous, so you can half-ass them and be super critical without fear of repercussions, which makes you more inclined to do it. (And more inclined to be honest too!)

Here are some ideas for things we could do:

--Model our process after Academia's process, except try to improve on it as well. Maybe we actually pay people to write reviews. Maybe we give the LessWrong team a magic Karma Wand, and they take all the karma that the anonymous reviews got and bestow it (plus or minus some random noise) to the actual authors. Maybe we have some sort of series of Review Parties where people gather together, chat and drink tasty beverages, and crank out reviews for a few hours.

Replies from: David Hornbein, Raemon, ChristianKl, niplav
comment by David Hornbein · 2021-01-03T22:02:38.373Z · LW(p) · GW(p)

In general I approve of the impulse to copy social technology from functional parts of society, but I really don't think contemporary academia should be copied by default. Frankly I think this site has a much healthier epistemic environment than you see in most academic communities that study similar subjects. For example, a random LW post with >75 points is *much* less likely to have an embarrassingly obvious crippling flaw in its core argument, compared to a random study in a peer-reviewed psychology journal.

Anonymous reviews in particular strike me as a terrible idea. Bureaucratic "peer review" in its current form is relatively recent for academia, and some of academia's most productive periods were eras where critiques came with names attached, e.g. the physicists of the early 20th century, or the Republic of Letters. I don't think the era of Elsevier journals with anonymous reviewers is an improvement—too much unaccountable bureaucracy, too much room for hidden politicking, not enough of the purifying fire of public argument.

If someone is worried about repercussions, which I doubt happens very often, then I think a better solution is to use a new pseudonym. (This isn't the reason I posted my critique of an FHI paper [LW · GW] under the "David Hornbein" pseudonym rather than under my real name, but it remains a proof of possibility.)

Some of these ideas seem worth adopting on their merits, maybe with minor tweaks, but I don't think we should adopt anything *because* it's what academics do.

comment by Raemon · 2021-01-03T15:47:11.194Z · LW(p) · GW(p)

Yeah, several those ideas are "obviously good", and the reason we haven't done them yet is mostly because the first half of December was full of competing priorities (marketing the 2018 books, running Solstice). But I expect us to be much more active/agenty about this starting this upcoming Monday.

comment by ChristianKl · 2021-01-03T15:15:30.974Z · LW(p) · GW(p)

Maybe we have some sort of series of Review Parties where people gather together, chat and drink tasty beverages, and crank out reviews for a few hours.

Maybe that should be an event that happens in the garden?

comment by niplav · 2021-01-03T22:15:11.200Z · LW(p) · GW(p)

Maybe we give the LessWrong team a magic Karma Wand, and they take all the karma that the anonymous reviews got and bestow it (plus or minus some random noise) to the actual authors.

Wouldn't this achieve the opposite of what we want, disincentivize reviews? Unless coupled with paying people to write reviews, this would remove the remaining incentive.

I'd prefer going into the opposite direction, making reviews more visible (giving them a more prominent spot on the front page/on allPosts, so that more people vote on them/interact with them). At the moment, they still feel a bit disconnected from the rest of the site.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-10-03T16:22:21.488Z · LW(p) · GW(p)

GPT-3 app idea: Web assistant. Sometimes people want to block out the internet from their lives for a period, because it is distracting from work. But sometimes one needs the internet for work sometimes, e.g. you want to google a few things or fire off an email or look up a citation or find a stock image for the diagram you are making. Solution: An app that can do stuff like this for you. You put in your request, and it googles and finds and summarizes the answer, maybe uses GPT-3 to also check whether the answer it returns seems like a good answer to the request you made, etc. It doesn't have to work all the time, or for all requests, to be useful. As long as it doesn't mislead you, the worst that happens is that you have to wait till your internet fast is over (or break your fast).

I don't think this is a great idea but I think there'd be a niche for it.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-01T21:46:53.545Z · LW(p) · GW(p)

For fun:

“I must not step foot in the politics. Politics is the mind-killer. Politics is the little-death that brings total obliteration. I will face my politics. I will permit it to pass over me and through me. And when it has gone past I will turn the inner eye to see its path. Where the politics has gone there will be nothing. Only I will remain.”

Makes about as much sense as the original quote, I guess. :P

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-03-26T14:15:42.546Z · LW(p) · GW(p)

A few years ago there was talk of trying to make Certificates of Impact a thing in EA circles. There are lots of theoretical reasons why they would be great. One of the big practical objections was "but seriously though, who would actually pay money to buy one of them? What would be the point? The impact already happened, and no one is going to actually give you the credit for it just because you paid for the CoI."

Well, now NFT's are a thing. I feel like CoI's suddenly seem a lot more viable!

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-03-10T09:29:56.108Z · LW(p) · GW(p)

Here's my AI Theory reading list as of 1/3/2022. I'd love to hear suggestions for more things to add! You may be interested to know that the lessons from this list are part of why my timelines are so short.

On scaling laws:
https://arxiv.org/abs/2001.08361
(Original scaling laws paper, contains the IMO super-important graph showing that bigger models are more data-efficient)

https://arxiv.org/abs/2010.14701 (Newer scaling laws paper, with more cool results and graphs, in particular graphs showing how you can extrapolate GPT performance seemingly forever)

https://www.youtube.com/watch?v=QMqPAM_knrE (Excellent presentation by Kaplan on the scaling laws stuff, also talks a bit about the theory of why it's happening)

On the bayesian-ness and simplicity-bias of neural networks (which explains why scaling works and should be expected to continue, IMO):

https://www.lesswrong.com/posts/YSFJosoHYFyXjoYWa/why-neural-networks-generalise-and-why-they-are-kind-of [LW · GW] (more like, the linked posts and papers) [LW · GW]

How To Think About Overparameterized Models - LessWrong [LW · GW] (John making a similar point but from a different angle I think)

https://www.lesswrong.com/posts/FRv7ryoqtvSuqBxuT/understanding-deep-double-descent [LW · GW] (and the linked papers I guess though they are probably less important than this post)

(Maybe also https://www.lesswrong.com/posts/nGqzNC6uNueum2w8T/inductive-biases-stick-around [LW · GW] but this is less important)

https://arxiv.org/abs/2102.01293 (Scaling laws for transfer learning)

https://arxiv.org/abs/2006.04182 (Brains = predictive processing = backprop = artificial neural nets)

https://www.biorxiv.org/content/10.1101/764258v2.full (IIRC this provides support for Kaplan's view that human ability to extrapolate is really just interpolation done by a bigger brain on more and better data.)

https://arxiv.org/abs/2004.10802 (I haven't even read this one, but it seems to be providing more theoretical justification for exactly why the scaling laws are the way they are)

Added 5/14/2021: A bunch of stuff related to Lottery Ticket Hypotheses and SGD's implicit biases and how NN's work more generally. Super important for mesa-optimization and inner alignment.

https://arxiv.org/abs/2103.09377

"In this paper, we propose (and prove) a stronger Multi-Prize Lottery Ticket Hypothesis:

A sufficiently over-parameterized neural network with random weights contains several subnetworks (winning tickets) that (a) have comparable accuracy to a dense target network with learned weights (prize 1), (b) do not require any further training to achieve prize 1 (prize 2), and (c) is robust to extreme forms of quantization (i.e., binary weights and/or activation) (prize 3)."

https://arxiv.org/abs/2006.12156

"An even stronger conjecture has been proven recently: Every sufficiently overparameterized network contains a subnetwork that, at random initialization, but without training, achieves comparable accuracy to the trained large network."

https://arxiv.org/abs/2006.07990

The strong {\it lottery ticket hypothesis} (LTH) postulates that one can approximate any target neural network by only pruning the weights of a sufficiently over-parameterized random network. A recent work by Malach et al. \cite{MalachEtAl20} establishes the first theoretical analysis for the strong LTH: one can provably approximate a neural network of width d and depth l, by pruning a random one that is a factor O(d4l2) wider and twice as deep. This polynomial over-parameterization requirement is at odds with recent experimental research that achieves good approximation with networks that are a small factor wider than the target. In this work, we close the gap and offer an exponential improvement to the over-parameterization requirement for the existence of lottery tickets. We show that any target network of width d and depth l can be approximated by pruning a random network that is a factor O(log(dl)) wider and twice as deep.

https://arxiv.org/abs/2103.16547

"Based on these results, we articulate the Elastic Lottery Ticket Hypothesis (E-LTH): by mindfully replicating (or dropping) and re-ordering layers for one network, its corresponding winning ticket could be stretched (or squeezed) into a subnetwork for another deeper (or shallower) network from the same family, whose performance is nearly as competitive as the latter's winning ticket directly found by IMP."

https://arxiv.org/abs/2010.11354

Sparse neural networks have generated substantial interest recently because they can be more efficient in learning and inference, without any significant drop in performance. The "lottery ticket hypothesis" has showed the existence of such sparse subnetworks at initialization. Given a fully-connected initialized architecture, our aim is to find such "winning ticket" networks, without any training data. We first show the advantages of forming input-output paths, over pruning individual connections, to avoid bottlenecks in gradient propagation. Then, we show that Paths with Higher Edge-Weights (PHEW) at initialization have higher loss gradient magnitude, resulting in more efficient training. Selecting such paths can be performed without any data.

http://proceedings.mlr.press/v119/frankle20a.html

We study whether a neural network optimizes to the same, linearly connected minimum under different samples of SGD noise (e.g., random data order and augmentation). We find that standard vision models become stable to SGD noise in this way early in training. From then on, the outcome of optimization is determined to a linearly connected region. We use this technique to study iterative magnitude pruning (IMP), the procedure used by work on the lottery ticket hypothesis to identify subnetworks that could have trained in isolation to full accuracy. We find that these subnetworks only reach full accuracy when they are stable to SGD noise, which either occurs at initialization for small-scale settings (MNIST) or early in training for large-scale settings (ResNet-50 and Inception-v3 on ImageNet).

https://mathai-iclr.github.io/papers/papers/MATHAI_29_paper.pdf “In some situations we show that neural networks learn through a process of “grokking” a pattern in the data, improving generalization performance from random chance level to perfect generalization, and that this improvement in generalization can happen well past the point of overfitting.”

Proves some sort of theorem along the lines of "If you want to fit N data points smoothly then you need P parameters."

Seems relevant to NTK stuff maybe, haven't read it yet.

Argues that regularization causes modularity causes generalization. I'm especially interested in the bit about modularity causing generalization. Insofar as that's true, it's reason to expect future AI systems to be fractally modular even if they are big black-box end-to-end neural nets. And that has some big implications I think.

comment by adamShimi · 2021-03-10T10:49:27.072Z · LW(p) · GW(p)

Cool list! I'll look into the ones I don't know or haven't read yet.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T22:07:48.572Z · LW(p) · GW(p)

For the past year I've been thinking about the Agent vs. Tool debate (e.g. thanks to reading CAIS/Reframing Superintelligence) and also about embedded agency and mesa-optimizers and all of these topics seem very related now... I keep finding myself attracted to the following argument skeleton:

Rule 1: If you want anything unusual to happen, you gotta execute a good plan.

Rule 2: If you want a good plan, you gotta have a good planner and a good world-model.

Rule 3: If you want a good world-model, you gotta have a good learner and good data.

Rule 4: Having good data is itself an unusual happenstance, so by Rule 1 if you want good data you gotta execute a good plan.

Putting it all together: Agents are things which have good planner and learner capacities and are hooked up to actuators in some way. Perhaps they also are "seeded" with a decent world-model to start off with. Then, they get a nifty feedback loop going: They make decent plans, which allow them to get decent data, which allows them to get better world-models, which allows them to make better plans and get better data so they can get great world-models and make great plans and... etc. (The best agents will also be improving on their learning and planning algorithms! Humans do this, for example.)

Empirical conjecture: Tools suck; agents rock, and that's why. It's also why agenty mesa-optimizers will arise, and it's also why humans with tools will eventually be outcompeted by agent AGI.

Replies from: Pattern
comment by Pattern · 2019-10-10T04:15:31.130Z · LW(p) · GW(p)

How would you test the conjecture?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-10T16:27:03.528Z · LW(p) · GW(p)

The ultimate test will be seeing whether the predictions it makes come true--whether agenty mesa-optimizers arise often, whether humans with tools get outcompeted by agent AGI.

In the meantime, it's not too hard to look for confirming or disconfirming evidence. For example, the fact that militaries and corporations that make a plan and then task their subordinates with strictly following the plan invariably do worse than those who make a plan and then give their subordinates initiative and flexibility to learn and adapt on the fly... seems like confirming evidence. (See: agile development model, the importance of iteration and feedback loops in startup culture, etc.) Whereas perhaps the fact that AlphaZero is so good despite lacking a learning module is disconfirming evidence.

As for a test, well we'd need to have something that proponents and opponents agree to disagree on, and that might be hard to find. Most tests I can think of now don't work because everyone would agree on what the probable outcome is. I think the best I can do is: Someday soon we might be able to test an agenty architecture and a non-agenty architecture in some big complex novel game environment, and this conjecture would predict that for sufficiently complex and novel environments the agenty architecture would win.

Replies from: bgold
comment by bgold · 2019-10-14T17:47:21.312Z · LW(p) · GW(p)

I'd agree w/ the point that giving subordinates plans and the freedom to execute them as best as they can tends to work out better, but that seems to be strongly dependent on other context, in particular the field they're working in (ex. software engineering vs. civil engineering vs. military engineering), cultural norms (ex. is this a place where agile engineering norms have taken hold?), and reward distributions (ex. does experimenting by individuals hold the potential for big rewards, or are all rewards likely to be distributed in a normal fashion such that we don't expect to find outliers).

My general model is in certain fields humans look more tool shaped and in others more agent shaped. For example an Uber driver when they're executing instructions from the central command and control algo doesn't require as much of the planning, world modeling behavior. One way this could apply to AI is that sub-agents of an agent AI would be tools.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-15T15:27:57.249Z · LW(p) · GW(p)

I agree. I don't think agents will outcompete tools in every domain; indeed in most domains perhaps specialized tools will eventually win (already, we see humans being replaced by expensive specialized machinery, or expensive human specialists, lots of places). But I still think that there will be strong competitive pressure to create agent AGI, because there are many important domains where agency is an advantage.

Replies from: gwern
comment by gwern · 2019-10-15T16:12:40.266Z · LW(p) · GW(p)

Expensive specialized tools are themselves learned by and embedded inside an agent to achieve goals. They're simply meso-optimization in another guise. eg AlphaGo learns a reactive policy which does nothing which you'd recognize as 'planning' or 'agentiness' - it just maps a grid of numbers (board state) to another grid of numbers (value function estimates of a move's value). A company, beholden to evolutionary imperatives, can implement internal 'markets' with 'agents' if it finds that useful for allocating resources across departments, or use top-down mandates if those work better, but no matter how it allocates resources, it's all in the service of an agent, and any distinction between the 'tool' and 'agent' parts of the company is somewhat illusory.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-25T10:15:09.904Z · LW(p) · GW(p)

The International Energy Agency releases regular reports in which it forecasts the growth of various energy technologies for the next few decades. It's been astoundingly terrible at forecasting solar energy for some reason. Marvel at this chart:

This is from an article criticizing the IEA's terrible track record of predictions. The article goes on to say that there should be about 500GW of installed capacity by 2020. This article was published in 2020; a year later, the 2020 data is in, and it's actually 714 GW. Even the article criticizing the IEA for their terrible track record managed to underestimate solar's rise one year out!

Anyhow, probably there were other people who successfully predicted it. But not these people. (I'd be interested to hear more about this--was the IEA representative of mainstream opinion? Or was it being laughed at even at the time? EDIT: Zac comments to say that yeah, plausibly they were being laughed at even then, and certainly now. Whew.)

Meanwhile, here was Carl Shulman in 2012: [LW(p) · GW(p)]

The continuation of the solar cell and battery cost curves are pretty darn impressive. Costs halving about once a decade, for several decades, is pretty darn impressive. One more decade until solar is cheaper than coal is today, and then it gets cheaper (vast areas of equatorial desert could produce thousands of times current electricity production and export in the form of computation, the products of electricity-intensive manufacturing, high-voltage lines, electrolysis to make hydrogen and hydrocarbons, etc). These trends may end before that, but the outside view looks good.

There have also been continued incremental improvements in robotics and machine learning that are worth mentioning, and look like they can continue for a while longer. Vision, voice recognition, language translation, and the like have been doing well.  [LW(p) · GW(p)]

"One more decade until solar is cheaper than coal is today..."

Anyhow, all of this makes me giggle, so I thought I'd share it. When money is abundant, knowledge is the real wealth. [LW · GW]In other words, many important kinds of knowledge are not for sale. If you were a rich person who didn't have generalist research skills and didn't know anything about solar energy, and relied on paying other people to give you knowledge, you would have listened to the International Energy Agency's official forecasts rather than Carl Shulman or people like him, because you wouldn't know how to distinguish Carl from the various other smart opinionated uncredentialed people all saying different things.

Replies from: zac-hatfield-dodds
comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2021-04-25T12:16:00.185Z · LW(p) · GW(p)

The IEA is a running joke in climate policy circles; they're transparently in favour of fossil fuels and their "forecasts" are motivated by political (or perhaps commercial, hard to untangle with oil) interests rather than any attempt at predictive accuracy.

Replies from: daniel-kokotajlo, Sherrinford
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-25T13:07:22.744Z · LW(p) · GW(p)

OH ok thanks! Glad to hear that. I'll edit.

comment by Sherrinford · 2021-04-26T19:46:23.938Z · LW(p) · GW(p)

What do you mean by "transparently" in favour of fossil fuels? Is there anything like a direct quote e.g. of Fatih Birol backing this up?

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-19T13:29:39.848Z · LW(p) · GW(p)

Maybe a tax on compute would be a good and feasible idea?

--Currently the AI community is mostly resource-poor academics struggling to compete with a minority of corporate researchers at places like DeepMind and OpenAI with huge compute budgets. So maybe the community would mostly support this tax, as it levels the playing field. The revenue from the tax could be earmarked to fund "AI for good" research projects. Perhaps we could package the tax with additional spending for such grants, so that overall money flows into the AI community, whilst reducing compute usage. This will hopefully make the proposal acceptable and therefore feasible.

--The tax could be set so that it is basically 0 for everything except for AI projects above a certain threshold of size, and then it's prohibitive. To some extent this happens naturally since compute is normally measured on a log scale: If we have a tax that is 1000% of the cost of compute, this won't be a big deal for academic researchers spending $100 or so per experiment (Oh no! Now I have to spend$1,000! No big deal, I'll fill out an expense form and bill it to the university) but it would be prohibitive for a corporation trying to spend a billion dollars to make GPT-5. And the tax can also have a threshold such that only big-budget training runs get taxed at all, so that academics are completely untouched by the tax, as are small businesses, and big businesses making AI without the use of massive scale.

--The AI corporations and most of all the chip manufacturers would probably be against this. But maybe this opposition can be overcome.

Replies from: riceissa, daniel-kokotajlo
comment by riceissa · 2020-11-24T05:48:29.630Z · LW(p) · GW(p)

Would this work across different countries (and if so how)? It seems like if one country implemented such a tax, the research groups in that country would be out-competed by research groups in other countries without such a tax (which seems worse than the status quo, since now the first AGI is likely to be created in a country that didn't try to slow down AI progress or "level the playing field").

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-24T07:05:19.509Z · LW(p) · GW(p)

Yeah, probably not. It would need to be an international agreement I guess. But this is true for lots of proposals. On the bright side, you could maybe tax the chip manufacturers instead of the AI projects? Idk.

Maybe one way it could be avoided is if it came packaged with loads of extra funding for safe AGI research, so that overall it is still cheapest to work from the US.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-20T11:10:38.329Z · LW(p) · GW(p)

Another cool thing about this tax is that it would automatically counteract decreases in the cost of compute. Say we make the tax 10% of the current cost of compute. Then when the next generation of chips comes online, and the price drops by an order of magnitude, automatically the tax will be 100% of the cost. Then when the next generation comes online, the tax will be 1000%.

This means that we could make the tax basically nothing even for major corporations today, and only start to pinch them later.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-09-04T10:06:56.888Z · LW(p) · GW(p)

Eric Drexler has argued that the computational capacity of the human brain is equivalent to about 1 PFlop/s, that is, we are already past the human-brain-human-lifetime milestone. (Here is a gdoc.) The idea is that we can identify parts of the human brain that seem to perform similar tasks to certain already-existing AI systems. It turns out that e.g. 1-thousandth of the human brain is used to do the same sort of image processing tasks that seem to be handled by modern image processing AI... so then that means an AI 1000x bigger than said AI should be able to do the same things as the whole human brain, at least in principle.

Has this analysis been run with nonhuman animals? For example, a chicken brain is a lot smaller than a human brain, but can still do image recognition, so perhaps the part of the chicken that does image recognition is smaller than the part of the human that does image recognition.

Replies from: avturchin
comment by avturchin · 2020-09-04T12:05:30.887Z · LW(p) · GW(p)

It is known that birds brains are much more mass-effective than mammalian.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T12:04:06.107Z · LW(p) · GW(p)

One thing I find impressive about GPT-3 is that it's not even trying to generate text.

Imagine that someone gave you a snippet of random internet text, and told you to predict the next word. You give a probability distribution over possible next words. The end.

Then, your twin brother gets a snippet of random internet text, and is told to predict the next word. Etc. Unbeknownst to either of you, the text your brother gets is the text you got, with a new word added to it according to the probability distribution you predicted.

Is it any wonder that sometimes the result doesn't make sense? All it takes for the chain of words to get derailed is for one unlucky word to be drawn from someone's distribution of next-word prediction. GPT-3 doesn't have the ability to "undo" words it has written; it can't even tell its future self what its past self had in mind when it "wrote" a word!

EDIT: I just remembered Ought's experiment with getting groups of humans to solve coding problems by giving each human 10 minutes to work on it and then passing it on to the next. The results? Humans sucked. The overall process was way less effective than giving 1 human a long period to solve the problem. Well, GPT-3 is like this chain of humans!

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-14T10:48:35.793Z · LW(p) · GW(p)

Years after I first thought of it, I continue to think that this chain reaction is the core of what it means for something to be an agent, AND why agency is such a big deal, the sort of thing we should expect to arise and outcompete non-agents. Here's a diagram:

Roughly, plans are necessary for generalizing to new situations, for being competitive in contests for which there hasn't been time for natural selection to do lots of optimization of policies. But plans are only as good as the knowledge they are based on. And knowledge doesn't come a priori; it needs to be learned from data. And, crucially, data is of varying quality, because it's almost always irrelevant/unimportant. High-quality data, the kind that gives you useful knowledge, is hard to come by. Indeed, you may need to make a plan for how to get it. (Or more generally, being better at making plans makes you better at getting higher-quality data, which makes you more knowledgeable, which makes your plans better.)

Replies from: Yoav Ravid, gworley
comment by Yoav Ravid · 2021-04-14T16:33:48.667Z · LW(p) · GW(p)

Seems similar to the OODA loop

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-14T19:22:15.359Z · LW(p) · GW(p)

Yep! I prefer my terminology but it's basically the same concept I think.

comment by G Gordon Worley III (gworley) · 2021-04-14T14:15:05.195Z · LW(p) · GW(p)

I think it's probably even simpler than that: feedback loops are the minimum viable agent, i.e. a thermostat is the simplest kind of agent possible. Stuff like knowledge and planning are elaborations on the simple theme of the negative feedback circuit.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-14T15:22:11.347Z · LW(p) · GW(p)

I disagree; I think we go astray by counting things like thermostats as agents. I'm proposing that this particular feedback loop I diagrammed is really important, a much more interesting phenomenon to study than the more general category of feedback loop that includes thermostats.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-18T16:00:56.051Z · LW(p) · GW(p)

In this post [LW · GW], Jessicata describes an organization which believes:

1. AGI is probably coming in the next 20 years.
2. Many of the reasons we have for believing this are secret.
3. They're secret because if we told people about those reasons, they'd learn things that would let them make an AGI even sooner than they would otherwise.

At the time, I didn't understand why an organization would believe that. I figured they thought they had some insights into the nature of intelligence or something, some special new architecture for AI designs, that would accelerate AI progress if more people knew about it. I was skeptical, because what are the odds that breakthroughs in fundamental AI science would come from such an organization? Surely we'd expect such breakthroughs to come from e.g. DeepMind.

Now I realize: Of course! The secret wasn't a dramatically new architecture, it was that dramatically new architectures aren't needed. It was the scaling hypothesis. [LW · GW] This seems much more plausible to me.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-19T13:20:42.609Z · LW(p) · GW(p)

The other day I heard this anecdote: Someone's friend was several years ago dismissive of AI risk concerns, thinking that AGI was very far in the future. When pressed about what it would take to change their mind, they said their fire alarm would be AI solving Montezuma's Revenge. Well, now it's solved, what do they say? Nothing; if they noticed they didn't say. Probably if they were pressed on it they would say they were wrong before to call that their fire alarm.

This story fits with the worldview expressed in "There's No Fire Alarm for AGI." I expect this sort of thing to keep happening well past the point of no return.

Replies from: Viliam
comment by Viliam · 2020-11-21T14:15:05.537Z · LW(p) · GW(p)

Also related: Is That Your True Rejection? [LW · GW]

There is this pattern when people say: "X is the true test of intelligence", and after a computer does X, they switch to "X is just a mechanical problem, but Y is the true test of intelligence". (Past values of X include: chess, go, poetry...) There was a meme about it that I can't find now.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-08-21T21:00:11.356Z · LW(p) · GW(p)

Notes on Tesla AI day presentation:

https://youtu.be/j0z4FweCy4M?t=6309 Here they claim they've got more than 10,000 GPUs in their supercomputer, and that this means their computer is more powerful than the top 5 publicly known supercomputers in the world. Consulting this list https://www.top500.org/lists/top500/2021/06/ it seems that this would put their computer at just over 1 Exaflop per second, which checks out (I think I had heard rumors this was the case) and also if you look at this https://en.wikipedia.org/wiki/Computer_performance_by_orders_of_magnitude to get a sense of how many flops GPUs do these days (10^13? 10^14?) it checks out as well, maybe.

Anyhow. 1 Exaflop means it would take about a day to do as much computation as was used to train GPT-3. (10^23 ops is 5 ooms more than 10^18, so 100,000 seconds.) So in 100 days, could do something 2 OOMs bigger than GPT-3. So... I guess the conclusion is that scaling up the AI compute trend by +2 OOMs will be fairly easy, and scaling up by +3 or +4 should be feasible in the next five years. But +5 or +6 seems probably out of reach? IDK I'd love to see someone model this in more detail. Probably the main thing to look at is total NVIDIA or TSMC annual AI-focused chip production.

https://youtu.be/j0z4FweCy4M?t=7467 Here it says Tesla's AI training computer is:

4x more compute per dollar

1.3x more energy-efficient

than... what? I didn't catch what they were comparing to.

They say they can do another 10x improvement in the next generation.

If they are comparing to the state of the art, that's a big deal I guess?

https://www.youtube.com/watch?v=j0z4FweCy4M Here Elon says their new robot is designed to be slow and weak so that humans can outrun it and overpower it if need be, because "you never know. [pause] Five miles an hour, you'll be fine. HAHAHA. Anyways... "

Example tasks it should be able to do: "Pick up that bolt and attach it to that car. Go to the store and buy me some groceries."

Prototype should be built next year.

The code name for the robot is Optimus. The Dojo simulation lead guy said they'll be focusing on helping out with the Optimus project in the near term. That's exciting because the software part is the hard part; it really seems like they'll be working on humanoid robot software/NN training in the next year or two.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-07-07T10:12:41.044Z · LW(p) · GW(p)

In a recent conversation, someone said the truism about how young people have more years of their life ahead of them and that's exciting. I replied that everyone has the same number of years of life ahead of them now, because AI timelines. (Everyone = everyone in the conversation, none of whom were above 30)

I'm interested in the question of whether it's generally helpful or harmful to say awkward truths like that. If anyone is reading this and wants to comment, I'd appreciate thoughts.

comment by Steven Byrnes (steve2152) · 2021-07-07T11:37:54.010Z · LW(p) · GW(p)

I've been going with the compromise position of "saying it while laughing such that it's unclear whether you're joking or not" :-P

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-07-07T12:52:07.740Z · LW(p) · GW(p)

The people who know me know I'm not joking, I think. For people who don't know me well enough to realize this, I typically don't make these comments.

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-07-07T17:32:52.716Z · LW(p) · GW(p)

I sometimes kinda have this attitude that this whole situation is just completely hilarious and absurd, i.e. that I believe what I believe about the singularity and apocalypse and whatnot, but that the world keeps spinning and these ideas have basically zero impact. And it makes me laugh. So when I shrug and say "I'm not saving enough for retirement; oh well, by then probably we'll all be dead or living in a radical post-work utopia", I'm not just laughing because it's ambiguously a joke, I'm also laughing because this kind of thing reminds me of how ridiculous this all is. :-P

Replies from: Pattern
comment by Pattern · 2021-07-07T20:37:53.215Z · LW(p) · GW(p)

What if things foom later than you're expecting - say during retirement?

What if anti-aging enters the scene and retirement can last, much, much longer, before the foom?

Replies from: steve2152
comment by Steven Byrnes (steve2152) · 2021-07-07T20:54:11.366Z · LW(p) · GW(p)

Tbc my professional opinion is that people should continue to save for retirement :-P

I mean, I don't have as much retirement savings as the experts say I should at my age ... but does anyone? Oh well...

comment by Vladimir_Nesov · 2021-07-09T08:06:41.451Z · LW(p) · GW(p)

"Truths" are persuasion, unless expected to be treated as hypotheses with the potential to evoke curiosity. This is charity, continuous progress on improving understanding of circumstances that produce claims you don't agree with, a key skill for actually changing your mind. By default charity is dysfunctional in popular culture, so non-adversarial use of factual claims that are not expected to become evident in short order depends on knowing that your interlocutor practices charity. Non-awkward factual claims are actually more insidious, as the threat of succeeding in unjustified persuasion is higher. So in a regular conversation, there is a place for arguments, not for "truths", awkward or not. Which in this instance entails turning the conversation to the topic of AI timelines.

I don't think there are awkward arguments here in the sense of treading a social taboo minefield, so there is no problem with that, except it's work on what at this point happens automatically via stuff already written up online, and it's more efficient to put effort in growing what's available online than doing anything in person, unless there is a plausible path to influencing someone who might have high impact down the line.

comment by Nisan · 2021-07-08T00:21:49.703Z · LW(p) · GW(p)

It's fine to say that if you want the conversation to become a discussion of AI timelines. Maybe you do! But not every conversation needs to be about AI timelines.

comment by Borasko · 2021-07-07T14:48:35.635Z · LW(p) · GW(p)

I've stopped bringing up the awkward truths around my current friends. I started to feel like I was using to much of my built up esoteric social capital on things they were not going to accept (or at least want to accept). How can I blame them? If somebody else told me there was some random field that a select few of people interested in will be deciding the fate of all of humanity for the rest of time and I had no interest in that field I would want to be skeptical of it as well.  Especially if they were to through out some figures like 15 - 25 years from now (my current timelines) is when humanities rein over the earth will end because of this field.

I found when I stopped bringing it up conversations were lighter and more fun. I've accepted we will just be screwing around talking about personal issues and the issues de jour, I don't mind it.  The truth is a bitter pill to get down, and if they no interest in helping AI research its probably best they don't live their life worrying about things they won't be able to change. So for me at least I saw personal life improvements on not bringing some of those awkward truths up.

comment by Dagon · 2021-07-07T13:39:54.993Z · LW(p) · GW(p)

Depends on the audience and what they'll do with the reminder.  But that goes for the original statement as well (which remains true - there's enough uncertainty about AI timelines and impact on individual human lives that younger people have more years of EXPECTED (aka average across possible futures) life).

comment by ChristianKl · 2021-07-07T13:54:14.156Z · LW(p) · GW(p)

Whether it makes sense to tell someone an awkward truth depends often more on the person then on the truth.

Replies from: Pattern
comment by Pattern · 2021-07-07T20:35:32.823Z · LW(p) · GW(p)

Truths in general:

This is especially true when the truth isn't in the words, but something you're trying to point at with them.

Awkward truths:

What makes something an awkward truth, is the person, anyway, so your statement seems tautological.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-30T10:56:42.024Z · LW(p) · GW(p)

Probably, when we reach an AI-induced point of no return, AI systems will still be "brittle" and "narrow" in the sense used in arguments against short timelines.

Argument: Consider AI Impacts' excellent point that "human-level" is superhuman (bottom of this page)

The point of no return, if caused by AI, could come in a variety of ways that don't involve human-level AI in this sense. See this post [LW · GW] for more. The general idea is that being superhuman at some skills can compensate for being subhuman at others. We should expect the point of no return to be reached at a time when even the most powerful AIs have weak points, brittleness, narrowness, etc. -- that is, even they have various things that they totally fail at, compared to humans. (Note that the situation is symmetric; humans totally fail at various things compared to AIs even today)

I was inspired to make this argument by reading this blast from the past which argued that the singularity can't be near because AI is still brittle/narrow. I expect arguments like this to continue being made up until (and beyond) the point of no return, because even if future AI systems are significantly less brittle/narrow than today's, they will still be bad at various things (relative to humans), and so skeptics will still have materials with which to make arguments like this.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-26T11:42:34.561Z · LW(p) · GW(p)

How much video data is there? It seems there is plenty:

This https://www.brandwatch.com/blog/youtube-stats/ says 500 hours of video are uploaded to youtube every minute. This https://lumen5.com/learn/youtube-video-dimension-and-size/ says standard definition for youtube video is 854x480 = 409920 pixels. At 48fps, that’s 3.5e13 pixels of data every minute. Over the course of a whole year, that’s +5 OOMs, it comes out to 1.8e19 pixels of data every year. So yeah, even if we use some encoding that crunches pixels down to 10x10 vokens or whatever, we’ll have 10^17 data points of video. That seems like plenty given that according to the scaling laws and the OpenPhil brain estimate we only need 10^15 data points to train an AI the size of a human brain. (And also according to Lanrian's GPT-N extrapolations, human-level performance and test loss will be reached by 10^15 or so.)

But maybe I've done my math wrong or something?

Replies from: gerald-monroe
comment by Gerald Monroe (gerald-monroe) · 2021-01-26T23:45:43.425Z · LW(p) · GW(p)

So have you thought about what "data points" mean? If the data is random samples from the mandelbrot set, the maximum information the AI can ever learn is just the root equation used to generate the set.

Human agents control a robotics system where we take actions and observe the results on our immediate environment. This sort of information seems to lead to very rapid learning especially for things where the consequences are near term and observable. You are essentially performing a series of experiments where you try action A vs B and observe what the environment does. This let's you rapidly cancel out data that doesn't matter, its how you learn that lighting conditions don't affect how a rock falls when you drop it.

Point is the obvious training data for an AI would be similar. It needs to manipulate, both in sims and reality, the things we need it to learn about

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-27T08:18:15.864Z · LW(p) · GW(p)

I've thought about it enough to know I'm confused! :)

I like your point about active learning (is that the right term?). I wonder how powerful GPT-3 would be if instead of being force-fed random internet text from a firehose, it had access to a browser and could explore (would need some sort of curiosity reward signal?). Idk, probably this isn't a good idea or else someone would have done it.

Replies from: gerald-monroe
comment by Gerald Monroe (gerald-monroe) · 2021-01-27T19:35:34.467Z · LW(p) · GW(p)

I don't know that GPT-3 is the best metric for 'progress towards general intelligence'.  One example of the agents receiving 'active' data that resulted in interesting results is this OpenAI experiment.

In this case the agents cannot emit text - which is what GPT-3 is doing that makes us feel it's "intelligent" - but can cleverly manipulate their environment in complex ways not hardcoded in.  The agents in this experiment are learning both movement to control how they view the environment and to use a few simple tools to accomplish a goal.

To me this seems like the most promising way forward.  I think that robust agents that can control real robots to do things, with those things becoming increasingly complex and difficult as the technology improves, might in fact be the "skeleton" of what would later allow for "real" sentience.

Because from our perspective, this is our goal.  We don't want an agent that can babble and seem smart, we want an agent that can do useful things - things we were paying humans to do - and thus extend what we can ultimately accomplish.  (yes, in the immediate short term it unemploys lots of humans, but it also would make possible new things that previously we needed lots of humans to do.  It also should allow for doing things we know how to do now but with better quality/on a more broader scale.  )

More exactly, how do babies learn?  Yes, they learn to babble, but they also learn a set of basic manipulations of their body - adjusting their viewpoint - and manipulate the environment with their hands - learning how it responds.

We can discuss more, I think I know how we will "get there from here" in broad strokes.  I don't think it will be done by someone writing a relatively simple algorithm and getting a sudden breakthrough that allows for sentience, I think it will be done by using well defined narrow domain agents that each do something extremely well - and by building higher level agents on top of this foundation in a series of layers, over years to decades, until you reach the level of abstraction of "modify thy own code to be more productive".

As a trivial example of abstraction, you can make a low level agent that all it does is grab things with a simulate/real robotic hand, and an agent 1 level up that decides what to grab by which grab has the highest predicted reward.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-01-28T17:15:28.408Z · LW(p) · GW(p)
We can discuss more, I think I know how we will "get there from here" in broad strokes.  I don't think it will be done by someone writing a relatively simple algorithm and getting a sudden breakthrough that allows for sentience, I think it will be done by using well defined narrow domain agents that each do something extremely well - and by building higher level agents on top of this foundation in a series of layers, over years to decades, until you reach the level of abstraction of "modify thy own code to be more productive".

I'd be interested to hear more about this. It sounds like this could maybe happen pretty soon with large, general language models like GPT-3 + prompt programming + a bit of RL.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-06-16T19:28:17.456Z · LW(p) · GW(p)

I recommend The Meme Machine, it's a shame it didn't spawn a huge literature. I was thinking a lot about memetics before reading it, yet still I feel like I learned a few important things.

Anyhow, here's an idea inspired by it:

First, here is my favorite right way to draw analogies between AI and evolution:

Evolution : AI research over time throughout the world

Gene : Bit of code on Github

Organism : The weights of a model

Past experiences of an organism : Training run of a model

With that as background context, I can now present the idea.

With humans, memetic evolution is a thing. It influences genetic evolution and even happens fast enough to influence the learning of a single organism over time. With AIs, memetic evolution is pretty much not a thing. Sure, the memetic environment will change somewhat between 2020 and whenever APS-AI is built, but the change will be much less than all the changes that happened over the course of human evolution. And the AI training run that produces the first APS-AI may literally involve no memetic change at all (e.g. if it's trained like GPT-3).

So. Human genes must code for the construction of a brain + learning procedure that works for many different memetic environments, and isn't overspecialized to any particular memetic environment. Whereas the first APS-AI might be super-specialized to the memetic environment it was trained in.

This might be a barrier to building APS-AI; maybe it'll be hard to induce a neural net to have the right sort of generality/flexibility because we don't have lots of different memetic environments for it to learn from (and even if we did, there's the further issue that the memetic environments wouldn't be responding to it simultaneously) and maybe this is somehow a major block to having APS capabilities.

More likely, I think, is that APS-AI will still happen but it'll just lack the human memetic generality. It'll be "overfit" to the current memetic landscape. Maybe.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-16T19:00:42.623Z · LW(p) · GW(p)

I spent way too much time today fantasizing about metal 3D printers. I understand they typically work by using lasers to melt a fine layer of metal powder into solid metal, and then they add another layer of powder and melt more of it, and repeat, then drain away the unmelted powder. Currently they cost several hundred thousand dollars and can build stuff in something like 20cm x 20cm x 20cm volume. Well, here's my fantasy design for an industrial-scale metal 3D printer, that would probably be orders of magnitude better in every way, and thus hopefully orders of magnitude cheaper. Warning/disclaimer: I don't have any engineering or physics background and haven't bothered to check several things so probably I'm making stupid mistakes somewhere.

1. The facility consists of a big strong concrete-and-steel dome, with big airlocks in the sides. The interior is evacuated, i.e. made a vacuum.

2. The ground floor of the facility is flat concrete, mostly clear. On it roll big robot carts, carrying big trays of metal powder. They look like minecarts because they have high sides and need to be able to support a lot of weight. They have a telescoping platform on the interior so that the surface of the powder is always near the top of the cart, whether the powder is a meter deep or a millimeter. The carts heat the powder via induction coils, so that it is as close as possible to the melting point without actually being in danger of melting.

3. The ceiling of the facility (not literally the ceiling, more like a rack that hangs from it) is a dense array of laser turrets. Rows and rows and rows of them. They don't have to be particularly high-powered lasers because the metal they are heating up is already warm and there isn't any atmosphere to diffuse the beam or dissipate heat from the target. When laser turrets break they can be removed and replaced easily, perhaps by robots crawling above them.

4. Off in one corner of the facility is the powder-recycling unit. It has a bunch of robot arms to lift out the finished pieces while the carts dump their contents, the unused powder being swept up and fed to powder-sprinkler-bots.

5. There are a bunch of powder-sprinkler-bots that perhaps look like the letter n. They roll over carts and deposit a fresh layer of powder on them. The powder is slightly cooler than the powder in the carts, to counteract the extra heat being added by the lasers.

6. The overall flow is: Carts arrange themselves like a parking lot on the factory floor; powder-sprinklers pair up with them and go back and forth depositing powder. The laser arrays above do their thing. When carts are finished or powder-sprinklers need more powder they use the lanes to go to the powder-recycling unit. Said unit also passes finished product through an airlock to the outside world, and takes in fresh shipments of powder through another airlock.

7. Pipes of oil or water manage heat in the facility, ensuring that various components (e.g. the lasers) stay cool, transporting the waste heat outside.

8. When machines break down, as they no doubt will, (a) teleoperated fixer robots might be able to troubleshoot the problem, and if not, (b) they can just wheel the faulty machine through an airlock to be repaired by humans outside.

9. I think current powder printer designs involve a second step in which the product is baked for a while to finish it? If this is still required, then the facility can have a station that does that. There are probably massive economies of scale for this sort of thing.

Overall this facility would allow almost-arbitrarily-large 3D objects to be printed; you are only limited by the size of your carts (and you can have one or two gigantic carts in there). Very little time would be wasted, because laser turrets will almost always have at least one chunk of cart within their line of fire that isn't blocked by something. There are relatively few kinds of machine (carts, turrets, sprinklers) but lots of each kind, resulting in economies of scale in manufacturing and also they are interchangeable so that several of them can be breaking down at any given time and the overall flow won't be interrupted. And the vacuum environment probably (I don't know for sure, and have no way of quantifying) helps a lot in many ways--lasers can be cheaper, less worry about dust particles messing things up, hotter powder melts quicker and better, idk. I'd love to hear an actual physicist's opinion. Does the vacuum help? Do open cart beds with ceiling lasers work, or is e.g. flying molten metal from neighboring carts a problem, or lasers not being able to focus at that range?

Outside the vacuum dome, printed parts would be finished and packaged into boxes and delivered to whoever ordered them.

If one of these facilities set up shop in your city, why would you ever need a metal 3D printer of your own? You could submit your order online and immediately some unused corner of some powder bed would start getting lasered, and the finished product would be available for pickup or delivery, just like a pizza.

If we ever start doing fancier things with our 3D printers, like using different types of metal or embedding other substances into them, the facility could be easily retrofitted to do this--just add another type of sprinkler-robot. (I guess the heated powder could be a problem? But some carts could be non-heated, and rely on extra-duration laser fire instead, or multiple redundant lasers.)

Replies from: wzp
comment by wzp · 2021-05-16T20:39:52.224Z · LW(p) · GW(p)

Cool sci-fi-ish idea, but my impression has been that 3D printing is viable for smaller and/or specific objects for which there is not enough demand to set up a separate production line. If economies of scale start to play a role then setting up a more specifically optimized process wins over general purpose 3D plant.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-16T20:52:34.212Z · LW(p) · GW(p)

This sort of thing would bring the cost down a lot, I think. Orders of magnitude, maybe. With costs that low, many more components and products would be profitable to 3D print instead of manufacture the regular way. Currently, Tesla uses 3D printing for some components of their cars I believe; this proves that for some components (tricky, complex ones typically) it's already cost-effective. When the price drops by orders of magnitude, this will be true for many more components. I'm pretty sure there would be sufficient demand, therefore, to keep at least a few facilities like this running 24/7.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-04T12:18:37.857Z · LW(p) · GW(p)

Ballistics thought experiment: (Warning: I am not an engineer and barely remember my high school physics)

You make a hollow round steel shield and make it a vacuum inside. You put a slightly smaller steel disc inside the vacuum region. You make that second disc rotate very fast. Very fast indeed.

An incoming projectile hits your shield. It is several times thicker than all three layers of your shield combined. It easily penetrates the first layer, passing through to the vacuum region. It contacts the speedily spinning inner layer, and then things get crazy... the tip of the projectile gets eroded by the spinning inner layer (Yes, it's spinning faster than the projectile is travelling, I told you it was fast).

I'm not sure what happens next, but I can think of two plausible possibilities:

1. The projectile bounces off the inner layer, because it's effectively been blocked not just by the material in front of it but by a whole ring of material, which collectively has more mass than the projectile. The ring slows down a bit and maybe has a dent in it now, a perfectly circular groove.

2. The inner layer disintegrates, but so does the projectile. The momentum of the ring of material that impacts the projectile is transferred to the projectile, knocking it sideways and probably breaking it into little melty pieces. The part of the inner layer outside the ring of material is now a free-spinning hula-hoop, probably a very unstable one which splits and shoots fragments in many directions.

Either way, the projectile doesn't make it through your shield.

So, does this work?

If it works, would it be useful for anything? It seems to be a relatively light form of armor; most armor has to be of similar (or greater) thickness to the projectile in order to block it, but this armor can be an order of magnitude less thick, possibly two or three. (Well, you also have to remember to account for the weight of the outer layer that holds back the vacuum... But I expect that won't be much, at least when we are armoring against tank shells and the like.)

So maybe it would be useful on tanks or planes, or ships. Something where weight is a big concern.

The center of the shield is a weak point I think. You can just put conventional armor there.

Problem: Aren't spinning discs harder to move about than still discs? They act like they have lots of mass, without having lots of weight? I feel like I vaguely recall this being the case, and my system 1 physics simulator agrees. This wouldn't reduce the armor's effectiveness, but it would make it really difficult to maneuver the tank/plane/ship equipped with it.

Problem: Probably this shield would become explosively useless after blocking the first shot. It's ablative armor in the most extreme sense. This makes it much less useful.

Solution: Maybe just use them in bunkers? Or ships? The ideal use case would be to defend some target that doesn't need to be nimble and which becomes much less vulnerable after surviving the first hit. Not sure if there is any such use case, realistically...

Replies from: lsusr, Dagon
comment by lsusr · 2020-11-04T22:00:50.590Z · LW(p) · GW(p)

This is well-beyond today's technology to build. By the time we have the technology to build one of these shields we will also have prolific railguns. The muzzle velocity of a railgun today exceeds 3 km/s[1] for a 3 kg slug. As a Fermi estimate, I will treat the impact velocity of a railgun as 1 km/s. The shield must have a radial velocity much larger than the incoming projectile. Suppose the radial velocity of the shield is 100 km/s, its mass is and you cover a target with 100 such shields.

The kinetic energy of each shield is . The moment of inertia is . We can calculate the total kinetic energy of shields.

This is an unstable system. If anything goes wrong like an earthquake or a power interruption, the collective shield is likely to explode in a cascading failure. Within Earth's atmosphere, a cascading explosion is guaranteed the first time it is hit. Such a failure would release 75 terajoules of energy.

For comparison, the Trinity nuclear test released 92 terajoules of energy. This proposed shield amounts to detonating a fission bomb on yourself to block to a bullet.

So, does this work?

Yes. The bullet is destroyed.

Aren't spinning discs harder to move about than still discs? They act like they have lots of mass, without having lots of weight?

Spinning a disk keeps its mass and weight the same. The disk becomes a gyroscope. Gyroscopes are hard to rotate but no harder to move than non-rotating objects. This is annoying for Earth-based buildings because the Earth rotates under them while the disks stay still.

1. In reality this is significantly limited by air resistance but air resistance can be partially mitigated by firing on trajectories that mostly go over the atmosphere. ↩︎

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-04T22:44:25.653Z · LW(p) · GW(p)

Lovely, thanks! This was fun to think about.

I had been hoping that the shield, while disintegrating and spraying bits of metal, would do so in a way that mostly doesn't harm the thing it is protecting, since all its energy is directed outwards from the rotating center, and thus orthogonal to the direction the target lies in. Do you think nevertheless the target would probably be hit by shrapnel or something?

Huh. If gyroscopes are no harder to move than non-rotating objects... Does this mean that these spinning vacuum disks can be sorta like a "poor man's nuke?" Have a missile with a spinning disc as the payload? Maybe you have to spend a few months gradually accelerating the disc beforehand, no problem. Just dump the excess energy your grid creates into accelerating the disc... I guess to match a nuke in energy you'd have to charge them up for a long time with today's grids... More of a rich man's nuke, ironically, in that for all that effort it's probably cheaper to just steal or buy some uranium and do the normal thing.

Replies from: gilch
comment by gilch · 2020-11-04T23:05:44.154Z · LW(p) · GW(p)

Reminds me of cookie cutters from The Diamond Age

A cookie-cutter was shaped like an aspirin tablet, except that the top and bottom were domed more to withstand ambient pressure; for like most other nanotechnological devices a cookie-cutter was filled with vacuum. Inside were two centrifuges, rotating on the same axis but in opposite directions, preventing the unit from acting like a gyroscope. The device could be triggered in various ways; the most primitive were simple seven-minute time bombs.

Detonation dissolved the bonds holding the centrifuges together so that each of a thousand or so ballisticules suddenly flew outward. The enclosing shell shattered easily, and each ballisticule kicked up a shock wave, doing surprisingly little damage at first, tracing narrow linear disturbances and occasionally taking a chip out of a bone. But soon they slowed to near the speed of sound, where shock wave piled on top of shock wave to produce a sonic boom. Then all the damage happened at once. Depending on the initial speed of the centrifuge, this could happen at varying distances from the detonation point; most everything inside the radius was undamaged but everything near it was pulped; hence "cookie-cutter." The victim then made a loud noise like the crack of a whip, as a few fragments exited his or her flesh and dropped through the sound barrier in air. Startled witnesses would turn just in time to see the victim flushing bright pink. Bloodred crescents would suddenly appear all over the body; these marked the geometric intersection of detonation surfaces with skin and were a boon to forensic types, who cloud thereby identify the type of cookie-cutter by comparing the marks against a handy pocket reference card. The victim was just a big leaky sack of undifferentiated gore at this point and, of course, never survived.

comment by Dagon · 2020-11-04T17:40:46.572Z · LW(p) · GW(p)

Somewhat similar to https://en.wikipedia.org/wiki/Ablative_armor, but I don't think it actually works.  You'd have to put actual masses and speeds into a calculation to be sure, but "spinning much faster than the bullet/shrapnel moves" seems problematic.  At the very least, you have to figure out how to keep the inner sphere suspended so it doesn't contact the outer sphere.  You might be able to ignore that bit by just calculating this as a space-borne defense mechanism: drop the outer shield, spin a sphere around your ship/habitat.  I think you'll still find that you have to spin it so fast that it deforms or disintegrates even without attack, for conventional materials.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-11-04T19:11:36.747Z · LW(p) · GW(p)

Mmm, good point about space-based system, that's probably a much better use case!

Replies from: Dagon
comment by Dagon · 2020-11-05T15:36:51.383Z · LW(p) · GW(p)

It's the easy solution to many problems in mechanics - put it in space, where you don't have to worry about gravity, air friction, etc.  You already specified that your elephant is uniform and spherical, so those complexities are already taken care of.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T16:34:59.034Z · LW(p) · GW(p)

Searching for equilibria can be infohazardous. You might not like the one you find first, but you might end up sticking with it (or worse, deviating from it and being punished). This is because which equilbrium gets played by other people depends (causally or, in some cases, acausally) not just on what equilibrium you play but even on which equilibria you think about. For reasons having to do with schelling points. A strategy that sometimes works to avoid these hazards is to impose constraints on which equilibria you think about, or at any rate to perform a search through equilibria-space that is guided in some manner so as to be unlikely to find equilibria you won't like. For example, here is one such strategy: Start with a proposal that is great for you and would make you very happy. Then, think of the ways in which this proposal is unlikely to be accepted by other people, and modify it slightly to make it more acceptable to them while keeping it pretty good for you. Repeat until you get something they'll probably accept.

Replies from: Dagon
comment by Dagon · 2020-03-02T17:12:08.177Z · LW(p) · GW(p)

I'm not sure I follow the logic. When you say "searching for equilibria", do you mean "internally predicting likelihood of points and durations of an equilibrium (as most of what we worry about aren't stable)? Or do you mean the process of application of forces and observation of counter forces in which the system is "finding it's level"? Or do you mean "discussion about possible equilibria, where that discussion is in fact a force that affects the system"?

Only the third seems to fit your description, and I think that's already covered by standard infohazard writings - the risk that you'll teach others something that can be used against you.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T17:48:24.794Z · LW(p) · GW(p)

I meant the third, and I agree it's not a particularly new idea, though I've never seen it said this succinctly or specifically. (For example, it's distinct from "the risk that you'll teach others something that can be used against you," except maybe in the broadest sense.)

Replies from: Dagon
comment by Dagon · 2020-03-02T18:24:26.813Z · LW(p) · GW(p)

Interesting. I'd like to explore the distinction between "risk of converging on a dis-preferred social equilibrium" (which I'd frame as "making others aware that this equilibrium is feasible") and other kinds of revealing information which others use to act in ways you don't like. I don't see much difference.

The more obvious cases ("here are plans to a gun that I'm especially vulnerable to") don't get used much unless you have explicit enemies, while the more subtle ones ("I can imagine living in a world where people judge you for scratching your nose with your left hand") require less intentionality of harm directed at you. But it's the same mechanism and info-risk.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T19:57:33.987Z · LW(p) · GW(p)

For one thing, the equilibrium might not actually be feasible, but making others aware that you have thought about it might nevertheless have harmful effects (e.g. they might mistakenly think that it is, or they might correctly realize something in the vicinity is.) For another, "teach others something that can be used against you" while technically describing the sort of thing I'm talking about, tends to conjure up a very different image in the mind of the reader -- an image more like your gun plans example.

I agree there is not a sharp distinction between these, probably. (I don't know, didn't think about it.) I wrote this shortform because, well, I guess I thought of this as a somewhat new idea -- I thought of most infohazards talk as being focused on other kinds of examples. Thank you for telling me otherwise!

Replies from: Dagon
comment by Dagon · 2020-03-02T20:57:24.678Z · LW(p) · GW(p)

(oops. I now realize this probably come across wrong). Sorry! I didn't intend to be telling you things, nor did I mean to imply that pointing out more subtle variants of known info-hazards was useless. I really appreciate the topic, and I'm happy to have exactly as much text as we have in exploring non-trivial application of the infohazard concept, and helping identify whether further categorization is helpful (I'm not convinced, but I probably don't have to be).

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-06-23T10:25:36.675Z · LW(p) · GW(p)

In stories (and in the past) important secrets are kept in buried chests, hidden compartments, or guarded vaults. In the real world today, almost the opposite is true: Anyone with a cheap smartphone can roam freely across the Internet, a vast sea of words and images that includes the opinions and conversations of almost every community. The people who will appear in future history books are right now blogging about their worldview and strategy! The most important events of the next century are right now being accurately predicted by someone, somewhere, and you could be reading about them in five minutes if you knew where to look! The plans of the powerful, the knowledge of the wise, and all the other important secrets are there for all to see -- but they are hiding in plain sight. To find these secrets, you need to be able to swim swiftly through the sea of words, skimming and moving on when they aren't relevant or useful, slowing down and understanding deeply when they are, and using what you learn to decide where to look next. You need to be able to distinguish the true and the important from countless pretenders. You need to be like a detective on a case with an abundance of witnesses and evidence, but where the witnesses are biased and unreliable and sometimes conspiring against you, and the evidence has been tampered with. The virtue you need most is rationality.

Replies from: Dagon, ChristianKl
comment by Dagon · 2021-06-23T17:55:44.448Z · LW(p) · GW(p)

There's a lot to unpack in the categorization of "important secrets".  I'd argue that the secret-est data isn't actually known by anyone yet, closely followed by secrets kept in someone's head, not in any vault or hidden compartment. Then there's "unpublished but theoretically discoverable" information, such as encrypted data, or data in limited-access locations (chests/caves, or just firewalled servers).

Then comes contextually-important insights buried in an avalanche of unimportant crap, which is the vast majority of interesting information, as you point out.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-06-23T20:29:52.537Z · LW(p) · GW(p)

I think I agree with all that. My claim is that the vast majority of interesting+important secrets (weighted by interestingness+importance) is in the "buried in an avalanche of crap" category.

Replies from: Dagon
comment by Dagon · 2021-06-23T21:55:38.856Z · LW(p) · GW(p)

Makes sense.  Part of the problem is that this applies to non-secrets as well - in fact, I'm not sure "secret" is a useful descriptor in this.  The vast majority of interesting+important information, even that which is actively published and intended for dissemination, is buried in crap.

comment by ChristianKl · 2021-06-23T14:33:55.948Z · LW(p) · GW(p)

I don't think that's true. There's information available online but a lot of information isn't. A person who had access to the US cables showing security concerns at the WIV, who had access to NSA surveilance that picked might have get alarmed that there's something problematic happening when the cell phone traffic dropped in October 2019 and they took their database down.

On the other hand I don't think that there's any way I could have known about the problems at the WIV in October of 2019 by accessing public information. Completely unrelated, it's probably just a councidence that October 2019 was also the time when the exercise by US policy makers about how a Coronavirus pandemic played out was done.

I don't think there's any public source that I could access that tells me about whether or not it was a coincidence. It's not the most important question but it leaves questions about how warning signs are handled open.

The information that Dong Jingwei just gave the US government is more like a buried chest.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-06-23T14:52:59.473Z · LW(p) · GW(p)

I mean, fair enough -- but I wasn't claiming that there aren't any buried-chest secrets. I'll put it this way: Superforecasters outperformed US intelligence community analysts in the prediction tournament; whatever secrets the latter had access to, they weren't important enough to outweigh the (presumably minor! Intelligence analysts aren't fools!) rationality advantage the superforecasters had!

Replies from: ChristianKl
comment by ChristianKl · 2021-06-23T18:23:59.251Z · LW(p) · GW(p)

When doing deep research I consider it very important to be mindful of information for a lot just not being available.

I think it's true that a lot can be done with publically available information but it's important to keep in mind the battle that's fought for it. If an organization like US Right to Know wouldn't wage lawsuits, then the FOIA requests they make can't be used to inform out decisions.

While going through FOIA documents I really miss Julian. If he would still be around, Wikileaks would likely host the COVID-19 related emails in a nice searchable fashion and given that he isn't I have to work through PDF documents.

Information on how the WHO coordinate their censorship partnership on COVID-19 with Google and Twitter in a single day on the 3rd of February 2020 to prevent the lab-leak hypothesis from spreading further, needs access to internal documents. Between FOIA requests and offical statements we can narrow it down to that day, but there's a limit to the depth that you can access with public information. It needs either a Senate committee to subpena Google and Twitter or someone in those companies leaking the information.

How much information is available and how easy it is to access is the result of a constant battle for freedom of information. I think it's great to encourage people to do research but it's also important to be aware that a lot of information is withheld and that there's room for pushing the available information further.

There might be effect like the enviroment in which intelligence analysts operate train them to be biased towards what their boss wants to hear.

I also think that the more specific questions happen to be the more important specialized sources of information become.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-01-16T02:47:31.767Z · LW(p) · GW(p)

Tonight my family and I played a trivia game (Wits & Wagers) with GPT-3 as one of the players! It lost, but not by much. It got 3 questions right out of 13. One of the questions it got right it didn't get exactly right, but was the closest and so got the points. (This is interesting because it means it was guessing correctly rather than regurgitating memorized answers. Presumably the other two it got right were memorized facts.)

Anyhow, having GPT-3 playing made the whole experience more fun for me. I recommend it. :) We plan to do this every year with whatever the most advanced publicly available AI (that doesn't have access to the internet) is.

Replies from: Mitchell_Porter
comment by Mitchell_Porter · 2022-01-16T05:31:20.596Z · LW(p) · GW(p)

How did GPT-3 participate?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2022-01-16T13:53:14.063Z · LW(p) · GW(p)

I typed in the questions to GPT-3 and pressed "generate" to see its answers. I used a pretty simple prompt.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-12-26T02:32:54.644Z · LW(p) · GW(p)
When we remember we are all mad, all the mysteries disappear and life stands explained.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-05-16T10:45:06.455Z · LW(p) · GW(p)

Came across this in a SpaceX AMA:

Q: I write software for stuff that isn't life or death. Because of this, I feel comfortable guessing & checking, copying & pasting, not having full test coverage, etc. and consequently bugs get through every so often. How different is it to work on safety critical software?
A: Having worked on both safety critical and non-safety critical software, you absolutely need to have a different mentality. The most important thing is making sure you know how your software will behave in all different scenarios. This affects the entire development process including design, implementation and test. Design and implementation will tend towards smaller components with clear boundaries. This enables those components to be fully tested before they are integrated into a wider system. However, the full system still needs to be tested, which makes end to end testing and observability an important part of the process as well. By exposing information about the decisions the software is making in telemetry, we are able to automate monitoring of the software. This automation can be used in development, regression testing, as well against software running on the real vehicles during missions. This helps us to be confident the software is working as expected throughout its entire life cycle, especially when we have crew onboard.

I am reminded of the Rocket Alignment Problem and the posts on Security Mindset.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-02-07T06:55:55.465Z · LW(p) · GW(p)

On a bunch of different occasions, I've come up with an important idea only to realize later that someone else came up with the same idea earlier. For example, the Problem of Induction/Measure Problem. Also modal realism and Tegmark Level IV. And the anti-souls argument from determinism of physical laws. There were more but I stopped keeping track when I got to college and realized this sort of thing happens all the time.

Now I wish I kept track. I suspect useful data might come from it. Like, my impression is that these scooped ideas tend to be scooped surprisingly recently; more of them were scooped in the past twenty years than in the twenty years prior, more in the past hundred years than in the hundred years prior, etc. This is surprising because it conflicts with the model of scientific/academic progress as being dominated by diminishing returns / low-hanging-fruit effects. Then again, maybe it's not so conflicting after all -- alternate explanations include something about ideas being forgotten over time, or something about my idea-generating process being tied to the culture and context in which I was raised. Still though now that I think about it that model is probably wrong anyway -- what would it even look like for this pattern of ideas being scooped more recently not to hold? That they'd be evenly spread between now and Socrates?

OK, so the example that came to mind turned out to be not a good one. But I still feel like having the data -- and ideally not just from me but from everyone -- would tell us something about the nature of intellectual progress, maybe about how path-dependent it is or something.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-09-15T11:07:40.288Z · LW(p) · GW(p)

Does the lottery ticket hypothesis have weird philosophical implications?

As I understand it, the LTH says that insofar as an artificial neural net eventually acquires a competency, it's because even at the beginning when it was randomly initialized there was a sub-network that happened to already have that competency to some extent at least. The training process was mostly a process of strengthening that sub-network relative to all the others, rather than making that sub-network more competent.

Suppose the LTH is true of human brains as well. Apparently at birth we have almost all the neurons that we will ever have. So... it follows that the competencies we have later in life are already present in sub-networks of our brain at birth.

So does this mean e.g. that there's some sub-network of my 1yo daughter's brain that is doing deep philosophical reflection on the meaning of life right now? It's just drowned out by all the other random noise and thus makes no difference to her behavior?

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-17T23:54:31.183Z · LW(p) · GW(p)

Two months ago [LW(p) · GW(p)]I said I'd be creating a list of predictions about the future in honor of my baby daughter Artemis. Well, I've done it, in spreadsheet form. The prediction questions all have a theme: "Cyberpunk." I intend to make it a fun new year's activity to go through and make my guesses, and then every five years on her birthdays I'll dig up the spreadsheet and compare prediction to reality.

I hereby invite anybody who is interested to go in and add their own predictions to the spreadsheet. Also feel free to leave comments asking for clarifications and proposing new questions or resolution conditions.

I'm thinking about making a version in Foretold.io, since that's where people who are excited about making predictions live. But the spreadsheet I have is fine as far as I'm concerned. Let me know if you have an opinion one way or another.

(Thanks to Kavana Ramaswamy and Ramana Kumar for helping out!)

Replies from: ozziegooen, DanielFilan
comment by ozziegooen · 2019-12-18T01:10:32.229Z · LW(p) · GW(p)

Hi Daniel!

We (Foretold) have been recently experimenting with "notebooks", which help structure tables for things like this.

I think a notebook/table setup for your spreadsheet could be a decent fit. These take a bit of time to set up now (because we need to generate each cell using a separate tool), but we could help with that if this looks interesting to you.

You can click on cells to add predictions to them.

Foretold is more experimental than Metaculus and doesn't have as large a community. But it could be a decent fit for this (and this should get better in the next 1-3 months, as notebooks get improved)

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-18T10:45:10.064Z · LW(p) · GW(p)

OK, thanks Ozzie on your recommendation I'll try to make this work. I'll see how it works, see if I can do it myself, and reach out to you if it seems hard.

Replies from: ozziegooen
comment by ozziegooen · 2019-12-18T12:46:47.299Z · LW(p) · GW(p)

Sure thing. We don't have documentation for how to do this yet, but you can get an idea from seeing the "Markdown" of some of those examples.

The steps to do this:

1. Make a bunch of measurables.
2. Get the IDs of all of those measurables (you can see these in the Details tabs on the bottom)
3. Create the right notebook/table, and add all the correct IDs to the right places within them.
Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-22T15:06:21.805Z · LW(p) · GW(p)

OK, some questions:

1. By measureables you mean questions, right? Using the "New question" button? Is there a way for me to have a single question of the form "X is true" and then have four columns, one for each year (2025, 2030, 2035, 2040) where people can put in four credences for whether X will be true at each of those years?

2. I created a notebook/table with what I think are correctly formatted columns. Before I can add a "data" section to it, I need IDs, and for those I need to have made questions, right?

Replies from: ozziegooen
comment by ozziegooen · 2019-12-22T16:42:32.581Z · LW(p) · GW(p)
1. Yes, sorry. Yep, you need to use the "New question" button. If you want separate things for 4 different years, you need to make 4 different questions. Note that you can edit the names & descriptions in the notebook view, so you can make them initially with simple names, then later add the true names to be more organized.

2. You are correct. In the "details" sections of questions, you can see their IDs. These are the items to use.

You can of course edit notebooks after making them, so you may want to first make it without the IDs, then once you make the questions, add the IDs in, if you'd prefer.

comment by DanielFilan · 2019-12-18T00:24:36.858Z · LW(p) · GW(p)

I'm thinking about making a version in Foretold.io, since that's where people who are excited about making predictions live.

Well, many of them live on Metaculus.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-18T10:44:05.025Z · LW(p) · GW(p)

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-29T14:57:25.004Z · LW(p) · GW(p)

I used to think that current AI methods just aren't nearly as sample/data - efficient as humans. For example, GPT-3 had to read 300B tokens of text whereas humans encounter 2 - 3 OOMs less, various game-playing AIs had to play hundreds of years worth of games to get gud, etc.

Plus various people with 20 - 40 year AI timelines seem to think it's plausible -- in fact, probable -- that unless we get radically new and better architectures, this will continue for decades, meaning that we'll get AGI only when we can actually train AIs on medium or long-horizon tasks for a ridiculously large amount of data/episodes.

So EfficientZero came as a surprise to me, though it wouldn't have surprised me if I had been paying more attention to that part of the literature.

What gives?

Inspired by this comment [LW(p) · GW(p)]:

in linguistic there is an argument called the poverty of stimulus. The claim is that children must figure out the rules of language using only a limited number of unlabeled examples. This is taken as evidence that the brain has some kind of hard-wired grammar framework, that serves as a canvas for further learning while growing up.
Is it possible that tools like EfficientZero help find the fundamental limits for how much training data you need to figure out a set of rules? If an artificial neural network ever manages to reconstruct the rules of English by using only the stimulus that the average children is exposed too, that would be a strong counter-argument against poverty of stimulus.
Replies from: gwern
comment by gwern · 2021-11-29T23:07:22.925Z · LW(p) · GW(p)

The 'poverty of stimulus' argument proves too much, and is just a rehash of the problem of induction, IMO. Everything that humans learn is ill-posed/underdetermined/vulnerable to skeptical arguments and problems like Duhem-Quine or the grue paradox. There's nothing special about language. And so - it all adds up to normality - since we solve those other inferential problems, why shouldn't we solve language equally easily and for the same reasons? If we are not surprised that lasso can fit a good linear model by having an informative prior about coefficients being sparse/simple, we shouldn't be surprised if human children can learn a language without seeing an infinity of every possible instance of a language or if a deep neural net can do similar things.

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-11-30T00:45:12.353Z · LW(p) · GW(p)

Right. So, what do you think about the AI-timelines-related claim then? Will we need medium or long-horizon training for a number of episodes within an OOM or three of parameter count to get something x-risky?

ETA: To put it more provocatively: If EfficientZero can beat humans at Atari using less game experience starting from a completely blank slate whereas humans have decades of pre-training, then shouldn't a human-brain-sized EfficientZero beat humans at any intellectual task given decades of experience at those tasks + decades of pre-training similar to human pre-training.

Replies from: gwern, conor-sullivan
comment by gwern · 2021-11-30T02:04:56.703Z · LW(p) · GW(p)

I have no good argument that a human-sized EfficientZero would somehow need to be much slower than humans.

Arguing otherwise sounds suspiciously like moving the goalposts after an AI effect: "look how stupid DL agents are, they need tons of data to few-shot stuff like challenging text tasks or image classifications, and they OOMs more data on even something as simple as ALE games! So inefficient! So un-human-like! This should deeply concern any naive DL enthusiast, that the archs are so bad & inefficient." [later] "Oh no. Well... 'the curves cross', you know, this merely shows that DL agents can get good performance on uninteresting tasks, but human brains will surely continue showing their tremendous sample-efficiency in any real problem domain, no matter how you scale your little toys."

As I've said before, I continue to ask myself what it is that the human brain does with all the resources it uses, particularly with the estimates that put it at like 7 OOMs more than models like GPT-3 or other wackily high FLOPS-equivalence. It does not seem like those models do '0.0000001% of human performance', in some sense.

comment by Conor Sullivan (conor-sullivan) · 2021-11-30T02:24:29.213Z · LW(p) · GW(p)

Can EfficientZero beat Montezuma's Revenge?

Replies from: gwern
comment by gwern · 2021-11-30T02:28:43.622Z · LW(p) · GW(p)

Not out of the box, but it's also not designed at all for doing exploration. Exploration in MuZero is an obvious but largely (ahem) unexplored topic. Such is research: only a few people in the world can do research with MuZero on meaningful problems like ALE, and not everything will happen at once. I think the model-based nature of MuZero means that a lot of past approaches (like training an ensemble of MuZeros and targeting parts of the game tree where the models disagree most on their predictions) ought to port into it pretty easily. We'll see if that's enough to match Go-Explore.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-17T12:42:06.216Z · LW(p) · GW(p)

I think it is useful to distinguish between two dimensions of competitiveness: Resource-competitiveness and date-competitiveness. We can imagine a world in which AI safety is date-competitive with unsafe AI systems but not resource-competitive, i.e. the insights and techniques that allow us to build unsafe AI systems also allow us to build equally powerful safe AI systems, but it costs a lot more. We can imagine a world in which AI safety is resource-competitive but not date-competitive, i.e. for a few months it is possible to make unsafe powerful AI systems but no one knows how to make a safe version, and then finally people figure out how to make a similarly-powerful safe version and moreover it costs about the same.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-12-12T17:56:22.812Z · LW(p) · GW(p)

Has anyone done an expected value calculation, or otherwise thought seriously about, whether to save for retirement? Specifically, whether to put money into an account that can't be accessed (or is very difficult to access) for another twenty years or so, to get various employer matching or tax benefits?

I did, and came to the conclusion that it didn't make sense, so I didn't do it. But I wonder if anyone else came to the opposite conclusion. I'd be interested to hear their reasoning.

ETA: To be clear, I have AI timelines in mind here. I expect to be either dead, imprisoned, or in some sort of posthuman utopia by the time I turn 60 or whenever it was that my retirement account would unlock. Thus I expect the money in said account to be worthless; thus unless the tax and matching benefits were, like, 10X compared to regular investment, it wouldn't be worth it. And that's not even taking into account the fact that money is more valuable to me in the next 10 years then after I retire. And that's not even taking into account that I'm an altruist and am looking to spend money in ways that benefit the world -- I'm just talking about "selfish spending" here.

Replies from: Dagon, steven0461, Charlie Steiner, samuel-shadrach
comment by Dagon · 2021-12-12T21:03:05.454Z · LW(p) · GW(p)

There's a lot of detail behind "expect to be" that matters here.  It comes down to "when is the optimal time to spend this money" - with decent investment options, if your satisfactory lifestyle has unspent income, the answer is likely to be "later".  And then the next question is "how much notice will I have when it's time to spend it all"?

For most retirement savings, the tax and match options are enough to push some amount of your savings into that medium.  And it's not really locked up - early withdrawal carries penalties, generally not much worse than not getting the advantages in the first place.

And if you're liquidating because you think money is soon to be meaningless (for you, or generally), you can also borrow a lot, probably more than you could if you didn't have long-term assets to point to.

For me, the EV calculation comes out in favor of retirement savings.  I'm likely closer to it than you, but even so, the range of outcomes includes all of "unexpected death/singularity making savings irrelevant", "early liquidation for a pre-retirement use", and "actual retirement usage".  And all of that outweighs by a fair bit "marginal spending today".

Fundamentally, the question isn't "should I use investment vehicles targeted for retirement", but "What else am I going to do with the money that's higher-value for my range of projected future experiences"?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-12-12T21:17:54.433Z · LW(p) · GW(p)

Very good point that I may not be doing much else with the money. I'm still saving it, just in more liquid, easy to access forms (e.g. stocks, crypto.) I'm thinking it might come in handy sometime in the next 20 years during some sort of emergency or crunch time, or to handle unforeseen family expenses or something, or to donate to a good cause.

comment by steven0461 · 2021-12-12T21:44:19.444Z · LW(p) · GW(p)

"imprisoned"?

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-12-12T22:38:16.457Z · LW(p) · GW(p)

It's not obvious that unaligned AI would kill us. For example, we might be bargaining chips in some future negotiation with aliens.

comment by Charlie Steiner · 2021-12-12T19:04:53.014Z · LW(p) · GW(p)

My decision was pretty easy because I don't have any employer matching or any similarly large incentives. I don't think the tax incentives are big enough to make up for the inconvenience in the ~60% case where I want to use my savings before old age. However, maybe a mixed strategy would be more optimal.

comment by acylhalide (samuel-shadrach) · 2021-12-14T15:53:29.669Z · LW(p) · GW(p)

How much is the rate of return from your employer matching program? What about tax rate and policies in your country?

Stocks typically pay more than bonds or FD, and don't have as sharp a requirement on when the money can't be accessed. Although yeah it's still bad luck if you have to sell after a bear market.

Standard retirement strategy is start stock-heavy and slowly shift money to bonds and cash as you near whatever age you want to start spending the money. This doesn't have to be a fixed date, for instance if you expect to spend 20% of your money in the next 10 years and another 30% in the 10 years after that, you can calculate a strategy around that.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-06-16T20:52:28.123Z · LW(p) · GW(p)

Historical precedents for general vs. narrow AI

• Household robots vs. household appliances: Score One for Team Narrow
• Vehicles on roads vs. a network of pipes, tubes, and rails: Score one for Team General
• Ships that can go anywhere vs. a trade network of ships optimized for one specific route: Score one for Team General

(On the ships thing -- apparently the Indian Ocean trade was specialized prior to the Europeans, with cargo being transferred from one type of ship to another to handle different parts of the route, especially the red sea which was dangerous to the type of oceangoing ship popular at the time. But then the Age of Sail happened.)

Obviously this is just three data points, two of which seem sorta similar because they both have to do with transporting stuff. It would be good to have more examples.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-09T10:27:26.233Z · LW(p) · GW(p)

Productivity app idea:

You set a schedule of times you want to be productive, and a frequency, and then it rings you at random (but with that frequency) to bug you with questions like:

--Are you "in the zone" right now? [Y] [N]

--(if no) What are you doing? [text box] [common answer] [ common answer] [...]

The point is to cheaply collect data about when you are most productive and what your main time-wasters are, while also giving you gentle nudges to stop procrastinating/browsing/daydream/doomscrolling/working-sluggishly, take a deep breath, reconsider your priorities for the day, and start afresh.

Probably wouldn't work for most people but it feels like it might for me.

Replies from: gjm, An1lam, becausecurious
comment by gjm · 2021-04-09T11:38:45.176Z · LW(p) · GW(p)

"Are you in the zone right now?"

"... Well, I was."

Replies from: daniel-kokotajlo
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2021-04-09T12:10:56.721Z · LW(p) · GW(p)

I'm betting that a little buzz on my phone which I can dismiss with a tap won't kill my focus. We'll see.

comment by NaiveTortoise (An1lam) · 2021-04-09T11:29:42.261Z · LW(p) · GW(p)

This is basically a souped up version of TagTime (by the Beeminder folks) so you might be able to start with their implementation.

comment by becausecurious · 2021-04-09T11:17:39.717Z · LW(p) · GW(p)

I've been thinking about a similar idea.

This format of data collection is called "experience sampling". I suspect there might be already made solutions.

Would you pay for such an app? If so, how much?

Also looks like your crux is actually becoming more productive (i.e. experience sampling is a just a mean to reach that). Perhaps just understanding your motivation better would help (basically http://mindingourway.com/guilt/)?