Daniel Kokotajlo's Shortform

post by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T18:53:22.087Z · score: 5 (2 votes) · LW · GW · 35 comments

35 comments

Comments sorted by top scores.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T18:53:22.262Z · score: 29 (10 votes) · LW(p) · GW(p)

My baby daughter was born two weeks ago, and in honor of her existence I'm building a list of about 100 technology-related forecasting questions, which will resolve in 5, 10, and 20 years. Questions like "By the time my daughter is 5/10/20 years old, the average US citizen will be able to hail a driverless taxi in most major US cities." (The idea is, tying it to my daughter's age will make it more fun and also increase the likelihood that I actually go back and look at it 10 years later.)

I'd love it if the questions were online somewhere so other people could record their answers too. Does this seem like a good idea? Hive mind, I beseech you: Help me spot ways in which this could end badly!

On a more positive note, any suggestions for how to do it? Any expressions of interest in making predictions with me?

Thanks!

EDIT: Now it's done, though I have yet to import it to Foretold.io it works perfectly fine in spreadsheet form.

comment by bgold · 2019-10-14T19:18:57.220Z · score: 8 (3 votes) · LW(p) · GW(p)

I'm interested, and I'd suggest using https://foretold.io for this

comment by Pablo_Stafforini · 2019-10-08T19:34:46.067Z · score: 5 (3 votes) · LW(p) · GW(p)

I love the idea. Some questions and their associated resolution dates may be of interest to the wider community of forecasters, so you could post them to Metaculus. Otherwise you could perhaps persuade the Metaculus admins to create a subforum, similar to ai.metaculus.com, for the other questions to be posted. Since Metaculus already has the subforum functionality, it seems a good idea to extend it in this way (perhaps a user's subforum could be associated with the corresponding username: e.g. user kokotajlo can post his own questions at kokotajlo.metaculus.com).

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-10-14T18:24:48.696Z · score: 18 (10 votes) · LW(p) · GW(p)

When I first read the now-classic arguments for slow takeoff -- e.g. from Paul and Katja -- I was excited; I thought they described a serious alternative scenario to the classic FOOM scenarios. However I never thought, and still do not think, that the classic FOOM scenarios were very unlikely; I feel that the slow takeoff and fast takeoff scenarios are probably within a factor of 2 of each other in probability.

Yet more and more nowadays I get the impression that people think slow takeoff is the only serious possibility. For example, Ajeya and Rohin seem very confident that if TAI was coming in the next five to ten years we would see loads more economic applications of AI now, therefore TAI isn't coming in the next five to ten years...

I need to process my thoughts more on this, and reread their claims; maybe they aren't as confident as they sound to me. But I worry that I need to go back to doing AI forecasting work after all (I left AI Impacts for CLR because I thought AI forecasting was less neglected) since so many people seem to have wrong views. ;)

This random rant/musing probably isn't valuable to anyone besides me, but hey, it's just a shortform. If you are reading this and you have thoughts or advice for me I'd love to hear it.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-11-03T17:47:33.554Z · score: 14 (5 votes) · LW(p) · GW(p)

It seems to me that human society might go collectively insane sometime in the next few decades. I want to be able to succinctly articulate the possibility and why it is plausible, but I'm not happy with my current spiel. So I'm putting it up here in the hopes that someone can give me constructive criticism:

I am aware of three mutually-reinforcing ways society could go collectively insane:

    1. Echo chambers/filter bubbles/polarization: Arguably political polarization is increasing across the world of liberal democracies today. Perhaps the internet has something to do with this--it’s easy to self-select into a newsfeed and community that reinforces and extremizes your stances on issues. Arguably recommendation algorithms have contributed to this problem in various ways--see e.g. “Sort by controversial” and Stuart Russell’s claims in Human Compatible. At any rate, perhaps some combination of new technology and new cultural or political developments will turbocharge this phenomenon. This could lead to civil wars, or more mundanely, societal dysfunction. We can’t coordinate to solve collective action problems relating to AGI if we are all arguing bitterly with each other about culture war issues.
    2. Deepfakes/propaganda/persuasion tools: Already a significant portion of online content is deliberately shaped by powerful political agendas--e.g. Russia, China, and the US political tribes. Much of the rest is deliberately shaped by less powerful apolitical agendas, e.g. corporations managing their brands or teenagers in Estonia making money by spreading fake news during US elections. Perhaps this trend will continue; technology like chatbots, language models, deepfakes, etc. might make it cheaper and more effective to spew this sort of propaganda, to the point where most online content is propaganda of some sort or other. This could lead to the deterioration of our collective epistemology.
    3. Memetic evolution: Ideas (“Memes”) can be thought of as infectious pathogens that spread, mutate, and evolve to coexist with their hosts. The reasons why they survive and spread are various; perhaps they are useful to their hosts, or true, or perhaps they are able to fool the host’s mental machinery into keeping them around even though they are not useful or true. As the internet accelerates the process of memetic evolution--more memes, more hosts, more chances to spread, more chances to mutate--the mental machinery humans have to select for useful and/or true memes may become increasingly outclassed. Analogy: The human body has also evolved to select for pathogens (e.g. gut, skin, saliva bacteria) that are useful, and against those that are harmful (e.g. the flu). But when e.g. colonialism, globalism, and World War One brought more people together and allowed diseases to spread more easily, great plagues swept the globe. Perhaps something similar will happen (is already happening?) with memes.
comment by Ben Pace (Benito) · 2019-11-03T18:01:10.467Z · score: 5 (3 votes) · LW(p) · GW(p)

All good points, but I feel like objecting to the assumption that society is currently sane and then we'll see a discontinuity, rather than any insanity being a continuation of current trajectories.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-11-03T23:36:00.977Z · score: 1 (1 votes) · LW(p) · GW(p)

I agree with that actually; I should correct the spiel to make it clear that I do. Thanks!

comment by Zack_M_Davis · 2019-11-03T19:52:03.399Z · score: 3 (2 votes) · LW(p) · GW(p)

Related: "Is Clickbait Destroying Our General Intelligence?" [LW · GW]

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-01T21:46:53.545Z · score: 10 (5 votes) · LW(p) · GW(p)

For fun:

“I must not step foot in the politics. Politics is the mind-killer. Politics is the little-death that brings total obliteration. I will face my politics. I will permit it to pass over me and through me. And when it has gone past I will turn the inner eye to see its path. Where the politics has gone there will be nothing. Only I will remain.”

Makes about as much sense as the original quote, I guess. :P

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-09-04T10:06:56.888Z · score: 9 (5 votes) · LW(p) · GW(p)

Eric Drexler has argued that the computational capacity of the human brain is equivalent to about 1 PFlop/s, that is, we are already past the human-brain-human-lifetime milestone. (Here is a gdoc.) The idea is that we can identify parts of the human brain that seem to perform similar tasks to certain already-existing AI systems. It turns out that e.g. 1-thousandth of the human brain is used to do the same sort of image processing tasks that seem to be handled by modern image processing AI... so then that means an AI 1000x bigger than said AI should be able to do the same things as the whole human brain, at least in principle.

Has this analysis been run with nonhuman animals? For example, a chicken brain is a lot smaller than a human brain, but can still do image recognition, so perhaps the part of the chicken that does image recognition is smaller than the part of the human that does image recognition.

comment by avturchin · 2020-09-04T12:05:30.887Z · score: 6 (3 votes) · LW(p) · GW(p)

It is known that birds brains are much more mass-effective than mammalian.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-08T22:07:48.572Z · score: 9 (6 votes) · LW(p) · GW(p)

For the past year I've been thinking about the Agent vs. Tool debate (e.g. thanks to reading CAIS/Reframing Superintelligence) and also about embedded agency and mesa-optimizers and all of these topics seem very related now... I keep finding myself attracted to the following argument skeleton:

Rule 1: If you want anything unusual to happen, you gotta execute a good plan.

Rule 2: If you want a good plan, you gotta have a good planner and a good world-model.

Rule 3: If you want a good world-model, you gotta have a good learner and good data.

Rule 4: Having good data is itself an unusual happenstance, so by Rule 1 if you want good data you gotta execute a good plan.

Putting it all together: Agents are things which have good planner and learner capacities and are hooked up to actuators in some way. Perhaps they also are "seeded" with a decent world-model to start off with. Then, they get a nifty feedback loop going: They make decent plans, which allow them to get decent data, which allows them to get better world-models, which allows them to make better plans and get better data so they can get great world-models and make great plans and... etc. (The best agents will also be improving on their learning and planning algorithms! Humans do this, for example.)

Empirical conjecture: Tools suck; agents rock, and that's why. It's also why agenty mesa-optimizers will arise, and it's also why humans with tools will eventually be outcompeted by agent AGI.




comment by Pattern · 2019-10-10T04:15:31.130Z · score: 2 (2 votes) · LW(p) · GW(p)

How would you test the conjecture?

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-10T16:27:03.528Z · score: 2 (2 votes) · LW(p) · GW(p)

The ultimate test will be seeing whether the predictions it makes come true--whether agenty mesa-optimizers arise often, whether humans with tools get outcompeted by agent AGI.

In the meantime, it's not too hard to look for confirming or disconfirming evidence. For example, the fact that militaries and corporations that make a plan and then task their subordinates with strictly following the plan invariably do worse than those who make a plan and then give their subordinates initiative and flexibility to learn and adapt on the fly... seems like confirming evidence. (See: agile development model, the importance of iteration and feedback loops in startup culture, etc.) Whereas perhaps the fact that AlphaZero is so good despite lacking a learning module is disconfirming evidence.

As for a test, well we'd need to have something that proponents and opponents agree to disagree on, and that might be hard to find. Most tests I can think of now don't work because everyone would agree on what the probable outcome is. I think the best I can do is: Someday soon we might be able to test an agenty architecture and a non-agenty architecture in some big complex novel game environment, and this conjecture would predict that for sufficiently complex and novel environments the agenty architecture would win.

comment by bgold · 2019-10-14T17:47:21.312Z · score: 2 (2 votes) · LW(p) · GW(p)

I'd agree w/ the point that giving subordinates plans and the freedom to execute them as best as they can tends to work out better, but that seems to be strongly dependent on other context, in particular the field they're working in (ex. software engineering vs. civil engineering vs. military engineering), cultural norms (ex. is this a place where agile engineering norms have taken hold?), and reward distributions (ex. does experimenting by individuals hold the potential for big rewards, or are all rewards likely to be distributed in a normal fashion such that we don't expect to find outliers).

My general model is in certain fields humans look more tool shaped and in others more agent shaped. For example an Uber driver when they're executing instructions from the central command and control algo doesn't require as much of the planning, world modeling behavior. One way this could apply to AI is that sub-agents of an agent AI would be tools.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-10-15T15:27:57.249Z · score: 3 (3 votes) · LW(p) · GW(p)

I agree. I don't think agents will outcompete tools in every domain; indeed in most domains perhaps specialized tools will eventually win (already, we see humans being replaced by expensive specialized machinery, or expensive human specialists, lots of places). But I still think that there will be strong competitive pressure to create agent AGI, because there are many important domains where agency is an advantage.

comment by gwern · 2019-10-15T16:12:40.266Z · score: 9 (4 votes) · LW(p) · GW(p)

Expensive specialized tools are themselves learned by and embedded inside an agent to achieve goals. They're simply meso-optimization in another guise. eg AlphaGo learns a reactive policy which does nothing which you'd recognize as 'planning' or 'agentiness' - it just maps a grid of numbers (board state) to another grid of numbers (value function estimates of a move's value). A company, beholden to evolutionary imperatives, can implement internal 'markets' with 'agents' if it finds that useful for allocating resources across departments, or use top-down mandates if those work better, but no matter how it allocates resources, it's all in the service of an agent, and any distinction between the 'tool' and 'agent' parts of the company is somewhat illusory.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-21T12:04:06.107Z · score: 8 (4 votes) · LW(p) · GW(p)

One thing I find impressive about GPT-3 is that it's not even trying to generate text.

Imagine that someone gave you a snippet of random internet text, and told you to predict the next word. You give a probability distribution over possible next words. The end.

Then, your twin brother gets a snippet of random internet text, and is told to predict the next word. Etc. Unbeknownst to either of you, the text your brother gets is the text you got, with a new word added to it according to the probability distribution you predicted.

Then we repeat with your triplet brother, then your quadruplet brother, and so on.

Is it any wonder that sometimes the result doesn't make sense? All it takes for the chain of words to get derailed is for one unlucky word to be drawn from someone's distribution of next-word prediction. GPT-3 doesn't have the ability to "undo" words it has written; it can't even tell its future self what its past self had in mind when it "wrote" a word!

EDIT: I just remembered Ought's experiment with getting groups of humans to solve coding problems by giving each human 10 minutes to work on it and then passing it on to the next. The results? Humans sucked. The overall process was way less effective than giving 1 human a long period to solve the problem. Well, GPT-3 is like this chain of humans!

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-10-03T16:22:21.488Z · score: 6 (3 votes) · LW(p) · GW(p)

GPT-3 app idea: Web assistant. Sometimes people want to block out the internet from their lives for a period, because it is distracting from work. But sometimes one needs the internet for work sometimes, e.g. you want to google a few things or fire off an email or look up a citation or find a stock image for the diagram you are making. Solution: An app that can do stuff like this for you. You put in your request, and it googles and finds and summarizes the answer, maybe uses GPT-3 to also check whether the answer it returns seems like a good answer to the request you made, etc. It doesn't have to work all the time, or for all requests, to be useful. As long as it doesn't mislead you, the worst that happens is that you have to wait till your internet fast is over (or break your fast).

I don't think this is a great idea but I think there'd be a niche for it.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T16:34:59.034Z · score: 6 (3 votes) · LW(p) · GW(p)

Searching for equilibria can be infohazardous. You might not like the one you find first, but you might end up sticking with it (or worse, deviating from it and being punished). This is because which equilbrium gets played by other people depends (causally or, in some cases, acausally) not just on what equilibrium you play but even on which equilibria you think about. For reasons having to do with schelling points. A strategy that sometimes works to avoid these hazards is to impose constraints on which equilibria you think about, or at any rate to perform a search through equilibria-space that is guided in some manner so as to be unlikely to find equilibria you won't like. For example, here is one such strategy: Start with a proposal that is great for you and would make you very happy. Then, think of the ways in which this proposal is unlikely to be accepted by other people, and modify it slightly to make it more acceptable to them while keeping it pretty good for you. Repeat until you get something they'll probably accept.

comment by Dagon · 2020-03-02T17:12:08.177Z · score: 4 (2 votes) · LW(p) · GW(p)

I'm not sure I follow the logic. When you say "searching for equilibria", do you mean "internally predicting likelihood of points and durations of an equilibrium (as most of what we worry about aren't stable)? Or do you mean the process of application of forces and observation of counter forces in which the system is "finding it's level"? Or do you mean "discussion about possible equilibria, where that discussion is in fact a force that affects the system"?

Only the third seems to fit your description, and I think that's already covered by standard infohazard writings - the risk that you'll teach others something that can be used against you.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T17:48:24.794Z · score: 4 (2 votes) · LW(p) · GW(p)

I meant the third, and I agree it's not a particularly new idea, though I've never seen it said this succinctly or specifically. (For example, it's distinct from "the risk that you'll teach others something that can be used against you," except maybe in the broadest sense.)

comment by Dagon · 2020-03-02T18:24:26.813Z · score: 2 (1 votes) · LW(p) · GW(p)

Interesting. I'd like to explore the distinction between "risk of converging on a dis-preferred social equilibrium" (which I'd frame as "making others aware that this equilibrium is feasible") and other kinds of revealing information which others use to act in ways you don't like. I don't see much difference.

The more obvious cases ("here are plans to a gun that I'm especially vulnerable to") don't get used much unless you have explicit enemies, while the more subtle ones ("I can imagine living in a world where people judge you for scratching your nose with your left hand") require less intentionality of harm directed at you. But it's the same mechanism and info-risk.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-02T19:57:33.987Z · score: 2 (1 votes) · LW(p) · GW(p)

For one thing, the equilibrium might not actually be feasible, but making others aware that you have thought about it might nevertheless have harmful effects (e.g. they might mistakenly think that it is, or they might correctly realize something in the vicinity is.) For another, "teach others something that can be used against you" while technically describing the sort of thing I'm talking about, tends to conjure up a very different image in the mind of the reader -- an image more like your gun plans example.

I agree there is not a sharp distinction between these, probably. (I don't know, didn't think about it.) I wrote this shortform because, well, I guess I thought of this as a somewhat new idea -- I thought of most infohazards talk as being focused on other kinds of examples. Thank you for telling me otherwise!

comment by Dagon · 2020-03-02T20:57:24.678Z · score: 4 (2 votes) · LW(p) · GW(p)

(oops. I now realize this probably come across wrong). Sorry! I didn't intend to be telling you things, nor did I mean to imply that pointing out more subtle variants of known info-hazards was useless. I really appreciate the topic, and I'm happy to have exactly as much text as we have in exploring non-trivial application of the infohazard concept, and helping identify whether further categorization is helpful (I'm not convinced, but I probably don't have to be).

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-17T23:54:31.183Z · score: 5 (3 votes) · LW(p) · GW(p)

Two months ago [LW(p) · GW(p)]I said I'd be creating a list of predictions about the future in honor of my baby daughter Artemis. Well, I've done it, in spreadsheet form. The prediction questions all have a theme: "Cyberpunk." I intend to make it a fun new year's activity to go through and make my guesses, and then every five years on her birthdays I'll dig up the spreadsheet and compare prediction to reality.

I hereby invite anybody who is interested to go in and add their own predictions to the spreadsheet. Also feel free to leave comments asking for clarifications and proposing new questions or resolution conditions.

I'm thinking about making a version in Foretold.io, since that's where people who are excited about making predictions live. But the spreadsheet I have is fine as far as I'm concerned. Let me know if you have an opinion one way or another.

(Thanks to Kavana Ramaswamy and Ramana Kumar for helping out!)

comment by ozziegooen · 2019-12-18T01:10:32.229Z · score: 8 (4 votes) · LW(p) · GW(p)

Hi Daniel!

We (Foretold) have been recently experimenting with "notebooks", which help structure tables for things like this.

I think a notebook/table setup for your spreadsheet could be a decent fit. These take a bit of time to set up now (because we need to generate each cell using a separate tool), but we could help with that if this looks interesting to you.

Here are some examples: https://www.foretold.io/c/0104d8e8-07e4-464b-8b32-74ef22b49f21/n/6532621b-c16b-46f2-993f-f72009d16c6b https://www.foretold.io/c/47ff5c49-9c20-4f3d-bd57-1897c35cd42d/n/2216ee6e-ea42-4c74-9b11-1bde30c7dd02 https://www.foretold.io/c/1bea107b-6a7f-4f39-a599-0a2d285ae101/n/5ceba5ae-60fc-4bd3-93aa-eeb333a15464

You can click on cells to add predictions to them.

Foretold is more experimental than Metaculus and doesn't have as large a community. But it could be a decent fit for this (and this should get better in the next 1-3 months, as notebooks get improved)

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-18T10:45:10.064Z · score: 5 (3 votes) · LW(p) · GW(p)

OK, thanks Ozzie on your recommendation I'll try to make this work. I'll see how it works, see if I can do it myself, and reach out to you if it seems hard.

comment by ozziegooen · 2019-12-18T12:46:47.299Z · score: 2 (1 votes) · LW(p) · GW(p)

Sure thing. We don't have documentation for how to do this yet, but you can get an idea from seeing the "Markdown" of some of those examples.

The steps to do this:

  1. Make a bunch of measurables.
  2. Get the IDs of all of those measurables (you can see these in the Details tabs on the bottom)
  3. Create the right notebook/table, and add all the correct IDs to the right places within them.
comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-22T15:06:21.805Z · score: 4 (2 votes) · LW(p) · GW(p)

OK, some questions:

1. By measureables you mean questions, right? Using the "New question" button? Is there a way for me to have a single question of the form "X is true" and then have four columns, one for each year (2025, 2030, 2035, 2040) where people can put in four credences for whether X will be true at each of those years?

2. I created a notebook/table with what I think are correctly formatted columns. Before I can add a "data" section to it, I need IDs, and for those I need to have made questions, right?

comment by ozziegooen · 2019-12-22T16:42:32.581Z · score: 2 (1 votes) · LW(p) · GW(p)
  1. Yes, sorry. Yep, you need to use the "New question" button. If you want separate things for 4 different years, you need to make 4 different questions. Note that you can edit the names & descriptions in the notebook view, so you can make them initially with simple names, then later add the true names to be more organized.

  2. You are correct. In the "details" sections of questions, you can see their IDs. These are the items to use.

You can of course edit notebooks after making them, so you may want to first make it without the IDs, then once you make the questions, add the IDs in, if you'd prefer.

comment by DanielFilan · 2019-12-18T00:24:36.858Z · score: 4 (2 votes) · LW(p) · GW(p)

I'm thinking about making a version in Foretold.io, since that's where people who are excited about making predictions live.

Well, many of them live on Metaculus.

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-18T10:44:05.025Z · score: 5 (3 votes) · LW(p) · GW(p)

Right, no offense intended, haha! (I already made a post about this on Metaculus, don't worry I didn't forget them except in this post here!)

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-09-15T11:07:40.288Z · score: 4 (2 votes) · LW(p) · GW(p)

Does the lottery ticket hypothesis have weird philosophical implications?

As I understand it, the LTH says that insofar as an artificial neural net eventually acquires a competency, it's because even at the beginning when it was randomly initialized there was a sub-network that happened to already have that competency to some extent at least. The training process was mostly a process of strengthening that sub-network relative to all the others, rather than making that sub-network more competent.

Suppose the LTH is true of human brains as well. Apparently at birth we have almost all the neurons that we will ever have. So... it follows that the competencies we have later in life are already present in sub-networks of our brain at birth.

So does this mean e.g. that there's some sub-network of my 1yo daughter's brain that is doing deep philosophical reflection on the meaning of life right now? It's just drowned out by all the other random noise and thus makes no difference to her behavior?

 

comment by Daniel Kokotajlo (daniel-kokotajlo) · 2019-12-17T12:42:06.216Z · score: 3 (2 votes) · LW(p) · GW(p)

I think it is useful to distinguish between two dimensions of competitiveness: Resource-competitiveness and date-competitiveness. We can imagine a world in which AI safety is date-competitive with unsafe AI systems but not resource-competitive, i.e. the insights and techniques that allow us to build unsafe AI systems also allow us to build equally powerful safe AI systems, but it costs a lot more. We can imagine a world in which AI safety is resource-competitive but not date-competitive, i.e. for a few months it is possible to make unsafe powerful AI systems but no one knows how to make a safe version, and then finally people figure out how to make a similarly-powerful safe version and moreover it costs about the same.