Posts

Comments

Comment by lawchan on Stable Pointers to Value III: Recursive Quantilization · 2018-07-29T01:30:20.334Z · score: 3 (3 votes) · LW · GW

I share both of these intuitions.

That being said, I'm not convinced that the space of concepts is smaller as you get more meta. (Naively speaking, there are ~exponentially more distributions over distributions than distributions, though some strong simplicity biases can cut this down a lot.) I suspect that one reason it seems that the space of concepts is "smaller" is because we're worse at differentiating concepts at higher levels of meta-ness. For example, it seems that it's often easier to figure out what the consequences of concrete action X are than the consequences of adopting a particular ethical system, and a lot of philosophy on metaethics seems more confused than philosophy on ethics. I think this is related to the "it's more difficult to get feedback" intuition, where we have fewer distinct buckets because it's too hard to distinguish between similar theories at sufficiently high meta-levels.

Comment by lawchan on Non-Adversarial Goodhart and AI Risks · 2018-03-28T19:05:24.718Z · score: 7 (2 votes) · LW · GW

I'm pretty sure that "hard problem of correctly identifying causality" is a major goal of MIRI's decision theory.

In what sense is discovering causality NP-hard? There's the trivial sense in which you can embed a NP-hard problem (or tasks of higher complexity) into the real world, and there's the sense in which inference in Bayesian networks can embed NP-hard problems.

Can you elaborate on why AIXI/Solomonoff induction is an unsafe utility maximizer, even for Cartesian agents?

Comment by LawChan on [deleted post] 2018-03-20T05:07:13.610Z

I skimmed some of Crick and read some commentary on him, and Crick seems to take the Hobbesian "politics as a necessary compromise" viewpoint. (I wasn't convinced by his definition of the word politics, which seemed not to point at what I would point at as politics.)

My best guess: I think they're arguing not that immature discourse is okay, but that we need to be more polite toward people's views in general for political reasons, as long as the people are acting somewhat in good faith (I suspect they think that you're not being sufficiently polite toward those you're trying to throw out of the overton window). As a result, we need to engage less in harsh criticism when it might be seen as threatening.

That being said, I also suspect that Duncan would agree that we need to be charitable. I suspect the actual disagreement is whether the behavior of the critics Duncan is replying to are actually the sort of behavior we want/need to accept in our community.

(Personally, I think we need to be more willing to do real-life experiments, even if they risk going somewhat wrong. And I think some of the tumblr criticism definitely fell out of what I would want in the overton window. So I'm okay with Duncan's paranthetical, though it would have been nicer if it was more explicit who it was responding to.)

Comment by LawChan on [deleted post] 2018-03-20T04:45:55.494Z

I also think I wouldn't have understood his comments without MTG or at least having read Duncan's explanation to the MTG color wheel.

(Nitpicking) Though I'd add that MTG doesn't have a literal Blue Knight card either, so I doubt it's that reference. (There are knights that are blue and green, but none with the exact names "Blue Knight" or "Green Knight".)

Comment by LawChan on [deleted post] 2018-03-19T04:10:39.434Z

Thanks for posting this. I found the framing of the different characters very insightful.

Comment by lawchan on Raising funds to establish a new AI Safety charity · 2018-03-17T18:32:09.572Z · score: 44 (12 votes) · LW · GW

After looking into the prototype course, I updated upwards on this project, as I think it is a decent introduction to Dylan's Off-Switch Game paper. Could I ask what other stuff RAISE wants to cover in the course? What other work on corrigibility are you planning to cover? (For example Dylan's other work, MIRI's work on this subject and Smitha Mili's paper?)

Could you also write more about who your course is targeting? Why does RAISE believe that the best way to fix the talent gap in AI safety is to help EAs change careers via introductory AI Safety material, instead of, say, making it easier for CS PhD students to do research on AI Safety-relevant topics? Why do we need to build a campus, instead of co-opting the existing education mechanisms of academia?

Finally, could you link some of the mind maps and summaries RAISE has created?

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T17:07:21.358Z · score: 8 (2 votes) · LW · GW

Makes sense.

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T17:06:35.933Z · score: 3 (1 votes) · LW · GW

Thanks! I think it makes sense to link it at the start, so new readers can get context for what you're trying to do.

Comment by lawchan on Sources of intuitions and data on AGI · 2018-02-03T16:56:30.459Z · score: 8 (2 votes) · LW · GW

Yeah, I think Ben captures my objection - IDA captures what is different between your approach and MIRI's agenda, but not what is different between some existing AI systems and your approach.

This might not be a bad thing - perhaps you want to choose a name that is evocative of existing approaches to stress that your approach is the natural next step for AI development, for example.

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T05:15:35.014Z · score: 7 (2 votes) · LW · GW

Could I ask what the motivation behind this post was?

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T05:13:35.369Z · score: 9 (3 votes) · LW · GW

I think they're referring to the fact that they wouldn't expect a Friendly AI to deconstruct them.

Also, for some reason, the link is wonky - likely because LessWrong 2.0 parses text contained in as italics. Here's the fixed link:

https://en.wikipedia.org/wiki/Friendly_artificial_intelligence

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T05:11:10.241Z · score: 7 (2 votes) · LW · GW

Hm, I noticed that your link showed up quite wonky. Here's a fixed version:

http://lesswrong.com/lw/mgf/amap%20agifailures%20modesand%20levels/

Comment by lawchan on Paper Trauma · 2018-02-03T05:09:56.278Z · score: 6 (3 votes) · LW · GW
I am always intensely skeptical of people who don't bring notebooks to meetings. Sometimes I'm the only one present with a notebook. What, you think you're going to just remember the twenty details and action items that were agreed on?

I generally don't bring a notebook to meetings when I expect a decent quality note-taker. I find that taking notes while listening often distracts from my ability to generate novel thoughts, especially if I'm spending more than half the time just taking notes. (And as I don't write particularly fast, this tends to happen unless I stick only to writing down a very small fraction of interesting conversations!)

Comment by lawchan on Paper Trauma · 2018-02-03T05:07:24.546Z · score: 16 (5 votes) · LW · GW

For reference: Andrew Critch's post arguing for using a large notebook to think.

Comment by lawchan on The Utility of Human Atoms for the Paperclip Maximizer · 2018-02-03T05:00:18.522Z · score: 6 (2 votes) · LW · GW

"Friendly", presumably.

Comment by lawchan on Paper Trauma · 2018-02-03T04:56:43.142Z · score: 16 (6 votes) · LW · GW

I strongly second the stick-to-the-wall whiteboards recommendation!

I actually suspect that the performance improvement for marker over pens is due in part to increased legibility - both from the tendency to write larger when using a marker (I know that I tend to draw really tiny diagrams with pens) and because markers leave a much thicker mark on the paper.

Comment by lawchan on Sources of intuitions and data on AGI · 2018-02-03T04:50:25.234Z · score: 17 (4 votes) · LW · GW

I remember hearing people call it iterative distillation and amplification (IDA), but I think this name might be too general.

Comment by lawchan on Hammers and Nails · 2018-01-31T18:54:11.766Z · score: 3 (1 votes) · LW · GW

I think there's a lot of the intuitions and thought processes that let you come up with new discoveries in mathematics and machine learning that aren't generally taught in classes or covered in textbooks. People are also quite bad at conveying their intuitions behind topics directly when asked to in Q&As and speeches. I think that at least in machine learning, hanging out with good ML researchers teaches me a lot about how to think about problems, in a way that I haven't been able to get even after reading their course notes and listening to their presentations. Similarly, I suspect that autobiographies may help convey the experience of solving problems in a way that actually lets you learn the intuitions or thought processes used by the author.

Comment by lawchan on Hammers and Nails · 2018-01-31T18:40:43.643Z · score: 11 (3 votes) · LW · GW

Yeah, I agree on the stretching point.

The main distinguishing thing about Feynman, at least from reading Feynman's two autobiographies, seemed to be how irreverent he is. He doesn't do science because it's super important, he does science he finds fun or interesting. He is constantly going on rants about the default way of looking at things (at least his inner monologue is) and ignoring authority, whether by blowing up at the science textbooks he was asked to read, ignoring how presidential committees traditionally functioned, or disagreeing with doctors. He goes to strip clubs because he likes interacting with pretty girls. It's really quite different from the rather stodgy utilitarian/outside mindset I tend to reference by default, and I think reading his autobiographies me a lot more of what Critch calls "Entitlement to believe" .

When I adopt this "Feynman mindset" in my head, this feels like letting my inner child out. I feel like I can just go and look at things and form hypotheses and ask questions, irrespective of what other people think. I abandon the feeling that I need to do what is immediately important, and instead go look at what I find interesting and fun.

From Watson's autobiography, I mainly got a sense of how even great scientists are drive a lot by petty desires, such as the fear that someone else would beat them to a discovery, or how annoying your collaborators are. For example, it seemed that a driving factor for Watson and Crick's drive to work on DNA was the fear that Linus Pauling would discover the true structure first. A lot of their failure to collaborate better with Rosalind Franklin was due to personal clashes with her. Of course, Watson does also display some irreverence to authority; he held fast to his belief that their approach to finding the structure of DNA would work, even when multiple more senior scientists disagreed with him. But I think the main thing I got out of the book was a visceral appreciation for how important social situations are for motivating even important science.

When I adopt this "Watson mindset" in my head, I think about the social situation I'm in, and use that to motivate me. I call upon the irritation I feel when people are just acting a little too suboptimal, or that people are doing things for the wrong reasons. I see how absolutely easy many of the problems I'm working on are, and use my irritation at people having thus failed to solve them to push me to work harder. This probably isn't a very healthy mindset to have in the long term, and there are obvious problems with it, but it feels very effective to get me to push past schleps.

Comment by lawchan on Hammers and Nails · 2018-01-24T04:18:59.056Z · score: 32 (9 votes) · LW · GW

Following Swerve's example above, I've also decided to try out your exercise and post my results. My favorite instrumental rationality technique is Oliver Habryka's Fermi Modeling. The way I usually explain it (with profuse apologies to Habryka for possibly butchering the technique) is that you quickly generate models of the problem using various frameworks and from various perspectives, then weighting the conclusions of those models based on how closely they seem to conform to reality. (@habryka, please correct me if this is not what Fermi Modeling is.)

For your exercise, I'll try to come up with variants/applications of Fermi modeling that are useful in other contexts.

  1. Instead of using different perspectives or frameworks, take one framework and vary the inputs, then weight the conclusions drawn by how likely the inputs are, as well as how consistent they are with the data.
  2. Likewise, instead of checking one story on either side when engaged in Pyrrhonian skepticism, tell a bunch of stories that are consistent with either side, then weight them by how likely the stories are.
  3. To test what your mental model actually says, try varying parts of the model inputs/outputs randomly and see which combinations fit well/horribly with your model.
  4. When working in domains where you have detailed mental simulations (for example dealing with people you're very familiar with, or for simple manual tasks such as picking up a glass of water), instead of using the inner sim technique once with the most likely/most available set of starting conditions, do as many simulations as possible and weight them based on how likely the starting conditions are.
  5. When doing reference class forecasting, vary the reference class used to test for model robustness.
  6. Instead of answering with a gut feeling directly for a probability judgment for a given thing, try to imagine different possibilities under which the thing happens or doesn't happen, and then vary the specific scenarios (then simulate them in your head) to see how robust each possibility is. Come up with your probability judgment after consulting the result of these robustness checks.
  7. When I am developing and testing (relatively easy to communicate) rationality techniques in the future, I will try to vary the technique in different ways when presenting them to people, and see how robust the technique is to different forms of noise.
  8. I should do more mental simulations to calibrate myself on how good the actions I didn't take were, instead of just relying on my gut feeling/how good other people who took those actions seem to be doing.
  9. Instead of using different perspectives or frameworks, I could do Fermi modeling with different instrumental rationality techniques when approaching a difficult problem. I would quickly go through my list of instrumental rationality techniques, then weight the suggestions made by each of them based on how applicable the technique is to the specific problem I'm stuck on.
  10. Recently, I've been reading a lot of biographies/auto-biographies from great scientists in the 20th century, for example Feynman and James Watson. When encountering a novel scientific problem, instead of only thinking about what the most recently read-about scientist would say, I should keep a list of scientists whose thought processes have been inspirational to me, and try to imagine what each of the scientists would do, weighting them by how applicable (my mental model of) their experiences are to the specific problem.

I guess Fermi modeling isn't so much a single hammer, as much as the "hammer" of the nail mindset. So some of the applications or variants I generated above seem to be ways of applying more hammers to a fixed nail, instead of applying the same fixed hammer to different nails.

Comment by lawchan on Babble · 2018-01-24T03:18:40.603Z · score: 7 (2 votes) · LW · GW

I find the similarities between modern chatbots and the babble/prune model more appropriate. For example, the recent MILA chatbot uses several response models to generate candidate responses based on the dialogue history, and then a response selection policy to select which of the responses to return.

More generally, the concept of seperate algorithms for action proposal and action evaluation is quite widespread in modern deep learning. For example, you can think of AlphaGo's policy network as serving the action proposal/babble role, while the MCTS procedure serves does action evaluation/pruning. (More generally, you can see this with any sort of game tree search algorithm that uses a heuristic to expand promising nodes.) Or, with some stretching, you can think of actor-critic based reinforcement learning algorithms as being composed of babble/prune parts.

GANs fall into the Babble/Prune model mainly insofar as there are two parts, one serving as action proposal and the other serving as action evaluation; beyond this high level; the fit feels very forced. I think that from modern deep learning, both the MILA chatbot and AlphaGo's MCTS procedure are much better fits to the babble/prune model than GANs.

Comment by lawchan on A model I use when making plans to reduce AI x-risk · 2018-01-19T22:43:50.807Z · score: 12 (3 votes) · LW · GW

Regarding the "fun to argue about" point - maybe a positive recommendation would be "focus on hitting the target"? Or "focus on technical issues"? I don't think there is a nice, concise phrase that captures what to do.

From Yudkowsky's Twelve Virtues of Rationality:

Before these eleven virtues is a virtue which is nameless.
Miyamoto Musashi wrote, in The Book of Five Rings:
“The primary thing when you take a sword in your hands is your intention to cut the enemy, whatever the means. Whenever you parry, hit, spring, strike or touch the enemy’s cutting sword, you must cut the enemy in the same movement. It is essential to attain this. If you think only of hitting, springing, striking or touching the enemy, you will not be able actually to cut him. More than anything, you must be thinking of carrying your movement through to cutting him.”
Every step of your reasoning must cut through to the correct answer in the same movement. More than anything, you must think of carrying your map through to reflecting the territory.

Musashi calls this virtue "the way of the void"; but I think that this name is sufficient counterintuitive that we should not try to adopt it.

I also am not sure if this is something that's informed about your models on AI per se; getting nerd-sniped is a common issue for intellectual communities, and being able to actually do things that contribute to your goals is a super important skill.

Comment by lawchan on A model I use when making plans to reduce AI x-risk · 2018-01-19T22:37:45.699Z · score: 26 (11 votes) · LW · GW

Minor nitpick - these don't seem to be models (at least not gears-level models) so much as background assumptions or heuristics.

Comment by lawchan on Beware of black boxes in AI alignment research · 2018-01-19T22:31:09.583Z · score: 7 (2 votes) · LW · GW

I agree! There's a distinction between "we know exactly what knowledge is represented in this complicated black box" and "we have formal guarantees about properties of the black box". It's indeed very different to say "the AI will have a black box representing a model of human preferences" and "we will train the AI to build a model of human preferences using a bootstrapping scheme such as HCH, which we believe works because of these strong arguments".

Perhaps more crisply, we should distinguish between black-boxes where we have a good grasp of why the box will behave as expected, and black boxes which we have little ability to reason about their behavior at all. I believe that both cousin_it and Eliezer (in the Artificial Mysterious Intelligence post), are referring to the folly of using the second type of black box in AI designs.

Perhaps related: Jessica Taylor's discussion on top-level vs subsystem reasoning.

Comment by lawchan on Insights from 'The Strategy of Conflict' · 2018-01-16T18:15:05.440Z · score: 7 (2 votes) · LW · GW

(Cross-posted from Daniel's FB)

Regarding the Bioweapons MAD point: I think detecting that a novel bioweapon has been deployed might be less trivial than you think.

A (possibly) more serious problem is identifying who deployed the bioweapon: it’s easy to tell from where land based missiles come from, but much harder to verify that the weird symptoms reported in one part of the country come from a weapon from this specific adversary.

Comment by lawchan on Field-Building and Deep Models · 2018-01-16T17:59:08.263Z · score: 13 (4 votes) · LW · GW
Albert: I agree that having made progress on issues like logical induction is impressive and has a solid chance of being very useful for AGI design. And I have a better understanding of your position - sharing deep models of a problem is important. I just think that some other top thinkers will be able to make a lot of the key inferences themselves - look at Stuart Russell for example - and we can help that along by providing funding and infrastructure.

I think the problem isn't just that other people might not be able to make the key inferences, but that there won't be common knowledge of the models/assumptions that people have. For example, Stuart Russell has thought a lot about research topics in AI safety, but I'm not actually aware of any write-ups detailing his models of the AI safety landscape and problem. (The best I could find was his "Provably Beneficial AI" Asilomar slides, the 2015 Research Agenda, and his AI FAQ, though all three are intended for a general audience.) It's possible, albeit unlikely, that he has groked MIRI's models, and still thinks that value uncertainty is the most important thing to work on (or call for people to work on) for AI safety. But even if this were the case, I'm not sure how we'd find out.

For example, events where top AI researchers in academia are given the space to share models with researchers closer to our community.

Yup. I think this may help resolve the problem.

Comment by lawchan on Field-Building and Deep Models · 2018-01-16T17:48:02.235Z · score: 7 (2 votes) · LW · GW
I've not seen any significant effort adopt this mind set.

Could I ask where you've looked? MIRI seems to be trying pretty explicitly to develop this mind set, while Paul Christiano has had extensive back and forth on assumptions for his general approach on his medium blog.

Comment by lawchan on Pascal’s Muggle Pays · 2017-12-18T20:36:34.229Z · score: 9 (3 votes) · LW · GW

This is *really* good.

>They yell at us for voting, and/or asking us to justify not living in a van down by the river on microwaved ramen noodles in terms of our expected additional future earnings from our resulting increased motivation and the networking effects of increased social status.

Could I ask how FDT justifies not living in a van down by the river?

Comment by lawchan on Big Advance in Infinite Ethics · 2017-11-29T22:00:03.231Z · score: 4 (1 votes) · LW · GW

>My hope is that one day someone will show LDU (or a similarly intuitive algorithm) can compare any two computable sequences, but I don’t think that this is that proof.

I'm pretty sure you can't use a computable algorithm to do this for general computable sequences while maintaining weak Pareto efficiency due to a diagonalization argument. Let be the algorithm you use to choose between two computable sequences, which returns 0 if the first sequence is better and 1 otherwise. Let be the infinite sequence whose value is always 0.5. Consider the sequence where has value . That is, if chooses , then is an infinite sequence of s, and if chooses , then is an infinite sequence of s. Either way, violates weak Pareto efficiency, since it tells you to choose a sequence whose value at every timestep is less than the other sequence.

Comment by lawchan on The set of Logical Inductors is not Convex · 2017-08-13T00:26:43.000Z · score: 0 (0 votes) · LW · GW

From conversation with Scott and Michael Dennis: there aren't enough logical inductors to make the set of limits convex, since there are an uncountable number of convex combinations but only a countable number of inductors (and thus limits). The interesting question would be whether any rational/computable convex combination of limits of logical inductors is a logical inductor.