riceissa's Shortform 2021-03-27T04:51:43.513Z
Timeline of AI safety 2021-02-07T22:29:00.811Z
Discovery fiction for the Pythagorean theorem 2021-01-19T02:09:37.259Z
Gems from the Wiki: Do The Math, Then Burn The Math and Go With Your Gut 2020-09-17T22:41:24.097Z
Plausible cases for HRAD work, and locating the crux in the "realism about rationality" debate 2020-06-22T01:10:23.757Z
Source code size vs learned model size in ML and in humans? 2020-05-20T08:47:14.563Z
How does iterated amplification exceed human abilities? 2020-05-02T23:44:31.036Z
What are some exercises for building/generating intuitions about key disagreements in AI alignment? 2020-03-16T07:41:58.775Z
What does Solomonoff induction say about brain duplication/consciousness? 2020-03-02T23:07:28.604Z
Is it harder to become a MIRI mathematician in 2019 compared to in 2013? 2019-10-29T03:28:52.949Z
Deliberation as a method to find the "actual preferences" of humans 2019-10-22T09:23:30.700Z
What are the differences between all the iterative/recursive approaches to AI alignment? 2019-09-21T02:09:13.410Z
Inversion of theorems into definitions when generalizing 2019-08-04T17:44:07.044Z
Degree of duplication and coordination in projects that examine computing prices, AI progress, and related topics? 2019-04-23T12:27:18.314Z
Comparison of decision theories (with a focus on logical-counterfactual decision theories) 2019-03-16T21:15:28.768Z
GraphQL tutorial for LessWrong and Effective Altruism Forum 2018-12-08T19:51:59.514Z
Timeline of Future of Humanity Institute 2018-03-18T18:45:58.743Z
Timeline of Machine Intelligence Research Institute 2017-07-15T16:57:16.096Z
LessWrong analytics (February 2009 to January 2017) 2017-04-16T22:45:35.807Z
Wikipedia usage survey results 2016-07-15T00:49:34.596Z


Comment by riceissa on Raj Thimmiah's Shortform · 2021-04-27T22:14:00.787Z · LW · GW

There is a map on the community page. (You might need to change something in your user settings to be able to see it.)

Comment by riceissa on You Can Now Embed Flashcard Quizzes in Your LessWrong posts! · 2021-04-19T18:06:33.160Z · LW · GW

I'm curious why you decided to make an entirely new platform (Thought Saver) rather than using Andy's Orbit platform.

Comment by riceissa on Using Flashcards for Deliberate Practice · 2021-04-15T01:54:41.445Z · LW · GW

Messaging sounds good to start with (I find calls exhausting so only want to do it when I feel it adds a lot of value).

Comment by riceissa on Using Flashcards for Deliberate Practice · 2021-04-15T01:36:15.122Z · LW · GW

Ah ok cool. I've been doing something similar for the past few years and this post is somewhat similar to the approach I've been using for reviewing math, so I was curious how it was working out for you.

Comment by riceissa on Using Flashcards for Deliberate Practice · 2021-04-14T19:54:30.796Z · LW · GW

Have you actually tried this approach, and if so for how long and how has it worked?

Comment by riceissa on Progressive Highlighting: Picking What To Make Into Flashcards · 2021-03-30T20:14:15.739Z · LW · GW

So there's a need for an intermediate stage between creating an extract and creating a flashcard. This need is what progressive highlighting seeks to address.

I haven't actually done incremental reading in SuperMemo so I'm not sure about this, but I believe extract processing is meant to be recursive: first you extract a larger portion of the text that seems relevant, then when you encounter it again the extract itself is treated like an original article itself, so you might extract just a single sentence, then when you encounter that sentence again you might make a cloze deletion or Q&A card.

Comment by riceissa on Progressive Highlighting: Picking What To Make Into Flashcards · 2021-03-30T06:02:37.102Z · LW · GW

This sounds a lot like (a subset of) incremental reading. Instead of highlighting, one creates "extracts" and reviews those extracts over time to see if any of them can be turned into flashcards. As you suggest, there is no pressure to immediately turn things into flashcards on a first-pass of the reading material. These two articles about incremental reading emphasize this point. A quote from the first of these:

Initially, you make extracts because “Well it seems important”. Yet to what degree (the number of clozes/Q&As) and in what formats (cloze/Q&A/both) are mostly fuzzy at this point. You can’t decide wisely on what to do with an extract because you lack the clarity and relevant information to determine it. In other words, you don’t know the extract (or in general, the whole article) well enough to know what to do with it.

In this case, if you immediately process an extract, you’ll tend to make mistakes. For example, for an extract, you should have dismissed it but you made two clozed items instead; you may have dismissed it when it’s actually very important to you, unbeknown to you at that moment. With lowered quality of metamemory judgments, skewed by all the cognitive biases, the resulting clozed/Q&A item(s) is just far from optimal.

Comment by riceissa on riceissa's Shortform · 2021-03-27T04:51:43.779Z · LW · GW

Does life extension (without other technological progress to make the world in general safer) lead to more cautious life styles? The longer the expected years left, the more value there is in just staying alive compared to taking risks. Since death would mean missing out on all the positive experiences for the rest of one's life, I think an expected value calculation would show that even a small risk is not worth taking. Does this mean all risks that don't get magically fixed due to life extension (for example, activities like riding a motorcycle or driving on the highway seem risky even if we have life extension technology) are not worth taking? (There is the obvious exception where if one knows when one is going to die, then one can take more risks just like in a pre-life extension world as one reaches the end of one's life.)

I haven't thought about this much, and wouldn't be surprised if I am making a silly error (in which case, I would appreciate having it pointed out to me!).

Comment by riceissa on [deleted post] 2021-03-12T21:56:45.469Z

I like this tag! I think the current version of the page is missing the insight that influence gained via asymmetric weapons/institutions is restricted/inflexible, i.e. an asymmetric weapon not only helps out only the "good guys" but also constrains the "good guys" into only being able to do "good things". See this comment by Carl Shulman. (I might eventually come back to edit this in, but I don't have the time right now.)

Comment by riceissa on [deleted post] 2021-03-03T00:10:00.158Z

The EA Forum wiki has stubs for a bunch of people, including a somewhat detailed article on Carl Shulman. I wonder if you feel similarly unexcited about the articles there (if so, it seems good to discuss this with people working on the EA wiki as well), or if you have different policies for the two wikis.

Comment by riceissa on Spaced Repetition Systems for Intuitions? · 2021-02-27T01:49:48.414Z · LW · GW

I also just encountered Flashcards for your soul.

Comment by riceissa on Probability vs Likelihood · 2021-02-26T18:43:10.246Z · LW · GW

Ah ok, that makes sense. Thanks for clarifying!

Comment by riceissa on Open & Welcome Thread – February 2021 · 2021-02-26T05:27:53.236Z · LW · GW

It seems to already be on LW.

Edit: oops, looks like the essay was posted on LW in response to this comment.

Comment by riceissa on [deleted post] 2021-02-26T00:04:19.519Z

I'm unable to apply this tag to posts (this tag doesn't show up when I search to add a tag).

Comment by riceissa on Learn Bayes Nets! · 2021-02-24T20:28:07.557Z · LW · GW

For people who find this post in the future, Abram discussed several of the points in the bullet-point list above in Probability vs Likelihood.

Comment by riceissa on Probability vs Likelihood · 2021-02-24T20:22:05.341Z · LW · GW

Regarding base-rate neglect, I've noticed that in some situations my mind seems to automatically do the correct thing. For example if a car alarm or fire alarm goes off, I don't think "someone is stealing the car" or "there's a fire". L(theft|alarm) is high, but P(theft|alarm) is low, and my mind seems to naturally know this difference. So I suspect something more is going on here than just confusing probability and likelihood, though that may be part of the answer.

Comment by riceissa on Probability vs Likelihood · 2021-02-24T19:59:39.003Z · LW · GW

I understood all of the other examples, but this one confused me:

A scenario is likely if it explains the data well. For example, many conspiracy theories are very likely because they have an answer for every question: a powerful group is conspiring to cover up the truth, meaning that the evidence we see is exactly what they'd want us to see.

If the conspiracy theory really was very likely, then we should be updating on this to have a higher posterior probability on the conspiracy theory. But in almost all cases we don't actually believe the conspiracy theory is any more likely than we started out with. I think what's actually going on is the thing Eliezer talked about in Technical Explanation where the conspiracy theory originally has the probability mass very spread out across different outcomes, but then as soon as it learns the actual outcome, it retroactively concentrates the probability mass on that outcome. So I want to say that the conspiracy theory is both unlikely (because it did not make an advance prediction) and improbable (very low prior combined with the unlikeliness). I'm curious if you agree with that or if I've misunderstood the example somehow.

Comment by riceissa on [deleted post] 2021-02-02T23:17:15.592Z

Thanks, I like your rewrite and will post questions instead in the future.

I think I understand your concerns and agree with most of it. One thing that does still feel "off" to me is: given that there seems to be a lot of in-person-only discussions about "cutting edge" ideas and "inside scoop" like things (that trickle out via venues like Twitter and random Facebook threads, and only much later get written up as blog posts), how can people who primarily interact with the community online (such as me) keep up with this? I don't want to have to pay attention to everything that's out there on Twitter or Facebook, and would like a short document that gets to the point and links out to other things if I feel curious. (I'm willing to grant that my emotional experience might be rare, and that the typical user would instead feel alienated in just the way you describe.)

Comment by riceissa on Spaced Repetition Systems for Intuitions? · 2021-01-30T03:43:45.634Z · LW · GW

The closest thing I've seen is Unusual applications of spaced repetition memory systems.

Comment by riceissa on Judgment Day: Insights from 'Judgment in Managerial Decision Making' · 2021-01-24T19:51:20.231Z · LW · GW

For those reading this thread in the future, Alex has now adopted a more structured approach to reviewing the math he has learned.

Comment by riceissa on The new Editor · 2021-01-19T03:37:35.909Z · LW · GW

Thanks, that worked and I was able to fix the rest of the images.

Comment by riceissa on The new Editor · 2021-01-19T02:13:14.799Z · LW · GW

I just tried doing this in a post, and while the images look fine in the editor, they come out huge once the post is published. Any ideas on what I can do to fix this? (I don't see any option in the editor to resize the images, and I'm scared of converting the post to markdown.)

Comment by riceissa on [deleted post] 2021-01-18T20:48:30.790Z

Some thoughts in response:

  • I agree that it's better to focus on ideas instead of people. I might have a disagreement about how successfully LessWrong has managed this, so that from your perspective it looks like this page is pushing the status quo toward something we don't want vs looking from my perspective like it's just doing things more explicitly/transparently (which I prefer).
  • I agree that writing about people can be dicey. I might have disagreement about how well this problem can be avoided.
  • Maybe I'm misunderstanding what you mean by "defensible style", but I'm taking it to mean something like "obsession with having citations from respected sources for every assertion, like what you see on Wikipedia". So the concern is that once we allow lots of pages about people, that will force us to write defensibly, and this culture will infect pages not about people to also be written similarly defensibly. I hadn't thought of this, and I'm not sure how I feel about it. It seems possible to have separate norms/rules for different kinds of pages (Wikipedia does in fact have extra rules for biographies of living persons). But I also don't think I can point to any particularly good examples of wikis that cover people (other than Wikipedia, which I guess is sort of a counterexample).
  • I agree that summarizing his ideas or intellectual culture would be better, but that takes way more work, e.g. to figure out what this culture is/how to carve up the space, how to name it, and figuring out what his core ideas are.
Comment by riceissa on [deleted post] 2021-01-18T20:03:30.047Z

Currently the wiki has basically no entries for people (we have one for Eliezer, but none for Scott Alexander or Lukeprog for example)

There do seem to be stubs for both Scott Alexander and Lukeprog, both similar in size to this Vervaeke page. So I think I'm confused about what the status quo is vs what you are saying the status quo is.

Comment by riceissa on [deleted post] 2021-01-18T03:56:04.906Z

I'm not sure what cluster you are trying to point to by saying "wiki pages like this".

For this page in particular: I've been hearing more and more about Vervaeke, so I wanted to find out what the community has already figured out about him. It seems like the answer so far is "not much", but as the situation changes I'm excited to have some canonical place where this information can be written up. He seems like an interesting enough guy, or at any rate he seems to have caught the attention of other interesting people, and that seems like a good enough reason to have some place like this.

If that's not a good enough reason, I'm curious to hear of a concrete alternative policy and how it applies to this situation. Vervaeke isn't notable enough to have a page on Wikipedia. Maybe I could write a LW question asking something like "What do people know about this guy?" Or maybe I could write a post with the above content. A shortform post would be easy, but seems difficult to find (not canonical enough). Or maybe you would recommend no action at all?

Comment by riceissa on The Wiki is Dead, Long Live the Wiki! [help wanted] · 2021-01-18T03:12:56.350Z · LW · GW


Comment by riceissa on Wiki-Tag FAQ · 2021-01-17T21:53:22.593Z · LW · GW

I tried creating a wiki-tag page today, and here are some questions I ran into that don't seem to be answered by this FAQ:

  • Is there a way to add wiki-links like on the old wiki? I tried using the [[double square brackets]] like on MediaWiki, but this did not work (at least on the new editor).
  • Is there a way to quickly see if a wiki-tag page on a topic already exists? On the creation page, typing something in the box does not show existing pages with that substring. What I'm doing right now is to look on the all tags page (searching with my browser) and also looking at the wiki 1.0 imported pages list and again searching there. I feel like there must be a better way than this, but I couldn't figure it out.
  • Is there a way to add MediaWiki-like <ref> tags? Or is there some preferred alternative way to add references on wiki-tag pages?
Comment by riceissa on The Wiki is Dead, Long Live the Wiki! [help wanted] · 2021-01-17T21:39:24.117Z · LW · GW

The Slack invite link seems to have expired. Is there a new one I can use?

Comment by riceissa on Matt Goldenberg's Short Form Feed · 2020-12-05T20:48:04.866Z · LW · GW

That makes sense, thanks for clarifying. What I've seen most often on LessWrong is to come up with reasons for preferring simple interpretations in the course of trying to solve other philosophical problems such as anthropics, the problem of induction, and infinite ethics. For example, if we try to explain why our world seems to be simple we might end up with something like UDASSA or Scott Garrabrant's idea of preferring simple worlds (this section is also relevant). Once we have something like UDASSA, we can say that joke interpretations do not have much weight since it takes many more bits to specify how to "extract" the observer moments given a description of our physical world.

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T04:15:33.868Z · LW · GW

Thanks! That does make me feel a bit better about the annual reviews.

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T04:00:27.412Z · LW · GW

I see, that wasn't clear from the post. In that case I am wondering if the 2018 review caused anyone to write better explanations or rewrite the existing posts. (It seems like the LessWrong 2018 Book just included the original posts without much rewriting, at least based on scanning the table of contents.)

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T03:46:46.049Z · LW · GW

This is a minor point, but I am somewhat worried that the idea of research debt/research distillation seems to be getting diluted over time. The original article (which this post links to) says:

Distillation is also hard. It’s tempting to think of explaining an idea as just putting a layer of polish on it, but good explanations often involve transforming the idea. This kind of refinement of an idea can take just as much effort and deep understanding as the initial discovery.

I think the kind of cleanup and polish that is encouraged by the review process is insufficient to qualify as distillation (I know this post didn't use the word "distillation", but it does talk about research debt, and distillation is presented as the solution to debt in the original article), and to adequately deal with research debt.

There seems to be a pattern where a term is introduced first in a strong form, then it accumulates a lot of positive connotations, and that causes people to stretch the term to use it for things that don't quite qualify. I'm not confident that is what is happening here (it's hard to tell what happens in people's heads), but from the outside it's a bit worrying.

I actually made a similar comment a while ago about a different term.

Comment by riceissa on Introduction to Cartesian Frames · 2020-12-01T21:24:21.245Z · LW · GW

So the existence of this interface implies that A is “weaker” in a sense than A’.

Should that say B instead of A', or have I misunderstood? (I haven't read most of the sequence.)

Comment by riceissa on Matt Goldenberg's Short Form Feed · 2020-12-01T10:15:45.172Z · LW · GW

Have you seen Brian Tomasik's page about this? If so what do you find unconvincing, and if not what do you think of it?

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-24T05:48:29.630Z · LW · GW

Would this work across different countries (and if so how)? It seems like if one country implemented such a tax, the research groups in that country would be out-competed by research groups in other countries without such a tax (which seems worse than the status quo, since now the first AGI is likely to be created in a country that didn't try to slow down AI progress or "level the playing field").

Comment by riceissa on Embedded Interactive Predictions on LessWrong · 2020-11-23T00:33:49.642Z · LW · GW

Is there a way to see all the users who predicted within a single "bucket" using the LW UI? Right now when I hover over a bucket, it will show all users if the number of users is small enough, but it will show a small number of users followed by "..." if the number of users is too large. I'd like to be able to see all the users. (I know I can find the corresponding prediction on the Elicit website, but this is cumbersome.)

Comment by riceissa on Open & Welcome Thread – November 2020 · 2020-11-19T02:48:48.148Z · LW · GW

Ok. Since visiting your office hours is somewhat costly for me, I was trying to gather more information (about e.g. what kind of moral uncertainty or prior discussion you had in mind, why you decided to capitalize the term, whether this is something I might disagree with you on and might want to discuss further) to make the decision.

More generally, I've attended two LW Zoom events so far, both times because I felt excited about the topics discussed, and both times felt like I didn't learn anything/would have preferred the info to just be a text dump so I could skim and move on. So I am feeling like I should be more confident that I will find an event useful now before attending.

Comment by riceissa on Open & Welcome Thread – November 2020 · 2020-11-18T23:41:25.219Z · LW · GW

Is any of the stuff around Moral Uncertainty real? I think it’s probably all fake, but if you disagree, let’s debate!

Can you say more about this? I only found this comment after a quick search.

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-15T19:30:47.039Z · LW · GW

Oh :0

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-13T00:58:53.501Z · LW · GW

I find the conjunction of your decision to have kids and your short AI timelines pretty confusing. The possibilities I can think of are (1) you're more optimistic than me about AI alignment (but I don't get this impression from your writings), (2) you think that even a short human life is worth living/net-positive, (3) since you distinguish between the time when humans lose control and the time when catastrophe actually happens, you think this delay will give more years to your child's life, (4) your decision to have kids was made before your AI timelines became short. Or maybe something else I'm not thinking of? I'm curious to hear your thinking on this.

Comment by riceissa on DARPA Digital Tutor: Four Months to Total Technical Expertise? · 2020-11-03T23:39:08.674Z · LW · GW

Does anyone know if the actual software and contents for the digital tutor are published anywhere? I tried looking in the linked report but couldn't find anything like that there. I am feeling a bit skeptical that the digital tutor was teaching anything difficult. Right now I can't even tell if the digital tutor was doing something closer to "automate teaching people how to use MS Excel" (sounds believable) vs "automate teaching people real analysis given AP Calculus level knowledge of math" (sounds really hard, unless the people are already competent at self-studying).

Comment by riceissa on Responses to Christiano on takeoff speeds? · 2020-10-30T22:46:53.153Z · LW · GW

I am also interested in this.

Comment by riceissa on Responses to Christiano on takeoff speeds? · 2020-10-30T22:41:52.691Z · LW · GW

There was "My Thoughts on Takeoff Speeds" by tristanm.

Comment by riceissa on Considerations on Cryonics · 2020-10-16T21:49:27.449Z · LW · GW

Thanks! I think I would have guessed that the optimal signup is around age 35-55 so this motivates me to dig closer into your model to see if I disagree with some parameter or modeling assumption (alternatively, I would be able to fix some mistaken intuition that I have). I've made a note to myself to come back to this when I have more free time.

Comment by riceissa on How much to worry about the US election unrest? · 2020-10-12T19:17:09.837Z · LW · GW

There was a similar question a few months ago: Plans / prepping for possible political violence from upcoming US election?

Comment by riceissa on The Alignment Problem: Machine Learning and Human Values · 2020-10-07T07:38:50.322Z · LW · GW

Does anyone know how Brian Christian came to be interested in AI alignment and why he decided to write this book instead of a book about a different topic? (I haven't read the book and looked at the Amazon preview but couldn't find the answer there.)

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T21:57:31.002Z · LW · GW

Here is part of Paul's definition of intent alignment:

In particular, this is the problem of getting your AI to try to do the right thing, not the problem of figuring out which thing is right. An aligned AI would try to figure out which thing is right, and like a human it may or may not succeed.

So in your first example, the partition seems intent aligned to me.

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T21:18:07.341Z · LW · GW

HCH is the result of a potentially infinite exponential process (see figure 1) and thereby, computationally intractable. In reality, we can not break down any task into its smallest parts and solve these subtasks one after another because that would take too much computation. This is why we need to iterate distillation and amplification and cannot just amplify.

In general your post talks about amplification (and HCH) as increasing the capability of the system and distillation as saving on computation/making things more efficient. But my understanding, based on this conversation with Rohin Shah, is that amplification is also intended to save on computation (otherwise we could just try to imitate humans). In other words, the distillation procedure is able to learn more quickly by training on data provided by the amplified system compared to just training on the unamplified system. So I don't like the phrasing that distillation is the part that's there to save on computation, because both parts seem to be aimed at that.

(I am making this comment because I want to check my understand with you or make sure you understand this point because it doesn't seem to be stated in your post. It was one of the most confusing things about IDA to me and I'm still not sure I fully understand it.)

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T20:51:48.067Z · LW · GW

I still don't understand how corrigibility and intent alignment are different. If neither implies the other (as Paul says in his comment starting with "I don't really think this is true"), then there must be examples of AI systems that have one property but not the other. What would a corrigible but not-intent-aligned AI system look like?

I also had the thought that the implicative structure (between corrigibility and intent alignment) seems to depend on how the AI is used, i.e. on the particulars of the user/overseer. For example if you have an intent-aligned AI and the user is careful about not deploying the AI in scenarios that would leave them disempowered, then that seems like a corrigible AI. So for this particular user, it seems like intent alignment implies corrigibility. Is that right?

The implicative structure might also be different depending on the capability of the AI, e.g. a dumb AI might have corrigibility and intent alignment equivalent, but the two concepts might come apart for more capable AI.

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T20:35:46.581Z · LW · GW

IDA tries to prevent catastrophic outcomes by searching for a competitive AI that never intentionally optimises for something harmful to us and that we can still correct once it’s running.

I don't see how the "we can still correct once it’s running" part can be true given this footnote:

However, I think at some point we will probably have the AI system autonomously execute the distillation and amplification steps or otherwise get outcompeted. And even before that point we might find some other way to train the AI in breaking down tasks that doesn’t involve human interaction.

After a certain point it seems like the thing that is overseeing the AI system is another AI system and saying that "we" can correct the first AI system seems like a confusing way to phrase this situation. Do you think I've understood this correctly / what do you think?