Gems from the Wiki: Do The Math, Then Burn The Math and Go With Your Gut 2020-09-17T22:41:24.097Z
Plausible cases for HRAD work, and locating the crux in the "realism about rationality" debate 2020-06-22T01:10:23.757Z
Source code size vs learned model size in ML and in humans? 2020-05-20T08:47:14.563Z
How does iterated amplification exceed human abilities? 2020-05-02T23:44:31.036Z
What are some exercises for building/generating intuitions about key disagreements in AI alignment? 2020-03-16T07:41:58.775Z
What does Solomonoff induction say about brain duplication/consciousness? 2020-03-02T23:07:28.604Z
Is it harder to become a MIRI mathematician in 2019 compared to in 2013? 2019-10-29T03:28:52.949Z
Deliberation as a method to find the "actual preferences" of humans 2019-10-22T09:23:30.700Z
What are the differences between all the iterative/recursive approaches to AI alignment? 2019-09-21T02:09:13.410Z
Inversion of theorems into definitions when generalizing 2019-08-04T17:44:07.044Z
Degree of duplication and coordination in projects that examine computing prices, AI progress, and related topics? 2019-04-23T12:27:18.314Z
Comparison of decision theories (with a focus on logical-counterfactual decision theories) 2019-03-16T21:15:28.768Z
GraphQL tutorial for LessWrong and Effective Altruism Forum 2018-12-08T19:51:59.514Z
Timeline of Future of Humanity Institute 2018-03-18T18:45:58.743Z
Timeline of Machine Intelligence Research Institute 2017-07-15T16:57:16.096Z
LessWrong analytics (February 2009 to January 2017) 2017-04-16T22:45:35.807Z
Wikipedia usage survey results 2016-07-15T00:49:34.596Z


Comment by riceissa on Matt Goldenberg's Short Form Feed · 2020-12-05T20:48:04.866Z · LW · GW

That makes sense, thanks for clarifying. What I've seen most often on LessWrong is to come up with reasons for preferring simple interpretations in the course of trying to solve other philosophical problems such as anthropics, the problem of induction, and infinite ethics. For example, if we try to explain why our world seems to be simple we might end up with something like UDASSA or Scott Garrabrant's idea of preferring simple worlds (this section is also relevant). Once we have something like UDASSA, we can say that joke interpretations do not have much weight since it takes many more bits to specify how to "extract" the observer moments given a description of our physical world.

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T04:15:33.868Z · LW · GW

Thanks! That does make me feel a bit better about the annual reviews.

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T04:00:27.412Z · LW · GW

I see, that wasn't clear from the post. In that case I am wondering if the 2018 review caused anyone to write better explanations or rewrite the existing posts. (It seems like the LessWrong 2018 Book just included the original posts without much rewriting, at least based on scanning the table of contents.)

Comment by riceissa on The LessWrong 2019 Review · 2020-12-03T03:46:46.049Z · LW · GW

This is a minor point, but I am somewhat worried that the idea of research debt/research distillation seems to be getting diluted over time. The original article (which this post links to) says:

Distillation is also hard. It’s tempting to think of explaining an idea as just putting a layer of polish on it, but good explanations often involve transforming the idea. This kind of refinement of an idea can take just as much effort and deep understanding as the initial discovery.

I think the kind of cleanup and polish that is encouraged by the review process is insufficient to qualify as distillation (I know this post didn't use the word "distillation", but it does talk about research debt, and distillation is presented as the solution to debt in the original article), and to adequately deal with research debt.

There seems to be a pattern where a term is introduced first in a strong form, then it accumulates a lot of positive connotations, and that causes people to stretch the term to use it for things that don't quite qualify. I'm not confident that is what is happening here (it's hard to tell what happens in people's heads), but from the outside it's a bit worrying.

I actually made a similar comment a while ago about a different term.

Comment by riceissa on Introduction to Cartesian Frames · 2020-12-01T21:24:21.245Z · LW · GW

So the existence of this interface implies that A is “weaker” in a sense than A’.

Should that say B instead of A', or have I misunderstood? (I haven't read most of the sequence.)

Comment by riceissa on Matt Goldenberg's Short Form Feed · 2020-12-01T10:15:45.172Z · LW · GW

Have you seen Brian Tomasik's page about this? If so what do you find unconvincing, and if not what do you think of it?

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-24T05:48:29.630Z · LW · GW

Would this work across different countries (and if so how)? It seems like if one country implemented such a tax, the research groups in that country would be out-competed by research groups in other countries without such a tax (which seems worse than the status quo, since now the first AGI is likely to be created in a country that didn't try to slow down AI progress or "level the playing field").

Comment by riceissa on Embedded Interactive Predictions on LessWrong · 2020-11-23T00:33:49.642Z · LW · GW

Is there a way to see all the users who predicted within a single "bucket" using the LW UI? Right now when I hover over a bucket, it will show all users if the number of users is small enough, but it will show a small number of users followed by "..." if the number of users is too large. I'd like to be able to see all the users. (I know I can find the corresponding prediction on the Elicit website, but this is cumbersome.)

Comment by riceissa on Open & Welcome Thread – November 2020 · 2020-11-19T02:48:48.148Z · LW · GW

Ok. Since visiting your office hours is somewhat costly for me, I was trying to gather more information (about e.g. what kind of moral uncertainty or prior discussion you had in mind, why you decided to capitalize the term, whether this is something I might disagree with you on and might want to discuss further) to make the decision.

More generally, I've attended two LW Zoom events so far, both times because I felt excited about the topics discussed, and both times felt like I didn't learn anything/would have preferred the info to just be a text dump so I could skim and move on. So I am feeling like I should be more confident that I will find an event useful now before attending.

Comment by riceissa on Open & Welcome Thread – November 2020 · 2020-11-18T23:41:25.219Z · LW · GW

Is any of the stuff around Moral Uncertainty real? I think it’s probably all fake, but if you disagree, let’s debate!

Can you say more about this? I only found this comment after a quick search.

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-15T19:30:47.039Z · LW · GW

Oh :0

Comment by riceissa on Daniel Kokotajlo's Shortform · 2020-11-13T00:58:53.501Z · LW · GW

I find the conjunction of your decision to have kids and your short AI timelines pretty confusing. The possibilities I can think of are (1) you're more optimistic than me about AI alignment (but I don't get this impression from your writings), (2) you think that even a short human life is worth living/net-positive, (3) since you distinguish between the time when humans lose control and the time when catastrophe actually happens, you think this delay will give more years to your child's life, (4) your decision to have kids was made before your AI timelines became short. Or maybe something else I'm not thinking of? I'm curious to hear your thinking on this.

Comment by riceissa on DARPA Digital Tutor: Four Months to Total Technical Expertise? · 2020-11-03T23:39:08.674Z · LW · GW

Does anyone know if the actual software and contents for the digital tutor are published anywhere? I tried looking in the linked report but couldn't find anything like that there. I am feeling a bit skeptical that the digital tutor was teaching anything difficult. Right now I can't even tell if the digital tutor was doing something closer to "automate teaching people how to use MS Excel" (sounds believable) vs "automate teaching people real analysis given AP Calculus level knowledge of math" (sounds really hard, unless the people are already competent at self-studying).

Comment by riceissa on Responses to Christiano on takeoff speeds? · 2020-10-30T22:46:53.153Z · LW · GW

I am also interested in this.

Comment by riceissa on Responses to Christiano on takeoff speeds? · 2020-10-30T22:41:52.691Z · LW · GW

There was "My Thoughts on Takeoff Speeds" by tristanm.

Comment by riceissa on Considerations on Cryonics · 2020-10-16T21:49:27.449Z · LW · GW

Thanks! I think I would have guessed that the optimal signup is around age 35-55 so this motivates me to dig closer into your model to see if I disagree with some parameter or modeling assumption (alternatively, I would be able to fix some mistaken intuition that I have). I've made a note to myself to come back to this when I have more free time.

Comment by riceissa on How much to worry about the US election unrest? · 2020-10-12T19:17:09.837Z · LW · GW

There was a similar question a few months ago: Plans / prepping for possible political violence from upcoming US election?

Comment by riceissa on The Alignment Problem: Machine Learning and Human Values · 2020-10-07T07:38:50.322Z · LW · GW

Does anyone know how Brian Christian came to be interested in AI alignment and why he decided to write this book instead of a book about a different topic? (I haven't read the book and looked at the Amazon preview but couldn't find the answer there.)

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T21:57:31.002Z · LW · GW

Here is part of Paul's definition of intent alignment:

In particular, this is the problem of getting your AI to try to do the right thing, not the problem of figuring out which thing is right. An aligned AI would try to figure out which thing is right, and like a human it may or may not succeed.

So in your first example, the partition seems intent aligned to me.

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T21:18:07.341Z · LW · GW

HCH is the result of a potentially infinite exponential process (see figure 1) and thereby, computationally intractable. In reality, we can not break down any task into its smallest parts and solve these subtasks one after another because that would take too much computation. This is why we need to iterate distillation and amplification and cannot just amplify.

In general your post talks about amplification (and HCH) as increasing the capability of the system and distillation as saving on computation/making things more efficient. But my understanding, based on this conversation with Rohin Shah, is that amplification is also intended to save on computation (otherwise we could just try to imitate humans). In other words, the distillation procedure is able to learn more quickly by training on data provided by the amplified system compared to just training on the unamplified system. So I don't like the phrasing that distillation is the part that's there to save on computation, because both parts seem to be aimed at that.

(I am making this comment because I want to check my understand with you or make sure you understand this point because it doesn't seem to be stated in your post. It was one of the most confusing things about IDA to me and I'm still not sure I fully understand it.)

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T20:51:48.067Z · LW · GW

I still don't understand how corrigibility and intent alignment are different. If neither implies the other (as Paul says in his comment starting with "I don't really think this is true"), then there must be examples of AI systems that have one property but not the other. What would a corrigible but not-intent-aligned AI system look like?

I also had the thought that the implicative structure (between corrigibility and intent alignment) seems to depend on how the AI is used, i.e. on the particulars of the user/overseer. For example if you have an intent-aligned AI and the user is careful about not deploying the AI in scenarios that would leave them disempowered, then that seems like a corrigible AI. So for this particular user, it seems like intent alignment implies corrigibility. Is that right?

The implicative structure might also be different depending on the capability of the AI, e.g. a dumb AI might have corrigibility and intent alignment equivalent, but the two concepts might come apart for more capable AI.

Comment by riceissa on My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda · 2020-10-05T20:35:46.581Z · LW · GW

IDA tries to prevent catastrophic outcomes by searching for a competitive AI that never intentionally optimises for something harmful to us and that we can still correct once it’s running.

I don't see how the "we can still correct once it’s running" part can be true given this footnote:

However, I think at some point we will probably have the AI system autonomously execute the distillation and amplification steps or otherwise get outcompeted. And even before that point we might find some other way to train the AI in breaking down tasks that doesn’t involve human interaction.

After a certain point it seems like the thing that is overseeing the AI system is another AI system and saying that "we" can correct the first AI system seems like a confusing way to phrase this situation. Do you think I've understood this correctly / what do you think?

Comment by riceissa on Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Battle of the Sexes · 2020-09-16T04:52:31.429Z · LW · GW

In the Alice/Bob diagrams, I am confused why the strategies are parameterized by the frequency of cooperation. Don't these frequencies depend on what the other player does, so that the same strategy can have different frequencies of cooperation depending on who the other player is?

Comment by riceissa on Eli's shortform feed · 2020-09-13T04:43:46.666Z · LW · GW

I am curious how good you think the conversation/facilitation was in the AI takeoff double crux between Oliver Habryka and Buck Shlegeris. I am looking for something like "the quality of facilitation at that event was X percentile among all the conversation facilitation I have done".

Comment by riceissa on Covid 8/20: A Little Progress · 2020-08-21T05:21:45.131Z · LW · GW

Tyler Cowen would be distributing the money personally

According to Tyler Cowen's blog post about the saliva test, this grant was made via Fast Grants. From the Fast Grants homepage:

Who will make grant decisions?
A panel of biomedical scientists will make funding recommendations to Emergent Ventures.

The Fast Grants website does not mention Cowen, and his level of involvement is unclear to me. Some of the phrasing in your post like "Funded By Blogger’s Personal Fund" gave me the impression that Cowen was more involved in the decision-making process than I can find evidence for. I'm curious if you have more information on this.

Comment by riceissa on Considerations on Cryonics · 2020-08-03T23:02:20.723Z · LW · GW

Does this analysis take into account the fact that young people are most likely to die in ways that are unlikely to result in successful cryopreservation? If not, I'm wondering what the numbers look like if you re-run the simulation after taking this into account. As a young person myself, if I die in the next decade I think it is most likely to be from injury or suicide (neither of which seems likely to lead to successful cryopreservation), and this is one of the main reasons I have been cryocrastinating. See also this discussion.

Comment by riceissa on Open & Welcome Thread - July 2020 · 2020-07-13T00:32:15.645Z · LW · GW

GreaterWrong has a meta view:

I'm not sure how it's populated or if a similar page exists on LW.

Comment by riceissa on What are the high-level approaches to AI alignment? · 2020-06-17T00:21:06.321Z · LW · GW
Comment by riceissa on Open & Welcome Thread - June 2020 · 2020-06-08T09:42:50.079Z · LW · GW

“Consume rationalist and effective altruist content” makes sense but some more specific advice would be helpful, like what material to introduce, when, and how to encourage their interest if they’re not immediately interested. Have any parents done this and can share their experience?

I don't have kids (yet) and I'm planning to delay any potential detailed research until I do have kids, so I don't have specific advice. You could talk to James Miller and his son. Bryan Caplan seems to also be doing well in terms of keeping his sons' views similar to his own; he does homeschool, but maybe you could learn something from looking at what he does anyway. There are a few other rationalist parents, but I haven't seen any detailed info on what they do in terms of introducing rationality/EA stuff. Duncan Sabien has also thought a lot about teaching children, including designing a rationality camp for kids.

I can also give my own data point: Before discovering LessWrong (age 13-15?), I consumed a bunch of traditional rationality content like Feynman, popular science, online philosophy lectures, and lower quality online discourse like the xkcd forums. I discovered LessWrong when I was 14-16 (I don't remember the exact date) and read a bunch of posts in an unstructured way (e.g. I think I read about half of the Sequences but not in order), and concurrently read things like GEB and started learning how to write mathematical proofs. That was enough to get me to stick around, and led to me discovering EA, getting much deeper into rationality, AI safety, LessWrongian philosophy, etc. I feel like I could have started much earlier though (maybe 9-10?) and that it was only because of my bad environment (in particular, having nobody tell me that LessWrong/Overcoming Bias existed) and poor English ability (I moved to the US when I was 10 and couldn't read/write English at the level of my peers until age 16 or so) that I had to start when I did.

Comment by riceissa on Open & Welcome Thread - June 2020 · 2020-06-08T06:14:46.907Z · LW · GW

Do you think that having your kids consume rationalist and effective altruist content and/or doing homeschooling/unschooling are insufficient for protecting your kids against mind viruses? If so, I want to understand why you think so (maybe you're imagining some sort of AI-powered memetic warfare?).

Eliezer has a Facebook post where he talks about how being socialized by old science fiction was helpful for him.

For myself, I think the biggest factors that helped me become/stay sane were spending a lot of time on the internet (which led to me discovering LessWrong, effective altruism, Cognito Mentoring) and not talking to other kids (I didn't have any friends from US public school during grades 4 to 11).

Comment by riceissa on The Stopped Clock Problem · 2020-06-04T23:42:50.232Z · LW · GW

If randomness/noise is a factor, there is also regression to the mean when the luck disappears on the following rounds.

Comment by riceissa on Open & Welcome Thread - June 2020 · 2020-06-04T02:10:52.556Z · LW · GW

People I followed on Twitter for their credible takes on COVID-19 now sound insane. Sigh...

Are you saying that you initially followed people for their good thoughts on COVID-19, but (a) now they switched to talking about other topics (George Floyd protests?), and their thoughts are much worse on these other topics, (b) their thoughts on COVID-19 became worse over time, (c) they made some COVID-19-related predictions/statements that now look obviously wrong, so that what they previously said sounds obviously wrong, or (d) something else?

Comment by riceissa on Source code size vs learned model size in ML and in humans? · 2020-05-25T03:06:42.309Z · LW · GW

I'm not sure exactly what you're trying to learn here, or what debate you're trying to resolve. (Do you have a reference?)

I'm not entirely sure what I'm trying to learn here (which is part of what I was trying to express with the final paragraph of my question); this just seemed like a natural question to ask as I started thinking more about AI takeoff.

In "I Heart CYC", Robin Hanson writes: "So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases."

It sounds like he expects early AGI systems to have lots of hand-coded knowledge, i.e. the minimum number of bits needed to specify a seed AI is large compared to what Eliezer Yudkowsky expects. (I wish people gave numbers for this so it's clear whether there really is a disagreement.) It also sounds like Robin Hanson expects progress in AI capabilities to come from piling on more hand-coded content.

If ML source code is small and isn't growing in size, that seems like evidence against Hanson's view.

If ML source code is much smaller than the human genome, I can do a better job of visualizing the kind of AI development trajectory that Robin Hanson expects, where we stick in a bunch of content and share content among AI systems. If ML source code is already quite large, then it's harder for me to visualize this (in this case, it seems like we don't know what we're doing, and progress will come from better understanding).

If the human genome is small, I think that makes a discontinuity in capabilities more likely. When I try to visualize where progress comes from in this case, it seems like it would come from a small number of insights. We can take some extreme cases: if we knew that the code for a seed AGI could fit in a 500-line Python program (I don't know if anybody expects this), a FOOM seems more likely (there's just less surface area for making lots of small improvements). Whereas if I knew that the smallest program for a seed AGI required gigabytes of source code, I feel like progress would come in smaller pieces.

If an algorithm uses data structures that are specifically suited to doing Task X, and a different set of data structures that are suited to Task Y, would you call that two units of content or two units of architecture?

I'm not sure. The content/architecture split doesn't seem clean to me, and I haven't seen anyone give a clear definition. Specialized data structures seems like a good example of something that's in between.

Comment by riceissa on NaiveTortoise's Short Form Feed · 2020-05-23T22:47:27.657Z · LW · GW

Somewhat related:

Last Thursday on the Discord we had people any% speedrunning and racing the Lean tutorial project . This fits very well into my general worldview: I think that doing mathematics in Lean is like solving levels in a computer puzzle game, the exciting thing being that mathematics is so rich that there are many many kinds of puzzles which you can solve.

Comment by riceissa on What are Michael Vassar's beliefs? · 2020-05-18T22:33:35.854Z · LW · GW

I've had this same question and wrote the Wikiquote page on Vassar while doing research on him.

See also this comment thread. The Harper's piece from that post also talks a lot about Vassar.

Comment by riceissa on Offer of collaboration and/or mentorship · 2020-05-17T21:00:05.988Z · LW · GW

I'm curious how this has turned out. Could you give an update (or point me to an existing one, in case I missed it)?

Comment by riceissa on How does iterated amplification exceed human abilities? · 2020-05-13T08:21:03.661Z · LW · GW

I'm confused about the tradeoff you're describing. Why is the first bullet point "Generating better ground truth data"? It would make more sense to me if it said instead something like "Generating large amounts of non-ground-truth data". In other words, the thing that amplification seems to be providing is access to more data (even if that data isn't the ground truth that is provided by the original human).

Also in the second bullet point, by "increasing the amount of data that you train on" I think you mean increasing the amount of data from the original human (rather than data coming from the amplified system), but I want to confirm.

Aside from that, I think my main confusion now is pedagogical (rather than technical). I don't understand why the IDA post and paper don't emphasize the efficiency of training. The post even says "Resource and time cost during training is a more open question; I haven’t explored the assumptions that would have to hold for the IDA training process to be practically feasible or resource-competitive with other AI projects" which makes it sound like the efficiency of training isn't important.

Comment by riceissa on Is AI safety research less parallelizable than AI research? · 2020-05-10T23:52:17.571Z · LW · GW

And I've seen Eliezer make the claim a few times. But I can't find an article describing the idea. Does anyone have a link?

Eliezer talks about this in Do Earths with slower economic growth have a better chance at FAI? e.g.

Relative to UFAI, FAI work seems like it would be mathier and more insight-based, where UFAI can more easily cobble together lots of pieces. This means that UFAI parallelizes better than FAI.

Comment by riceissa on How does iterated amplification exceed human abilities? · 2020-05-04T00:34:04.114Z · LW · GW

The addition of the distillation step is an extra confounder, but we hope that it doesn't distort anything too much -- its purpose is to improve speed without affecting anything else (though in practice it will reduce capabilities somewhat).

I think this is the crux of my confusion, so I would appreciate if you could elaborate on this. (Everything else in your answer makes sense to me.) In Evans et al., during the distillation step, the model learns to solve the difficult tasks directly by using example solutions from the amplification step. But if can do that, then why can't it also learn directly from examples provided by the human?

To use your analogy, I have no doubt that a team of Rohins or a single Rohin thinking for days can answer any question that I can (given a single day). But with distillation you're saying there's a robot that can learn to answer any question I can (given a single day) by first observing the team of Rohins for long enough. If the robot can do that, why can't the robot also learn to do the same thing by observing me for long enough?

Comment by riceissa on NaiveTortoise's Short Form Feed · 2020-04-24T21:34:23.664Z · LW · GW

I want to highlight a potential ambiguity, which is that "Newton's approximation" is sometimes used to mean Newton's method for finding roots, but the "Newton's approximation" I had in mind is the one given in Tao's Analysis I, Proposition 10.1.7, which is a way of restating the definition of the derivative. (Here is the statement in Tao's notes in case you don't have access to the book.)

Comment by riceissa on NaiveTortoise's Short Form Feed · 2020-04-17T03:18:19.152Z · LW · GW

I had a similar idea which was also based on an analogy with video games (where the analogy came from let's play videos rather than speedruns), and called it a live math video.

Comment by riceissa on Takeaways from safety by default interviews · 2020-04-04T01:23:42.211Z · LW · GW

What is the plan going forward for interviews? Are you planning to interview people who are more pessimistic?

Comment by riceissa on Categorization of Meta-Ethical Theories (a flowchart) · 2020-04-01T07:30:45.397Z · LW · GW

In the first categorization scheme, I'm also not exactly sure what nihilism is referring to. Do you know? Is it just referring to Error Theory (and maybe incoherentism)?

Yes, Huemer writes: "Nihilism (a.k.a. 'the error theory') holds that evaluative statements are generally false."

Usually non-cognitivism would fall within nihilism, no?

I'm not sure how the term "nihilism" is typically used in philosophical writing, but if we take nihilism=error theory then it looks like non-cognitivism wouldn't fall within nihilism (just like non-cognitivism doesn't fall within error theory in your flowchart).

I actually don't think either of these diagrams place Nihilism correctly.

For the first diagram, Huemer writes "if we say 'good' purports to refer to a property, some things have that property, and the property does not depend on observers, then we have moral realism." So for Huemer, nihilism fails the middle condition, so is classified as anti-realist. For the second diagram, see the quote below about dualism vs monism.

I'm not super well acquainted with the monism/dualism distinction, but in the common conception don't they both generally assume that morality is real, at least in some semi-robust sense?

Huemer writes:

Here, dualism is the idea that there are two fundamentally different kinds of facts (or properties) in the world: evaluative facts (properties) and non-evaluative facts (properties). Only the intuitionists embrace this.

Everyone else is a monist: they say there is only one fundamental kind of fact in the world, and it is the non-evaluative kind; there aren't any value facts over and above the other facts. This implies that either there are no value facts at all (eliminativism), or value facts are entirely explicable in terms of non-evaluative facts (reductionism).

Comment by riceissa on How special are human brains among animal brains? · 2020-04-01T06:42:50.783Z · LW · GW

It seems like "agricultural revolution" is used to mean both the beginning of agriculture ("First Agricultural Revolution") and the 18th century agricultural revolution ("Second Agricultural Revolution").

Comment by riceissa on Categorization of Meta-Ethical Theories (a flowchart) · 2020-03-30T20:19:12.364Z · LW · GW

Michael Huemer gives two taxonomies of metaethical views in section 1.4 of his book Ethical Intuitionism:

As the preceding section suggests, metaethical theories are traditionally divided first into realist and anti-realist views, and then into two forms of realism and three forms of anti-realism:

  /       \
 /         Intuitionism
 \              Subjectivism
  \            /
   Anti-Realism -- Non-Cognitivism

This is not the most illuminating way of classifying positions. It implies that the most fundamental division in metaethics is between realists and anti-realists over the question of objectivity. The dispute between naturalism and intuitionism is then seen as relatively minor, with the naturalists being much closer to the intuitionists than they are, say, to the subjectivists. That isn't how I see things. As I see it, the most fundamental division in metaethics is between the intuitionists, on the one hand, and everyone else, on the other. I would classify the positions as follows:

   Dualism -- Intuitionism
 /                      Subjectivism
/                      /
\          Reductionism
 \        /            \
  \      /              Naturalism
         \               Non-Cognitivism
          \             /
Comment by riceissa on Open & Welcome Thread - March 2020 · 2020-03-20T01:19:06.145Z · LW · GW

Do you have prior positions on relationships that you don’t want to get corrupted through the dating process, or something else?

I think that's one way of putting it. I'm fine with my prior positions on relationships changing because of better introspection (aided by dating), but not fine with my prior positions changing because they are getting corrupted.

Intelligence beyond your cone of tolerance is usually a trait that people pursue because they think it’s “ethical”

I'm not sure I understand what you mean. Could you try re-stating this in different words?

Comment by riceissa on Open & Welcome Thread - March 2020 · 2020-03-20T00:04:27.811Z · LW · GW

A question about romantic relationships: Let's say currently I think that a girl needs to have a certain level of smartness in order for me to date her long-term/marry her. Suppose I then start dating a girl and decide that actually, being smart isn't as important as I thought because the girl makes up for it in other ways (e.g. being very pretty/pleasant/submissive). I think this kind of change of mind is legitimate in some cases (e.g. because I got better at figuring out what I value in a woman) and illegitimate in other cases (e.g. because the girl I'm dating managed to seduce me and mess up my introspection). My question is, is this distinction real, and if so, is there any way for me to tell which situation I am in (legitimate vs illegitimate change of mind) once I've already begun dating the girl?

This problem arises because I think dating is important for introspecting about what I want, i.e. there is a point after which I can no longer obtain new information about my preferences via thinking alone. The problem is that dating is also potentially a values-corrupting process, i.e. dating someone who doesn't meet certain criteria I think I might have means that I can get trapped in a relationship.

I'm also curious to hear if people think this isn't a big problem (and if so, why).

Comment by riceissa on What are some exercises for building/generating intuitions about key disagreements in AI alignment? · 2020-03-16T23:52:35.883Z · LW · GW

I have only a very vague idea of what you mean. Could you give an example of how one would do this?

Comment by riceissa on Name of Problem? · 2020-03-09T23:09:54.493Z · LW · GW

I think that makes sense, thanks.

Comment by riceissa on Name of Problem? · 2020-03-09T22:30:12.772Z · LW · GW

Just to make sure I understand, the first few expansions of the second one are:

  • f(n)
  • f(n+1)
  • f((n+1) + 1)
  • f(((n+1) + 1) + 1)
  • f((((n+1) + 1) + 1) + 1)

Is that right? If so, wouldn't the infinite expansion look like f((((...) + 1) + 1) + 1) instead of what you wrote?