Posts

Their romantic partner offering lots of value in other ways. I'm skeptical of this one because female partners are typically notoriously high maintenance in money, attention, and emotional labor. Sure, she might be great in a lot of ways, but it's hard for that to add up enough to outweigh the usual costs.

Imagine a woman is a romantic relationship with somebody else. Are they still so great a person that you would still enjoy hanging out with them as a friend? If not that woman should not be your girlfriend. Friendship first. At least in my model romantic stuff should be stacked ontop of platonic love.

Comment by Johannes C. Mayer (johannes-c-mayer) on Self's Shortform · 2025-02-05T10:02:48.349Z · LW · GW

Don't spend all your time compressing knowledge that's not that useful to begin with, if there are higher value things to be learned.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2025-02-05T10:01:43.277Z · LW · GW

An extreme test of how much you trust a persons intention is to consider whether you would upload them, if you could. Then they would presumably be the only (speed) superintelligence.

Comment by Johannes C. Mayer (johannes-c-mayer) on I'm offering free math consultations! · 2025-01-27T16:35:28.215Z · LW · GW

Maybe better name: Let me help debug your math via programming

Comment by Johannes C. Mayer (johannes-c-mayer) on Exercise: Planmaking, Surprise Anticipation, and "Baba is You" · 2025-01-20T11:59:39.883Z · LW · GW

If you've tried this earnestly 3 times, after the 3rd time, I think it's fine to switch to just trying to solve the level however you want (i.e. moving your character around the screen, experimenting).

After you failed 3 times, wouldn't it be a better exercise to just play around in the level until you get a new pice of information that you predict will allow you to reformulate better plans, and then step back into planning mode again?

Comment by Johannes C. Mayer (johannes-c-mayer) on johnswentworth's Shortform · 2025-01-11T13:07:47.132Z · LW · GW

Another one: We manage to solve alignment to a significant extend. The AI who is much smarter than a human thinks that it is aligned, and takes aligned actions. The AI even predicts that it will never become unaligned to humans. However, at some point in the future as the AI naturally unrolles into a reflectively stable equilibrium it becomes unaligned.

Comment by Johannes C. Mayer (johannes-c-mayer) on The Field of AI Alignment: A Postmortem, and What To Do About It · 2024-12-31T17:09:28.196Z · LW · GW

Why not AI? Is it that AI alignment is too hard? Or do you think it's likely one would fall into the "try a bunch of random stuff" paradigm popular in AI, which wouldn't help much in getting better at solving hard problems?

What do you think about the strategy of instead of learning a textbook e.g. on information theory, or compilers you try to write the textbook and only look at existing material if you are really stuck. That's my primary learning strategy.

It's very slow and I probably do it too much, but it allows me to train to solve hard problems that aren't super hard. If you read all the text books all the practice problems remaining are very hard.

Comment by Johannes C. Mayer (johannes-c-mayer) on Vegans need to eat just enough Meat - emperically evaluate the minimum ammount of meat that maximizes utility · 2024-12-29T12:14:16.848Z · LW · GW

How about we meet, you do research, and I observe, and then try to subtly steer you, ideally such that you learn faster how to do it well. Basically do this, but without it being an interview.

Comment by Johannes C. Mayer (johannes-c-mayer) on The Field of AI Alignment: A Postmortem, and What To Do About It · 2024-12-28T22:25:41.131Z · LW · GW

What are some concrete examples of the of research that MIRI insufficiently engaged with? Are there general categories of prior research that you think are most underutilized by alignment researchers?

Comment by Johannes C. Mayer (johannes-c-mayer) on The Field of AI Alignment: A Postmortem, and What To Do About It · 2024-12-28T22:17:41.293Z · LW · GW

... and Carol's thoughts run into a blank wall. In the first few seconds, she sees no toeholds, not even a starting point. And so she reflexively flinches away from that problem, and turns back to some easier problems.

I spend ~10 hours trying to teach people how to think. I sometimes try to intentionally cause this to happen. Usually you can recognize it by them starting to be quiet (I usually give the instruction that they should do all their thinking out loud). And this seems to be when actual cognitive labor is happening, instead of saying things that you already knew. Though usually they by default fail earlier than "realizing the hard parts of ELK".

Usually I need to tell them that actually they are doing great by thinking about the black wall more, and shouldn't now switch the topic.

Infact it seem to be a good general idea generation strategy to just write down all the easy ideas first, until you hit this wall, such that you can start to actually think.

Why Physicists are competent

Here is my current model after thinking about this for 30 minutes of why physicists are good at solving hard problems (not ever having studied physics extensively myself).

The job description of a physicist is basically "understand the world", meaning make models that have predictive power over the real world.

This is very different from math. In some sense a lot harder. In math you know everything. There is no uncertainty. And you have a very good method to verify that you are correct. If you have generated a proof, it's correct. It's also different from computer science for similar reasons.

But of cause physicists need to be very skilled at math, because if you are not skilled at math you can't make good models that have predictive power. Similarly physicists need to be good at computer science, to implement physicsal simulations, which often involve complex algorithms. And to be able to actually implement these algorithms such that they are fast enough, and run at all, they need to also be decent at software engeneering.

Also understanding the scientific method is a lot more important when you are physicist. It's sort of not required to understand science for doing math and theoretical CS.

Another thing is that physicists need actually do things that work. You can do some random math that's not useful at all. It seems harder to make a random model of reality that predicts some aspect of reality that you couldn't predict before, and have you not figure out anything important. As a physicist you are actually measured by how reality is. You can't go "hmm maybe this just doesn't work" like in math. Obviously somehow it works because it's reality, you just haven't figured out how to properly capture how reality is in your model.

Perhaps this trains physicist to not give up on problems, because the default assumption is that clearly there must be some way to model some part of reality, because reality is in some sense already a model of itself.

I think this is the most important cognitive skill. Not giving up. I think this is much more important than any particular pice of technical knowledge. Having technical knowledge is of cause required, but it seems that if you where to not give up on thinking how to solve a problem (that is hard but important) would make you end up learning whatever is required.

And in some sense it is this simple. When I see people run into a wall, and then have them stare at a wall they often have ideas that I like so much that I feel the need to write them down.

I watched this video, and I semi trust this guy (more than anybody else) about not getting it completely wrong. So you can eat too much soy. But eating a bit is actually healthy, is my current model.

Here is also a calculation I did that it is possible to get all amino acids from soy without eating too much.

Haven't thought about, nor experimented with that. If you think clams would be ok to eat, you could perform the experiment yourself.

Comment by Johannes C. Mayer (johannes-c-mayer) on What Have Been Your Most Valuable Casual Conversations At Conferences? · 2024-12-25T08:14:51.589Z · LW · GW

At the 2024 LessWrong Community weekend I met somebody who I have been working with for perhaps 50 hours so far. They are better at certain programming related tasks than me, in a way provided utility. Before meeting them they where not even considering working on AI alignment related things. The conversation wen't something like this:

Johannes: What are you working on.
Other Person: Web development. What are you working on?
Johannes: I am trying to understand intelligence such that we can build a system that is capable enough to prevent other misaligned AI's from being build, and that we understand enough such that we can be sure that it wouldn't kill us. [...] Why are you not working on it? Other Person: (I forgot what he said)
Johannes: Oh then now is the perfect time to start working on it.
Other Person: So what are you actually doing.
Johannes: (Describes some methodologies.)
Other Person: (Questions whether these methodologies are actually good, and thinks about how they could be better.)
[...]

Actually this all happened after the event when traveling from the venue to the train station.

It doesn't happen that often that I get something really good out of a random meeting. Most of them are bad. However, I think the most important thing I do to get something out is to just immediately talk about the things that I am interested in. This efficiently filters out people, either because they are not interested, or because they can't talk it.

You can overdo this. Starting a conversation with "AI seems very powerful, I think it will likely destroy the world" can make other people feel awkward (I know from experience). However, the above formula of "what do you do" and then "and I do this" get's to the point very quickly without inducing awkwardness.

Basically you can think of this as making random encounters (like walking back to the train station with randomly sampled people) non-random by always trying to steer any encounter such that it becomes useful.

I probably did it badly. I would eat hole grain bread pretty regularly, but not consistently. I might not eat it for 1 week in a row sometimes. That was before I knew that amino acids are important.

It was ferritin. However the levels where actually barely within acceptable levels. I hypothesise that because I started to eat steamed blood for perhaps 2 weaks prior every day, and that blood contains a lot of heme iron, that I was deficient before.

I think running this experiment is generally worth it. It's very different to read a study and to run the experiment and see the effect yourself. You may also try to figure out if you are amino acid deficient. See this comment, as well as others in that comment stack.

The reason I mention chicken is that last time I ran this experiment with beef my body started to hurt really bad such that I woke up in the middle of the night. I am pretty sure that the beef was the reason. Maybe something weird was going on in my body at the same time. However, when I tried the same one week later with chicken I didn't have this issue.

I ate tons of beluga lentils. Sometimes 1kg (cooked) a day. That wasn't enough. However, now I switched to eating 600g (cooked) soybeans every day, and that was a very significant improvement (like solving the problem to 75% or so). Soy is a complete protein. Soy beans are also very cheap.

Comment by Johannes C. Mayer (johannes-c-mayer) on johnswentworth's Shortform · 2024-12-14T11:42:38.632Z · LW · GW

Note this 50% likely only holds if you are using a main stream language. For some non-main stream language I have gotten responses that where really unbelivably bad. Things like "the name of this variable wrong" which literally could never be the problem (it was a valid identifier).

And similarly, if you are trying to encode novel concepts, it's very different from gluing together libraries, or implementing standard well known tasks, which I would guess is what habryka is mostly doing (not that this is a bad thing to do).

Comment by Johannes C. Mayer (johannes-c-mayer) on johnswentworth's Shortform · 2024-12-14T11:39:11.503Z · LW · GW

Maybe you include this in "stack overflow substitute", but the main thing I use LLMs for is to understand well known technical things. The workflow is: 1) I am interested in understanding something, e.g. how a multiplexed barrel bit shifter works. 2) I ask the LLM to explain the concept. 3) Based on the initial response I create seperate conversation branches with questions I have (to save money and have the context be closer. Didn't evaluate if this actually makes the LLM better.). 4) Once I think I understood the concept or part of the concept I explain it to GPT. (Really I do this all the time during the entire process.) 5) The LLM (hopeful) corrects me if I am wrong (it seems it detects mistakes more often than not).

The last part of the conversation can then looks like this:

User: Ah, now I think I understand. You have mechanism that shifts by a fixed ammount (powers of 2). Each of these multiplexers can be set up such that the data flows through them, or though an ID wire, into the next multiplexer. Now by controlling whether the value flows through the multiplexer or the ID wire, we can "sum up the shifts". Specifically if we have shifters of 1, 2, 4, ..., 32, then we can achive any shift (for 64-bit numbers) by activating the appropirate multiplexers. This design has not 64^64 gates, but instead 64^6 gates. 64 gates for each multiplexer.

AI: Exactly, you've got the concept! The design you described efficiently uses a series of multiplexers to perform shifts by fixed amounts that are powers of two (1, 2, 4, ..., 32 for a 64-bit number). By selecting which multiplexers to activate, you can achieve any desired shift amount.

I had probably ~200,000 words worth of conversation with LLMs, mainly in this format.

I am not sure what next leap you are talking about. But I intuit based on some observations that GPT-4o is much better for this than GPT-3 (you might talk about more recent "leaps"). (Didn't test o1 extensively because it's so expensive).

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-11T15:45:54.730Z · LW · GW

I totally agree with this. I expect the majority early AI researchers where falling into this trap. The main problem I am focusing on is how a mind can construct a model of the world in the first place.

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-11T15:38:42.454Z · LW · GW

The goal is to have a system where there are no unlabeled parameters ideally. That would be the world modeling system. It then would build a world model that would have many unlabeled parameters. By understanding the world modeler system you can ensure that the world model has certain properties. E.g. there is some property (which I don't know) of how to make the world model not contain dangerous minds.

E.g. imagine the AI is really good at world modeling, and now it models you (you are part of the world) so accurately that you are now basically copied into the AI. Now you might try to escape the AI, which would actually be really good because then you could save the world as a speed intelligence (assuming the model of you would really accurate which is probably wouldn't be). But if it models another mind (maybe it considers dangerous adversaries) then maybe they could also escape, and would not be aligned.

By understanding the system you could put constraints on what world models can be generated, such that all generated world models can't contain such dangerous minds, or at least make such minds much less likely.

I propose that a more realistic example would be “classifying images via a ConvNet with 100,000,000 weights” versus “classifying images via 5,000,000 lines of Python code involving 1,000,000 nonsense variable names”. The latter is obviously less inscrutable on the margin but it’s not a huge difference.

Python code is a discrete structure. You can do proofs on more easily than for a NN. You could try to apply program transformations on it that preserve functional equality, trying to optimize for some measure of "human understandable structure". There are image classification alogrithms iirc that are worse than NN but much more interpretable, and these algorithms would at most be hundets of lines of code I guess (haven't really looked a lot at them).

Anyway, it’s fine to brainstorm on things like this, but I claim that you can do that brainstorming perfectly well by assuming that the world model is a Bayes net (or use OpenCog AtomSpace, or Soar, or whatever), or even just talk about it generically.

You give examples of recognizing problems. I tried to give examples of how you can solve these problems. I'm not brainstorming on "how could this system fail". Instead I understand something, and then I just notice without really trying, that now I can do a thing that seems very useful, like making the system not think about human psycology given certain constraints.

Probably I completely failed at making clear why I think that, because my explanation was terrible. In any case I think your suggested brainstorming this is completely different from the thing that I am actually doing.

To me it just seems that limiting the depth of a tree search is better that limiting the compute of a black box neural network. It seems like you can get a much better grip on what it means to limit the depth, and what this implies about the system behavior, when you actually understand how tree search works. Of cause tree search here is only an example.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-11-11T15:17:59.449Z · LW · GW

Here. There is a method you can have. This is just a small pice of what I do. I also probably haven't figured out many important methodological things yet.

Also this is very important.

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-09T19:25:24.584Z · LW · GW

John's post is quite wierd, because it only says true things, and implicitly implies a conclusion, namely that NNs are not less interpretable than some other thing, which is totally wrong.

Example: A neural network implements modular arithmetic with furier transforms. If you implement that furier algorithm in python, it's harder to understand for a human than the obvious modular arithmetic implementation in python.

It doesn't matter if the world model is inscruitable when looking directly at it, if you can change the generating code such that certain properties must hold. Figuring out what these properties is not directly solved by understading intelligence of cause.

This is bad because, if AGI is very compute-efficient, then when we have AGI at all, we will have AGI that a great many actors around the world will be able to program and run, and that makes governance very much harder.

This is bad because, if AGI is very compute-efficient, then when we have AGI at all, we will have AGI that a great many actors around the world will be able to program and run, and that makes governance very much harder.

Totally agree, so obviously try super hard to not leak the working AGI code if you had it.

But you won’t get insight into those distinctions, or how to ensure them in an AGI, by thinking about whether world-model stuff is stored as connections on graphs versus induction heads or whatever.

No you can. E.g. I could define theoretically a general algoritm that identifies the minimum concrepts neccesary, if I know enough about the structure of the system, specifically how concepts are stored, for solving a task. That's of cause not perfect, but it would seem that for very many problems it would make the AI unable to think about things like human manipulation, or that it is a constrained AI, even if that knowledge was somewhere in a learned black box world model. This is just an example of something you can do by knowing the structure of a system.

If your system is some plain code with for loops, just reduce the number the for loops of seach processes do. Now decreasing/incleasing the iterations somewhat will correspond to making the system dumber/smarter. Again obviously not solving the problem completely, but clearly a powerful thing to be able to do.

Of cause many low level details do not matter. Often you'd only care that something is a sequence, or a set. I am talking about a higher level program structure.

It feels like you are somewhat missing the point. The goal is to understand how intelligence works. Clearly that would be very useful for alignment? Even if you would get a blackbox world model. But of cause it would also enable you to think about how to make such a world model more interpretable. I think that is possible, it's just not what I am focusing on now.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-11-09T19:03:39.295Z · LW · GW

I specifically am talking about solving problems that nobody knows the answer to, where you are probably even wrong about what the problem even is. I am not talking about taking notes on existing material. I am talking about documenting the process of generating knowledge.

I am saying that I forget important ideas that I generated in the past, probably they are not yet so refined that they are impossible to forget.

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-07T08:14:24.231Z · LW · GW

A robust alignment scheme would likely be trivial to transform into an AGI recipe.

Perhaps if you did have the full solution, but it feels like that there are some things of a solution that you could figure out, such that that part of the solution doesn't tell you as much about the other parts of the solution.

And it also feels like there could be a book such that if you read it you would gain a lot of knowledge about how to align AIs without knowing that much more about how to build one. E.g. a theoretical solution to the stop button problem seems like it would not tell you that much about how to build an AGI compared to figuring out how to properly learn a world model of Minecraft. And knowing how to build a world model of minecraft probably helps a lot with solving the stop button problem, but it doesn't just trivially yield a solution.

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-07T08:05:58.215Z · LW · GW

If you had a system with “ENTITY 92852384 implies ENTITY 8593483" it would be a lot of progress, as currently in neural networks we don't even understand the interal structures.

I want to have an algorithm that creates a world model. The world is large. A world model is uninterpretable by default through it's sheer size, even if you had interpretable but low level abels. By default we don't get any interpretable labels. I think there are ways to have generic dataprocessing procedures that don't talk about the human mind at all, that would yield more interpretable world model. Similar to how you could probably specify some very general property about python programs, such that that program becomes easier to understand by humans. E.g. a formalism of what it means that the control flow is straightforward: Don't use goto in C.

But even if you wouldn't have this, understanding the system still allows you to understand what the structure of the knowledge would be. It seems plausible that one could simply by understanding the system very well, make it such that the learned datastrucutres need to take particular shapes, such that these shapes correspond some relevant alignment properties.

In any case, it seems that this is a problem that any possible way to build an intelligence runs into? So I don't think it is a case against the project. When building an AI with NN you might not even think about that the interal representations might be wierd and alien (even for an LLM trained on human text)^[1], but the same problem persists.

^{^}
I haven't looked into this, or thought about at all, though that's what I expect.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-11-07T07:54:05.922Z · LW · GW

You Need a Research Log

I definitely very often run into the problem that I forget why something was good to do in the first place. What are the important bits? Often I get sidetracked, and then the thing that I am doing seems not so got, so I stop and do something completely different. But then later on I realize that actually the original reason that led me down the path was good and that it would have been better to only backtrack a bit to the important piece. But often I just don't remember the important piece in the moment.

E.g. I think that having some kind of linking structure in your world model, that links objects in the model to the real world is important such that you can travel backward on the links to identify where exactly in your world model the error is. Then I go off and construct some formalism for a bit, but before I got to the point of adding the links I forgot that that was the original motivation, and so I just analyzed the model for a couple of hours before realizing that I still haven't added the linking structure. So it even happens during the same research session for me if I am not careful. And if you want to continue the next day, or a week later, having organized your thoughts in a way that isn't so painful to go through that you won't do it is extremely helpful.

I recognized a couple of things as important so far for being able to do it correctly:

Make it fun to make the notes. If you can't make this information processing activity fun you basically can't do it.

My brain somehow seems to like doing it much more when I put all the notes on a website.
Also taking lots of ADHD medication helps.

Make the notes high quality enough such that they are readable, instead of a wall of garbage text.

Writing thoughts mainly on a whiteboard, and analog journals (including reflection) seems to help a lot (in general actually).

Integrate note-taking tightly into your research workflow.

Don't rely on postprocessing, i.e. having a separate step of producing research notes. At least I didn't manage to get this to work at all so far. As much as possible make the content you produce in the first place as good as possible (analog tools help a lot with this). That means writing up notes and reflections as you are working, not at some time later (which never actually comes).

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-05T00:08:38.289Z · LW · GW

I'd think you can define a tedrahedron for non-euclidean space. And you can talk about and reason about a set of polyhedra with 10 verticies as an abstract object without talking or defining any specific such polyhedra.

Just consider if you take the assumption that the system would not change in arbitrary ways in response to it's environment. There might be certain constrains. You can think about what the constrains need to be such that e.g. a self modifying agent would never change itself such that it would expect that in the future it would get less utility than if it would not selfmodify.

And that is just a random thing that came to mind without me trying. I would expect that you can learn useful things about alignment by thinking about such things. Infact the line between understanding intelligence and figuring out alignment in advance really doesn't exist I think. Clearly understanding something about alignment is understanding something about intelligence.

When people say to only figure out alignment thing, maybe what they mean is to figure out things about intelligence that won't actually get you much closer to being able to build a dangerous intelligence. And there do seem to be such things. It is just that I expect that just trying to work on these will not actually make you generate the most useful models about intelligence in your mind, making you worse/slower at thinking on average per unit of time working.

And that's of cause not a law. Probably there are some things that you want to understand through an abstract theoretical lens at certain points in time. Do whatever works best.

Comment by Johannes C. Mayer (johannes-c-mayer) on Seeking Collaborators · 2024-11-04T23:17:20.208Z · LW · GW

The way I would approach this problem (after not much thought): Come up with a concrete system architecture A of a maimizing computer program that has an explicit utility function, and is known to behave optimally. E.g. maybe it plays tic tac toe or 4-in a row optimally.

Now mutate the source code of A slightly such that it is no longer optimal to get a system B. The objective is not modified. Now B still "wants" to basically be A, in the sense that if it is a general enough optimizer and has access to selfmodification facilities, it would try to make itself be A, because A is better at optimizing the objective.

I predict by creating a setup where the delta between B and A is small, you can create a tractable problem, without sidestepping the core bottlecks, i.e. solving "correct selfmodification" for small delta between A and B, seems like it needs to solve some hard part of the problem. Once you solved it increase the delta, and solve it again.

Unsure about the exact setup for giving the systems the ability to selfmodify. I intuit one can construct a toy setup that can generate good insight such that B doesn't actually need to be that powerful, or that general of an optimizer.

Comment by Johannes C. Mayer (johannes-c-mayer) on Goal: Understand Intelligence · 2024-11-04T23:02:03.141Z · LW · GW

To me it seems that understanding how a system that you are building actually works (i.e. have good models about its internal) is the most basic requirement to be able to reason about the system coherently at all.

Yes if you'd actually understood how intelligence works in a deep way you don't automatically solve alignment. But it sure will make it a lot more tractable in many ways. Especially when only aiming for a pivotal act.

I am pretty sure you can figure out alignment in advance as you suggest. That might be the overall saver route... if we didn't have coordination problems. But it seems slower, and we don't have time.

Obviously, if you figure out the intelligence algorithm before you know how to steer it, don't put it on GitHub or the universe's fate will be sealed momentarily. Ideally don't even run it at all.

So far working on this project seems to have created ontologies in my brain that are good for thinking about alignment. There are a couple of approaches that now seem obvious, which I think wouldn't seem obvious before. Again having good models about intelligence (which is really what this is about) is actually useful for thinking about intelligence. And Alignment research is mainly thinking about intelligence.

The approach many people take of trying to pick some alignment problem seems somewhat backward to me. E.g. embedded agency is a very important problem, and you need to solve it at some point. But it doesn't feel like the problem such that when you work on it, you build up the most useful models of intelligence in your brain.

As an imperfect analogy consider trying to understand how a computer works by first understanding how modern DRAM works. To build a practical computer you might need to use DRAM. But in principle, you could build a computer with only S-R latch memory. So clearly while important it is not at the very core. First, you understand how NAND gates work, the ALU, and so on. Once you have a good understanding of the fundamentals, DRAM will be much easier to understand. It becomes obvious how it needs to work at a high level: You can write and read bits. If you don't understand how a computer works you might not even know why storing a bit is an important thing to be able to do.

Comment by Johannes C. Mayer (johannes-c-mayer) on johnswentworth's Shortform · 2024-10-31T13:50:33.928Z · LW · GW

It's becomes more interresting when the people constrain their output based on what they expect is true information that the other person does not yet know. It's useful to talk to an expert, who tells you a bunch of random stuff they know that you don't.

Often some of it will be useful. This only works if they understand what you have said though (which presumably is something that you are interested in). And often the problem is that people's models about what is useful are wrong. This is especially likely if you are an expert in something. Then the thing that most people will say will be worse what you would think on the topic. This is especially bad if the people can't immediately even see why what you are saying is right.

The best strategy around this I have found so far is just to switch the topic to the actually interesting/important things. Suprisingly usually people go along with it.

Comment by Johannes C. Mayer (johannes-c-mayer) on Whiteboard Pen Magazines are Useful · 2024-10-14T09:57:41.865Z · LW · GW

2024-10-14 Added the "FUUU 754 extensions M and S" section.

Comment by Johannes C. Mayer (johannes-c-mayer) on Whiteboard Pen Magazines are Useful · 2024-10-14T09:56:58.140Z · LW · GW

Update History

Comment by Johannes C. Mayer (johannes-c-mayer) on Why is o1 so deceptive? · 2024-09-28T12:14:19.644Z · LW · GW

It seems potentially important to compare this to GPT4o. In my experience when asking GPT4 for research papers on particular subjects it seemed to make up non-existent research papers (at least I didn't find them after multiple minutes of searching the web). I don't have any precise statistics on this.

Comment by Johannes C. Mayer (johannes-c-mayer) on We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap · 2024-09-20T20:54:51.631Z · LW · GW

Yes exactly. The larva example illustrates that there are different kinds of values. I thought it was underexplored in the OP to characterize exactly what these different kinds of values are.

In the sadist example we have:

the hardcoded pleasure of hurting people.
And we have, let's assume, the wish to make other people happy.

These two things both seem like values. However, they seem to be qualitatively different kinds of values. I intuit that more precisely characterizing this difference is important. I have a bunch of thoughts on this that I failed to write up so far.

Comment by Johannes C. Mayer (johannes-c-mayer) on We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap · 2024-09-20T10:58:36.106Z · LW · GW

reward is the evidence from which we learn about our values

A sadist might feel good each time they hurt somebody. I am pretty sure it is possible for a sadist to exist who does not endorse hurting people, meaning they feel good if they hurt people, but they avoid it nonetheless.

So to what extent is hurting people a value? It's like the sadist's brain tries to tell them that they ought to want to hurt people, but they don't want to. Intuitively the "they don't want to" seems to be the value.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-09-08T19:55:16.258Z · LW · GW

Any n-arity function can be simulated with an an (n+1)-arity predicate. Let a and b be constants. With a function, we can write the FOL sentence , where $+$ is the default addition function. We can write the same as $\forall x, (+_{P} (a, b, x) \to x > 1)$ where $+_{P}$ is now a predicate that returns true iff $a$ added to $b$ is $x$ .

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-09-08T18:56:46.963Z · LW · GW

How to Sleep

Here are a few observations I have made when it comes to going to bed on time.

Bedtime Alarms

I set up an alarm that reminds me when my target bedtime has arrived. Many times when I am lost in an activity, the alarm makes me remember that I made the commitment to go to bed on time.

I only allow myself to dismiss the alarm when I lay down in bed. Before laying down I am only allowed to snooze it for 8 minutes. To dismiss the alarm I need to solve a puzzle which takes 10s, making dismissing more convenient. Make sure to carry your phone around with you at bedtime.

This is probably the single best thing I have done to improve my sleep hygiene.

Avoid Hard to Stop Activities

It is hard for me to go to bed when doing any engaging activity that I just want to finish up. For example:

Finishing up some Nixos, xmonad, exwm, etc. configuration.
Programming such that I get something working.
Watch a video and feel I need to watch it to the end.

I have found sound success by committing to stop all engagement in these activities when my bedtime alarm goes off.

Don't Fail by Abandon

Once I get past my bedtime by a bit, I am likely to go past my bedtime by a lot.

Somehow it feels like I have already lost. "Did I go to bed on time" is binary.

[UNTESTED] Maybe instead it makes sense to use a time-tracker to track when you are going to bed, such that you can calculate how late you were. Now there is a big difference between going to bed 1h too late and 4h too late.

[UNTESTED] Potentially one could use a sleep right that then automatically records when you sleep. Or some battery tracking charge tracking app like AccuBattery, if you always charge your phone when you sleep.

[UNTESTED] Try to sleep

At the target time, try to sleep for 5-15 minutes. If you can't sleep, you are allowed to get back up. You can use a very subtle self dismissing alarm for notification.

Comment by Johannes C. Mayer (johannes-c-mayer) on quila's Shortform · 2024-09-03T14:17:51.449Z · LW · GW

Consider all the programs that encode uncomputable numbers up to $n$ digits. There are infinitely many of these programs. Now consider the set of programs $P^{'} := {call-10-times (p) | p \in P}$ . Each program in P' has some pattern. But it's always a different one.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-09-03T14:07:23.987Z · LW · GW

You need the right relationship with confusion. By default confusion makes you stop your thinking. Being confused feels like you are doing something wrong. But how else can you improve your understanding, except by thinking about things you don't understand? Confusion tells you that you don't yet understand. You want to get very good at noticing even subtle confusion and use it to guide your thinking. However, thinking about confusing things isn't enough. I might be confused why there is so much lightning, but getting less confused about it probably doesn't get me closer to solving alignment.

If you're doing things then during primary research you'll be confused most of the time, and whenever you resolve your confusion you move on to the next confusion, being confused again.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-09-02T11:30:33.396Z · LW · GW

Here is an AI called GameNGen that generates a game in real-time as the player interacts with the model. (It simulates doom at >20fps.) It uses a diffusion model. People are only slightly better than random chance at identifying if it was generated by the AI or by the Doom program.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-08-30T14:01:06.251Z · LW · GW

There are muscles in your nose I just realized. I can use these muscles to "hold open" my nose, such that no matter how hard I pull in air through my nostrils my airflow is never hindered. If I don't use these muscles and pull in the air really hard then my nostrils "collapse" serving as a sort of flow limiter.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-08-24T11:26:08.737Z · LW · GW

The next time you buy a laptop, and you don't want a Mac, it's likely you want to buy one with a snapdragon CPU. That's an ARM chip, meaning you get very good battery life (just like the M-series Apple chips). On Snapdragon though you can easily run Windows, and eventually Linux (Linux support is a few months out though).

Comment by Johannes C. Mayer (johannes-c-mayer) on What is "True Love"? · 2024-08-19T10:18:16.812Z · LW · GW

IMO the most important factor in interpersonal relations is that it needs to be possible to have engaging/useful conversations. There are many others.

The problem: Somebody who scores low on these, can be pushed up unreasonably high in your ranking through feelings of sexual desire.

The worst thing: Sexual desire drops temporarily in the short term after orgasm, and (I heard) permanently after a 2-year period.

To probe the nature of your love:

If you like women imagine the other person is a man. Did your liking shift? If you like men... etc.
- Generally imagine them being physically unattractive, e.g. them being a giant slime monster.
Masturbate and check how much your liking shifted immediately after orgasm.

This helps disentangle lust and love.

Everlasting Honeymoon

I heard some people never leave the "honeymoon phase". The initial strong feelings of love persist indefinitely. IIRC scientists determined this by measuring oxytocin or something in couples married for decades. Possibly genetic, so it's not re-creatable.

If the person is a good fit, there's perhaps nothing wrong with loving them even more on top of that.

Appearance can be compelling in non-primary-sexual ways. Porn is closed, NEPPUU opened.

Comment by Johannes C. Mayer (johannes-c-mayer) on Johannes C. Mayer's Shortform · 2024-08-14T12:48:18.909Z · LW · GW

Typst is better than Latex

I started to use Typst. I feel a lot more productive in it. Latex feels like a slug. Typst doesn't feel like it slows me down when typing math, or code. That and the fact that it has an online collaborative editor, and that rendering is very very fast are the most important features. Here are some more:

It has an online collaborative editor.
It compiles instantly (at least for my main 30-page document)
The online editor has Vim support.
It's free.
It can syntax highlight lots of languages (e.g. LISP and Lean3 are supported).
It's embedded scripting language is much easier to use than Latex Macros.
The paid version has Google Doc-style comment support.
It's open source and you can compile documents locally, though the online editor is closed source.

Here is a comparison of encoding the games of life in logic:

Latex

$$
\forall i, j \in \mathbb{Z}, A_{t+1}(i, j) = \begin{cases}
                                0 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) < 2 \\
                                1 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) \in \{2, 3\} \\
                                0 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) > 3 \\
                                1 &\text{if} \quad A_t(i, j) = 0 \land N_t(i, j) = 3 \\
                                0 &\text{otherwise}
                              \end{cases}
$$

Typst

$
forall i, j in ZZ, A_(t+1)(i, j) = cases(
                                0 "if" A_t(i, j) = 1 and N_t(i, j) < 2 \
                                1 "if" A_t(i, j) = 1 and N_t(i, j) in {2, 3} \
                                0 "if" A_t(i, j) = 1 and N_t(i, j) > 3 \
                                1 "if" A_t(i, j) = 0 and N_t(i, j) = 3 \
                                0 "otherwise")
$

Typst in Emacs Org Mode

Here is some elisp to treat latex blocks in emacs org-mode as typst math, when exporting to HTML (renders/embeds as SVG images):

;;;; Typst Exporter
;;; This exporter requires that you have inkscape and typst in your path.
;;; Call org-typst-enabled-html-export

;;; TODO
;;; - Error if inskape or typst is not installed.
;;; - Make it such that it shows up in the org-dispatch exporter and we can
;;;   automatically not export only to output.html.
;;; - Automatically setup the HTML header, and possible also automatically start the server as described in: [[id:d9f72e91-7e8d-426d-af46-037378bc9b15][Setting up org-typst-html-exporter]]
;;; - Make it such that the temporary buffers are deleted after use.


(require 'org)
(require 'ox-html) ; Make sure the HTML backend is loaded

(defun spawn-trim-svg (svg-file-path output-file-path)
  (start-process svg-file-path
		 nil
		 "inkscape"
		 svg-file-path
		 "--export-area-drawing"
		 "--export-plain-svg"
		 (format "--export-filename=%s" output-file-path)))

(defun correct-dollar-sings (typst-src)
  (replace-regexp-in-string "\\$\\$$"
			    " $" ; Replace inital $$ with '$ '
			    (replace-regexp-in-string "^\\$\\$" "$ " ; same for ending $$
						      typst-src)))

(defun math-block-p (typst-src)
  (string-match "^\\$\\$\\(\\(?:.\\|\n\\)*?\\)\\$\\$$" typst-src))

(defun html-image-centered (image-path)
  (format "<div style=\"display: flex; justify-content: center; align-items: center;\">\n<img src=\"%s\" alt=\"Centered Image\">\n</div>" image-path))

(defun html-image-inline (image-path)
  (format " <img hspace=3px src=\"%s\"> " image-path))

(defun spawn-render-typst (file-format input-file output-file)
  (start-process input-file nil "typst" "compile" "-f" file-format input-file output-file))

(defun generate-typst-buffer (typst-source)
  "Given typst-source code, make a buffer with this code and neccesary preamble."
  (let ((buffer (generate-new-buffer (generate-new-buffer-name "tmp-typst-source-buffer"))))
    (with-current-buffer buffer
      (insert "#set text(16pt)\n")
      (insert "#show math.equation: set text(14pt)\n")
      (insert "#set page(width: auto, height: auto)\n")1
      (insert typst-source))
    buffer))
  
(defun embed-math (is-math-block typst-image-path)
    (if is-math-block
	(html-image-centered typst-image-path)
        (html-image-inline typst-image-path)))

(defun generate-math-image (output-path typst-source-file)
  (let* ((raw-typst-render-output (make-temp-file "my-temp-file-2" nil ".typ")))
    (spawn-render-typst file-format typst-source-file raw-typst-render-output)
    (spawn-trim-svg raw-typst-render-output typst-image-path)))

(defun my-typst-math (latex-fragment contents info)
  ;; Extract LaTeX source from the fragment's plist
  (let* ((typst-source-raw (org-element-property :value latex-fragment))
	 (is-math-block (math-block-p typst-source-raw))
	 (typst-source (correct-dollar-sings typst-source-raw))
	 (file-format "svg") ;; This is the only supported format.
         (typst-image-dir (concat "./typst-svg"))
	 (typst-buffer (generate-typst-buffer typst-source)) ; buffer of full typst code to render
	 (typst-source-file (make-temp-file "my-temp-file-1" nil ".typ"))
	 ;; Name is unique for every typst source we render to enable caching.
	 (typst-image-path (concat typst-image-dir "/"
				   (secure-hash 'sha256 (with-current-buffer typst-buffer (buffer-string)))
				   "." file-format)))
    ;; Only render if neccesary
    (unless (file-exists-p typst-image-path)
      (message (format "Rendering: %s" typst-source))
      ;; Write the typst code to a file
      (with-current-buffer typst-buffer
	(write-region (point-min) (point-max) typst-source-file))
      (generate-math-image typst-image-path typst-source-file))
    (kill-buffer typst-buffer)
    (embed-math is-math-block typst-image-path)))

(org-export-define-derived-backend 'my-html 'html
    :translate-alist '((latex-fragment . my-typst-math))
    :menu-entry
    '(?M "Export to My HTML"
	((?h "To HTML file" org-html-export-to-html))))

;; Ensure org-html-export-to-html is bound correctly to your backend:
(defun org-html-export-to-html-with-typst (&optional async subtreep visible-only body-only ext-plist)
  (interactive)
  (let* ((buffer-file-name (buffer-file-name (window-buffer (minibuffer-selected-window))))
	 (html-output-name (concat (file-name-sans-extension buffer-file-name) ".html")))
    (org-export-to-file 'my-html html-output-name
      async subtreep visible-only body-only ext-plist)))

(setq org-export-backends (remove 'html org-export-backends))
(add-to-list 'org-export-backends 'my-html)

Simply eval this code and then call org-html-export-to-html-with-typst.

Comment by Johannes C. Mayer (johannes-c-mayer) on Being at peace with Doom · 2024-08-09T13:58:10.952Z · LW · GW

Now I need to link the Always Look on the Bright Side of Life song.

Comment by Johannes C. Mayer (johannes-c-mayer) on My Advice for Incoming SERI MATS Scholars · 2024-08-01T18:21:39.665Z · LW · GW

Probably not useful but just in case here are some other medications that are prescribed for narcolepsy (i.e. stuff that makes you not tired):

Solriamfetol is supposed to be more effective than Modafinil. Possibly hard to impossible to get without a prescription. Haven't tried that yet.

Pitolisant is interesting because it has a novel mechanism of action. Possibly impossible to get even with a prescription, as it is super expensive if you don't have the right health insurance. For me, it did not work that well. Only lasted 2-4 hours, and taking multiple doses makes me not be able to sleep.

Comment by Johannes C. Mayer (johannes-c-mayer) on My Advice for Incoming SERI MATS Scholars · 2024-07-24T21:17:03.767Z · LW · GW

I am now diagnosed with sleep apnea and type 2 narcolepsy. CPAP and a Modafinil prescription seem to help pretty well so far. You were the first iirc to point me in that direction, so thank you. Any things that helped you that I did not list?

Comment by Johannes C. Mayer (johannes-c-mayer) on johnswentworth's Shortform · 2024-07-24T20:58:17.800Z · LW · GW

For the reasonable price of $300 dollars per month, I insure anybody against the destruction of the known world. Should the world be destroyed by AGI I'll give you your money back fold.

That said, if there were insurers, they would probably be more likely than average to look into AI X-risk. Some might then be convinced that it is important and that they should do something about it.

User info