Posts
Comments
I totally agree with this. I expect the majority early AI researchers where falling into this trap. The main problem I am focusing on is how a mind can construct a model of the world in the first place.
The goal is to have a system where there are no unlabeled parameters ideally. That would be the world modeling system. It then would build a world model that would have many unlabeled parameters. By understanding the world modeler system you can ensure that the world model has certain properties. E.g. there is some property (which I don't know) of how to make the world model not contain dangerous minds.
E.g. imagine the AI is really good at world modeling, and now it models you (you are part of the world) so accurately that you are now basically copied into the AI. Now you might try to escape the AI, which would actually be really good because then you could save the world as a speed intelligence (assuming the model of you would really accurate which is probably wouldn't be). But if it models another mind (maybe it considers dangerous adversaries) then maybe they could also escape, and would not be aligned.
By understanding the system you could put constraints on what world models can be generated, such that all generated world models can't contain such dangerous minds, or at least make such minds much less likely.
I propose that a more realistic example would be “classifying images via a ConvNet with 100,000,000 weights” versus “classifying images via 5,000,000 lines of Python code involving 1,000,000 nonsense variable names”. The latter is obviously less inscrutable on the margin but it’s not a huge difference.
Python code is a discrete structure. You can do proofs on more easily than for a NN. You could try to apply program transformations on it that preserve functional equality, trying to optimize for some measure of "human understandable structure". There are image classification alogrithms iirc that are worse than NN but much more interpretable, and these algorithms would at most be hundets of lines of code I guess (haven't really looked a lot at them).
Anyway, it’s fine to brainstorm on things like this, but I claim that you can do that brainstorming perfectly well by assuming that the world model is a Bayes net (or use OpenCog AtomSpace, or Soar, or whatever), or even just talk about it generically.
You give examples of recognizing problems. I tried to give examples of how you can solve these problems. I'm not brainstorming on "how could this system fail". Instead I understand something, and then I just notice without really trying, that now I can do a thing that seems very useful, like making the system not think about human psycology given certain constraints.
Probably I completely failed at making clear why I think that, because my explanation was terrible. In any case I think your suggested brainstorming this is completely different from the thing that I am actually doing.
To me it just seems that limiting the depth of a tree search is better that limiting the compute of a black box neural network. It seems like you can get a much better grip on what it means to limit the depth, and what this implies about the system behavior, when you actually understand how tree search works. Of cause tree search here is only an example.
Here. There is a method you can have. This is just a small pice of what I do. I also probably haven't figured out many important methodological things yet.
Also this is very important.
John's post is quite wierd, because it only says true things, and implicitly implies a conclusion, namely that NNs are not less interpretable than some other thing, which is totally wrong.
Example: A neural network implements modular arithmetic with furier transforms. If you implement that furier algorithm in python, it's harder to understand for a human than the obvious modular arithmetic implementation in python.
It doesn't matter if the world model is inscruitable when looking directly at it, if you can change the generating code such that certain properties must hold. Figuring out what these properties is not directly solved by understading intelligence of cause.
This is bad because, if AGI is very compute-efficient, then when we have AGI at all, we will have AGI that a great many actors around the world will be able to program and run, and that makes governance very much harder.
This is bad because, if AGI is very compute-efficient, then when we have AGI at all, we will have AGI that a great many actors around the world will be able to program and run, and that makes governance very much harder.
Totally agree, so obviously try super hard to not leak the working AGI code if you had it.
But you won’t get insight into those distinctions, or how to ensure them in an AGI, by thinking about whether world-model stuff is stored as connections on graphs versus induction heads or whatever.
No you can. E.g. I could define theoretically a general algoritm that identifies the minimum concrepts neccesary, if I know enough about the structure of the system, specifically how concepts are stored, for solving a task. That's of cause not perfect, but it would seem that for very many problems it would make the AI unable to think about things like human manipulation, or that it is a constrained AI, even if that knowledge was somewhere in a learned black box world model. This is just an example of something you can do by knowing the structure of a system.
If your system is some plain code with for loops, just reduce the number the for loops of seach processes do. Now decreasing/incleasing the iterations somewhat will correspond to making the system dumber/smarter. Again obviously not solving the problem completely, but clearly a powerful thing to be able to do.
Of cause many low level details do not matter. Often you'd only care that something is a sequence, or a set. I am talking about a higher level program structure.
It feels like you are somewhat missing the point. The goal is to understand how intelligence works. Clearly that would be very useful for alignment? Even if you would get a blackbox world model. But of cause it would also enable you to think about how to make such a world model more interpretable. I think that is possible, it's just not what I am focusing on now.
I specifically am talking about solving problems that nobody knows the answer to, where you are probably even wrong about what the problem even is. I am not talking about taking notes on existing material. I am talking about documenting the process of generating knowledge.
I am saying that I forget important ideas that I generated in the past, probably they are not yet so refined that they are impossible to forget.
A robust alignment scheme would likely be trivial to transform into an AGI recipe.
Perhaps if you did have the full solution, but it feels like that there are some things of a solution that you could figure out, such that that part of the solution doesn't tell you as much about the other parts of the solution.
And it also feels like there could be a book such that if you read it you would gain a lot of knowledge about how to align AIs without knowing that much more about how to build one. E.g. a theoretical solution to the stop button problem seems like it would not tell you that much about how to build an AGI compared to figuring out how to properly learn a world model of Minecraft. And knowing how to build a world model of minecraft probably helps a lot with solving the stop button problem, but it doesn't just trivially yield a solution.
If you had a system with “ENTITY 92852384 implies ENTITY 8593483" it would be a lot of progress, as currently in neural networks we don't even understand the interal structures.
I want to have an algorithm that creates a world model. The world is large. A world model is uninterpretable by default through it's sheer size, even if you had interpretable but low level abels. By default we don't get any interpretable labels. I think there are ways to have generic dataprocessing procedures that don't talk about the human mind at all, that would yield more interpretable world model. Similar to how you could probably specify some very general property about python programs, such that that program becomes easier to understand by humans. E.g. a formalism of what it means that the control flow is straightforward: Don't use goto in C.
But even if you wouldn't have this, understanding the system still allows you to understand what the structure of the knowledge would be. It seems plausible that one could simply by understanding the system very well, make it such that the learned datastrucutres need to take particular shapes, such that these shapes correspond some relevant alignment properties.
In any case, it seems that this is a problem that any possible way to build an intelligence runs into? So I don't think it is a case against the project. When building an AI with NN you might not even think about that the interal representations might be wierd and alien (even for an LLM trained on human text)[1], but the same problem persists.
- ^
I haven't looked into this, or thought about at all, though that's what I expect.
You Need a Research Log
I definitely very often run into the problem that I forget why something was good to do in the first place. What are the important bits? Often I get sidetracked, and then the thing that I am doing seems not so got, so I stop and do something completely different. But then later on I realize that actually the original reason that led me down the path was good and that it would have been better to only backtrack a bit to the important piece. But often I just don't remember the important piece in the moment.
E.g. I think that having some kind of linking structure in your world model, that links objects in the model to the real world is important such that you can travel backward on the links to identify where exactly in your world model the error is. Then I go off and construct some formalism for a bit, but before I got to the point of adding the links I forgot that that was the original motivation, and so I just analyzed the model for a couple of hours before realizing that I still haven't added the linking structure. So it even happens during the same research session for me if I am not careful. And if you want to continue the next day, or a week later, having organized your thoughts in a way that isn't so painful to go through that you won't do it is extremely helpful.
I recognized a couple of things as important so far for being able to do it correctly:
- Make it fun to make the notes. If you can't make this information processing activity fun you basically can't do it.
- My brain somehow seems to like doing it much more when I put all the notes on a website.
- Also taking lots of ADHD medication helps.
- Make the notes high quality enough such that they are readable, instead of a wall of garbage text.
- Writing thoughts mainly on a whiteboard, and analog journals (including reflection) seems to help a lot (in general actually).
- Integrate note-taking tightly into your research workflow.
- Don't rely on postprocessing, i.e. having a separate step of producing research notes. At least I didn't manage to get this to work at all so far. As much as possible make the content you produce in the first place as good as possible (analog tools help a lot with this). That means writing up notes and reflections as you are working, not at some time later (which never actually comes).
I'd think you can define a tedrahedron for non-euclidean space. And you can talk about and reason about a set of polyhedra with 10 verticies as an abstract object without talking or defining any specific such polyhedra.
Just consider if you take the assumption that the system would not change in arbitrary ways in response to it's environment. There might be certain constrains. You can think about what the constrains need to be such that e.g. a self modifying agent would never change itself such that it would expect that in the future it would get less utility than if it would not selfmodify.
And that is just a random thing that came to mind without me trying. I would expect that you can learn useful things about alignment by thinking about such things. Infact the line between understanding intelligence and figuring out alignment in advance really doesn't exist I think. Clearly understanding something about alignment is understanding something about intelligence.
When people say to only figure out alignment thing, maybe what they mean is to figure out things about intelligence that won't actually get you much closer to being able to build a dangerous intelligence. And there do seem to be such things. It is just that I expect that just trying to work on these will not actually make you generate the most useful models about intelligence in your mind, making you worse/slower at thinking on average per unit of time working.
And that's of cause not a law. Probably there are some things that you want to understand through an abstract theoretical lens at certain points in time. Do whatever works best.
The way I would approach this problem (after not much thought): Come up with a concrete system architecture A of a maimizing computer program that has an explicit utility function, and is known to behave optimally. E.g. maybe it plays tic tac toe or 4-in a row optimally.
Now mutate the source code of A slightly such that it is no longer optimal to get a system B. The objective is not modified. Now B still "wants" to basically be A, in the sense that if it is a general enough optimizer and has access to selfmodification facilities, it would try to make itself be A, because A is better at optimizing the objective.
I predict by creating a setup where the delta between B and A is small, you can create a tractable problem, without sidestepping the core bottlecks, i.e. solving "correct selfmodification" for small delta between A and B, seems like it needs to solve some hard part of the problem. Once you solved it increase the delta, and solve it again.
Unsure about the exact setup for giving the systems the ability to selfmodify. I intuit one can construct a toy setup that can generate good insight such that B doesn't actually need to be that powerful, or that general of an optimizer.
To me it seems that understanding how a system that you are building actually works (i.e. have good models about its internal) is the most basic requirement to be able to reason about the system coherently at all.
Yes if you'd actually understood how intelligence works in a deep way you don't automatically solve alignment. But it sure will make it a lot more tractable in many ways. Especially when only aiming for a pivotal act.
I am pretty sure you can figure out alignment in advance as you suggest. That might be the overall saver route... if we didn't have coordination problems. But it seems slower, and we don't have time.
Obviously, if you figure out the intelligence algorithm before you know how to steer it, don't put it on GitHub or the universe's fate will be sealed momentarily. Ideally don't even run it at all.
So far working on this project seems to have created ontologies in my brain that are good for thinking about alignment. There are a couple of approaches that now seem obvious, which I think wouldn't seem obvious before. Again having good models about intelligence (which is really what this is about) is actually useful for thinking about intelligence. And Alignment research is mainly thinking about intelligence.
The approach many people take of trying to pick some alignment problem seems somewhat backward to me. E.g. embedded agency is a very important problem, and you need to solve it at some point. But it doesn't feel like the problem such that when you work on it, you build up the most useful models of intelligence in your brain.
As an imperfect analogy consider trying to understand how a computer works by first understanding how modern DRAM works. To build a practical computer you might need to use DRAM. But in principle, you could build a computer with only S-R latch memory. So clearly while important it is not at the very core. First, you understand how NAND gates work, the ALU, and so on. Once you have a good understanding of the fundamentals, DRAM will be much easier to understand. It becomes obvious how it needs to work at a high level: You can write and read bits. If you don't understand how a computer works you might not even know why storing a bit is an important thing to be able to do.
It's becomes more interresting when the people constrain their output based on what they expect is true information that the other person does not yet know. It's useful to talk to an expert, who tells you a bunch of random stuff they know that you don't.
Often some of it will be useful. This only works if they understand what you have said though (which presumably is something that you are interested in). And often the problem is that people's models about what is useful are wrong. This is especially likely if you are an expert in something. Then the thing that most people will say will be worse what you would think on the topic. This is especially bad if the people can't immediately even see why what you are saying is right.
The best strategy around this I have found so far is just to switch the topic to the actually interesting/important things. Suprisingly usually people go along with it.
2024-10-14 Added the "FUUU 754 extensions M and S" section.
Update History
It seems potentially important to compare this to GPT4o. In my experience when asking GPT4 for research papers on particular subjects it seemed to make up non-existent research papers (at least I didn't find them after multiple minutes of searching the web). I don't have any precise statistics on this.
Yes exactly. The larva example illustrates that there are different kinds of values. I thought it was underexplored in the OP to characterize exactly what these different kinds of values are.
In the sadist example we have:
- the hardcoded pleasure of hurting people.
- And we have, let's assume, the wish to make other people happy.
These two things both seem like values. However, they seem to be qualitatively different kinds of values. I intuit that more precisely characterizing this difference is important. I have a bunch of thoughts on this that I failed to write up so far.
reward is the evidence from which we learn about our values
A sadist might feel good each time they hurt somebody. I am pretty sure it is possible for a sadist to exist who does not endorse hurting people, meaning they feel good if they hurt people, but they avoid it nonetheless.
So to what extent is hurting people a value? It's like the sadist's brain tries to tell them that they ought to want to hurt people, but they don't want to. Intuitively the "they don't want to" seems to be the value.
Any n-arity function can be simulated with an an (n+1)-arity predicate. Let a and b be constants. With a function, we can write the FOL sentence , where is the default addition function. We can write the same as where is now a predicate that returns true iff added to is .
How to Sleep
Here are a few observations I have made when it comes to going to bed on time.
Bedtime Alarms
I set up an alarm that reminds me when my target bedtime has arrived. Many times when I am lost in an activity, the alarm makes me remember that I made the commitment to go to bed on time.
I only allow myself to dismiss the alarm when I lay down in bed. Before laying down I am only allowed to snooze it for 8 minutes. To dismiss the alarm I need to solve a puzzle which takes 10s, making dismissing more convenient. Make sure to carry your phone around with you at bedtime.
This is probably the single best thing I have done to improve my sleep hygiene.
Avoid Hard to Stop Activities
It is hard for me to go to bed when doing any engaging activity that I just want to finish up. For example:
- Finishing up some Nixos, xmonad, exwm, etc. configuration.
- Programming such that I get something working.
- Watch a video and feel I need to watch it to the end.
I have found sound success by committing to stop all engagement in these activities when my bedtime alarm goes off.
Don't Fail by Abandon
Once I get past my bedtime by a bit, I am likely to go past my bedtime by a lot.
Somehow it feels like I have already lost. "Did I go to bed on time" is binary.
[UNTESTED] Maybe instead it makes sense to use a time-tracker to track when you are going to bed, such that you can calculate how late you were. Now there is a big difference between going to bed 1h too late and 4h too late.
[UNTESTED] Potentially one could use a sleep right that then automatically records when you sleep. Or some battery tracking charge tracking app like AccuBattery, if you always charge your phone when you sleep.
[UNTESTED] Try to sleep
At the target time, try to sleep for 5-15 minutes. If you can't sleep, you are allowed to get back up. You can use a very subtle self dismissing alarm for notification.
Consider all the programs that encode uncomputable numbers up to digits. There are infinitely many of these programs. Now consider the set of programs . Each program in P' has some pattern. But it's always a different one.
You need the right relationship with confusion. By default confusion makes you stop your thinking. Being confused feels like you are doing something wrong. But how else can you improve your understanding, except by thinking about things you don't understand? Confusion tells you that you don't yet understand. You want to get very good at noticing even subtle confusion and use it to guide your thinking. However, thinking about confusing things isn't enough. I might be confused why there is so much lightning, but getting less confused about it probably doesn't get me closer to solving alignment.
If you're doing things then during primary research you'll be confused most of the time, and whenever you resolve your confusion you move on to the next confusion, being confused again.
Here is an AI called GameNGen that generates a game in real-time as the player interacts with the model. (It simulates doom at >20fps.) It uses a diffusion model. People are only slightly better than random chance at identifying if it was generated by the AI or by the Doom program.
There are muscles in your nose I just realized. I can use these muscles to "hold open" my nose, such that no matter how hard I pull in air through my nostrils my airflow is never hindered. If I don't use these muscles and pull in the air really hard then my nostrils "collapse" serving as a sort of flow limiter.
The next time you buy a laptop, and you don't want a Mac, it's likely you want to buy one with a snapdragon CPU. That's an ARM chip, meaning you get very good battery life (just like the M-series Apple chips). On Snapdragon though you can easily run Windows, and eventually Linux (Linux support is a few months out though).
IMO the most important factor in interpersonal relations is that it needs to be possible to have engaging/useful conversations. There are many others.
The problem: Somebody who scores low on these, can be pushed up unreasonably high in your ranking through feelings of sexual desire.
The worst thing: Sexual desire drops temporarily in the short term after orgasm, and (I heard) permanently after a 2-year period.
To probe the nature of your love:
- If you like women imagine the other person is a man. Did your liking shift? If you like men... etc.
- Generally imagine them being physically unattractive, e.g. them being a giant slime monster.
- Masturbate and check how much your liking shifted immediately after orgasm.
This helps disentangle lust and love.
Everlasting Honeymoon
I heard some people never leave the "honeymoon phase". The initial strong feelings of love persist indefinitely. IIRC scientists determined this by measuring oxytocin or something in couples married for decades. Possibly genetic, so it's not re-creatable.
If the person is a good fit, there's perhaps nothing wrong with loving them even more on top of that.
Appearance can be compelling in non-primary-sexual ways. Porn is closed, NEPPUU opened.
Typst is better than Latex
I started to use Typst. I feel a lot more productive in it. Latex feels like a slug. Typst doesn't feel like it slows me down when typing math, or code. That and the fact that it has an online collaborative editor, and that rendering is very very fast are the most important features. Here are some more:
- It has an online collaborative editor.
- It compiles instantly (at least for my main 30-page document)
- The online editor has Vim support.
- It's free.
- It can syntax highlight lots of languages (e.g. LISP and Lean3 are supported).
- It's embedded scripting language is much easier to use than Latex Macros.
- The paid version has Google Doc-style comment support.
- It's open source and you can compile documents locally, though the online editor is closed source.
Here is a comparison of encoding the games of life in logic:
Latex
$$
\forall i, j \in \mathbb{Z}, A_{t+1}(i, j) = \begin{cases}
0 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) < 2 \\
1 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) \in \{2, 3\} \\
0 &\text{if} \quad A_t(i, j) = 1 \land N_t(i, j) > 3 \\
1 &\text{if} \quad A_t(i, j) = 0 \land N_t(i, j) = 3 \\
0 &\text{otherwise}
\end{cases}
$$
Typst
$
forall i, j in ZZ, A_(t+1)(i, j) = cases(
0 "if" A_t(i, j) = 1 and N_t(i, j) < 2 \
1 "if" A_t(i, j) = 1 and N_t(i, j) in {2, 3} \
0 "if" A_t(i, j) = 1 and N_t(i, j) > 3 \
1 "if" A_t(i, j) = 0 and N_t(i, j) = 3 \
0 "otherwise")
$
Typst in Emacs Org Mode
Here is some elisp to treat latex blocks in emacs org-mode as typst math, when exporting to HTML (renders/embeds as SVG images):
;;;; Typst Exporter
;;; This exporter requires that you have inkscape and typst in your path.
;;; Call org-typst-enabled-html-export
;;; TODO
;;; - Error if inskape or typst is not installed.
;;; - Make it such that it shows up in the org-dispatch exporter and we can
;;; automatically not export only to output.html.
;;; - Automatically setup the HTML header, and possible also automatically start the server as described in: [[id:d9f72e91-7e8d-426d-af46-037378bc9b15][Setting up org-typst-html-exporter]]
;;; - Make it such that the temporary buffers are deleted after use.
(require 'org)
(require 'ox-html) ; Make sure the HTML backend is loaded
(defun spawn-trim-svg (svg-file-path output-file-path)
(start-process svg-file-path
nil
"inkscape"
svg-file-path
"--export-area-drawing"
"--export-plain-svg"
(format "--export-filename=%s" output-file-path)))
(defun correct-dollar-sings (typst-src)
(replace-regexp-in-string "\\$\\$$"
" $" ; Replace inital $$ with '$ '
(replace-regexp-in-string "^\\$\\$" "$ " ; same for ending $$
typst-src)))
(defun math-block-p (typst-src)
(string-match "^\\$\\$\\(\\(?:.\\|\n\\)*?\\)\\$\\$$" typst-src))
(defun html-image-centered (image-path)
(format "<div style=\"display: flex; justify-content: center; align-items: center;\">\n<img src=\"%s\" alt=\"Centered Image\">\n</div>" image-path))
(defun html-image-inline (image-path)
(format " <img hspace=3px src=\"%s\"> " image-path))
(defun spawn-render-typst (file-format input-file output-file)
(start-process input-file nil "typst" "compile" "-f" file-format input-file output-file))
(defun generate-typst-buffer (typst-source)
"Given typst-source code, make a buffer with this code and neccesary preamble."
(let ((buffer (generate-new-buffer (generate-new-buffer-name "tmp-typst-source-buffer"))))
(with-current-buffer buffer
(insert "#set text(16pt)\n")
(insert "#show math.equation: set text(14pt)\n")
(insert "#set page(width: auto, height: auto)\n")1
(insert typst-source))
buffer))
(defun embed-math (is-math-block typst-image-path)
(if is-math-block
(html-image-centered typst-image-path)
(html-image-inline typst-image-path)))
(defun generate-math-image (output-path typst-source-file)
(let* ((raw-typst-render-output (make-temp-file "my-temp-file-2" nil ".typ")))
(spawn-render-typst file-format typst-source-file raw-typst-render-output)
(spawn-trim-svg raw-typst-render-output typst-image-path)))
(defun my-typst-math (latex-fragment contents info)
;; Extract LaTeX source from the fragment's plist
(let* ((typst-source-raw (org-element-property :value latex-fragment))
(is-math-block (math-block-p typst-source-raw))
(typst-source (correct-dollar-sings typst-source-raw))
(file-format "svg") ;; This is the only supported format.
(typst-image-dir (concat "./typst-svg"))
(typst-buffer (generate-typst-buffer typst-source)) ; buffer of full typst code to render
(typst-source-file (make-temp-file "my-temp-file-1" nil ".typ"))
;; Name is unique for every typst source we render to enable caching.
(typst-image-path (concat typst-image-dir "/"
(secure-hash 'sha256 (with-current-buffer typst-buffer (buffer-string)))
"." file-format)))
;; Only render if neccesary
(unless (file-exists-p typst-image-path)
(message (format "Rendering: %s" typst-source))
;; Write the typst code to a file
(with-current-buffer typst-buffer
(write-region (point-min) (point-max) typst-source-file))
(generate-math-image typst-image-path typst-source-file))
(kill-buffer typst-buffer)
(embed-math is-math-block typst-image-path)))
(org-export-define-derived-backend 'my-html 'html
:translate-alist '((latex-fragment . my-typst-math))
:menu-entry
'(?M "Export to My HTML"
((?h "To HTML file" org-html-export-to-html))))
;; Ensure org-html-export-to-html is bound correctly to your backend:
(defun org-html-export-to-html-with-typst (&optional async subtreep visible-only body-only ext-plist)
(interactive)
(let* ((buffer-file-name (buffer-file-name (window-buffer (minibuffer-selected-window))))
(html-output-name (concat (file-name-sans-extension buffer-file-name) ".html")))
(org-export-to-file 'my-html html-output-name
async subtreep visible-only body-only ext-plist)))
(setq org-export-backends (remove 'html org-export-backends))
(add-to-list 'org-export-backends 'my-html)
Simply eval this code and then call org-html-export-to-html-with-typst
.
Now I need to link the Always Look on the Bright Side of Life song.
Probably not useful but just in case here are some other medications that are prescribed for narcolepsy (i.e. stuff that makes you not tired):
Solriamfetol is supposed to be more effective than Modafinil. Possibly hard to impossible to get without a prescription. Haven't tried that yet.
Pitolisant is interesting because it has a novel mechanism of action. Possibly impossible to get even with a prescription, as it is super expensive if you don't have the right health insurance. For me, it did not work that well. Only lasted 2-4 hours, and taking multiple doses makes me not be able to sleep.
I am now diagnosed with sleep apnea and type 2 narcolepsy. CPAP and a Modafinil prescription seem to help pretty well so far. You were the first iirc to point me in that direction, so thank you. Any things that helped you that I did not list?
For the reasonable price of $300 dollars per month, I insure anybody against the destruction of the known world. Should the world be destroyed by AGI I'll give you your money back fold.
That said, if there were insurers, they would probably be more likely than average to look into AI X-risk. Some might then be convinced that it is important and that they should do something about it.
Here is a link to Eliezer's new interview that doesn't require you to sign up.
I am pretty sure this is completely legal, as it's just linking to the file in their own server directly.
That's a good point. You are right. I learned something about how to write a good standardisation. Nice!
I think it makes sense to have Black Red Green Blue because of RGB and these are the most common. But after that sorting after hue value make sense.
I updated the OP to use HSL and HSV.
What is the problem with Lisp?
Add more parenthesis!
The idea of having a consistent arrangement is that you don't need to look at your pens to know which one to pull. The ordering is simply such that if somebody copies this idea, then I can visit their house and use their magazine, without looking, in case I forgot mine. Probably low probability.
Not using your PC, looking in the mirror, and trying to wake up instantly where most interesting.
A smart human given a long enough lifespan, sufficient motivation, and the ability to respawn could take over the universe (let's assume we are starting in today's society, but all technological progress is halted, except when it comes from that single person).
A LLM can't currently.
Maybe we can better understand what we mean with general reasoning by looking at concrete examples of what we expect humans are capable of achieving in the limit.
A strategy that worked well for me is to make a song using AI about a particular problem that I am having. Here is a WIP song about how going to bed on time is good. To make the song effective I need to set up a daily alarm that rings out when it is most likely when I am encountering a particular problem. For example, e.g. when I think it's a good time to go to bed or take a reflective walk.
Here is a playlist of songs I made.
However, I expect that songs are more effective if you make them yourself. It's quite easy, you just need to provide the lyrics. As long as you make them rime a bit Suno does a pretty good job at making it sound good (at least in my opinion).
I am using Suno. You get a couple of credits every day for free, though I make many generations to create a single song. So in practice, it isn't enough if you are making longer songs. If your songs are 30-60s the free credits are enough to make something good.
By @Thomas Kehrenberg and me.
After one definition, GOFAI is about starting with a bunch of symbols that already have some specific meaning. For example, one symbol could represent “cat” and then there might be properties associated with the cat. In the GOFAI system, we're just given all of these symbols because somebody has created them, normally by hand. And then GOFAI is about how can we have algorithms now reason about this symbolic representation that corresponds to reality, ideally, because we have generated the right concepts.
The problem is that this seems like the easy part of the problem. The hard part is how do you get these symbolic representations automatically in the first place. Because once you start to reason about the real world you can't do this. Even in a much simpler world like Minecraft, if you want to have an agent that always mines the dirt block, when it spawns anywhere in the overworld, already it takes a lot of effort to write such a program because you need to hard-code so many things.
So maybe GOFAI exists because the problem of the symbolic manipulation of how to do reasoning, given that you already have a sort of model of the world, is a lot easier than getting the model of the world in the first place. So that's maybe where early AI researchers often went, because then you could have a system that seems impressive because it can tell you sort of new things about the real world, talking about the real world, by saying things like, yes, it will actually rain if I look outside and it's wet, or this cat is orange if I know that it is a tiger, even if we didn't tell these things explicitly to the system.
So it now seems very impressive. But actually it's not really impressive, because all the work was done by hand-coding the world model.
The actual impressive things are also probably more like, I can play chess, I can play perfect tic-tac-toe, or I can perfectly play any discrete game with the minimax algorithm. That's actually progress, and it can, then, for example, in chess, play better than any human, which seems very impressive, and in some sense it is, but it's still completely ignoring the world-modeling problem, which seems to be harder to figure out than figuring out how to think about the game tree.
No they just got the connectdome afaik. This is completely different. gives you no information about the relation between the different neurons in terms of their firing.
I don't know if this is possible conditional on you having some brain scan data.
I think I could build a much better model of it. The backstory of this post is that I wanted to think about exactly this problem. But then realized that maybe it does not make any sense because it's just not technically feasible to get the data.
After writing the post I updated that. I am now a bit more pessimistic than the post might suggest actually. So I probably won't think about this particular way to upload yourself for a while.
I noticed that by default the brain does not like to criticise itself sufficiently. So I need to train myself to red team myself, to catch any problems early.
I want to do this by playing this song on a timer.
Tulpamancy sort of works by doing concurrency on a single-core computer in my current model. So this would definitely not speed things up significantly (I don't think you implied that just mentioning it for conceptual clarity).
To actually divide the tasks I would need to switch with IA. I think this might be a good way to train switching.
Though I think most of the benefits of tulpamancy are gained if you are thinking about the same thing. Then you can leverage that IA and Johannes share the same program memory. Also, simply verbalizing your thoughts, which you then do naturally, is very helpful in general. And there are a bunch more advantages like that that you miss out on when you only have one person working.
However, I guess it would be possible for IA to just be better at certain programming tasks. Certainly, she is a lot better at social interactions (without explicit training for that).
What <mathematical scaffolding/theoretical CS> do you think I am recreating? What observations did you use to make this inference? (These questions are not intended to imply any subtext meaning.)
How much does this line up with your model.
At the top of this document.
I am probably bad at valuing my well-being correctly. That said I don't think the initial comment made me feel bad (but maybe I am bad at noticing if it would). Rather now with this entire comment stream, I realize that I have again failed to communicate.
Yes, I think this was irrational to not clean up the glass. That is the point I want to make. I don't think it is virtuous to have failed in this way at all. What I want to say is: "Look I am running into failure modes because I want to work so much."
Not running into these failure modes is important, but these failure modes where you are working too much are much easier to handle than the failure mode of "I can't get myself to put in at least 50 hours of work per week consistently."
While I do think that it is true, I am probably very bad in general at optimizing for myself to be happy. But the thing is while I was working so hard during AISC I was most of the time very happy. The same when I made these games. Most of the time I did these things because I deeply wanted to.
There where moments during AISC where I felt like I was close to burning out, but this was the minority. Mostly I was much happier than baseline. I think usually I don't manage to work as hard and as long as I'd like, and that is a major source of unhappiness for me.
So it seems that the problem that Alex seems to see, in me working very hard (that I am failing to take my happiness into account) is actually solved by me working very hard, which is quite funny.
For which parts do you feel cringe?
I have this description but it's not that good, because it's very unfocused. That's why I did not link it in the OP. The LessWrong dialog linked at the top of the post is probably the best thing in terms of describing the motivation and what the project is about at a high level.
Sometimes I forget to take a dose of methylphenidate. As my previous dose fades away, I start to feel much worse than baseline. I then think "Oh no, I'm feeling so bad, I will not be able to work at all."
But then I remember that I forgot to take a dose of methylphenidate and instantly I feel a lot better.
Usually, one of the worst things when I'm feeling down is that I don't know why. But now, I'm in this very peculiar situation where putting or not putting some particular object into my mouth is the actual cause. It's hard to imagine something more tangible.
Knowing the cause makes me feel a lot better. Even when I don't take the next dose, and still feel drowsy, it's still easy for me to work. Simply knowing why you feel a particular way seems to make a huge difference.
I wonder how much this generalizes.
I think this is a useful model. If I understand correctly what you're saying, then it is that for any particular thing we can think about whether that thing is optimal to do, and whether I could get this thing to work seperately.
I think what I was saying is different. I was advocating confidence not at the object level of some concrete things you might do. Rather I think being confident in the overall process that you engage in to make process is a thing that you can have confidence in.
Imagine there is a really good researcher, but now this person forgets everything that they ever researched, except for their methodology. It some sense they still know how to do research. If they fill in some basic factual knowledge in their brain, which I expect wouldn't take that long, I expect they would be able to continue being an effective researcher.