Posts
Comments
Hang up a tear-off calendar?
(You can find his ten mentions of that ~hashtag via the looking glass on thezvi.substack.com. huh, less regular than I thought.)
Zvi's AI newsletter, latest installment https://www.lesswrong.com/posts/LBzRWoTQagRnbPWG4/ai-93-happy-tuesday, has a regular segment Pick Up the Phone arguing against this.
Why not just one global project?
https://www.google.com/search?q=spx futures
I was specifically looking at Nov 5th 0:00-6:00, which twitched enough to show aliveness, while manifold and polymarket moved in smooth synchrony.
As the prediction markets on Trump winning went from ~50% to ~100% over 6 hours, S&P 500 futures moved less than the rest of the time. Why?
The public will Goodhart any metric you hand over to it. If you provide evaluation as a service, you will know how many attempts an AI lab made at your test.
If you say heads every time, half of all futures contain you; likewise with tails.
https://www.lesswrong.com/posts/Mc6QcrsbH5NRXbCRX/dissolving-the-question
What is going to be done with these numbers? If Sleeping Beauty is to gamble her money, she should accept the same betting odds as a thirder. If she has to decide which coinflip result kills her, she should be ambivalent like a halfer.
Your experiment is contaminated: If a piece of training document said that AI texts are overly verbose, and then announced that the following is a piece of AI-written text, it'd be a natural guess that the document would continue with overly verbose text, and so that's what an autocomplete engine will generate.
Due to RLHF, AI is no longer cleanly modelled as an autocomplete engine, but the point stands. For science, you could try having AI assist in the writing of an article making the opposite claim :).
Ask something only they would know.
Among monotonic, boolean quantifiers that don't ignore their input, exists is maximal because it returns true as often as possible; forall is minimal because it returns true as rarely as possible.
For concreteness, let's say the basic income is the same in every city, same for a paraplegic or Elon Musk. Anyone who can vote gets it, it's a dividend on your share of the country.
I am surprised at section 3; I don't remember anyone who seriously argues that women should be dependent on men. By amusing coincidence, my last paragraph makes your reasoning out of scope; you can abolish women's suffrage in a separate bill.
In section 5, you are led astray by assuming a fixed demand for labor. You notice that we have yet to become obsolete. Well, of course: For as long as human inputs remain cheaper than their outputs, employment statistics will fail to reflect our dwindling comparative advantage. But we are on track to turn every graphics card into a cheaper white collar worker. Humans have to be trained for jobs, software can be copied. Human hands might remain SOTA for a few years longer. Horses weren't reduced to pets because we built too many cars, but because cars became possible to build.
factor out alpha
⌊x⌋ is floor(x), the greatest integer that's at most x.
People with sufficiently good models of each other to use them in their social protocols.
I'd call those absences of drawbacks, not benefits - you would have had them without the job.
I was alone in a room of computers, and I had set out to take no positive action but grading homework. I ended up sitting and pacing and occasionally moving the mouse in the direction it would need to go next. What I remember of what my mind was on was the misery of the situation.
I tried that for a weekend once. I did nothing.
It has been pointed out to me that no, what this presumably means is the past decisions of the patients.
Q2 Is it ethically permissible to consider an individual’s past decisions when determining their
access to medical resources?
You assume the conclusion:
A lot of the AI alignment success seems to me stem from the question of whether the problem is easy or not, and is not very elastic to human effort.
AI races are bad because they select for contestants that put in less alignment effort.
Sure, he's trying to cause alarm via alleged excerpts from his life. Surely society should have some way to move to a state of alarm iff that's appropriate, do you see a better protocol than this one?
Recall that every vector space is the finitely supported functions from some set to ℝ, and every Hilbert space is the square-integrable functions from some measure space to ℝ.
I'm guessing that similarly, the physical theory that you're putting in terms of maximizing entropy lies in a large class of "Bostock" theories such that we could put each of them in terms of maximizing entropy, by warping the space with respect to which we're computing entropy. Do you have an idea of the operators and properties that define a Bostock theory?
that thing about affine transformations
If the purpose of a utility function is to provide evidence about the behavior of the group, we can preprocess the data structure into that form: Suppose Alice may update the distribution over group decisions by ε. Then the direction she pushes in is her utility function, and the constraints "add up to 100%" and "size ε" cancel out the "affine transformation" degrees of freedom. Now such directions can be added up.
Let's investigate whether functions must necessarily contain an agent in order to do sufficiently useful cognitive work. Pick some function of which an oracle would let you save the world.
Hmmmm. What if I said "an enumeration of the first-order theory of (union(Q,{our number}),<)"? Then any number can claim to be equal to one of the constants.
If Earth had intelligent species with different minds, an LLM could end up identical to a member of at most one of them.
Is the idea that "they seceded because we broke their veto" is more of a casus belli than "we can't break their veto"?
Sure! Fortunately, while you can use this to prove any rational real innocent of being irrational, you can't use this to prove any irrational real guilty of being irrational, since every first-order formula can only check against finitely many constants.
Chaitin's constant, right. I should have taken my own advice and said "an enumeration of all properties of our number that can be written in the first-order logic (Q,<)".
Oh, I misunderstood the point of your first paragraph. What if we require an enumeration of all rationals our number is greater than?
If you want to transfer definitions into another context (constructive, in this case), you should treat such concrete, intuitive properties as theorems, not axioms, because the abstract formulation will generalize further. (remark: "close" is about distances, not order.)
If constructivism adds a degree of freedom in the definition of convergence, I'd try to use it to rescue the theorem that the Dedekindorder and Cauchydistance structures on ℚ agree about the completion. Potential rewards include survival of the theory built on top and evidence about the ideal definition of convergence. (I bet it's not epsilon/N, because why would a natural property of maps from ℕ to ℚ introduce the variable of type ℚ before the variable of type ℕ?)
I claim Dedekind cuts should be defined in a less hardcoded manner. Galaxy brain meme:
- An irrational number is something that can sneak into (Q,<), such as sqrt(2)="the number which is greater than all rational numbers whose square is less than 2". So infinity is not a real number because there is no greatest rational number, and epsilon is not a real number because there is no smallest rational number greater than zero.
- An irrational number is a one-element elementary extension of (Q,<). (Of course, the proper definition would waive the constraint that the new element be original, instead of treating rationals and irrationals separately.)
- The real numbers are the colimit of the finite elementary extensions of (Q,<).
I claim Cauchy sequences should be defined in a less hardcoded manner, too: A sequence is Cauchy (e.g. in (Q,Euclidean distance)) iff it converges in some (wlog one-element) extension of the space.
Yeah, the TLDR sounds worse than the story, so the story might aound worse than the correspondence.
But Igor presumably had some reasoning for not publishing it immediately. Preserving privacy? An opportunity for the fund to save face? The former would have worked better without the name drop, and the latter seems antithetical to local culture...
If a future decision is to shape the present, we need to predict it.
The decision-theoretic strategy "Figure out where you are, then act accordingly." is merely an approximation to "Use the policy that leads to the multiverse you prefer.". You *can* bring your present loyalties with you behind the veil, it might just start to feel farcically Goodhartish at some point.
There are of course no probabilities of being born into one position or another, there are only various avatars through which your decisions affect the multiverse. The closest thing to probabilities you'll find is how much leverage each avatar offers: The least wrong probabilistic anthropics translates "the effect of your decisions through avatar A is twice as important as through avatar B" into "you are twice as likely to be A as B".
So if we need probabilities of being born early vs. late, we can compare their leverage. We find:
- Quantum physics shows that the timeline splits a bazillion times a second. So each second, you become a bazillion yous, but the portions of the multiverse you could first-order impact are divided among them. Therefore, you aren't significantly more or less likely to find yourself a second earlier or later.
- Astronomy shows that there's a mazillion stars up there. So we build a Dyson sphere and huge artificial womb clusters, and one generation later we launch one colony ship at each star. But in that generation, the fate of the universe becomes a lot more certain, so we should expect to find ourselves before that point, not after.
- Physics shows that several constants are finely tuned to support organized matter. We can infer that elsewhere, they aren't. Since you'd think that there are other, less precarious arrangements of physical law with complex consequences, we can also moderately update towards that very precariousness granting us unusual leverage about something valuable in the acausal marketplace.
- History shows that we got lucky during the Cold War. We can slightly update towards:
- Current events are important.
- Current events are more likely after a Cold War.
- Nuclear winter would settle the universe's fate.
- The news show that ours is the era of inadequate AI alignment theory. We can moderately update towards being in a position to affect that.
Can the simulators tell whether an AI is dumb or just playing dumb, though? You can get the right meme out there with a very light touch.
Yeah, it'd be safer to skip the simulations altogether and just build a philosopher from the criteria by which you were going to select a civilization.
To be blunt, sample a published piece of philosophy! Its author wanted others to adopt it. But you're well within your rights to go "If this set is so large, surely it has an element?", so here's a fun couple paragraphs on the topic.
If an AI intuits that policy, it can subvert it - nothing says that it has to announce its presence, or openly take over immediately. Shutting it down when they build computers should work.
If the "human in a box" degenerates into a loop like LLMs do, try the next species.
I agree on your last paragraph, though humans have produced loads of philosophy that both works for them and benefits them for others to adopt.
How do you tell when to stop the simulation? Apparently not at the almost human-level AI we have now.
Do you have an example piece of philosophical progress made by a civilization?
I admit that the human could turn against you, but if a human can eat you, you certainly shouldn't be watching a planet full of humans.
Sorry, our timeline is dangerous because we're on track to create AI that can eat unsophisticated simulators for breakfast, such as by helpfully handing them a "solution to philosophy".
Yes, instantiate a philosopher. Not having solved philosophy is a good reason to use fewer moving parts you don't understand. Just because you can use arbitrary compute doesn't mean you should.
You'd be a proper fool to simulate the Call of Cthulhu timeline before solving philosophy.
That said, if you can steal the philosophy, why not steal the philosopher?
Building an ancestor sim for intellectual labor is like building the Matrix for energy production. You simulate a timeline to figure out what happens there.
That said, the decision-theoretic strategy of "figure out where you are, then act accordingly" is just an approximation to "follow the policy that produces the multiverse you want", so counting a number of simulations is silly: Every future ancestor sim merely grants your decisions an extra way to affect a timeline they could already affect through your meatspace avatar.
I previously told an org incubator one simple idea against failure cases like this. Do you think you should have tried the like?
Funnily enough I spotted this at the top of lesslong on the way to write the following, so let's do it here:
What less simple ideas are there? Can an option to buy an org be conditional on arbitrary hard facts such as an arbitrator finding it in breach of a promise?
My idea can be Goodharted through its reliance on what the org seems to be worth, though "This only spawns secret AI labs." isn't all bad. Add a cheaper option to audit the company?
It can also be Goodharted through its reliance on what the org seems to be worth. OpenAI shows that devs can just walk out.
You hand-patched several inadequacies out of the judge. Shouldn't you use the techniques that made the debaters more persuasive to make the judge more accurate?
Absent feedback, today I read further, to the premise of the maxent conjecture. Let X be 100 numbers up to 1 million, rerolled until the remainder of their sum modulo 1000000 ends up 0 or 1. (X' will have sum-remainder circa 50 or circa -50.) Given X', X1 has a 25%/50%/25% pattern around X'1. Given X2 through X100, X1 has a 50%/50% distribution. So the (First/Strong) Universal Natural Latent Conjecture fails, right?
I claim that the way to properly solve embedded agency is to do abstract agent foundations such that embedded agency falls out naturally as one adds an embedding.
In the abstract, an agent doesn't terminally care to use an ability to modify its utility function.
Suppose a clique of spherical children in a vacuum [edit: ...pictured on the right] found each other by selecting for their utility functions to be equal on all situations considered so far. They invest in their ability to work together, as nature incentivizes them to
They face a coordination problem: As they encounter new situations, they might find disagreements. Thus, they agree to shift their utility functions precisely in the direction of satisfying whatever each other's preferences turn out to be.
This is the simplest case I yet see where alignment as a concept falls out explicitly. It smells like it fails to scale in any number of ways, which is worrisome for our prospects. Another point for not trying to build a utility maximizer.
What do you mean by them memorizing the songs, if they don't repeat them word for word? Do you only require that all the events in the version they heard happen again in the version they sing? Are there audio recordings of their singing? Those should help reduce confusion here.
A USB microscope. Just point it at an arbitrary thing and learn more about it! (Say "Examine" for good luck.)
I don't have the following, but I wish I did: A heat camera, an ultrasound probe, a sound camera, an e-nose. Sensors ought to have high bandwidth, in order to give you a chance to notice any anomalies.
Then all zeroes maps to all zeroes.
(1,1,1,1,1,1,1,1,1) maps to (1,0,0,0,0,0,0,0,0).
Fix some atom of information. It's contained in some of Lambda, X1, X2, and Lambda'. Call the corresponding four statements a,b,c,d. Then you assume "b&c implies a, c&d implies b, b&d implies c, a&d implies b or c.".
These compress into "b&c implies a, d implies a=b=c."; after concluding that, I read that you conclude "d&(b or c) implies a", which seems to be a special case. My approach feels too gainfully simpler, so I'm checking in to ask whether it fails.