Posts
Comments
Yeah, I definitely oversimplified somewhere. I'm definitely tripped up by "this statement is false" or statements that don't terminate. Worse, thinking in that direction, I appear to have claimed that the utterance "What color is your t-shirt" is associated with a probability of being true.
I think that your a-before-e example is confusing your intuition- a typical watermark that occurs 10% of the time isn't going to be semantic, it's more like "this n-gram hashed with my nonce == 0 mod 10"
I'm at this point pretty confident that under the Copenhagen interpretation, whenever an intergalactic photon hits earth, the wave-function collapse takes place on a semi-spherical wave-front many millions of lightyears in diameter. I'm still trying to wrap my head around what the interpretation of this event is in many-worlds. I know that it causes earth to pick which world it is in out of the possible worlds that split off when the photon was created, but I'm not sure if there is any event on the whole spherical wavefront.
It's not a pure hypothetical- we are likely to see gravitational lens interferometry in our lifetime (if someone hasn't achieved it yet outside of my attempt at literature review) which will either confirm that these considerations are real, or produce a shock result that they aren't.
One feature of every lesswrong Petrov day ritual is the understanding that the people on the other side of the button have basically similar goals and reasoning processes, especially when aggregated into a group. I wonder if the mods at /r/sneerclub would be interested in a Petrov day collaboration in the future.
Does it ever fail to complete a proof, and honestly announce failure? A single time I have gotten claude to successfully disprove a statement that I asked it to prove, after trying to find a proof and instead finding a disproof, but I’ve never had it try for a while and then announce that it has not made progress either way.
The funniest possible outcome is that no one opts in and so the world is saved but the blog post is ruined.
I would hate to remove the possibility of a funny outcome. No opt in!
I greatly enjoyed this book back in the day, but the whole scenario was wild enough to summon the moral immune system. Past a certain point, for me it’s a safe default to put up mental barriers and actively try not to learn moral lessons from horror fiction. Worm, Gideon the 9th, anything by Stephen King- great, but I don’t quite expect to learn great lessons.
While rejecting them as sources of wisdom now, I can remember these books and return to them if I suddenly need to make moral choices in a world where people can grow wiser by being tortured for months, or stronger by
killing and then mentally fusing with your childhood friend. or achieve coordination by mind controlling your entire community and spending their lives like pawns
This is a good point! As a result of this effect and Jensen’s inequality, chaos is a much more significant limit on testing CUDA programs than for example cpp programs
Huang
I enjoyed doing this interview. I haven’t done too much extemporaneous public speaking, and it was a weird but wonderful experience being on the other side of the youtube camera. Thanks Elizabeth!
- If a trebuchet requires you to solve the double pendulum problem (a classic example of a chaotic system) in order to aim, it is not a competition-winning trebuchet.
Ah, this is not quite the takeaway- and getting the subtlety here right is important for larger conclusions. If a simulating a trebuchet requires solving the double pendulum problem over many error-doublings, it is not a competition-winning trebuchet. This is an important distinction.
If you start with a simulator and a random assortment of pieces, and then start naively optimizing for pumpkin distance, you will quickly see the sort of design shown at 5:02 in the video, where the resulting machine is unphysical because its performance depends on coincidences that will go away in the face of tiny changes in initial conditions. This behaviour shows up with a variety of simulators and optimizers.
An expensive but probably effective solution is to perturb a design several times, simulate it several times, and stop simulation once the simulations diverge.
An ineffective solution is to limit the time of the solution, as many efficient and real-world designs take a long time to fire, because they begin with the machine slowly falling away from an unstable equilibrium.
The chaos-theory motivated cheap solution is to limit the number of rotations of bodies in the solution before terminating it, as experience shows error doublings tend to come from rotations in trebuchet-like chaotic systems.
The solution I currently have implemented at jstreb.hgreer.com is to only allow the direction of the projectile to rotate once before firing (specifically, it is released if it is moving upwards and to the right at a velocity above a threshold) which is not elegant, but seems mostly effective. I want to move to the "perturb and simulate several times" approach in the future.
The structure of language kind of screwed us here. Picture literally any reasonable policy for discussing each other’s religious affiliation in the workplace. Now implement that policy, but your workers speak a language where the grammar only functions if you know whether each referent is a “True” christian.
export INSTRUMENTAL_GOAL=change_yourself
Keep track of your past attemts to $INSTRUMENTAL_GOAL, so that you can better predict whether your future attempts to $INSTRUMENTAL_GOAL will succeed, and so better choose between plans that require $INSTRUMENTAL_GOAL and plans that route around it.
I didn't catch on at all that this was humor, and as a result made a point to pick up oatmeal next time I was at the grocery. I do actually like oatmeal, I just hadn't thought about it in a while. It has since made for some pretty good breakfasts.
This whole sequence of events is either deeply mundane or extremely funny, I genuinely can't tell. If it's funny it's definitely at my expense.
Ahah! I suspect that permission to start from scratch may be a large component of maintaining passion. Starting from scratch at will is pretty close to the exact boundary between programming I do for fun and programming for which I demand payment.
Today I realized I am free to make the letters in an einsum string meaningful (b for batch, x for horizontal index, y for vertical index etc) instead of just choosing ijkl.
I haven't been able to change my passion, but I have faced a similar issue and found that if I occasionally take stock of my semi-abandoned long term projects, I often notice that I have passion for one of them again. As a result, several have come to something resembling completion over the years, and often over many passion-dispassion cycles. The key then becomes documentation and good storage and organization, to minimize the difficulty of starting up again. I feel that this has made my passion for projects more durable, because there is no longer a sense of panic if it starts to fade- I expect it to return if it is needed.
The "tiny explosions" mental model doesn't make new predictions in the way that the Carnot model does, but it does encode and compress an enormous amount of useful pre-discovered information. For example, that a car engine is hot like fire and will burn you, that if you mix gasoline and air and light it, it will explode, that a car engine will be made of strong stuff, that a car engine is in something of a delicate engineered balance, and if you make large changes to it, it will typically become extremely loud and catch fire. I think this is enough to distinguish the "tiny explosions" model from typical "guess the teacher's password" knowledge.
A consistent trope in dath-ilani world-transfer fiction is "Well the theorems of agents are true in dath ilani and independent of physics, so they're going to be true here damnit"
How do we violate this in the most consistent way possible?
Well it's basically default that a dath ilani gets dropped in a world without the P NP distinction, usually due to time travel BS. We can make it worse- there's no rule that sapient beings have to exist in worlds with the same model of the peano axioms. We pull some flatlander shit- Keltham names a turing machine that would halt if two smart agents fall off the peano frontier and claims to have proof it never halts, and then the native math-lander chick says nah watch this and then together they iterate the machine for a very very long time- a non standard integer number of steps- and then it halts and Keltham (A) just subjectively experienced an integer larger than any natural number of his homeworld and (B) has a couterexample to his precious theorems
With a grain of salt,
There’s a sort of quiet assumption that should be louder about the dath Ilan fiction: which is that it’s about a world where a bunch of theorems like “as systems of agents get sufficiently intelligent, they gain the ability to coordinate in prisoner’s dilemma like problems” have proofs. You could similarly write fiction set in a world where P=NP has a proof and all of cryptography collapses. I’m not sure whether EY would guess that sufficiently intelligent agents actually coordinate- Just like I could write the P=NP fiction while being pretty sure that P/=NP
What you’ve hit upon is “BATNA,” or “Best alternative to a negotiated agreement.” Because the robbers can get what they want by just killing the farmers, the dath ilani will give in- and from what I understand, Yudowsky therefore doesn’t classify the original request (give me half your wheat or die) as a threat.
This may not be crazy- it reminds me of the Ancient Greek social mores around hospitality, which seem insanely generous to a modern reader but I guess make sense if the equilibrium number of roving <s>bandits</s> honored guests is kept low by some other force
So this turns out to be a doozy, but it's really fascinating. I don't have an answer- an answer would look like "normal chaotic differential equations don't have general exact solutions" or "there is no relationship between being chaotic and not having an exact solution" but deciding which is which won't just require proof, it would also require good definitions of "normal differential equation" and "exact solution." (the good definition of "general" is "initial conditions with exact solutions have nonzero measure") I have some work.
A chaotic differential equation has to be nonlinear and at least third order- and almost all nonlinear third order differential equations don't admit general exact solutions. So, the statement "as a heuristic, chaotic differential equations don't have general exact solutions" seems pretty unimpressive. However, I wrongly believed the strong version of this heuristic and that belief was useful: I wanted to model trebuchet arm-sling dynamics, recognized that the true form could not be solved, and switched to a simplified model based on what simplifications would prevent chaos (no gravity, sling is wrapped around a drum instead of fixed to the tip of an arm) and then was able to find an exact solution (note that this solvable system starts as nonlinear 4th order, but can be reduced using conservation of angular momentum hacks)
Now, it is known that a chaotic difference equation can have an exact solution: the equation x(n+1) = 2x(n) mod 1 is formally chaotic and has the exact solution 2^n x mod 1. A chaotic differential equation exhibiting chaotic behaviour can have an exact solution if it has discontinuous derivatives because this difference equation can be constructed:
equation is in three variables x, y, z
dz/dt always equals 1
if 0 < z < 1:
if x > 0:
dx/dt = 0
dy dt = 1
if x < 0:
dx/dt = 0
dy/dt = -1
if 1 < z < 2:
if y > 0
dx/dx = -.5
dy dt = 0
if y < 0
dy dt = 0
dx dt = .5
if 2 < z < 3:
dx/dt = x ln(2)
dy/dt = -(y)/(3 - t)
and then make it periodic by gluing z=0 to z=3 in phase space. (This is pretty similar to the structure of the lorentz attractor, except that in the lorentz system, the sheets of solutions get close together but don't actually meet.) This is an awful,weird ode: the derivative is discontinuous, and not even bounded near the point where the sheets of solutions merge.
Plenty of prototypical chaotic differential equations have a sprinkling of exact solutions: e.g, three bodies orbiting in an equilateral triangle- hence the requirement for a "general" exact solution.
The three body problem "has" an "exact" "series" "solution" but it appears to be quite effed: for one thing, no one will tell me the coefficient of the first term. I suspect that in fact the first term is calculated by solving the motion for all time, and then finding increasingly good series approximations to that motion.
I strongly suspect that the correct answer to this question can be found in one of these stack overflow posts, but I have yet to fully understand them:
https://physics.stackexchange.com/questions/340795/why-are-we-sure-that-integrals-of-motion-dont-exist-in-a-chaotic-system?rq=1
https://physics.stackexchange.com/questions/201547/chaos-and-integrability-in-classical-mechanics
There are certainly billiards with chaotic and exactly solvable components- if nothing else, place a circular billiard next to an oval. So, for the original claim to be true in any meaningful way, this may have to involve excluding all differential equations with case statements- which sounds increasingly unlike a true, fundamental theorem.
If this isn't an open problem, then there is somewhere on the internet a chaotic, normal-looking system of odes (would have aesthetics like x'''' = sin(x''') - x'y''', y' = (1-y / x') etc) posted next to a general exact solution, perhaps only valid for non chaotic initial conditions, or a proof that no such system exists. The solvable system is probably out there and related to billiards
Final edit: the series solution to the three body problem is legit mathematically, see page 64 here
https://ntrs.nasa.gov/citations/19670005590
So “can’t find general exact solution to chaotic differential equation” is just uncomplicatedly false
I want to run code generated by an llm totally unsupervised
Just to get in the habit, I should put it in an isolated container in case it does something weird
Claude, please write a python script that executes a string as python code In an isolated docker container.
If you do set out on this quest, Bell's inequality and friends will at least put hard restrictions on where you could look for a rule underlying seemingly random wave function collapse. The more restricted your search, the sooner you'll find a needle!
I am suddenly unsure whether it is true! It certainly would have to be more specific than how I phrased it, as it is trivially false if the differential equation is allowed to be discontinuous between closed form regions and chaotic regions
Sometimes!
https://sohl-dickstein.github.io/2024/02/12/fractal.html
Differential equation example: I wanted a closed form solution of the range of the simplest possible trebuchet- just a seesaw. This is perfectly achievable, see for example http://ffden-2.phys.uaf.edu/211.fall2000.web.projects/J%20Mann%20and%20J%20James/see_saw.html. I wanted a closed form solution of the second simplest trebuchet, a seesaw with a sling. This is impossible, because even though the motion of the trebuchet with sling isn't chaotic during the throw, it can be made chaotic by just varying the initial conditions, which rules out a simple closed form solution for non-chaotic initial conditions.
Lyapunov exponent example: for the bouncing balls, if each ball travels 1 diameter between bounces, then a change in velocity angle of 1 degree pre-bounce becomes a change in angle of 4 degrees post bounce (this number may be 6- I am bad at geometry), so the exponent is 4 if time is measured in bounces.
The good news is that chaos theory can rule out solutions with extreme prejudice- and because it's a formal theory, it lets you be very clear if it's ruling out a solution absolutely (FluttershAI and Clippy combined aren't going to be able to predict the weather a decade in advance) vs if it's ruling out a solution in all practicality, but teeeechnically (ie, predicing 4-5 swings of a double pendulum). Here are the concrete examples that come to mind:
I wrote a CPU N-Body simulator, and then ported it to CUDA. I can't test that the port is correct by comparing long trajectories of the CPU simulator to the CUDA simulator, and because I know this is a chaos problem, I won't try to fix this by adding epsilons to the test. Instead, I will fix this by running the test simulation for ~ < a Lyapunov Time.
I wrote a genetic algorithm for designing trebuchets. The final machine, while very efficient, is visibly using precise timing after several cycles of a double pendulum. Therefore, I know it can't be built in real life.
I see a viral gif of black and white balls being dumped in a pile, so that when they come to rest they form a smiley face. I know it's fake because that's not something that can be achieved with careful orchestration
I prove a differential equation is chaotic, so I don't need to try to find a closed form solution.
One thing that jumps out, writing this out explicitly, is that chaos theory concievably could be replaced with intuition of "well obviously that won't work," and so I don't know to what extent chaos theory just formulated wisdom that existed pre-1950s, vs generated wisdom that got incorporated into modern common sense. Either way, formal is nice- in particular, the "can't test end to end simulations by direct comparison" and "can't find a closed form solution" cases saved me a lot of time.
Hi! I seem run into chaos relatively often in practice. It's extremely useful and not likely to have flagship applications because it mostly serves to rule out solutions. The workflow looks like
"I have an idea! It is very brilliant"
"Your idea is wonderful, but it's probably fucked by chaos. Calculate a Lyapunov exponent"
calculates Lyapunov exponent
"fuck"
But this is of course much better than trying the idea for weeks or months without a concept of why it's impossible
A different perspective: Colleges very, very badly want you to graduate - especially if you look like you have been doing something other than playing videogames high on weed in your apartment for four years. The upshot of this is that in the case suggested (top 1% IQ, top 50% conscientiousness) after a threshold of maybe 5 hours a week, any effort put specifically towards graduating is basically wasted- going to college with the goal of graduating is severely, severely under-determined. Take chemistry classes! Take physics classes! Take graduate math classes without the prerequisites and fail them! Calculate which essays you don't strictly have to turn in! Build a rocket ship or a race car! Found a startup! Practice bullying administrators into giving you class credit for all of the above!
What college is providing you is 35 hours a week of working time to do with as you please, access to 3-D printers, a machine shop, math classes, the local supercomputer, a chemistry lab, oscilloscopes and signal generators, and zero unemployment stigma. The marginal cost of also getting the credentials while you are there is tiny.
When push comes to shove, you cannot spend 4 years on the goal "graduate from college." There are very few tasks that you can achieve without a college degree that would be significantly more difficult to achieve while getting a college degree.
(There are also degrees which you absolutely can't get with 5 hours a week of effort. Selecting one of them is a choice, and frankly, these are not degrees that you are going to successfully self teach.)
I consistently get upvotes and lots of disagrees when I post thoughts on alignment, which is much more encouraging than downvotes.
Today many of us are farther away from ground truth. The internet is an incredible means of sharing and discovering information, but it promotes or suppresses narratives based on clicks, shares, impressions, attention, ad performance, reach, drop off rates, and virality - all metrics of social feedback. As our organizations grow larger, our careers are increasingly beholden to performance reviews, middle managers' proclivities, and our capacity to navigate bureaucracy. We find ourselves increasingly calibrated by social feedback and more distant from direct validation or repudiation of our beliefs about the world.
I seek a way to get empirical feedback on this set of claims- specifically the direction-of-change-over-time assertions "farther... increasingly... more distant..."
Yeah, in the lightcone scenario evolution probably never actually aligns the inner optimizers- although it may align them, as a super intelligence copying itself will have little leeway for any of those copies having slightly more drive to copy themselves than their parents. Depends on how well it can fight robot cancer.
However, while a cancer free paperclipper wouldn't achieve "AGIs take over the lightcone and fill it with copies of themselves, to at least 90% of the degree to which they would do so if their terminal goal was filling it with copies of themselves," they would achieve something like "AGIs take over the lightcone and briefly fill it with copies of themselves, to at least 10^-3% of the degree to which they would do so if their terminal goal was filling it with copies of themselves" which is in my opinion really close. As a comparison, if Alice sets off Kmart AIXI with the goal of creating utopia we don't expect the outcome "AGIs take over the lightcone and convert 10^-3% of it to temporary utopias before paperclipping."
Also, unless you beat entropy, for almost any optimization target you can trade "fraction of the universe's age during which your goal is maximized" against "fraction of the universe in which your goal is optimized" since it won't last forever regardless. If you can beat entropy, then the paperclipper will copy itself exponentially forever.
Evolution is threatening to completely recover from a worst case inner alignment failure. We are immensely powerful mesaoptimizers. We are currently wildly misaligned from optimizing for our personal reproductive fitness. Yet, this state of affairs feels fragile! The prototypical lesswrong AI apocalypse involves robots getting into space and spreading at the speed of light extinguishing all sapient value, which from the point of view of evolution is basically a win condition.
In this sense, "reproductive fitness" is a stable optimization target. If there are more stable optimizations targets (big if), finding one that we like even a little bit better than "reproductive fitness" could be a way to do alignment.
Basically, the claims in the linked post that LLM inference is compute bound, and that a modern nvidia chip inferring LLaMa only achieves 30% utilization, seem extraordinarily unlikely to both be true.
Crypto asics fundamentally didn’t need memory bandwidth. Modern GPUs are basically memory bandwidth asics already.
Phenomenon: The cosmological principle
Situation where it seems false and unfalsifiable: The distant future after galaxies outside of the local group depart the cosmic event horizon
According to a widely held understanding of the far future (~100 Billion years), the distant galaxies will fade completely from view and the local group will likely merge into one galaxy. For civilizations that arise in this future orbiting trillion year old red dwarfs, the hypothesis that there are billions of galaxies just like the one they are in will be unfalsifiable. The evidence will point to all mass in the universe living in one lump with a reachable center.
This isn't my example, it's sort of the canonical scenario to use as a metaphor for how inflation-based-multiverse theories could be true yet undetectable. For example, see the afterword to "A Universe from Nothing" https://www.google.com/books/edition/A_Universe_from_Nothing/TGpbASdsIW4C?hl=en&gbpv=1&dq=A%20universe%20from%20nothing%20dawkins&pg=PA187&printsec=frontcover
You commented yourself that the word "woke" is ill defined, but I don't think this post takes that ill-definition seriously enough. I don't really know what you mean by it, and frankly I'd be surprised if two readers (both within the LessWrong overton window but with significant political differences), who were both confident that they understood what you meant, had the same understanding.
I've laid out a concrete example of this at https://www.lesswrong.com/posts/FgXjuS4R9sRxbzE5w/medical-image-registration-the-obscure-field-where-deep , following the "optimization on a scaffold level" route. I found a real example of a misaligned inner objective outside of RL, which is cool
No one we have worked with has had a license. I think you need one to take care of multiple people's kids at your house, but not to take care of one family's kids at their house.
If you can get to Seattle for your partner's career, you can likely get a job nannying during the day, which will pay $25 to $30 an hour and doesn't require a car.
This time last summer I was an incoming intern in Seattle and I was unable to pay less than $30 an hour for childcare during working hours, hiring by combing through Facebook groups for nannys and sending many messages. At this price, one of the nannys we worked with had a car and the other did not. I do not know what the childcare market is like near your current location.
To add explore / exploit, just start the game's chess clock before allowing the players to start reading the rules.
If you choose a single player game, you are going to have to carefully calibrate the level of difficulty and the type of difficulty. However, if you pick any two player comptetitive strategy game you can focus on the type of difficulty, as the level of difficulty will be calibrated automatically to "half your participants will win."
My recommendation would be to rig up a way to randomly sample from the two player board games on boardgamearena.com that neither player has every played before (can be as simple as putting 20 names on index cards, the players remove any cards they recognize, then shuffle and draw).
A concrete research direction in the "Searching for Search" field is to find out whether ChessGPTs or the Leela chess 0 network are searching. Your "Babble and prune" description seems checkable: maybe something like linear probe for continuation board states in the residual stream, and see if bad continuations are easier to find in earlier layers? Thank you for this writeup.
Mostly I think your thought process is quite good! But if you list out the design constraints of your logistic drone: (deliver airborn self guided munitions into maximally hostile area) vs the design constraints of a modern attack aircraft (deliver airborn guided munitions into maximally hostile area) you’ll find that they’re the same constraints- so likely a fully optimized logistics drone is going to just be an F35 or MQ9. This assumes that dropping mesh-networked batteries on parachutes or even just fresh drones will work better than landing the mothership or docking to recharge.
I think thats the key takeaway- most of the killing will be done by the small drone infantry as you described, the air war still controls where the small drone infantry can deploy, the small drone infantry has limited ability to affect the air war.
Flying low works when the other guy is either on the ground or forced to also fly low by your ground based radar. It doesn’t actually do anything against a high altitude radar.
Also there’s a bit of domain knowledge you need: Anything with rotors reflects 500mph doppler shifted radio waves even when stationary, which makes them incredibly visible to any radar that is looking for aircraft.
You still need something to contest stealthy high altitude aircraft to protect your logistics drones. Against the proposed setup, any force with ground attack aircraft would shred the entire force of logistics drones from 40,000 feet and then wait for the rest to run out of batteries. If your price ceiling per unit is a laser guided bomb, you are going to have a damned hard time making a logistics drone carrying multiple attack drones, each carrying multiple guided munitions.
Taking off when you spot it will not save you from a laser guided bomb. https://www.sandboxx.us/news/how-an-f-15e-shot-down-an-iraqi-gunship-with-a-bomb/
Only two moves have worked against NATO forces since the development of the F-117: hide among civilians and threaten nuclear retaliation. I don't see anything here that proposes a third effective move.
The rumors are that this was SpacexXs secret- even at huge scale, Musk interviewed every employee. From even the positive accounts of the process, his hiring and firing decision making was sleep deprived, stimulant addled, inconsistent and childish. On the other hand, something is going right at SpaceX, judging by the rockets. I agree with the theory that one agent hiring mediocrily is just more effective than professional and polite staffing decisions made by a swarm of agents at cross purposes.
Diaper changes are rare and precious peace
Suffering from ADHD, I spend most of my time stressed that whatever I'm currently doing, it's not actually the highest priority task and something or someone I've forgotten is increasingly mad that I'm not doing their task instead.
One of the few exceptions is doing a diaper change. Not once in the past 2 years have I been mid-diaper-change and thought "Oh shit, there was something more important I needed to be doing right now."
There are two completely distinct ways to swing on a swing- You can rotate your body relative to the seat-chain body at the same frequency as your swinging but out of phase, or move your center of mass up and down the chain at twice the frequency. The power of the former is ~ torque applied to chain \* angular velocity, the power output of the latter is radial velocity of your body \* (angular velocity ^2 \* chain length).
To get to any height, you have to switch from one to the other once the angular velocity ^2 term dominates- this is why learning to swing is so unintuitive.
I should emphasize that he did not succeed at hurting another kid in his allergy plot, and was not likely to. 1% of kids with psychopathic tendencies sounds rare when you’re parenting one kid, but it sounds like Tuesday when you have the number of kids seen by an institution like a summer camp- there’s paperwork, procedures, escalations, all hella battle-tested. Typically with a kid in the cluster, we focus on safety but also work hard to integrate them and let all the kids have a good experience. His behavior was different enough from a typical violent, unresponsive to punishment kid that we weren’t able to keep him at camp because the standard fun preserving, behavior improving parts of these policies did not work at all on him (very weird, they always work), but the safety oriented policies like boost staff to camper ratio around him, always have one staff member watching him, document everything and brief staff members who will be supervising him worked fine.