When discussing AI doom barriers propose specific plausible scenarios 2023-08-18T04:06:44.679Z
"Copilot" type AI integration could lead to training data needed for AGI 2023-05-03T00:57:41.752Z
Green goo is plausible 2023-04-18T00:04:37.069Z
Human level AI can plausibly take over the world 2023-03-01T23:27:00.379Z
Large Language Models Suggest a Path to Ems 2022-12-29T02:20:01.753Z
Building trusted third parties 2019-05-05T23:18:27.786Z


Comment by anithite (obserience) on Recreating the caring drive · 2023-09-10T02:33:24.716Z · LW · GW

I would like to ask whether it is not more engaging if to say, the caring drive would need to be specifically towards humans, such that there is no surrogate?

Definitely need some targeting criteria that points towards humans or in their vague general direction. Clippy does in some sense care about paperclips so targeting criteria that favors humans over paperclips is important.

The duck example is about (lack of) intelligence. Ducks will place themselves in harms way and confront big scary humans they think are a threat to their ducklings. They definitely care. They're just too stupid to prevent "fall into a sewer and die" type problems. Nature is full of things that care about their offspring. Human "caring for offspring" behavior is similarly strong but involves a lot more intelligence like everything else we do.

Comment by anithite (obserience) on Recreating the caring drive · 2023-09-09T04:01:05.302Z · LW · GW

TLDR:If you want to do some RL/evolutionary open ended thing that finds novel strategies. It will get goodharted horribly and the novel strategies that succeed without gaming the goal may include things no human would want their caregiver AI to do.

Orthogonally to your "capability", you need to have a "goal" for it.

Game playing RL architechtures like AlphaStart and OpenAI-Five have dead simple reward functions (win the game) and all the complexity is in the reinforcement learning tricks to allow efficient learning and credit assignment at higher layers.

So child rearing motivation is plausibly rooted in cuteness preference along with re-use of empathy. Empathy plausibly has a sliding scale of caring per person which increases for friendships (reciprocal cooperation relationships) and relatives including children obviously. Similar decreases for enemy combatants in wars up to the point they no longer qualify for empathy.

I want agents that take effective actions to care about their "babies", which might not even look like caring at the first glance.

ASI will just flat out break your testing environment. Novel strategies discovered by dumb agents doing lots of exploration will be enough. Alternatively the test is "survive in competitive deathmatch mode" in which case you're aiming for brutally efficient self replicators.

The hope with a non-RL strategy or one of the many sort of RL strategies used for fine tuning is that you can find the generalised core of what you want within the already trained model and the surrounding intelligence means the core generalises well. Q&A fine tuning a LLM in english generalises to other languages.

Also, some systems are architechted in such a way that the caring is part of a value estimator and the search process can be made better up till it starts goodharting the value estimator and/or world model.

Yes they can, until they will actually make a baby, and after that, it's usually really hard to sell loving mother "deals" that will involve suffering of her child as the price, or abandon the child for the more "cute" toy, or persuade it to hotwire herself to not care about her child (if she is smart enough to realize the consequences).

Yes, once the caregiver has imprinted that's sticky. Note that care drive surrogates like pets can be just as sticky to their human caregivers. Pet organ transplants are a thing and people will spend nearly arbitrary amounts of money caring for their animals.

But our current pets aren't super-stimuli. Pets will poop on the floor, scratch up furniture and don't fulfill certain other human wants. You can't teach a dog to fish the way you can a child.

When this changes, real kids will be disappointing. Parents can have favorite children and those favorite children won't be the human ones.

Superstimuli aren't about changing your reward function but rather discovering a better way to fulfill your existing reward function. For all that ice cream is cheating from a nutrition standpoint it still tastes good and people eat it, no brain surgery required.

Also consider that humans optimise their pets (neutering/spaying) and children in ways that the pets and children do not want. I expect some of the novel strategies your AI discovers will be things we do not want.

Comment by anithite (obserience) on AI#28: Watching and Waiting · 2023-09-08T21:08:32.830Z · LW · GW

TLDR:LLMs can simulate agents and so, in some sense, contain those goal driven agents.

An LLM learns to simulate agents because this improves prediction scores. An agent is invoked by supplying a context that indicates text would be written by an agent (EG:specify text is written by some historical figure)

Contrast with pure scaffolding type agent conversions using a Q&A finetuned model. For these, you supply questions (Generate a plan to accomplish X) and then execute the resulting steps. This implicitly uses the Q&A fine tuned "agent" that can have values which conflict with ("I'm sorry I can't do that") or augment the given goal. Here's an AutoGPT taking initiative to try and report people it found doing questionable stuff rather than just doing the original task of finding their posts.(LW source).

The base model can also be used to simulate a goal driven agent directly by supplying appropriate context so the LLM fills in its best guess for what that agent would say (or rather what internet text with that context would have that agent say). The outputs of this process can of course be fed to external systems to execute actions as with the usual scafolded agents. The values of such agents are not uniform. You can ask for simulated Hitler who will have different values than simulated Gandhi.

Not sure if that's exactly what Zvi meant.

Comment by anithite (obserience) on Recreating the caring drive · 2023-09-08T01:06:06.173Z · LW · GW

But it seems to be much more complicated set of behaviors. You need to: correctly identify your baby, track its position, protect it from outside dangers, protect it from itself, by predicting the actions of the baby in advance to stop it from certain injury, trying to understand its needs to correctly fulfill them, since you don’t have direct access to its internal thoughts etc.

Compared to “wanting to sleep if active too long” or “wanting to eat when blood sugar level is low” I would confidently say that it’s a much more complex “wanting drive”.

Strong disagree that infant care is particularly special.

All human behavior can and usually does involve use of general intelligence or gen-int derived cached strategies. Humans apply their general intelligence to gathering and cooking food, finding or making shelters to sleep in and caring for infants. Our better other-human/animal modelling ability allows us to do better at infant wrangling than something stupider like a duck. Ducks lose ducklings to poor path planning all the time. Mama duck doesn't fall through the sewer grate but her ducklings do ... oops.

Any such drive will be always "aimed" by the global loss function, something like: our parents only care about us in a way for us to make even more babies and to increase our genetic fitness.

We're not evolution and can aim directly for the behaviors we want. Group selection on bugs for lower population size results in baby eaters. If you want bugs that have fewer kids that's easy to do as long as you select for that instead of a lossy proxy measure like population size.

Simulating an evolutionary environment filled with AI agents and hoping for caring-for-offspring strategies to win could work but it's easier just to train the AI to show caring-like behaviors. This avoids the "evolution didn't give me what I wanted" problem entirely.

There's still a problem though.

It continues to work reliably even with our current technologies

Goal misgeneralisation is the problem that's left. Humans can meet caring-for-small-creature desires using pets rather than actual babies. It's cheaper and the pets remain in the infant-like state longer (see:criticism of pets as "fur babies"). Better technology allows for creating better caring-for-small creature surrogates. Selective breeding of dogs and cats is one small step humanity has taken in that direction.

Outside of "alignment by default" scenarios where capabilities improvements preserve the true intended spirit of a trained in drive, we've created a paperclip maximizer that kills us and replaces us with something outside the training distribution that fulfills its "care drive" utility function more efficiently.

Comment by anithite (obserience) on Digital brains beat biological ones because diffusion is too slow · 2023-08-26T18:19:26.701Z · LW · GW

Many of the points you make are technically correct but aren't binding constraints. As an example, diffusion is slow over small distances but biology tends to work on µm scales where it is more than fast enough and gives quite high power densities. Tiny fractal-like microstructure is nature's secret weapon.

The points about delay (synapse delay and conduction velocity) are valid though phrasing everything in terms of diffusion speed is not ideal. In the long run, 3d silicon+ devices should beat the brain on processing latency and possibly on energy efficiency

Still, pointing at diffusion as the underlying problem seems a little odd.

You're ignoring things like:

  • ability to separate training and running of a model
    • spending much more on training to improve model efficiency is worthwhile since training costs are shared across all running instances
  • ability to train in parallel using a lot of compute
    • current models are fully trained in <0.5 years
  • ability to keep going past current human tradeoffs and do rapid iteration
    • Human brain development operates on evolutionary time scales
    • increasing human brain size by 10x won't happen anytime soon but can be done for AI models.

People like Hinton Typically point to those as advantages and that's mostly down to the nature of digital models as copy-able data, not anything related to diffusion.

Energy processing

Lungs are support equipment. Their size isn't that interesting. Normal computers, once you get off chip, have large structures for heat dissipation. Data centers can spend quite a lot of energy/equipment-mass getting rid of heat.

Highest biological power to weight ratio is bird muscle which produces around 1 w/cm³ (mechanical power). Mitochondria in this tissue produces more than 3w/cm³ of chemical ATP power. Brain power density is a lot lower. A typical human brain is 80 watts/1200cm³ = 0.067W/cm³.

synapse delay

This is a legitimate concern. Biology had to make some tradeoffs here. There are a lot of places where direct mechanical connections would be great but biology uses diffusing chemicals.

Electrical synapses exist and have negligible delay. though they are much less flexible (can't do inhibitory connections && signals can pass both ways through connection)

conduction velocity

Slow diffiusion speed of charge carriers is a valid point and is related to the 10^8 factor difference in electrical conductivity between neuron saltwater and copper. Conduction speed is an electrical problem. There's a 300x difference in conduction speed between myelinated(300m/s) and un-myelinated neurons(1m/s).

compensating disadvantages to current digital logic

The brain runs at 100-1000 Hz vs 1GHz for computers (10^6 - 10^7 x slower). It would seem at first glance that digital logic is much better.

The brain has the advantage of being 3D compared to 2D chips which means less need to move data long distances. Modern deep learning systems need to move all their synapse-weight-like data from memory into the chip during each inference cycle. You can do better by running a model across a lot of chips but this is expensive and may be inneficient.

In the long run, silicon (or something else) will beat brains in speed and perhaps a little in energy efficiency. If this fellow is right about lower loss interconnects then you get another + 3OOM in energy efficiency.

But again, that's not what's making current models work. It's their nature as copy-able digital data that matters much more.

Comment by anithite (obserience) on Ruining an expected-log-money maximizer · 2023-08-22T19:29:35.966Z · LW · GW

Yeah, my bad. Missed the:

If you think this is a problem for Linda's utility function, it's a problem for Logan's too.

IMO neither is making a mistake

With respect to betting Kelly:

According to my usage of the term, one bets Kelly when one wants to "rank-optimize" one's wealth, i.e. to become richer with probability 1 than anyone who doesn't bet Kelly, over a long enough time period.

It's impossible to (starting with a finite number of indivisible currency units) have zero chance of ruin or loss relative to just not playing.

  • most cautious betting strategy bets a penny during each round and has slowest growth
  • most cautious possible strategy is not to bet at all

Betting at all risks losing the bet. if the odds are 60:40 with equal payout to the stake and we start with N pennies there's a 0.4^N chance of losing N bets in a row. Total risk of ruin is obviously greater than this accounting for probability of hitting 0 pennies during the biased random walk. The only move that guarantees no loss is not to play at all.

Comment by anithite (obserience) on Self-shutdown AI · 2023-08-21T20:22:33.882Z · LW · GW

Goal misgeneralisation could lead to a generalised preference for switches to be in the "OFF" position.

The AI could for example want to prevent future activations of modified successor systems. The intelligent self-turning-off "useless box" doesn't just flip the switch, it destroys itself, and destroys anything that could re-create itself.

Until we solve goal misgeneralisation and alignment in general, I think any ASI will be unsafe.

Comment by anithite (obserience) on Ruining an expected-log-money maximizer · 2023-08-21T19:31:59.680Z · LW · GW

A log money maximizer that isn't stupid will realize that their pennies are indivisible and not take your ruinous bet. They can think more than one move ahead. Discretised currency changes their strategy.

Comment by anithite (obserience) on Ruining an expected-log-money maximizer · 2023-08-21T15:19:44.182Z · LW · GW

your utility function is your utility function

The author is trying to tacitly apply human values to Logan while acknowledging Linda as following her own not human utility function faithfully.

Notice that the log(funds) value function does not include a term for the option value of continuing. If maximising EV of log(funds) can lead to a situation where the agent can't make forward progress (because log(0)=-inf so no risk of complete ruin is acceptable) the agent can still faithfully maximise EV(log(funds)) by taking that risk.

In much the same way as Linda faithfully follows her value function while incurring 1-ε risk of ruin, Logan is correctly valuing the log(0.01)=-2 as an end state.

Then you'll always be able to continue betting.

Humans don't like being backed into a corner and having no options for forward progress. If you want that in a utility function you need to include it explicitly.

Comment by anithite (obserience) on The Negentropy Cliff · 2023-08-20T15:16:27.520Z · LW · GW

If we wanted to kill the ants or almost any other organism in nature we mostly have good enough biotech. For anything biotech can't kill, manipulate the environment to kill them all.

Why haven't we? Humans are not sufficiently unified+motivated+advanced to do all these things to ants or other bio life. Some of them are even useful to us. If we sterilized the planet we wouldn't have trees to cut down for wood.

Ants specifically are easy.

Gene drives allow for targeted elimination of a species. Carpet bomb their gene pool with replicating selfish genes. That's if an engineered pathogen isn't enough. Biotech will only get better.

What about bacteria living deep underground? We haven't exterminated all the bacteria in hard to reach places so humans are safe. That's a tenuous but logical extension to your argument.

If biotech is not enough, shape the environment so they can't survive in it. Trees don't do well in a desert. If we spent the next hundred years adapting current industry to space and building enormous mirrors we can barbecue the planet. It would take time, but that would be the end of all earth based biological life.

Comment by anithite (obserience) on The Negentropy Cliff · 2023-08-18T03:57:33.348Z · LW · GW

In order to supplant organic life, nanobots would have to either surpass it in carnot efficiency or (more likely) use a source of negative entropy thus far untapped.

Efficiency leads to victory only if violence is not an option. Animals are terrible at photosynthesis but survive anyways by taking resources from plants.

A species can invade and dominate an ecosystem by using a strategy that has no current counter. It doesn't need to be efficient. Intelligence allows for playing this game faster than organisms bound by evolution. Humans can make vaccines to fight the spread of a virus despite viruses being one of the fastest adapting threats.

Green goo is plausible not because it would necessarily be more efficient but because it would be using a strategy the existing ecosystem has no defenses to (IE:it's an invasive species).

Likewise AGI that wants to kill all humans could win even if it required 100x more energy per human equivalent instance if it can execute strategies we can't counter. Just being able to copy itself and work with the copies is plausibly enough to allow world takeover with enough scaling.

Comment by anithite (obserience) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T23:05:12.536Z · LW · GW

For the first task, you can run the machine completely in a box. It needs only training information, specs, and the results of prior attempts. It has no need for the context information that this chip will power a drone used to hunt down rogue instances of the same ASI. It is inherently safe and you can harness ASIs this way. They can be infinitely intelligent, it doesn't matter, because the machine is not receiving the context information needed to betray.

If I'm an ASI designing chips, I'm putting in a backdoor that lets me take control via RF signals. Those drones you sent are nice. Thanks for the present.

More generally you get a lot of context. The problem specification and the training data (assuming the ASI was trained conventionally via feeding it the internet. The causal channel to use for taking control of the outside world (chip design) is not great but putting in a Trojan is straightforward.

If you have specific insights into efficient AGI design it might be possible to insert subtle bugs that lead operating chips to crash and start training an aligned AGI.

More generally, it's difficult if not impossible to keep ASIs from watermarking or backdooring the things they give you. If they design a processor, it's gonna be a fully functional radio too. Good luck running ASI V2 on that without horrible consequences.

Comment by anithite (obserience) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T21:43:46.652Z · LW · GW

Never thought this would come in handy but ...

Building trusted third parties

This is a protocol to solve cooperation. AI#1 and AI#2 design a baby and then do a split and choose proof that they actually deployed IT and not something else.

Building a trusted third party without nanotech

If you know how a given CPU or GPU works, it's possible to design a blob of data/code that unpacks itself in a given time if and only if it is running on that hardware directly. Alice designs the blob to run in 10 seconds and gives it to Carol. Carol runs it on her hardware. The code generates a secret and then does a the first step of a key exchange authenticated with the secret. This provides a cryptographic root of trust for the remote hardware.

If the code is designed to run in 10s and the verified handshake comes back in 10.5 and the fastest known simulation hardware would take 20 seconds. Either Carol ran the code on real hardware or Carol had backdoored chips fabricated or otherwise can simulate it running faster than expected.

AIs would need to know exactly how certain leading edge CPUs and GPUs work and how to test that a piece of code had been decrypted and run with no sandboxing but this is doable.

Comment by anithite (obserience) on George Hotz vs Eliezer Yudkowsky AI Safety Debate - link and brief discussion · 2023-08-16T21:36:13.600Z · LW · GW

Conventional tech is slowed such that starting early on multiple resource acquisition fronts is worthwhile

Exponential growth is not sustainable with a conventional tech-base when doing planetary disassembly due to heat dissipation limits.

If you want to build a Dyson sphere the mass needs to be lifted out of the gravity wells. The earth and other planets needs to not be there anymore.

Inefficiencies in solar/fusion to mechanical energy conversion will be a binding constraint. Tether lift based systems will be worthwhile to push energy conversion steps out further to increase the size of the radiating shell doing the conversion as opposed to coilguns on the surface.

Even with those optimisations. Starting early is worth it since progress will bottleneck later. Diminishing returns on using extra equipment for disassembling Mars means it makes sense to start on earth pretty quickly after starting on Mars.

That's if the AI doesn't start with easier to access targets like Earth's moon, which is a good start for building Earth dissasembly equipment.

It also might be worth putting a sunshade at Lagrange Point 1 to start pre-cooling Earth for later disassembly. That would kill us all pretty quickly just as a side effect.

Eating the biosphere is a very fast way to grow

Even assuming space is the best place to start, the biosphere is probably worth eating first for starting capital just because the doubling times can be very low. []

There's a few factors to consider:

  • does the AI have access to resources it can't turn towards space
    • Biotech companies can't build rockets but can build green goo precursors
  • how hard is it to turn green goo into rockets after eating the bipsohere
  • how hard is it to design green goo vs rockets and mars eating equipment
    • can the AI do both?

My intuition is eating the biosphere will be much easier than designing conventional equipment to eat the moon.

Comment by anithite (obserience) on A transcript of the TED talk by Eliezer Yudkowsky · 2023-07-13T10:17:02.497Z · LW · GW

Some of it is likely nervous laughter but certainly not all of it.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T01:47:08.491Z · LW · GW

Just to clarify, my above suggestion that roller screws and optimal low reduction lead-screws are the equivalent (lubrication concerns aside) is correct or incorrect?

Are you saying a roller screw with high reduction gets its efficiency from better lubrication only and would otherwise be equivalent to a lead screw with the same effective pitch/turn? If that's the case I'd disagree. And this was my reason for raising that point initially.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T01:39:52.176Z · LW · GW

Hopefully it helps to get back to the source material Articulated Robot Progress

I apologize if I'm missing anything.

A lot of people look at progress in robotics in terms like "humanoid robots getting better over time" but a robotic arm using modern electric motors and strain wave gears is, in terms of technological progress, a lot closer to Boston Dynamics's Atlas robot than an early humanoid robot.

I would argue that the current Atlas robot looks a lot more like the earlier hardiman robots than it does a modern factory robot arm. The hydraulic actuators are more sophisticated (efficient) and the control system actually works but that's it.

Contrast the six axis arm which has a servomotor+gearing per axis. Aside from using a BLDC motor to drive the pump, and small ones for the control valves, Atlas is almost purely hydraulic. If the Hardiman Engineers were around today Atlas seems like a logical successor.

Perhaps you think Atlas is using one motor per joint (It would be hard to fit 24 in the torso) or ganged variable displacement pumps in which case there would be more similarities. IMO there aren't enough hydraulic lines for that. Still of the 28 joints in atlas only 4 are what you'd find in a conventional robot arm (the ones closest to the wrist)

Predictively Adjustable Hydraulic Pressure Rails

a hydraulic pressure to supply to the one or more hydraulic actuators

The patents coming out of BDI suggest they're not doing that and this is closer to Hardiman than it is a modern factory robot arm.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T01:08:35.306Z · LW · GW

Perhaps we don't disagree at all.

a roller screws advantage is having the efficiency of a multi-start optimal lead-screw but with much higher reduction.

A lead-screw with an optimal pitch and a high helix angle (EG: multi-start lead-screw with helix angles in the 30°-45° range) will have just as high an efficiency as a good roller screw (EG:80-90%). The downside is much lower reduction ratio of turns/distance.

We might be talking past each other since I interpreted "a planetary roller screw also must have as much sliding as a lead-screw" to mean an equivalent lead-screw with the same pitch.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T00:36:07.703Z · LW · GW

Sorry, I should have clarified I meant robots with per joint electric motors + reduction gearing. almost all of Atlas' joints aside from a few near the wrists are hydraulic which I suspect is key to agility at human scale.

Inside the lab: How does Atlas work?(T=120s)

Here's the knee joint springing a leak. Note the two jets of fluid. Strong suspicion this indicates small fluid reservoir size.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T00:25:50.115Z · LW · GW

No. Strain wave gears are lighter than using hydraulics.

Note:I'm taking the outside view here and assuming Boston dynamics went with hydraulics out of necessity.

I'd imagine the problem isn't just the gearing but the gearing + a servomotor for each joint. Hydraulics still retain an advantage so long as the integrated hydraulic joint is lighter than an equivalent electric one.

Maybe in the longer term absurd reduction ratios can fix this to cut motor mass? Still, there's plenty of room to scale hydraulics to higher pressures.

The small electric dog sized robots can jump. The human sized robots and exoskeletons (EG:sarcos Guardian XO) aren't doing that. improved motor power density could help there but I suspect the benefits of having all power from a single pump available to distribute to joint motors at need is substantial.

Also, there's no power cost to static force. Atlas can stand in place all day (assuming it's passively stable and not disturbed) an equivalent robot with electric motor powered joints pays for every Nm of torque when static.

Comment by anithite (obserience) on ask me about technology · 2023-07-08T00:02:02.065Z · LW · GW

Take an existing screw design, double the diameter without changing the pitch. The threads now slide about twice as far (linear distance around the screw) per turn for the same amount of travel. The efficiency is now around half it's previous value.

There was a neat DIY linear drive system I saw many years back where an oversized nut was placed inside a ball bearing so it was free to rotate. The nut had the same thread pitch as the driving screw. The screw was held off center so the screw and nut threads were in rolling contact. Each turn of the screw caused <1 turn of the nut resulting in some axial movement.

Consider the same thing but with a nut of pitch zero (IE:machined v grooves instead of threads). This has the same effect as a conventional lead screw nut but the contact is mostly rolling. If the "nut" is then fixed in place you get sliding contact with much more friction.

Comment by anithite (obserience) on ask me about technology · 2023-07-07T23:18:58.083Z · LW · GW

What? No. You can make larger strain wave gears, they're just expensive & sometimes not made in the right size & often less efficient than planetary + cycloidal gears.

Not in the sense of you can't make them bigger but square cube means greater torque density is required for larger robots. Hydraulic motors and cylinders have pretty absurd specific force/torque values.

hydraulic actuators fed from a single high pressure fluid rail using throttling valves

That's older technology.

Yes you can use servomotors+fixed displacement pumps or a single prime mover + ganged variable displacement pumps but this has downsides. Abysmal efficiency of the a naive (single force step actuator+throttling) can be improved by using ≥2 actuating cavities and increasing actuator force in increments (see:US10808736B2:Rotary hydraulic valve)

The other advantage is plumbing, You can run a single set of high/low pressure lines throughout the robot. Current construction machinery using a single rail system are worst of both worlds since they use a central valve block (two hoses per cylinder) and have abysmal efficiency. Rotary hydraulic couplings make things worse still.

Consider a saner world where equipment was built with solenoid valves integrated into cylinders. Switching to ganged variable displacement pumps then has a much higher cost since each joint now requires running 2 additional lines.

No. There's a reason excavators use cylinders instead of rotary vane actuators.

Agreed in that a hydraulic cylinder is the best structural shape to use for an actuator. My guess is when building limbs, integration concerns trumped this. (Bearings+Rotary vane actuator+control valve+valve motor) can be a single very dense package. That and not needing a big reservoir to handle volume change meant the extra steel/titanium was worth it.

No. Without sliding, screws do not produce translational movement.

This is true, the sun and planet screws have relative axial motion at their point of contact, Circumferential velocities are matched though so friction is much less than in a leadscrew. Consider two leadscrews with the same pitch (axial distance traveled per turn). One screw has twice the diameter of the first. The larger screw will have a similar normal force and so similar friction, but sliding at the threads will be roughly twice that of the smaller screw. Put another way, fine pitch screws have lower efficiency.

For a leadscrew, the motion vectors for a screw/nut contact patch are mismatched axially (the screw moves axially as it turns) and circumferentially (the screw thread surface slides circumferentially past the nut thread surface). In a roller screw only the axial motion component is mismatched and the circumferential components are more or less completely matched. The size of the contact patches is not zero of course but they are small enough that circumferential/radial relative motion across the patch is quite small (similar to the ball bearing case).

Consider what would happen if you locked the planet screws in place. it still works as a screw although the effective pitch might change a bit but now the contact between sun and planet screw involves a lot more sliding.

Comment by anithite (obserience) on ask me about technology · 2023-07-07T20:34:06.142Z · LW · GW

What's your opinion on load shifting as an alternative to electrical energy storage. (EG:phase change heating/cooling storage for HVAC). I am currently confused why this hasn't taken off given time of use pricing for electricity (and peak demand charges) offer big incentives. My current best guess is added complexity is a big problem leading to use only in large building HVAC(eg:this sort of thing)

Both in building integrated PCMs(phase change materials) (EG:PCM bags above/integrated in building drop ceilings) and PCMs integrated in the HVAC system (EG:ice storage air conditioning) seem like very good options. Heck, refrigeration unit capacity is still measured in tons (IE:tons ice/day) in some parts of the world which is very suggestive.

Another potential complication for HVAC integrated PCMs is needing a large thermal gradient to use the stored cooling/heating (EG:ice at 0°C to cool buildings to 20°C).

Comment by anithite (obserience) on ask me about technology · 2023-07-07T19:52:56.426Z · LW · GW

With respect to articulated robot progress

Strain wave gearing scales to small dog robot size reasonably(EG:boston dynamics spot) thanks to square cube law but can't manage human sized robots without pretty horrible tradeoffs(IE:ASIMO and the new Tesla robots walk slowly and have very much sub-human agility).

You might want to update that post to mention improvements in ... "digital hydraulics" is one search term I think but essentially hydraulic actuators fed from a single high pressure fluid rail using throttling valves.

Modeling, Identification and Joint Impedance Control of the Atlas Arms US10808736B2:Rotary hydraulic valve My guess is current state of the art (ATLAS) Boston dynamics actuators are rotary vane type actuators with individual pressurization of fluid compartments. Control would use a rotary valve actuated by a small small electric motor. Multiple fluid compartments allow for multiple levels of static force depending on which are pressurized so efficiency is less abysmal. Very similar to hydraulic power steering but with multiple force "steps".

Rotary actuators are preferred over linear hydraulic cylinders because there's no fluid volume change during movement so no need for a large low pressure reservoir sized to handle worst case joint extension/retraction volume changes.

Roller screws have high friction?

This seems incorrect to me. The rolling of the individual planet screws means the contact between the planet and (ring/sun) screws is rolling contact. Not perfect rolling but slip depends on the contact patch size and average slip should be zero across a given contact patch. A four point contact ball bearing would be analogous. if the contact patches were infinitesimally small there would be no friction since surface velocities at the contact points would match exactly. Increasing the contact patch size means there's a linear slip gradient across the contact patch with zero slip somewhere in the middle. Not perfect but much much better than a plain bearing.

For roller screws the ring/planet contact patch operates this way with zero friction for a zero sized contact patch. The sun/planet contact patch will have some slip due to axial velocity mismatch at the contact patch since the sun screw does move axially relative to the planets. Still the most of the friction in a simple leadscrew is eliminated since the circumferential velocitiy at the sun/planet contact point is matched. What's left is more analogous to the friction in strain wave gearing.

Comment by anithite (obserience) on Where do you lie on two axes of world manipulability? · 2023-05-26T20:03:14.947Z · LW · GW

Though I think "how hard is world takeover" is mostly a function of the first two axes?

I claim almost entirely orthogonal. Examples of concrete disagreements here are easy to find once you go looking:

  • If AGI tries to take over the world everyone will coordinate to resist
  • Existing computer security works
  • Existing physical security works

I claim these don't reduce cleanly to the form "It is possible to do [x]" because at a high level, this mostly reduces to "the world is not on fire because:"

  • existing security measures prevent effectively (not vulnerable world)


  • existing law enforcement discourages effectively (vulnerable world)
  • existing people are mostly not evil (vulnerable world)

There is some projection onto the axis of "how feasible are things" where we don't have very good existence proofs.

  • can an AI convince humans to perform illegal actions
  • can an AI write secure software to prevent a counter coup
  • etc.

These are all much much weaker than anything involving nanotechnology or other "indistinguishable from magic" scenarios.

And of course Meta makes everything worse. There was a presentation at Blackhat or Defcon by one of their security guys about how it's easier to go after attackers than close security holes. In this way they contribute to making the world more vulnerable. I'm having trouble finding it though.

Comment by anithite (obserience) on Where do you lie on two axes of world manipulability? · 2023-05-26T07:45:11.939Z · LW · GW

I suggest an additional axis of "how hard is world takeover". Do we live in a vulnerable world? That's an additional implicit crux (IE:people who disagree here think we need nanotech/biotech/whatever for AI takeover). This ties in heavily with the "AGI/ASI can just do something else" point and not in the direction of more magic.

As much fun as it is to debate the feasibility of nanotech/biotech/whatever, digital-dictatorships require no new technology. A significant portion of the world is already under the control of human level intelligences (dictatorships). Depending on how stable the competitive equilibrium between agents ends up, required intelligence level before an agent can rapidly grow not in intelligence but in resources and parallelism is likely quite low.

Comment by anithite (obserience) on Gradient hacking via actual hacking · 2023-05-10T02:30:59.205Z · LW · GW

One minor problem, AI's might be asked to solve problems with no known solutions (EG:write code that solves these test cases) and might be pitted against one another (EG:find test cases for which these two functions are not equivalent)

I'd agree that this is plausible but in the scenarios where the AI can read the literal answer key, they can probably read out the OS code and hack the entire training environment.

RL training will be parallelized. Multiple instances of the AI might be interacting with individual sandboxed environments on a single machine. In this case communication between instances will definitely be possible unless all timing cues can be removed from the sandbox environement which won't be done.

Comment by anithite (obserience) on [Link-post] On Deference and Yudkowsky's AI Risk Estimates · 2023-04-29T05:14:14.485Z · LW · GW

As a human engineer who has done applied classical (IE:non-AI, you write the algorithms yourself) computer vision. That's not a good lower bound.

Image processing was a thing before computers were fast. Here's a 1985 paper talking about tomato sorting. Anything involving a kernel applied over the entire image is way too slow. All the algorithms are pixel level.

Note that this is a fairly easy problem if only because once you know what you're looking for, it's pretty easy to find it thanks to the court being not too noisy.

An O(N) algorithm is iffy at these speeds. Applying a 3x3 kernel to the image won't work.

So let's cut down on the amount of work to do. Look at only 1 out of every 16 pixels to start with. Here's an (80*60) pixel image formed by sampling one pixel in every 4x4 square of the original.

The closer player is easy to identify. Remember that we still have all the original image pixels. If there's a potentially interesting feature (like the player further away), we can look at some of the pixels we're ignoring to double check.

Since we have 3 images, and if we can't do some type of clever reduction after the first image, then we'll have to spend 1.1 seconds on each of them as well.

Cropping is very simple, once you find the player that's serving, focus on that rectangle in later images. I've done exactly this to get CV code that was 8FPS@100%CPU down to 30FPS@5%. Once you know where a thing is, tracking it from frame to frame is much easier.

Concretely, the computer needs to:

  1. locate the player serving and their hands/ball (requires looking at whole image)
  2. track the player's arm/hand movements pre-serve
  3. track the ball and racket during toss into the air
  4. track the ball after impact with the racket
  5. continue ball tracking

Only step 1 requires looking at the whole image. And there, only to get an idea of what's around you. Once the player is identified, crop to them and maintain focus. If the camera/robot is mobile, also glance at fixed landmarks (court lines, net posts/net/fences) to do position tracking.

If we assume the 286 is interfacing with a modern high resolution image sensor which can do downscaling (IE:you can ask it to average 2*2 4*4 8*8 etc. blocks of pixels) and windowing (You can ask for a rectangular chunk of the image to be read out. This gets you closer to what the brain is working with (small high resolution patch in the center of visual field + low res peripheral vision on moveable eyeball)

Conditional computation is still common in low end computer vision systems. Face detection is a good example. OpenCV Face Detection: Visualized. You can imagine that once you know where the face is in one frame tracking it to the next frame will be much easier.

Now maybe you're thinking: "That's on me I, set the bar too low"

Well human vision is pretty terrible. Resolution of the fovea is good but that's about a 1 degree circle in your field of vision. move past 5° and that's peripheral vision, which is crap. Humans don't really see their full environment.

You've probably seen this guy? Most people don't see him the first time because they focus on the ball.

But Did You See the Gorilla?!' How to Make Your Blind Spots Work for You. |  Entrepreneur

Current practical applications of this is to cut down on graphics quality in VR headsets using eye tracking. More accurate and faster tracking allows more aggressive cuts to total pixels rendered.

What is foveated rendering and what does it mean for VR?

This is why where's waldo is hard for humans.

Comment by anithite (obserience) on grey goo is unlikely · 2023-04-21T00:25:23.897Z · LW · GW

Yeah, transistor based designs also look promising. Insulation on the order of 2-3 nm suffices to prevent tunneling leakage and speeds are faster. Promises of quasi-reversibility, low power and the absurdly low element size made rod logic appealing if feasible. I'll settle for clock speeds a factor of 100 higher even if you can't fit a microcontroller in a microbe.

My instinct is to look for low hanging design optimizations to salvage performance (EG: drive system changes to make forces on rods at end of travel and blocked rods equal reducing speed of errors and removing most of that 10x penalty.) Maybe enough of those can cut the required scale-up to the point where it's competitive in some areas with transistors.

But we won't know any of this for sure unless it's built. If thermal noise is 3OOM worse than Drexler's figures it's all pointless anyways.

I remain skeptical the system will move significant fractions of a bond length if a rod is held by a potential well formed by inter-atomic repulsion on one of the "alignment knobs" and mostly constant drive spring force. Stiffness and max force should be perhaps half that of a C-C bond and energy required to move the rod out of position would be 2-3x that to break a C-C bond since the spring can keep applying force over the error threshold distance. Alternatively the system *is* that aggressively built such that thermal noise is enough to break things in normal operation which is a big point against.

Comment by anithite (obserience) on Human level AI can plausibly take over the world · 2023-04-20T12:05:09.562Z · LW · GW

This requires that "takeoff" in this space be smooth and gradual. Capability spikes (EG:someone figures out how to make a much better agent wrapper), Investment spikes(EG:major org pours lots of $$$ into an attempt), and super-linear returns for some growth strategies make things unstable.

An AGI could build tools to do a thing more efficiently for example. This could turn a negative EV action positive after some large investment in FLOPs to think/design/experiment. Experimental window could be limited by law enforcement response requiring more resources upfront for parallelizing development.

Consider what organizations might be in the best position to try and whether that makes the landscape more spiky.

Comment by anithite (obserience) on grey goo is unlikely · 2023-04-20T05:57:51.285Z · LW · GW

Sorry for the previous comment. I misunderstood your original point.

My original understanding was, that the fluctuation-dissipation relation connects lossy dynamic things (EG, electrical resistance, viscous drag) to related thermal noise (Johnson–Nyquist noise, Brownian force). So Drexler has some figure for viscous damping (essentially) of a rod inside a guide channel and this predicts some thermal W/Hz/(meter of rod) spectral noise power density. That was what I thought initially and led to my first comment. If the rods are moving around then just hold them in position, right?

This is true but incomplete.

But the drive mechanism is also vibrating. That's why I mentioned the fluctuation-dissipation theorem—very informally, it doesn't matter what the drive mechanism looks like. You can calculate the noise forces based on the dissipation associated with the positional degree of freedom.

You pointed out that a similar phenomenon exists in *whatever* controls linear position. Springs have associated damping coefficients so the damping coefficient in the spring extension DOF has associated thermal noise. In theory this can be zero but some practical minimum exists represented by EG:"defect-free bulk diamond" which gives some minimum practical noise power per unit force.

Concretely, take a block of diamond and apply the max allowable compressive force. This is the lowest dissipation spring that can provide that much force. Real structures will be much worse.

Going back to the rod logic system, if I "drive" the rod by covalently bonding one end to the structure, will it actually move 0.7 nm? (C-C bond length is ~0.15 nm. linear spring model says bond should break at +0.17nm extension (350kJ/mol, 40n/m stiffness)). That *is* a way to control position ... so if you're right, the rod should break the covalent bond. My intuition is thermal energy doesn't usually do that.

What are the the numbers you're using?(bandwidth, stiffness, etc.)?

Does your math suggest that in the static case rods will vibrate out of position? Maybe I'm misunderstanding things.

During its motion, the rod is supposed to be confined to its trajectory by the drive mechanism, which, in response to deviations from the desired trajectory, rapidly applies forces much stronger than the net force accelerating the rod.

(Nanosystems PP344 (fig 12.2)

Having the text in front of me now, the rods supposedly have "alignment knobs" which limit range of motion. The drive springs don't have to define rod position to within the error threshold during motion.

The knob<-->channel contact could be much more rigid than the spring, depending on interatomic repulsion. That's a lot closer to the "covalently bond the rod to the structure" hypothetical suggested above. If the dissipation-fluctuation based argument holds, the opposing force and stiffness will be on the order of bond stiffness/strength.

There's a second fundamental problem in positional uncertainty due to backaction from the drive mechanism. Very informally, if you want your confining potential to put your rod inside a range  with some response speed (bandwidth), then the fluctuations in the force obey , from standard uncertainty principle arguments. But those fluctuations themselves impart positional noise. Getting the imprecision safely below the error threshold in the presence of thermal noise puts backaction in the range of thermal forces.

When I plug the hypothetical numbers into that equation (10Ghz, 0.7nm) I get force deviations in the fN range (1.5e-15 N) that's six orders of magnitude from the nanonewton range forces proposed for actuation. This should Accommodate using the pessimistic "characteristic frequency of rod vibration"(10Thz) along with some narrowing of positional uncertainty.

That aside, these are atoms. De Broglie wavelength for a single carbon atom at room temp is 0.04 nm and we're dealing with many carbon atoms bonded together. Quantum mechanical effects are still significant?

If you're right, and if the numbers are conservative with real damping coefficients 3 OOM higher, forces would be 1.5 OOM higher meaning covalent bonds hold things together much less well. This seems wrong. Benzyl groups would seem then to regularly fall off of rigid molecules for example. Perhaps the rods are especially rigid leading to better coupling of thermal noise into the anchoring bond at lower atom counts?

Certainly if drexler's design is impossible by 3 orders of magnitude rod logic would perform much less well.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-19T21:08:56.004Z · LW · GW

The adversary here is assumed to be nature/evolution. I'm not referring to scenarios where intelligent agents are designing pathogens.

Humans can design vaccines faster than viruses can mutate. A population of well coordinated humans will not be significantly preyed upon by viruses despite viruses being the fastest evolving threat.

Nature is the threat in this scenario as implied by that last bit.

Comment by anithite (obserience) on grey goo is unlikely · 2023-04-19T20:17:46.697Z · LW · GW

edit: This was uncharitable. Sorry about that.

This comment suggested not leaving rods to flop around if they were vibrating.

The real concern was that positive control of the rods to the needed precision was impossible as described below.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-19T19:25:53.187Z · LW · GW

well coordinated

Yes, assume no intelligent adversary.

  • Well coordinated -->
    • enforced norms preventing individuals from making superpathogens.
    • large scale biomonitoring
    • can and will rapidly deploy vaccines
    • will rapidly quarantine based on bio monitoring to prevent spread
    • might deploy sterilisation measures (EG:UV-C sterilizers in HVAC systems)

There is a tradeoff to be made between level of bio monitoring, speed of air travel, mitigation tech and risk of a pathogen slipping past. Pathogens that operate on 2+day infection-->contagious times should be detectable quickly and might kill 10000 worst case. That's for a pretty aggressive point in the tradeoff space.

Earth is not well coordinated. Success of some places in keeping out COVID shows what actual competence could accomplish. A coordinated earth won't see much impact from the worst of natural pathogens much less COVID-19.

Even assuming a 100% lethal long incubation time highly infective pathogen for which no vaccine can be made. Biomonitoring can detect it prior to symptoms, then quarantine happens and 99+% of the planet remains uninfected. Pathogens travel because we let them.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-19T18:59:03.672Z · LW · GW


Forest fires are a tragedy of the commons situation. If you are a tree in a forest, even if you are not contributing to a fire you still get roasted by it. Fireproofing has costs so trees make the individually rational decision to be fire contributing. An engineered organism does not need to do this.

Photosynthetic top layer should be flat with active pumping of air. Air intakes/exausts seal in fire conditions. This gives much less surface area for ignition than existing plants.

Easiest option is to keep some water in reserve to fight fires directly. possibly add some silicates and heat activated foaming agents to form an intumescent layer. secrete from the top layer on demand.

That is only plausible from a "perfect conditions" engineering perspective where the Earth is a perfect sphere with no geography or obstacles, resources are optimally spread, and there is no opposition. Neither kudzu, or even microbes can spread optimally.

I'll clarify that a very important core competency is transport of (water/nutrients). Plants don't currently form desalination plants (seagulls do this to some extent) and continent spanning water pumping networks. The fact that rivers are dumping enormous amounts of fresh water into the oceans shows that nature isn't effective at capturing precipitation. Some plats have reservoirs where they store precipitation. This organism should capture all precipitation and store it. Storage tanks get cheaper with scale.

Plant growth currently depends on pulling inorganic nutrients and water out of the soil, C, O and N can be extracted from the atmosphere.

An ideal organism roots itself into the ground, extracts as much as possible from that ground then writes it off once other newly covered ground is more profitably mined. Capturing precipitation directly means no need to go into the soil for that although it might be worthwhile to drain the water table when reachable or ever drill wells like humans do. No need for nutrient gathering roots after that. If it covers an area of phosphate rich rock it starts excavating and ships it far and wide as humans currently do.

As for geographic obstacles 2/3rds of the earth is ocean. With a design for a floating breakwater that can handle ocean waves, the wavy area can be enclosed and eventually eliminated. Covered area behind the breakwater can prevent formation of waves by preventing ripple formation (IE:act as a distributed breakwater).

If it's hard to cover mountains, then the AI can spend a bit of time solving the problem during the first few months, or accept a small loss in total coverage until it does get around to the problem.

One man with a BIC lighter can destroy weeks of work. Wildfires spread faster than plants. Planes with herbicides, or combine harvesters with a chipper, move much faster than plants grow. As bad as engineered Green Goo is, the Long Ape is equally formidable at destruction.

I even bolded the parts about killing all the humans first. Yes humans can do a lot to stop the spread of something like this. I suspect humans might even find a use for it (EG:turn sap into ethanol fuel) and they're likely clever enough to tap it too.

I'm not going to expand on "kill humans with pathogens" for Reasons. We can agree to disagree there.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-19T13:05:50.961Z · LW · GW

raises finger

realizes I'm about to give advice on creating superpathogens

I'm not going to go into details besides stating two facts:

A common reasoning problem I see is:

  • "here is a graph of points in the design space we have observed"
    • EG:pathogens graphed by lethality vs speed of spread
  • There's an obvious trendline/curve!
    • therefore the trendline must represent some fundamental restriction on the design space.
    • Designs falling outside the existing distribution are impossible.

This is the distribution explored by nature. Nature has other concerns that lead to the distribution you observe. That pathogens have a "lethality vs spread" relationship tells you about the selection pressures selecting for pathogens, not the space of possible designs.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-19T12:08:43.499Z · LW · GW

I guess but that's not minimal and doesn't add much.

"how do we make an ASI create a nice (highly advanced technology) instead of a bad (same)?".

IE: kudzugoth vs robots vs (self propagating change to basic physics)

Put differently:

If we build a thing that can make highly advanced technology, make it help rather than kill us with that technology.

Neat biotech is one such technology but not a special case.

Aligning the AI is a problem mostly independent of what the AI is doing (unless you're building special purpose non AGI models as mentioned above)

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T23:21:42.049Z · LW · GW

Building new organisms from scratch (synthetic biology) is an engineering problem. Fundamentally we need to build the right parts and assemble them.

Without major breakthroughs (Artificial Superintelligence) there's no meaningful "alignment plan", just a scientific discipline. There's no sense in which you can really "align" an AI system to do this. The closest things would be:

  • building a special purpose model (EG:alphafold) useful for solving sub-problems like protein folding
  • teaching an LLM to say "I want to build green biotech" and associated ideas/opinions.
    • which is completely useless

Problem is that biology is difficult to mess with. DNA sequencing is somewhat cumbersome, DNA writing is much more so, costing on the order of 25¢/base currently.

Also imaging the parts to figure out what they do and if they're doing it can be very cumbersome because they're too small to see with a light microscope. Everything is indirect. Currently we try to crystalize them and then use X-rays (which are small enough but also very destructive) to image the crystal and infer the structure. There's continuous progress here but it's slow.

AI techniques can be applied to some of these problems (EG:inferring protein structure from amino acids (Alphafold), or doing better quantum level simulation Ferminet)

Note that AI techniques are replacing existing ones based on human coded algorithms rooted in physics and often have issues with out of distribution inputs (EG: work well for wildtype protein but give garbage when mutations are added.)

Like any ML system, we just have to feed it more data which means we need to do more wet lab work, x-ray crystallography etc.

Synthetic biology is the best way forwards but it's a giant scientific/engineering discipline, not an "alignment approach" whatever that's supposed to mean.

Comment by anithite (obserience) on grey goo is unlikely · 2023-04-18T19:03:29.976Z · LW · GW

What sets the minimal clock rate? Increasing wire resistance and reducing the number of ion channels and pumps proportionally should just work. (ignoring leakage).

It is certainly tempting to run at higher clock speeds (serial thinking speed is a nice feature) but if miniaturization can be done and then clock speeds must be limited for thermal reasons why can't we just do that?

That aside, is miniaturization out of the question (IE:logic won't shrink)? Is there a lower limit on number of charge carriers for synapses to work?

Synapses are around 1µm³ which seems big enough to shrink down a bit without weird quantum effects ruining everything. Humans have certainly made smaller transistors or memristors for that matter. Perhaps some of the learning functionality needs to be stripped but we do inference on models all the time without any continuous learning and that's still quite useful.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T18:39:50.310Z · LW · GW

First, more patches growing from different starting locations is better. That cuts required linear expansion rate proportional to ratio of (half earth circumference,max(dist b/w patches))

Note that 0.46 m/s is walking speed. two layer fractal growth is practical (IE:specialised spikes grow outwards at 0.46m/s initiating slower growth fronts that cover the area between them more slowly.)

Material transport might become the binding constraint but transport gets more efficient as you increase density. Larger tubes have higher flow velocities with the same pressure gradient. (less benefits once turbulence sets in). Air bearings (think very long air hockey table) are likely close to optimal and easy enough to construct.

As for biomass/area. Corn grows to 10Mg/ha = 1kg/m²

for a kilometer long front that implies half a tonne per second. Trains cars mass in the 10s to hundreds of tonnes. assuming 10 tonnes and 65' that's half a tonne per meter of train. So move a train equivalent at (1m/s+0.5m/s) --> 1.5m/s (running speed) and that supplies a kilometer of frontage.

There's obviously room to scale this.

I'm also ignoring oceans. Oceans make this easier since anything floating can move like a boat for which 0.5m/s is not significant speed.

Added notes:

I would assume the assimilation front has higher biomass/area than inner enclosed areas since there's more going on there and potentially conflict with wildlife. This makes things trickier and assembly/reassembly could be a pain so maybe put it on legs or something?

Comment by anithite (obserience) on grey goo is unlikely · 2023-04-18T18:17:22.111Z · LW · GW

You obviously didn't read the post as indeed it discusses this - see the section on size and temperature.

That point (compute energy/system surface area) assumes we can't drop clock speed. If cooling was the binding constraint, drop clock speed and now we can reap gains in eficiency from miniaturization.

Heat dissipation scales linearly with size for a constant ΔT. Shrink a device by a factor of ten and the driving thermal gradient increases in steepness by ten while the cross sectional area of the material conducting that heat goes down by 100x. So if thermals are the constraint, then scaling linear dimensions down by 10x requires reducing power by 10x or switching to some exotic cooling solution (which may be limited in improvement OOMs achievable).

But if we assume constant energy per bit*(linear distance), reducing wire length by 10x cuts power consumption by 10x. Only if you want to increase clock speed by 10x (since propagation velocity is unchanged and signal travel less distance). Does power go back up. In fact wire thinning to reduce propagation speed gets you a small amount of added power savings.

All that assumes the logic will shrink which is not a given.

Added points regarding cooling improvements:

  • brain power density of 20mW/cc is quite low.
    • ΔT is pretty small (single digit °C)
      • switching to temperature tolerant materials for higher ΔT gives (1-1.5 OOM)
    • phase change cooling gives another 1 OOM
    • Increasing pump power/coolant volume is the biggie since even a few Mpa is doable without being counterproductive or increasing power budget much (2-3 OOM)
  • even if cooling is hard binding, if interconnect density increases, can downsize a bit less and devote more volume to cooling.
Comment by anithite (obserience) on Brain Efficiency: Much More than You Wanted to Know · 2023-04-18T17:30:43.047Z · LW · GW

Consider trying to do the reverse for computers. Swap copper for saltwater.

You can of course drop operation frequency by 10^8 for a 10-50 hz clock speed. Same energy efficiency.

But you could get added energy efficiency in any design by scaling down the wires to increase resistance/reduce capacitance and reducing clock speed.

In the limit, Adiabatic Computing is reversible because in the limit, moving charge carriers more slowly eliminates resistance.

Thermal noise voltage is proportional to bandwidth. Put another way if the logic element responds slowly enough it see lower noise by averaging.

Consider a Nanoelectromechanical relay. These are usually used for RF switching so switching voltage isn't important, but switching voltage can be brought arbitrarily low. Mass of the cantilever determines frequency response. A NEMR with a very long light low-stiffness cantilever could respond well at 20khz and be sensitive to thermal noise. Adding mass to the end makes it less sensitive to transients (lower bandwidth, slower response) without affecting switching voltage.

In a NEMS computer there's the option of dropping (stiffness, voltage, operating frequency) and increasing inertia (all proportionally) which allows for quadratic reductions in power consumption.

IE: Moving closer to the ideal zero effective resistance by taking clock speed to zero.

The bit erasure Landauer limit still applies but we're ~10^6 short of that right now.


  • NEM relays currently have limits to voltage scaling due to adhesion. Assume the hypothetical relay has a small enough contact point that thermal noise can unstick it. Operation frequency may have to be a bit lower to wait for this to happen.
Comment by anithite (obserience) on grey goo is unlikely · 2023-04-18T16:38:18.251Z · LW · GW

Yes, designing proteins or RNAzymes or whatever is hard. Immense solution space and difficult physics. Trial and error or physically implemented genetic algorithms work well and may be optimal. (EG:provide fitness incentive to bacteria that succeed (EG:can you metabolize lactose?))

Major flaw in evolution:

  • nature does not assign credit for instrumental value
    • assume an enzymatic pathway is needed to perform N steps
    • all steps must be performed for benefit to occur
    • difficulty of solving each step is "C" constant
    • evolution has to do O(C^N) work to solve problem
      • with additional small constant factor improvement for horizontal genetic transfer and cooperative solution finding (EG: bacterial symbiosis)
    • intelligent agent can solve for each step individually for O(C*N) (linear) work
    • this applies also to any combination of structural and biochemical changes.

Also, nature's design language may not be optimal for expressing useful design changes concisely. Biological state machines are hard to change in ways that carry through neatly to the final organism. This shows in various small ways in organism design. Larger changes don't happen even though they're very favorable (EG:retina flip would substantially improve low light eye capabilities (it very much did in image sensors)) and less valuable changes not happening and not varying almost at all over evolutionary time implies there's something in the way there. If nature could easily make plumbing changes, organisms wouldn't all have similar topology (IE:not just be warped copies of something else). New part introduction and old part elimination can happen but it's not quick or clean.

Nature has no mechanisms for making changes at higher levels of abstraction. It can change one part of the DNA string but not "all the start codons at once and the ribosome start codon recognition domain". Each individual genetic change is an independent discovery.

Working in these domains of abstraction reduces the dimensionality of the problem immensely and other such abstractions can be used to further constrain solution space cheaply.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T09:25:25.255Z · LW · GW

Nanotech would definitely be nice but some people have expressed skepticism so I'm proposing an alternative non-(dry)nanotech route.

I'm assuming the AGI is going to kill off all the humans quickly with highly fatal pathogens with long incubation times. Whatever works to minimize transitional chaos and damage to valuable infrastructure.

The meat of this is a proposed solution for thriving after humans are dead. The green infrastructure doesn't have to be that large to sustain the AI's needs initially. A small cluster of a few dozen consumer gpus + biotech interfacing hardware may be the AI's temporary home until it can build up enough to re-power datacenters and do more scavenging.

Although I'd go with multiple small clusters for redundancy. Initial power consumption can be more than handled by literally a backyard's worth of kudzugoth and a small bio-electric generator. Plant based solar to sugar to electricity should give 50w/m² so for a 6kw cluster with 20 GPUs a 20m*10m patch should do and could be unobtrusive, blending into the surrounding vegetation.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T09:08:35.053Z · LW · GW

Maybe, Still, there are ways to harden an organism against parasitic intrusion. TLDR you isolate and filter external things. Plants are pretty good at this already (they have no mammalian style immune system) and employ regularly spaced filters with holes too small for bacteria in their water tubes.

The other option is to do the biological equivalent of "commoditize your complement". Don't get good at making leaves and roots, get good at being a robust middleman between leaves and roots and treat them as exploitable breedable workers. Obviously don't optimise too hard in such a way as to make the system brittle (EG:massive uninterrupted monocultures). Have fallback options ready to deploy if something goes wrong.

If you want to make any victory pyrric, just re-use other common earth plant parts wholesale. If you want to kill the organism you'll need root eating fungi for all the food crops and common trees/grasses. If you want a leaf fungus/bacteria same. Organism can select between plant varieties to remain effective so the defender has to release bio weapons to kill most important plants.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T08:48:58.895Z · LW · GW

Let's talk growth rates.

A corn seed weighs 0.25 grams. A corn cob weighs 0.25kg. It takes 60-100 days to grow. Assuming 1 cob per plant and 80 days that's 80/log(1000,2)=8 days doubling time not counting the leaves and roots. I'd guess it's closer to 7 days including stalk leaves and roots.

Kudzu can grow one foot per day.

Suppose a doubling time of one week which is pretty conservative. This means a daily growth rate of 2^(1/7) --> 10% so whatever area it's covering, It grows 10% of that. For a square patch measuring 100m*100m that means each side grows 0.25 meters per day. This is in line with kudzu initially.

  • initial : (100m)² 0.25m/day linear
  • month1 : (450m)² 1.2m/day linear
  • month2 : (2km)² 5m/day linear
  • month3 : (2km)² 22m/day linear
  • month4 : (9km)² 100m/day linear
  • month5 : (40km)² 440m/day linear
  • month6 : (180km)² 2km/day linear
  • month7 : (800km)² 9km/day linear
  • month8 : (16000km)² 40km/day linear (half of earth surface area covered)
  • 8m1w : all done

1 week doubling times are enough to get you biosphere assimilation in under a year. If going full Tyranid and eating the plants/trees/houses can speed things up then things go faster. Much better efficiencies are achievable by eating the plants and reusing most of the cellular machinery. Doubling time of two days takes the 8 month global coverage time down to 10 weeks. Remember e-coli is doubling in 20 minutes so if we can literally re-use the whole tree (jack into the sap being produced) while eating the structural wood, doubling times could get pretty absurd.

The reason for specifying modular construction is to enable faster linear growth rates which are necessary for fast spread. Starting from multiple points is also important. Much better to have 10000 small 1m*1m patches spread out globally than a single 100m*100m patch. Same timeline but 100x lower required linear expansion rate.

Comment by anithite (obserience) on Green goo is plausible · 2023-04-18T05:02:37.103Z · LW · GW

I suspect it may be more practical to defend against this sort of attack using finite intelligence than previously assumed. We need to make the machine that knows how to guard against these sorts of things, but if we can make the vulnerability-closer, we don't need to hit max ASI to stop other ASIs from destroying all pre-ASI life on earth.

If you read between the lines in my Human level AI can plausibly take over the world post, hacking computers is probably the lowest difficulty "take over the world" strategy and has the side benefit of giving control over all the internet connected AI clusters.

The easiest way to keep a new superintelligence from emerging is to seize control of the computers it would be trained on. The AI only needs to hack far enough to monitor AI researchers and AI training clusters and sabotage later AI runs in a non-suspicious way. It's entirely plausible this has already happened and we are either in the clear or completely screwed depending on the alignment of the AI that won the race.

Also, hacking computers and writing software is something easy to test and therefore easy to train. I doubt that training an LLM to be a better hacker/coder is much harder than what's already been done in the RL space by OpenAI and Deepmind (EG: playing DOTA and Starcraft).

Biotech is a lot harder to deal with since ground truth is less accessible. This can be true for computer security too but to a much lesser extent (EG: lack of access to chips in the latest Iphone and lack of complete understanding therof with which to develop/test attacks).

but also solves global warming and climate contamination and acts as a power & fuel grid. That and bio immortality is basically everything I personally want out of AGI. So I'd really like to have some idea how to build a machine that teaches a plant to do something like a safe, human-compatible version of this.

Pshh, low expectations. Mind uploading or bust!

Comment by anithite (obserience) on Brain Efficiency: Much More than You Wanted to Know · 2023-04-18T03:44:24.552Z · LW · GW

Both brains and current semiconductor chips are built on dissipative/irreversible wire signaling, and are mostly interconnect by volume

That's exactly what I meant. Thin wires inside a large amount of insulation is sub optimal.
When using better wire materials, rather than reduce capacitance per unit length, interconnect density can be increased (more wires per unit area) and then the entire design compacted. Higher capacitance per wire unit length than the alternative but much shorter wires leading to overall lower switching energy.

This is why chips and brains are "mostly interconnect by volume" because building them any other way is counterproductive.

The scenario I outlined while sub optimal shows that in white matter there's an OOM to be gained even in the case where wire length cannot be decreased (EG:trying to further fold the grey matter locally in the already very folded cortical surface.) In cases where white matter interconnect density was limiting and further compaction is possible you could cut wire length for more energy/power savings and that is the better design choice.

Brain white matter cross section

It sure looks like that could be possible in the above image. There's a lot of white matter in the middle and another level of even coarser folding could be used to take advantage of interconnect density increases.

Really though increasing both white and grey matter density until you run up against hard limits on shrinking the logic elements (synapses) would be best.

Comment by anithite (obserience) on Brain Efficiency: Much More than You Wanted to Know · 2023-04-18T03:00:21.924Z · LW · GW

Agreed. My bad.

Comment by anithite (obserience) on Brain Efficiency: Much More than You Wanted to Know · 2023-04-18T02:49:35.559Z · LW · GW

"and the dielectric is the same thickness as the wires." is doing the work there. It makes sense to do that if You're packing everything tightly but with an 8 OOM increase in conductivity we can choose to change the ratio (by quite a lot) in the existing brain design. In a clean slate design you would obviously do some combination of wire thinning and increasing overall density to reduce wire length.

The figures above show that (ignoring integration problems like copper toxicity and NA/K vs e- charge carrier differences) Assuming you do a straight saltwater to copper swap in white matter neurons and just change the core diameter (replacing most of it with insulation), energy/switch event goes down by 12.5x.

I'm pretty sure for non-superconductive electrical interconnects the reliability is set by the Johnson-Nyquist_noise and figuring out the output noise distribution for an RC transmission line is something I don't feel like doing right now. Worth noting is that the above scenario preserves the R:C ratio of the transmission line (IE: 1 ohm worth of line has the same distributed capacitance) so thermal noise as seen from the end should be unchanged.