Posts

Unnatural abstractions 2024-08-10T22:31:42.949Z
Aprillion (Peter Hozák)'s Shortform 2024-04-10T11:29:01.973Z
The Usefulness Paradigm 2022-12-26T13:23:58.722Z
Why square errors? 2022-11-26T13:40:37.318Z

Comments

Comment by Aprillion on Training AGI in Secret would be Unsafe and Unethical · 2025-04-20T08:01:16.731Z · LW · GW

I can see that if Moloch is a force of nature, any wannabe singleton would collapse under internal struggles... but it's not like that would show me any lever AI safety can pull, it would be dumb luck if we live in a universe where the ratio of instrumentally convergent power concentration to it's inevitable schism is less than 1 ¯\_(ツ)_/¯

Comment by Aprillion on johnswentworth's Shortform · 2025-04-17T08:33:45.349Z · LW · GW

Have you tried to make a mistake in your understanding on purpose to test out whether it would correct you or agree with you even when you'd get it wrong?

(and if yes, was it "a few times" or "statistically significant" kinda test, please?)

Comment by Aprillion on johnswentworth's Shortform · 2025-04-17T08:13:07.822Z · LW · GW

While Carl Brown said (a few times) he doesn't want to do more youtube videos for every new disappointing AI release, so far he seems to be keeping tabs on them in the newsletter just fine - https://internetofbugs.beehiiv.com/

...I am quite confident that if anything actually started to work, he would comment on it, so even if he won't say much about any future incremental improvements, it might be a good resource to subscribe to for getting better signal - if Carl will get enthusiastic about AI coding assistants, it will be worth paying attention.

Comment by Aprillion on plex's Shortform · 2025-03-29T08:40:32.976Z · LW · GW

My own experience is that if-statements are even 3.5's Achilles heel and 3.7 is somehow worse (when it's "almost" right, that's worse than useless, it's like reviewing pull requests when you don't know if it's an adversarial attack or if they mean well but are utterly incompetent in interesting, hypnotizing ways)... and that METR's baselines more resemble a Skinner box than programming (though many people have that kind of job, I just don't find the conditions of gig economy as "humane" and representative of what how "value" is actually created), and the sheer disconnect of what I would find "productive", "useful projects", "bottlenecks", and "what I love about my job and what parts I'd be happy to automate" vs the completely different answers on How Much Are LLMs Actually Boosting Real-World Programmer Productivity?, even from people I know personally...

I find this graph indicative of how "value" is defined by the SF investment culture and disruptive economy... and I hope the AI investment bubble will collapse sooner rather than later...

But even if the bubble collapses, automating intelligence will not be undone, it won't suddenly become "safe", the incentives to create real AGI instead of overhyped LLMs will still exists - the danger is not in the presented economic curve going up, it's in what economic actors see as potential, how incentivized are the corporations/governments to search for the thing that is both powerful and dangerous, no?

Comment by Aprillion on The Most Forbidden Technique · 2025-03-17T14:50:47.899Z · LW · GW

I would never trust people not to look at my scratchpad.

I suspect the corresponding analogy for humans might be about hostile telepaths, not just literal scratchpads, right?

Comment by Aprillion on How Much Are LLMs Actually Boosting Real-World Programmer Productivity? · 2025-03-13T12:46:00.387Z · LW · GW

thanks for concrete examples, can you help me understand how these translate from individual productivity to externally-observable productivity?

3 days to make a medium sized project

I agree Docker setup can be fiddly, however what happened with the 50+% savings - did you lower price for the customer to stay competitive, do you do 2x as many paid projects now, or did you postpone hiring another developer who is not needed now, or do you just have more free time? No change in support&maintenance costs compared to similar projects before LLMs?

processing isn't more than ~500 lines of code

oh well, my only paid experience is with multi-year project development&maintenance, those are definitelly not in the category under 1kloc 🙈  which might help to explain my abysmal experience trying to use any AI tools for work (beyond autocomplete, but IntelliSense also existed before LLMs)

TBH, I am now moving towards the opinion that evals are very un-representative of the "real world" (if we exclude LLM wrappers as requested in the OP ... though LLM wrappers including evals are becoming part of the "real world" too, so I don't know - it's like banking bootstrapped wealthy bankers, and LLM wrappers might be bootstraping wealthy LLM startups)

Comment by Aprillion on Paths and waystations in AI safety · 2025-03-13T11:15:46.511Z · LW · GW

toxic slime, which releases a cloud of poison gas if anything touches it

this reminds me of Oxygen Not Included (though I just learned the original reference is D&D), where Slime (which also releases toxic stuff) can be harversted to produce useful stuff in Algae Distiller

the metaphor runs differently, one of the useful stuff from Slime is Polluted Water, which is also produced by humans replicants in Lavatory ... and there is Water Sieve that will process Polluted Water into Water (and some plants want to be watered by the Polluted variant)

makes me wonder if there is any back-applicable insight - if AI slop is indistinguishable from corporate slop, can we use it to generate data to train spam filters to improve quality of search results and start valuing quality journalism again soon? (and maybe some cyborgs want to use AI assistants for useful work beyond buggy clones of open source tools)

Comment by Aprillion on The Insanity Detector and Writing · 2025-03-10T10:27:15.258Z · LW · GW

Talking out loud is even better. There is something about forcing your thoughts into language...

Those are 2 very different things for some people ;)

I, for one, can think MUCH faster without speaking out loud, even if subvocalize real words (for the purpose of revealing gaps) and don't go all the way to manipulating concepts-that-don't-have-words-yet-but-have-been-pointed-to-already or concepts-that-have-a-word-but-the-word-stands-for-5-concepts-and-we-already-narrowed-it-down-without-explicit-label ...

Comment by Aprillion on A Bear Case: My Predictions Regarding AI Progress · 2025-03-06T09:26:34.932Z · LW · GW

the set of problems the solutions to which are present in their training data

a.k.a. the set of problems already solved by open source libraries without the need to re-invent similar code?

Comment by Aprillion on How Much Are LLMs Actually Boosting Real-World Programmer Productivity? · 2025-03-05T11:24:45.410Z · LW · GW

that's not how productivity ought to be measured - it should measure some output per (say) a workday

1 vs 5 FTE is a difference in input, not output, so you can say "adding 5 people to this project will decrease productivity by 70% next month and we hope it will increase productivity by 2x in the long term" ... not a synonym of "5x productivity" at all

it's the measure by which you can quantify diminishig results, not obfuscate them!

...but the usage of "5-10x productivity" seems to point to a diffent concept than a ratio of useful output per input 🤷 AFAICT it's a synonym with "I feel 5-10x better when I write code which I wouldn't enjoy writing otherwise"

Comment by Aprillion on Fake thinking and real thinking · 2025-02-28T09:38:21.330Z · LW · GW

A thing I see around me, my mind.
Many a peak, a vast mountain range,
standing at a foothill,
most of it unseen.

Two paths in front of me,
a lighthouse high above.

Which one will it be,
a shortcut through the forest,
or a scenic route?

Climbing up for better views,
retreating from overlooks,
away from the wolves.

To think with all my lighthouses.

Comment by Aprillion on Aprillion (Peter Hozák)'s Shortform · 2025-02-18T10:00:50.644Z · LW · GW

all the scaffold tools, system prompt, and what not add context for the LLM ... but what if I want to know what's the context too?

Comment by Aprillion on Detect Goodhart and shut down · 2025-01-24T10:08:46.445Z · LW · GW

we can put higher utility on the shutdown

sounds instrumental to expand your moral circle to include other instances of yourself to keep creating copies of yourself that will shut down ... then exand your moral circle to include humans and shut them down too 🤔

Comment by Aprillion on What Is The Alignment Problem? · 2025-01-18T12:51:33.921Z · LW · GW

exercise for readers: what patterns need to hold in the environment in order for "do what I mean" to make sense at all?

Notes to self (let me know if anyone wants to hear more, but hopefully no unexplored avenues can be found in my list of "obvious" if somewhat overlapping points):

  • sparsity - partial observations narrow down unobserved dimensions
  • ambiguity / edge of chaos - the environment is "interesting" to both agents (neither fully predictable nor fully random)
  • repetition / approximation / learnability - induction works
  • computational boundedness / embeddedness / diversity
  • power balance / care / empathy / trading opportunity / similarity / scale
Comment by Aprillion on The subset parity learning problem: much more than you wanted to know · 2025-01-05T13:04:37.758Z · LW · GW

Parity in computing is whether the count of 1s in a binary string is even or odd, e.g. '101' has two 1s => even parity (to output 0 for even parity, XOR all bits like 1^0^1 .. to output 1 for this, XOR that result with 1).

The parity problem (if I understand it correctly) sounds like trying to find out the minimum amount of data samples per input length a learning algorithm ought to need to figure out that a mapping between a binary input and a single bit output is equal to computing XOR parity and not something else (e.g. whether an integer is even/odd, or if there is a pattern in wannabe-random mapping, ...), and the conclusion seems to be that you need exponentially more samples for linearly longer input .. unless you can figure out from other clues that you need to calculate parity in which case you just implement parity for any input size and you don't need any additional sample data.

(FTR: I don't understand the math here, I am just pattern matching to the usual way this kind problems go)

Comment by Aprillion on The Online Sports Gambling Experiment Has Failed · 2025-01-04T08:02:07.408Z · LW · GW

The failure mode of the current policy sounds to me like "pay for your own lesson to feel less motivated to do it again" while the failure mode of this proposal would be "one of the casinos might maybe help you cheat the system which will feel even more exciting" - almost as if the people who made the current policy knew what they were doing to set aligned incentives 🤔

Comment by Aprillion on The Plan - 2024 Update · 2025-01-02T16:56:00.081Z · LW · GW

Focus On Image Generators

 

How about audio? Is the speech-to-text domain as "close to the metal" as possible to deserve focus too or did people hit roadblocks that made image generators more attractive? If the latter, where can I read about the lessons learned, please?

Comment by Aprillion on The Plan - 2024 Update · 2025-01-02T15:23:19.159Z · LW · GW

What if you tried to figure out a way to understand the "canonical cliffness" and design a new line of equipment that could be tailored to fit any "slope"... Which cliff would you test first? 🤔

Comment by Aprillion on Shallow review of technical AI safety, 2024 · 2024-12-30T13:48:11.536Z · LW · GW

IMO

in my opinion, the acronym for the international math olympiad deserves to be spelled out here

Comment by Aprillion on Darwinian Traps and Existential Risks · 2024-08-26T11:55:04.894Z · LW · GW

Evolution isn't just a biological process; it's a universal optimization algorithm that applies to any type of entity

Since you don't talk about the other 3 forces of biological evolution, or about "time evolution" concept in physics...

And since the examples seem to focus on directional selection (and not on other types of selection), and also only on short-term effect illustrations, while in fact natural selection explains most aspects of biological evolution, it's the strongest long-term force, not the weakest one (anti-cancer mechanisms and why viruses don't usualy kill theit host are also well explained by natural selection even if not listed as examples here, evolution by natural selection is the thing that well explains ALL of those billions of years of biology in the real world - including cooperation, not just competition)...

Would it be fair to say that you use "evolution" only by analogy, not trying to build a rigorous causal relationship between what we know of biology and what we observe in sociology? There is no theory of the business cycle because of allele frequency, right?!?

Comment by Aprillion on Limitations on Formal Verification for AI Safety · 2024-08-24T09:01:18.419Z · LW · GW

If anyone here might enjoy a dystopian fiction about a world where the formal proofs will work pretty well, I wrote Unnatural abstractions

Comment by Aprillion on Unnatural abstractions · 2024-08-11T13:44:56.277Z · LW · GW

Thank you for the engagement, but "to and fro" is a real expression, not a typo (and I'm keeping it).. it's used slightly unorthodoxly here, but it sounded right to my ear, so it survived editing ¯\_(ツ)_/¯

Comment by Aprillion on Unnatural abstractions · 2024-08-10T22:50:40.153Z · LW · GW

I tried to be use the technobabble in a way that's usefully wrong, so please also let me know if someone gets inspired by this short story.

I am not making predictions about the future, only commenting on the present - if you notice any factual error from that point of view, feel free to speak up, but as far as the doominess spectrum goes, it's supposed to be both too dystopian and too optimistic at the same time.

And if someone wants to fix a typo or a grammo, I'd welcome a pull request (but no commas shall be harmed in the process). 🙏

Comment by Aprillion on Inspired by: Failures in Kindness · 2024-07-28T08:21:39.476Z · LW · GW

Let me practice the volatile kidness here ... as a European, do I understand it correctly that this advice is targeted for US audience? Or am I the only person to whom it sounds a bit fake?

Comment by Aprillion on Scalable oversight as a quantitative rather than qualitative problem · 2024-07-07T11:41:15.051Z · LW · GW

How I personally understand what it could mean to "understand an action:"

Having observed action A1 and having a bunch of (finite state machine-ish) models, each with a list of states that could lead to action A1, more accurate candidate model => more understanding. (and meta-level uncertainty about which model is right => less understanding)

Model 1            Model 2
S11 -> 50% A1      S21 -> 99% A1
    -> 50% A2          ->  1% A2

S21 -> 10% A1      S22 ->  1% A1
    -> 90% A3          -> 99% A2
                   
                   S23 -> 100% A3
Comment by Aprillion on LLM Generality is a Timeline Crux · 2024-06-25T10:06:36.942Z · LW · GW

Thanks for the clarification, I don't share the intuition this will prove harder than other hard software engineering challenges in non-AI areas that weren't solved in months but were solved in years and not decades, but other than "broad baseline is more significant than narrow evidence for me" I don't have anything more concrete to share.

A note until fixed: Chollet also discusses 'unhobbling' -> Aschenbrenner also discusses 'unhobbling'

Comment by Aprillion on LLM Generality is a Timeline Crux · 2024-06-24T15:04:26.744Z · LW · GW

I agree with "Why does this matter" and with the "if ... then ..." structure of the argument.

But I don't see from where do you see such high probability (>5%) of scaffolding not working... I mean whatever will work can be retroactively called "scaffolding", even if it will be in the "one more major breakthrough" category - and I expect they were already accounted for in the unhobblings predictions.

a year ago many expected scaffolds like AutoGPT and BabyAGI to result in effective LLM-based agents

Do we know the base rate how many years after initial marketing hype of a new software technology we should expect "effective" solutions? What is the usual promise:delivery story for SF startups / corporate presentations around VR, metaverse, crypto, sharing drives, sharing appartments, cybersecurity, industrial process automation, self-driving ..? How much hope should we take from the communication so far that the problem is hard to solve - did we expect before AutoGPT and BabyAGI that the first people who will share their first attempt should have been successful?

Comment by Aprillion on LLM Generality is a Timeline Crux · 2024-06-24T13:42:51.412Z · LW · GW

Aschenbrenner argues that we should expect current systems to reach human-level given further scaling

In https://situational-awareness.ai/from-gpt-4-to-agi/#Unhobbling, "scaffolding" is explicitly named as a thing being worked on, so I take it that progress in scaffolding is already included in the estimate. Nothing about that estimate is "just scaling".

And AFAICT neither Chollet nor Knoop made any claims in the sense that "scaffolding outside of LLMs won't be done in the next 2 years" => what am I missing that is the source of hope for longer timelines, please?

Comment by Aprillion on My AI Model Delta Compared To Christiano · 2024-06-16T15:48:11.181Z · LW · GW

It’s a failure of ease of verification: because I don’t know what to pay attention to, I can’t easily notice the ways in which the product is bad.

Is there an opposite of the "failure of ease of verification" that would add up to 100% if you would categorize the whole of reality into 1 of these 2 categories? Say in a simulation, if you attributed every piece of computation into following 2 categories, how much of the world can be "explained by" each category?

  • make sure stuff "works at all and is easy to verify whether it works at all"
  • stuff that works must be "potentially better in ways that are hard to verify"

Examples:

  • when you press the "K" key on your keyboard for 1000 times, it will launch nuclear missiles ~0 times and the K key will "be pressed" ~999 times
  • when your monitor shows you the pixels for a glyph of the letter "K" 1000 times, it will represent the planet Jupyter ~0 times and "there will be" the letter K ~999 times
  • in each page in your stack of books, the character U+0000 is visible ~0 times and the letter A, say ~123 times
  • tupperware was your own purchase and not gifted by a family member? I mean, for which exact feature would you pay how much more?!?
  • you can tell whether a water bottle contains potable water and not sulfuric acid
  • carpet, desk, and chair haven't spontaneously combusted (yet?)
  • the refrigerator doesn't produce any black holes
  • (flip-flops are evil and I don't want to jinx any sinks at this time)


 

Comment by Aprillion on Humming is not a free $100 bill · 2024-06-08T07:36:00.185Z · LW · GW

This leaves humming in search of a use case.

we can still hum to music, hum in (dis)agreement, hum in puzzlement, and hum the "that's interesting" sound ... without a single regard to NO or viruses, just for fun!

Comment by Aprillion on The case for stopping AI safety research · 2024-06-04T10:32:42.464Z · LW · GW

I agree with the premises (except "this is somewhat obvious to most" 🤷).

On the other hand, stopping AI safety research sounds like a proposal to go from option 1 to option 2:

  1. many people develop capabilities, some of them care about safety
  2. many people develop capabilities, none of them care about safety
Comment by Aprillion on Value Claims (In Particular) Are Usually Bullshit · 2024-06-02T14:42:32.586Z · LW · GW

half of the human genome consists of dead transposons

The "dead" part is a value judgement, right? Parts of DNA are not objectively more or less alive.

It can be a claim that some parts of DNA are "not good for you, the mind" ... well, I rather enjoy my color vision and RNA regulation, and I'm sure bacteria enjoy their antibiotic resistance.

Or maybe it's a claim that we already know everything there is to know about the phenomena called "dead transposons", there is nothing more to find out by studying the topic, so we shouldn't finance that area of research.

Is there such a thing as a claim that is not a value claim?

Is "value claims are usually bullshit" a value claim? Does the mental model pick out bullshit more reliably than to label as value claim from what you want to be bullshit? Is there a mental model behind both, thus explaining the correlation? Do I have model close enough to John's so it can be useful to me too? How do I find out?

Comment by Aprillion on Level up your spreadsheeting · 2024-05-26T15:26:27.945Z · LW · GW

Know some fancier formulas like left/mid/right, concatenate, hyperlink

Wait, I thought basic fancier formulas are like =index(.., match(.., .., 0)) 

I guess https://dev.to/aprillion/self-join-in-sheets-sql-python-and-javascript-2km4 might be a nice toy example if someone wants to practice the lessons from the companion piece 😹

Comment by Aprillion on Duct Tape security · 2024-05-12T11:24:08.409Z · LW · GW

It's duct tapes all the way down!

Comment by Aprillion on Duct Tape security · 2024-05-12T11:09:58.781Z · LW · GW

Bad: "Screw #8463 needs to be reinforced."

The best: "Book a service appointment, ask them to replace screw #8463, do a general check-up, and report all findings to the central database for all those statistical analyses that inform recalls and design improvements."

Comment by Aprillion on Dyslucksia · 2024-05-12T10:48:11.453Z · LW · GW

Oh, I should probably mention that my weakness is that I cannot remember the stuff well while reading out loud (especially when I focus on pronunciation for the benefit of listeners)... My workaround is to make pauses - it seems the stuff is in working memory and my subconscious can process it if I give it a short moment, and then I can think about it consciously too, but if I would read out loud a whole page, I would have trouble even trying to summarize the content.

Similarly a common trick how to remember names is to repeat the name out loud.. that doesn't seem to improve recall for me very much, I can hear someone's name a lot of times and repeating it to myself doesn't seem to help. Perhaps seeing it written while hearing it might be better, but not sure... By far the best method is when I want to write them a message and I have to scroll around until I see their picture, after that I seem to remember names just fine 😹

Comment by Aprillion on Dyslucksia · 2024-05-11T08:37:39.355Z · LW · GW

Yeah, I myself subvocalize absolutely everything and I am still horrified when I sometimes try any "fast" reading techniques - those drain all of the enjoyment our of reading for me, as if instead of characters in a story I would imagine them as p-zombies.

For non-fiction, visual-only reading cuts connections to my previous knowledge (as if the text was a wave function entangled to the rest of the universe and by observing every sentence in isolation, I would collapse it to just "one sentence" without further meaning).

I never move my lips or tongue though, I just do the voices (obviously, not just my voice ... imagine reading Dennett without Dennett's delivery, isn't that half of the experience gone? how do other people enjoy reading with most of the beauty missing?).

It's faster then physical speech for me too, usually the same speed as verbal thinking.

Comment by Aprillion on Ironing Out the Squiggles · 2024-05-03T16:16:53.615Z · LW · GW

ah, but booby traps in coding puzzles can be deliberate... one might even say that it can feel "rewarding" when we train ourselves on these "adversarial" examples

the phenomenon of programmers introducing similar bugs in similar situations might be fascinating, but I wouldn't expect a clear answer to the question "Is this true?" without a slightly more precise definitions of:

  • "same" bug
  • same "bug"
  • "hastily" cobbled-together programs
  • hastily "cobbled-together" programs ...
Comment by Aprillion on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-18T14:12:57.483Z · LW · GW

To me as a programmer and not a mathematitian, the distinction doesn't make practical intuitive sense.

If we can create 3 functions f, g, h so that they "do the same thing" like f(a, b, c) == g(a)(b)(c) == average(h(a), h(b), h(c)), it seems to me that cross-entropy can "do the same thing" as some particular objective function that would explicitly mention multiple future tokens.

My intuition is that cross-entropy-powered "local accuracy" can approximate "global accuracy" well enough in practice that I should expect better global reasoning from larger model sizes, faster compute, algorithmic improvements, and better data.

Implications of this intuition might be:

  • myopia is a quantity not a quality, a model can be incentivized to be more or less myopic, but I don't expect it will be proven possible to enforce it "in the limit"
  • instruct training on longer conversations outght to produce "better" overall conversations if the model simulates that it's "in the middle" of a conversation and follow-up questions are better compared to giving a final answer "when close to the end of this kind of conversation"

What nuance should I consider to understand the distinction better?

Comment by Aprillion on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T07:35:25.472Z · LW · GW

transformer is only trained explicitly on next token prediction!

I find myself understanding language/multimodal transformer capabilities better when I think about the whole document (up to context length) as a mini-batch for calculating the gradient in transformer (pre-)training, so I imagine it is minimizing the document-global prediction error, it wasn't trained to optimize for just a single-next token accuracy...

Comment by Aprillion on Transformers Represent Belief State Geometry in their Residual Stream · 2024-04-17T07:02:35.196Z · LW · GW

Can you help me understand a minor labeling convention that puzzles me? I can see how we can label  from the Z1R process as  in MSP because we observe 11 to get there, but why  is labeled as  after observing either 100 or 00, please?

Comment by Aprillion on Aprillion (Peter Hozák)'s Shortform · 2024-04-10T11:29:02.135Z · LW · GW

Pushing writing ideas to external memory for my less burned out future self:

  • agent foundations need path-dependent notion of rationality

    • economic world of average expected values / amortized big O if f(x) can be negative or you start very high
    • vs min-maxing / worst case / risk-averse scenarios if there is a bottom (death)
    • pareto recipes
  • alignment is a capability

    • they might sound different in the limit, but the difference disappears in practice (even close to the limit? 🤔)
  • in a universe with infinite Everett branches, I was born in the subset that wasn't destroyed by nuclear winter during the cold war - no matter how unlikely it was that humanity didn't destroy itself (they could have done that in most worlds and I wasn't born in such a world, I live in the one where Petrov heard the Geiger counter beep in some particular patter that made him more suspicious or something... something something anthropic principle)

    • similarly, people alive in 100 years will find themselves in a world where AGI didn't destroy the world, no matter what are the odds - as long as there is at least 1 world with non-zero probability (something something Born rule ... only if any decision along the way is a wave function, not if all decisions are classical and the uncertainty comes from subjective ignorance)
    • if you took quantum risks in the past, you now live only in the branches where you are still alive and didn't die (but you could be in pain or whatever)
    • if you personally take a quantum risk now, your future self will find itself only in a subset of the futures, but your loved ones will experience all your possible futures, including the branches where you die ... and you will experience everything until you actually die (something something s-risk vs x-risk)
    • if humanity finds itself in unlikely branches where we didn't kill our collective selves in the past, does that bring any hope for the future?
Comment by Aprillion on Natural Latents: The Concepts · 2024-03-24T13:44:40.468Z · LW · GW

Now, suppose Carol knows the plan and is watching all this unfold. She wants to make predictions about Bob’s picture, and doesn’t want to remember irrelevant details about Alice’s picture. Then it seems intuitively “natural” for Carol to just remember where all the green lines are (i.e. the message M), since that’s “all and only” the information relevant to Bob’s picture.


(Writing before I read the rest of the article): I believe Carol would "naturally" expect that Alice and Bob share more mutual information than she does with Bob herself (even if they weren't "old friends", they both "decided to undertake an art project" while she "wanted to make predictions"), thus she would weight the costs of remembering more than just the green lines against the expected prediction improvement given her time constrains, lost opportunities, ... - I imagine she could complete purple lines on her own, and then remember some "diff" about the most surprising differences...

Also, not all of the green lines would be equally important, so a "natural latent" would be some short messages in "tokens of remembering", not necessarily correspond to the mathematical abstraction encoded by the 2 tokens of English "green lines" => Carol doesn't need to be able to draw the green lines from her memory if that memory was optimized to predict purple lines.

If the purpose was to draw the green lines, I would be happy to call that memory "green lines" (and in that, I would assume to share a prior between me and the reader that I would describe as: "to remember green lines" usually means "to remember steps how to draw similar lines on another paper" ... also, similarity could be judged by other humans ... also, not to be confused with a very different concept "to remember an array of pixel coordinates" that can also be compressed into the words "green lines", but I don't expect people will be confused about the context, so I don't have to say it now, just keep in mind if someone squirts their eyes just-so which would provoke me to clarify).

Comment by Aprillion on It's OK to eat shrimp: EAs Make Invalid Inferences About Fish Qualia and Moral Patienthood · 2023-11-13T18:37:16.820Z · LW · GW

yeah, I got a similar impression that this line of reasoning doesn't add up...

we interpret other humans as feeling something when we see their reactions

we interpret other eucaryotes as feeling something when we see their reactions 🤷

Comment by Aprillion on The Brain as a Universal Learning Machine · 2023-10-25T11:20:08.695Z · LW · GW

(there are a couple of circuit diagrams of the whole brain on the web, but this is the best.  From this site.)

could you update the 404 image, please? (link to the site still works for now, just the image is gone)

Comment by Aprillion on Features and Adversaries in MemoryDT · 2023-10-22T11:21:08.747Z · LW · GW

S5


What is S5, please?

Comment by Aprillion on Are humans misaligned with evolution? · 2023-10-20T07:27:40.139Z · LW · GW

I agree with what you say. My only peeve is that the concept of IGF is presented as a fact from the science of biology, while it's used as a confused mess of 2 very different concepts.

Both talk about evolution, but inclusive finess is a model of how we used to think about evolution before we knew about genes. If we model biological evolution on the genetic level, we don't have any need for additional parameters on the individual organism level, natural selection and the other 3 forces in evolution explain the observed phenomena without a need to talk about invididuals on top of genetic explanations.

Thus the concept of IF is only a good metaphor when talking approximately about optimization processes, not when trying to go into details. I am saying that going with the metaphor too far will result in confusing discussions.

Comment by Aprillion on Are humans misaligned with evolution? · 2023-10-19T14:50:20.472Z · LW · GW

humans don't actually try to maximize their own IGF


Aah, but humans don't have IGF. Humans have https://en.wikipedia.org/wiki/Inclusive_fitness, while genes have allele frequency https://en.wikipedia.org/wiki/Gene-centered_view_of_evolution ..

Inclusive genetic fitness is a non-standard name for the latter view of biology as communicated by Yudkowsky - as a property of genes, not a property of humans.

The fact that bio-robots created by human genes don't internally want to maximize the genes' IGF should be a non-controversial point of view. The human genes successfully make a lot of copies of themselves without any need whatsoever to encode their own goal into the bio-robots.

I don't understand why anyone would talk about IGF as if genes ought to want for the bio-robots to care about IGF, that cannot possibly be the most optimal thing that genes should "want" to do (if I understand examples from Yudkowsky correctly, he doesn't believe that either, he uses this as an obvious example that there is nothing about optimization processes that would favor inner alignment) - genes "care" about genetic success, they don't care about what the bio-robots outght to believe at all 🤷

Comment by Aprillion on Sum-threshold attacks · 2023-09-14T15:26:59.591Z · LW · GW

Some successful 19th century experiments used 0.2°C/minute and 0.002°C/second.

Have you found the actual 19th century paper?

The oldest quote about it that I found is from https://www.abc.net.au/science/articles/2010/12/07/3085614.htm

Or perhaps the story began with E.M. Scripture in 1897, who wrote the book, The New Psychology. He cited earlier German research: "…a live frog can actually be boiled without a movement if the water is heated slowly enough; in one experiment the temperature was raised at the rate of 0.002°C per second, and the frog was found dead at the end of two hours without having moved."

Well, the time of two hours works out to a temperature rise of 18°C. And, the numbers don't seem right.

First, if the water boiled, that means a final temperature of 100°C. In that case, the frog would have to be put into water at 82°C (18°C lower).

Surely, the frog would have died immediately in water at 82°C. 
Comment by Aprillion on Sum-threshold attacks · 2023-09-13T13:02:23.429Z · LW · GW

I'm not sure what to call this sort of thing. Is there a preexisting name?

sounds like https://en.wikipedia.org/wiki/Emergence to me 🤔 (not 100% overlap and also not the most useful concept, but very similar shaky pointer in concept space between what is described here and what has been observed as a phenomena called Emergence)