Posts
Comments
This argument against subagents is important and made me genuinely less confused. I love the concrete pizza example and the visual of both agent's utility in this post. Those lead me to actually remember the technical argument when it came up in conversation.
I found Steven Byrnes valence concept really useful for my own thinking about psychology more broadly and concretely when reading text messages from my contextualizing friend (in that when a message was ambiguous, guessing the correct interpretation based on valence worked surprisingly well for me).
I ended up dodging the bullet of loosing money here, because I was a bit worried that Nate Silvers model might have been behind, because the last poll then was on the 23rd. I was also too busy with other important work to resolve my confusions before the election. My current two best guesses are:
- The French whale did not have an edge,
- The neighbour polling method is a just-so story to spread confusion, but he actually did have an edge
- I don't understand correctly how this neighbour polling method is supposed to work.
In any case, if Polymarket is still legal in 4 years I expect the prediction market on the election to be efficient relative to me and I will not bet on it.
I had a discussion with @Towards_Keeperhood what we would expect in the world where orcas either are or aren't more intellectually capable than humans if trained. Main pieces I remember were: Orcas already dominating the planet (like humans do), large sea creatures going extinct due to orcas (similar to how humans drove several species extinct (Megalodon? Probably extinct for different reasons, weak evidence against? Most other large whales are still around)). I argued that @Towards_Keeperhood was also underestimating the intricacies that hunter-gatherers are capable of, and gave the book review for the secret of our success as an example. I think @Towards_Keeperhood did update in that direction after reading that post. I also reread that post and funnily enough stumbled over some evidence that orcas might have fallen into a similar "culture attractor" for intelligence, like humans:
Learn from old people. Humans are almost unique in having menopause; most animals keep reproducing until they die in late middle-age. Why does evolution want humans to stick around without reproducing?
Because old people have already learned the local culture and can teach it to others. Heinrich asks us to throw out any personal experience we have of elders; we live in a rapidly-changing world where an old person is probably “behind the times”. But for most of history, change happened glacially slowly, and old people would have spent their entire lives accumulating relevant knowledge. Imagine a world where when a Silicon Valley programmer can’t figure out how to make his code run, he calls up his grandfather, who spent fifty years coding apps for Google and knows every programming language inside and out.
Quick google search revealed Orcas have menopause too! While chimpanzees don't! I would not have predicted that.
Typo in the linked document:
There is no one is coming to save us.
Can someone who is already trading on Polymarket or is planning to do so soon tell me if there are any hidden fees (or ways my money might be locked up for longer than I expect) if I trade on Polymarket? Four years ago I got hit by enormous ether gas fees on Augur, which still made my bet positive EV, but only barely so (I had to wait quite a while for the gas cost to go that low and was loosing out on investing the money and my attention). I plan to bet ~$3K-$7K and think Kamala Harris has a 45% chance of winning. Is that enough for all the transaction costs to vanish?
One confounder: depression/mania. Recently (the last ~two weeks) I have been having bad sleep (waking up 3-7 am and not feeling sleepy anymore (usually I sleep from midnight to 9). My current best guess is that the problem is that my life has been going too well recently, leading to a self-sustaining equilibrium where I have little sleep and mania. Reduced my medication today (~55mg instead of 70mg) which seems to have helped with the mania. I had another day with slight mania 1 month ago when sleeping little in order to travel to a conference, so in the future I'll probably reduce my medication dose on such days. Took a friend describing his symptoms on too much medication for me to realize what is going on.
I am also interested in finding a space to explore ideas which are not well-formed. It isn’t clear to me that this is intended to be such a space. This may simply be due to my ignorance of the mechanics around here.
For not well-formed ideas, you can write a Quick Take (can be found by clicking on your profile name in the top right corner) or starting a dialogue if you want to develop the idea together with someone (can be found in the same corner).
I feel like there should exist a more advanced sequence that explains problems with filtered evidence leading to “confirmation bias”. I think the Luna sequence is already a great step in the right direction. I do feel like there is a lack of the equivalent non-fiction version, that just plainly lays out the issue. Maybe what I am envisioning is just a version of What evidence filtered evidence with more examples of how to practice this skill (applied to search engines, language models, someone’s own thought process, information actively hidden from you, rationality in groups etc.).
adult augmentation 2-3std for the average person seems plausible, but for the few +6std people on earth it might just give +0.2std or +0.3std, which tbc I think is incredibly worthwhile.
Such high diminishing returns in g based on genes seems quite implausible to me, but would be happy if you can point to evidence to the contrary. If it works well for people with average Intelligence, I'd expect it to work at most half as well with +6sd.
I am a bit confused why some of these theories would be so hard to test? It seems like some core pathways that seem like they wouldn't be reversible even in naive stem cells under any circumstances (like transposons copying themselves successfully), could possibly be tested by checking if clones derived from older cells age faster or something along those lines? The same goes for children from older parents? (Not sure to which extent that test would be made harder by all the mechanisms keeping the germ line immortal)
I don't know where anger fits into this. Also I should look at how these behaviors manifest in other animals.
Hypothesis based on the fact that status is a strong drive and people who are on the outer ends of that spectrum get classified as having a "personality disorder" and are going to be very resistant to therapy:
- weak-status-fear==psychopathy: psychopathy is caused by the loop leading to fear of loosing status, being less strong than average or possibly broken. (psychopathy is Probably on a spectrum. I don't see a reason why little of this feeling would be less optimal than none.)
- strong-status-fear==(?histrionic personality disorder)
- weak-status-seeking-loop==(?schizoid personality disorder)
- strong-status-seeking-loop==(?narcissism)
Was thinking about Steven Byrnes agenda to figure out social drives and what makes a psychopath a psychopath. One clearly existing social drive that seemed to be a thing was “status-seeking” and “status-fear” (fear of loosing status). Both of these could themselves be made of several drives? The idea that status-seeking and status-fear are different came to me when trying to think of the simplest hypothesis explaining psychopathy and from introspecting that both of these feelings feel very different to me and distinct from other fears. These two could be made more mostly separate loops, but I can't complicate my fake framework even more just yet.
If someone is interested, I'd write a post how to stress test this fake-framework and what I'd expect in the world where it is true or isn't (Most interesting would be social drives that are distinct from the above? Or maybe they use some of the same sub-circuitry? Like jealousy seems obviously like it would fit under strong status fear, so histrionic personality would go with being more jealous)
Since I am already on the fancy note-taking train, I'd find examples of your actual note files way more interesting.
On my phone, rotating the screen by 180° quickly reverses the direction and then I rotate it back slowly.
I think from the perspective of a radical probabilist, it is very natural to not only have a word of where your current point estimate is at, but also have some tagging for the words indicating how much computation went into it or if this estimate already tries to take the listeners model into account also?
I misread your whole post by thinking your title implied "your post would question whether the entropy increased=> the post argues it decreases" and then I was reading sloppily and didn't notice my error.
Also you should halt and reevaluate your intuitions if they lead you to believe there is a perpetual motion machine.
Photosynthesis? Most of the carbon is bound from CO2 by using sun exergy.
Cool post. I agree with the many-shot part in principle. It strikes me that in a few years (hopefully not months?), this will look naive in a similar way that all the thinking on ways a well boxed AI might be controlled look naive now. If I understand correctly, these kinds of simulations would require a certain level of slowing down and doing things that are slightly inconvenient once you hit a certain capability level. I don't trust labs like OpenAI, Deepmind, (Anthropic maybe?) to execute such a strategy well.
If legibility of expertise is a bottleneck to progress and adequacy of civilization, it seems like creating better benchmarks for knowledge and expertise for humans might be a valuable public good. While that seems difficult for aesthetics, it seems easier for engineering? I'd rather listen to a physics PhD, who gets Thinking Physics questions right (with good calibration), years into their professional career, than one who doesn't.
One way to do that is to force experts to make forecasts, but this takes a lot of time to hash out and even more time to resolve.
One idea I just had related to this: the same way we use datasets like MMLU and MMMU, etc. to evaluate language models, we use a small dataset like this and then experts are allowed to take the test and performance on the test is always public (and then you make a new test every month or year).
Maybe you also get some participants to do these questions in a quiz show format and put it on YouTube, so the test becomes more popular? I would watch that.
The disadvantage of this method compared to tests people prepare for in academia would be that the data would be quite noisy. On the other hand, this measure could be more robust to goodharting and fraud (although of course this would become a harder problem once someone actually cared about that test). This process would inevitably miss genius hedgehog's of course, but maybe not their ideas, if the generalists can properly evaluate them.
There are also some obvious issues in choosing what kinds of questions one uses as representative.
It not being linked on Twitter and Facebook seems more like a feature than a bug, given that when I asked Gwern why a page like this doesn't already exist, he wrote me he doesn't want people to mock it.
> I really like the importance Tags, but what I would really like is a page
> where I can just go through all the posts ordered by importance. I just
> stumbled over another importance 9 post (iron rules) when I thought I had
> read all of them. Clicking on the importance tag, just leads to a page
> explaining the importance tag.
Yeah, that is a mix of 'too hard to fix' and 'I'm not sure I want to
fix it'. (I don't know how Hakyll works well enough to do it
'normally', and while I think I can just treat it as a tag-directory,
like 'meta/importance/1', 'meta/importance/2' etc, that's a little
awkward.) Do I *want* people to be able to go through a list of
articles sorted by importance and be able to easily mock it - avoiding
any actual substantive critique?
I went through Gwern’s posts and collected all the posts with importance 8 and higher as of 2024-09-04 in case someone else was searching for something like this.
10
- How Often Does Correlation=Causality?
- Why Correlation Usually ≠ Causation
- The Scaling Hypothesis
- Complexity no Bar to AI
- The Algernon Argument
- Embryo Selection For Intelligence
- Life Extension Cost-Benefits
- Colder Wars
- Metamagical Themas: Sanity and Survival
- The Existential Risk of Math Errors
9
- Iodine and Adult IQ meta-analysis
- Lithium in ground-water and well-being
- Melatonin
- Modafinil
- Modafinil community survey
- Nicotine
- Why Tool AIs Want to Be Agent AIs
- Machine Learning Scaling
- The Hyperbolic Time Chamber & Brain Emulation
- The Iron Law Of Evaluation And Other Metallic Rules
- Spaced Repetition for Efficient Learning
- Are Sunk Costs Fallacies?
- Plastination versus Cryonics
- Silk Road 1: Theory & Practice
- Darknet Market Archives (2013–2015)
- The Melancholy of Subculture Society
8
- The Replication Crisis: Flaws in Mainstream Science
- How Should We Critique Research?
- The Narrowing Circle
- Zeo sleep self-experiments
- When Should I Check The Mail?
- Candy Japan’s new box A/B test
- The Kelly Coin-Flipping Game: Exact Solutions
- In Defense of Inclusionism
- Bitcoin Is Worse Is Better
- Time-lock encryption
- Easy Cryptographic Timestamping of Files
- GPT-3 Creative Fiction
- RNN Metadata for Mimicking Author Style
- Terrorism Is Not About Terror
- Terrorism Is Not Effective
- Archiving URLs
- Conscientiousness & Online Education
- Radiance: A Novel
The recent post on reliability and automation reminded me that my "textexpansion" tool Espanso is not reliable enough on Linux (Ubuntu, Gnome, X11). Anyone here using reliable alternatives?
I've been using Espanso for a while now, but its text expansions miss characters too often, which is worse than useless. I fiddled with Espanso's settings just now and set the backend to Clipboard, which seems to help with that, but it still has bugs like the special characters remaining ("@my_email_shorthand" -> "@myemail@gmail.com").
In particular, I think you might need to catch many escape attempts before you can make a strong case for shutting down. (For concreteness, I mostly imagine situations where we need to catch the model trying to escape 30 times.)
So instead of leaving the race once the models start scheming against you, you keep going to gather more instances of scheming until you can finally convince people? As an outside reader of that story I'd just be screaming at the protagonists that clearly everyone can see where this is going where scheming attempt number 11 is just good enough to be successful. And in the worlds where we catch them 30 times successfully it feels like people would argue: this is clear evidence that the models aren't "actually dangerous" yet, so let's keep scaling "responsibly".
There is probably a lot of variation between people regarding that. In my family meds across the board improved people's sleep (by making people less sleepy during the day, so more active and less naps). When I reduced my medication from 70mg to 50mg for a month to test whether I still needed the full dose, the thing that was annoying the most was my sleep (waking up at night and not falling asleep again increased. Falling asleep initially was maybe slightly easier). Taking it too late in the afternoon is really bad for my sleep, though.
Things I learned that surprised me from a deep dive into how the medication I've been taking for years (Vyvanse) actually gets metabolized:
- It says in the instructions that it works for 13 hours, and my psychiatrist informed me that it has a slow onset of about an hour. What that actually means is that after ~1h you reach 1/2 the peak concentration and after 13 hours you are at 1/2 the peak concentration again, because the half-life is 12h (and someone decided at some point 1/2 is where we decide the exponential starts and ends?). Importantly, this means 1/4 of the medication is still present the next day!
Here is some real data, which fit the simple exponential decay rather well (It's from children though, which metabolize dextroamphetamine faster, which is why the half-life is only ~10h)
- If you eat ~1-3 grams of baking soda, you can make the amount of medication you lose through urine (usually ~50%) go to 0[1] (don't do this! Your body probably keeps its urine pH at the level it does for a reason! You could get kidney stones).
- I thought the opposite effect (acidic urine gets rid of the medication quickly) explained why my ADHD psychologist had told me that the medication works less well combined with citric fruit, but no! Citric fruit actually increase your urine pH (or mostly don't affect it much)? Probably because of the citric acid cycle which means there's more acid leaving as co2 through your lungs? (I have this from gpt4 and a rough gloss over details checked out when checking Wikipedia, but this could be wrong, I barely remember my chemistry from school)
- Instead, Grapefruit has some ingredients known to inhibit enzymes for several drugs, including dextroamphetamine (I don't understand if this inhibitory effect is actually large enough to be a problem yet though)
- This brings me to another observation: apparently each of these enzymes is used in >10-20% of drugs: (CYP3A4/5, CYP2D6, CYP2C9). Wow! Seems worth learning more about them! CYP2D6 gets actually used twice in the metabolism of dextroamphetamine, once for producing and once for degrading an active metabolite.
Currently still learning more about basics about neurotransmitters from a textbook, and I might write another update once/if at the point where I feel comfortable writing about the effects of dextroamphetamine on signal transmission.
Urinary excretion of methylamphetamine in man (scihub is your friend) ↩︎
Looking forward to the rest of the sequence! On my current model, I think I agree with ~50% of the "scientism" replies (roughly I agree with those relating to thinking of things as binary vs. continuous, while I disagree with the outlier/heavy-tailed replies), so I'll see if you can change my mind.
The technical background is important, but in a somewhat different way than I'd thought when I wrote it. When I was writing it, I was hoping to help transmit my model of how things work so that people could use it to make their own decisions. I still think it's good to try to do this, however imperfectly it might happen in practice. But I think the main reason it is important is because people want to know where I'm coming from, what kinds of things I considered, and how deeply I have investigated the matter.
Yes! I think it is beneficial and important that someone who has a lot of knowledge about this transmits their model on the internet. Maybe my Google foo is bad, but I usually have a hard time finding articles like this when there doesn't happen to be one on Lesswrong (only can think of this counterexample I remember finding reasonably quickly).
Raising children better doesn't scale well. Neither in how much ooomph you get out of it per person, nor in how many people you can reach with this special treatment.
What (human or not) phenomena do you think are well explained by this model? I tried to think of any for 5 minutes and the best I came up with was the strong egalitarianism among hunter gatherers. I don't actually know that much about hunter gatherers though. In the modern world something where "high IQ" people are doing worse is sex, but it doesn't seem to fit your model.
So on the -meta-level you need to correct weakly in the other direction again.
I used Alex Turners entire shortform for my prompt as context for gpt-4 which worked well enough to make the task difficult for me but maybe I just suck at this task.
By the way, if you want to donate to this but thought, like me, that you need to be an “accredited investor” to fund Manifund projects, that only applies to their impact certificate projects, not this one.
My point is more that 'regular' languages form a core to the edifice because the edifice was built on it, and tailored to it
If that was the point of the edifice, it failed successfully, because those closure properties made me notice that visibly pushdown languages are nicer than context-free languages, but still allow matching parentheses and are arguably what regexp should have been built upon.
My comment was just based on a misunderstanding of this sentence:
The 'regular' here is not well-defined, as Kleene concedes, and is a gesture towards modeling 'regularly occurring events' (that the neural net automaton must process and respond to).
I think you just meant that there's really no satisfying analogy explaining why it's called 'regular'. What I thought you imply is that this class wasn't crisply characterized then or now in terms of math (it is). Thanks to your comment though, I noticed a large gap in the CS-theory understanding I thought I had. I thought that the 4 levels usually mentioned in the chomsky hierarchy are the only strict subsets for languages that are well characterized by a grammar, an automaton and a a whole lot of closure properties. Apparently the emphasis on these languages in my two stacked classes on the subject 2 years ago was a historical accident? (Looking at wikipedia, visibly pushdown languages allow intersection, so from my quick skim more natural than context-free languages). They were only discovered in 2004, so perhaps I can forgive my two classes on the subject to not have included developments 15 years in the past. Anyone has post recommendations for propagating this update?
I noticed some time ago there is a big overlap between lines of hope mentioned in Garret Baker's post and lines of hope I already had. The remaining things he mentions are lines of hope that I at least can't antipredict which is rare. It's currently the top plan/model of Alignment that I would want to read a critique of (to destroy or strengthen my hopes). Since no one else seems to have written that critique yet I might write a post myself (Leave a comment if you'd be interested to review a draft or have feedback on the points below).
- if singular learning theory is roughly correct in explaining confusing phenomena about neural nets (double descent, grokking), then the things confusing about these architectures are pretty straightforward implications from probability theory (Implying we might expect fewer diffs in priors between humans and neural nets because biases are less architecture dependent).
- the idea of whether something like "reinforcing shards" can be stable if your internals are part of the context during training even if you don't have perfect interpretability
- The idea that maybe the two ideas above can stack? If for both humans and AI training data is the most crucial, then perhaps we can develop methods comparing human brains and AI. If we get to the point of being able to do this in detail (big If, especially on the neuroscience side this seems possibly hopeless?), then we could get further guarantees that the AI we are training is not a "psychopath".
Quite possibly further reflection feedback would change my mind and counterarguments/feedback would be appreciated. I am quite worried about motivated reasoning to think this plan is better than I think because it would give me something tractable to work on. Also to which extent people planning to work on methods that should be robust enough to survive a sharp left turn are pessimistic about lines of research like this only because of the capability externalities. I have a hard time evaluating the capability externalities of publishing research on plans like the above. If someone is interested in writing a post about this or reading it feel free to leave a comment.
Aren't regular languages really well defined as the weakest level in the Chomsky Hierarchy?
Would it change your mind if gpt-4 was able to do the grid tasks if I manually transcribed them to different tokens? I tried to manually let gpt-4 turn the image to a python array, but it indeed has trouble performing just that task alone.
That propagates into a huge difference in worldviews. Like, I walk around my house and look at all the random goods I’ve paid for - the keyboard and monitor I’m using right now, a stack of books, a tupperware, waterbottle, flip-flops, carpet, desk and chair, refrigerator, sink, etc. Under my models, if I pick one of these objects at random and do a deep dive researching that object, it will usually turn out to be bad in ways which were either nonobvious or nonsalient to me, but unambiguously make my life worse and would unambiguously have been worth-to-me the cost to make better.
Based on my 1 deep dive on pens a few years ago this seems true. Maybe that is too high dimensional and too unfocused a post, but maybe there should be a post on "best X of every common product people use every day"? And then we somehow filter for people with actual expertise? Like for pens you want to go with the recommendations of "the pen addict".
For concreteness. In this task it fails to recognize that all of the cells get filled, not only the largest one. To me that gives the impression that the image is just not getting compressed really well and the reasoning gpt-4 is doing is just fine.
I think humans just have a better visual cortex and expect this benchmark too to just fall with scale.
Looking at how gpt-4 did on the benchmark when I gave it some screenshots, the thing it failed at was the visual "pattern matching" (things completely solved by my system 1) rather than the abstract reasoning.
Thanks for clarifying! I just tried a few simple ones by prompting gpt-4o and gpt-4 and it does absolutely horrific job! Maybe trying actually good prompting could help solving it, but this is definitely already an update for me!
LLMs have failed at ARC for the last 4 years because they are simply not intelligent and basically pattern-match and interpolate to whatever is within their training distribution. You can say, "Well, there's no difference between interpolation and extrapolation once you have a big enough model trained on enough data," but the point remains that LLMs fail at the Abstract Reasoning and Concepts benchmark precisely because they have never seen such examples.
No matter how 'smart' GPT-4 may be, it fails at simple ARC tasks that a human child can do. The child does not need to be fed thousands of ARC-like examples; it can just generalize and adapt to solve the novel problem.
I don't get it. I just looked at ARC and it seemed obvious that gpt-4/gpt-4o can easily solve these problems by writing python. Then I looked it up on papers-with-code and it seems close to solved? Probably the ones remaining would be hard for children also. Did the benchmark leak into the training data and that is why they don't count them?
Feel free to write a post if you find something worthwhile. I didn't know how likely the whole Biden leaving the race thing was so 5% seemed prudent. At those odds, even if I belief the fivethirtyeight numbers I'd rather leave my money in etfs. I'd probably need something like >>1,2 multiplier in expected value before I'd bother. Last year when I was betting on Augur I was also heavily bitten by gas fees (150$ transaction costs to get my money back because gas fees exploded for eth), so would be good to know if this is a problem on polymarket also.
Heuristics I heard: cutting away moldy bits is ok for solid food (like cheese, carrot). Don't eat moldy bread, because of mycotoxins (googeling this I don't know why people mention bread in particular here). Gpt-4 gave me the same heuristics.
Has anyone here investigated before if washing vegetables/fruits is worth it? Until recently I never washed my vegetables, because I classified that as a bullshit value claim.
Intuitively, if I am otherwise also not super hygienic (like washing my hands before eating) it doesn't seem that plausible to me that vegetables are where I am going to get infected from other people having touched the carrots etc... . Being in quarantine during a pandemic might be an exception, but then again I don't know if I am going to get rid of viruses if I am just lazily rinsing them with water in my sink. In general washing vegetables is a trivial inconvenience I'd like to avoid, because it leads me to eat less vegetables/fruits (like raw carrots or apples).
Also I assume a little pesticides and dirt are not that bad (which might be wrong).
Sounds like the right kind of questions to ask, but without more concrete data on what questions your predictions were off by how much, it is hard to give any better advice than: if your gut judgement tends to be 20% off after considering all evidence, move the number 20% up.
Personally me and my partner have a similar bias, but only for ourselves, so making predictions together on things like "Application for xyz will succeed. Y will read, be glad about and reply to the message I send them" can be helpful in cases where there are large disagreements.