> “THEY BELIEVE YOU CAN CARVE UP THE DIFFERENT FEATURES OF THE UNIVERSE, ENTIRELY UNLIKE CARVING A FISH,” the angel corrected himself. “BUT IN FACT EVERY PART OF THE BLUEPRINT IS CONTAINED IN EVERY OBJECT AS WELL AS IN THE ENTIRETY OF THE UNIVERSE. THINK OF IT AS A FRACTAL, IN WHICH EVERY PART CONTAINS THE WHOLE. IT MAY BE TRANSFORMED ALMOST BEYOND RECOGNITION. BUT THE WHOLE IS THERE. THUS, STUDYING ANY OBJECT GIVES US CERTAIN DOMAIN-GENERAL KNOWLEDGE WHICH APPLIES TO EVERY OTHER OBJECT. HOWEVER, BECAUSE ADAM KADMON IS ARRANGED IN A WAY DRAMATICALLY DIFFERENTLY FROM HOW OUR OWN MINDS ARRANGE INFORMATION, THIS KNOWLEDGE IS FIENDISHLY DIFFICULT TO DETECT AND APPLY. YOU MUST FIRST CUT THROUGH THE THICK SKIN OF CONTINGENT APPEARANCES BEFORE REACHING THE HEART OF -”
https://arxiv.org/abs/1806.00952 gives a theoretical argument that suggests SGD will converge to a point that is very close in L2 norm to the initialization. Since NNs are often initialized with extremely small weights, this amounts to implicit L2 regularization.
I've found one of the main benefits of getting a virtual assistant type device (Alexa, Google Home) is allowing me to capture ideas by verbalizing them. This is especially useful if I'm falling asleep and don't want to pull out a notebook/phone.
This looks like me saying things like "Alexa, add 'is it meaningful to say that winning the lottery is difficult' to my todo list".
I went to a CFAR workshop more recently, so there might be some content that is slightly newer. Additionally, my sequence is not yet completed and I am worse at writing.
The most important thing about reading any such sequence is to actually practice the techniques. I suggest reading the sequence that is most likely to get you to do that. If you think both are equally likely, I would recommend the Hammertime sequence.
copying my comment from https://www.lesswrong.com/posts/PX7AdEkpuChKqrNoj/what-are-your-greatest-one-shot-life-improvements?commentId=t3HfbDYpr8h2NHqBD
Note that this is in reference to voting on question answers.
> Downvoting in general confuses me, but I think that downvoting to 0 is appropriate if the answer isn't quite answering the question, but downvoting past zero doesn't make sense. Downvoting to 0 feels like saying "this isn't that helpful" whereas downvoting past 0 feels like "this is actively harmful".
My rough take: https://elicit.ought.org/builder/oTN0tXrHQ
3 buckets, similar to Ben Pace's
5% chance that current techniques just get us all the way there, e.g. something like GPT-6 is basically AGI
10% chance AGI doesn't happen this century, e.g. humanity sort of starts taking this seriously and decides we ought to hold off + the problem being technically difficult enough that small groups can't really make AGI themselves
50% chance that something like current techniques and some number of new insights gets us to AGI.
If I thought about this for 5 additional hours, I can imagine assigning the following ranges to the scenarios:
This is personal to me, but I once took a class at school where all the problems were multiple choice, required a moderate amount of thought, and were relatively easy. I got 1/50 wrong, giving me a 2% base rate for making the class of dumb mistakes like misreading inequalities or circling the wrong answer.
This isn't quite a meta-prior, but it seemed sort of related?
One of my similar tools is trying to avoid keeping my phone in my pocket. Using my phone is a fine thing to do, but having my default state be "can use my phone within 5 seconds" is generally distracting and causes more phone use than necessary. For this reason, I own an ipod touch because I needed access to my calendar/todo-list at all times, but didn't want to keep my phone on me.
An example of the benefits of examples: Eli Tyre once told me you know you have understood the point someone was trying to make if you can give them a better example of their point than the one they already had. I am not often able to do this in practice, but the general act of paraphrasing things and including examples has been very beneficial for me. One of my favorite questions to ask myself and others is "for example?"
This experiment reminded me of Scott's story about Alchemy, where each generation of Alchemists had to spend the first N years of their life learning, and could only make progress after they were caught up.
In the story, the art of Alchemy is advanced by running previous knowledge through layers of redactors, who make it faster for the alchemists to catch up to the current knowledge.
In the experiment, there seems to be some level of redaction that was attempted:
where some of the subsequent rounds were spent cleaning up these notes and formulating clear next steps.
In the experiment it seemed there were some problems that couldn't be solved in the "participants lifespan" of 10 minutes. I'm curious as to whether problems can go from "not solvable" to "solvable" if every Nth participant was explicitly instructed to focus only on organizing information and making it easier for the next participants to get up to speed quickly.
I'm imagining explicit instruction to be important because if the default mode is to code, the participant would be reading the document to try to get to place where they can make progress, get frustrated, then decide to reorganize part of the document, which would lose them time. Explicit instruction to reorganize information seems potentially many times more efficient, especially if that participant is a skilled technical writer.
My experience with trying the above led to the discovery of a double crux that felt like it wasn't that useful for bridging the disagreement. If you think of disagreeing about whether to drink tea as "surface level" and disagreeing about whether tea causes cancer to be "deeper", then the crux that was identified felt like it was "sideways."
The disagreement was about whether or not advertising was net bad. The identified crux was a disagreement about how bad the negative parts of advertising were. In some sense, if I changed my mind about how bad the negative parts about advertising were, then I _would_ change my mind about advertising - however, the crux seems like it's just a restatement of the initial disagreement.
The thing I think that would have helped with this is identifying multiple cruxes and picking the one that felt most substantive to continue with.
I was answering a bunch of questions from OpenPhill's calibration test of the form "when did <thing> happen?". A lot of the time, I had no knowledge of <thing>, so I gave a fairly large confidence interval as a "maximum ignorance" type prediction (1900-2015, for example).
However, the fact that I have no knowledge of <thing> is actually moderate evidence that it happened "before my time".
Example: "when did <person> die?" If I was alive when <person> died, there's a higher chance of me hearing about their death. Thus not having heard of <person> is evidence that they died some time ago.
I like this post and think that I have personally gotten really large gains by just noticing various sensations.
A modification of the algorithm is to purposefully place yourself in situations where you experience the mental state instead of just imagining it. For instance, something I found valuable was opening reddit and just kind of sitting in the state of wanting-to-mindlessly-consume-content. The caveat is to not do this in a way that might be damaging.
To add to this, I've noticed something that I do when I'm programming is rely on various constructs/built-ins that I have a solid gears-level model of. I often find myself unwilling to use functions/constructions where I don't understand why they do what they do because I'm worried that the code might cause unintended effects that cause bugs, especially since debugging is extremely difficult without gears-level models.
I think the ideal programmer maintains probability distributions over whether or not they understand what various parts of their code is doing. If there is a bug, then they have some weighting over where the bug probably is, enabling faster debugging.
The Pilot Hi-Tec-C Coleto 4 is the best pen I've used among 10+ pens. The ink is very smooth, it's highly customizable, and it's narrow enough to fit comfortably in the hand. The Coleto 5 is too thick. The downside is that the ink in the individual cartridges runs out very quickly. This is mitigated because replacing ink cartridges in a pen feels exciting.
It's not a random walk among probabilities, it's a random walk among questions, which have associated probabilities. This results in a non-random walk downwards in probability.
The underlying distribution might be described best as "nearly all questions cannot be decided with probabilities that are as certain as 0.999999".
There is a difference in "error in calculation" versus "error in interpreting the question". The former affects the result in such a way that makes it roughly as likely to go up as down. If you err in interpreting the question, you're placing higher probability mass on other questions, which you are less than 0.999999 certain about on average. Roughly, I'm saying that you expect regression to the mean effects to apply in proportion to the uncertainty. E.g. If I tell you I scored an 90% on my test for which the average was a 70%, then you expect me to score a bit lower on a test of equal difficulty. However, if I tell you that I guessed on half the questions, then you should expect me to score a lot lower than you did if you assumed I guessed on 0 questions.
I don't know why the last comment is relevant. I agree that 1 in a million odds happen 1 in a million times. I also agree that people win the lottery. My interpretation is that it means "sometimes people say impossible when they really mean extremely unlikely", which I agree is true.
Not so. "X is guilty" is a very specific hypothesis and 0.99999999 is Very Confident, so general increases in uncertainty should make you think it's less likely that "X is guilty" is true. For example, if I'm told I misread the question, since I will not be 0.99999999 confident on nearly every question, since I now have non-trivial probability mass on other questions, I should become less confident.
The result is that it takes a specific misreading to make you more confident and that most misreadings will make you less confident, so you should become less confident.
When I was quite young, one of the guests at our house refused to eat processed food. I remember that I offered her some fritos and she refused. I was fairly astonished, and young enough to be socially inept. I asked, incredulous, how someone could not like fritos. To my surprise, she didn't brush me off or feed me banal lines about how different people have different tastes. She gave me the answer of someone who had recently stopped liking fritos through an act of will. Her answer went something like this: "Just start noticing how greasy they are, and how the grease gets all over your fingers and coats the inside of the bag. Notice that you don't want to eat things soaked in that much grease. Become repulsed by it, and then you won't like them either."
Now, I was a stubborn and contrary child, so her ploy failed. But to this day, I still notice the grease. This woman's technique stuck with me. She picked out a very specific property of a thing she wanted to stop enjoying and convinced herself that it repulsed her.
I'm fine with uncertain answers if the response is qualified, e.g. "I did one-shot-things A, B along with non-one-shot-thing C and observed that Y problem was solved after. Subjectively, it feels like A solved most of the problem."
I agree that many answers aren't the sort of one-shot things that I was looking for.
Downvoting in general confuses me, but I think that downvoting to 0 is appropriate if the answer isn't quite answering the question, but downvoting past zero doesn't make sense. Downvoting to 0 feels like saying "this isn't that helpful" whereas downvoting past 0 feels like "this is actively harmful".
I frequently got trapped browsing the internet on my phone, so I removed the web browser from my phone. You would think that I would just reinstall the browser, but adding 5 extra seconds delay is apparently sufficient for me to have impulse control.
January 2020 CFAR played this game extensively, although we played with 8+ people instead of the recommended 4. Took a while to get past even level 1 because of the amount of synchronization required. We also didn't play with stars or lives.
I can second the feeling of close calls being amazing. To quote someone I played with, "this is the most excited I've ever felt."
Here are some haiku-ish things that I wrote that attempted to capture the experience of playing.
Someone has blue eyes. 100 days pass. I have mud on my forehead.
The absence of counting. A groan of disappointment. The jabber didn't jab.
Moments accumulate. A flurry of plays. The sound of a jab.
Arms creeping forward. Sequentially numbered cards. A hum of satisfaction.
A standoff begins. Arms creeping forward. Signalling intensifies.
2) is something that I sort of thought about but not with as much nuance. I agree that such an infographic would be only useful for people who were looking for an alternate meal preparation strategy or something.
3) if it's true that people want to do meal preppy type things but don't have enough to pay upfront costs, there might be gains from 0-interest microloans, maybe via some MLM-type I loan you money, then once you've saved some money and paid me back, you loan other people money too.
I think this breaks because it results in people upvoting based on the title. I recall some study about how people did things they predicted they would do with higher than like 60% chance almost 95% of the time (numbers made up, think I remember direction/order of magnitude of effect size roughly correctly, don't know if it survived the replication crises)