Pattern's Shortform Feed 2019-05-30T21:21:23.726Z · score: 13 (3 votes)
[Accidental Post.] 2018-09-13T20:41:17.282Z · score: -6 (1 votes)


Comment by pattern on Two senses of “optimizer” · 2019-08-23T05:59:16.041Z · score: 1 (1 votes) · LW · GW

I meant the reason people think the jump is possible is because of evolution. (Evolution was just running, business as usual, and suddenly! We're on the moon!)

I have seen better versions of this argument that mention humans or evolutionary algorithms explicitly. And you're right - the argument isn't clear on that. The jump is 'a search for better plans will find better plans'.

Comment by pattern on Two senses of “optimizer” · 2019-08-23T05:43:04.100Z · score: 1 (1 votes) · LW · GW

The way SAT solvers work is by trying to assign the maximum number of variables without conflict. For a finite number of constraints, the max is at most all of them, so it stops. (If the people who write it assume there's a solution, then they probably don't include code to keep the (current) "best" solution around.)

As Joar Skalse noted below, this could be considered intrinsic optimization. (Maybe if you wrote an SAT solver which consisted solely of trying random assignments for all variables at once until it happened to find one that satisfied all constraints it wouldn't be an optimizer, absent additional code to retain the best solution so far.)

In contrast, I haven't seen any work on this* - optimization is always described for finding solutions which fits all constraints. (The closest to an exception is Lagrange multipliers - that method produces a set of candidate solutions which includes the maximum/minimum, and then one has to check all of them to find which one that is. But all candidates satisfy the other constraints.)

The way it's described would have one think it maximizing a utility function - which only returns 0 or 1. But the way the solution works is by assembling pieces towards that end - navigating the space of possible solutions in a fashion that will explore/eliminate every possibility before giving up.

*Which really stuck out once I had a problem in that form. I'm still working on it. Maybe work on chess AI includes this because the problem can't be fully solved, and that's what it takes to get people to give up on "optimal" solutions which take forever to find.

Comment by pattern on Computational Model: Causal Diagrams with Symmetry · 2019-08-23T04:49:34.094Z · score: 1 (1 votes) · LW · GW
the DAG consists of an infinite sequence of copies of the same subcircuit.

If it's infinite then the program doesn't terminate. (In theory this may be the case if n is a non-negative integer.)

Ideally, we’d gain enough information to avoid needing to search over all possible programs.

Over all shorter possible programs.

(In principle, we could shoehorn this whole thing into traditional Solomonoff Induction by treating information about the internal DAG structure as normal old data, but that doesn’t give us a good way to extract the learned DAG structure.)

Sounds like SI is being treated as a black box.

We don’t care about graphs which can be generated by a short program but don’t have these sorts of symmetries.

Why do we like programs which use recursion instead of iteration?

Comment by pattern on Two senses of “optimizer” · 2019-08-22T03:11:08.372Z · score: 3 (2 votes) · LW · GW
The jump between the two is not explained.

Generalizing from Evolution.

Comment by pattern on Two senses of “optimizer” · 2019-08-22T03:09:34.338Z · score: 7 (6 votes) · LW · GW
But how is an SAT solver an optimizer? There’s not an implied goal as far as I can tell.

There's a bunch of things you want to fit in your backpack, but only so much space. Fortunately, you're carrying these things to sell, and you work in a business where you're guaranteed to sell everything you bring with you, and the prices are fixed. You write a program with the same goal as you - finding the combination which yields maximum profit.

Comment by pattern on Odds are not easier · 2019-08-21T15:32:54.437Z · score: 3 (2 votes) · LW · GW

If you roll a fair six sided die once, there is a probability of 1/3 of rolling a "1" or a "2". While a probability (#) is followed by a description of what happens, this information is interlaced with the odds:

1:2 means there's 1 set* where you get what you're looking for ("1" or "2") and 2 where you don't ("3" or "4", "5" or "6"). It can also be read as 1/3.

I tried to come up with a specific difference between odds and probability that would suggest where to use one or the other, aside from speed/comfort and multiplication versus addition**, and the only thing I came up with is that you used "0.25" as a probability where I'd have used "1/4".

*This relies on the sets all having equal probability.

**Adding .333 repeating to .25 isn't too hard, .58333 3s repeating. Multiplying those sounds like a mess. (I do not want to multiply anything by .58333 ever. (58 + 1/3)/100 doesn't look a lot better. 7/12 seems reasonable.)

Multiplying with odds: 1:2 x 1:3 = 1:6 = 1/7.

Adding: 1:2 + 1:3 = ? 3 worlds + 4 worlds = 7, so 2:5? Double checking: 1/3 + 1/4 = (4+3)/12 = 7/12

a:b + c:d = ac+bc:bd

Comment by pattern on Odds are not easier · 2019-08-21T15:16:07.216Z · score: 3 (2 votes) · LW · GW
So please, whatever you write, stop saying that odds are easier. They are possibly more intuitive to manipulate, but they need exactly the same amount of information.

But do they need the same amount of computation?

Comment by pattern on Why I Am Not a Technocrat · 2019-08-21T04:10:16.523Z · score: 1 (1 votes) · LW · GW

I'm interested in why the link was posted.

(I could read it via chrome.)

I didn't find the linked post very interesting, mostly because

1) I don't see a lot of self identified technocrats explaining their position - prior to reading this, I might have classified "RadicalxChange" as technocratic. Now I'd categorize them as "Technocrats who don't want to be called that."

2) Claims that aren't backed up, particularly of impossibility/intangibility, and similarly rejecting things without making it clear why (like consequentialism).

If an important task is impossible then we should give up. If it is possible, solutions seem like the place to start.

3) Saying 'one of the problems with Technocratic governance is focusing on legibility - with historically disastrous results.' And then following that up with 'We should fix Technocracy/governance by focusing on legibility...'

Comment by pattern on Unstriving · 2019-08-21T03:26:11.855Z · score: 3 (2 votes) · LW · GW

I'm not sure how well "Unstriving" captures this idea - though that might depend on what comes to mind when you hear the word "Striving". Is it:

  • Failing because you tried to do too many things? (1)
  • Failing most things because you sacrificed most of your goals/slack to trying to succeed at one of your goals? (2)
  • Failing because you tried too hard and pushed past the point of usefulness? (3)
  • Failing because you were unlucky*/didn't experiment enough/that one thing you were nervous about was something to be worried about/you did what you were supposed to do and it wasn't good enough? (4) *You did X, and you succeeded. You tried doing Y which looks a lot like X, almost the same way. Unfortunately, between the difference between X and Y, and your approaches it didn't pan out. (5)
  • Succeeding because you planned in advance so you didn't do 1. (6)
  • Succeeding because when the data/theory said "X is the worst possible thing" you did the opposite of X. (6)
  • Succeeding because you're Superman so you never fail. (7)
  • Failing because you're Superman, and there was Kryptonite in the building. (8)
  • Succeeding because you don't challenge yourself/aren't challenged enough. (9)
  • "Failing" because you challenged yourself. (10)
  • Failing/Succeeding because you did too much of 9/enough of 10. (11)

Basically, what's the problem with optimization?

*Unlucky can mean a lot of things.

Comment by pattern on Do We Change Our Minds Less Often Than We Think? · 2019-08-21T02:54:17.972Z · score: 1 (1 votes) · LW · GW

It also might be advice for timed tests - even if the probability of the getting the question right increases after being scrutinized more closely, doing that before answering all the questions might not be the strategy with the highest expected score - if the increase in time taken doesn't result in an increase in expected score greater than answering an additional question, then opportunity cost may be too high.*

*This may rely on all questions having equal value.

Comment by pattern on Davis_Kingsley's Shortform · 2019-08-21T02:33:06.773Z · score: 1 (1 votes) · LW · GW

In MTG the card "Browbeat" is pretty effective: any player can choose to take 5 damage, or you* get to draw 3 cards.

*Technically the target.

Comment by pattern on A Primer on Matrix Calculus, Part 2: Jacobians and other fun · 2019-08-15T20:09:50.023Z · score: 2 (2 votes) · LW · GW

The final set of images looks a bit like someone zooming in on a map*. (The blue part of the first image looks like the head of a cat.)

*ETA: specifically the yellow region. (Not because it's small.)

Comment by pattern on Partial summary of debate with Benquo and Jessicata [pt 1] · 2019-08-15T19:52:52.392Z · score: 1 (1 votes) · LW · GW
People are right/just to fear you developing political language if you appear to be actively

The end of this sentence appears to be missing.

More generally, I appreciate this post, and I think it's a good distillation - as someone who can't read what it's a distillation of.

I also think that evaluating distillation quality well is easier with access to the conversation/data being distilled.

Absent any examples of conversations becoming public, it looks like distillation is the way things are going. While I don't have any reason to suspect there are one or more conspiracies, given this:

It's *also* the case that private channels enable collusion

was brought up, I am curious how robust distillations are (intended to be) against such things, as well as how one goes about incentivizing "publishing". For example, I have a model where pre-registered results are better* because they limit certain things like publication bias. I don't have such a model for "conversations", which, while valuable, are a different research paradigm. (I don't have as much of a model for, in general, how to figure out the best thing to do, absent experiments.)

*"Better" in terms of result strength, and not necessarily the best thing (in a utilitarian sense).

Comment by pattern on Diana Fleischman and Geoffrey Miller - Audience Q&A · 2019-08-14T21:41:25.961Z · score: 1 (1 votes) · LW · GW

For those who prefer to read the post on Putanumonit:


The post says there are 13* comments, but only 6* can be seen.

*After this is posted: 14 and 7, respectively.

Comment by pattern on "Designing agent incentives to avoid reward tampering", DeepMind · 2019-08-14T20:43:42.432Z · score: 3 (4 votes) · LW · GW
I feel like this same set of problems gets re-solved a lot. I'm worried that it's a sign of ill health for the field.

Maybe the problem is getting everyone on the same page.

Comment by pattern on William_Darwin's Shortform · 2019-08-14T19:07:21.305Z · score: 5 (4 votes) · LW · GW

(This is meant to be purely illustrative, not taken seriously. Also, given how hard it was to come up with frames, it might be better to replace using lenses this way with 'questions that are always good to ask'.)

Idea [1]____1___________2__________3

Lenses:__Economic__Narrative [3] Empiricism [4]

[1] Movies these days seem to be lacking realism[2]/X.


Is there not an audience for realism/X? (Or is this a market failure?)

Are most movies produced by studios that aren't good at writing realism/X?

Is it more expensive to produce movies which are more realistic/X?

Harder to make money off of?

Is this the result of government regulation? Self-regulation?

Do the people involved in making movies (scriptwriters, directors, etc.) prefer less realistic**/X movies? Find it easier to make such movies?

[2] What is the alternative to realism that is more common?

(Realistically, the best way to make progress on a question like this is probably by unpacking 'What you mean by "realism"/X.')

[3] A struggle between the forces of good and evil.


Who controls Hollywood, good or evil? Both? Neither?

What reasons might they have for doing this?


It's easier to have the good guys win in movies if you're less realistic. It also delivers a particular message 'you will win if you're good, no matter how ridiculous that sounds.


It lulls people into a false sense of security. "All evil needs to prevail is every good person doing nothing." As it is hard to get people to do nothing, the nothing must be obscured by an illusion of doing something - thus, meaningless visual media, Netflix, etc.



Life isn't perfect. People go to the movies to get away from it all/see the people they agree with win. It doesn't have to make sense, it just has to be entertaining and end happily.


Movie makers don't care about realism. Conflicts of Good versus Evil, where the good guys always win, in movies that don't make sense aren't about Good versus Evil. They're just another opportunity for movie makers to set up their side as "Good" and the other side as "Evil". This is why movies today are getting political (to the detriment of their quality).

[4] Looking at data


Assess the quality of a sample of movies, perhaps across time periods, perhaps highly rated/popular movies.

Has the factor we're interested in changed over time?

(Is realism going down, up, in a cycle, or randomly - say, based on really popular movies coming out which do or don't have features (such as realism), and then more movies like that getting made, Y number of Years later.)

Have other factors? Are there any relationships in the data?

Comment by pattern on William_Darwin's Shortform · 2019-08-14T18:38:27.439Z · score: 3 (2 votes) · LW · GW

(Update: it might be better to just come up with a list of questions that are always good to ask/have been really useful in the past (in other domains) and use that instead. The chart is just 1) a row of such questions 2) a row where you add a checkmark after you've answered that question (about the topic you're trying to understand.)

A lens/frame/framework is a way of looking at things. I meant this to be a suggestion to see how a lens can be applied to other domains by constructing a chart/checklist as follows:

There are n columns and 2 rows, where n is the number of lenses. The first column tells you what is in each row. The top element of the first column contains 'the name of your idea'. The bottom element of the first column can contain the word "lens". The bottom row (after the first element) contains the name of each lens (that you have given it). The top row (after the first element) contains either blank spaces or check marks*.

Coming up with a procedure to see if you've thought through all the implications of a model may also be useful.

*One could also put a page number in it, and write about that idea through that lens on that page.

Comment by pattern on William_Darwin's Shortform · 2019-08-13T05:11:12.780Z · score: 4 (3 votes) · LW · GW

Great post by the way, it was really useful to see all these, together as one system.

Comment by pattern on William_Darwin's Shortform · 2019-08-13T05:10:55.620Z · score: 2 (2 votes) · LW · GW
and experiences you interact with

Relatedly, I'd suggest 'Things you create or skills you have.'

you develop the ability to eliminate the channels/methods which present the most negative feedback.

The least useful feedback?

Start by developing ideas from activities you enjoy:

Try more things, see if you enjoy them. (Also, sometimes you can learn from things you don't like - a story with a specific form of bad storytelling might teach you something about the right way to tell stories.)

Structure of Information Flows Record every idea you have:


This section didn't have bold parts.

Idea Rate = Number of Ideas / Number of Opportunities

Why wouldn't "Idea Rate = Ideas per (Unit of Time)" ? It would seem one could increase the amount/rate of ideas they have, not only by increases their ideas per opportunity, but also by increasing their number of opportunities.

Constants: Number of Opportunities = 800
Ideas/Opportunity = Idea Rate = 20%
Good Ideas/Idea = Success Rate = 10%

I didn't understand this until I copied it here, and the formatting clicked, and then it all made sense.

Arming yourself with a vast knowledge of any particular situation or topic gives you a better chance of coming up with the correct solution to a given problem because as your network of understanding grows,

(Emphasis added.)

Problems are important, as is coming up with them/having good sources for them.

Each individual you come in contact with is an opportunity to glean unique and valuable information from.

It might be useful to come up with frames and give them names and put them in a list, so you can do this:

[New Idea]: (check)

Frames [Frame 1] [Frame 2] etc.

Reduce Cognitive Delay[**]

Also see how you can implement an idea, particularly - quickly. (This one can be hard.)

More generally, release "Delays" period. Having a model of the process can help with this. (It's possible the low hanging fruit has been picked in communication technology on the raw speed front*, but it's useful to note how this can speed up things we do.)

*To such an extent some may find it detrimental. One could compare the quality of comments on twitter with the quality of letters, or the quality of of moves in a live chess game versus one by post. (It's also easier to draw on paper.)

Looking for HARSH criticism

That's hard to do with such a good idea.

[**]####5. Gain Around Positive Feedback Loops a. Find a receptive audience:

The section above is related to that fact that "idea quality" can be subjective - coming up with ideas that sound great is all well and good, but reality is the final arbiter. (Though this kind of depends on what you're working on.) Finding ways to implement things or ideas/testing things out might help. I'd also ask where these "first principles" come from.

If you enjoy something, you might not learn as much. Consider the popular Lord of the Rings. Did you learn something by reading/watching it? The first time? The n-th time?

By engaging in more conversation about your ideas, you develop a better grasp of why you receive negative feedback about particular topics.

Eh. Can you really change the world with an idea that doesn't upset people?

This ties in to the idea that it is possible for you to produce a synthesis of contrarian idea.

This could use some elaboration.

Comment by pattern on Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4) · 2019-08-13T04:10:27.386Z · score: 3 (3 votes) · LW · GW
This isn't quite embedded agency, but it requires the base optimizer to be "larger" than the mesa-optimizer, only allowing mesa-suboptimizers, which is unlikely to be guaranteed in general.

Size might be easier to handle if some parts of the design are shared. For example, if the mesa-optimizer's design was the same as the agent, and the agent understood itself, and knew the mesa-optimizer's design, then it seems like them being the same size wouldn't be (as much of) an issue.

Principal optimization failures occur either if the mesa-optimiser itself falls prey to a Goodhart failure due to shared failures in the model, or if the mesa-optimizer model or goals are different than the principal's in ways that allow the metrics not to align with the principals' goals. (Abrams correctly noted in an earlier comment that this is misalignment. I'm not sure, but it seems this is principally a terminology issue.)

1) It seems like there's a difference between the two cases. If I write a program to take the CRT, and then we both take it, and we both get the same score (and that isn't a perfect score), because it solved them the way I solve them, that doesn't sound like misalignment.

2) Calling the issues between the agents because of model differences "terminology issues" could also work well - this may be a little like people talking past each other.

Lastly, there are mesa-transoptimizers, where typical human types of principle-agent failures can occur because the mesa-optimizer has different goals. The other way this occurs is if the mesa-optimizer has access to or builds a different model than the base-optimzer.

Some efforts require multiple parties being on the same page. Perhaps a self driving car that drives on the wrong side of the road could be called "unsynchronized" or "out of sync". (If we really like using the word "alignment" the ideal state could be called "model alignment".)

Comment by pattern on Intransitive Preferences You Can't Pump · 2019-08-11T03:48:50.432Z · score: 1 (1 votes) · LW · GW
But arbitrarily low probabilities need not exist.

I believe for practical purposes, "I (or you) buy a cheap lottery ticket, and if it's the winning ticket, then you pay me $1" is low enough.

Comment by pattern on Raemon's Scratchpad · 2019-08-11T03:31:45.514Z · score: 0 (1 votes) · LW · GW

Create a machine that creates lightning strikes.

Comment by pattern on Why do humans not have built-in neural i/o channels? · 2019-08-09T18:25:09.747Z · score: 2 (2 votes) · LW · GW

The first issue isn't humans abusing the system. It's opening your brain/etc. up to attack by parasites, to say nothing of disease.

And that would probably be an issue way before the system would be developed enough to have a lot of, if any, upsides from functionality, let alone downsides.

Comment by pattern on Why do humans not have built-in neural i/o channels? · 2019-08-09T01:32:13.533Z · score: 1 (1 votes) · LW · GW

I think the issue isn't trial and error making one (securely) - it's that it'd be expensive, and 2 parties would have to have it, to use it.

Comment by pattern on Why do humans not have built-in neural i/o channels? · 2019-08-09T01:25:16.180Z · score: 3 (2 votes) · LW · GW
What's the closest thing to this we see in any species?

I recommend asking this as its own question, and note that communication across species may be more interesting.

Comment by pattern on Help forecast study replication in this social science prediction market · 2019-08-07T19:26:36.800Z · score: 3 (3 votes) · LW · GW

Links still work, I think it's a cross-posting issue.


(Details: If you click on it, it goes to:

If you shorten that to:

It redirects to: )

Comment by pattern on In defense of Oracle ("Tool") AI research · 2019-08-07T18:06:49.058Z · score: 1 (1 votes) · LW · GW

What does it take for something to qualify as agent AI?

Consider something like Siri. Suppose you could not only ask for information ("What will the weather be like today?"), but you could also ask for action ("Call 911/the hospital"). Does this cross the line from "Oracle" to "Agent"?

Comment by pattern on In defense of Oracle ("Tool") AI research · 2019-08-07T18:00:05.631Z · score: 3 (2 votes) · LW · GW
when the capabilities of an unconstrained Agent AI will essentially always surpass those of an Oracle-human synthesis.

Nitpick: the capabilities of either a) unconstrained Agent AI/s, or b) Artificial Agent-human synthesis, will essentially always surpass those of an Oracle-human synthesis. We might have to work our way up to AIs without humans being more effective.

Comment by pattern on Raemon's Scratchpad · 2019-08-07T17:19:21.797Z · score: 1 (1 votes) · LW · GW

Perhaps degree of investment. Consider the amount of time it takes for someone to grow up, and the effort involved in teaching them (how to talk, read, etc.). (And before that, pregnancy.)

There is at least one book that plays with this - the protagonist finds out they were stolen from 'their family' as a baby (or really small child), and the people who stole them raised them, and up to that point they had no idea. I don't remember the title.

Comment by pattern on Open & Welcome Thread August 2019 · 2019-08-07T06:31:28.348Z · score: 1 (1 votes) · LW · GW

A more extreme version of the interpersonal is that (one might suppose) you could have two (otherwise*) identical universes such that in the first Bob answers "3" and in the second Bob answers "5", where both Bob's feel (currently) the same way (largely*), but a) used different reference points within their life, or b) focus on different things - perhaps in universe 1 Bob thinks or says "my life could be better - 3", but in universe 2 Bob thinks or says "my life could be worse** - 5'.

*This might require some assumptions about Bob, that don't necessarily apply to everyone.

**Perhaps in universe 2, it occurs to Bob that he's never had cancer, or anything of similar magnitude.

Comment by pattern on Percent reduction of gun-related deaths by color of gun. · 2019-08-07T03:57:12.416Z · score: 1 (1 votes) · LW · GW
Assume all guns were pink by law tomorrow.

All guns, or all new guns made from here on out?

Comment by pattern on Do you do weekly or daily reviews? What are they like? · 2019-08-06T04:25:07.803Z · score: 3 (2 votes) · LW · GW
I have a bunch of low friction systems for taking notes which I can expand on if people are interested.

That sounds interesting.

Comment by pattern on Cephaloponderings · 2019-08-06T04:21:15.213Z · score: 4 (2 votes) · LW · GW
suggests they can't see in colour

They can't see color, or their eyes can't see color?

Comment by pattern on The Planning Problem · 2019-08-06T04:17:27.724Z · score: 2 (2 votes) · LW · GW

I should have split things up into multiple comments. Most (if not all) of that should be read as "I think this might be useful in practice* for planning/executing, rather than improving the model".

*Advice which if followed would or could have led to a) doing LaTex sooner, b) changed how math was handled or turned out, or making it less unpredictable w.r.t to time estimates, c) formatting the writing sooner.

Hopefully, it's clearer why it's impossible to go further without a good model for how tasks are sub-divided.

I suggested

1) that if writing the math* was a substantial piece which took longer than expected, then you might find it useful to have a personal model which says "Math* will take longer than I expect"/update future expectations based on the result - how things turned out for this post.

2) If you change the way you divide up tasks that might affect outcomes**, if not predictability.

*Or anything else this applies to.

**Such as how long things take, as well as how they turn out.

Comment by pattern on The Planning Problem · 2019-08-05T20:58:06.463Z · score: 1 (1 votes) · LW · GW

Specific commentary:

W(x) - V(x) = max(W(x) - V(x)) > 0

I didn't follow this step (locally). Max(a) = a, since max can only return a quantity it has been given, and so it requires 2 inputs to have the option of not returning its first input.

Globally, I didn't follow the math, and probably have a lot more to learn in this space.

More general commentary:

Section 1. (of this comment.)

we naturally make hierarchical or tree-like plans.

Did this plan need to be a tree? Or could it have been done as follows below [1]:

(This is intended to address the 'Math took longer than expected' part)

1) write all of the article except the math (with a placeholder called "Math Section" or "Formalization).

2) The "Math Section" placeholder could be expanded to a description of what needed to be there.

(A very short version might be: "Let’s consider a semi-realistic example of a planning function.")


3) The (written parts of the) article could be formatted.

4) The Math Section could be written.

5) The math Section could be formatted.

This breaks down 'how long does it take to write this article' into three parts: a) 1-2. b) 3. c) 4-5. While 5 might be treated as its own (large) step, that is distinction is the benefit of hindsight.

Hindsight which offers further variations - formatting the math section could be viewed as a different process from formatting the rest of the article: Learning how to format with Latex.

Section 2. (of this comment.)

In the spirit of "plans are worthless, but planning is everything":

1) Outline article.

a) Write out what each section should contain.

b) Get an idea for what each section should look like. For everything, it means creating sections - a header and a sentence each or something. For math, this additionally means writing one equation in Latex. [5]

Section 3 - Overview

The plan in section 1 was supposed to illustrate that 'how long will it take to write this article' could be broken up into 2 questions 'how long will it take to write the text' and 'how long will it take to do the math'. (And maybe after that, 'how long will it take to write the conclusion'.) If you correctly predicted the length of the first, but not the second, then splitting those seems like a good fix.

The application to plans in general is that when you discover a new sub-action A, the original time estimate T should be figured as T(doing everything except A) instead of T(doing everything.) (This isn't quite perfect in the event there are other un-discovered actions - but LaTex seems like a sub-action of A.)

What ultimately happened was that while I created an initial plan for the article, the unexpected addition of mathematical arguments broke all my estimates. I should've realized that formatting latex would take a long time since it was my first attempt.

The plan in section 2 was supposed address if the main issue was LaTex. The more general rule is - do the thing that you've never done/know the least about how long it will take/will give you the most information (/about how long things will take).

The plan in [4] addresses formatting, among other things, by suggesting loops in place of stages. This can be treated as more general planning/executing advice. It's interesting that similar things did occur in the proof:

Thus, we can use this tool and assume we can construct a sequence of plans {πt}t≥0 and incur unit cost for each plan we create.

(for each sounds like a loop.)

To accommodate error in estimation we introduce a transition function to map πt→πt+1.

Though they were about the amount of time it takes to plan, rather than interleaving planning and execution (such as 'plan the next step', execute it, repeat until done).

I thought there'd be math in this section, but it seems I haven't formalized these models that much because I haven't (explicitly) tried out them out experimentally yet.

Overall, I spent about 14 hours on this article while my original estimate was about 7.

I thought that this was Hofstader's Law, but it doesn't seem to include the specifics I thought it did - namely the scale factor of 2, which seems to be applicable here. (Empirical study of this might be interesting.)

[1] I am aware that the end conclusion might rely on the results from the math section. This analysis is starting with a simpler case. (It is also worth noting that an idea might be expressed in multiple parts, such as n posts, rather than one post with n sections.)

[3] There are multiple ways of handling this - "A tree of what?" is the real question.

[4] This sounds a bit like a recursive process for article/post generation - a series of descriptions expanded in detail step by step, until complete. This does make more sense to handle as a tree [3], in order to enable getting things done within the desired amount of time (or length of post). While "Formatting the post" could be treated as a separate process from "writing the post", if the post is handled as a set of sections in this manner, as described in steps 1) and 2), and this [4], it might be easier to integrate formatting into writing the post.

[5] Some would break this down further, and have "go get a picture of an equation displayed via Latex" here, then have working out how to use it later. A similar process is 'write the smallest/fastest version of the article that you can', then a) 'decide if you want to make it bigger/spend more time on it' b) 'make the smallest/fastest change that's self-contained'. (Such a process might have suggested doing latex sooner because you thought it wouldn't take long.)

Comment by pattern on Cephaloponderings · 2019-08-05T05:46:33.591Z · score: 2 (2 votes) · LW · GW
Perhaps most octopus parents have to die simply because they’re way too likely to eat their own offspring?

Or that part of the brain serves some other purpose? (Which could be tested by removing it earlier in life.) The advantages of having one shot mate selection aren't obvious. Unless there's some downside to having later clutches, or this is what a species implementing population control** looks like (the benefits of age capping aren't clear*), all that's left seems to be guessing RNA evolution changes things.

*Unless you're trying to accelerate generation rate, or affect how evolution works. (Intuitively, getting rid of adults that are good at staying alive means slowing things down. Maybe it helps with making sure they don't compete with themselves too much?**)

**It's not super clear how that would evolve, unless there were conditions that drove species to extinction that didn't, and we're looking at the species that didn't? That doesn't seem to explain evolutionary pressure leading it to all of them having the trait though.

The only thing it seems to guarantee is a) no intergenerational breeding - of parents and children, and b) maybe regular generation lengths.

ETA: Or making sure they don't compete with, or teach, their young.

Comment by pattern on Inversion of theorems into definitions when generalizing · 2019-08-05T02:21:22.674Z · score: 1 (1 votes) · LW · GW

Was this intended to gesture at this process:

1) Mathematics (Axioms -> Theorems), 2) Reverse Mathematics? (Theorems -> (sets of axioms* from which it could be proved)

or this process:

1) See what may be proved in System A. 2) Create system B out of what was proved in system A, and prove things.

*made as small as possible

Comment by pattern on To Spread Science, Keep It Secret · 2019-08-03T15:36:12.687Z · score: 1 (1 votes) · LW · GW
  And this effect is especially strong with information—we're much more likely to try to obtain information that we believe is secret, and to value it more when we do obtain it.

There is an additional benefit to the process - filtering. Today there is so much information, that finding the info. you're looking for can be hard to find. And when the quality of sources varies so much, and can be difficult to judge, that driving a lack of interest does make sense. (As does forcing people to do a thing badly.)

For this world, I might recommend employing the Ikea effect - don't study X, build X/do what sounds fun in that space. Are there limits to what you can build? In this, Empiricism may be the way to go - the impossible hasn't been done yet, and you don't know until you've tried.

Perhaps in a world where calculus hadn't been invented it would be harder to reinvent - I still don't see why derivatives are important as a thing unto themselves, rather than as a special case where h=0. But if you make something, you have not only a better grasp of it, and context, but of where it does and doesn't work - yet. What someone else gives you, you may forget. What you have made once, you can make again - perhaps better this time.

Perhaps making a computer and an OS from scratch and programming languages and reinventing everything would take too long. But just as the journey of a thousand miles begins with a single step, the pursuit of growth that never ends may go far indeed.

Comment by pattern on Occam's Razor: In need of sharpening? · 2019-08-03T12:57:51.476Z · score: 4 (2 votes) · LW · GW
Can we do anything to make it more precise?

There are some posts on complexity measures, which make note of a) degrees of freedom, b) free variables, ways of penalizing them, and how to calculate these measures. They probably rely on formalization though.

Links: Model Comparison Sequence, What are principled ways for penalising complexity in practice?*, Complexity Penalties in Statistical Learning,

*This might be someone asking the same question as you. While the other links might hold an answer, this one has multiple.

Comment by pattern on Rethinking Batch Normalization · 2019-08-03T12:41:56.328Z · score: 2 (2 votes) · LW · GW
And to top that off, they found that even in networks where they artificially increased ICS, performance barely suffered.

All networks, or just ones with batch normalization?

Comment by pattern on Proposed algorithm to fight anchoring bias · 2019-08-03T04:50:20.807Z · score: 3 (2 votes) · LW · GW

I wonder if running the algorithm would affect the performance/results of those who haven't been anchored.

Comment by pattern on How to Ignore Your Emotions (while also thinking you're awesome at emotions) · 2019-08-02T19:48:42.864Z · score: 3 (2 votes) · LW · GW
Do you have a source for that claim that most people's bodies are capable of it?

The assurances/anecdotes of someone who can:

If so, is their any good way to learn it?

"You are already able to move your ears. The trick is learning how to move just your ears, and not everything else with them. Practice with a mirror." (Paraphrased from memory.)

Comment by pattern on Mistake Versus Conflict Theory of Against Billionaire Philanthropy · 2019-08-02T17:34:52.423Z · score: 4 (2 votes) · LW · GW

Thank you for posting this post, and this last comment*. Both you and Scott made good points; I appreciated seeing this nuanced and interesting presentation of this other side of the issue. I really enjoy your blog.

*The heads up was appreciated:

and so no one expects further responses.
Comment by pattern on Dagon's Shortform · 2019-08-01T18:22:09.823Z · score: 1 (1 votes) · LW · GW

There's two sides of discussing incentives, wrt. X:

  • Incentivize X/Make tools that make it easier for people to do X [1].
  • Get rid of incentives that push people to not do X[2] /Remove obstacles to people doing X.

Even if alignment can't be created with incentives, it can be made easier. I'm also curious about how the current incentives on LW are a bad proxy right now.

[1] There's a moderation log somewhere (whatever that's for?), GW is great for formatting things like bulleted lists, and we can make Sequences if we want.

[2] For example, someone made a post about "drive by criticism" a while back. I saw this post, and others, as being about "How can we make participating (on LW) easier (for people it's hard for right now)?"

Comment by pattern on Mistake Versus Conflict Theory of Against Billionaire Philanthropy · 2019-08-01T18:12:58.882Z · score: 10 (5 votes) · LW · GW
We disagree that Scott’s post is a useful thing to write. I agree with everything he says, but expect it to convince less than zero people to support his position.

I found it useful, as someone who wasn't aware of the issue (having gotten this bad). I might also find further value in the discussion the post has led to. For example, this post I'm commenting on, has shed some light on (how people use the words) "conflict theorist" and "mistake theorist".

I also found it's focus on the benefits of diverse approaches to problem solving useful - it's nice to read something where people take principles seriously instead of using "virtue labels" when they like things and "vice labels" when they don't.

Thus, I expect the post to backfire.

Backfire how? Your arguments suggest a possible waste of time, not negative causal effects.

Comment by pattern on Gathering thoughts on Distillation · 2019-08-01T17:25:48.703Z · score: 1 (1 votes) · LW · GW

It almost[1] sounds like auto-generated sequences - that is, linked posts being collected together into a sequence, automatically - would be a good idea.

[1] This might make people more hesitant to link posts together, or create a mess - it might be hard to come up with algorithms that can tell when to make 1 sequence, and when to make 2.

The closest alternative seems like a general graph navigation system. (With post titles as nodes [2], and links between linked posts, and possibly colors indicating direction (A linked to B is red if you have A selected, but Blue if B is selected.))

[2] And a link for going to a post if you've selected that node.

at the bottom of a post

With regards to distillation, I'd note that if a post is really long this might not be ideal, especially for users who don't know their way around this site - those who do know their way around, and about the feature, can just click on the Comments link in the title, then scroll up to see the bottom of the post.

Comment by pattern on Another case of "common sense" not being common? · 2019-08-01T17:11:48.755Z · score: 1 (1 votes) · LW · GW
What is the relationship you see between common sense and surprisingly simple solutions to problems?

It was sounds something like a pre-existing solution (as opposed to the start of a completely new field).

Comment by pattern on How to Ignore Your Emotions (while also thinking you're awesome at emotions) · 2019-07-31T19:59:43.770Z · score: 1 (1 votes) · LW · GW

Great post. It also looks amazing on your blog - that picture goes with it well.

Is this related to risk aversion?

Comment by pattern on How to Ignore Your Emotions (while also thinking you're awesome at emotions) · 2019-07-31T19:52:56.750Z · score: 2 (2 votes) · LW · GW


the thing were you're legs... absorb shock?"
It's hard to know how to give queues that will lead to someone making the right mental/muscle connection.

Where your


Comment by pattern on Pattern's Shortform Feed · 2019-07-31T19:19:54.407Z · score: 4 (2 votes) · LW · GW

Thanks to this question, I recently started thinking about how progress on open problems in math [1] could be made faster, at least with regard to low hanging fruit. I made a comment there about modeling the problem (how can progress be made faster) and a possible solution. This brings me to a few questions:

  • Modeling the problem.
  • Solving the problem.
  • Is this a big enough deal that people want it solved? Or are people only interested in something like a) More narrow areas with obvious value being improved? b) The creation of a platform where people can put money on specific things that they want solved being solved/progress being made. c) Something else?
  • How to test all of the above (and implement where applicable).
  • Meta: Should these all be posted as separate Questions? What should they be called? Have any of these questions already been asked?

[1] They have a certain formal/empirical quality which makes things simpler. It also might be easier to use this as a metric for 'how good is our X [2] at advancing research (progress)'?

[2] Anything that could make a difference - a Platform, Organization, Program, a set of Math courses...