"Future of Go" summit with AlphaGo 2017-04-10T11:10:40.249Z · score: 3 (4 votes)
Buying happiness 2016-06-16T17:08:53.802Z · score: 38 (38 votes)
AlphaGo versus Lee Sedol 2016-03-09T12:22:53.237Z · score: 19 (19 votes)
[LINK] "The current state of machine intelligence" 2015-12-16T15:22:26.596Z · score: 3 (4 votes)
[LINK] Scott Aaronson: Common knowledge and Aumann's agreement theorem 2015-08-17T08:41:45.179Z · score: 15 (15 votes)
Group Rationality Diary, March 22 to April 4 2015-03-23T12:17:27.193Z · score: 6 (7 votes)
Group Rationality Diary, March 1-21 2015-03-06T15:29:01.325Z · score: 4 (5 votes)
Open thread, September 15-21, 2014 2014-09-15T12:24:53.165Z · score: 6 (7 votes)
Proportional Giving 2014-03-02T21:09:07.597Z · score: 10 (13 votes)
A few remarks about mass-downvoting 2014-02-13T17:06:43.216Z · score: 27 (42 votes)
[Link] False memories of fabricated political events 2013-02-10T22:25:15.535Z · score: 17 (20 votes)
[LINK] Breaking the illusion of understanding 2012-10-26T23:09:25.790Z · score: 19 (20 votes)
The Problem of Thinking Too Much [LINK] 2012-04-27T14:31:26.552Z · score: 7 (11 votes)
General textbook comparison thread 2011-08-26T13:27:35.095Z · score: 9 (10 votes)
Harry Potter and the Methods of Rationality discussion thread, part 4 2010-10-07T21:12:58.038Z · score: 5 (7 votes)
The uniquely awful example of theism 2009-04-10T00:30:08.149Z · score: 38 (48 votes)
Voting etiquette 2009-04-05T14:28:31.031Z · score: 10 (16 votes)
Open Thread: April 2009 2009-04-03T13:57:49.099Z · score: 5 (6 votes)


Comment by gjm on ML is an inefficient market · 2019-10-16T02:02:16.311Z · score: 14 (4 votes) · LW · GW

In my opinion this makes your post valueless.

(Not to say that you should explain what tools. But I think either saying nothing or being informative must be better than posting this as it is.)

Comment by gjm on [deleted post] 2019-10-15T13:35:12.526Z

There is no objective fact of the matter regarding moral standards. Rather, we want a moral system that can be widely adopted and that when widely adopted promotes things we find good.

A moral system that said "you have to spend every waking moment curing malaria and feeding the hungry" would probably either just make people feel burned out and miserable or else be rejected outright. Many imaginable and prima facie plausible moral systems turn out to say that. A moral system that said "just do whatever the hell you want" would probably lead to few people bothering to cure malaria and feed the hungry.

It seems plausible to me that a system that says "you should be making things better for others but it's fine to devote most of your time and energy and resources to your own welfare and that of your family" does, given human nature, actually roughly maximize net good done. I expect the optimum is more demanding than the average person's actual moral system, but probably not (much?) more demanding than the average effective altruist's.

Comment by gjm on Open & Welcome Thread - October 2019 · 2019-10-13T22:04:26.265Z · score: 4 (2 votes) · LW · GW

Immediately after the bit about monkeys there's this

The usual goal in the typing monkeys thought experiment is the production of the complete works of Shakespeare. Having a spell checker and a grammar checker in the loop would drastically increase the odds. The analog of a type checker would go even further by making sure that, once Romeo is declared a human being, he doesn’t sprout leaves or trap photons in his powerful gravitational field.

which feels like a bit of an own goal to me, because I suspect the analogue of a type checker would actually make sure that once Romeo is declared a Montague it's a type error for him to have any friendly interactions with a Capulet, thus preventing the entire plot of the play.

Comment by gjm on Categories: models of models · 2019-10-10T03:28:22.783Z · score: 4 (2 votes) · LW · GW

Let's take a somewhat-concrete example. Your post mentions birds. OK, so let's consider e.g. a model of birds flying in a flock, how they position themselves relative to one another, and so on. You suggest that we consider the birds as objects: so far, so good. And then you say "they do stuff like fly, tweet, lay eggs, eat, etc. I.e., verbs (morphisms)." For the purpose of a flocking model, the most relevant one of those is flying. How are you going to consider flying as a morphism in a category of birds? If A and B are birds, what is this morphism from A to B that represents flying? I'm not seeing how that could work.

In the context of a flocking model, there are some things involving two birds. E.g., one bird might be following another, tending to fly toward it. Or it might be staying away from another, not getting too close. Obviously you can compose these relations if you want. (You can compose any relations whose types are compatible.) But it's not obvious to me that e.g. "following a bird that stays away from another bird" is actually a useful notion in modelling flocks of birds. It might turn out to be, but I would expect a number of other notions to be more useful: you might be interested in some sort of centre of mass of a whole flock, or the density of birds in the flock; you might want to consider something like a velocity field of which the individual birds' velocities are samples; etc. None of these things feel very categorical to me (though of course e.g. velocities live in a vector space and there is a category of vector spaces).

Maybe flocking was a bad choice of example. Let's try another: let the birds be hens on a farm, kept for breeding and/or egg-laying. We might want to understand how much space to give them, what to feed them, when to collect their eggs, whether and when to kill them, and so on. Maybe we're interested in optimizing taste or profit or chicken-happiness or some combination of those. So, according to your original comment, the birds are again objects in a category, and now when they "lay eggs, etc., etc." these are morphisms. What morphisms? When a bird lays an egg, what are the two objects the morphism goes between? When are we going to compose these morphisms and what good will it do us?

How does it actually help anything to consider birds as objects of a category?

Here's the best I can do. We take the birds, and their eggs, and whatever else, as objects in a category, and we somehow cook up some morphisms relating them. The category will be bizarre and jury-rigged because none of the things we care about are really very categorical, but its structure will somehow correspond to some of the things about the birds that we care about. And then we make whatever sort of mathematical or computational model of the birds we would have made without category theory. So now instead of birds and eggs we have tuples (position, velocity, number of eggs sat on) or objects of C++ classes or something. Now since we've designed our mathematical model to match up, kinda, to what the birds actually do, maybe we can find a morphism between these two jury-rigged categories corresponding to "making a mathematical model of". And then maybe there's some category-theoretic thing we can do with this model and other mathematical models of birds, or something. But I gravely doubt that any of this will actually deliver any insight that we didn't ourselves put into it. I'd be intrigued to be proved wrong.

Comment by gjm on Categories: models of models · 2019-10-10T03:06:36.842Z · score: 10 (6 votes) · LW · GW

I'm really not convinced by this framing in terms of "objects doing things to other objects".

Let's take a typical example of a morphism: let's say (note for non-mathematicians: that is, is a function that takes a positive integer and gives you a real number) given by . How is it helpful to think about this as doing something to ? How is it even slightly like "Alice pushes Bob"? You say "Every model is ultimately found in how one object changes another object" -- are you saying here that the integers change the real numbers? Or vice versa? (After that's done, what have the integers or the real numbers become?)

The only thing here that looks to me like something changing something else is that (the morphism, not either of the objects) kinda-sorta "changes" an individual positive integer to which it's applied (an element of one of the objects, again not either of the objects) by replacing it with its square root.

But even that much isn't true for many morphisms, because they aren't all functions and the objects of a category don't always have elements to "change". For instance, there's a category whose objects are the positive integers and which has a single morphism from to if and only if ; when we observe that , is 5 changing 9? or 9 changing 5? No, nothing is changing anything else here.

So far as I can see, the only actual analogy here is with the bare syntactic structure: you can take "A pushes B" and "A has a morphism f to B" and match the pieces up. But the match isn't very good -- the second of those is a really unnatural way of writing it, and really you'd say "f is a morphism from A to B", and the things you can do with morphisms and the things you can do with sentences don't have much to do with one another. (You can say "A pushes B with a stick", and "A will push B", and so forth, and there are no obvious category-theoretic analogues of these; there's nothing grammatical that really corresponds to composition of morphisms; if A pushes B and B eats C, there really isn't any way other than that to describe the relationship between A and C, and indeed most of us wouldn't consider there to be any relationship worth mentioning between A and C in this situation.)

Comment by gjm on What are your strategies for avoiding micro-mistakes? · 2019-10-06T20:23:12.962Z · score: 4 (2 votes) · LW · GW

This also helps to train your intuition, in the cases where careful calculation reveals that in fact the intuitive answer was wrong.

Comment by gjm on What is category theory? · 2019-10-06T15:00:22.902Z · score: 17 (6 votes) · LW · GW

It seems a bit odd to offer lambda calculus as an example of how category theory is useful in computing, when lambda calculus predates category theory by about a decade (1932 to 1942).

Comment by gjm on Open & Welcome Thread - October 2019 · 2019-10-06T09:57:11.984Z · score: 2 (1 votes) · LW · GW

It's more usual for topology to motivate category theory than the other way around. (That's where category theory originally came from, historically.)

Comment by gjm on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-27T19:33:42.476Z · score: 11 (4 votes) · LW · GW

It seems extremely unfortunate that the terminology apparently shifted from "counterfactually valid" (which means the right thing) to "counterfactual" (which means almost the opposite of the right thing).

Comment by gjm on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-27T19:26:17.293Z · score: 3 (2 votes) · LW · GW

I would be interested to know how you see spite as "not necessarily negative".

Comment by gjm on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-26T22:34:21.985Z · score: 5 (4 votes) · LW · GW

I don't see the big shiny red button on the front page. If I visit LW in private mode, it's there. I have the map turned off. I haven't tried logging out or turning the map back on. I'm guessing that when Ben says it's "over the frontpage map" that means it's implemented in a way that makes it disappear if the map isn't there. That seems a bit odd, though it probably isn't worth the effort of fixing.

(I have a launch code but hereby declare my intention not to use it. I am intrigued by the discussions of trading launch codes, or promises to use or not use them, for valuable things like effective charitable donations, but am not interested in taking either side of any such trade.)

Comment by gjm on Open & Welcome Thread - September 2019 · 2019-09-08T15:17:18.058Z · score: 3 (2 votes) · LW · GW

Aha, thanks. Sorry for being grumpy about it! (I hadn't known there was a profile setting to turn it off.)

Comment by gjm on Open & Welcome Thread - September 2019 · 2019-09-07T19:54:24.904Z · score: 4 (2 votes) · LW · GW

LW admins -- Is there a good reason why half my browser window when viewing the LW home page needs to be taken up with an enormous map? It's pretty horrible (and somehow pushes the same mental buttons as those whole-screen "why not sign up for our mailing list?" popups some sites give you, though obviously it's not actually very similar to those). I guess the idea is to encourage more people to go to meetups or something, but I promise it does not make me the least bit more inclined to do so.

Comment by gjm on Predicted AI alignment event/meeting calendar · 2019-08-15T18:50:13.704Z · score: 5 (3 votes) · LW · GW

Dunno. I don't think the way it is does any actual harm. Maybe something with "meetings" in it, as per Teerth Aloke's suggestion.

Comment by gjm on Predicted AI alignment event/meeting calendar · 2019-08-15T00:26:16.274Z · score: 7 (4 votes) · LW · GW

Somehow the word "predicted" in the title (as opposed to, say, "future" or "planned") led me to expect entries for things like "OpenAI releases explicit model of human utility function" and "Entire mass of planet earth converted to paperclips"...

Comment by gjm on Rethinking Batch Normalization · 2019-08-03T13:33:53.992Z · score: 4 (3 votes) · LW · GW

The Lipschitz constant of a function gives an indication of how horizontal it is rather than how locally linear it is. Naively I'd expect that the second of those things matters more than the first. Has anyone looked at what batch normalization does to that?

More specifically: Define the 2-Lipschitz constant of function at to be something like and its overall 2-Lipschitz constant to be the sup of these. This measures how well is locally approximable by linear functions. (I expect someone's already defined a better version of this, probably with a different name, but I think this'll do.) Does batch normalization tend to reduce the 2-Lipschitz constant of the loss function?

[EDITED to add:] I think having a 2-Lipschitz constant in this sense may be equivalent to having a derivative which is a Lipschitz function (and the constant may be its Lipschitz constant, or something like that). So maybe a simpler question is: For networks with activation functions making the loss function differentiable, does batchnorm tend to reduce the Lipschitz constant of its derivative? But given how well rectified linear units work, and that they have a non-differentiable activation function (which will surely make the loss functions fail to be 2-Lipschitz in the sense above) I'm now thinking that if anything like this works it will need to be more sophisticated...

Comment by gjm on Why Subagents? · 2019-08-03T13:17:57.786Z · score: 2 (1 votes) · LW · GW

Consider a pizza-eating agent with the following "grass is always greener on the other side of the fence" preference: it has no "initial" preference between toppings but as soon as it has one it realises it doesn't like it and then prefers all other not-yet-tried toppings to the one it's got (and to others it's tried).

There aren't any preference cycles here -- if you give it mushroom it then prefers pepperoni, but having switched to pepperoni it then doesn't want to switch back to mushroom. If our agent has no opinion about comparisons between all toppings it's tried, and between all toppings it hasn't tried, then there are no outright inconsistencies either.

Can you model this situation in terms of committees of subagents? Can you do it without requiring an unreasonably large number of subagents?

Comment by gjm on Shortform Beta Launch · 2019-07-29T00:21:01.038Z · score: 2 (1 votes) · LW · GW

The MVP described here doesn't seem functionally any different from an open thread.

The future features clearly go beyond that, and the current MVP seems a reasonable stepping stone towards those. But ... is it worth considering just adding those features to comments generally, or comments in threads with some special flag set (which would then need to be set on the open threads), rather than introducing a whole new thing?

(I'll hazard a guess that that's actually roughly how the current implementation works.)

I'm thinking, e.g., that "convert a comment into a full post" might be something people sometimes want to do to comments anywhere, not just ones they called shortform posts. And that it's not entirely impossible that someone might want to be able to subscribe to a feed of all of some other user's comments, though that seems a bit extreme.

Comment by gjm on How to take smart notes (Ahrens, 2017) · 2019-07-25T18:16:44.139Z · score: 11 (2 votes) · LW · GW

Apparently "slip box" is roughly equivalent to "card index" and Luhmann's system is as follows:

  • Make notes on small cards / pieces of paper.
  • Don't attempt to categorize them with things like alphabetical order of subject or Dewey decimal notion.
  • Give them all unique identifiers, and allow these to have a "nested" structure when one note leads to others which lead to others.
  • Cross-link them by adding to each note references (via those unique IDs) to other notes that you know are related to it.

Obviously something very similar could be done on a computer, with many practical advantages over the version made out of pieces of paper.

I have a suspicion that Luhmann's alleged great productivity ("alleged" only because I haven't verified for myself) is best ascribed either (1) to things other than his use of a card-index system or (2) to idiosyncratic things about _how_ he used it that are not captured by what I wrote above or by the contents of the post here...

Comment by gjm on How to take smart notes (Ahrens, 2017) · 2019-07-25T18:08:45.776Z · score: 11 (2 votes) · LW · GW

Unless I am missing something, this post never actually says what a slip-box is or how to make and use one. What then am I supposed to do with the advice that "to start the habit of using slip-boxes" I should "start by making literature notes"? I can make some literature notes ... and what then? What do I do with them for them to be slip-box notes?

It seems like the single most important piece of information here is being wilfully withheld...

Comment by gjm on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-07T02:35:26.345Z · score: 6 (3 votes) · LW · GW

I'm sure you took into account things like house price appreciation. But what you said about investing the difference between what renters pay and what owners pay was misleading and wrong. You can do as careful and accurate and insightful a simulation as you please, but if what you say is "people pay more in mortgage + fees + taxes than in rent, which is bad because they could have invested the difference in the market and then they'd have got some actual returns" or "rent is bad because it's just throwing money away" then you are making a broken argument and I think you shouldn't do that. Not even if the conclusion of the broken argument happens to be the same as the conclusion of the careful accurate insightful simulation.

The question isn't whether back-testing is hard, it's how well you did it and whether whatever assumptions you made seem reasonable to any given reader. Again, my complaint isn't that your final results are bad, it's that we have no way of telling whether your final results are good or bad because you didn't show us any of the information we'd need to decide.

[EDITED to add:] This is all coming out more aggressive-sounding than I would like, and I hope I'm not giving the impression that I think the OP is terrible or that you're a bad person or any such thing. It just seems as if your responses to my comments aren't engaging with the actual points those comments are trying to make. That may be wholly or partly because of lack of clarity in what I've written, in which case please accept my apologies.

Comment by gjm on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-07T01:40:46.917Z · score: 8 (4 votes) · LW · GW

Repayment versus interest etc.: Looking at total money in and total money out is fine so long as you do it carefully; in this case "carefully" means you can't just say "If that money were invested in the market it would be earning you a return" as if those mortgage payments aren't earning any return. I don't know how a typical homeowner's payments divide between repayment, interest, and other things, but it seems quite possible to me that more than the difference you're talking about is repayment, in which case it could even be that the higher returns you (might hope to) get from investing the money in index funds rather than housing are more than counterbalanced by the fact that more of the money is being invested.

Of course the right way to answer this is to do the actual calculations, and you very reasonably suggest that your readers go and find a rent-versus-buy calculator and try to do them. Nothing wrong with that. But I think the way you describe the situation is no more accurate than the "if you rent you're throwing the money away" line one hears from people arguing on the opposite side.

Stock market investors: If those numbers for hypothetical investors A,B,C come from a simulation you did then I think the article should say so, and should say something about what assumptions you made. As it stands, you're just asking us to take them on faith. I don't find them terribly implausible, and your simulations may be excellent, but we can't tell that.

Anecdotal nonsense: Yup, the forced savings thing is a good point (and presumably not very applicable to anyone who's bothering to read a lengthy article about the financial merits of renting versus buying). My suspicion is that the "poorer people rent, richer people buy" dynamic is an even bigger reason why my observations aren't much evidence about what any given person here should do. But I don't think the counterfactual is relevant here, because the observation I was drawing attention to wasn't "buyers are doing OK" but "buyers are doing better than renters".

Comment by gjm on Opting into Experimental LW Features · 2019-07-07T01:20:12.890Z · score: 2 (1 votes) · LW · GW

I'm not feeling any glaring lacks. Of course it's possible that there are possible changes that once made would be obvious improvements :-).

I do use the "recent discussion" section. I actually don't mind the collapsing there -- it's not trying to present the whole of any discussion, and clearly space is at a big premium there, so collapsing might not be a bad tradeoff.

Comment by gjm on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-06T21:33:06.363Z · score: 7 (4 votes) · LW · GW

There is a lot of good sense in this article, but I have some problems with it. Perhaps the biggest: It appears to be written as if a house is only an investment; as if the buy-or-rent decision is made entirely on the basis of what will maximize your future overall wealth. I'm all in favour of maximizing wealth (all else being equal, at any rate) but a house is not only an investment; it is, as you point out, also a place where you are likely to be spending ~87k hours of your life, and you may reasonably choose to do something that makes you poorer overall if those 87k hours are substantially more pleasant for it.

Those non-financial factors aren't all in favour of buying over renting. The article mentions one that goes the other way, though of course only in the context of what it means for your financial welfare: if you rent and don't own, then it's often easier to move. Here are some others:

  • Buy, because if you rent there are likely to be a ton of onerous restrictions on what you're allowed to do. No pets! No children! No replacing the crappy kitchen appliances! No changing the horrible carpets!
  • Buy, if the properties available to buy (where you are) are more to your taste than the ones available for rent. Or: Rent, if the properties available to rent are more to your taste than the ones available to buy.
  • Buy, because if you rent then your landlord can kick you out on a whim. (Exactly how readily they can do that varies from place to place, but they always have some ability to do it and it's a risk that basically doesn't exist if you buy.)
  • Buy, if you get a sense of satisfaction or security from owning the place you live in. Or: Rent, if you find that being responsible for the maintenance of the place feels oppressive.

Some other quibbles.

People spend on average 50% more money on mortgages, taxes, and fees than they spend on rent. If that money were invested in the market it would be earning you a return.

Does that include morgage repayment as well as mortgage interest? Mortgage interest, taxes, and fees are straightforwardly "lost" in the same sort of way as rent is; mortgage repayment, though, is just buying a house slowly. That money is invested, in real estate rather than in equities, and it is earning you a return. (Possibly a smaller, more volatile, and/or less diversified return, for sure; I'm not disagreeing with that bit.)

let's look at three examples of people trying to time the market. All three people invest at a rate of $200/mo. [...]

OK, let's look. Where are the actual numbers? All I see is vague descriptions (e.g., A invests immediately before each crash, but how much do they invest on each occasion? An amount derived somehow from that $200/mo, but how? E.g., is the idea that the money sits there until a crash is about to happen and then however much is available gets invested, or what?) and final numbers with no explanation or justification or anything. For all I can tell, romeostevensit might just have made those numbers up. I bet he didn't, but without seeing the details no one can tell anything.

Also: $200/mo 40 years ago is a very different figure from $200/mo now. Does anyone do anything at all like investing at a constant nominal rate over 40 years? It seems unlikely. If instead of "$200/month" you make it "whatever corresponds to $200/month now, according to some standard measure of inflation" then the numbers will look quite different. (For the avoidance of doubt, I expect that A,B,C would still come out in the same order.)

[EDITED to add:] Anecdotally, the people I know who have bought houses generally seem to have done OK, largely independent of when they've bought; the people I know who have rented seem to have had no end of problems. But (1) this is a small and doubtless unrepresentative sample, and (2) I think richer people are more likely to buy and poorer people more likely to rent (especially as I live somewhere where buying is generally assumed to be something you want to do), so I don't think this is very strong evidence of anything.

Comment by gjm on Opting into Experimental LW Features · 2019-07-06T20:21:52.733Z · score: 2 (1 votes) · LW · GW

Sure. But the thing I was saying might be useful (which, I understand, has nothing to speak of in common with what's on offer right now) is auto-collapsing all comments I can be presumed to have read or decided not to bother reading on the grounds that they were already there the last time I visited the discussion. That would be useful even on posts with <=50 comments. (At least, it would be useful there if useful at all; it might be that I'm wrong in thinking it would be useful.)

Comment by gjm on Opting into Experimental LW Features · 2019-07-06T20:19:50.525Z · score: 3 (2 votes) · LW · GW

If someone's writing a whole post then for sure they should try to make its structure clear, perhaps with headings and tables of contents and introductory paragraphs and bullet points and whatnot.

I don't think that's usually appropriate for comments, which are usually rather short.

So, e.g., I don't think your comment to which I'm replying right now would have been improved by adding such signposts. But, even so, I don't see how I could tell whether I want to read the whole thing from knowing that it begins "I want to argue that this is a huge problem".

There might be benefit in providing some sort of guidance for readers of a whole comment thread. But it's hard to see how, especially as comment threads are dynamic: new material could appear anywhere at any time, and if order of presentation is partly determined by scores then that too can be rearranged pretty much arbitrarily. (And who'd do it?)

You might hope that a collapsed pile of comments is itself a sort of roadmap to the comments themselves, but I think that just doesn't work, just as you wouldn't get a useful summary of A Tale of Two Cities or A Brief History of Time by just taking the first half-sentence of each paragraph.

Comment by gjm on Opting into Experimental LW Features · 2019-07-05T15:11:46.043Z · score: 4 (2 votes) · LW · GW

If comments took up the whole width of my browser window, I would find them acutely unpleasant to read. I like my text fairly small and I have my window full-screen; text becomes difficult to read when each line is more than about 15-20 words. (The exact figure depends on size, personal preference, how long those words are, line spacing, etc.)

Comment by gjm on Opting into Experimental LW Features · 2019-07-05T15:08:43.303Z · score: 2 (1 votes) · LW · GW

Yes, I _do_ tend to read every comment on a thread. Or, sometimes, none, but usually if I'm bothering even to look at the comments on a post then I'm going to look at them all.

I don't eat at restaurants with literally no menu. If a restaurant has a large intimidating menu, I read it all anyway; if I visit it a few times I will get to know what I like. I won't be helped by a menu that says "Something containing chicken. Something containing turmeric. Something roughly round in shape." which is roughly what the collapsed comments provide. The menu is only useful in so far as it (together with my past experience, the overall look of the place, etc.) gives me a good enough mental picture of each dish to have a good idea whether I'll like it.

(An important distinction between visiting a restaurant and visiting Less Wrong: When I go to a restaurant, typically I intend to each a roughly fixed, fairly small number of dishes. I don't generally go to LW with the intention of reading three comments and then leaving. I don't so much mind there being no menu if what there is instead is a great multitude of little snacks I can try dozens of without feeling ill.)

Collapsed comments don't (for me; I must stress that I'm not claiming to speak for anyone else) give me any useful view of what's available; seeing the first few words of a comment tells me little, and I try not to prejudge things too much on the basis of author or score.

Expanding comments that are new since you last visited is a good idea. If I could get that (on all posts, not just ones with 50+ comments; why would I want that restriction) without any collapsing of comments I haven't read yet, that would probably be useful. I'm interested in tools that let me read all the comments efficiently. I'm interested in tools that help me decide which comments are worthy of more attention. I am not interested in tools that try to decide for me which comments I will want to read. If I want that then I can go and read Facebook with "top stories" mode turned on instead of LW.

Comment by gjm on Opting into Experimental LW Features · 2019-07-05T00:59:43.301Z · score: 6 (3 votes) · LW · GW

OK, so my feedback is that I have never ever seen a feature of this sort (unless intended only to hide outright spam and the like) that is any improvement over just leaving everything there, and this instance is no exception.

Perhaps the ability to collapse discussions, rather than having them start collapsed and providing a way to un-collapse them, might be useful sometimes, though I'm struggling to think when I'd actually use it. But this is (for me) a waste of time. Seeing half a line of a comment is usually not enough information to decide whether reading the whole thing is worth while, so the overall minimum-hassle solution is just to uncollapse everything. Thankfully I can make that the global default here on LW; if not, I'd just be clicking the "please expand everything" button every time. (Or, if there weren't one, expanding everything by hand and cursing the admins.)

Perhaps it might be different if I were reading LW in order to acquire information about some particular thing as efficiently as possible. And perhaps there is some desire for LW to be more the sort of place one goes to acquire information efficiently, rather than e.g. a chatty social venue. Except that if LW were more "academic" in that way -- something more like the Alignment Forum, perhaps -- then I would expect an (even) higher proportion of the comments to be ones well worth reading for anyone interested in the topic at hand, so auto-collapsing would still be a net loss.

Comment by gjm on Opting into Experimental LW Features · 2019-07-03T16:01:51.835Z · score: 2 (1 votes) · LW · GW

Are you looking for feedback on single line comments?

Information leakage alert: the screenshot there includes a comment from a now-deleted user; ordinary LW users viewing the same thing will see neither that user's name nor their comment. I don't think exposing that will do any grave harm, but you might consider using a different screenshot. (Also, if anyone thinks that deleting their account on LW _actually deletes their account_ they should be advised that apparently it does not.) [EDITED to add: to be more precise, in collapsed view ordinary LW users will see an error message in red instead of the username and truncated comment text; when expanded they _will_ see the text but _won't_ see the username.]

Comment by gjm on Is AlphaZero any good without the tree search? · 2019-07-02T01:47:40.739Z · score: 4 (2 votes) · LW · GW

Either your understanding is correct or mine isn't: AlphaGo Zero and AlphaZero _do_ do a tree search that the DM papers call "Monte Carlo Tree Search" but that doesn't involve actual Monte Carlo playouts and that doesn't match e.g. the description on the Wikipedia page about MCTS.

Comment by gjm on What does the word "collaborative" mean in the phrase "collaborative truthseeking"? · 2019-06-26T21:44:05.123Z · score: 4 (2 votes) · LW · GW

If multiple parties engage in adversarial interactions (e.g., debate, criminal trial, ...) with the shared goal of arriving at the truth then as far as I'm concerned that's still an instance of collaborative truth-seeking.

On the other hand, if at least one party is aiming to win rather than to arrive at the truth then I don't think they're engaging in truth-seeking at all. (Though maybe it might sometimes be effective to have a bunch of adversaries all just trying to win, and then some other people, who had better be extremely smart and aware of how they might be being manipulated, trying to combine what they hear from those adversaries in order to get to the truth. Hard to do well, though, I think.)

Comment by gjm on What does the word "collaborative" mean in the phrase "collaborative truthseeking"? · 2019-06-26T13:15:58.554Z · score: 18 (5 votes) · LW · GW

I share Richard Kennaway's feeling that this is a rather strange question because the answer seems so obvious; perhaps I'm missing something important. But:

"Collaborative" just means "working together". Collaborative truthseeking means multiple people working together in order to distinguish truth from error. They might do this for a number of reasons, such as these:

  • They have different skills that mesh together to let them do jointly what they could not do so well separately.
  • The particular truths they're after require a lot of effort to pin down, and having more people working on that can get it done quicker.
  • They know different things; perhaps the truth in question can be deduced by putting together multiple people's knowledge.
  • There are economies of scale; e.g., a group of people could get together and buy a bunch of books or a fast computer or a subscription to some information source, which is almost as useful to each of them as if they'd paid its full price on their own.
  • There are things they can do together that nudge their brains into working more effectively (e.g., maybe adversarial debate gets each person to dig deeper for arguments in a particular direction than they would have done without the impetus to compete and win).

There is a sense in which collaborative truth-seeking is built out of individual truth-seeking. It just happens that sometimes the most effective way for an individual to find what's true in a particular area involves working together with other individuals who also want to do that.

Collaborative truth-seeking may involve activities that individual truth-seeking (at least if that's interpreted rather strictly) doesn't because they fundamentally require multiple people, such as adversarial debate or double-cruxing.

Being "collaborative" isn't a thing that in itself brings benefits. It's a name for a variety of things people do that bring benefits. Speech-induced state changes don't result in better predictions because they're "collaborative"; engaging in the sort of speech whose induced state changes seem likely to result in better predictions is collaboration.

And yes, there are circumstances in which collaboration could be counterproductive. E.g., it might be easier to fall into groupthink. Sufficiently smart collaboration might be able to avoid this by explicitly pushing the participants to explore more diverse positions, but empirically it doesn't look as if that usually happens.

Related: collaborative money-seeking, where people join together to form a "company" or "business" that pools their work in order to produce goods or services that they can sell for profit, more effectively than they could if not working together. Collaborative sex-seeking, where people join together to form a "marriage" or "relationship" or "orgy" from which they can derive more pleasure than they could individually. Collaborative good-doing, where people join together to form a "charity" which helps other people more effectively than the individuals could do it on their own. Etc.

(Of course businesses, marriages, charities, etc., may have other purposes besides the ones listed above, and often do; so might groups of people getting together to seek the truth.)

Comment by gjm on "The Bitter Lesson", an article about compute vs human knowledge in AI · 2019-06-23T13:37:24.375Z · score: 2 (1 votes) · LW · GW

Yes, that's fair. (Though I'm not sure about the terms "domain-agnostic" and "domain-specific"; e.g., the AlphaZero approach seems to work well for a wide variety of board games played on "regular" boards but would need substantial modification to apply even to other board games and isn't obviously applicable at all to anything that isn't more or less a board game.)

Comment by gjm on "The Bitter Lesson", an article about compute vs human knowledge in AI · 2019-06-22T01:47:02.176Z · score: 10 (6 votes) · LW · GW

What he says about computer go seems wrong to me.

Two things came together to make really strong computer go players. The first was Monte Carlo tree search. Yes, that name has "search" in it, but it amounted to making programs less search-y, for two reasons. First, the original versions of MCTS did evaluations at the leaves of its trees by random full-game playouts, which took substantial time, so the trees were smaller than for earlier tree-searching programs. Second, the way MCTS propagates results up the tree moves away from the minimax paradigm and allows for the possibility that some tree nodes that seem to arise only from inferior moves (and hence should be irrelevant, minimaximally) actually have useful information in them about the position. The second was the use of big neural networks for evaluation instead of those Monte Carlo playouts. This amounts to having a really fancy static evaluation function, and (with the use of a policy net/head as in AlphaGo and its successors) a highly selective search that chooses what moves to include in the search according to how promising they look.

So when Sutton says this

Enormous initial efforts went into avoiding search by taking advantage of human knowledge, or of the special features of the game, but all those efforts proved irrelevant, or worse, once search was applied effectively at scale.

that seems misleading; while it's true that the currently best go programs minimize their use of human knowledge, it's not by "applying search effectively at scale" that they do it.

Now, this does all fit into the broader pattern of "leveraging computation". Fair enough, I guess, but what else would you expect? That one can play superhumanly strong go by doing only a few thousand elementary steps of computation per move? That applying more computation wouldn't help? (There's a reason why human players, when they want to play well, take more time to think.)

Comment by gjm on Decisions are hard, words feel easier · 2019-06-21T17:30:24.770Z · score: 11 (2 votes) · LW · GW

I think -- especially as, IIRC, this is intended as material for people unfamiliar with this sort of thinking -- this could use a thorough reworking in search of greater clarity. I'll pick a particular bit to illustrate.

If I'm understanding your intentions right, when you ask

What do I know about someone who is "racist"?

what you intend it to mean is something like "What do I know about someone, as a result of knowing that the word 'racist' is applied to them with its usual meaning?". Note that this is not the exact same use of quotation marks as when you put "racism" in quotation marks so that you can talk about the word itself -- you aren't asking "What do I know about someone who is the word 'racist'?", which is just as well because that wouldn't make any sense. But it took me a moment's thought to figure out what I think you do mean there, and I suspect that for some readers it will be a real stumbling block. And I'm not certain that I've interpreted it the way you intend; maybe what you mean is "... that the word 'racist' is applied to them by many other people?" or something, which is similar but not the same. So it might be better to say something like

What do I know about someone who is "racist"? That is: what do I know if I know that the word "racist" applies to them? This is no longer a question about my own concepts, but about how a word is used.

But ... I think this words-versus-concepts distinction is the wrong distinction here. If you ask "What do I know about someone to whom I apply the word 'racist'?" then it's a question about your own usage, in just the same way as "What do I know about someone who is racist?" allegedly is. And when you ask "What do I know about someone who is racist?", that can be universalized or particularized just as word-meanings can. "What do I know about someone I consider racist?" "What do I know about someone who would generally be considered racist?". You could, in principle, consider these questions without invoking words at all.

A couple of other specific points:

This is the lawyer.

What? Maybe there's something in one of your earlier posts that indicates what this is about, but as it stands it doesn't seem to me like it makes any sense. If it is a reference to some earlier thing where you (e.g.) introduce some fictional characters and explain that you're going to analyse things in terms of what they would say, then you should probably provide a link back to the earlier thing in question. Similarly for "This is the student again" near the end.

Oops, we've got a situation on our hands that incentives being a lawyer.

You probably want "incentivizes" or something of the sort.

But, generally, I'm afraid the whole thing feels a bit impressionistic and unfinished. Sorry!

Comment by gjm on Decisions are hard, words feel easier · 2019-06-21T17:15:19.045Z · score: 2 (1 votes) · LW · GW

When you say "parentheses" in your "explicit note", do you mean "quotation marks"?

Comment by gjm on LessWrong FAQ · 2019-06-16T15:02:58.220Z · score: 6 (3 votes) · LW · GW

You accidentally a word: "you will mention of Eliezer not infrequently".

Comment by gjm on Some Ways Coordination is Hard · 2019-06-13T17:07:27.753Z · score: 9 (2 votes) · LW · GW

Zvi has "Shilling" in the title of Raymond's earlier post, which definitely said "Schelling", so I bet it's just a mis-schpelling.

Comment by gjm on Learning magic · 2019-06-09T23:59:03.742Z · score: 2 (1 votes) · LW · GW

If you're saying "Manipulating people like that wouldn't work, because you always get to choose whether to do what a hypnotist tells you to" then I see two objections.

  • The fact that you think you could choose not to do it doesn't mean you actually could in any very strong sense. Perhaps it just feels that way.
  • It could be that when someone's explicitly, blatantly trying to manipulate you via your subconscious, you get to choose, but that a sufficiently skilled manipulator can do it without your ever noticing, in which case you don't have the chance to say no.

(I am not sure that that is what you're saying, though, and if it isn't then those points may be irrelevant.)

Comment by gjm on Paternal Formats · 2019-06-09T23:50:39.361Z · score: 4 (3 votes) · LW · GW

I expected "coercive" to mean something like "attempting to persuade and manipulate as well as inform".

Of course this interpretation couldn't survive a careful reading even of the bullet points, never mind the rest of your post, which is why more or less the next thing I did after making that guess was to look in the document you linked to to find out what it was actually meant to mean.

Comment by gjm on Paternal Formats · 2019-06-09T14:19:31.203Z · score: 5 (3 votes) · LW · GW

If I am understanding right, something is "coercive" to the extent that it doesn't support a wide variety of ways of interacting with it in order to get whatever benefits it offers. An extreme example would be a document that starts out written in (say) English, but keeps introducing neologisms and new symbols and unorthodox grammatical constructions, so that it ends up written in a language and notation entirely of the author's devising that you can only make sense of by starting at the beginning and working through it in order.

The examples here are all about information sources, for which "interacting with" mostly means "reading", but I think the notion generalizes further. I suspect that the choice of a better term -- I think "coercive" is bad -- will be tangled up with the choice of how far (if at all) to generalize.

Comment by gjm on Paternal Formats · 2019-06-09T02:42:43.206Z · score: 19 (7 votes) · LW · GW

I think this would be improved by a short paragraph (or less) explaining what "coercive" means in this context. (I made a guess after glancing at the bullet-lists, and then looked at the linked document; my guess was not correct.)

Comment by gjm on Mistakes with Conservation of Expected Evidence · 2019-06-09T02:40:02.811Z · score: 14 (7 votes) · LW · GW

On "If you can't provide me with a reason ...", I think the correct position is: when someone says X (and apparently means it, is someone whose opinions you expect to have some correlation with reality, etc.) you update towards X, and if they then can't give good reasons why X you then update towards not-X. Your overall update could end up being in either direction; if the person in question is particularly wise but not great at articulating reasons, or if X is the sort of thing whose supporting evidence you expect to be hard to articulate, the overall update is probably towards X.

Comment by gjm on Asymmetric Justice · 2019-06-05T11:09:49.330Z · score: 2 (1 votes) · LW · GW

The parent of this comment says that the last line of this post

was previously "Asymmetric systems of judgment are systems for opposing all action."

which (as of 2019-06-05) is right now the last line of the post. There's discussion elsewhere about a last line saying something like "Let us all be wise enough to aim higher". Would I be right in guessing that that is the last line that was removed, and that the comment above has merely transcribed the wrong text? Or am I more deeply confused than I think?

Comment by gjm on Welcome and Open Thread June 2019 · 2019-06-03T17:04:35.237Z · score: 3 (2 votes) · LW · GW

(Not replying "at the original post" because others haven't and now this discussion is here.)

That fragment of "Final Words" is in a paragraph of consequences of underconfidence.

Suppose (to take a standard sort of toy problem) you have a coin which you know either comes up heads 60% of the time or comes up heads 40% of the time. (Note: in the real world there are probably no such coins, at least not if they're tossed in a manner not designed to enable bias. But never mind.) And suppose you have some quantity of evidence about which sort of coin it is -- perhaps derived from seeing the results of many tosses. If you've been tallying them up carefully then there's not much room for doubt about the strength of your evidence, so let's say you've just been watching and formed a general idea.

Underconfidence would mean e.g. that you've seen an excess of T over H over a long period, but your sense of how much information that gives you is wrong, so you think (let's say) there's a 55% chance that it's a T>H coin rather than an H>T coin. So then someone trustworthy comes along and tells you he tossed the coin once and it came up H. That has probability 60% on the H>T hypothesis and probability 40% on the T>H hypothesis, so it's 3:2 evidence for H>T, so if you immediately have to bet a large sum on either H or T you should bet it on H.

But maybe the _real_ state of your evidence before this person's new information justifies 90% confidence that it's a T>H coin, in which case that new information leaves you still thinking it's more likely T>H, and if you immediately have to bet a large sum you should bet it in T.

Thus: if you are underconfident you may take advice you shouldn't, because you underweight what you already know relative to what others can tell you.

Note that this is all true even if the other person is scrupulously honest, has your best interests at heart, and agrees with you about what those interests are.

Comment by gjm on Drowning children are rare · 2019-05-31T10:56:28.395Z · score: 17 (6 votes) · LW · GW

Many people find that thinking about effectiveness rapidly makes local giving seem a less attractive option.
The thought processes I can see that might lead someone to give locally in pursuit of effectiveness are quite complex ones:

  • Trading off being able to do more good per dollar in poorer places against the difficulty of ensuring that useful things actually happen with those dollars. Requires careful thought about just how severe the principal/agent problems, lack of visibility, etc, are.
  • Giving explicitly higher weighting to the importance of people and causes located near to oneself, and trading off effectiveness against closeness. Requires careful thought about one's own values, and some sort of principled way of mapping closeness to importance.

Those are both certainly possible but I think they take more than a "small amount of thinking". Of course there are other ways to end up prioritizing local causes, but I think those go in the "without reflecting much" category. It seems to me that a modest amount of (serious) thinking about effectiveness makes local giving very hard to justify for its effectiveness, unless you happen to have a really exceptional local cause on your doorstep.

Comment by gjm on Open Thread May 2019 · 2019-05-25T22:15:32.524Z · score: 4 (3 votes) · LW · GW

(I'm coming to this rather late, and will entirely understand if Said doesn't want to resurrect an old discussion; if not, I hope no reader will take the lack of response to indicate that my points were just so compelling that Said had no answer to them.)

There is no such thing as “purple photons” [...]

I am unconvinced by your argument here (which may just indicate that I haven't grasped what it is). The following position seems pretty reasonable to me: There are violet photons but not magenta photons. There are orange photons, but because of metamerism you can see something orange without any orange photons being involved. Violet photons are not violet in the same way as a piece of paper covered in violet dye is violet, but it's reasonable to use the same word for both.

Would you care to make more explicit your reasons for saying that there is no such thing as a violet photon? (You actually said purple, as Charlie had done in the comment you were replying to, but I'm fairly sure your position is not "there are photons of some colours but not of others". My apologies if I've misunderstood.)

What would the atom look like in ordinary illumination [...]

Like many things, it would look different in different conditions of illumination. Unlike most things we look at, its spectrum is composed of a few sharp peaks, so the relationship between its illumination and the colour we see (given sufficient "amplification" since of course a single atom can neither emit nor scatter very much light) is a bit unusual. (One can make macroscopic materials with similar properties and I don't see any particular reason to deny that they have colour.)

There is no fact of the matter as to whether Neptune "is blue". [...]

This seems like a matter of definitions, and I don't like your definitions. Whatever difficulties there are about assigning a colour to Neptune are simple a consequence of the fact that it's a long way away from us. (I think it's clear that there's a fact of the matter as to whether Mars is red: it is. If Neptune's orbit were where Mars's is, there'd be no difficulty saying that Neptune is blue.) Are you sure you want to say that whether a thing has a definite colour can change merely on account of the distance between it and us? If we sent astronauts to Neptune instead of just probes, would there then be a fact of the matter as to whether Neptune has a colour? If there were an accident and the astronauts died, would Neptune's colour-having-ness change? If I take an orange, lock it in a vault (illuminated, let's say, by an incandescent bulb in the vault), and throw away the key, does there stop being a fact of the matter as to whether the orange is orange?

Incidentally: yes, the reporter is wrong about something. A quarter of a nanometre is not 2.5 x 10^-7 metres.

Comment by gjm on On the Nature of Programming Languages · 2019-04-23T11:54:54.887Z · score: 2 (1 votes) · LW · GW

I'm intrigued by your topic for another day.

How do you define "lifeform" so as to make us not examples? (Is the point e.g. that "we" are our _minds_ which could in principle exist without our _bodies_? Or do you consider that _Homo sapiens_ bodies don't constitute a lifeform?)

Comment by gjm on On the Nature of Programming Languages · 2019-04-23T00:39:53.441Z · score: 3 (2 votes) · LW · GW

I think your h4ck3r-versus-n00b dichotomy may need a little adjustment.

It's true that some hackers prefer mathematics-y languages like, say, Haskell or Scheme, with elegantly minimal syntax and a modest selection of powerful features that add up to something tremendous.

But _plenty_ of highly skilled and experienced software-makers program in, for instance, C++, which really doesn't score too highly on the elegance-and-abstraction front. Plenty more like to program in C, which does better on elegance and worse on abstraction and is certainly a long way from mathematical elegance. Plenty more like to program in Python, which was originally designed to be (inter alia) a noob-friendly language, and is in fact a pretty good choice for a first language to teach to a learner. And, on the other side of things, Scheme -- which seems like it has a bunch of the characteristics you're saying are typical of "expert-focused" languages -- has always had a great deal of educational use, by (among others) the very people who were and are designing it.

If you're designing a programming language, you certainly need to figure out whether to focus on newcomers or experts, but I don't think that choice alone nails down very much about the language, and I don't think it aligns with elegance-versus-let's-politely-call-it-richness.