Posts
Comments
PredictionBook itself has a bunch more than three participants and functions as an always-running contest for calibration, although it's easy to cheat since it's possible to make and resolve whatever predictions you want. I also participate in GJ Open, which has an eternally ongoing prediction contest. So there's stuff out there where people who want to compete on running score can do so.
The objective of the contest was less to bring such an opportunity into existence as to see if it'd incentivise some people who had been "meaning" to practice prediction-making and not gotten to it yet to do so on on one of the platforms, by offering a kind of "reason to get around to it now"; the answer was no, though.
I don't participate much on Metaculus because for my actual, non-contest prediction-making practice, I tend to favour predictions that resolve within about six weeks, because the longer the time between prediction and resolution, the slower the iteration process on improving calibration; if I predict on 100 things that happen in four years, it takes four years for me to learn if I'm over or under confident at the 90% or so mark, and then another four years for me to learn if my reaction to that was an over or under reaction. Metaculus seems to favour predictions 2-4 or more years out, and requires sticking with private predictions to create your own short term ones in number, which is interesting for getting a crowd read on the future, but doesn't offer me so much of an opportunity to iterate and improve. It's a nice project, though.
It's not a novel algorithm type, just a learning project I did in the process of learning ML frameworks, a fairly simple LSTM + one dense layer, trained on the predictions + resolution of about 60% of the resolved predictions from PredictionBook as of September last year (which doesn't include any of the ones in the contest). The remaining resolved predictions were used for cross-validation or set aside as a test set. An even simpler RNN is only very slightly less good, though.
The details of how the algorithm works are thus somewhat opaque but from observing the way it reacts to input, it seems to lean on the average, weight later in sequence predictions more heavily (so order matters) and get more confident with number of predictions, while treating the propositions with only one probability assignment as probably being heavily overconfident. It seems to have more or less learnt that insight Tetlock pointed out on its own. Disagreement might also matter to it, not sure.
It's on GitHub at https://github.com/jbeshir/moonbird-predictor-keras; this doesn't include the data, which I downloaded using https://github.com/jbeshir/predictionbook-extractor. It's not particularly tidy though, and still includes a lot of unused functionality for input features- the words of the proposition, the time between a probability assignment and the due time, etc- which I didn't end up using because the dataset was too small for it to learn any signal in them.
I'm currently working on making the online frontend to the model automatically retrain the model at intervals using freshly resolved predictions, mostly for practice building a simple "online" ML system before I move on to trying to build things with more practical application.
The main reason I ran figures for it against the contest was that some of its individual confidences seemed strange to me, and while the cross-validation stuff was saying it was good, I was suspicious I was getting something wrong in the process.
I'm concerned that the described examples of holding individual comments to high epistemic standards don't seem to necessarily apply to top-level posts, or linked content- one reason I think this is bad is that it is hard to precisely critique something which is not in itself precise, or which contains metaphor, or which contains example-but-actually-pointing-at-a-class writing where the class can be construed in various different ways.
Critique of fuzzy intuitions and impressions and feelings often involves fuzzy intuitions and impressions and feelings, I think- and if this stuff is restricted in critique but not in top level content it makes top level content involving fuzzy intuitions and impressions and feelings hard to critique, despite I think being exactly the content which needs critiquing the most.
Strong comment standards seem like they would be good for a space (no strong opinion on whether LW should be that space), but it would probably want to also have high standards in top level posts, possibly review and feedback prior to publication, to keep them up to the same epistemic standards. Otherwise I think moderation argument over which interpretations of vague content were reasonable would dominate.
Additionally, strong disagree on "weaken the stigma around defensiveness" as an objective of moderation. One should post arguments because one believes they are valid, and clarify misunderstandings because they are wrong, not argue or post or moderate to try to save personal status. It may be desirable to post and act with the objective of making it easier to not be defensive, but we still want people in themselves to try to avoid taking it as a referendum on their person. In terms of fairness, I'm not sure how you'd judge it- it is valid for the part people have most concerns about to not be the part which is desired to be given the most attention, I think, in even formal peer review. It's also valid for most people to disagree with and have critiques of a piece of content. The top level post author (or the link post's author) doesn't have a right to "win"- it is permissible for the community to just not think a post's object level content is all that good. If there was to be a fairness standard that justified anything, it'd certainly want to be spelled out in more detail and checked by someone other than the person feeling they were treated unfairly.
It might be nice to have a set of twenty EA questions, a set of twenty ongoing-academic-research questions, a set of twenty general tech industry questions, a set of twenty world politics questions for the people who like them maybe, and run multiple contests at some point which refine predictive ability within a particular domain, yeah.
It'd be a tough time to source that many, and I feel that twenty is already about the minimum sample size I'd want to use, and for research questions it'd probably require some crowdsourcing of interesting upcoming experiments to predict on, but particularly if help turns out to be available it'd be worth considering if the smaller thing works.
The usefulness of a model of the particular area was something I considered in choosing between questions, but I had a hard time finding a set of good non-personal questions which had very high value to model. I tried to pick questions which in some way depended on interesting underlying questions-for example, the Tesla one hinges on your ability to predict the performance of a known-to-overpromise entrepreneur in a manner that's more precise than either maximum cynicism or full trust, and the ability to predict ongoing ramp-up of manufacturing of tech facing manufacturing difficulties, both of which I think have value.
World politics are I think the weakest section in that regard, and this is a big part of why rather than just taking twenty questions from the various sources of world politics predictions I had available, I looked for other questions, and made a bunch of my own EA-related ones by going through EA org posts looking for uncertain pieces of the future, reducing the world politics questions down to only a little over a third of the set.
That said, I think the world politics do have transferability in calibration if not precision (you can learn to be accurate on topics you don't have a precise model for by having a good grasp of how confident you should be), and the general skill of skimming a topic, arriving at impressions about it, and knowing how much to trust those impressions. I think there are general skills of rationality being practiced here, beyond gaining specific models.
And I think while it is the weakest section it does have some value- there's utility in having a reasonable grasp of the behaviour and in particular the speed of change under various circumstances in governments- the way governments behave and react in the future will set the regulatory environment for future technological development, and the way they behave in geopolitics affects risk from political instability, both as a civilisation risk in itself and as something that could require mitigation in other work. There was an ongoing line of questioning about how good it is, exactly, to have a massive chunk of AGI safety orgs in one coastal American city (in particular during the worst of the North Korea stuff), and a good model for that is useful for deciding whether it's worth trying to fund the creation and expansion and focusing of orgs elsewhere as a "backup", for example, which is a decision that can be taken individually on the basis of a good grasp of how concerned you should be, exactly, about particular geopolitical issues.
These world politics questions are probably not perfectly optimised for that (I had to avoid anything on NK in particular due to the current rate of change), and it'd be nice to find better ones, and maybe more other useful questions and shrink the section further next year. I think they probably have some value to practice predicting on, though.
I need to take a good look over what GJO has to offer here- I'm not sure if running a challenge for score on it would meet the goals here well (in particular I think it needs to be bounded in amount of prediction it requires in order to motivate doing it, and yet not gameable by just doing easy questions, and I'd like to be able to see what the probability assignments on specific questions were), but I've not looked at it closely with this in mind. I should at least hopefully be able to crib a few questions, or more.
Sounds good. I've looked over them and I could definitely use a fair few of those.
Thanks for letting me know! I've sent them a PM, and hopefully they'll get back to me once they're free.
On the positive side, I think an experiment in a more centrally managed model makes sense, and group activity that has become integrated into routine is an incredibly good commitment device for getting the activity done- the kind of social technology used in workplaces everywhere that people struggle to apply to their other projects and self-improvement efforts. Collaborative self-improvement is good; it was a big part of what I was interested in for the Accelerator Project before that became defunct.
On the skulls side, though, I think the big risk factor that comes to mind for me for any authoritarian project wasn't addressed directly. You've done a lot of review of failed projects, and succeeded projects, but I don't get an impression you've done much of a review of abusive projects. The big common element I've seen in abusive projects is that unreasonable demands were made that any sensible person should have 'defected' on- they were asked things or placed under demands which from the outside and in retrospect staying in the group was in no way worth meeting- and people didn't defect. They stayed in the abusive situation.
A lot of abusive relationships involve people trading off their work performance and prospects, and their outside relationship prospects, in order to live up to commitments made within those relationships, when they should have walked. They concede arguments when they can't find a reason that will be accepted because the other person rejects everything they say, rather than deciding to defect on the personhood norm of use of reasons. I see people who have been in abusive relationships in the past anxiously worrying about how they will find a way to justify themselves in circumstances where I would have been willing to bite the bullet and say "No, I'm afraid not, I have reasons but I can't really talk about them.", because the option of simply putting their foot down without reasons- a costly last resort but an option- is mentally unavailable to them.
What I draw from the case studies of abusive situations I've encountered, is that humans have false negatives as well as false positives about 'defection'; that is, people maintain commitments when they should have defected as well as defecting when they should have maintained commitments. Some of us are more prone to the former, and others are more prone to the latter. The people prone to the former are often impressively bad at boundaries, at knowing when to say no, at making a continually updated cost/benefit analysis to their continued presence in an environment, at protecting themselves. Making self-protection a mantra indicates that you've kind of seen a part of it, but the overall model being "humans defect on commitments too much" rather than "humans are lousy at knowing when to commit and when not to" seems like it will miss consideration of what various ideas will do with false negatives often.
The rationalist community as a whole probably is mostly people with relatively few false negatives and mostly false positives. Most of us know when to walk and are independent enough to be keeping an eye on the door when things get worrying, and have no trouble saying "you seem to be under the mistaken impression I need to give you a reason" if people try to reject our reasons. So I can understand failures the other way not being the most salient thing. But the rationalist community as a whole is mostly people who won't be part of this project.
When you select out the minority who are interested in this project, I think you will get a considerably higher rate of people who fail in the direction of backing down if they can't find a reason that (they think) others will accept, in the direction of not having good boundaries, and more generally in the direction of not 'defecting' enough to protect themselves. And I've met enough of them in rationalist-adjacent spaces that I know they're nearby, they're smart, they're helpful, some are reliable, and they're kind of vulnerable.
I think as leader you need to do more than say "protect yourself". I think you need to expect that some people you are leading will /not/ say no when they should, and you won't successfully filter all of them out before starting no more than you'll filter all people who will fail in any other way out before starting. And you need to take responsibility for protecting them, rather than delegating it exclusively for them to handle. To be a bit rough, "protect yourself" seems like trying to avoid part of the leadership role that isn't actually optional: that if you fail in the wrong way you will hurt people, and you as leader are responsible for not failing in that way, and 95% isn't good enough. The drill instructor persona does not come off as the sort of person who would do that- with the unidirectional emphasis on committing more- and I think that is part of why people who don't know you personally find it kinda alarming in this context.
(The military, of course, from which the stereotype originates, deals with this by simply not giving two shits about causing psychological harm, and is fine either severely hurting people to turn them into what it needs or severely hurting them before spitting them out if they are people who are harmed by what it does.)
On the somewhat more object level, the exit plan discussed seems wildly inadequate, and very likely to be a strong barrier against anyone who isn't one of our exceptional libertines leaving when they should. This isn't a normal house share, and it is significantly more important than a regular house share that people are not prevented from leaving by financial constraints or inability to find a replacement who's interested. The harsh terms typical of an SF house share are not suitable, I think.
The finding a replacement person part seems especially impractical, given most people trend towards an average of their friends and so if their friends on one side are DA people, and they're unsuited to DA, their other friends are probably even more unsuited to DA on average. I would strongly suggest taking only financial recompense on someone leaving for up to a limited number of months of rent if a replacement is not secured, and either permitting that recompense to be paid back at a later date after immediate departure, or requiring it as an upfront deposit, to guarantee safety of exit.
If there are financial costs involved with ensuring exit is readily available, there are enough people who think that this is valuable that it should be possible to secure capital for use in that scenario.
Assuming by "it" you refer to the decision theory work, that UFAI is a threat, Many Worlds Interpretation, things they actually have endorsed in some fashion, it would be fair enough to talk about how the administrators have posted those things and described them as conclusions of the content, but it should accurately convey that that was the extent of "pushing" them. Written from a neutral point of view with the beliefs accurately represented, informing people that the community's "leaders" have posted arguments for some unusual beliefs (which readers are entitled to judge as they wish) as part of the content would be perfectly reasonable.
It would also be reasonable to talk about the extent to which atheism is implicitly pushed in stronger fashion; theism is treated as assumed wrong in examples around the place, not constantly but to a much greater degree. I vaguely recall that the community has non-theists as a strong majority.
The problem is that this is simply not what the articles say. The articles imply strongly that the more unusual beliefs posted above are widely accepted- not that they are posted in the content but that they are believed by Less Wrong members, part of the identity of someone who is a Less Wrong user. This is simply wrong. And the difference is significant; it is incorrectly accusing all people interested in the works of a writer of being proponents of that writer's most unusual beliefs, discussed only in a small portion of their total writings. And this should be fixed so they convey an accurate impression.
The Scientology comparison is misleading in that Scientology attempts to use cult practices to achieve homogeneity of beliefs, whereas Less Wrong does not- the poll solidly demonstrates that homogeneity of beliefs is not a thing which is happening. A better analogy would be a community of fans of the works of a philosopher who wrote a lot of stuff and came to some outlandish conclusions in parts, but the fans don't largely believe that outlandish stuff. Yeah, their outlandish stuff is worth discussing- but presenting it as the belief of the community is wrong even if the philosopher alleges it all fits together. Having an accurate belief here matters, because it has greatly different consequences. There are major practical differences in how useful you'd expect the rest of the content to be, and how you'd perceive members of the community.
At present, much of the articles are written as "smear pieces" against Less Wrong's community. As a clear and egregious example, it alleges they are "libertarian", for example, clearly a shot at LW given RW's readerbase, when surveys tell us that the most common political affiliation is "liberalism", and while "libertarianism" is second, "socialism" is third. It does this while citing one of the surveys in the article itself. Many of the problems here are not subtle.
If by "it" you meant the evil AI from the future thing, it most certainly is not "the belief pushed by the organization running this place"; any reasonable definition of "pushing" something would have to meancommunicating it to people and attempting to convince them of it, and if anything they're credibly trying to stop people from learning about it. There are no secret "higher levels" of Less Wrong content only shown to the "prepared", no private venues conveying it to members as they become ready, so we can be fairly certain given publicly visible evidence that they aren't communicating it or endorsing it as a belief to even 'selected' members.
It doesn't obviously follow from anything posted on Less Wrong, it requires putting a whole bunch of parts together and assuming it is true.
The pattern matching's conclusions are wrong because the information it is matching on is misleading. The article implied that there was widespread belief that the future AI should be assisted, and this was wrong. Last I looked it still implied widespread support for other beliefs incorrectly.
This isn't an indictment of pattern matching so much as a need for the information to be corrected.
It would be nice if you'd also address the extent to which it misrepresents other LessWrong contributors as thinking it is feasible or important (sometimes to the point of mocking them based on its own misrepresentation). People around LessWrong engage in hypothetical what-if discussions a lot; it doesn't mean that they're seriously concerned.
Lines like "Though it must be noted that LessWrong does not believe in or advocate the basilisk ... just in almost all of the pieces that add up to it." are also pretty terrible given we know only a fairly small percentage of "LessWrong" as a whole even consider unfriendly AI to be the biggest current existential risk. Really, this kind of misrepresentation of alleged, dubiously actually held extreme views as the perspective of the entire community is the bigger problem with both the LessWrong article and this one.
First, examining the dispute over whether scalable systems can actually implement a distributed AI...
This is one reason why even Google's datastore, AFAIK, does not implement exactly this kind of architecture -- though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance.
That's untrue; Google App Engine's datastore is not built on exactly this architecture, but is built on one with these scalability properties, and they do not inhibit its operation. It is built on BigTable, which builds on multiple instances of Google File System, each of which has multiple chunk servers. They describe this as intended to scale to hundreds of thousands of machines and petabytes of data. They do not define a design scaling to an arbitrary number of levels, but there is no reason an architecturally similar system like it couldn't simply add another level and add on another potential roundtrip. I also omit discussion of fault-tolerance, but this doesn't present any additional fundamental issues for the described functionality.
In actual application, its architecture is used in conjunction with a large number of interchangeable non-data-holding compute nodes which communicate only with the datastore and end users rather than each other, running identical instances of software running on App Engine. This layout runs all websites and services backed by Google App Engine as distributed, scalable software, assuming they don't do anything to break scalability. There is no particular reliance of "special properties" of the data being stored, merely limited types of searching of the data which is possible. Even this is less limited than you might imagine; full text search of large texts has been implemented fairly recently. A wide range of websites, services, and applications are built on top of it.
The implication of this is that there could well be limitations on what you can build scalably, but they are not all that restrictive. They definitely don't include anything for which you can split data into independently processed chunks. Looking at GAE some more because it's a good example of a generalised scalable distributed platform, the software run on the nodes is written in standard Turing-complete languages (Python, Java, and Go) and your datastore access includes read and write by key and by equality queries on specific fields, as well as cursors. A scalable task queue and cron system mean you aren't dependent on outside requests to drive anything. It's fairly simple to build any such chunk processing on top of it.
So as long as an AI can implement its work in such chunks, it certainly can scale to huge sizes and be a scalable system.
And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
And as I demonstrated, O(n log n) is big enough for a Singularity.
And now on whether scalable systems can actually grow big in general...
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency.
Speed of light as an issue is not a problem for building huge systems in general, so long as the number of roundtrips rises as O(n log n) or less, because for any system capable of at least tolerating roundtrips to the other side of the planet (few hundred milliseconds), it doesn't become more of an issue as a system gets bigger, until you start running out of space on the planet surface to run fibre between locations or build servers.
The GAE datastore is already tolerating latencies sufficient to cover distances between cities to permit data duplication over wide areas, for fault tolerance. If it was to expand into all the space between those cities, it would not have the time for each roundtrip increase until after it had filled all the space between them with more servers.
Google and Amazon are not at all forced to build data centres in different parts of the Earth to reduce latency; this is a misunderstanding. There is no technical performance degradation caused by the size of their systems forcing them to need the latency improvements to end users or the region-scale fault tolerance that spread out datacentres permit. They can just afford it more easily. You could argue there are social/political/legal reasons they need it more, higher expectations of their systems and similar, but these aren't relevant here. This spreading out is actually largely detrimental to their systems since spreading out this way increases latency between them, but they can tolerate this.
Heat dissipation, power generation, and network cabling needs all also scale as O(n log n), since computation and communication do and those are the processes which create those needs. Looking at my previous example, the amount of heat output, power needed, and network cabling required per amount of data processed will increase by maybe an order of magnitude in scaling such a system upwards by tens of orders of magnitude, 5x for 40 orders of magnitude in the example I gave. This assumes your base amount of latency is still enough to cover the distance between the most distant nodes (for an Earth bound system, one side of the planet to the other), which is entirely reasonable latency-wise for most systems; a total of 1.5 seconds for a planet-sized system.
This means that no, these do not become an increasing problem as you make a scalable system expand, any more so than provision of the nodes themselves. You are right in that that heat dissipation, power generation, and network cabling mean that you might start to hit problems before literally "running out of planet", using up all the matter of the planet; that example was intended to demonstrate the scalability of the architecture. You also might run out of specific elements or surface area.
These practical hardware issues don't really create a problem for a Singularity, though. Clusters exist now with 560k processors, so systems at least this big can be feasibly constructed at reasonable cost. So long as the software can scale without substantial overhead, this is enough unless you think an AI would need even more processors, and that the software could is the point that my planet-scale example was trying to show. You're already "post Singularity" by the time you seriously become unable to dissipate heat or run cables between any more nodes.
This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today.
HFT systems desire extremely low latency; this is the sole cause of their wish to be close to the exchange and to have various internal scalability limitations in order to improve speed of processing. These issues don't generalise to typical systems, and don't get worse at above O(n log n) for typical bigger systems.
It is conceivable that speed of light limitations might force a massive, distributed AI to have high, maybe over a second latency in actions relying on knowledge from all over the planet, if prefetching, caching, and similar measures all fail. But this doesn't seem like nearly enough to render one at all ineffective.
There really aren't any rules of distributed systems which says that it can't work or even is likely not to.
Restricting the topic to distributed computation, the short answer is "essentially no". The rule is that you get at best linear returns, not that your returns diminish greatly. There are a lot of problems which are described as "embarassingly parallel", in that scaling them out is easy to do with quite low overhead. In general, any processing of a data set which permits it to be broken into chunks which can be processed independently would qualify, so long as you were looking to increase the amount of data processed by adding more processors rather than process the same data faster.
For scalable distributed computation, you use a system design whose total communication overhead rises as O(n log n) or lower. The upper bound here is superlinear, but gets closer to linear the more additional capacity is added, and so scales well enough that with a good implementation you can run out of planet to make the system out of before you get too slow. Such systems are quite achievable.
The DNS system would be an important example of a scalable distributed system; if adding more capacity to the DNS system had substantially diminishing returns, we would have a very different Internet today.
An example I know well enough to walk through in detail is a scalable database in which data is allocated to shards, which manage storage of that data. You need a dictionary server to locate data (DNS-style) and handle moving blocks of it between shards, but this can then be sharded in turn. The result is akin to a really big tree; number of lookups (latency) to find the data rises with the log of the data stored, and the total number of dictionary servers at all levels does not rise faster than the number of shards with Actual Data at the bottom level. Queries can be supported by precomputed indexes stored in the database themselves. This is similar to how Google App Engine's datastore operates (but much simplified).
With this fairly simple structure, the total cost of all reads/writes/queries theoretically rises superlinearly with the amount of storage (presuming read/write/queries and amount of data scale linearly with each other), due to the dictionary server lookups, but only as O(n log(n)). If you were trying, with current day commodity hard disks and a conceptually simple on-disk tree, a dictionary server could reasonably store information for ten billion shards (500 bytes 10 billion = ~5 TB), two levels of sharding giving you a hundred billion billion data-storing shards, three giving a thousand billion billion billion data-storing shards. Five levels, five latency delays would give you more bottom-level shards than there are atoms on Earth. This is why, while scalability will eventually* limit a O(n log(n)) architecture, in this case because the cost of communicating with subshards of subshards becomes too high, you can run out of planet first.
This can be generalised; if you imagine that each shard performs arbitrary work on the data sent to it, and when the data is read back you get the results of the processing on that data, you get a scalable system which does any processing on a dataset than can be done by processing chunks of data independently from one another. Image or voice recognition matching a single sample against a huge dataset would be an example.
This isn't to trivialise the issues of parallelising algorithms. Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don't support the same set of queries as a simple MySQL server because a MySQL server implements some queries by iterating all the data, and there's no known way to perform them in a scalable way. Instead, software using them finds other ways to implement the feature.
However, scalable-until-you-run-out-of-planet distributed systems are quite possible, and there are some scalable distributed systems doing pretty complex tasks. Search engines are the best example which comes to mind of systems which bring data together and do complex synthesis with it. Amazon's store would be another scalable system which coordinates a substantial amount of real world work.
The only question is whether a (U)FAI specifically can be implemented as a scalable distributed system, and considering the things we know can be divided or done scalably, as well as everything which can be done with somewhat-desynchronised subsystems which correct errors later (or even are just sometimes wrong), it seems quite likely that (assuming one can be implemented at all) it could implement its work in the form of problems which can be solved in a scalable fashion.
This model trivially shows that censoring espousing violence is a bad idea, if and only if you accept the given premise that censorship of espousing violence is a substantial PR negative. This premise is a large part of what the dispute is about, though.
Not everyone is you; a lot of people feel positively about refusing to provide a platform to certain messages. I observe a substantial amount of time expended by organisations on simply signalling opposition to things commonly accepted as negative, and avoiding association with those things. LW barring espousing violence would certainly have a positive effect through this.
Negative effects from the policy would be that people who do feel negatively about censorship, even of espousing violence, would view LW less well.
The poll in this thread indicates that a majority of people here would be for moderators being able to censor people espousing violence. This suggests that for the majority here it is not bad PR for the reason of censorship alone, since they agree with its imposition. I would expect myself for people outside LW to have an even stronger preference in favour of censorship of advocacy of unthinkable dangerous ideas, suggesting a positive PR effect.
Whether people should react to it in this manner is a completely different matter, a question of the just world rather than the real one.
And this is before requiring any actual message be censored, and considering the impact of any such censorship, and before considering what the particular concerns of the people who particularly need to be attracted are.
I think in this context, "asking about" might include raising for neutral discussion without drawing moral judgements.
The connection I see between them is that if someone starts neutral discussion about a possible action, actions which would reasonably be classified as advocacy have to be permitted if the discussion is going to progress smoothly. We can't discuss whether some action is good or bad without letting people put forward arguments that it is good.
I think that a discussion in which only most people are mindkilled can still be a fairly productive one on these questions in the LW format. LW is actually one of the few places where you would get some people who aren't mindkilled, so I think it is actually good that it achieves this much.
They seem fairly ancillary tor LW as a place for improving instrumental or epistemic rationality, though. If you think testing the extreme cases of your models of your own decision-making is likely to result in practical improvements in your thinking, or just want to test yourself on difficult questions, these things seem like they might be a bit helpful, but I'm comfortable with them being censored as a side effect of a policy with useful effects.
Ah, I see. That makes sense. They weren't actually asked to remove the whole of the quoting, just to remove some unrelated lines, which has been complied with, so there's no unimplemented requests as far as I know.
Of course, it might just have not asked for because having it pulled at this point could cause a worse mess than leaving it up, with more reputation damage. Some third party moderator could request it to avoid that issue, but I think at this point the horse is long gone and going to the work of closing the barn door might not be worth it.
It'd be reasonable for a hypothetical moderator taking an appropriate action to request they replace the whole thing with a summary, though; that makes sense.
Quoting without permission was clearly a mistake, but describing it as a "rather clear privacy agreement" is not particularly apt; Freenode policy on this is written as strong advice rather than "rules" as such, and the channel itself had no clear policy. As it was, it was mostly a social convention violation. I thus have to disagree that an indefinite ban for ignorance of advice or an unwritten policy would be an appropriate or optimum response. What's happened so far- the person being corrected quite sharply here and on the channel, and a clear privacy agreement added to the IRC channel topic for next time- seems like a reasonable remedy.
More specifically, the Freenode policy item in question is entitled "If you're considering publishing channel logs, think it through.", the section on constant public logging by the channel staff says "should" throughout, and the bit at the end about quoting publicly as a user ends with "Avoid the temptation to publish or distribute logs without permission in order to portray someone in a bad light. The reputation you save will most likely be your own." rather than stating that it is actually a violation of anything in particular.
What is fairly solid Freenode policy, though, is that unofficial channels of things have to use the ## format, and # format is reserved for generally official project channels. I don't know if the Less Wrong site admins and #lesswrong admins overlap, but if hypothetically Less Wrong wanted to disaffiliate #lesswrong, it is actually entirely possible for Less Wrong administrators to force #lesswrong to, at the least, migrate to ##lesswrong or a different IRC network.
As a #lesswrong user since I started reading the Sequences originally, though, I don't think this is a good idea. Having a real-time discussion channel is a nice thing for those that benefit from it. The IRC channel, listed on the wiki, was the first place I gravitated towards for discussing LW stuff, preferring it to comments. It is fairly Less Wrong focused; links to and discussions of Less Wrong posts are the key focus, even with a lot of other interesting conversations, evaluations, thoughts, etc, perhaps having more actual conversation time. What you remember as having bled over is unrepresentative, I feel.
"The morally (and socially) appropriate thing to do" would be to learn the difference between a chat and a public forum before jumping to hasty conclusions.
The conclusions drawn, while erroneous, were erroneous for reasons unrelated to the difference between an IRC channel and a public forum. They were not wrong to think that they were being insulted because they were wrong to post logs. Strongly establishing that they made an error in quoting from the channel here does not establish that their issue is groundless.
Conflation of issues like this is exactly why it is normally a faux pas to mention errors by a person unrelated to their complaint when responding to it, and bring them up separately.
Edit: To be more specific about the conflation I'm pointing at... the "hasty conclusions" they came to are not made less plausible by knowing about the "difference between a chat and a public forum". Knowing that quoting is not socially normal does not make the conclusion "the things said were serious insults and there are unfriendly social norms here" less likely. That lack of knowledge thus does not invalidate the conclusion, or the presence of issues or mistakes leading them to that conclusion.
Edit 2: And to be more specific about why this matters... it's a claim which doesn't actually make any sense which is a snappy comeback. It's not actually a rebuttal to what it's replying to, because they don't conflict, seeing as they're talking about moral/socially correct actions for different people, but it takes the form of one, carrying negative signals about what it replies to which it doesn't actually justify. It also conveys substantial negative connotations towards the person complaining, and rhetoric running people down isn't nice. It not making sense is a thing which should be noticed, so it can be deliberately discounted.
This is just so utterly over the top I'm mystified that it was taken as anything but ritual insulting for the purpose of bonding/hazing in an informal group.
You've been lucky to avoid seeing jokes like this more often when moving around the Internet, then. Over the top jokes at the expense of minority groups are popular when representing actual opinions, not just as jokes to people you already know, particularly in communities where those opinions are accepted norms and the group in question is an acceptable target. The desire to score points often leads to gross caricatures of such acceptable targets being thrown around. It's repugnant, but not that unusual. I've seen plenty of worse things said about gay people when trawling things.
To anyone who knows that these opinions aren't actually accepted norms, from time spent in #lesswrong, they're obvious jokes. But for a fairly new arrival, in the absence of this knowledge, and possibly with more experience of genuinely unpleasant communities, it's not an unreasonable interpretation.
It's true that with all the information available now, a simple private message would have cleared it up. It's also true, though, that with all the information available now, simply not saying those specific lines would have avoided the whole issue in the first place. It was not realistic to expect either party to have known that at the time.
It isn't reasonable to expect someone who feels they have been insulted, and who has already responded in public with complaints like "what a disgusting thing to say", and observed everyone fail to care, to go PM the person- the very high status person- with a direct complaint. As far as they're concerned, they already tried complaining and the person didn't care. There would be no reason for them to expect this to be productive, and it would likely feel very intimidating. No one in the channel seemed a reasonable source for help; the operators were presumably fine with it, gwern being one of them.
Considering the situation myself, with the knowledge that one would actually have in the situation, the only reasonable alternative to asking for help on Less Wrong itself is leaving the channel, and we should be glad they didn't take that option, because if they did, we not only lose them, but never know why, and lose the chance to reduce the odds of this happening in future.
And as far as gwern was concerned, he was just joking and startling was playing along. He didn't recognise that this was actual offence at the time, and that's not something he can be blamed for either. Double illusion of transparency never stopped being a thing.
This mess did not arise because either party was an idiot, and advice and reactions to it are going to need to be more complex than "should have just done the obvious thing, stupid". There are some good results already. The clarification to those around now that the people in the channel do not collectively-or-in-general endorse the views, which were originally said as a joke, is at the least a good thing. This should also at least result in some updating on the probable meaning of other people's responses.
Avoiding misunderstandings like this happening again is not an easy problem. To an extent I'd expect events like this to be an ongoing cost of operating a community where jokes of that nature are accepted. One shouldn't expect moderation policy debates to be one-sided. But I think we can do better. The ombudsman idea is interesting. Another is anyone in the channel saying something which clarifies the situation when someone seems like they might be insulted; I feel kinda guilty for not doing this myself when the first quoted event happened (I'm Namegduf there), since I was around at the time and talked to at least one other person who was genuinely bothered by it. There's useful discussion to be had there.
No one has ever prefaced such a statement with "for your purposes." There is a reason for that.
It actually occurs fairly often. A good reason to prefix such a statement with "for your purposes", is to indicate that modelling a statement as true is effective for achieving your purposes, without getting into a more complex discussion of whether it actually is true or not.
For example, "for your purposes, the movement of objects is described by Newtonian physics". The statement after "for your purposes" is ill-defined (what exactly does 'described' mean?) as an actual claim about the universe, but the sentence as a whole is a useful empirical and falsifiable statement, saying that you can assume Newtonian physics are accurate enough for whatever you're currently doing.
As a second example, it might be true to say to an individual walker arriving at a bridge, "for your purposes, the bridge is safe to walk over", while for the purposes of a parade organiser, they cannot simply model the bridge as safe to walk over, but may need to consider weight tolerances and think in terms of more precise statements about what the bridge can support.
For your purposes as a human being in a typical situation who doesn't want to signal negative things about and to any and all transgender people, you should behave in line with gender identity as innate. It is a reasonable piece of advice relating to social etiquette in this area.
It isn't necessary to get into demonstrating the probable truth of this (including breaking down the definition of 'innate') to give this advice and the original quote decided to avoid starting that argument, which seems like a reasonable call.
Statements like this do make some assumptions about what your purposes are- in the bridge example I gave, the speaker is assuming the walker is not a parade organiser considering leading a parade over the bridge. Such assumptions and guesses about the audience's purposes are unavoidable when giving advice, though, and this particular one seems quite reasonable. In no case do these assumptions "subjugate" you to make them correct.
Took the survey; doing all the extra tests for the last few extra questions was fairly interesting, not having done many personality tests or taken online IQ tests before.
This is interesting, particularly the idea of comparing wage growth against welfare growth predicting success of "free money" welfare. I agree that it seems reasonably unlikely that a welfare system paying more than typical wages, without restrictions conflicting with the "detached from work" principle, would be sustainable, and identifying unsustainable trends in such systems seems like an interesting way to recognise where something is going to have to change, long-term.
I appreciate the clarification; it provides what I was missing in terms of evidence or reasoned probability estimates over narrative/untested model. I'm taking a hint from feedback that I likely still communicated this poorly, and will revise my approach in future.
Back on the topic of taking these ideas as principles, perhaps more practical near-term goals which provide a subset of the guarantee, like detaching availability of resources basic survival from the availability of work, might be more probably achievable. There are a wider range of options available for implementing these ideas, and of incentives/disincentives to avoid long-term use. An example which comes to mind is providing users with credit usable only to order basic supplies and basic food. My rough estimate is that it seems likely that something in this space could be designed to operate sustainably with only the technology we have now.
On the side, relating to generation Facebook, my model of the typical 16-22 year old today would predict that they'd like to be able to buy an iPad, go to movies, afford alcohol, drive a nice car, go on holidays, and eventually get most of the same goals previous generations sought, and that their friends will also want these things. At younger ages, I agree that parental pressure wouldn't be typically classified as "peer pressure", but I still think it likely to provide significant incentive to do school work; the parents can punish them by taking away their toys if they don't, as effectively as for earlier generations. My model is only based on my personal experience, so mostly this is an example of anecdotal data leading to different untested models.
It is true that in the long run, things could work out worse with a guarantee of sufficient food/supplies for everyone. I think, though, that this post answers the wrong question; the question to answer in order to compare consequences is how probable it is to be better or worse, and by what amounts. Showing that it "could" be worse merely answers the question "can I justify holding this belief" rather than the question "what belief should I hold". The potential benefits of a world where people are guaranteed food seem quite high on the face of it, so it is a question well worth asking seriously... or would be if one were in a position to actually do anything about it, anyway.
Prisoners' dilemmas amongst humans with reputation and social pressure effects do not reliably work out with consistent defection, and models of societies (and students*) can easily predict almost any result by varying the factors they model and how they do so, and so contribute very little evidence in the absence of other evidence that they generate accurate predictions.
The only reliable information that I am aware of is that we know that states making such guarantees can exist for multiple generations with no obvious signs of failure, at least with the right starting conditions, because we have such states existing in the world today. The welfare systems of some European countries have worked this way for quite a long time, and while some are doing poorly economically, others are doing comparably well.
I think that it is worth assessing the consequences of deciding to live by the idea of universal availability of supplies, but they are not so straightforwardly likely to be dire as this post suggests, requiring a longer analysis.
The problem with this argument is that there are costs to causing things to happen via spreading misinformation; you're essentially biasing other people doing expected utility evaluations by providing inaccurate data to them. People drawing conclusions based on inaccurate data would have other effects; in this example, some people would avoid flying, suffering additional costs. People are also likely to continue to support the goals the conspiracy theory pushes towards past the point that they actually would have the greater expected utility without the conspiracy theory's influence on probability estimates, causing bad decisions later.
It's possible that after factoring all this in, it could be worthwhile in some cases. But given the costs involved I think, prior to any deeper study of the situation, it would be more likely harmful than beneficial in this specific example.
Looking at the press association example, I think that one problem here is that similar ideas are being blurred, and given a single probability instead of separate ones.
A lot of the theories involving press/politician association involve conspiracy to conceal specific, high impact information from the public, or similar levels of dysfunction of the media. Most of these are low probability (I can't think of any counterexamples offhand); as far as I know either no or a very small percentage of such theories have been demonstrated as true over time.
Different theories involving association have different probabilities. The Leveson Inquiry is providing reasonably strong evidence for influence and close social connections, so the proposition that that existed would seem to have been fairly accurate.
I don't know what exactly you heard described as a conspiracy theory, in the fairly large space of possible theories, but it seems to me that that example is a good case where it is important to review the evidence for, and recognise fallacies (including overestimation of the probability of agency) in a specific theory, rather than decide what classification of theory it falls into, and judge it based on whether theories in that classification are generally "conspiracy theories".