Posts
Comments
And a couple of years later, I've not adopted this full-time, but I keep coming back to it and making incremental improvements.
This resonated with me instantly, thank you!
I now remember, I used to do something similar if I needed to make decisions, even minor decisions, when drunk. I'd say, "what would I think of this decision sober"? If the answer was "it was silly" or "I'd want to do it but be embarrassed" I'd go ahead and do it. But if the answer was "Eek, obviously unsafe", I'd assume my sober self was right and I was currently overconfident.
For what it's worth, I quibbled with this at the time, but now I find it an incredibly useful concept. I still wish it had a more transparent name -- we always call it "the worst argument in the world", and can't remember "noncentral fallacy", but it's been really useful to have a name for it at all.
I think this is a useful idea, although I'm not sure how useful this particular example is. FWIW, I definitely remember this from revising maths proofs -- each proof had some number of non-obvious steps, and you needed to remember those. Sometimes there was just one, and once you had the first line right, even if there was a lot of work to do afterwards, it was always "simplifying in the obvious way", so the rest of proof was basically "work it out, don't memorise it". Other proofs had LOTS of non-obvious ideas and were a lot harder to remember even if they were short.
FWIW I think of activities that cost time like activities that cost money: I decide how much money/time I want to spend on leisure, and then insist I spend that much, hopefully choosing the best way possible. But I don't know if that would help other people.
I guess "unknown knowns" are the counterpoint to "unknown unknowns" -- things it never occurred to you to consider, but didn't. Eg. "We completely failed to consider the possibility that the economy would mutate into a continent-sized piano-devouring shrimp, and it turned out we were right to ignore that."
FWIW, I always struggle to embrace it when I change my mind ("Yay, I'm less wrong!")
But I admit, I find it hard, "advocating a new point of view" is a lot easier than "admitting I was wrong about a previous point of view", so maybe striving to do #1 whether or not you've done #2 would help change my mind in response to new information a lot quicker?
http://phenomena.nationalgeographic.com/2013/11/26/welcome-to-the-era-of-big-replication/
When he studied which psychological studies were replicatable, and had to choose whether to disbelieve some he'd previously based a lot of work on, Brian Nosek said:
I choose the red pill. That's what doing science is.
(via ciphergoth on twitter)
I don't like a lot of things he did, but that's the second very good advice I've heard from Rumsfeld. Maybe I need to start respecting his competence more.
Do we make suggestions here or wait for another post?
A few friends are Anglo-Catholic (ie. members of the Church of England or equivalent, not Roman Catholic, but catholic, I believe similar to Episcopalian in USA?), and not sure if they counted as "Catholic", "Protestant" or "Other". It might be good to tweak the names slightly to cover that case. (I can ask for preferred options if it helps.)
https://fbcdn-sphotos-f-a.akamaihd.net/hphotos-ak-prn2/1453250_492554064192905_1417321927_n.jpg http://en.wikipedia.org/wiki/Anglo-Catholicism
I took the survey.
I think most of my answers were the same as last year, although I think my estimates have improved a little, and my hours of internet have gone down, both of which I like.
Many of the questions are considerably cleaned up -- much thanks to Yvain and everyone else who helped. It's very good it has sensible responses for gender. And IIRC, the "family's religious background" was tidied up a bit. I wonder if anyone can answer "atheist" as religious background? I hesitated over the response, since the last religious observance I know of for sure was G being brought up catholic, but I honestly think living in a protestant (or at least, anglican) culture is a bigger influence on my parents cultural background, so I answered like that.
I have no idea what's going to happen in the raffle. I answered "cooperate" because I want to encourage cooperating in as many situations as possible, and don't really care about a slightly-increased chance of < $60.
I emphatically agree with that, and I apologise for choosing a less-than-perfect example.
But when I'm thinking of "ways in which an obviously true statement can be wrong", I think one of the prominent ways is "having a different definition than the person you're talking to, but both assuming your definition is universal". That doesn't matter if you're always careful to delineate between "this statement is true according to my internal definition" and "this statement is true according to commonly accepted definitions", but if you're 99.99% sure your definition is certain, it's easy NOT to specify (eg. in the first sentence of the post)
Yeah, that's interesting.
I agree with Eliezer's post, but I think that's a good nitpick. Even if I can't be that certain about 10,000 statements consecutively because I get tired, I think it's plausible that there's 10,000 statements simple arithmetic statements which if I understand, check of my own knowledge, and remember seeing in a list on wikipedia, (which is what I did for 53), that, I've only ever been wrong once on. I find it hard to judge the exact amount, but I definitely remember thinking "I thought that was prime but I didn't really check and I was wrong" but I don't remember thinking "I checked that statement and then it turned out I was still wrong" for something that simple.
Of course, it's hard to be much more certain. I don't know what the chance is that (eg) mathematicians change the definition of prime -- that's pretty unlikely, but similar things have happened before that I thought I was certain of. But rarely.
I think the problem may be what counts as correlated. If I toss two coins and both get heads, that's probably coincidence. If I toss two coins N times and get HH TT HH HH HH TT HH HH HH HH TT HH HH HH HH HH TT HH TT TT HH then there's probably a common cause of some sort.
But real life is littered with things that look sort of correlated, like price of X and price of Y both (a) go up over time and (b) shoot up temporarily when the roads are closed, but are not otherwise correlated, and it's not clear when this should apply (even though I agree it's a good principle).
Note: this isn't always right. Anyone giving advice is going to SAY it's true and non-obvious even if it isn't. "Don't fall into temptation" etc etc. But that essay was talking about mistakes which he'd personally often empirically observed and proposed counter-actions to, and he obviously could describe it in much more detail if necessary.
And the fourth largest country of any sort :)
That's an interesting thought, but it makes me rather uncomfortable.
I think partly, I find it hard to believe the numbers (if I have time I will read the methodology in more detail and possibly be convinced).
And partly, I think there's a difference between offsetting good things, and offsetting bad things. I think it's plausible to say "I give this much to charity, or maybe this other charity, or maybe donate more time, or...". But even though it sort of makes sense from a utilitarian perspective, I think it's wrong (and most people would agree it's wrong) to say "I'll kick this puppy to death because I'm a sadist/a modern artist/whatever, but I'll give more money to charity afterwards." Paying someone else to be vegetarian sounds at least as much like the latter as the former to me.
Hm. My answers were:
Anti-procrastination: "This fit with things I'd tried to do before in a small way, but went a lot farther, and I've repeatedly come back to it and feel I've made intermittent improvements by applying one or another part of it, but not really in a systematic way, so can't be sure that that's due to the technique rather than just ascribing any good results I happen to get to this because it sounded good."
Pomodoro: "I've tried something similar before with intermittently good results and would like to do so more than I do. I don't know whether the trappings of pomodoro significantly improve on that."
Exercise: "I feel good on the occasions when I exercise, but it doesn't seem to produce a measurable performance increase -- it may halt a performance decrease."
I decided to fit those into the boxes as best I could rather than write in, but I wasn't sure.
I agree that the answers to these questions depend on definitions
I think he meant that those questions depend ONLY on definitions.
As in, there's a lot of interesting real world knowledge that goes in getting a submarine to propel itself, but that now we know that, have, people asking "can a submarine swim" is only interesting in deciding "should the English word 'swim' apply to the motion of a submarine, which is somewhat like the motion of swimming, but not entirely". That example sounds stupid, but people waste a lot of time on the similar case of "think" instead of "swim".
Anna Salamon and I usually apply the Tarski Method by visualizing a world that is not-how-we'd-like or not-how-we-previously-believed, and ourselves as believing the contrary, and the disaster that would then follow.
I find just that description really, really useful. I knew about the Litany of Tarski (or Diax's Rake, or believing something just because you wanted it to be true) and have the habit of trying to preemptively prevent it. But that description makes it a lot easier to grok it at a gut level.
And yet it seems to me - and I hope to you as well - that the statement "The photon suddenly blinks out of existence as soon as we can't see it, violating Conservation of Energy and behaving unlike all photons we can actually see" is false, while the statement "The photon continues to exist, heading off to nowhere" is true.
I remember when you drew this analogy to different interpretations of QM and was thinking it over.
The way I put it to myself was that the difference between "laws of physics apply" and "everything acts AS IF the laws of physics apply, but the photon blinks out of existence" is not falsifiable, so for our current physics, the two theories are actually just different reformulations of the same theory.
However, Occam's razor says that, of the two theories, the right one to use is "laws of physics apply" for two reasons: firstly, that it's a lot simpler to calculate, and secondly, if we ever DO find any way of testing it, we're 99.9% sure that we'll discover that the theory consistent with conservation of energy will apply.
And this sort of belief can have behavioral consequences! ... If we thought the colonization ship would just blink out of existence before it arrived, we wouldn't bother sending it.
Excellent point!
FWIW, this is one of my favourite articles. I can't say how much it would help everyone -- I think I read it when I was just at the right point to think about procrastination seriously. But I found the analytical breakdown into components incredibly helpful way to think about it (and I love the sniper rifle joke).
Tone arguments are not necessarily logical errors
I think people's objections to tone arguments have often been misinterpreted because (ironically) the objections are often explained more emotively and less dispassionately.
As I understand it, the problem with "tone arguments" is NOT that they're inherently fallacious, but rather, than they're USUALLY (although not necessarily) rude and inflammatory.
I think a stereotypical exchange might be:
A says something inadvertently offensive to subgroup Beta B says "How dare you? Blah blah blah" A says "Don't get so emotional! Also, what you said is wrong because p, q and r" C says "Hey, no tone arguments, please"
A is correct that B's point might be more persuasive if it were less emotional and were well-crafted to be persuasive to people regardless whether they're already aware of the issues or not, and often correct about p, q and r (whether they're substansive rebuttals of the main point, or just quibbles) . But if B fails to put B's argument in the strongest possible form , it's A's responsibility to evaluate the stronger form of the argument, not just critique B for not doing so. And C pointed that out, just in a way that might unfortunately be opaque to A.
I don't know if the idea works in general, but if it works as described I think it would still be useful even if it doesn't meet this objection. I don't forsee any authentication system which can distinguish between "user wants money" and "user has been blackmailed to say they want money as convincingly as possible and not to trigger any hidden panic buttons", but even if it doesn't, a password you can't tell someone would still be more secure because:
- you're not vulnerable to people ringing you up and asking what your password is for a security audit, unless they can persaude you to log on to the system for them
- you're not vulnerable to being kidnapped and coerced remotely, you have to be coerced wherever the log-on system is
I think the "stress detector" idea is one that is unlikely to work unless someone works on it specifically to tell the difference between "hurried" and "coerced", but I don't think the system is useless because it doesn't solve every problem at once.
OTOH, there are downsides to being too secure: you're less likely to be kidnapped, but it's likely to be worse if you ARE.
The impression I've formed is that physicists have a pretty good idea what's pretty reliable (the standard model) and what's still completely speculative (string theory) but at some point the popular science pipeline communicating the difference to intelligent scientifically literate non-physicists broke down, and so I became broadly cynical about non-experimentally-verified physics in general, when if I'd had more information, I'd have been able to make much more accurate predictions about which were very likely, and which were basically just guesses.
I'd not seen Elizier's post on "0 and 1 are not probabilities" before. It was a very interesting point. The link at the end was very amusing.
However, it seems he meant "it would be more useful to define probabilities excluding 0 and 1" (which may well be true), but phrased it as if it were a statement of fact. I think this is dangerous and almost always counterproductive -- if you mean "I think you are using these words wrong" you should say that, not give the impression you mean "that statement you made with those words is false according to your interpretation of those words is false".
I once skimmed "How to win friends and influence people". I didn't read enough to have a good opinion of the advice (I suspect djcb's description of it being fairly good advice as long as the author's experience generalises well, which HTWFAIP probably does better than many but not perfectly).
However, what had a profound influence on me was that though there's an unfortunate stereotype of people who've read too much Carnegie seeming slimy and fake, the author seemed to genuinely want to like people and be nice to them, which I thought was lovely.
It seems to me that Elizier's post was a list of things that typically seem, in the real world, to be component of people's happiness, but are commonly missed out when people propose putative (fictional or futuristic) utopias.
It seemed to me that Elizier was saying "If you propose a utopia without any challenge, humans will not find it satisfying" not "It's possible to artificially provide challenge in a utopia".
Hm. Now you say it, I think I've definitely read some excellent non-Elizier articles on Less Wrong. But not as systematically. Are they collated together ("The further sequences") anywhere? I mean, in some sense, "all promoted articles" is supposed to serve that function, but I'm not sure that's the best way to start reading. And there are some good "collections of best articles". But they don't seem as promoted as the sequences.
If there's not already, maybe there should be a bit of work in collecting the best articles by theme, and seeing which of them could do with some revising to make whatever the (in retrospect) best point more clear. Preferably enough bit of revising (or just disclaimers) to make it clear that they're not the the Word of God, but not so much they become bland.
Awesome avoidance of potential disagreement in favour of cooperation for a positive-sum result :)
I agree (as a comparative outsider) that the polite response to Holden is excellent. Many (most?) communities -- both online communities and real-world organisations, especially long-standing ones -- are not good at it for lots of reasons, and I think the measured response of evaluating and promoting Holden's post is exactly what LessWrong members would hope LessWrong could do, and they showed it succeeded.
I agree that this is good evidence that LessWrong isn't just an Eliezer-cult. (The true test would be if Elizier and another long-standing poster were dismissive to the post, and then other people persuaded them otherwise. In fact, maybe people should roleplay that or something, just to avoid getting stuck in an argument-from-authority trap, but that's a silly idea. Either way, the fact that other people spoke positively, and Elizier and other long-standing posters did too, is a good thing.)
However, I'm not sure it's as uniquely a victory for the rationality of LessWrong as it sounds. In responose to srdiamond, Luke quoted tenlier saying "[Holden's] critique mostly consists of points that are pretty persistently bubbling beneath the surface around here, and get brought up quite a bit. Don't most people regard this as a great summary of their current views, rather than persuasive in any way?" To me, that suggests that Holden did a really excellent job expressing these views clearly and persuasively. However, it suggests that previous people had tried to express something similar, but it hadn't been expressed well enough to be widely accepted, and people reading had failed to sufficiently apply the dictum of "fix your opponents' arguments for them". I'm not sure if that's true (it's certainly not automatically true), but I suspect it might be. What do people think?
If there's any truth to it, it suggests one good answer to the recent post http://lesswrong.com/lw/btc/how_can_we_get_more_and_better_lw_contrarians (whether that was desirable in general or not) would be, as a rationalist exercise for someone familiar with/to the community and good at writing rationally, to take a survey of contrarian views on the topic that people on the community may have had but not been able to express, and don't worry about showmanship like pretending to believe it yourself, but just say "I think what some people think is [well-expressed argument]. Do you agree that's fair? If so, do I and other people think they have a point?" Whether or not that argument is right it's still good to engage with it if many people are thinking it.
Yes, I'd agree. (I meant to include that in (B)). I mean, in fact, I'd say that "there are no biological differences between races other than appearance" is basically accurate, apart from a few medical things, without any need for tiptoeing around human biases. Even if the differences were a bit larger (as with gender, or even larger than that), I agree with your last parenthesis that it would probably still be a good idea to (usually) _act_as if there weren't any.
From context, it seems "race realism" refers to the idea that there are legitimate differences between races, is that correct? However, I'm not sure if it's supposed to refer to biological differences specifically, or any cultural differences? And it seems to be heavily loaded with connotations which I'm unaware of, that I would be hesitant to say it was "true" or "not true" even if I knew the answer to the questions in the two first sentences.
Let me try to summarise the obvious parts of the situation as I understand it. I contend that:
(A) There are some measureable differences between ethnicities that are most plausibly attributed to biological differences. (There are some famous examples, such as greater susceptibility of some people to skin cancer, or sickle cell anemia. I assume there are smaller differences elsewhere. If anyone seriously disagrees, say so.)
(B) These are massively dwarfed by the correlation of ethnicity with cultural differences in almost all cases.
(C) There is a social taboo against admitting (A)
(D) There is a large correlation between ethnicity and various cultural factors, and between cultural factors.
(E) It is sometimes possible to draw probabalistic inferences based on (D). Eg. With no other information, you may guess that someone on the street in London is more likely to be a British citizen if they are Indian than East Asian (or vice versa, whichever is true).
(F) The human brain deals very badly with probabalistic inferences. If you guess someone's culture based on their ethnicity or dress, you are likely to maintain that view as long as possible even in the face of new information, until you suddenly flip to the opposite view. Because of this, there is (rightly IMHO) a social taboo against doing (E) even when it might make sense.
(G) People who are and/or think they are good at drawing logicial inferences a la (E) but don't have as much personal experience fo the pitfalls described in (F) are likely to resent the social taboo described in (F) because it seems fussy and nonsensical to them. I am somewhat prone to this error (not so much with race, but with other things)
(H) The word "racist" is horrendously undefined. It is used both to mean "someone or something which treats people differently based on 'race', rightly or wrongly" (including examples where treating people differently is the only possible thing to do, such as preventative advice for medical conditions, or advice on how to avoid bad racism from other people) and to mean "someone or something which is morally wrong to discriminate based on race." Thus a description of whether something is "racist" is typically counterproductive.
I admit I only skimmed the OP's transcript, but my impression is that he fairly describes why he is frustrated that it is difficult to talk about these issues, but I am extremely leery of a lot of the examples he uses.
I was going to write more, but am not sure how to push it. How am I doing so far...? :)
I would say add [Video]: [Link] would perpetuate the misunderstanding that there may be no immediate content, [Video] correctly warns people who (for whatever reason) can't easily view arguments in video format.
I think this is directly relevant to the idea of embracing contrarian comments.
The idea of having extra categories of voting is problematic, because it's always easy to suggest, but only worthwhile if people will often want to distinguish them, and distinguishing them will be useful. So I think normally it's a well-meaning but doomed suggestion, and better to stick to just one.
However, whether or not it would be a good idea to actually imlpement, I think separating "interested" and "agree" is a good way of expressing what happens to contrarian comments. I don't have first-hand experience, but based on what I usually see happening at message boards, I suspect a common case is something like:
Someone posts a contrarian comment. Because they are not already a community stalwart, they also compose the comment in a way which is low-status within the community (eg. bits of bad reasoning, waffle, embedded in other assumptions which disagree with the community).
Thus, people choose between "there's something interesting here" and "In general, this comment doesn't support the norms we want this community to represent." The latter usually wins except when the commenter happens to be popular or very articulate.
The interesting/agree distinction would be relevant in cases like this, for instance:
- I'm pretty sure this is wrong, but I can't explain why, I'd like to see someone else tackle it and agree/disagree
- I think this comment is mostly sub-par, but the core idea is really, really interesting
- I might click "upvote" for a comment I thought was funny, but want a greater level of agreement for a comment I specifically wanted to endorse.
There's a possibly similar distinction between stackoverflow and stackoverflow meta, because negative votes affect user rank on overflow but not meta. On stack overflow, voting generally refers to perceived quality. On meta, it normally means agreement.
I'm not sure I'd advocate this as a good idea, but it seemed an interesting possibility given the problem proposed. FWIW, if it were implemented, it'd want a lot of scrutiny and brainstorming, but my first reaction would be to leave the voting as supposedly meaning "interesting", and usually sort by that, but add a secondary vote meaning "agree" or "disagree" or similar terms that can add a nuance to it.
Edit: Come to think of it, a similar effect is acheived by a social convention of people upvoting the comment, but also upvoting a reply that says "this part good, this part bad". If that happens, it should fulfil the same niche, but I don't know if it is happening enough.
That's an awesome comment. I'm interested which specific cues came up that you realised each other didn't get :)
Perhaps the right level of warning is to say "Cambridge UK" in the title and first line, but not take a position on whether other people are likely to be interested or not..?
I've been reading the answers and trying to put words into what I want to say. Ideally people will experience not just being more specific, but experience that when they're more specific, they immedaitely communicate more effectively.
For instance, think of three or four topics people probably have an opinion on, starting with innocuous (do you like movie X) and going on to controvertial (what do you think of abortion). Either have a list in advance, or ask people for examples. Perhaps have a shortlist and let people choose, or suggest something else if they really want?
I picked the movie example because it's something people usually feel happy to talk about, but can be very invested in their opinion of. Ideally it's something people will immediately disagree about. I don't think this is difficult -- in a group of 10, I'd expect to name only one or two movies before people disagreed, even though social pressure usually means they won't immediately say so.
Step 1 Establish that people disagree, and find it hard to come to an agreement. This should take about 30s. People will hopefully "agree to disagree" but not actually understand each other's position. Eg. "Starwars was great, it was so exciting." "Starwars was boring and sucked and didn't make any sense."
Step 2 Ask WHAT people like about it. Encourage people to give specific examples at first ("eg. I loved it when Luke did X") and then draw generalisations from it ("I really empathised with Luke and I was excited that he won" "I've read stories about farmboys who became heroes before, I already know what happens, bring me some intellecutal psychological fare instead"). Emphasise that everyone is on the same side, and they shouldn't worry about being embarrassed or being "wrong".
Step 3 Establish that (probably) they interpreted what the other person said in terms of what they were thinking (eg. "How can blowing up a spaceship be boring") when actually the other person was thinking about something they hadn't thought of (eg. "OK, I guess if you care about the physics, it would be annoying that they are completely and utterly made up, it just never occurred to me that anyone would worry about that.")
I may be hoping too much, but this is definitely the sort of process I've gone through to rapidly reach an understanding with someone when we previously differed a lot, and for some simple examples, it doesn't seem too much to hope we can do so that rapidly. Now, go through the process with two-four statements, ending with something fairly controvertial.
Hopefully (this is pure speculation, I've not tried it), giving specific examples will lead to people actually reaching understandings, imprinting the experience as a positive and successful one. Then encourage people to say "Can you give me an example of when [bad thing] would be as bad as you feel" as often as possible. Give examples where being specific is more persuasive (eg. "We value quality" vs "We aim for as few bugs as possible" vs "We triage bug reports as they come in. All bugs we decide to fix are fixed before the next version is released" or "we will close loopholes in the tax code" vs "we will remove the tax exempion on X"), and encourage people to shout out more.
For that matter, I couldn't stop my mind throwing up objections like "Frodo buys off-the-rack clothes? From where exactly? Surely he'd have tailor made? Wouldn't he be translated into British English as saying 'trousers'? Hobbit feet are big and hairy for Hobbits, but how big are they compared to human feet -- are their feet and inches 2/3 the size?"
It didn't occur to me until I'd read past the first two paragraphs that we were even theoretically supposed to ACTUALLY guess what size Frodo would wear. And I'm still unsure if the badness of the Frodo example was supposed to be part of the joke or not -- I mean, it's fairly funny if it is, but it's the sort of mistake (a bad example for a good point) I'd expect to see even made by intelligent, competent writers.
And I mean, I'm fairly sure that the fictional bias effect is real :)
The explanation of the current system, and how to view it in a rationalist manner was really interesting.
The problem as you state it seems to be that the court (and people in general) have a tendency to evaluate each link in a chain separately. For instance, if there was one link with an 80% chance of being valid, both a court and a bayesian would say "ok, lets accept it provisionally for now", but if there's three or four links, a court might say "each individual link seems ok, so the whole chain is ok" but a Bayesian would say "taken together that's only about 50%, that's nowhere near good enough."
It seems likely this is a systematic problem in the legal system -- I can imagine many situations where one side just about persuades the jury or judge of a long chain of things, but the null hypothesis of "it's just coincidence" isn't given sufficient consideration. However, all the examples you give show the bar to introducing it is already very high, so do you have a good reason to think introducing a hard limit would do more good than harm?
It would be incredible if there could be a general principle in court that chaining together evidence had to include some sort of warning from the judge about how it could be unrealiable and when it was too unreliable to admit (although I agree putting specific probabilities on things is probably never going to happen). It would be good if such a principle could be applied to some specific genre of evidence that was currently a real problem, in order to illustrate the general principle. Hearsay is conceptually appropriate, however, would it really help? And if not, what case does exhibit systematic miscarriages of justice?
I would be interested in going along to a meet-up at some point, but am not normally free with less than a week's notice :)
[Late night paraphrasing deleted as more misunderstanding/derailing than helpful. Edit left for honesty purposes. Hopeful more useful comment later.]
That sounds reasonable. I agree a complete discussion is probably too complicated, but it certainly seems a few simple examples of the sort I eventually gave would probably help most people understand -- it certainly helped me, and I think many other people were puzzled, whereas with the simple examples I have now, I think (although I can't be sure) I have a simplistic but essentially accurate idea of the possibilities.
I'm sorry if I sounded overly negative before: I definitely had problems with the post, but didn't mean to negative about it.
If I were breaking down the post into several, I would probably do:
(i) the fact of holomorphic encryption's (apparent) existence, how that can be used to to run algorithms on unknown data and a few theoretical applications of that, a mention that this is unlikely to be practical atm. That it can in principle be used to execute an unknown algorithm on unknown data, but that is really, really impractical, but might become more practical with some sort of parallel processing design. And at this point, I think most people would accept when you say it can be used to run an unfriendly AI.
(ii) If you like, more mathematical details, although this probably isn't necessary
(iii) A discussion of friendlyness-testing, which wasn't in the original premise, but is something people evidently want to think about
(iv) any other discussion of running an unfriendly AI safely
Thank you.
But if you're running something vaguely related to a normal program, if the program wants to access memory location X, but you're not supposed to know which memory location is accessed, doesn't that mean you have to evaluate the memory write instruction in combination with every memory location for every instruction (whether or not that instruction is a memory write)?
So if your memory is -- say -- 500 MB, that means the evaluation is at least 500,000,000 times slower? I agree there are probably some optimisations, but I'm leery of being able to run a traditional linear program at all. That's why I suggested something massively parallel like game of life (although I agree that's not actually designed for computation) -- if you have to evaluate all locations anyway, they might as well all do something useful. For instance, if your hypothetical AI was a simulation of a neural netowrk, that would be perfect, if you're allowed to know the connections as part of the source code, you can evaluate each simulation neuron in combination with only the ones its connected with, which is barely less efficient than evaluating them linearly, since you have to evaluate all of them anyway.
I think that's what my brain choked on when I imagined it the first time -- yes, technically, 500,000,000 times slower is "possible in principle, but much slower", but I didn't realise that's what you meant, hence my assumption that parts of the algorithm would not be encrypted. think if you rewrite this post to make the central point more accessible (which would make it a very interesting post), it would not be hard to put in this much explanation about how the encryption would actually be used.
Unless I'm wrong that that is necessary? It wouldn't be hard, just something like "you can even run a whole program which is encrypted, even if you have to simulate every memory location. That's obviously insanely impractical, but if we were to do this for real we could hopefully find a way of doing it that doesn't require that." I think that would be 10,000 times clearer than just repeatedly insisting "it's theoretically possible" :)
Thank you.
Isn't "encrypt random things with the public key, until it finds something that produces [some specific] ciphertext" exactly what encryption is supposed to prevent?? :)
(Not all encryption, but commonly)
"You don't quite understand homomorphic encryption."
I definitely DON'T understand it. However, I'm not sure you're making it clear -- is this something you work with, or just have read about? It's obviously complicated to explain in a blog (though Schnier http://www.schneier.com/blog/archives/2009/07/homomorphic_enc.html does a good start) but it surely ought to be possible to make a start on explaining what's possible and what's not!
It's clearly possible for Alice to have some data, Eve to have a large computer, and Alice to want to run a well-known algorithm on her data. It's possible to translate the algorithm into a series of addition and multiplication of data, and thus have Eve run the algorithm on the encrypted data using her big computer, and send the result back to Alice. Eve never knows what the content is, and Alice gets the answer back (assuming she has a way of checking it).
However, if Alice has a secret algorithm and secret data, and wants Eve to run one on the other, is it possible? Is this analogous to our case?
Is the answer "in principle, yes, since applying an algorithm to data is itself an algorithm, so can be applied to an encryoted algorithm with encryopted data without ever knowing what the algortihm is"? But is it much less practical to do so, because each step involves evaluating much redundant data, or not?
I'm imagining, say, using the game of life to express an algorithm (or an AI). You can iterate a game of life set-up by combining each set of adjacent cells in a straight-forward way. However, you can't use any of the normal optimisations on it, because you can't easily tell what data is unchanged? Is any of this relevant? Is it better if you build dedicated hardware?
If my understanding is correct (that you do run with the program and data encrypted at all times) then I agree the the scheme does in theory work: the AI has no way of affecting its environment in a systematic way if it can't predict how its data appears non-encrypted.
"I believe that in the future we will be able to resolve this problem in a limited sense by destroying the original private key and leaving a gimped private key which can only be used to decrypt a legitimate output."
I mentioned this possibility, but had no idea if it would be plausible. Is this an inherent part of the cryptosystem, or something you hope it would be possible to design a cryptosystem to do? Do you have any idea if it IS possible?
I definitely have the impression that even if the hard problem a cryptosystem is based on actually is hard (which is yet to be proved, but I agree is almost certainly true), most of the time the algorithm used to actually encrypt stuff is not completely without flaws, which are successively patched and exploited. I thought this was obvious, just how everyone assumed it worked! Obviously an algorithm which (a) uses a long key length and (b) is optimised for simplicity rather than speed is more likely to be secure, but is it really the consensus that some cryptosystems are free from obscure flaws? Haven't now-broken systems in the past been considered nearly infalliable? Perhaps someone (or you if you do) who knows professional cryoptography systems can clear that up, which ought to be pretty obvious?
LOL. Good point. Although it's a two way street: I think people did genuinely want to talk about the AI issues raised here, even though they were presented as hypothetical premises for a different problem, rather than as talking points.
Perhaps the orthonormal law of less wrong should be, "if your post is meaningful without fAI, but may be relevant to fAI, make the point in the least distracting example possible, and then go on to say how, if it holds, it may be relevant to fAI". Although that's not as snappy as Godwin's :)