Posts
Comments
EDIT: The original post now has updated times and links, so refer to that instead.
Here are links to the times suggested, for convenience:
- New York City: Fridays at 1pm
- Paris: Tuesdays at 1pm
- San Francisco: Wednesdays at 1pm
- Melbourne: Tuesdays at 9pm (edited to actually coincide with Paris)
I'd suggest posting meeting times using timeanddate.com, to help avoid confusion about time zones and daylight savings.
Perhaps what is missing is these rules:
AT = A (1)
AF = F (2)
A + T = T (3)
A + F = A (4)
Which can be derived from the given axioms, apparently. I'm not sure if some necessary axioms were omitted.
Using some of these, here's one way to derive B!A=!A from !B=AD:
!B = AD
!B + A = AD + A
!B + A = AD + AT (1)
!B + A = A(D + T) (Distributivity)
!B + A = AT (3)
!B + A = A (1)
!!B!A = !A (Duality)
B!A = !A
I'm in.
Post also mentioned Tolerate Tolerance
Hi.
I registered and started posting a while back, but since then have reverted to lurking. Partly due to not having time, but I can also identify with reasons some others have given.
Voted down for being off-topic.
Verifying a proof is quite a bit simpler that coming up with the proof in the first place.
Perhaps keep anonymous votes too, but make them worth less or only use them to break ties.
A sequence of wins and non-wins is enough to tell you whether a given approach can result in intelligent behaviour. That alone is enough to make it a useful experiment.
A man with one watch might have the wrong time; a man with two watches is more aware of his own ignorance.
The basement is the biggest
I like that turn of phrase.
I was about to point out that the fascinating and horrible dynamics of over-the-top threats are covered in length in Strategy of Conflict. But then I realised you're the one who made that post in the first place. Thanks, I enjoyed that book.
It's much easier to limit output than input, since the source code of the AI itself provide it with some patchy "input" about what the external world is like. So there is always some input, even if you do not allow human input at run-time.
ETA: I think I misinterpreted your comment. I agree that input should not be unrestricted.
You sir, have made a gender assumption.
Similar topics were discussed in an Open Thread.
Luckily digital constructs are easier to perfect that wooden ones. Although you wouldn't think so with the current state of most software.
It is difficult to constrain the input we give to the AI, but the output can be constrained severely. A smart guy could wake up alone in a room and infer how he evolved, but so long as his only link to the outside world is a light switch that can only be switched once, there is no risk that he will escape.
If Dave holds a consequentialist ethical theory that only values his own life, then yes we are screwed.
If Dave's consequentialism is about maximizing something external to himself (like the probable state of the universe in the future, regardless of whether he is in it), then his decision has little or no weight if he is a simulation, but massive weight if he is the real Dave. So the expected value of his decision is dominated by the possibility of him being real.
If we accept the simulation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions (and other conditions, but torture seems to be the theme here). An extortionist in our world can only create a relatively small number of simulations of us, relatively small enough that it is not worth taking them into account. The distribution of simulation types in this world bears no relation to the distribution of simulations we could possibly be in.
If we want to gain information about what sort of simulation we are in, evidence needs to come directly from properties of our universe (stars twinkling in a weird way, messages embedded in π), rather than from properties of simulations nested in our universe.
So I'm safe from the AI ... for now.
Here's an idea for how a LW-based commercial polling website could operate. Basically it is a variation on PredictionBook with a business model similar to TopCoder.
The website has business clients, and a large number of "forecasters" who have accounts on the website. Clients pay to have their questions added to the website, and forecasters give their probability estimates for whichever questions they like. Once the answer to a question has been verified, each forecaster is financially rewarded using some proper scoring rule. The more money assigned to a question, the higher the incentive for a forecaster to have good discrimination and calibration. Some clever software would also be needed to combine and summarize data in a way that is useful to clients.
The main advantage of this over other prediction markets is that the scoring rule encourages forecasters to give accurate probability estimates.
Perhaps start by giving it away, or sell it to small buyers (eg. individuals).
But I've got to admit I don't have experience in this area, so my suggestions are mostly naive speculation (but hopefully my speculation is of high quality!). Research into existing prediction companies is called for.
If we are the kind of people who would delete lots of AIs, I don't see why AIs would not see it as similarly ethical to delete lots of us.
So just in case we are a simulated AI's simulation of its creators, we should not simulate an AI in a way it might not like? That's 3 levels of a very specific simulation hypothesis. Is there some property of our universe that suggests to you that this particular scenario is likely? For the purpose of seriously considering the simulation hypothesis and how to respond to it, we should make as few assumptions as possible.
More to the point, I think you are suggesting that the AI will have human-like morality, like taking moral cues from others, or responding to actions in a tit-for-tat manner. This is unlikely, unless we specifically program it to do so, or it thinks that is the best way to leverage our cooperation.
I'm concerned about the moral implications of creating intelligent beings with the intent of destroying them after they have served our needs [...]
Personally, I would rather be purposefully brought into existence for some limited time than to never exist at all, especially if my short life was enjoyable.
I evaluate the morality of possible AI experiments in a consequentialist way. If choosing to perform AI experiments significantly increases the likelihood of reaching our goals in this world, it is worth considering. The experiences of one sentient AI would be outweighed by the expected future gains in this world. (But nevertheless, we'd rather create an AI that experiences some sort of enjoyment, or at least does not experience pain.) A more important consideration is social side-effects of the decision - does choosing to experiment in this way set a bad precedent that could make us more likely to de-value artificial life in other situations in the future? And will this affect our long-term goals in other ways?
I merely wanted to point out to Kaj that some "meaningful testing" could be done, even if the simulated world was drastically different from ours. I suspect that some core properties of intelligence would be the same regardless of what sort of world it existed in - so we are not crippling the AI by putting it in a world removed from our own.
Perhaps "if released into our world" wasn't the best choice of words... more likely, you would want to use the simulated AI as an empirical test of some design ideas, which could then be used in a separate AI being carefully designed to be friendly to our world.
People would hire the firm if it could be demonstrated that the firm consistently produced accurate results. So initial interest might be low, but pick up over time as the track record gets longer.
You could observe how it acts in its simulated world, and hope it would act in a similar way if released into our world. ETA: Also, see my reply for possible single-bit tests.
I have had some similar thoughts.
The AI box experiment argues that a "test AI" will be able to escape even if it has no I/O (input/output) other than a channel of communication with a human. So we conclude that this is not a secure enough restraint. Eliezer seems to argue that it is best not to create an AI testbed at all - instead get it right the first time.
But I can think of other variations on an AI box that are more strict than human-communication, but less strict than no-test-AI-at-all. The strictest such example would be an AI simulation in which the input consisted of only the simulator and initial conditions, and the output consisted only of a single bit of data (you destroy the rest of the simulation after it has finished its run). The single bit could be enough to answer some interesting questions ("Did the AI expand to use more than 50% of the available resources?", "Did the AI maximize utility function F?", "Did the AI break simulated deontological rule R?").
Obviously these are still more dangerous that no-test-AI-at-all, but the information gained from such constructions might outweigh the risks. Perhaps if I/O is restricted to few enough bits, we could guarantee safety in some information-theoretic way.
What do people think of this? Any similar ideas along the same lines?
I personally don't mind "tl;dr", but I agree that where practical it is best to use language that will be understood by as wide an audience as possible. (Start using "tl;dr" again when it becomes mainstream :) )
Are extensional and intensional definitions related to outside views and inside views? I suppose extensional definitions and outside view are about drawing conclusions from a class of things, while the intensional and inside use specific details more unique to the thing in question.
I agree with the gist of your post, but this paragraph:
Conversely, we know the LHC is not going to destroy the world, because nature has been banging particles together at much higher energy levels for billions of years...
is a common argument, that doesn't really stand up once you take into account anthropic bias.
Saying "oh sorry I hurt your feelings" is just plain being nice, which is a good idea whether you are aiming to be rational or not.
Someone actually made a top-level post on this the other day. Just sayin'.
You can have emotions while being rational, and you can be rational while having emotions. They are opposed sometimes, but they do not always have to be. But when there is a conflict between them, rationality (so long as you practice it properly) is more reliable in reaching correct, useful conclusions.
It would be great for this rationalist community to be able to discuss any topic, but in a way that insulates the main rationality discussions from off-topic discussions. Perhaps forum software separate from the main format of LessWrong? Are monthly open threads enough for off-topic discussions?
Add a term granting a large disutility for deaths, and this should do the trick.
What if death isn't well-defined? What if the AI has the option of cryonically freezing a person to save their life - but then being frozen, that person does not have any "current" utility function, so the AI can then disregard them completely. Situations like this also demonstrate that more generally, trying to satisfy someone's utility function may have an unavoidable side-effect of changing their utility function. These side-effects may be complex enough that the person does not forsee them, and it is not possible for the AI to explain them to the person.
I think your "simple hack" is not actually that simple or well-defined.
I sent an email on January the 10th, and haven't yet got a reply. Has my email made it to you? Granted, it is over a month since this article was posted, so I understand if you are working on things other than applications at this point...
Or to take an open source term, "Benevolent Dictator for Life".
If there is, I'd like to know too, for when(/if) I try my hand at a top-level post. Hopefully the rating and moderation system is good enough such that no formal rule is needed.
Or the two are fairly independent - you can be good or bad at seeking status, intelligent or not-so intelligent, and it is possible to have any combination of those, including that of being unintelligent and yet still good at obtaining status.
That claims of this type are sometimes made to advance agendas does not mean we shouldn't make these claims, or that all such claims are false. It means such claims need to be scrutinised more carefully.
I agree that more often than not there is not a simple solution, and people often accept a false simple solution too readily. But the absence of a simple solution does not mean there is no theoretical optimal strategy for continually working through the difficulty.
I agree with the message of the article, but I do not think it is forever going to be impossible to query what science currently knows.
Improvements in search technology cause a decrease in the time taken to do a reasonable search for any existing knowledge on a topic. Before the internet you might have had to read dozens of journals to have a vague idea of whether a field had discovered something in particular. Now you can do an online search. Conceivably, a future search engine could be good enough that it could take some imprecise (non-jargon) search terms, and bring up the most relevant and up-to-date research on the topic.
This would be great for public perception of science. Consider your uncle typing "How does gravity work?", and being presented not only with a pop-science description (as you would today), but also with the latest peer-reviewed work and a list of prerequisite reading, should he want a more thorough understanding. It'd be harder for people to worship ignorance if there was less of a barrier between them and knowledge.
... seems to have this belief that there is some perfect cure for any problem.
There may not be a single strategy that is perfect on it's own, but there will always be an optimum course of action, which may be a mixture of strategies (eg dump $X into nanotech safety, $Y into intelligence enhancement, and $Z into AGI development). You might never have enough information to know the optimal strategy to maximise your utility function, but one still exists, and it is worth trying to estimate it.
I mention this because previously I have heard "there is no perfect solution" as an excuse to give up and abandon systematic/mathematical analysis of a problem, and just settle with some arbitrary suggestion of a "good enough" course of action.
For some things (especially concrete things like animals or toothpaste products), it is easy to find a useful reference class, while for other things it is difficult to find which possible reference class, if any, is useful. Some things just do not fit nicely enough into an existing reference class to make the method useful - they are unclassreferencable, and it is unlikely to be worth the effort attempting to use the method, when you could just look at more specific details instead. ("Unclassreferencable" suggests a dichotomy, but it's more of a spectrum.) ETA: I see this point has already been made here.
Humans naturally use an ad-hoc method that is like reference class forecasting (that may not be perfect or completely rational, but does a reasonable job sometimes). It is useful when we first encounter something and do not yet have enough specific details to evaluate it on its own terms. Once we have those details, the forecasting method is not needed. We use forecasting to get a heuristic on which things are worth us investigating further, so we can make that more detailed evaluation. Often something that is unclassreferencable is more worth investigating - we are curious about things that do not fit nicely into our existing categories.
There are a couple of ways promoters of a product/idea can exploit humans' natural forecasting habits. Sometimes the phrase "defies categorisation" or "doesn't fit into the normal genres" is applied to a new piece of music, to suggest that it is unclassreferencable and therefore worth checking out (which is better than a potential listener lumping it into a category that they don't like). On the other hand, sometimes promoters purposefully put themselves into a reference class, hoping that noone investigates finer details - like a new product claiming to be "environmentally friendly", or people wearing certain clothes to appear to have higher status.
Let me know if I'm suffering from man-with-hammer syndrome here, but it seems reference class forecasting is a useful way to think about many promotional strategies in a more systematic way.
- Handle: arbimote
- Gender: Male
- Age: 22 (born 1987)
- Location: Australia
- Occupation: Student of computer science
I've been lurking since May 2009. My views on some issues that are often brought up on LW are:
- It's a good idea to sign up to cryonics if you have the money, due to a Pascal's Wager type argument. I have not signed up, since I do not yet have the money (and AFAIK there are further complications due to being in Australia).
- It is possible and desirable for humans to create AGI.
- MWI seems intuitive to me, but I have not read enough about the subject to form an decent estimate on its correctness.
I feel like I should pad out this intro with more information, but that'll have to do for now.