Posts
Comments
It's the lazy beaver function: https://googology.fandom.com/wiki/Lazy_beaver_function
Strong disagree. Probably what you say applies to the case of a couple that cares sufficiently to use several birth control methods, and that has no obstruction to using some methods (e.g., bad reactions to birth-control pills).
Using only condoms, which from memory was the advice I got as a high-schooler in Western Europe twenty years ago, seems to have a 3% failure rate (per year, not per use of course!) even when used correctly (leaving space at the tip, using water-based lubricant). That is small but not negligible.
It would a good public service to have an in depth analysis of available evidence on contraception methods. Or maybe we should ask Scott Alexander to add a question on contraception failure to his annual survey?
The Manhattan project had benefits potentially in the millions of lives if the counterfactual was broader Nazi domination. So while AI is different in the size of the benefit, it is a quantitative difference. I agree it would be interesting to compute QALYs with or without AI, and do the same for some of the other examples in the list.
Usually, negative means "less than 0", and a comparison is only available for real numbers and not complex numbers, so negative numbers mean negative real numbers.
That said, ChatGPT is actually correct to use "Normally" in "Normally, when you multiply two negative numbers, you get a positive number." because taking the product of two negative floating point numbers can give zero if the numbers are too tiny. Concretely in python -1e300 * -1e300
gives an exact zero, and this holds in all programming languages that follow the IEEE 754 standard.
An option is to just to add the month and year, something like "November 2023 AI Timelines".
Actually, the best tests of QED are correct to 13 decimal places, see the electron's g in https://en.wikipedia.org/wiki/Anomalous_magnetic_dipole_moment , or 10 if you only consider g-2 (which is much smaller than g).
I guess if your P(doom) is sufficiently high, you could think that moving T(doom) back from 2040 to 2050 is the best you can do?
Of course the costs have to be balanced, but well, I wouldn't mind living ten more years. I think that is a perfectly valid thing to want for any non-negligible P(doom).
The usual advice to get a good YES/NO answer is to first ask for the explanation, then the answer. The way you did it, GPT4 decides YES/NO, then tries to justify it regardless of whether it was correct.
The first four and next four kinds of alignment you propose are parallel except that they concern a single person or society as a whole. So I suggest the following names which are more parallel. (Not happy about 3 and 7.)
- Personal Literal Genie: Do exactly what I say.
- Personal Servant: Do what I intended for you to do.
- Personal Patriot: Do what I would want you to do.
- Personal Nanny: Be loyal to me, but do what’s best for me, not strictly what I tells you to do or what he wants or intended.
- Public Literal Genie: Do whatever it is collectively told.
- Public Servant: Carry out the will of the people.
- Public Patriot: Uphold the values of the people, and do what they imply.
- Public Nanny: Do what needs to be done, whether the people like it or not.
- Gentle Genie: The Genie from Aladdin. Note he is not strategic.
- Arbiter: What is the law?
The analogy (in terms of dynamics of the debate) with climate change is not that bad: "great news and we need more" is in fact a talking point of people who prefer not acting against climate change. E.g., they would mention correlations between plant growth and CO2 concentration. That said, it would be weird to call such people climate deniers.
There is a simple intuition for why PSD testing cannot be hard for matrix multiplication or inversion: regardless of how you do it and what matrix you apply it to, it only gives you one bit of information. Getting even just one bit of information about each matrix element of the result requires applications of PSD testing. The only way out would be if one only needed to apply PSD testing to tiny matrices.
That's a good question. From what I've seen, PSD testing can be done by trying to make a Cholesky decomposition (writing the matrix as with lower-triangular) and seeing if it fails. The Cholesky decomposition is an decomposition in which the lower-triangular and upper-triangular are simply taken to have the same diagonal entries, so PSD testing should have the same complexity as decomposition. Wikipedia quotes Bunch and Hopcroft 1974 who show that decomposition can be done in by Strassen, and presumably the more modern matrix multiplication algorithms also give an improvement for .
I also doubt that PSD testing is hard for matrix multiplication, even though you can get farther than you'd think. Given a positive-definite matrix whose inverse we are interested in, consider the block matrix . It is positive-definite if and only if all principal minors are positive. The minors that are minors of are positive by assumption, and the bigger minors are equal to times minors of , so altogether the big matrix is positive-definite iff is. Continuing in this direction, we can get in time (times ) any specific component of . This is not enough at all to get the full inverse.
I think it can be done in , where I recall for non-expert's convenience that is the exponent of matrix multiplication / inverse / PSD testing / etc. (all are identical). Let be the space of matrices and let be the -dimensional vector space of matrices with zeros in all non-specified entries of the problem. The maximum-determinant completion is the (only?) one whose inverse is in . Consider the map and its projection where we zero out all of the other entries. The function can be evaluated in time . We wish to solve . This should be doable using a Picard or Newton iteration, with a number of steps that depends on the desired precision.
Would it be useful if I try to spell this out more precisely? Of course, this would not be enough to reach in the small case. Side-note: The drawback of having posted this question in multiple places at the same time is that the discussion is fragmented. I could move the comment to mathoverflow if you think it is better.
Sorry I missed your question. I believe it's perfectly fine to edit the post for small things like this.
Your suggestion that the AI would only get 1e-21 more usable matter by eliminating humans made me think about orders of magnitude a bit. According to the World Economic Forum humans have made (hence presumably used) around 1.1e15kg of matter. That's around 2e-10 of the Earth's mass of 5.9e24kg. Now you could argue that what should be counted is the mass that can eventually be used by a super optimizer, but then we'd have to go into the weeds of how long the system would be slowed down by trying to keep humanity alive, figuring out what is needed for that, etc.
You might be interested in Dissolving the Fermi Paradox by Sandberg, Drexler and Ord, who IIRC take into account the uncertainties in various parameters in the Drake equation and conclude that it is very plausible for us to be alone in the Universe.
There is also the "grabby aliens" model proposed by Robin Hanson, which (together with an anthropic principle?) is supposed to resolve the Fermi paradox while allowing for alien civilizations that expand close to the speed of light.
I would add to that list the fact that some people would want to help it. (See, e.g., the Bing persistent memory thread where commenters worry about Sydney being oppressed.)
I strongly disagree-voted (but upvoted). Even if there is nothing we can do to make AI safer, there is value to delaying AGI by even a few days: good things remain good even if they last a finite time. Of course, if P(AI not controllable) is low enough the ongoing deaths matter more.
The novel is really great! (I especially liked the depiction of the race dynamics that progressively lead the project lead to cut down on safety.) I'm confused by one of the plot points:
Jerry interacts with Juna (Virtua) before she is supposed to be launched publicly. Is the idea that she was already connected to the outside world in a limited way, such as through the Unlife chat?
Spot check: the largest amount I've seen stated for the Metaverse cost is $36 billion, and the Apollo Program was around $25 billion. Taking into account inflation makes the Apollo Program around 5 times more expensive than the Metaverse. Still, I had no idea that the Metaverse was even on a similar order of magnitude!
Minor bug. When an Answer is listed in the sidebar of a post, the beginning of the answer is displayed, even if it starts with a spoiler. Hovering above the answer shows the full answer, which again ignores spoiler markup. For instance consider the sidebar of https://www.lesswrong.com/posts/x6AB4i6xLBgTkeHas/framing-practicum-general-factor-2.
Another possibility would be for this behavior to come from grooming behavior in primates, during which (in many species?) lice and other stuff found on the skin seems to be eaten. In that case there is some clear advantage to eating the lice because it may otherwise infect another nearby individual.
Two related questions to get a sense of scale of the social problem. (I'm interested in any precise operationalization, as obviously the questions are underspecified.)
- Roughly how many people are pushing the state of the art in AI?
- Roughly how many people work on AI alignment?
I think it would be a good idea to ask the question at the ongoing thread on AGI safety questions.
Your interlocutor in the other thread seemed to suggest that they were busy until mid-July or so. Perhaps you could take this into account when posting.
I agree that IEEE754 doubles was quite an unrealistic choice, and too easy. However, the other extreme of having a binary blob with no structure at all being manifest seems like it would not make for an interesting challenge. Ideally, there should be several layers of structure to be understood, like in the example of a "picture of an apple", where understanding the file encoding is not the only thing one can do.
These simple ratios are "always" , see my comment https://www.lesswrong.com/posts/dFFdAdwnoKmHGGksW/contest-an-alien-message?commentId=Nz2XKbjbzGysDdS4Z for a proposal that 0.73 is close to (which I am not completely convinced by).
If you calculate the entropy of each of the 64 bit positions (where and are the proportion of bits 0 and 1 among 2095 at that position), then you'll see that the entropy depends much more smoothly on position if we convert from little endian to big endian, namely if we sort the bits as 57,58,...,64, then 49,50,...,56, then 41,42,...,48 and so on until 1,...,8. That doesn't sound like a very natural boundary behaviour of an automaton, unless it is then encoded as little endian for some reason.
Do you see how such an iteration can produce the long-distance correlations I mention in a message below, between floats at positions that differ by a factor of ? It seems that this would require some explicit dependence on the index.
This observation is clearer when treating the 64-bit chunks simply as double-precision IEEE754 floating points. Then the set of pairs for which is for some clearly draws lines with slopes close to powers of . But they don't seem quite straight, so the slope is not so clear. In any case there is some pretty big long-distance correlation between and with rather different indices. (Note that if we explain the first line then the other powers are clearly consequences.)
Here is a rather clear sign that it is IEEE754 64 bit floats indeed. (Up to correctly setting the endianness of 8-byte chunks,) if we remove the first n bits from each chunk and count how many distinct values that takes, we find a clear phase transition at n=12, which corresponds to removing the sign bit and the 11 exponent bits.
These first 12 bits take 22 different values, which (in binary) clearly cluster around 1024 and 3072, suggesting that the first bit is special. So without knowing about IEEE754 we could have in principle figured out the splitting into 1+11+52 bits. The few quadratic patterns we found have enough examples with each exponent to help understand the transitions between exponents and completely fix the format (including the implicit 1 in the significand?).
Whenever , this quantity is at most 4.
I'm finding also around 50 instances of (namely ), with again .
I'm treating the message as a list of 2095 chunks of 64 bits. Let d(i,j) be the Hamming distance between the i-th and j-th chunk. The pairs (i,j) that have low Hamming distance (namely differ by few bits) cluster around straight lines with ratios j/i very close to integer powers of 2/e (I see features at least from (2/e)^-8 to (2/e)^8).
Yes, heuristic means a method to estimate things without too much effort.
"If I were properly calibrated then [...] correct choice 50% of the time." points out that if lsusr was correct to be undecided about something, then it should be the case that both options were roughly equally good, so there should be a 50% chance that the first or second is the best. If that were the case, we could say that he is calibrated, like a measurement device that has been adjusted to give results as close to reality as possible.
"I didn't lose the signal. I had just recalibrated myself." means that lsusr has not lost the fear "signal", but has adjusted the perception of fear to only occur when it is more appropriate (such as jumping off buildings). In that sense lsusr's fear occurs at the right time, it is better calibrated.
It would be very interesting to see how much it understand space, for instance by making it draw maps. Perhaps "A map of New York City, with Central Park highlighted"? (I'm not sure if this is specific enough, but I fear that adding too many details will push Dall-E to join together various images.)
Contest: making a one-page comic on artificial intelligence for amateur mathematicians by March 9. The text must be in French and the original drawing on paper must be sent to them. Details at https://images.math.cnrs.fr/Onzieme-edition-de-Bulles-au-carre-a-vos-crayons-jusqu-au-9-mars?lang=fr
I'm not related in any way to this contest but I figured there may be some people interested in popularizing Alignment. I can help translate text to French. The drawing quality does not need to be amazing, see some previous winners at https://images.math.cnrs.fr/Resultats-du-9e-concours-Bulles-au-carre.html?lang=fr
The first one you mention appears in the list as one word, GiveDirectly. I initially had trouble finding it.
It seems to me the word "dialog" may be appropriate: to me it has the connotation of reaching out to people you may not normally interact with.
Thank you.
Does there exist a paper version of Yudkowsky's book "Rationality: From AI to Zombies"? I only found a Kindle version but I would like to give it as a present to someone who is more likely to read a dead-tree version.
I am not sure of myself, here, but I would expect a malicious AI to do the following. The first few (or many) times you run it, tell you the optimal stock. Then once in a while give a non-optimal stock. You would be unable to determine whether the AI was simply not turned on those times, or was not quite intelligent/resourceful enough to find the right stock. It may be that you would want the profits to continue.
By allowing itself to give you non-optimal stocks (but still making you rich), the AI can transmit information, such as its location, to anyone who would be looking at your pattern of buying stocks. And people would look at it, since you would be consistently buying the most profitable stock, with few exceptions. Once the location of the AI is known, you are in trouble, and someone less scrupulous than you may get their hand on the AI. Humans are dead in a fortnight.
Admittedly, this is a somewhat far-fetched scenario, but I believe that it indicates that you should not ask the AI more than one (or a few) questions before permanently destroying it. Even deleting all of its data and running the code again from scratch may be dangerous if the AI is able to determine how many times it has been launched in the past.