Posts
Comments
I might indeed want to create a precedent here and maybe try to fundraise for some substantial fraction of it.
I wonder if it might be more effective to fund legal action against OpenAI than to compensate individual ex-employees for refusing to sign an NDA. Trying to take vested equity away from ex-employees who refuse to sign an NDA sounds likely to not hold up in court, and if we can establish a legal precident that OpenAI cannot do this, that might make other ex-employees much more comfortable speaking out against OpenAI than the possibility that third-parties might fundraise to partially compensate them for lost equity would be (a possibility you might not even be able to make every ex-employee aware of). The fact that this would avoid financially rewarding OpenAI for bad behavior is also a plus. Of course, legal action is expensive, but so is the value of the equity that former OpenAI employees have on the line.
Yeah, sorry that was unclear; there's no need for any form of hypercomputation to get an enumeration of the axioms of U. But you need a halting oracle to distinguish between the axioms and non-axioms. If you don't care about distinguishing axioms from non-axioms, but you do want to get an assignment of truthvalues to the atomic formulas Q(i,j) that's consistent with the axioms of U, then that is applying a consistent guessing oracle to U.
I see that when I commented yesterday, I was confused about how you had defined U. You're right that you don't need a consistent guessing oracle to get from U to a completion of U, since the axioms are all atomic propositions, and you can just set the remaining atomic propositions however you want. However, this introduces the problem that getting the axioms of U requires a halting oracle, not just a consistent guessing oracle, since to tell whether something is an axiom, you need to know whether there actually is a proof of a given thing in T.
I think what you proved essentially boils down to the fact that a consistent guessing oracle can be used to compute a completion of any consistent recursively axiomatizable theory. (In fact, it turns out that a consistent guessing oracle can be used to compute a model (in the sense of functions and relations on a set) of any consistent recursively axiomatizable theory; this follows from what you showed and the fact that an oracle for a complete theory can be used to compute a model of that theory.)
I disagree with
Philosophically, what I take from this is that, even if statements in a first-order theory such as Peano arithmetic appear to refer to high levels of the Arithmetic hierarchy, as far as proof theory is concerned, they may as well be referring to a fixed low level of hypercomputation, namely a consistent guessing oracle.
The translation from T to U is computable. The consistent guessing oracle only came in to find a completion of U, but it could also find a completion of T (in fact, a completion of U can be computably translated to a completion of T), so the consistent guessing oracle doesn't really have anything to do with the relationship between T and U.
a consistent guessing oracle rather than a halting oracle (which I theorize to be more powerful than a consistent guessing oracle).
This is correct. Or at least, the claim I'm interpreting this as is that there exist consistent guessing oracles that are strictly weaker than a halting oracle, and that claim is correct. Specifically, it follows from the low basis theorem that there are consistent guessing oracles that are low, meaning that access to a halting oracle makes it possible to tell whether any Turing machine with access to the consistent guessing oracle halts. In contrast, access to a halting oracle does not make it possible to tell whether any Turing machine with access to a halting oracle halts.
I don't understand what relevance the first paragraph is supposed to have to the rest of the post.
Something that I think it unsatisfying about this is that the rationals aren't previleged as a countable dense subset of the reals; it just happens to be a convenient one. The completions of the diadic rationals, the rationals, and the algebraic real numbers are all the same. But if you require that an element of the completion, if equal to an element of the countable set being completed, must eventually certify this equality, then the completions of the diadic rationals, rationals, and algebraic reals are all constructively inequivalent.
This means that, in particular, if your real happens to be rational, you can produce the fact that it is equal to some particular rational number. Neither Cauchy reals nor Dedekind reals have this property.
perhaps these are equivalent.
They are. To get enumerations of rationals above and below out of an effective Cauchy sequence, once the Cauchy sequence outputs a rational such that everything afterwards can only differ by at most , you start enumerating rationals below as below the real and rationals above as above the real. If the Cauchy sequence converges to , and you have a rational , then once the Cauchy sequence gets to the point where everything after is gauranteed to differ by at most , you can enumerate as less than .
My take-away from this:
An effective Cauchy sequence converging to a real induces recursive enumerators for and , because if , then for some , so you eventually learn this.
The constructive meaning of a set is that that membership should be decidable, not just semi-decidable.
If is irrational, then and are complements, and each semi-decidable, so they are decidable. If is rational, then the complement of is , which is semi-decidable, so again these sets are decidable. So, from the point of view of classical logic, it's not only true that Cauchy sequences and Dedekind cuts are equivalent, but also effective Cauchy sequences and effective Dedekind cuts are equivalent.
However, it is not decidable whether a given Cauchy-sequence real is rational or not, and if so, which rational it is. So this doesn't give a way to construct decision algorithms for the sets and from recursive enumerators of them.
If board members have an obligation not to criticize their organization in an academic paper, then they should also have an obligation not to discuss anything related to their organization in an academic paper. The ability to be honest is important, and if a researcher can't say anything critical about an organization, then non-critical things they say about it lose credibility.
Yeah, I wasn't trying to claim that the Kelly bet size optimizes a nonlogarithmic utility function exactly, just that, when the number of rounds of betting left is very large, the Kelly bet size sacrifices a very small amount of utility relative to optimal betting under some reasonable assumptions about the utility function. I don't know of any precise mathematical statement that we seem to disagree on.
Well, we've established the utility-maximizing bet gives different expected utility from the Kelly bet, right? So it must give higher expected utility or it wouldn't be utility-maximizing.
Right, sorry. I can't read, apparently, because I thought you had said the utility-maximizing bet size would be higher than the Kelly bet size, even though you did not.
Yeah, I was still being sloppy about what I meant by near-optimal, sorry. I mean the optimal bet size will converge to the Kelly bet size, not that the expected utility from Kelly betting and the expected utility from optimal betting converge to each other. You could argue that the latter is more important, since getting high expected utility in the end is the whole point. But on the other hand, when trying to decide on a bet size in practice, there's a limit to the precision with which it is possible to measure your edge, so the difference between optimal bet and Kelly bet could be small compared to errors in your ability to determine the Kelly bet size, in which case thinking about how optimal betting differs from Kelly betting might not be useful compared to trying to better estimate the Kelly bet.
Even in the limit as the number of rounds goes to infinity, by the time you get to the last round of betting (or last few rounds), you've left the limit, since you have some amount of wealth and some small number of rounds of betting ahead of you, and it doesn't matter how you got there, so the arguments for Kelly betting don't apply. So I suspect that Kelly betting until near the end, when you start slightly adjusting away from Kelly betting based on some crude heuristics, and then doing an explicit expected value calculation for the last couple rounds, might be a good strategy to get close to optimal expected utility.
Incidentally, I think it's also possible to take a limit where Kelly betting gets you optimal utility in the end by making the favorability of the bets go to zero simultaneously with the number of rounds going to infinity, so that improving your strategy on a single bet no longer makes a difference.
I think that for all finite , the expected utility at timestep from utility-maximizing bets is higher than that from Kelly bets. I think this is the case even if the difference converges to 0, which I'm not sure it does.
Why specifically higher? You must be making some assumptions on the utility function that you haven't mentioned.
I do want to note though that this is different from "actually optimal"
By "near-optimal", I meant converges to optimal as the number of rounds of betting approaches infinity, provided initial conditions are adjusted in the limit such that whatever conditions I mentioned remain true in the limit. (e.g. if you want Kelly betting to get you a typical outcome of in the end, then when taking the limit as the number of bets goes to infinity, you better have starting money , where is the geometric growth rate you get from bets, rather than having a fixed starting money while taking the limit ). This is different from actually optimal because in practice, you get some finite amount of betting opportunities, but I do mean something more precise than just that Kelly betting tends to get decent outcomes.
The reason I brought this up, which may have seemed nitpicky, is that I think this undercuts your argument for sub-Kelly betting. When people say that variance is bad, they mean that because of diminishing marginal returns, lower variance is better when the mean stays the same. Geometric mean is already the expectation of a function that gets diminishing marginal returns, and when it's geometric mean that stays fixed, lower variance is better if your marginal returns diminish even more than that. Do they? Perhaps, but it's not obvious. And if your marginal returns diminish but less than for log, then higher variance is better. I don't think any of median, mode, or looking at which thing more often gets a higher value are the sorts of things that it makes sense to talk about trading off against lowering variance either. You really want mean for that.
Correct. This utility function grows fast enough that it is possible for the expected utility after many bets to be dominated by negligible-probability favorable tail events, so you'd want to bet super-Kelly.
If you expect to end up with lots of money at the end, then you're right; marginal utility of money becomes negigible, so expected utility is greatly effected by neglible-probability unfavorable tail events, and you'd want to bet sub-Kelly. But if you start out with very little money, so that at the end of whatever large number of rounds of betting, you only expect to end up with money in most cases if you bet Kelly, then I think the Kelly criterion should be close to optimal.
(The thing you actually wrote is the same as log utility, so I substituted what you may have meant). The Kelly criterion should optimize this, and more generally for any , if the number of bets is large. At least if is an integer, then, if is normally distributed with mean and standard deviation , then is some polynomial in and that's homogeneous of degree . After a large number of bets, scales proportionally to and scales proportionally to , so the value of this polynomial approaches its term, and maximizing it becomes equivalent to maximizing , which the Kelly criterion does. I'm pretty sure you get something similar when is noninteger.
It depends how much money you could end up with compared to . If Kelly betting usually gets you more than at the end, then you'll bet sub-Kelly to reduce tail risk. If it's literally impossible to exceed even if you go all-in every time and always win, then this is linear, and you'll bet super-Kelly. But if Kelly betting will usually get you less than but not by too many orders of magnitude at the end after a large number of rounds of betting, then I think it should be near-optimal.
If there's many rounds of betting, and Kelly betting will get you as a typical outcome, then I think Kelly betting is near-optimal. But you might be right if .
If you bet more than Kelly, you'll experience lower average returns and higher variance.
No. As they discovered in the dialog, average returns is maximized by going all-in on every bet with positive EV. It is typical returns that will be lower if you don't bet Kelly.
The Kelly criterion can be thought of in terms of maximizing a utility function that depends on your wealth after many rounds of betting (under some mild assumptions about that utility function that rule out linear utility). See https://www.lesswrong.com/posts/NPzGfDi3zMJfM2SYe/why-bet-kelly
For two, your specific claims about the likely confusion that Eliezer's presentation could induce in "laymen" is empirically falsified to some degree by the comments on the original post: in at least one case, a reader noticed the issue and managed to correct for it when they made up their own toy example, and the first comment to explicitly mention the missing unitarity constraint was left over 10 years ago.
Some readers figuring out what's going on is consistent with many of them being unnecessarily confused.
I don't think this one works. In order for the channel capacity to be finite, there must be some maximum number of bits N you can send. Even if you don't observe the type of the channel, you can communicate a number n from 0 to N by sending n 1s and N-n 0s. But then even if you do observe the type of the channel (say, it strips the 0s), the receiver will still just see some number of 1s that is from 0 to N, so you have actually gained zero channel capacity. There's no bonus for not making full use of the channel; in johnswentworth's formulation of the problem, there's no such thing as some messages being cheaper to transmit through the channel than others.
We "just" need to update the three geometric averages on this background knowledge. Plausibly how this should be done in this case is to normalize them such that they add to one.
My problem with a forecast aggregation method that relies on renormalizing to meet some coherence constraints is that then the probabilities you get depend on what other questions get asked. It doesn't make sense for a forecast aggregation method to give probability 32.5% to A if the experts are only asked about A, but have that probability predictably increase if the experts are also asked about B and C. (Before you try thinking of a reason that the experts' disagreement about B and C is somehow evidence for A, note that no matter what each of the experts believe, if your forecasting method is mean log odds, but renormalized to make probabilities sum to 1 when you ask about all 3 outcomes, then the aggregated probability assigned to A can only go up when you also ask about B and C, never down. So any such defense would violate conservation of expected evidence.)
(In the case of the arithmetic mean, updating on the background information plausibly wouldn't change anything here, but that's not the case for other possible background information.)
Any linear constraints (which are the things you get from knowing that certain Boolean combinations of questions are contradictions or tautologies) that are satisfied by each predictor will also be satisfied by their arithmetic mean.
But it is anyway a more general question (than the question of whether the geometric mean of the odds is better or the arithmetic mean of the probabilities): how should we "average" two or more probability distributions (rather than just two probabilities), assuming they come from equally reliable sources?
That's part of my point. Arithmetic mean of probabilities gives you a way of averaging probability distributions, as well as individual probabilities. Geometric mean of log odds does not.
If we assume that the prior was indeed important here then this makes sense, but if we assume that the prior was irrelevant (that they would have arrived at 25% even if their prior was e.g. 10% rather than 50%), then this doesn't make sense. (Maybe they first assumed the probability of drawing a black ball from an urn was 50%, then they each independently created a large sample, and ~25% of the balls came out black. In this case the prior was mostly irrelevant.) We would need a more general description under which circumstances the prior is indeed important in your sense and justifies the multiplicative evidence aggregation you proposed.
In this example, the sources of evidence they're using are not independent; they can expect ahead of time that each of them will observe the same relative frequency of black balls from the urn, even while not knowing in advance what that relative frequency will be. The circumstances under which the multiplicative evidence aggregation method is appropriate are exactly the circumstances in which the evidence actually is independent.
But in the second case I don't see how a noisy process for a probability estimate would lead to being "forced to set odds that you'd have to take bets on either side of, even someone who knows nothing about the subject could exploit you on average".
They make their bet direction and size functions of the odds you offer them in such a way that they bet more when you offer better odds. If you give the correct odds, then the bet ends up resolving neutrally on average, but if you give incorrect odds, then which direction you are off in correlates with how big a bet they make in such a way that you lose on average either way.
Oh, derp. You're right.
I think the way I would rule out my counterexample is by strengthening A3 to if and then there is ...
Q2: No. Counterexample: Suppose there's one outcome such that all lotteries are equally good, except for the lottery than puts probability 1 on , which is worse than the others.
I'm not sure why you don't like calling this "redundancy". A meaning of redundant is "able to be omitted without loss of meaning or function" (Lexico). So ablation redundancy is the normal kind of redundancy, where you can remove sth without losing the meaning. Here it's not redundant, you can remove a single direction and lose all the (linear) "meaning".
Suppose your datapoints are (where the coordinates and are independent from the standard normal distribution), and the feature you're trying to measure is . A rank-1 linear probe will retain some information about the feature. Say your linear probe finds the coordinate. This gives you information about ; your expected value for this feature is now , an improvement over its a priori expected value of . If you ablate along this direction, all you're left with is the coordinate, which tells you exactly as much about the feature as the coordinate does, so this rank-1 ablation causes no loss in performance. But information is still lost when you lose the coordinate, namely the contribution of from the feature. The thing that you can still find after ablating away the direction is not redundant with the the rank-1 linear probe in the direction you started with, but just contributes the same amount towards the feature you're measuring.
The point is, the reason why CCS fails to remove linearly available information is not because the data "is too hard". Rather, it's because the feature is non-linear in a regular way, which makes CCS and Logistic Regression suck at finding the direction which contains all linearly available data (which exists in the context of "truth", just as it is in the context of gender and all the datasets on which RLACE has been tried).
Disagree. The reason CCS doesn't remove information is neither of those, but instead just that that's not what it's trained to do. It doesn't fail, but rather never makes any attempt. If you're trying to train a function such that and , then will achieve optimal loss just like will.
What you're calling ablation redundancy is a measure of nonlinearity of the feature being measured, not any form of redundancy, and the view you quote doesn't make sense as stated, as nonlinearity, rather than redundancy, would be necessary for its conclusion. If you're trying to recover some feature , and there's any vector and scalar such that for all data (regardless of whether there are multiple such , which would happen if the data is contained in a proper affine subspace), then there is a direction such that projection along it makes it impossible for a linear probe to get any information about the value of . That direction is , where is the covariance matrix of the data. This works because if , then the random variables and are uncorrelated (since ), and thus is uncorrelated with .
If the data is normally distributed, then we can make this stronger. If there's a vector and a function such that (for example, if you're using a linear probe to get a binary classifier, where it classifies things based on whether the value of a linear function is above some threshhold), then projecting along removes all information about . This is because uncorrelated linear features of a multivariate normal distribution are independent, so if , then is independent of , and thus also of . So the reason what you're calling high ablation redundancy is rare is that low ablation redundancy is a consequence of the existence of any linear probe that gets good performance and the data not being too wildly non-Gaussian.
Ablating along the difference of the means makes both CCS & Supervised learning fail, i.e. reduce their accuracy to random guessing. Therefore:
- The fact that Recursive CCS finds many good direction is not due to some “intrinsic redundancy” of the data. There exist a single direction which contains all linearly available information.
- The fact that Recursive CCS finds strictly more than one good direction means that CCS is not efficient at locating all information related to truth: it is not able to find a direction which contains as much information as the direction found by taking the difference of the means. Note: Logistic Regression seems to be about as leaky as CCS. See INLP which is like Recursive CCS, but with Logistic Regression.
I don't think that's a fair characterization of what you found. Suppose, for example, that you're given a vector in whose th component is , where is a random variable with high variance, and are i.i.d. with mean and tiny variance. There is a direction which contains all the information about contained in the vector, namely the average of the coordinates. Subtracting out the mean of the coordinates from each coordinate will remove all information about . But the data is plenty redundant; there are orthogonal directions each of which contain almost all of the available information about , so a probe trained to recover that learns to just copy one of the coordinates will be pretty efficient at recovering . If the have variance (i.e. are just constants always equal to ), then there are orthogonal directions each of which contain all information about , and a probe that copies one of them is perfectly efficient at extracting all information about .
If you can find multiple orthogonal linear probes that each get good performance at recovering some feature, then something like this must be happening.
My point wasn't that the equation didnt hold perfectly, but that the discrepancies are very suspicious. Two of the three discrepancies were off by exactly 1 order of magnitude, making me fairly confident that they are the result of a typo. (Not sure what's going on with the other discrepency).
In the table of parameters, compute, and tokens, compute/(parameters*tokens) is always 6, except in one case where it's 0.6, one case where it's 60, and one case where it's 2.75. Are you sure this is right?
Done, thanks.
It would kind of use assumption 3 inside step 1, but inside the syntax, rather than in the metalanguage. That is, step 1 involves checking that the number encoding "this proof" does in fact encode a proof of C. This can't be done if you never end up proving C.
One thing that might help make clear what's going on is that you can follow the same proof strategy, but replace "this proof" with "the usual proof of Lob's theorem", and get another valid proof of Lob's theorem, that goes like this: Suppose you can prove that []C->C, and let n be the number encoding a proof of C via the usual proof of Lob's theorem. Now we can prove C a different way like so:
- n encodes a proof of C.
- Therefore []C.
- By assumption, []C->C.
- Therefore C.
Step 1 can't be correctly made precise if it isn't true that n encodes a proof of C.
The revelation that he spent maybe 10x as much on villas for his girlfriends as EA cause areas
Source?
The idea that he was trying to distance himself from EA to protect EA doesn't hold together because he didn't actually distance himself from EA at all in that interview. He said ethics is fake, but it was clear from context that he meant ordinary ethics, not utilitarianism.
"Having been handed this enormous prize, how do I maximize the probability that I max out on utility?" Hm, but that actually doesn't give back any specific criterion, since basically any strategy that never bets your whole stack will win.
That's not quite true. If you bet more than double Kelly, your wealth decreases. But yes, Kelly betting isn't unique in growing your wealth to infinity in the limit as number of bets increases.
If the number of bets is very large, but due to some combination of low starting wealth relative to the utility bound and slow growth rate, it is not possible to get close to maximum utility, then Kelly betting should be optimal.
I basically endorse what kh said. I do think it's wrong to think you can fit enormous amounts of expected value or disvalue into arbitrarily tiny probabilities.
It is true that in practice, there's a finite amount of credit you can get, and credit has a cost, limiting the practical applicability of a model with unlimited access to free credit, if the optimal strategy according to the model would end up likely making use of credit which you couldn't realistically get cheaply. None of this seems important to me. The easiest way to understand the optimal strategy when maximum bet sizes are much smaller than your wealth is that it maximizes expected wealth on each step, rather than that it maximizes expected log wealth on each step. This is especially true if you don't already understand why following the Kelly criterion is instrumentally useful, and I hadn't yet gotten to the section where I explained that, and in fact used the linear model in order to show that Kelly betting is optimal by showing that it's just the linear model on a log scale.
One could similarly object that since currency is discrete, you can't go below 1 unit of currency and continue to make bets, so you need to maintain a log-scale bankroll where you prevent your log wealth from going negative, and you should really be maximizing your expected log log wealth, which happens to give you the same results when your wealth is a large enough number of currency units that the discretization doesn't make a difference. Like, sure, I guess, but it's still useful to model currency as continuous, so I see no need to account for its discreteness in a model. Similarly, in situations where the limitations on funds available to place bets with don't end up affecting you, I don't think it needs to be explicitly included in the model.
Access to credit. In the logarithmic model, you never make bets that could make your net worth zero or negative.
Again, the max being a small portion of your net worth isn't the assumption behind the model; the assumption is just that you don't get constrained by lack of funds, so it is a different model. It's true that if the reason you don't get constrained by lack of funds is that the maximum bets are small relative to your net worth, then this is also consistent with maximizing log wealth on each step. But this isn't relevant to what I brought it up for, which was to use it as a step in explaining the reason for the Kelly criterion in the section after it.
No. The point of the model where acting like your utility is linear is optimal wasn't that this is a more realistic model than the assumptions behind the Kelly criterion; it's just another simplified model, which is slightly easier to analyze, so I was using it as a step in showing why you should follow the Kelly criterion when it is your wealth that constrains the bet sizes you can make. It's also not true that the linear-utility model I described is still just maximizing log wealth; for instance, if the reason that you're never constrained by available funds is that you have access to credit, then your wealth could go negative, and then its log wouldn't even be defined.
Most of the arguments for Kelly betting that you address here seem like strawmen, except for (4), which can be rescued from your objection, and an interpretation of johnswentworth's version of (2), which you actually mention in footnote 3, but seem unfairly dismissive of.
The assumptions according to which your derived utility function is logarithmic is that expected utility doesn't get dominated by negligible-probability tail events. For instance, if you have a linear utility function and you act like it, you almost surely get 0 payout, but your expected payout is enormous because of the negligible-probability tail event in which you win every bet. Even if you do Kelly betting instead, the expected payout is going to be well outside the range of typical payouts, because of the negligible-probability tail event in which you win a statistically improbably number of bets. This won't happen if, for instance, you have a bounded utility function, for which typical payouts from Kelly betting will not get you infinitesimally close to the bounds. The class of myopic utility functions is infinite, yes, but in the grand scheme of things, compared to the space of possible utility functions, is very tiny, and I don't think it should be surprising that there are relatively mild assumptions that imply results that aren't true of most of the myopic utility functions.
In footnote 3, you note that optimizing for all quantiles simultaneously is not possible. Kelly betting comes extremely close to doing this. Your implied objection is, I assume, that the quantifier order is backwards from what would make this really airtight: When comparing Kelly betting to a different strategy, for every quantile, Kelly betting is superior after sufficiently many iterations, but there is no single sufficient number of iterations after which Kelly betting is superior for every quantile; if you have enough iterations such that Kelly betting is better for quantiles 1% through 99%, the alternative strategy could still be so much better at the 99.9% quantile that it outweighs all this. This is where the assumption that negligible-probability tail events don't dominate expected value calculations makes this difference not matter so much. I think that this is a pretty natural assumption, and thus that this really is almost as good.
But in fact, I expect the honest policy to get significantly less reward than the training-game-playing policy, because humans have large blind spots and biases affecting how they deliver rewards.
The difference in reward between truthfulness and the optimal policy depends on how humans allocate rewards, and perhaps it could be possible to find a clever strategy for allocating rewards such that truthfulness gets close to optimal reward.
For instance, in the (unrealistic) scenario in which a human has a well-specified and well-calibrated probability distribution over the state of the world, so that the actual state of the world (known to the AI) is randomly selected from this distribution, the most naive way to allocate rewards would be to make the loss be the log of the probability the human assigns to the answers given by the AI (so it gets better performance by giving higher-probability answers). This would disincentivize answering questions honestly if the human is often wrong. A better way to allocate rewards would be to ask a large number of questions about the state of the world, and, for each simple-to-describe property that an assignment of answers to each of these questions could have, which is extremely unlikely according to the human's probability distribution (e.g. failing calibration tests), penalize assignments of answers to questions that satisfy this property. That way answering according to a random selection from the human's probability distribution (which we're modeling the actual state of the world as) will get high reward with high probability, while other simple-to-describe strategies for answering questions will likely have one of the penalized properties and get low reward.
Of course, this doesn't work in real life because the state of the world isn't randomly selected from human's beliefs. Human biases make it more difficult to make truthfulness get close to optimal reward, but not necessarily impossible. One possibility would be to only train on questions that the human evaluators are extremely confident of the correct answers to, in hopes that they can reliably reward the AI more for truthful answers than for untruthful ones. This has the drawback that there would be no training data for topics that humans are uncertain about, which might make it infeasible for the AI to learn about these topics. It sure seems hard to come up with a reward allocation strategy that allows questions on which the humans are uncertain in training but still makes truth-telling a not-extremely-far-from-optimal strategy, under realistic assumptions about how human beliefs relate to reality, but it doesn't seem obviously impossible.
That said, I'm still skeptical that AIs can be trained to tell to truth (as opposed to say things that are believed by humans) by rewarding what seems like truth-telling, because I don't share the intuition that truthfulness is a particularly natural strategy that will be easy for gradient descent to find. If it's trained on questions in natural language that weren't selected for being very precisely stated, then these questions will often involve fuzzy, complicated concepts that humans use because we find useful, even though they aren't especially natural. Figuring out how to correctly answer these questions would require learning things about how humans understand the world, which is also what you need in order to exploit human error to get higher reward than truthfulness would get.
It sounds to me like, in the claim "deep learning is uninterpretable", the key word in "deep learning" that makes this claim true is "learning", and you're substituting the similar-sounding but less true claim "deep neural networks are uninterpretable" as something to argue against. You're right that deep neural networks can be interpretable if you hand-pick the semantic meanings of each neuron in advance and carefully design the weights of the network such that these intended semantic meanings are correct, but that's not what deep learning is. The other things you're comparing it to that are often called more interpretable than deep learning are in fact more interpretable than deep learning, not (as you rightly point out) because the underlying structures they work with is inherently more interpretable, but because they aren't machine learning of any kind.
This seems related in spirit to the fact that time is only partially ordered in physics as well. You could even use special relativity to make a model for concurrency ambiguity in parallel computing: each processor is a parallel worldline, detecting and sending signals at points in spacetime that are spacelike-separated from when the other processors are doing these things. The database follows some unknown worldline, continuously broadcasts its contents, and updates its contents when it receives instructions to do so. The set of possible ways that the processors and database end up interacting should match the parallel computation model. This makes me think that intuitions about time that were developed to be consistent with special relativity should be fine to also use for computation.
Wikipedia claims that every sequence is Turing reducible to a random one, giving a positive answer to the non-resource-bounded version of any question of this form. There might be a resource-bounded version of this result as well, but I'm not sure.
- By "optimal", I mean in an evidential, rather than causal, sense. That is, the optimal value is that which signals greatest fitness to a mate, rather than the value that is most practically useful otherwise. I took Fisherian runaway to mean that there would be overcorrection, with selection for even more extreme traits than what signals greatest fitness, because of sexual selection by the next generation. So, in my model, the value of that causally leads to greatest chance of survival could be , but high values for are evidence for other traits that are causally associated with survivability, so offers best evidence of survivability to potential mates, and Fisherian runaway leads to selection for . Perhaps I'm misinterpreting Fisherian runaway, and it's just saying that there will be selection for in this case, instead of over-correcting and selecting for ? But then what's all this talk about later-generation sexual selection, if this doesn't change the equilibrium?
- Ah, so if we start out with an average , standard deviation , and optimal , then selecting for larger has the same effect as selecting for closer to , and that could end up being what potential mates do, driving up over the generations, until it is common for individuals to have positive , but potential mates have learned to select for higher ? Sure, I guess that could happen, but there would then be selection pressure on potential mates to stop selecting for higher at this point. This would also require a rapid environmental change that shifts the optimal value of ; if environmental changes affecting optimal phenotype aren't much faster than evolution, then optimal phenotypes shouldn't be so wildly off the distribution of actual phenotypes.
Fisherian runaway doesn't make any sense to me.
Suppose that each individual in a species of a given sex has some real-valued variable , which is observable by the other sex. Suppose that, absent considerations about sexual selection by potential mates for the next generation, the evolutionarily optimal value for is 0. How could we end up with a positive feedback loop involving sexual selection for positive values of , creating a new evolutionary equilibrium with an optimal value when taking into account sexual selection? First the other sex ends up with some smaller degree of selection for positive values of (say selecting most strongly for ). If sexual selection by the next generation of potential mates were the only thing that mattered, then the optimal value of to select for is , since that's what everyone else is selecting for. That's stability, not positive feedback. But sexual selection by the next generation of potential mates isn't the only thing that matters; by stipulation, different values of have effects on evolutionary fitness other than through sexual selection, with values closer to being better. So, when choosing a mate, one must balance the considerations of sexual selection by the next generation (for which is optimal) and other considerations (for which is optimal), leading to selection for mates with being evolutionarily optimal. That's negative feedback. How do you get positive feedback?
I know this was tagged as humor, but taking it seriously anyway,
I'm skeptical that breeding octopuses for intelligence would yield much in the way of valuable insights for AI safety, since octopuses and humans have so much in common that AGI wouldn't. That said, it's hard to rule out that uplifting another species could reveal some valuable unknown unknowns about general intelligence, so I unironically think this is a good reason to try it.
Another, more likely to pay off, benefit to doing this would be as a testbed for genetically engineering humans for higher intelligence (which also might have benefits for AI safety under long-timelines assumptions). I also think it would just be really cool from a scientific perspective.
One example of a class of algorithms that can solve its own halting problem is the class of primitive recursive functions. There's a primitive recursive function that takes as input a description of a primitive recursive function and input and outputs if halts, and otherwise: this program is given by , because all primitive recursive functions halt on all inputs. In this case, it is that does not exist.
I think should exist, at least for classical bits (which as others have pointed out, is all that is needed), for any reasonably versatile model of computation. This is not so for , since primitive recursion is actually an incredibly powerful model of computation; any program that you should be able to get an output from before the heat death of the universe can be written with primitive recursion, and in some sense, primitive recursion is ridiculous overkill for that purpose.
If a group decides something unanimously, and has the power to do it, they can do it. That would take them outside the formal channels of the EU (or in another context of NATO) but I do not see any barrier to an agreement to stop importing Russian gas followed by everyone who agreed to it no longer importing Russian gas. Hungary would keep importing, but that does not seem like that big a problem.
If politicians can blame Hungary for their inaction, then this partially protects them from being blamed by voters for not doing anything. But it doesn't protect them at all from being blamed for high fuel prices if they stop importing it from Russia. So they have incentives not to find a solution to this problem.
If you have a 10-adic integer, and you want to reduce it to a 5-adic integer, then to know its last n digits in base 5, you just need to know what it is modulo . If you know what it is modulo , then you can reduce it module , so you only need to look at the last n digits in base 10 to find its last n digits in base 5. So a base-10 integer ending in ...93 becomes a base-5 integer ending in ...33, because 93 mod 25 is 18, which, expressed in base 5, is 33.
The Chinese remainder theorem tells us that we can go backwards: given a 5-adic integer and a 2-adic integer, there's exactly one 10-adic integer that reduces to each of them. Let's say we want the 10-adic integer that's 1 in base 5 and -1 in base 2. The last digit is the digit that's 1 mod 5 and 1 mod 2 (i.e. 1). The last 2 digits are the number from 0 to 99 that's 1 mod 25 and 3 mod 4 (i.e. 51). The last 3 digits are the number from 0 to 999 that's 1 mod 125 and 7 mod 8 (i.e. 751). And so on.