Posts
Comments
RE: GPT getting dumber, that paper is horrendous.
The code gen portion was completely thrown off because of Markdown syntax (the authors mistook back-ticks for single-quotes, afaict). I think the update to make there is that it is decent evidence that there was some RLHF on ChatGPT outputs. If you remember from that "a human being will die if you don't reply with pure JSON" tweet, even that final JSON code was escaped with markdown. My modal guess is that markdown was inserted via cludge to make the ChatGPT UX better, and then RLHF was done on that cludged output. Code sections are often mislabeled for what language they contain. My secondary guess is that the authors used an API which had this cludged added on top of it, such that GPT just wouldn't output plaintext code, tho that is baffled by the "there were any passing examples".
In the math portion they say GPT-4-0613 only averaged 3.8 CHARACTERS per response. Note that "[NO]" and "[YES]" both contain more than 3.8 characters. Note that GPT-4 does not answer hardly any queries with a single word. Note that the paper's example answer for the primality question included 1000 characters, so the remaining questions apparently averaged 3 characters flat. Even if you think they only fucked up that data analysis: I also replicated GPT-4 failing to solve "large" number primality, and am close to calling a that cherry picked example. It is a legit difficult problem for GPT, I agree that anyone who goes to ChatGPT to replicate will agree the answer they get back is a coin flip at best. But we need to say it again for the kids in the back: the claim is that GPT-4 got 2% on yes/no questions. What do we call a process that gets 2% on coin flip questions?
If you take the distance between the North and South pole and divide it by ten million: voilà, you have a meter!
NB: The circumference of the Earth is ~40k km - this definition of a meter should instead mention the distance from the North or South pole to the Equator.
The problem with this is that you get whatever giant risks you aren’t measuring properly. That’s what happened at SVB, they bought tons of ‘safe’ assets while taking on a giant unsafe bet on interest rates because the system didn’t check for that. Also they cheated on the accounting, because the system allowed that too.
A very good example of Goodhart's Law/misalignment. Highlighting for the skimmers. Thanks for the write up Zvi!
Tidbit to make this comment useful: "duration" is the (negative) derivative of price with respect to yield - a bond with duration of 10 will be worth 5% (relative to par) after a 50 bip (0.5%) rate hike. So why do they call it duration? Well, suppose you buy a 10 year bond that pays 2% interest, and then tomorrow someone offers you a 3% 10 year bond. How much money do you have to pay to trade in yesterday's bond? Well, pretty much you have to pay an extra 1% for each year of the bonds life!
This is probably dead obvious to everyone in finance, but I only got into finance by joining fintech as after a math undergrad, and it took me years to figure out why they called it duration when they are nice enough to call the second derivative "convexity".
New U.S. sanctions on Russia (70%): Scott holds, I sell to 60%.
This seems like a better sale than the sale on Russia going to war, by a substantial amount. So if I was being consistent I should have sold more here. Given that I was wrong about the chances of the war, the sale would have been bad, but I didn’t know that at the time. Therefore this still counts as a mistake not to sell more.
This seems like a conjunctive fallacy. "US sanctions Russia" is very possible outside "Russia goes to war", even if "Russia goes to war" implies "US sanctions Russia". You had 30% on "major flare up in Russia-Ukraine". Perhaps you are anchoring your relative sells or something?
I obviously agree that you know these things, and am only noting a self-flagellation that seemed unearned. Thanks for writing Zvi!
What prompts maximize the chance of returning these tokens?
Idle speculation: cloneembedreportprint and similar end up encoding similar to /EOF.
I am sorry for insulting you. My experience in the rationality community is that many people choose abstinence from alcohol, which I can respect, but I forgot that likely in many social circles that choice leads to feelings of alienation. While I thought you were signaling in-group allegiance, I can see that you might not have that connection. I will attempt to model better in the future, since this seems generalizable.
I'm still interested in whether the beet margarita with OJ was good~
Did you try the beet margarita with orange juice? Was it good?
To be honest, this exchange seems completely normal for descriptions of alcohol. Tequila is canonically described as sweet. You are completely correct that when people say "tequila is sweet" they are not trying to compared it to super stimulants like orange juice and coke. GPT might not understand this fact. GPT knows that the canonical flavor profile for tequila includes "sweet", and your friend knows that it'd be weird to call tequila a sweet drink.
I think the gaslighting angle is rather overblown. GPT knows that tequila is sweet. GPT knows that most the sugar in tequila has been converted to alcohol. GPT may not know how to reconcile these facts.
Also, I get weird vibes from this post as generally performative about sobriety. You don't know the flavor profiles of alcohol, and the AI isn't communicating well the flavor profiles of alcohol. Why are you writing about the AIs lack of knowledge about the difference between tequila's sweetness and orange juice's sweetness? You seem like an ill informed person on the topic, and like you have no intention of becoming better informed. From where I stand, it seems like you understand alcohol taste less than GPT.
I wish this post talked about object level trade offs. It did that somewhat with the reference to the importance of "have a decision theory that makes it easier to be traded with". However, the opening was extremely strong and was not supported:
I care deeply about the future of humanity—more so than I care about anything else in the world. And I believe that Sam and others at FTX shared that care for the world. Nevertheless, if some hypothetical person had come to me several years ago and asked “Is it worth it to engage in fraud to send billions of dollars to effective causes?”, I would have said unequivocally no.
What level of funding would make fraud worth it?
Edit to expand: I do not believe the answer is infinite. I believe the answer is possibly less than the amount I understand FTX has contributed (assuming they honor their commitments, which they maybe can't). I think this post gestures at trading off sacred values, in a way that feels like it signals for applause, without actually examining the trade.
Thanks for feedback, I am new to writing in this style and may have erred too much towards deleting sentences while editing. But, if you never cut too much you're always too verbose, as they say. I in particular appreciate that, when talking about how I am updating, I should make clear where I am updating from.
For instance, regarding human level intelligence, I was also describing relative to "me a year/month ago". I relistened to the Sam Harris/Yudkowsky podcast yesterday, and they detour for a solid 10 minutes about how "human level" intelligence is a straw target. I think their arguments were persuasive, and that I would have endorsed them a year ago, but that they don't really apply to GPT. I had pretty much concluded that the difference between a 150 IQ AI and a 350 IQ AI would be a matter of scale. GPT as a simulator/platform seems to me like an existence proof for a not-artificially-handicapped human level AI attractor state. Since I had previous thought the entire idea was a distraction, this is an update towards human level AI.
The impact on AI timelines mostly follows from diversion of investment. I will think on if I have anything additional to add on that front.
Right, okay. I am trying to learn your ontology here, but the concepts are not close to my current inferential distance. I don't understand what the 95% means. I don't understand why the d100 has 99% chance to be fixed after one roll, while a d10 only has 90%. By the second roll I think I can start to stomach the logic here though, so maybe we can set that aside.
In my terms, when you say that a Bayesian wouldn't bet $1bil:$1 that the sun will rise tomorrow, that doesn't seem correct to me. It's true that I wouldn't actually make that nightly bet, because the risk free rate is like 3% per annum so it'd be a pretty terrible allocation of risk, plus it seems like it'd be an assassination market on the rotation of Earth and I don't like incentivizing that as a matter of course. But does the math of likelihood ratios not work as well to bury bad theories under a mountain of evidence?
I think not assigning 1e-40 chance to an event is an epistemological choice separate from Bayesianism. The math seems quite capable of leading to that conclusion, and recovering from that state quickly enough.
I think maybe the crux is "There is no way for a Bayesian to be wrong. Everything is just an update. But a Frequentist who said the die was fair can be proven wrong to arbitrary precision." You can, if the Bayesian announces their prior, know precisely how much of your arbitrary evidence they will require to believe the die is loaded.
Again, I hope this is taken in the spirit I mean it, which is "you are the only self proclaimed Frequentist on this board I know of, so you are a very valuable source of epistemic variation that I should learn how to model".
I am not sure I understand, probably because I am too preprogrammed by Bayesianism.
You roll a d20, it comes up with a number (let's say 8). The Frequentist now believes there is a 95% chance the die is loaded to produce 8s? But they won't bet 20:1 on the result, and instead they will do something else with that 95% number? Maybe use it to publish a journal article, I guess.
I would like to note that the naive version of this is bad. First, the naive version falls prey to new grads (who generally have nothing) declaring bankruptcy immediately after graduation. Then, lenders are forced to ask for collateral, which gets rid of a GREAT quality our current system has - you can go to college even if your parents weren't frugal, no matter their income. I think this criticism probably still lands with a 5 year time horizon, maybe less for a 10 year.
I like the concept that lenders would take an interest in which major you were getting, since that seems like something that could use an actuarial table. I think we would benefit from more directly incentivizing STEM (and other profitable) degrees, which IDR doesn't seem to do. What if IDR left lenders holding the bag?
This was the UX I was going to mention - watching GSL (SC:BW) VoDs. There it is tricky, especially since individual games can vary so heavily.
This article was great! Please define WIC much earlier, that was how I felt reading it and the first feedback I got after sharing it. Thanks for writing this!
My understanding is that the math textbooks were banned in Florida for their use of the "Common Core" framework. I was a math educator, and my experience is that resistance to Common Core comes primarily from parents who hate math, and are confused why they can't do their child's math, and who somehow take this as a failure mode.
I really appreciate this post. In Chinese, the vocal pronouns for "he" and "she" are the same (they are distinguished in writing). It is common for Chinese ESL students to mix the words "she" and "he" when speaking. I have been trying to understand this, and relate it to my (embarrassingly recent) understanding that probabilistic forecasts (which I now use ubiquitously) are a different "epistemology" than I used to have. This post is a very concrete exploration of the subject. Thank you!
I think finding the correct link required a good heart. In the hope Zvi will see you, I am commenting to further boost visibility.
I think top level posts generate much more than 10x the value than the entire comments section combined, based off my impression that the majority of lurkers don't get deep in the comments. I wonder if top level posts having a x^1.5 exponent would get closer to the ideal... That would also disincentivize post series...
No, since if I had rolled low I wouldn't want to like, give them significantly more notice than necessary as I job hunted. I offered to do something like hash a seed to use on a RNG, they didn't think that was necessary.
There is a "going going" in this chapter as well
Actually, for any given P which works, P'(x)=P(x)/10 is also a valid algorithm.
If I am following, it seems like an agent which says "bet 'higher' if positive and 'lower' otherwise" does well
Neither
Thanks!
I do not believe that "any monotonically increasing bounded function over the reals is continuous". For instance, choose some montonically increasing function bounded to (0,0.4) for x<-1, another function bounded to (0.45,0.55) for -1<x<1, and a third function bounded to (0.6,1) for x>1.
I did not check the rest of the argument, sorry
Please spoiler your edit
Could you explain why you are almost certain?
Could you explain why it's clearly impossible to produce an algorithm that gives better than 50% chance of success on the first round? I think I follow the rest of your argument.
Good questions! It's a forum with posts between two users "Iarwain" (Yudkowsky) and "lintamande", who is co-authoring this piece. There are no extraneous posts, although there are (very rarely) OOC posts, for instance announcing the discord or linking to a thread for a side story.
In each post, either user will post as either a character (ie Keltham, Carissa, and others - each user writes multiple characters) or without a character (for 3rd person exposition). I usually use the avatars when possible to quickly identify who is speaking.
You don't need to pay attention to history or post time, until your catch up to the current spot and start playing the F5 game (they are writing at a very quick pace).
By "better than 50% accuracy" I am trying to convey "Provide an algorithm such that if you ran a casino where the players acted as ROB, the casino can price the game at even money and come out on top, given the law of large numbers".
(Perhaps?) more precisely I mean that for any given instantiation of ROB's strategy, then for any given target reward R and payoff probability P<1 there exists a number N such that if you ran N trials betting even money with ROB you would have P probability to have at least R payoff (assuming you start with 1 dollar or whatever).
You can assume ROB will know your algorithm when choosing his distribution of choices.
Computability is not important. I only meant to cut off possibilities like "ROB hands TABI 1 and 'the number which is 0 if the goldbach conjecture is true and 2 otherwise'"
You can restrict yourself to arbitrary integers, and perhaps I should have
I don't see how 2 is true.
If you always answer that your number is lower, you definitely have exactly 50% accuracy, right? So ROB isn't constraining you to less than 50% accuracy.