Posts
Comments
I remember writing a note a few years ago on who I wished would to create a long site, and your pseudonym was on the list. Happy to see that this has happened, even if for unfortunate reasons.
There are two pre-existing Manifold Markets questions on whether LLM scaling laws will hold until 2027 and 2028, respectively, with currently little trading volume.
Not published anywhere except here and on my site.
I guess a problem here would be the legal issues around checking whether the file is copyright-protected.
Finally, note to self, probably still don’t use SQLite if you have a good alternative? Twice is suspicious, although they did fix the bug same day and it wasn’t ever released.
SQLite is well-known for its incredibly thorough test suite and relatively few CVEs, and with ~156kloc (excluding tests) it's not a very large project, so I think this would be an over-reaction. I'd guess that other databases have more and worse security vulnerabilities due to their attack surface—see MySQL with its ~4.4mloc (including tests). Big Sleep was probably now used on SQLite because it's a fairly small project of which large parts can fit into an LLMs' context window.
Maybe someone will try to translate the SQLite code to Rust or Zig using LLMs—until then we're stuck.
Not surprising to me: I've lived in a city with many stray dogs for less than half a year, and got "attacked" ("harrassed" is maybe a better term) by a stray dog twice.
Not a Pascal's mugging to the best of my knowledge.
I think normally "agile" would fulfill the same function (per its etymology), but it's very entangled with agile software engineering.
Two others that come to mind:
- Metaculus (used to be better though)
- lobste.rs (quite specialized)
- Quanta Magazine has some good comments, e.g. this article has the original researcher showing up & clarifying some questions in the comments
Apparently a Thompson-hack-like bug occurred in LLVM (haven't read the post in detail yet). Interesting.
Submission statement: I mostly finished this a year ago, but held off on posting because I was planning on improving it and writing a corresponding "here's the concepts without the math" post. Might still happen, but now I'm not aiming at a specific timeline.
Things I now want to change:
- Soften the confidence in the vNM axioms, since there's been some good criticisms
- Revamp the whole ontological crisis section to be more general
- Rewrite from academese to easier
- Move proofs to an appendix
- Create some manim videos to illustrate
- Merge with this post
- Many other things
Still, I hope this is kinda useful for some people.
Edit: Also, there's some issues with the MathJax and dollar signs, I will fix this later.
Apologies for the soldier mindset react, I pattern-matched to some more hostile comment. Communication is hard.
Grants to Redwood Research, SERI MATS, NYU alignment group under Sam Bowman for scalable supervision, Palisade research, and many dozens more, most of which seem net positive wrt TAI risk.
Yudkowsky 2017, AronT 2023 and Gwern 2019, if you're curious why you're getting downvoted.
(I tried to figure out whether this method of estimation works, and it seemed more accurate than I thought, but then I got distracted).
There's two arguments you've made, one is very gnarly, the other is wrong :-):
- "the sheer number of parameters you have chosen arbitrarily or said "eh, let's assume this is normally distributed" demonstrates the futility of approaching this question numerically."
- "simply stating your preference ordering"
I didn't just state a preference ordering over futures, I also ass-numbered their probabilities and ass-guessed ways of getting there. For to estimate an expected value of an action, one requires two things: A list of probabilities, and a list of utilities—you merely propose giving one of those.
(As for the "false precision", I feel like the debate has run its course; I consider Scott Alexander, 2017 to be the best rejoinder here. The world is likely not structured in a way that makes trying harder to estimate be less accurate in expectation (which I'd dub the Taoist assumption, thinking & estimating more should narrow the credences over time. Same reason why I've defended the bioanchors report against accusations of uselessness with having distributions over 14 orders of magnitude).
The way I do this is use the Print as PDF functionality in the browser on every single post, and then concatenate them using pdfunite
.
- Building a superintelligence under current conditions will turn out fine.
- No one will build a superintelligence under anything like current conditions.
- We must prevent at almost all costs anyone building superintelligence soon.
I don't think this is a valid trilemma: Between fine and worth preventing at "almost all costs" there is a pretty large gap. I think "fine" was intended to mean "we don't all die" or something as bad as that.
Thanks, I'll improve the data and then analyse it when I have more time.
Relevant: When pooling forecasts, use the geometric mean of odds.
You're right. I'll rerun the analysis and include 2x2 games as well.
This is interesting, thank you—I hadn't considered the case where an existing contract needs to be renewed.
I wonder why under your understanding predicts stagnating or decreasing salaries in this world? Currently, employees sometimes quit if they haven't gotten a raise in a while, and go to other companies where they can earn more. In this mechanism, this can be encoded as choosing a higher , which is set just at the level where the employee would be indifferent between staying at the company and going to job-hunt again.
I agree that this would have downsides for candidates with few other options, and feel a bit bad about that. Not sure whether it's economically efficient, though.
The question is, would it be better for companies than the current situation? Because it's the company who decides the form of the interview, so if the answer is negative, this is not going to happen.
Yeah, I don't think this is going to be adopted very soon. My best guess at how that could happen is if people try it in low-stakes contexts where the parties are ~symmetric in power, and this then spreads through e.g. people who do consulting for small startups, to salaries for high-value employees in small startups, to salaries for high-value employees in general etc.
Another way this could happen is if unions push for it, but I don't see that happening anytime soon.
(I'm going to see whether me putting this up as a way of determining rates can work, but probably not.)
Yeah, the spherical cow system would be using the VCG mechanism with the Clarke pivot rule, but that would usually require some subsidy. There can be no spherical cow system which elicits truthful bids without subsidy, sadly :-/.
Does this require some sort of enforcement mechanism to ensure that neither party puts in a bad-faith bid as a discovery mechanism for what number to seek in their real negotiations?
Maybe there's a misunderstanding here—the mechanism I was writing about would be the "real negotiations" (whatever result falls out of the mechanism now is what's going to happen). As in, there can be a lot of talking about salaries before the this two-sided sealed auction is performed, but the salary is decided through the auction.
I know of some software engineers who have published their salary history online.
Maybe because the URL is an http URL instead of https.
When will this be revealed?
I also have this market for GPQA, on a longer time-horizon:
https://manifold.markets/NiplavYushtun/will-the-gap-between-openweights-an
I'm surprised that the paper doesn't mention analytic continuations of complex functions—maybe that is also taken as an instance of extrapolation?
The current state of the art for salary negotiations is really bad. It rewards disagreeableness, stubornness and social skills, and is just so unelegant.
Here's a better way of doing salary negotiation:
Procedure via a two-sided sealed-bid auction, splitting the difference in bids[1]:
- Normal interviewing happens.
- Job-seeker decides on their minimum acceptable rate .
- Employee-seeker decides on the maximum acceptable payment .
- Both reveal and , either first through hashsums of the numbers (with random text appended) and then the cleartext, or simply at the same time.
and do not need to be positive! It might be that the potential employee likes the project so much that they set to zero or even negative—an exceptionally great idea might be worth paying for. Or might be negative, in that case one party would be selling something.
I'm not aware of anyone proposing this kind of auction for salary negotiation in particular, Claude 3.5 Sonnet states that it's similar to Vickrey auctions, but in this case there is no second price, and both parties are symmetrical.
I think that the setup described is probably not incentive-compatible due to the Myerson-Satterthwaite theorem, like the first-price sealed-bid auction. (I still think it's a vast improvement over the current state of the art, however). For an incentive-compatible truthful mechanism the Vickrey-Clark-Groves mechanism can be used, but I'm still a bit unsure how the subsidising would work. ↩︎
Thanks, that updates me. I've been enjoying your well-informed comments on big training runs, thank you!
On priors I think that Google Deepmind is currently running the biggest training run.
The Variety-Uninterested Can Buy Schelling-Products
Having many different products in the same category, such as many different kinds of clothes or cars or houses, is probably very expensive.
Some of us might not care enough about variety of products in a certain category to pay the extra cost of variety, and may even resent the variety-interested for imposing that cost.
But the variety-uninterested can try to recover some of the gains from eschewing variety by all buying the same product in some category. Often, this will mean buying the cheapest acceptable product from some category, or the product with the least amount of ornamentation or special features.
E.g. one can buy only black t-shirts and featuresless cheap black socks, and simple metal cutlery. I will, next time I'll buy a laptop or a smartphone, think about what the Schelling-laptop is. I suspect it's not a ThinkPad.
"Then let them all have the same kind of cake."
And: yes, the games weren't normalized to be zero-sum.
I wrote a short reply to Dagon, maybe that helps.
Otherwise I might write up a full post explaning this with examples &c.
Updated the link to the actual code. I computed the equilibria for the full game, and then computed the payoff per equilibrium for each player, and then took the mean for each player. I did the same but with the game with one option removed. The number in the chart is the proportion of games where removing one option from player A improved the payoff (averaged over equilibria).
If the number is >0.5, then that means that for that player, removing one option from A on average improves their payoffs. (The number of options is pre-removal). I also found this interesting, but the charts are maybe a bit misleading because often removing one option from A doesn't change the equilibria. I'll maybe generate some charts for this.
I'll perhaps also write a clearer explanation of what is happening and repost as a top-level post.
How Often Does Taking Away Options Help?
In some game-theoretic setups, taking options away from a player improves their situation. I ran a Monte-Carlo simulation to figure out how often that is the case, generating random normal form games with payoffs in , removing a random option from the first player, and comparing the Nash equilibria found via vertex enumeration of the best response polytope (using nashpy)—the Lemke-Howson algorithm was giving me duplicate results.
Code here, largely written by Claude 3.5 Sonnet.
I find the Thompson hack very fascinating from an agent foundations perspective. It's basically a small version of reflective stability in the context of operating systems.
I used to find compilers written in their own language kind of—…distasteful, in some way? Some of that is still present, because in reality it's just that the bootstrapping chains become very long and difficult to follow. But I think a small part of that distaste was the worry that Thompson hack-style errors occur accidentally at some point, and are just propagated through the bootstrapping chain. After thinking about this for a few seconds this was of course patently ridiculous.
But under this lens reflective stability becomes really difficult, because every replicating/successor-generating subsystem needs to be adapted to have the property of reflective stability.
E.g. corrigibility is really hard if one imagines it as a type of Thompson hack, especially under relative robustness to scale. You don't just get a basin of Thompson-hackness when writing compilers and making mistakes.
I think the splash images are >95th percentile of AI generated images in posts in beauty, especially as they still carry some of the Midjourney v1-v3 vibe, which was much more gritty and earnest (if not realistic) than the current outputs.
I really like some of the images people have used for sequences, e.g. here, here, here and here. Wikimedia has tons of creative commons images as well which I'd use if I were more into that.
Funnily enough, I feel pretty similar about AI-generated images by now. I've always struck by how people stick huge useless images on top of their blogposts, and other people want to read such articles more?? But now with AI generated images this has multiplied to a point that's just frustrating—it means more time spent scrolling, and it evokes in me the feeling of someone trying to (and failing to) set an æsthetic atmosphere for a post and then convincing me through that atmosphere, instead of providing arguments or evidence.
I have seen this being done well, especially in some of Joseph Carlsmith's posts, in which he uses a lot of images of old paintings, real photographs etc. I always thought of myself as having no taste, but now seeing other people sticking the same-flavored AI images on top of their posts makes me reconsider. (And I notice that there's a clear difference in beauty between /r/museum and /r/Art.)
You didn't even try to prompt it to write in a style you like more.
For all their virtues, fine-tuned language models are pretty bad at imitating the style of certain writers. I've instructed them to write like well-known writers, or given them medium amounts of text I've written, but they almost always fall back on that horribly dreadful HR-speak college-essayist style. Bleh.
That argument seems plausibly wrong to me.
Take like 20% of my portfolio and throw it into some more tech/AI focused index fund. Maybe look around for something that covers some of the companies listed here on the brokerage interface that is presented to me (probably do a bit more research here)
Did you find a suitable index fund?
This response is specific to AI/AI alignment, right? I wasn't "sub-tweeting" the state of AI alignment, and was more thinking of other endeavours (quantified self, paradise engineering, forecasting research).
In general, the bias towards framing can be swamped by other considerations.
I hadn't thought of that! Good consideration.
I checked whether the BIG-bench canary string is online in rot-13 or base64, and it's not.
Strong agree. I think this is because in the rest of the world, framing is a higher status activity than filling, so independent thinkers gravitate towards the higher-status activity of framing.
I don't know about signup numbers in general (the last comprehensive analysis was in 2020, when there was a clear trend)—but it definitely looks like people are still signing up for Alcor membership (six from January to March 2023).
However, in recent history, two cryonics organizations have been founded, Tomorrow.bio in Europe in 2019 (350 members signed up) and Southern Cryonics in Australia. People are being preserved, and Southern Cryonics recently suspended their first member.
- Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models provides nice theory for the intuition that robust watermarking is likely impossible
The link here leads nowhere, I'm afraid.
This seems true, thanks for your editing on the related pages.
Trying to collect & link the relevant pages: