LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

justus on What's the risk that AI tortures us all?

When do you think it would happen if it did happen?

justus on What's the risk that AI tortures us all?

What do you think the likelihood of extinction is and when would it probably happen?

dave-orr on What's the risk that AI tortures us all?

If you want a far future fictional treatment of this kind of situation, I recommend Surface Detail by Iain Banks.

mako-yass on mesaoptimizer's Shortform

That isn't anyone's first/preferred plan. I assure you everyone born in a liberal democracy has considered another plan before arriving at that one.

akash-wasil on robo's Shortform

Oh good point– I think my original phrasing was too broad. I didn't mean to suggest that there were no high-quality policy discussions on LW, moreso meant to claim that the proportion/frequency of policy content is relatively limited. I've edited to reflect a more precise claim:

The vast majority of high-quality content on LessWrong is about technical stuff, and it's pretty rare to see high-quality policy discussions on LW these days (Zvi's coverage of various bills would be a notable exception). Partially as a result of this, some "serious policy people" don't really think LW users will have much to add.

(I haven't seen much from Scott or Robin about AI policy topics recently– agree that Zvi's posts have been helpful.)

(I also don't know of many public places that have good AI policy discussions. I do think the difference in quality between "public discussions" and "private discussions" is quite high in policy. I'm not quite sure what the difference looks like for people who are deep into technical research, but it seems likely to me that policy culture is more private/secretive than technical culture.)

viliam on On Privilege

What are the advantages of noticing all of this?

better model of the world;
not being an asshole, i.e. not assuming that other people could do just as well as you, if they only were not so fucking lazy;
realizing that your chances to achieve something may be better than you expected, because you have all these advantages over most potential competitors, so if you hesitated to do something because "there are so many people, many of them could do it much better than I could", the actual number of people who could do it may be much smaller than you have assumed, and most of them will be busy doing something else instead.

johnvon on Ilya Sutskever and Jan Leike resign from OpenAI [updated]

This interview was terrifying to me (and I think to Dwarkesh as well), Schulman continually demonstrates that he hasn't really thought about the AGI future scenarios in that much depth and sort of handwaves away any talk of future dangers.

Right off the bat he acknowledges that they reasonably expect AGI in 1-5 years or so, and even though Dwarkesh pushes him he doesn't present any more detailed plan for safety than "Oh we'll need to be careful and cooperate with the other companies...I guess..."

vladimir_nesov on Alexander Gietelink Oldenziel's Shortform

We start with an LLM trained on 50T tokens of real data, however capable it ends up being, and ask how to reach the same level of capability with synthetic data. If it takes more than 50T tokens of synthetic data, then it was less valuable per token than real data.

But at the same time, 500T tokens of synthetic data might train an LLM more capable than if trained on the 50T tokens of real data for 10 epochs. In that case, synthetic data helps with scaling capabilities beyond what real data enables, even though it's still less valuable per token.

With Go, we might just be running into the contingent fact of there not being enough real data to be worth talking about, compared with LLM data for general intelligence. If we run out of real data before some threshold of usefulness, synthetic data becomes crucial (which is the case with Go). It's unclear if this is the case for general intelligence with LLMs, but if it is, then there won't be enough compute to improve the situation unless synthetic data also becomes better per token, and not merely mitigates the data bottleneck and enables further improvement given unbounded compute.

I would be genuinely surprised if training a transformer on the pre2014 human Go data over and over would lead it to spontaneously develop alphaZero capacity.

I expect that if we could magically sample much more pre-2024 unique human Go data than was actually generated by actual humans (rather than repeating the limited data we have), from the same platonic source and without changing the level of play, then it would be possible to cheaply tune an LLM trained on it to play superhuman Go.

steve-kommrusch on Super-Exponential versus Exponential Growth in Compute Price-Performance

The Tom's Hardware article is interesting, thanks. It makes the point that the price quoted may not include the full 'cost of revenue' for the product in that it might be the bare die price and not the tested and packaged part (yields from fabs aren't 100% so extensive functional testing of every part adds cost). The article also notes that R&D costs aren't included in that figure; the R&D for NVIDIA (and TSMC, Intel, AMD, etc) are what keep that exponential perf-per-dollar moving along.

For my own curiosity, I looked into current and past income statements for companies. Today, NVIDIA's latest balance sheet for the fiscal year ending 1/31/2024 has $61B in revenue, 17B for cost of revenue (that would include the die cost, as well as testing and packaging), R&D of 9B, and a total operating income of 33B. AMD for their fiscal year ending 12/31/2023 had $23B revenue, 12B cost of revenue, 6B R&D, and 0.4B operating income. Certainly NVIDIA is making more profit, but the original author and wikipedia picked the AMD RX 7600 as the 2023 price-performance leader and there isn't much room in AMD's income statement to lower those prices. While NVIDIA could cut their revenue in half and still make a profit in 2023, in 2022 their profit was 4B on 27B in revenue. FWIW, Goodyear Tire, selected by me 'randomly' as an example of a company making a product with lower technology innovation year-to-year, had 20B revenue for the most recent year, 17B cost of revenue, and no R&D expense. So if we someday plateau silicon technology (even if ASI can help us build transistors smaller than atoms, the plank length is out there at some point), then maybe silicon companies will start cutting costs down to bare manufacturing costs. As a last study, the wikipedia page on FLOPS cited the Pentium Pro from Intel as part of the 1997 perf-per-dollar system. For 1997, Intel reported 25B in revenues, 10B cost of sales (die, testing, packaging, etc), 2B in R&D, and an operating income of 10B; so it was spending a decent amount on R&D too in order to stay on the Moore's law curve.

I agree with Foyle's point that even with successful AGI alignment the socioeconomic implications are huge, but that's a discussion for another day...

yanni-kyriacos on Examples of Highly Counterfactual Discoveries?

Thanks :) Uh, good question. Making some good links? Have you done much nondual practice? I highly recommend Loch Kelly :)

LessWrong 2.0 Reader

Archive

Recent comments