Posts

Davidmanheim's Shortform 2025-01-16T08:23:40.952Z
Exploring Cooperation: The Path to Utopia 2024-12-25T18:31:55.565Z
Moderately Skeptical of "Risks of Mirror Biology" 2024-12-20T12:57:31.824Z
Most Minds are Irrational 2024-12-10T09:36:33.144Z
Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn 2024-12-09T08:24:26.594Z
Mitigating Geomagnetic Storm and EMP Risks to the Electrical Grid (Shallow Dive) 2024-11-26T08:00:04.810Z
Proveably Safe Self Driving Cars [Modulo Assumptions] 2024-09-15T13:58:19.472Z
Are LLMs on the Path to AGI? 2024-08-30T03:14:04.710Z
Scaling Laws and Likely Limits to AI 2024-08-18T17:19:46.597Z
Misnaming and Other Issues with OpenAI's “Human Level” Superintelligence Hierarchy 2024-07-15T05:50:17.770Z
Biorisk is an Unhelpful Analogy for AI Risk 2024-05-06T06:20:28.899Z
A Dozen Ways to Get More Dakka 2024-04-08T04:45:19.427Z
"Open Source AI" isn't Open Source 2024-02-15T08:59:59.034Z
Technologies and Terminology: AI isn't Software, it's... Deepware? 2024-02-13T13:37:10.364Z
Safe Stasis Fallacy 2024-02-05T10:54:44.061Z
AI Is Not Software 2024-01-02T07:58:04.992Z
Public Call for Interest in Mathematical Alignment 2023-11-22T13:22:09.558Z
What is autonomy, and how does it lead to greater risk from AI? 2023-08-01T07:58:06.366Z
A Defense of Work on Mathematical AI Safety 2023-07-06T14:15:21.074Z
"Safety Culture for AI" is important, but isn't going to be easy 2023-06-26T12:52:47.368Z
"LLMs Don't Have a Coherent Model of the World" - What it Means, Why it Matters 2023-06-01T07:46:37.075Z
Systems that cannot be unsafe cannot be safe 2023-05-02T08:53:35.115Z
Beyond a better world 2022-12-14T10:18:26.810Z
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm) 2022-11-02T12:57:23.445Z
Announcing AISIC 2022 - the AI Safety Israel Conference, October 19-20 2022-09-21T19:32:35.581Z
Rehovot, Israel – ACX Meetups Everywhere 2022 2022-08-25T18:01:16.106Z
AI Governance across Slow/Fast Takeoff and Easy/Hard Alignment spectra 2022-04-03T07:45:57.592Z
Arguments about Highly Reliable Agent Designs as a Useful Path to Artificial Intelligence Safety 2022-01-27T13:13:11.011Z
Elicitation for Modeling Transformative AI Risks 2021-12-16T15:24:04.926Z
Modelling Transformative AI Risks (MTAIR) Project: Introduction 2021-08-16T07:12:22.277Z
Maybe Antivirals aren’t a Useful Priority for Pandemics? 2021-06-20T10:04:08.425Z
A Cruciverbalist’s Introduction to Bayesian reasoning 2021-04-04T08:50:07.729Z
Systematizing Epistemics: Principles for Resolving Forecasts 2021-03-29T20:46:06.923Z
Resolutions to the Challenge of Resolving Forecasts 2021-03-11T19:08:16.290Z
The Upper Limit of Value 2021-01-27T14:13:09.510Z
Multitudinous outside views 2020-08-18T06:21:47.566Z
Update more slowly! 2020-07-13T07:10:50.164Z
A Personal (Interim) COVID-19 Postmortem 2020-06-25T18:10:40.885Z
Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? 2020-04-27T22:43:26.034Z
Potential High-Leverage and Inexpensive Mitigations (which are still feasible) for Pandemics 2020-03-09T06:59:19.610Z
Ineffective Response to COVID-19 and Risk Compensation 2020-03-08T09:21:55.888Z
Link: Does the following seem like a reasonable brief summary of the key disagreements regarding AI risk? 2019-12-26T20:14:52.509Z
Updating a Complex Mental Model - An Applied Election Odds Example 2019-11-28T09:29:56.753Z
Theater Tickets, Sleeping Pills, and the Idiosyncrasies of Delegated Risk Management 2019-10-30T10:33:16.240Z
Divergence on Evidence Due to Differing Priors - A Political Case Study 2019-09-16T11:01:11.341Z
Hackable Rewards as a Safety Valve? 2019-09-10T10:33:40.238Z
What Programming Language Characteristics Would Allow Provably Safe AI? 2019-08-28T10:46:32.643Z
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4) 2019-08-12T08:07:01.769Z
Applying Overoptimization to Selection vs. Control (Optimizing and Goodhart Effects - Clarifying Thoughts, Part 3) 2019-07-28T09:32:25.878Z
What does Optimization Mean, Again? (Optimizing and Goodhart Effects - Clarifying Thoughts, Part 2) 2019-07-28T09:30:29.792Z

Comments

Comment by Davidmanheim on Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development · 2025-01-30T21:26:15.491Z · LW · GW

I think this is correct, but doesn't seem to note the broader trend towards human disempowerment in favor of bureaucratic and corporate systems, which this gradual disempowerment would continue, and hence elides or ignores why AI risk is distinct.

Comment by Davidmanheim on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-01-28T15:53:48.801Z · LW · GW

"when, if ever, our credences ought to capture indeterminacy in how we weigh up considerations/evidence"


The obvious answer is only when there is enough indeterminacy to matter; I'm not sure if anyone would disagree. Because the question isn't whether there is indeterminacy, it's how much, and whether it's worth the costs of using a more complex model instead of doing it the Bayesian way.

I'd be surprised if many/most infra-Bayesians would endorse suspending judgment in the motivating example in this post

You also didn't quite endorse suspending judgement in that case - "If someone forced you to give a best guess one way or the other, you suppose you’d say “decrease”. Yet, this feels so arbitrary that you can’t help but wonder whether you really need to give a best guess at all…" So, yes, if it's not directly decision relevant, sure, don't pick, say you're uncertain. Which is best practice even if you use precise probability - you can have a preference for robust decisions, or a rule for withholding judgement when your confidence is low. But if it is decision relevant, and there is only a binary choice available, your best guess matters. And this is exactly why Eliezer says that when there is a decision, you need to focus your indeterminacy, and why he was dismissive of DS and similar approaches.

Comment by Davidmanheim on Should you go with your best guess?: Against precise Bayesianism and related views · 2025-01-28T11:48:53.277Z · LW · GW

I’m not merely saying that agents shouldn’t have precise credences when modeling environments more complex than themselves


You seem to be underestimating how pervasive / universal this critique is - essentially every environment is more complex than we are, at the very least when we're embedded agents, or other humans are involved. So I'm not sure where your criticism (which I agree with) is doing more than the basic argument is in a very strong way - it just seems to be stating it more clearly.
 

 The problem is that Kolmogorov complexity depends on the language in which algorithms are described. Whatever you want to say about invariances with respect to the description language, this has the following unfortunate consequence for agents making decisions on the basis of finite amounts of data: For any finite sequence of observations, we can always find a silly-looking language in which the length of the shortest program outputting those observations is much lower than that in a natural-looking language (but which makes wildly different predictions of future data).

Far less confident here, but I think this isn't correct as a mater of practice. Conceptually, Solomonoff doesn't say "pick an arbitrary language once you've seen the data and then do the math" it says "pick an arbitrary language before you've seen any data and then do the math." And if we need to implement the silly looking language, there is a complexity penalty to doing that, one that's going to be similarly large regardless of what baseline we choose, and we can determine how large it is in reducing the language to some other language. (And I may be wrong, but picking a language cleverly should not means that Kolmogorov complexity will change something requiring NP programs to encode into something that P programs can encode, so this criticism seems weak anyways outside of toy examples.)

Comment by Davidmanheim on Davidmanheim's Shortform · 2025-01-18T21:59:27.860Z · LW · GW

Strongly agree. I was making a narrower point, but the metric is clearly different than the goal - if anything it's more surprising that we see so much correlation as we do, given how much it has been optimized.

Comment by Davidmanheim on Davidmanheim's Shortform · 2025-01-16T08:23:41.386Z · LW · GW

Toby Ord writes that “the required resources [for LLM training] grow polynomially with the desired level of accuracy [measured by log-loss].” He then concludes that this shows “very poor returns to scale,” and christens it the "Scaling Paradox." (He continues to point out that this doesn’t imply it can’t create superintelligence, but I agree with him about that.)

But what would it look like if this were untrue? That is, what would be the conceptual alternative, where required resources grow more slowly?I think the answer is that it’s conceptually impossible.

To start, there is a fundamental bound on loss at zero, since the best possible model perfectly predicts everything - it exactly learns the distribution. This can happen when overfitting a model, but it can also happen when there is a learnable ground truth; models that are trained to learn a polynomial function can learn them exactly. 

But there is strong reason to expect the bound to be significantly above zero loss. The training data for LLMs contains lots of aleatory randomness, things that are fundamentally conceptually unpredictable. I think it’s likely that things like RAND’s random number book are in the training data, and it’s fundamentally impossible to predict randomness. I think something similar is generally true for many other things - predicting world choice for semantically equivalent words, predicting where typos occur, etc.

Aside from being bound well above zero, there's a strong reason to expect that scaling is required to reduce loss for some tasks. In fact, it’s mathematically guaranteed to require significant computation to get near that level for many tasks that are in the training data. Eliezer pointed out that GPTs are predictors, and gives the example of a list of numbers followed by their two prime factors. It’s easy to generate such a list by picking pairs of primes and multiplying them, the writing the answer first - but decreasing loss for generating the next token to predict the primes from the product is definitionally going to require exponentially more computation to perform better for larger primes.

And I don't think this is the exception, I think it's at least often the rule. The training data for LLMs contains lots of data where the order of the input doesn’t follow the computational order of building that input. When I write an essay, I sometimes arrive at conclusions and then edit the beginning to make sense. When I write code, the functions placed earlier often don’t make sense until you see how they get used later. Mathematical proofs are another example where this would often be true.

An obvious response is that we’ve been using exponentially more compute for better accomplishing tasks that aren’t impossible in this way - but I’m unsure if that is true. Benchmarks keep getting saturated, and there’s no natural scale for intelligence. So I’m left wondering whether there’s any actual content in the “Scaling Paradox.”

(Edit: now also posted to my substack.)

Comment by Davidmanheim on AI Safety as a YC Startup · 2025-01-15T09:23:18.665Z · LW · GW

True, and even more, if optimizing for impact or magnitude has Goodhart effects, of various types, then even otherwise good directions are likely to be ruined by pushing on them too hard. (In large part because it seems likely that the space we care about is not going to have linear divisions into good and bad, there will be much more complex regions, and even when pointed in a directino that is locally better, pushing too far is possible, and very hard to predict from local features even if people try, which they mostly don't.)

Comment by Davidmanheim on AI Safety as a YC Startup · 2025-01-15T09:20:07.855Z · LW · GW

I think the point wasn't having a unit norm, it was that impact wasn't defined as directional, so we'd need to remove the dimensionality from a multidimensionally defined  direction.

So to continue the nitpicking, I'd argue impact = || Magnitude * Direction ||, or better, ||Impact|| = Magnitude * Direction, so that we can talk about size of impact. And that makes my point in a different comment even clearer - because almost by assumption, the vast majority of those with large impact are pointed in net-negative directions, unless you think either a significant proportion of directions are positive, or that people are selecting for it very strongly, which seems not to be the case.

Comment by Davidmanheim on We probably won't just play status games with each other after AGI · 2025-01-15T08:23:12.304Z · LW · GW

I think some of this is on target, but I also think there's insufficient attention to a couple of factors.

First, in the short and intermediate term, I think you're overestimating how much most people will actually update their personal feelings around AI systems. I agree that there is a fundamental reason that fairly near-term AI will be able to function as  better companion and assistant than humans - but as a useful parallel, we know that nuclear power is fundamentally better than most other power sources that were available in the 1960s, but people's semi-irrational yuck reaction to "dirty" or "unclean" radiation - far more than the actual risks - made it publicly unacceptable. Similarly, I think the public perception of artificial minds will be generally pretty negative, especially looking at current public views of AI. (Regardless of how appropriate or good this is in relation to loss-of-control and misalignment, it seems pretty clearly maladaptive for generally friendly near-AGI and AGI systems.)

Second, I think there is a paperclip maximizer aspect to status competition, in the sense Eliezer uses the concept. That is,  Specifically, given massively increased wealth, abilities, and capacity, even if a implausibly large 99% of humans find great ways to enhance their lives in ways that don't devolve into status competition, there are few other domains where an indefinite amount of wealth and optimization power can be applied usefully. Obviously, this is at best zero-sum, but I think there aren't lots of obvious alternative places for positive sum indefinite investments. And even where such positive-sum options exist, they often are harder to arrive at as equilibria. (We see a similar dynamic with education, housing, and healthcare, where increasing wealth leads to competition over often artificially-constrained resources rather than expansion of useful capacity.)

Finally and more specifically, your idea that we'd see intelligence enhancement as a new (instrumental) goal in the intermediate term seems possible and even likely, but not a strong competitor for, nor inhibitor of, status competition. (Even ignoring the fact that intelligence itself is often an instrumental goal for status competition!) Even aside from the instrumental nature of the goal, I will posit that some strongly reduced returns to investment will exist - regardless of the fact that it's unlikely on priors that these limits are near the current levels. Once those points are reached, the indefinite investment of resources will trade-off between more direct status competition and further intelligence increases, and as the latter shows decreased returns, as noted above, the former becomes the metaphorical paperclip which individuals can invest indefinitely into.

Comment by Davidmanheim on AI Safety as a YC Startup · 2025-01-08T16:50:21.156Z · LW · GW

my uninformed intuition is that the people with the biggest positive impact on the world have prioritized the Magnitude

 

That's probably true, but it's selecting on the outcome variable. And I'll bet that the people with the biggest negative impact are even more overwhelmingly also those who prioritized magnitude.

Comment by Davidmanheim on When Is Insurance Worth It? · 2025-01-01T15:00:35.064Z · LW · GW

"If you already know that an adverse event is highly likely for your specific circumstances, then it is likely that the insurer will refuse to pay out for not disclosing "material information" - a breach of contract."

Having worked in insurance, that's not what the companies usually do. Denying explicitly for clear but legally hard to defend reasons, especially those which a jury would likely rule against, isn't a good way to reduce costs and losses. (They usually will just say no and wait to see if you bother following up. Anyone determined enough to push to get a reasonable claim is gonna be cheaper to pay out for than to fight.)

Comment by Davidmanheim on Alexander Gietelink Oldenziel's Shortform · 2024-12-30T00:46:14.112Z · LW · GW

Yes - the word 'global' is a minimum necessary qualification for referring to catastrophes of the type we plausibly care about - and even then, it is not always clear that something like COVID-19 was too small an event to qualify.

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-29T19:36:49.664Z · LW · GW

I definitely appreciate that confusion. I think it's a good reason to read the sequence and think through the questions clearly; https://www.lesswrong.com/s/p3TndjYbdYaiWwm9x - I think this resolves the vast majority of the confusion people have, even if it doesn't "answer" the questions.

Comment by Davidmanheim on When Is Insurance Worth It? · 2024-12-26T22:22:04.360Z · LW · GW

The math is good, the point is useful, the explanations are fine, but the embracing straw vulcan version of rationality and dismissing any notion of people legitimately wanting things other than money seems really quite bad, which leaves me wishing this wasn't being highlighted for visitors to the site.

Comment by Davidmanheim on Moderately Skeptical of "Risks of Mirror Biology" · 2024-12-22T05:08:20.673Z · LW · GW

I don't understand. You shouldn't get any changes from changing encoding if it produces the same proteins - the difference for mirror life is that it would also mirror proteins, etc.

Comment by Davidmanheim on Announcement: AI for Math Fund · 2024-12-21T16:20:01.961Z · LW · GW

Plausibly, yes. But so does programming capability, which is actually a bigger deal. (And it's unclear that a traditionally envisioned intelligence explosion is possible with systems built on LLMs, though I'm certainly not convinced by that argument.)

Comment by Davidmanheim on Announcement: AI for Math Fund · 2024-12-20T13:41:46.139Z · LW · GW

But "creating safety guarantees into which you can plug in AI capabilities once they arise anyway" is the point, and it requires at least some non-trivial advances in AI capabilities.

You should probably read the current programme thesis.

Comment by Davidmanheim on Announcement: AI for Math Fund · 2024-12-20T13:04:48.853Z · LW · GW

It is speculative in the sense that any new technology being developed is speculative - but closely related approaches are already used for assurance in practice, so provable safety isn't actually just speculative, there are concrete benefits in the near term. And I would challenge you to name a different and less speculative framework that actually deals with any issues of ASI risks that isn't pure hopium. 

Uncharitably, but I think not entirely inaccurately, these include: "maybe AI can't be that much smarter than humans anyways," "let's get everyone to stop forever," "we'll use AI to figure it out, even though we have no real ideas," "we just will trust that no-one makes it agentic," "the agents will be able to be supervised by other AI which will magically be easier to align," "maybe multiple AIs will compete in ways that isn't a disaster," "maybe we can just rely on prosaic approaches forever and nothing bad happens," "maybe it will be better than humans at having massive amounts of unchecked power by default." These all certainly seem to rely far more on speculative claims, with far less concrete ideas about how to validate or ensure them.

Comment by Davidmanheim on Announcement: AI for Math Fund · 2024-12-20T06:06:43.701Z · LW · GW

It is critical for guaranteed safe AI and many non-prosaic alignment agendas. I agree it has risks, since all AI capabilities and advances pose control risks, but it seems better than most types of general capabilities investments.

Do you have a more specific model of why it might be negative?

Comment by Davidmanheim on Trying to translate when people talk past each other · 2024-12-18T07:23:15.261Z · LW · GW

I don't think it was betrayal, I think it was skipping verbal steps, which left intent unclear.

If A had said "I promised to do X, is it OK now if I do Y instead?" There would presumably have been no confusion. Instead, they announced, before doing Y, their plan, leaving the permission request implicit. The point that "she needed A to acknowledge that he’d unilaterally changed an agreement" was critical to B, but I suspect A thought that stating the new plan did that implicitly.

Comment by Davidmanheim on MIRI's June 2024 Newsletter · 2024-12-14T19:56:26.729Z · LW · GW

Strongly agree that there needs to be an institutional home. My biggest problem is that there is still no such new home!

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-12T08:44:18.935Z · LW · GW

You should also read the relevant sequence about dissolving the problem of free will: https://www.lesswrong.com/s/p3TndjYbdYaiWwm9x

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-12T08:42:29.488Z · LW · GW

You believe that something inert cannot be doing computation. I agree. But you seem to think it's coherent that a system with no action - a post-hoc mapping of states - can be.

The place where comprehension of Chinese exists in the "chinese room" is the creation of the mapping - the mapping itself is a static object, and the person in the room by assumption is doing to cognitive work, just looking up entries. "But wait!" we can object, "this means that the Chinese room doesn't understand Chinese!" And I think that's the point of confusion - repeating someone else telling you answers isn't the same as understanding. The fact that the "someone else" wrote down the answers changes nothing. The question is where and when the computation occurred.

In our scenarios, there are a couple different computations - but the creation of the mapping unfairly sneaks in the conclusion that the execution of the computation, which is required to build the mapping, isn't what creates consciousness!

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-12T05:55:13.337Z · LW · GW

Good point. The problem I have with that is that in every listed example, the mapping either requires the execution of the conscious mind and a readout of its output and process in order to build it, or it stipulates that it is well enough understood that it can be mapped to an arbitrary process, thereby implicitly also requiring that it was run elsewhere.

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-11T16:36:48.560Z · LW · GW

That seems like a reasonable idea. It seems not at all related to what any of the philosophers proposed.

For their proposals, it seems like the computational process is more like:
1. Extract a specific string of 1s and zeros from the sandstorm's initial position, and another from it's final position, with the some length as the length of the full description of the mind.
2. Calculate the bitwise sum of the initial mind state and the initial sand position.
3. Calculate the  bitwise sum of the final mind state and the final sand position.
4. Take the output of state 2 and replace it with the output of state 3.
5. Declare that the sandstorm is doing something isomorphic to what the mind did. Ignore the fact that the internal process is completely unrelated, and all of the computation was done inside of the mind, and you're just copying answers.

Comment by Davidmanheim on Most Minds are Irrational · 2024-12-11T11:30:39.867Z · LW · GW

I agree that's a more interesting question, and computational complexity theorists have done work on it which I don't fully understand, but it also doesn't seem as relevant for AI safety questions.

Comment by Davidmanheim on Most Minds are Irrational · 2024-12-10T13:05:38.149Z · LW · GW

Regarding Chess agents, Vanessa pointed out that while only perfect play is optimal, informally we would consider agents to have an objective that is better served by slightly better play, for example, an agent rated 2500 ELO is better than one rated 1800, which is better than one rated 1000, etc. That means that lots of "chess minds" which are non-optimal are still somewhat rational at their goal.

I think that it's very likely that even according to this looser definition, almost all chess moves, and therefore almost all "possible" chess bots, fail to do much to accomplish the goal. 
We could check this informally by evaluating the set of possible moves in normal games would be classified as blunders, using a method such as the one used here to evaluate what proportion of actual moves made by players are blunders. Figure 1 there implies that in positions with many legal moves, a larger proportion are blunders - but this is looking at the empirical blunder rate by those good enough to be playing ranked chess. Another method would be to look at a bot that actually implements "pick a random legal move" - namely Brutus RND. It has an ELO of 255 when ranked against other amateur chess bots, and wins only occasionally against some of the worst bots; it seems hard to figure out from that what proportion of moves are good, but it's evidently a fairly small proportion.

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T23:14:36.300Z · LW · GW

We earlier mentioned that it is required that the finite mapping be precomputed. If it is for arbitrary Turing machines, including those that don't halt, we need infinite time, so the claim that we can map to arbitrary Turing machines fails. If we restrict it to those which halt, we need to check that before providing the map, which requires solving the halting problem to provide the map.

Edit to add: I'm confused why this is getting "disagree" votes - can someone explain why or how this is an incorrect logical step, or

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T21:40:56.055Z · LW · GW

OK, so this is helpful, but if I understood you correctly, I think it's assuming too much about the setup. For #1, in the examples we're discussing, the states of the object aren't predictably changing in complex ways - just that it will change "states" in ways that can be predicted to follow a specific path, which can be mapped to some set of states. The states are arbitrary, and per the argument don't vary in some way that does any work - and so as I argued, they can be mapped to some set of consecutive integers. But this means that the actions of the physical object are predetermined in the mapping.

And the difference between that situation and the CNS is that we know he neural circuitry is doing work - the exact features are complex and only partly understood, but the result is clearly capable of doing computation in the sense of Turing machines. 

Comment by Davidmanheim on Language Models are a Potentially Safe Path to Human-Level AGI · 2024-12-09T16:58:41.306Z · LW · GW

I think this was a valuable post, albeit ending up somewhat incorrect about whether LLMs would be agentic - not because they developed the capacity on their own, but because people intentionally built and are building structure around LLMs to enable agency. That said, the underlying point stands - it is very possible that LLMs could be a safe foundation for non-agentic AI, and many research groups are pursuing that today.

Comment by Davidmanheim on Five Worlds of AI (by Scott Aaronson and Boaz Barak) · 2024-12-09T16:55:26.651Z · LW · GW

The blogpost this points to was an important contribution at the time, more clearly laying out extreme cases for the future.  (The replies there were also particularly valuable.)

Comment by Davidmanheim on "Publish or Perish" (a quick note on why you should try to make your work legible to existing academic communities) · 2024-12-09T16:45:32.863Z · LW · GW

I think this post makes an important and still neglected claim that people should write their work more clearly and get it published in academia, instead of embracing the norms of the narrower community they interact with. There has been significant movement in this direction in the past 2 years, and I think this posts marks a critical change in what the community suggests and values in terms of output.

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T16:29:45.446Z · LW · GW

"the actual thinking-action that the mapping interprets"


I don't think this is conceptually correct. Looking at the chess playing waterfall that Aaronson discusses, the mapping itself is doing all of the computation. The fact that the mapping ran in the past doesn't change the fact that it's the location of the computation, any more than the fact that it takes milliseconds for my nerve impulses to reach my fingers means that my fingers are doing the thinking in writing this essay. (Though given the typos you found, it would be convenient to blame them.)

they assume ad arguendo that you can instantiate the computations we're interested in (consciousness) in a headful of meat, and then try to show that if this is the case, many other finite collections of matter ought to be able to do the job just as well.

Yes, they assume that whatever runs the algorithm is experiencing running the algorithm from the inside. And yes, many specific finite systems can do so - namely, GPUs and CPUs, as well as the wetware in our head. But without the claim that arbitrary items can do these computations, it seems that the arguendo is saying nothing different than the conclusion - right?

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T16:23:23.790Z · LW · GW

Looks like I messed up cutting and pasting - thanks!

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T14:33:36.233Z · LW · GW

Thanks - fixed!

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T14:31:39.493Z · LW · GW

Yeah, perhaps refuting is too strong given that the central claim is that we can't know what is and is not doing computation - which I think is wrong, but requires a more nuanced discussion. However, the narrow claims they made inter-alia were strong enough to refute, specifically by showing that their claims are equivalent to saying the integers are doing arbitrary computation - when making the claim itself requires the computation to take place elsewhere!

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-09T13:52:43.772Z · LW · GW

Seems worth noting that the claims of most of the philosophers being cited here is (1) - that even rocks are doing the same computation as minds.

Comment by Davidmanheim on Refuting Searle’s wall, Putnam’s rock, and Johnson’s popcorn · 2024-12-09T13:49:05.415Z · LW · GW

I agree that this wasn't intended as an introduction to the topic. For that, I will once again recommend Scott Aaronson's excellent mini-book explaining computational complexity to philosophers

I agree that the post isn't a definition of what computation is - but I don't need to be able to define fire to be able to point out something that definitely isn't on fire! So I don't really understand your claim. I agree that it's objectively hard to interpret computation, but it's not at all hard to interpret the fact that the integers are less complex and doing less complex computation than, say, an exponential-time Turing machine - and given the specific arguments being made, neither is a wall or a bag of popcorn. Which, as I just responded to the linked comment, was how I understood the position being taken by Searle, Putnam, and Johnson.  (And even this ignores that one implication of the difference in complexity is that the wall / bag of popcorn / whatever is not mappable to arbitrary computations, since the number of steps required for a computation may not be finite!)

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-09T08:24:56.033Z · LW · GW

I've written my point more clearly here: https://www.lesswrong.com/posts/zxLbepy29tPg8qMnw/refuting-searle-s-wall-putnam-s-rock-and-johnson-s-popcorn

Comment by Davidmanheim on Detection of Asymptomatically Spreading Pathogens · 2024-12-06T11:06:33.182Z · LW · GW

I think 'we estimate... to be'

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-05T04:33:34.784Z · LW · GW

Your/Aaronson's claim is that only the fully connected, sensibly interacting calculation matters.

Not at all. I'm not making any claim about what matters or counts here, just pointing out a confusion in the claims that were made here and by many philosophers who discussed the topic.

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-04T17:11:19.965Z · LW · GW

You disagree with Aaronson that the location of the complexity is in the interpreter, or you disagree that it matters?

In the first case, I'll defer to him as the expert. But in the second, the complexity is an internal property of the system! (And it's a property in a sense stronger than almost anything we talk about in philosophy; it's not just a property of the world around us, because as Gödel and others showed, complexity is a necessary fact about the nature of mathematics!)

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-04T17:07:42.109Z · LW · GW

Yeah, something like that. See my response to Euan in the other reply to my post.

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-04T17:06:50.193Z · LW · GW

Yes, and no, it does not boil down to Chalmer's argument. (as Aaronson makes clear in the paragraph before the one you quote, where he cites the Chalmers argument!) The argument from complexity is about the nature and complexity of systems capable of playing chess - which is why I think you need to carefully read the entire piece and think about what it says.

But as a small rejoinder, if we're talking about playing a single game, the entire argument is ridiculous; I can write the entire "algorithm" a kilobyte of specific instructions. So it's not that an algorithm must be capable of playing multiple counterfactual games to qualify, or that counterfactuals are required for moral weight - it's that the argument hinges on a misunderstanding of how complex different classes of system need to be to do the things they do.

PS. Apologies that the original response comes off as combative - I really think this discussion is important, and wanted to engage to correct an important point, but have very little time to do so at the moment!

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-04T07:22:28.904Z · LW · GW

As with OP, I strongly recommend Aaronson, who explains why waterfalls aren't doing computation in ways that refute the rock example you discuss: https://www.scottaaronson.com/papers/philos.pdf

Comment by Davidmanheim on Do simulacra dream of digital sheep? · 2024-12-04T07:19:44.741Z · LW · GW

You seem to fundamentally misunderstand computation, in ways similar to Searle. I can't engage deeply, but recommend Scott Aaronson's primer on computational complexity: https://www.scottaaronson.com/papers/philos.pdf

Comment by Davidmanheim on Is the mind a program? · 2024-12-04T07:17:52.493Z · LW · GW

You seem deeply confused about computation, in ways similar to Searle et al. I cannot engage deeply on this at present, but recommend Aaronson's primer on the topic: https://www.scottaaronson.com/papers/philos.pdf

Comment by Davidmanheim on Hierarchical Agency: A Missing Piece in AI Alignment · 2024-12-02T12:16:48.553Z · LW · GW

Norms can accomplish this as well - I wrote about this a couple weeks ago.

Comment by Davidmanheim on Hierarchical Agency: A Missing Piece in AI Alignment · 2024-12-02T12:01:41.751Z · LW · GW

Are you familiar with Davidad's program working on compositional world modeling? (The linked notes are from before the program was launched, there is ongoing work on the topic.)

The reason I ask is because embedded agents and agents in multi-agent settings should need compositional world models that include models of themselves and other agents, which implies that hierarchical agency is included in what they would need to solve. 

It also relates closely to work Vanessa is doing (as an "ARIA Creator") in learning theoretic AI, related to what she has called "Frugal Compositional Languages" and see this work by @alcatal - though I understand both are not yet addressing on multi-agent world models, nor is it explicitly about modeling the agents themselves in a compositional / embedded agent way, though those are presumably desiderata.

Comment by Davidmanheim on Mitigating Geomagnetic Storm and EMP Risks to the Electrical Grid (Shallow Dive) · 2024-11-28T21:43:02.681Z · LW · GW

That is an interesting question l, but I unfortunately do not know enough to even figure out how to answer it.

Comment by Davidmanheim on Mitigating Geomagnetic Storm and EMP Risks to the Electrical Grid (Shallow Dive) · 2024-11-27T06:51:59.400Z · LW · GW

Good points. Yes, storage definitely helps, and microgrids are generally able to have some storage, if only to smooth out variation in power generation for local use. But solar storms can last days, even if a large long-lasting event is very, very unlikely. And it's definitely true that if large facilities have storage, shutdowns will have reduced impact - but I understand that the transformers are used for power transmission, so having local storage at the large generators won't change the need to shut down the transformers used for sending that power to consumers.