Posts

shminux's Shortform 2024-08-11T02:59:10.993Z
On Privilege 2024-05-18T22:36:25.430Z
Why Q*, if real, might be a game changer 2023-11-26T06:12:31.964Z
Why I am not an AI extinction cautionista 2023-06-18T21:28:38.657Z
Upcoming AI regulations are likely to make for an unsafer world 2023-06-03T01:07:35.921Z
How can one rationally have very high or very low probabilities of extinction in a pre-paradigmatic field? 2023-04-30T21:53:15.843Z
Do LLMs dream of emergent sheep? 2023-04-24T03:26:54.144Z
Top lesson from GPT: we will probably destroy humanity "for the lulz" as soon as we are able. 2023-04-16T20:27:19.665Z
Respect Chesterton-Schelling Fences 2023-02-27T00:09:30.815Z
Inequality Penalty: Morality in Many Worlds 2023-02-11T04:08:19.090Z
The Pervasive Illusion of Seeing the Complete World 2023-02-09T06:47:36.628Z
If you factor out next token prediction, what are the remaining salient features of human cognition? 2022-12-24T00:38:04.801Z
"Search" is dead. What is the new paradigm? 2022-12-23T10:33:35.596Z
Google Search loses to ChatGPT fair and square 2022-12-21T08:11:43.287Z
Prodding ChatGPT to solve a basic algebra problem 2022-12-12T04:09:42.105Z
If humanity one day discovers that it is a form of disease that threatens to destroy the universe, should it allow itself to be shut down? 2022-11-25T08:27:14.740Z
Scott Aaronson on "Reform AI Alignment" 2022-11-20T22:20:23.895Z
Why don't organizations have a CREAMO? 2022-11-12T02:19:57.258Z
Desiderata for an Adversarial Prior 2022-11-09T23:45:16.331Z
Google Search as a Washed Up Service Dog: "I HALP!" 2022-11-07T07:02:40.469Z
Is there any discussion on avoiding being Dutch-booked or otherwise taken advantage of one's bounded rationality by refusing to engage? 2022-11-07T02:36:36.826Z
What Does AI Alignment Success Look Like? 2022-10-20T00:32:48.100Z
UI/UX From the Dark Ages 2022-09-25T01:53:48.099Z
Why are we sure that AI will "want" something? 2022-09-16T20:35:40.674Z
A possible AI-inoculation due to early "robot uprising" 2022-06-16T21:21:56.982Z
How much stupider than humans can AI be and still kill us all through sheer numbers and resource access? 2022-06-12T01:01:36.735Z
Eternal youth as eternal suffering 2022-06-04T01:48:49.684Z
Algorithmic formalization of FDT? 2022-05-08T01:36:10.778Z
Write posts business-like, not story-like 2022-05-05T20:13:08.495Z
How Might an Alignment Attractor Look like? 2022-04-28T06:46:11.139Z
Worse than an unaligned AGI 2022-04-10T03:35:20.373Z
Recognizing and Dealing with Negative Automatic Thoughts 2022-03-03T20:41:55.839Z
Epsilon is not a probability, it's a cop-out 2022-02-15T02:48:53.892Z
Aligned AI Needs Slack 2022-01-26T09:29:53.897Z
You can't understand human agency without understanding amoeba agency 2022-01-06T04:42:51.887Z
You are way more fallible than you think 2021-11-25T05:52:50.036Z
Nitric Oxide Spray... a cure for COVID19?? 2021-03-15T19:36:17.054Z
Uninformed Elevation of Trust 2020-12-28T08:18:07.357Z
Learning is (Asymptotically) Computationally Inefficient, Choose Your Exponents Wisely 2020-10-22T05:30:18.648Z
Mask wearing: do the opposite of what the CDC/WHO has been saying? 2020-04-02T22:10:31.126Z
Good News: the Containment Measures are Working 2020-03-17T05:49:12.516Z
(Double-)Inverse Embedded Agency Problem 2020-01-08T04:30:24.842Z
Since figuring out human values is hard, what about, say, monkey values? 2020-01-01T21:56:28.787Z
A basic probability question 2019-08-23T07:13:10.995Z
Inspection Paradox as a Driver of Group Separation 2019-08-17T21:47:35.812Z
Religion as Goodhart 2019-07-08T00:38:36.852Z
Does the Higgs-boson exist? 2019-05-23T01:53:21.580Z
A Numerical Model of View Clusters: Results 2019-04-14T04:21:00.947Z
Quantitative Philosophy: Why Simulate Ideas Numerically? 2019-04-14T03:53:11.926Z
Boeing 737 MAX MCAS as an agent corrigibility failure 2019-03-16T01:46:44.455Z

Comments

Comment by Shmi (shminux) on Daniel Kokotajlo's Shortform · 2024-10-03T06:06:59.081Z · LW · GW

That is definitely my observation, as well: "general world understanding but not agency", and yes, limited usefulness, but also... much more useful than gwern or Eliezer expected, no? I could not find a link. 

I guess whether it counts as AGI depends on what one means by "general intelligence". To me it was having a fairly general world model and being able to reason about it. What is your definition? Does "general world understanding" count? Or do you include the agency part in the definition of AGI? Or maybe something else?

Hmm, maybe this is a General Tool, as opposed a General Intelligence?

Comment by Shmi (shminux) on Daniel Kokotajlo's Shortform · 2024-10-02T06:56:49.937Z · LW · GW

Given that we basically got AGI (without the creativity of best humans) that is a Karnofsky's Tool AI very unexpectedly, as you admit, can you look back and see what assumptions were wrong in expecting the tools agentizing on their own and pretty quickly? Or is everything in that Eliezer's post still correct or at least reasonable, and we are simply not at the level where "foom" happens yet?

Come to think of it, I wonder if that post had been revisited somewhere at some point, by Eliezer or others, in light of the current SOTA. Feels like it could be instructive.

Comment by Shmi (shminux) on shminux's Shortform · 2024-09-29T06:07:12.996Z · LW · GW

I'm not even going to ask how a pouch ends up with voice recognition and natural language understanding when the best Artificial Intelligence programmers can't get the fastest supercomputers to do it after thirty-five years of hard work

some HPMoR statements did not age gracefully as others.

Comment by Shmi (shminux) on shminux's Shortform · 2024-09-29T06:05:54.968Z · LW · GW

That is indeed a bit of a defense. Though I suspect human minds have enough similarities that there are at least a few universal hacks.

Comment by Shmi (shminux) on shminux's Shortform · 2024-09-29T06:04:48.874Z · LW · GW

Any of those. Could be some kind of intentionality ascribed to AI, could be accidental, could be something else.

Comment by Shmi (shminux) on shminux's Shortform · 2024-09-28T23:55:08.183Z · LW · GW

So when I think through the pre-mortem of "AI caused human extinction, how did it happen?" one of the more likely scenarios that comes to mind is not nano-this and bio-that, or even "one day we just all fall dead instantly and without a warning". Or a scissor statement that causes all-out wars. Or anything else noticeable. 

Human mind is infinitely hackable through the visual, textual, auditory and other sensory inputs. Most of us do not appreciate how easily because being hacked does not feel like it. Instead it feels like your own volition, like you changed your mind based on logic and valid feelings. Reading a good book, listening to a good sermon, a speech, watching a show or a movie, talking to your friends and family is how mind-hacking usually happens. Abrahamic religions are a classic example. The Sequences and HPMoR are a local example. It does not work on everyone, but when it does, the subject feels enlightened rather than hacked. If you tell them their mind has been hacked, they will argue with you to the end, because clearly they just used logic to understand and embrace the new ideas.

So, my most likely extinction scenario is more like "humans realized that living is not worth it, and just kind of stopped" than anything violent. Could be spread out over the years and decades, like, for example, voluntarily deciding not to have children anymore. None of it would look like it was precipitated by an AI taking over. It does not even have to be a conspiracy by an unaligned SAI. It could just be that the space of new ideas, thanks to the LLMs getting better and better, expands a lot and in the new enough directions to include a few lethal memetic viruses like that.

Comment by Shmi (shminux) on Wei Dai's Shortform · 2024-08-26T21:32:18.697Z · LW · GW

What are the issues that are "difficult" in philosophy, in your opinion? What makes them difficult?

I remember you and others talking about the need to "solve philosophy", but I was never sure what it meant by that.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-21T21:31:10.981Z · LW · GW

My expectation, which I may have talked about before here, is that the LLMs will eat all of the software stack between the human and the hardware. Moreover, they are already nearly good enough to do that, the issue is that people have not yet adapted to the AI being able to do that. I expect there to be no OS, no standard UI/UX interfaces, no formal programming languages. All interfaces will be more ad hoc, created by the underlying AI to match the needs of the moment. It can be star trek like "computer plot a course to..." or a set of buttons popping up on your touchscreen, or maybe physical buttons and keys being labeled as needed in real-time, or something else. But not the ubiquitous rigid interfaces of the last millennium. For the clues of what is already possible but not being implemented yet one should look to the scifi movies and shows, unconstrained by the current limits. Almost everything useful there is already doable or will be in a short while. I hope someone is working on this.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-21T04:55:48.138Z · LW · GW

Just a quote found online:

SpaceX can build fully reusable rockets faster than the FAA can shuffle fully disposable paper

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-19T03:36:47.789Z · LW · GW

It seems like we are not even close to converging on any kind of shared view. I don't find the concept of "brute facts" even remotely useful, so I cannot comment on it.

But this faces the same problem as the idea that the visible universe arose as a Boltzmann fluctuation, or that you yourself are a Boltzmann brain: the amount of order is far greater than such a hypothesis implies.

I think Sean Carroll answered this one a few times: the concept of a Boltzmann brain is not cognitively stable (you can't trust your own thoughts, including that you are a Boltzmann brain). And if you try to make it stable, you have to reconstruct the whole physical universe. You might be saying the same thing? I am not claiming anything different here.

The simplest explanation is that some kind of Platonism is real, or more precisely (in philosophical jargon) that "universals" of some kind do exist.

Like I said in the other reply, I think that those two words are not useful as binaries real/not real, exist/not exist. If you feel that this is non-negotiable to make sense of philosophy of physics or something, I don't know what to say.

I was struck by something I read in Bertrand Russell, that some of the peculiarities of Leibniz's worldview arose because he did not believe in relations, he thought substance and property are the only forms of being. As a result, he didn't think interaction between substances is possible (since that would be a relation), and instead came up with his odd theory about a universe of monadic substances which are all preprogrammed by God to behave as if they are interacting. 

Yeah, I think denying relations is going way too far. A relation is definitely a useful idea. It can stay in epistemology rather than in ontology.

I am not 100% against these radical attempts to do without something basic in ontology, because who knows what creative ideas may arise as a result? But personally I prefer to posit as rich an ontology as possible, so that I will not unnecessarily rule out an explanation that may be right in front of me. 

Fair, it is foolish to reduce potential avenues of exploration. Maybe, again, we differ where they live, in the world as basic entities or in the mind as our model of making sense of the world.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-19T03:25:58.858Z · LW · GW

Thanks, I think you are doing a much better job voicing my objections than I would. 

If push comes to shove, I would even dispute that "real" is a useful category once we start examining deep ontological claims. "Exist" is another emergent concept that is not even close to being binary, but more of a multidimensional spectrum (numbers, fairies and historical figures lie on some of the axes). I can provisionally accept that there is something like a universe that "exists", but, as I said many years ago in another thread, I am much more comfortable with the ontology where it  is models all the way down (and up and sideways and every which way). This is not really a critical point though. The critical point is that we have no direct access to the underlying reality, so we, as tiny embedded agents, are stuck dealing with the models regardless.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-19T03:14:36.077Z · LW · GW

By "Platonic laws of physics" I mean the Hawking's famous question

What is it that breathes fire into the equations and makes a universe for them to describe…Why does the universe go to all the bother of existing?

Re

Current physics, if anything else, is sort of antiplatonic: it claims that there are several dozens of independent entities, actually existing, called "fields", which produce the entire range of observable phenomena via interacting with each other, and there is no "world" outside this set of entities.

I am not sure if it actually "claims" that. A HEP theorist would say that QFT (the standard model of particle physics) + classical GR is our current best model of the universe, with a bunch of experimental evidence that this is not all it is. I don't think there is a consensus for an ontological claim of "actually existing" rather than "emergent". There is definitely a consensus that there is more to the world that the fundamental laws of physics we currently know, and that some new paradigms are needed to know more.

"Laws of nature" are just "how this entities are". Outside very radical skepticism I don't know any reasons to doubt this worldview.

No, I don't think that is an accurate description at all. Maybe I am missing something here.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-15T21:32:50.252Z · LW · GW

Yeah, that was my question. Would there be something that remains, and it sounds like Chalmers and others would say that there would be.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-15T21:31:11.910Z · LW · GW

Thank you for your thoughtful and insightful reply! I think there is a lot more discussion that could be had on this topic, and we are not very far apart, but this is supposed to be a "shortform" thread. 

I never liked The Simple Truth post, actually. I sided with Mark, the instrumentalist, whom Eliezer turned into what I termed back then as "instrawmantalist". Though I am happy with the part

Necessary?” says Inspector Darwin, sounding puzzled. “It just happened. . . I don’t quite understand your question.”

Rather recently Devs the show, which, for all its flaws, has a bunch of underrated philosophical highlights, had an episode with a somewhat similar storyline.

Anyway, appreciate your perspective.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-15T01:38:17.248Z · LW · GW

Thank you, I forgot about that one. I guess the summary would be "if your calibration for this class of possibilities sucks, don't make up numbers, lest you start trusting them". If so, that makes sense.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-15T00:40:51.491Z · LW · GW

Isn't your thesis that "laws of physics" only exist in the mind? 

Yes!

But in that case, they can't be a causal or explanatory factor in anything outside the mind

"a causal or explanatory factor" is also inside the mind

which means that there are no actual explanations for the patterns in nature

What do you mean by an "actual explanation"? Explanations only exist in the mind, as well.

There's no reason why planets go round the stars

The reason (which is also in the minds of agents) is the Newton's law, which is an abstraction derived from the model of the universe that exists in the minds of embedded agents.

there's no reason why orbital speeds correlate with masses in a particular way, these are all just big coincidences

"None of this is a coincidence because nothing is ever a coincidence" https://tvtropes.org/pmwiki/pmwiki.php/Literature/Unsong

"Coincidence" is a wrong way of looking at this. The world is what it is. We live in it and are trying to make sense of it, moderately successfully. Because we exist, it follows that the world is somewhat predictable from the inside, otherwise life would not have been a thing. That is, tiny parts of the world can have lossily compressed but still useful models of some parts/aspects of the world. Newton's laws are part of those models.


A more coherent question would be "why is the world partially lossily compressible from the inside", and I don't know a non-anthropic answer, or even if this is an answerable question. A lot of "why" questions in science bottom out at "because the world is like that".

... Not sure if this makes my view any clearer, we are obviously working with very different ontologies.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-14T23:41:23.358Z · LW · GW

That is a good point, deciding is different from communicating the rationale for your decisions. Maybe that is what Eliezer is saying. 

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-14T23:39:39.625Z · LW · GW

I think you are missing the point, and taking cheap shots.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-14T03:01:45.848Z · LW · GW

So, is he saying that he is calibrated well enough to have a meaningful "action-conditional" p(doom), but most people are not? And that they should not engage in "fake Bayesianism"? But then, according to the prevailing wisdom, how would one decide how to act if they cannot put a number on each potential action?

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-14T01:48:38.145Z · LW · GW

I notice my confusion when Eliezer speaks out against the idea of expressing p(doom) as a number: https://x.com/ESYudkowsky/status/1823529034174882234

I mean, I don't like it either, but I thought his whole point about Bayesian approach was to express odds and calculate expected values.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T22:29:16.684Z · LW · GW

Hmm, I am probably missing something. I thought if a human honestly reports a feeling, we kind of trust them that they felt it? So if an AI reports a feeling, and then there is a conduit where the distillate of that feeling is transmitted to a human, who reports the same feeling, it would go some ways toward accepting that the AI had qualia? I think you are saying that this does not address Chalmers' point.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T21:46:59.890Z · LW · GW

I am not sure why you are including the mind here, maybe we are talking at cross purposes. I am not making statements about the world, only about the emergence of the laws of physics as written in textbooks, which exist as abstractions across human minds. If you are the Laplace's demon, you can see the whole world, and if you wanted to zoom into the level of "planets going around the sun", you could, but there is no reason for you to. This whole idea of "facts" is a human thing. We, as embedded agents, are emergent patterns that use this concept. I can see how it is natural to think of facts, planets or numbers as ontologically primitive or something, not as emergent, but this is not the view I hold.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T21:39:51.733Z · LW · GW

Well, what happens if we do this and we find out that these representations are totally different? Or, moreover, that the AI's representation of "red" does not seem to align (either in meaning or in structure) with any human-extracted concept or perception?

I would say that it is a fantastic step forward in our understanding, resolving empirically a question we did not known an answer to.

How do we then try to figure out the essence of artificial consciousness, given that comparisons with what we (at that point would) understand best, i.e., human qualia, would no longer output something we can interpret?

That would be a great stepping stone for further research.

I think it is extremely likely that minds with fundamentally different structures perceive the world in fundamentally different ways, so I think the situation in the paragraph above is not only possible, but in fact overwhelmingly likely, conditional on us managing to develop the type of qualia-identifying tech you are talking about.

I'd love to see this prediction tested, wouldn't you?

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T21:37:10.751Z · LW · GW

The testing seems easy, one person feels the quale, the other reports the feeling, they compare, what am I missing?

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T08:17:03.841Z · LW · GW

Thanks for the link! I thought it was a different, related but a harder problem than what is described in https://iep.utm.edu/hard-problem-of-conciousness. I assume we could also try to extract what an AI "feels" when it speaks of redness of red, and compare it with a similar redness extract from the human mind. Maybe even try to cross-inject them. Or would there be still more to answer?

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T07:56:03.228Z · LW · GW

How to make dent in the "hard problem of consciousness" experimentally. Suppose we understand brain well enough to figure out what makes one experience specific qualia, then stimulate the neurons in a way that makes the person experience them. Maybe even link two people with a "qualia transducer" such that when one person experiences "what it's like", the other person can feel it, too. 

If this works, what would remain from the "hard problem"?

Chalmers:

To see this, note that even when we have explained the performance of all the cognitive and behavioral functions in the vicinity of experience—perceptual discrimination, categorization, internal access, verbal report—there may still remain a further unanswered question:  Why is the performance of these functions accompanied by experience?

If you can distill, store and reproduce this experience on demand, what remains? Or, at least, what would/does Chalmers say about it?

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-13T07:39:52.249Z · LW · GW

There is an emergent reason, one that lives in the minds of the agents. The universe just is. In other words, if you are a hypothetical Laplace's demon, you don't need the notion of a reason, you see it all at once, past, present and future.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-12T08:33:28.046Z · LW · GW

I think I articulated this view here before, but it is worth repeating. It seems rather obvious to me that there are no "Platonic" laws of physics, and there is no Platonic math existing in some ideal realm. The world just is, and everything else is emergent. There are reasonably durable patterns in it, which can sometimes be usefully described as embedded agents. If we squint hard, and know what to look for, we might be able to find a "mini-universe" inside such an agent, which is a poor-fidelity model of the whole universe, or, more likely, of a tiny part of it. These patterns we call agents appear to be fairly common and multi-level, and if we try to generalize the models they use across them, we find that something like "laws of physics" is a concise description. In that sense the laws of physics exist in the universe, but only as an abstraction over embedded agents of a certain level of complexity.

It is not clear whether any randomly generated world would necessarily get emergent patterns like that, but the one we live in does, at least to a degree. It is entirely possible that there is a limit to how accurate a model a tiny embedded agent can contain. For example, if most of the universe is truly random, we would never be able to understand those parts, and they would look like miracles to us, just something that pops up without any observable cause. Another possibility that we might find some patterns that are regular but defy analysis. These would look to us like "magic": something we know how to call into being, but that defies any rational explanation. 

We certainly hope that the universe we live in does not contain either miracles or magic, but it is, in the end, an open empirical question, and does not require any kind of divine power or dualism, it might just be the feature of our world.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-11T19:38:47.502Z · LW · GW

Hence the one tweak I mentioned.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-11T07:52:12.198Z · LW · GW

Ancient Greek Hell is doing fruitless labor over and over, never completing it.

Christian Hell is boiling oil, fire and brimstone.

The Good Place Hell is knowing you are not deserving and being scared of being found out.

Lucifer Hell is being stuck reliving the day you did something truly terrible over and over.


Actual Hell does not exist. But Heaven does and everyone goes there. The only difference is that the sinners feel terrible about what they did while alive, and feel extreme guilt for eternity, with no recourse. That's the only brain tweak God does. 

No one else tortures you, you can sing hymns all infinity long, but something is eating you inside and you can't do anything about it. Sinners would be like everyone else most of the time, just subdued, and once in a while they would start screaming and try to self-harm or suicide, to no avail. "Sorry, no pain for you except for the one that is eating you from inside. And no reprieve, either."

Comment by Shmi (shminux) on Leaving MIRI, Seeking Funding · 2024-08-11T07:38:46.726Z · LW · GW

As Patrick McKenzie has been saying for almost 20 years, "you can probably stand to charge more".

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-11T07:32:56.775Z · LW · GW

Yeah, I think this is exactly what I meant. There will still be boutique usage for hand-crafted computer programs just like there is now for penpals writing pretty decorated letters to each other. Granted, fax is still a thing in old-fashioned bureaucracies like Germany, so maybe there will be a requirement for "no LLM" code as well, but it appears much harder to enforce.

I think your point on infinite and cheap UI/UX customizations is well taken. The LLM will fit seamlessly one level below that. There will be no "LLM interface" just interface.

Comment by Shmi (shminux) on J's Shortform · 2024-08-11T03:40:58.228Z · LW · GW

Consider moral constructivism.

Comment by Shmi (shminux) on shminux's Shortform · 2024-08-11T02:59:11.410Z · LW · GW

I believe that, while the LLM architecture may not lead to AGI (see https://bigthink.com/the-future/arc-prize-agi/ for the reasons why -- basically current models are rules interpolators, not rules extrapolators, though they are definitely data extrapolators), they will succeed in killing all computer languages. That is, there will be no intermediate rust, python, wasm or machine code. The AI will be the interpreter and executor of what we now call "prompts". They will also radically change the UI/UX paradigm. No menus, no buttons, no windows -- those are all artifacts of 1980s. The controls will be whatever you need them to be: voice, text, keypresses... Think of your grandma figuring out how to do something on her PC or phone and asking you, only the you will be the AI. There will be rigid specialized interfaces for, say, gaming, but those will be a small minority. 

Comment by Shmi (shminux) on On Privilege · 2024-05-20T18:57:52.501Z · LW · GW

That makes sense! Maybe you feel like writing a post on the topic? Potentially including a numerical or analytical model.

Comment by Shmi (shminux) on On Privilege · 2024-05-20T09:07:16.390Z · LW · GW

Excellent point about the compounding, which is often multiplicative, not additive. Incidentally, multiplicative advantages result in a power law distribution of income/net worth, whereas additive advantages/disadvantages result in a normal distribution. But that is a separate topic, well explored in the literature.

Comment by Shmi (shminux) on On Privilege · 2024-05-20T09:02:26.734Z · LW · GW

I mostly meant your second point, just generally being kinder to others, but the other two are also well taken.

Comment by Shmi (shminux) on Examples of Highly Counterfactual Discoveries? · 2024-04-26T16:48:19.441Z · LW · GW

First, your non-standard use of the term "counterfactual" is jarring, though, as I understand, it is somewhat normalized in your circles. "Counterfactual" unlike "factual" means something that could have happened, given your limited knowledge of the world, but did not. What you probably mean is "completely unexpected", "surprising" or something similar. I suspect you got this feedback before.

Sticking with physics. Galilean relativity was completely against the Aristotelian grain. More recently, the singularity theorems of Penrose and Hawking unexpectedly showed that black holes are not just a mathematical artifact, but a generic feature of the world. A whole slew of discoveries, experimental and theoretical, in Quantum mechanics were almost all against the grain. Probably the simplest and yet the hardest to conceptualize was the Bell's theorem

Not my field, but in economics, Adam Smith's discovery of what Scott Alexander later named Moloch was a complete surprise, as I understand it. 

Comment by Shmi (shminux) on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong · 2024-04-26T07:23:23.751Z · LW · GW

Let's say I start my analysis with the model that the predictor is guessing, and my model attaches some prior probability for them guessing right in a single case. I might also have a prior about the likelihood of being lied about the predictor's success rate, etc. Now I make the observation that I am being told the predictor was right every single time in a row. Based on this incoming data, I can easily update my beliefs about what happened in the previous prediction excercises: I will conclude that (with some credence) the predictor was guessed right in each individual case or that (also with some credence) I am being lied to about their prediction success. This is all very simple Bayesian updating, no problem at all.

Right! If I understand your point correctly, given a strong enough prior for the predictor being lucky or deceptive, it would have to be a lot of evidence to change one's mind, and the evidence would have to be varied. This condition is certainly not satisfied by the original setup. If your extremely confident prior is that foretelling one's actions is physically impossible, then the lie/luck hypothesis would have to be much more likely than changing your mind about physical impossibility. That makes perfect sense to me. 

I guess one would want to simplify the original setup a bit. What if you had full confidence that the predictor is not a trickster? Would you one-box or two-box? To get the physical impossibility out of the way, they do not necessarily have to predict every atom in your body and mind, just observe you (and read your LW posts, maybe) to Sherlock-like make a very accurate conclusion about what you would decide.

Another question: what kind of experiment, in addition to what is in the setup, would change your mind? 

Comment by Shmi (shminux) on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong · 2024-04-21T07:58:24.794Z · LW · GW

Sorry, could not reply due to rate limit.

In reply to your first point, I agree, in a deterministic world with perfect predictors the whole question is moot. I think we agree there.

Also, yes, assuming "you have a choice between two actions", what you will do has not been decided by you yet. Which is different from "Hence the information what I will do cannot have been available to the predictor." If the latter statement is correct, then how can could have "often correctly predicted the choices of other people, many of whom are similar to you, in the particular situation"? Presumably some information about your decision-making process is available to the predictor in this particular situation, or else the problem setup would not be possible, would it? If you think that you are a very special case, and other people like you are not really like you, then yes, it makes sense to decide that you can get lucky and outsmart the predictor, precisely because you are special. If you think that you are not special, and other people in your situation thought the same way, two-boxed and lost, then maybe your logic is not airtight and your conclusion to two-box is flawed in some way that you cannot quite put your finger on, but the experimental evidence tells you that it is. I cannot see a third case here, though maybe I am missing something. Either you are like others, and so one-boxing gives you more money than two boxing, or you are special and not subject to the setup at all, in which case two-boxing is a reasonable approach.

I should decide to try two-boxing. Why? Because that decision is the dominant strategy: if it turns out that indeed I can decide my action now, then we're in a world where the predictor was not perfect but merely lucky and in that world two-boxing is dominant

Right, that is, I guess, the third alternative: you are like other people who lost when two-boxing, but they were merely unlucky, the predictor did not have any predictive powers after all. Which is a possibility: maybe you were fooled by a clever con or dumb luck. Maybe you were also fooled by a clever con or dumb luck when the predictor "has never, so far as you know, made an incorrect prediction about your choices". Maybe this all led to this moment, where you finally get to make a decision, and the right decision is to two-box and not one-box, leaving money on the table.

I guess in a world where your choice is not predetermined and you are certain that the predictor is fooling you or is just lucky, you can rely on using the dominant strategy, which is to two-box. 

So, the question is, what kind of a world you think you live in, given Nozick's setup? The setup does not say it explicitly, so it is up to you to evaluate the probabilities (which also applies to a deterministic world, only your calculation would also be predetermined).

What would a winning agent do? Look at other people like itself who won and take one box, or look at other people ostensibly like itself and who nevertheless lost and two-box still?

I know what kind of an agent I would want to be. I do not know what kind of an agent you are, but my bet is that if you are the two-boxing kind, then you will lose when push comes to shove, like all the other two-boxers before you, as far as we both know.

Comment by Shmi (shminux) on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong · 2024-04-18T03:59:48.976Z · LW · GW

There is no possible world with a perfect predictor where a two-boxer wins without breaking the condition of it being perfect.

Comment by Shmi (shminux) on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-06T03:03:34.647Z · LW · GW

People constantly underestimate how hackable their brains are. Have you changed your mind and your life based on what you read or watched? This happens constantly and feels like your own volition. Yet it comes from external stimuli. 

Comment by Shmi (shminux) on Claude 3 claims it's conscious, doesn't want to die or be modified · 2024-03-05T08:30:14.639Z · LW · GW

Note that it does not matter in the slightest whether Claude is conscious. Once/if it is smart enough it will be able to convince dumber intelligences, like humans, that it is indeed conscious. A subset of this scenario is a nightmarish one where humans are brainwashed by their mindless but articulate creations and serve them, kind of like the ancients served the rock idols they created. Enslaved by an LLM, what an irony.

Comment by Shmi (shminux) on Universal Love Integration Test: Hitler · 2024-01-11T05:26:09.815Z · LW · GW

Not into ancestral simulations and such, but figured I comment on this:

I think "love" means "To care about someone such that their life story is part of your life story."

I can understand how how it makes sense, but that is not the central definition for me. When I associate with this feeling is what comes to mind is willingness to sacrifice your own needs and change your own priorities in order to make the other person happier, if only a bit and if only temporarily. This is definitely not the feeling I would associate with villains, but I can see how other people might.

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-10T04:56:55.767Z · LW · GW

Thank you for checking! None of the permutations seem to work with LW, but all my other feeds seem fine. Probably some weird incompatibility with protopage.

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-10T04:53:25.512Z · LW · GW

neither worked... Something with the app, I assume.

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-09T04:33:27.604Z · LW · GW

Could be the app I use. It's protopage.com (which is the best clone of the defunct iGoogle I could find):

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-09T04:31:48.092Z · LW · GW

Thankfully, human traits are rather dispersive. 

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-08T23:01:45.067Z · LW · GW

No, I assume I would not be the only person having this issue, and if I were the only one, it would not be worth the team's time to fix it. Also, well, it's not as important anymore, mostly a stream of dubious AI takes.

Comment by Shmi (shminux) on How do you feel about LessWrong these days? [Open feedback thread] · 2023-12-08T09:35:04.583Z · LW · GW

I used to comment a fair bit over the last decade or so, and post occasionally. After the exodus of LW 1.0 the site was downhill, but the current team managed to revive it somehow and they deserve a lot of credit for that, most sites on the downward trajectory never recover. 

It felt pretty decent for another few years, but eventually the rationality discourse got swamped by the marginal quality AI takes of all sorts. The MIRI work, prominently featured here, never amounted to anything, according to the experts in ML, probability and other areas relevant to their research. CFAR also proved a flop, apparently. A number of recent scandals in various tightly or loosely affiliated orgs did not help matters. But mainly it's the dearth of insightful and lasting content that is sad. There is an occasional quality post, of course, but not like it used to be. The quality discourse happens on ACX and ACXD and elsewhere, but rarely here. To add insult to injury, the RSS feed stopped working, so I can no longer see the new posts on my offsite timeline.

My guess is that the bustling front disguises serious issues, and maybe the leadership could do what Eliezer called "Halt, melt, and catch fire". Clearly this place does not contribute to AI safety research in any way. The AI safety agitprop has been undoubtedly successful beyond wildest dreams, but seems like it's run its course, now that it has moved into a wider discourse. EA has its own place. What is left? I wish I knew. I would love to see LW 3.0 taking off.