Posts
Comments
- If you spend 8000 times less on AI alignment (compared to the military),
- You must also believe that AI risk is 8000 times less (than military risk).[1]
No. You must believe that spending on military is 8000 times more helpful to your goals. And really, in a democracy or other multilateral decision framework, nobody actually has to believe this, it just has to be 8000 times easier to agree to spend a marginal amount, which is quite path-dependent.
Even if you DO believe the median estimates as given, you have to weight it by the marginal change that spending makes. Military spending keeps the status quo, rewards your constituents, makes you look good, etc. AI spending is ... really confusing and doesn't really help any political goals. It's absolutely not clear that spending more can increase safety - the obvious thing that happens when you spend is acceleration, not slowdown.
Ah, yes - bargaining solutions that ignore or hide a significant underlying power disparity are rampant in wishful-thinking academic circles, and irrelevant in real life. That's the context I was missing; my confusion is resolved. Thanks!
I'm missing some context here. Is this not obvious, and well-supported by the vast majority of "treaties" between europeans and natives in the 16th through 19th centuries? For legal settlements, it's generally between the extremes that each party would prefer, but it's not always the case that this range doesn't include "quite bad", even if not completely arbitrary.
"We'll kill you quickly and painlessly" isn't actually arbitrarily bad, it's only quite bad. There are possibly worse outcomes available if no agreement was available.
Overall this feels comfortable and reasonable to me in some situations, but I had a very strong negative reaction to the opening, as I applied it to other situations until I'd read the whole thing.
one is constantly called to account for one’s behavior. At any moment, one may be asked “what are you doing?” or “why did you do that?” And one is expected to provide a reasonable answer.
This sounds like a nightmare. But that depends a whole lot on the frequency and intensity of such questions and discussion. "constantly called to account" just isn't going to work for me. "able to discuss goals and behaviors when useful and appropriate" is mandatory for happy coexistence (for me). And they're the same thing, just slight variants.
I think the key underlying context to call out is "presumption of alignment". Among people who overall share a philosophy and at least some goals, this all just works. Among less-trusted acquaintances, it does not.
The ecosystem (econo-system?) of drug regulation and approval is the primary cost/required-investment for much of this. The tension of protecting the profits and making sure all agencies and participants get their cut against selling the system as protecting the public is really hard to break.
One of the biggest online threats to rational discourse,
If this is true, I'm relieved, because it means there are no serious threats. But I doubt it.. I hadn't heard the name for a number of years, and even in the heyday it only mattered in a tiny part of a tiny sub-community.
You can put those options into .ssh/config, which makes it work for things which use SSH directly (scp, git, other tools) when they don't know to go through your script.
Thanks for writing this, but my personal experience of valuing things is a direct contradiction to this. Almost all valuations have some kind of non-linear aggregation. "Declining marginal utility" is observationally and reflectively true for me, at least, and there are many cases outside myself which are more consistent with nonlinear aggregation than linear.
In a lot of cases, the margin is tiny, so it's hard to notice and not very important. Going from 9 billion to 9.01 or 9.5 billion is close to linear. Going from 0 to 1 or 1 to 2 or 9 to 10 is often VERY different in utility-change.
Interesting, but I worry that the word "Karma" as a label for a legibly-usable resource token makes it VERY different from common karma systems on social websites, and that the bid/distribute system is even further from common usage.
For the system described, "karma" is a very misleading label. Why not just use "dollars" or "resource tokens"?
The rabbit hole can go deep, and probably isn't worth getting too fancy for single-digit hosts. Fleets of thousands of spot instances benefit from the effort. Like everything, dev-time vs runtime-complexity vs cost-efficiency is a tough balance.
When I was doing this often, I had different modes for "dev mode, which includes human-timeframe messing about" and "prod mode", which was only for monitored workloads. In both cases, automating the "provision, spin up, and initial setup", as well as the "auto-shutdown if not measurably used for N minutes (60 was my default)" with a one-command script made my life much easier.
I've seen scripts (though I don't have links handy) that do this based on no active logins and no CPU load for X minutes as well. On the other tack, I've seen a lot of one-off processes that trigger a shutdown when they complete (and write their output/logs to S3 or somewhere durable). Often a Lambda is used for the control plane - it responds to signals and runs outside the actual host.
There's a big presumption there. If he was a p-zombie to start with, he still has non-experience after the training. We still have no experience-o-meter, or even a unit of measure that would apply.
For children without major brain abnormalities or injuries, who CAN talk about it, it's a pretty good assumption that they have experiences. As you get more distant from your own structure, your assumptions about qualia should get more tentative.
Do you think that as each psychological continuations plays out, they'll remain identical to one another?
They'll differ from one another, and differ from their past singleton self. Much like future-you differs from present-you. Which one to privilege for what purposes, though, is completely arbitrary and not based on anything.
Which psychological stream one-at-the-moment-of-brain-scan ends up in is a matter of chance.
I think this is a crux. It's not a matter of chance, it's all of them. They all have qualia. They all have continuity back to the pre-upload self. They have different continuity, but all of them have equally valid continuity.
Think of it like this: if one had one continuation in which one lived a perfect life, one would be guaranteed to live that perfect life. But if one had 10 copies in which one lived a perfect life, one does benefit at all. It's the average that matters.
Sure, just like if a parent has one child or 10 children, they have identical expectations.
I think we're unlikely to converge here - our models seem too distant from each other to bridge. Thanks for the post, though!
Reminder to all: thought experiments are limited in what you can learn. Situations which are significantly out-of-domain for our evolved and trained experiences simply cannot be analyzed by our intuitions. You can sometimes test a model to see if it remains useful in novel/fictional situations, but you really can't trust the results.
For real decisions and behaviors, details matter. And thought experiments CANNOT provide the details, or they'd be just situations, not hypotheticals.
Once we identify an optimal SOA
This is quite difficult, even without switching costs or fear of change. Definition of optimal is elusive, and most SOA have so many measurable and unmeasurable, correlated and uncorrelated factors to them that comparison is not directly possible.
Add to this the common moral beliefs (incorrect IMO, but still very common) of "inaction is less blameworthy than wrong action, and only slightly blameworthy compared to correct action", and there needs to be a pretty significant expected gain from switching in order to undertake it.
With that in mind, suppose you are asexual. Would you take a pill to make you not asexual?
I'm not asexual, but sex is less important to me than for most humans, as far as I can tell. I know of no pills to shift in either direction that are actually effective and side-effect-free, and it's not meta-important to me enough to seek out change in either direction. This does NOT mean that I judge it optimal, just that I think the risk and cost of adjusting myself to be higher than the value.
In fact, I suspect such pills would be very popular if they existed, and I would likely try them out if common, to find out if it's actually better in either direction.
You could make this argument about a LOT of things - for any trait or metric about yourself, why is this exact value the best one? Wouldn't you like to raise or lower it? In fact, most people DO attempt to change things about themselves. It's just not actually as easy as taking a pill, so the cost of actually working toward a change is nonzero, and can't be handwaved away.
Wow, a lot of assumptions without much justification
Let's assume computationalism and the feasibility of brain scanning and mind upload. And let's suppose one is a person with a large compute budget.
Already well into fiction.
But one is not both. This means that when one is creating a copy one can treat it as a gamble: there's a 50% chance they find themselves in each of the continuations.
There's a 100% chance that each of the continuations will find themselves to be ... themselves. Do you have a mechanism to designate one as the "true" copy? I don't.
What matters to one is then the average quality of one's continuations.
Disagree, but I'm not sure that my preference (some aggregation function with declining marginal impact) is any more justifiable. It's no less.
Before even a small fraction of one's life has played out, one's copy will bear no relation to oneself. To spend one's compute on this person, effectively a stranger, is just altruism. One would be better off donating the compute to ASI.
Huh? This supposes that one of them "really" is you, not the actual truth that they all are equal continuations of you. Once they diverge, they're still closer to twin siblings to each other, and there is no fact that would elevate one as primary.
This is a topic where macro and micro have a pretty big gap.
If you're asking about measured large-group unemployment, you probably don't get very good causality from any given change, and there's no useful, simple model of the motivations and frictions of potential-employeers and potential-employees. It's a very complicated matching market.
If you're asking about some specific reasons that an individual may be out of work or become out of work, you'll get a lot better result and some concrete reasons. But everyone you talk to will say "that doesn't scale!".
At its most useless modeling level, unemployment happens when some people don't want to (or aren't allowed to) accept the wage that someone can and will offer.
I don't understand the question. What intuition for not smoking are you talking about? CDT prefers smoking. Are you asking why EDT abstains from smoking? I'm not the best defender, as I don't really think EDT is workable, but as I understand it EDT updates it's world state based on actions, meaning that it prefers the world where you don't have the lesion and don't WANT to smoke.
The first one is only a metaphor - it's not possible now, and we don't know if it ever will be (because we don't know how to scan a being in enough detail to recreate it well enough).
The second one is WAY TOO limited. If you put a radio anywhere near your head, or really any other-controlled media, you can be programmed. By trivial extension, you have been programmed. Get used to it.
Economists and other social theorists often take the concept of utility for granted.
Armchair economists and EAs even more so. Take for granted, and fail to document WHICH version of the utility concept they're using.
For me, utility is a convenient placeholder for the underlying model that our ordinal preferences expressed through action (I did X, meaning I prefer the expected sum of value of outcomes likely from X). Utility is the "value" that is preferred. Note that it's kind of a circular defining - it's the thing that drives decisions, proven by the fact that actions take place.
More expansive uses of the term come about by forgetting that this definition doesn't carry much information about anything. It would be nice if we could find underlying consistent preferences, and this would be a good term for the unification of them. And if they're long-term consistent preferences, maybe it should add up over time to explain time-preferences. And if everyone is equal, then clearly we can sum this thing up to get a group value.
I think it's a different level of abstraction. Decision theory works just fine if you separate the action of predicting a future action from the action itself. Whether your prior-prediction influences your action when the time comes will vary by decision theory.
I think, for most problems we use to compare decision theories, it doesn't matter much whether considering, planning, preparing, replanning, and acting are correlated time-separated decisions or whether it all collapses into a sum of "how to act at point-in-time". I haven't seen much detailed exploration of decision theory X embedded agents or capacity/memory-limited ongoing decisions, but it would be interesting and important, I think.
Decision theory is fine, as long as we don't think it applies to most things we colloquially call "decisions". In terms of instantaneous discrete choose-an-action-and-complete-it-before-the-next-processing-cycle, it's quite a reasonable topic of study.
But if you only have a belief that you will do something in the future, you still have to decide, when the time comes, whether to carry out the action or not. So your previous belief doesn't seem to be an actual decision, but rather just a belief about a future decision -- about which action you will pick in the future
Correct. There are different levels of abstraction of predictions and intent, and observation/memory of past actions which all get labeled "decision". I decide to attend a play in London next month. This is an intent and a belief. It's not guaranteed. I buy tickets for the train and for the show. The sub-decisions to click "buy" on the websites are in the past, and therefore committed. The overall decision has more evidence, and gets more confident. The cancelation window passes. Again, a bit more evidence. I board the train - that sub-decision is in the past, so is committed, but there's STILL some chance I won't see the play.
Anything you call a "decision" that hasn't actually already happened is really a prediction or an intent. Even DURING an action, you only have intent and prediction. While the impulse is traveling down my arm to click the mouse, the power could still go out and I don't buy the ticket. There is past, which is pretty immutable, and future, which cannot be known precisely.
I think this is compatible with Spohn's example (at least the part you pasted), and contradicts OP's claim that "you did not make a decision" for all the cases where the future is uncertain. ALL decisions are actually predictions, until they are in the past tense. One can argue whether that's a p(1) prediction or a different thing entirely, but that doesn't matter to this point.
"If, on making a decision, your next thought is “Was that the right decision?” then you did not make a decision." is actually good directional advice in many cases, but it's factually simply incorrect.
When the decision is made, consideration ends. The action must be wholehearted in spite of uncertainty.
This seems like hyperbolic exhortation rather than simple description. This is not how many decisions feel to me - many decisions are exactly a belief (complete with bayesean uncertainty). A belief in future action, to be sure, but it's distinct in time from the action itself.
I do agree with this as advice, in fact - many decisions one faces should be treated as a commitment rather than an ongoing reconsideration. It's not actually true in most cases, and the ability to change one's plan when circumstances or knowledge changes is sometimes quite valuable. Knowing when to commit and when to be flexible is left as an excercise...
I only see one downvoted post, and a bunch of comments and a few posts with very low voting at all. That seems pretty normal to me, and the advice of "lurk for quite a bit, and comment occasionally" is usually good for any new users on any site.
A lot depends on what you mean by "required", and what specific classes or functions you're talking about. The core skill of committing a position to writing and supporting it with logic is never going away. It will shift from "do this with minimal spelling and grammar assistance" to "ensure that the prompt-review-revise loop generates output you can stand behind".
This is already happening in many businesses and practical (grant-writing) aspects of academia. It'll take a while for undergrad and MS programs to admit that their academic theories of what they're teaching needs revision.
This seems generally applicable. Any significant money transaction includes expectations, both legible and il-, which some participants will classify as bullshit. Those holding the expectations may believe it to be legitimately useful, or semi-legitimately necessary due to lack of perfect alignment.
If you want to specify a bit, we can probably guess at why it's being required.
[Note: I apologize for being somewhat combative - I tend to focus on the interesting parts, which is those parts which don't add up in my mind. I thank you for exploring interesting ideas, and I have enjoyed the discussion! ]
I was only saying that I don't see anything proving it won't work
Sure, proving a negative is always difficult.
I agree that this missile problem shouldn't happen in the first place. But it did happen in the past
Can you provide details on which incident you're talking about, and why the money-bond is the problem that caused it, rather than simply not having any communications loop to the controllers on the ground or decent identification systems in the missile?
I've been in networking long enough to know that "can be less than", "often faster", and "can run" are all verbal ways of saying "I haven't thought about reliability or measured the behavior of any real systems beyond whole percentiles."
But really, I'm having trouble understanding why a civilian plane is flying in a war zone, and why current IFF systems can't handle the identification problem of a permitted entry.
Kind of unfortunate that a comms or systems latency destroys civilian airliners. But nice to live in a world where all flyers have $10B per missile/aircraft pair lying around, and everyone trusts each other enough to hand it over (and hand it back later).
Sure. There's lots of things that aren't yet possible to collect evidence about. No given conception of God or afterlife options has been disproven. However, there are lots of competing, incompatible theories, none of which have any evidence for or against. Assigning any significant probability (more than a percent, say) to any of them is unjustified. Even if you want to say 50/50 that some form of deism will be revealed after death, there are literally thousands of incompatible conceptions of how that works. And near-infininte possibilities that haven't become popular. Note that if it turns out that consciousness is physical and just ends when the physical support for it terminates, then nobody will be able to observe that. It's a permanent "no evidence" situation.
All that said, it's hard to argue against someone else's choice of priors (what they believe before evidence becomes available). Maybe they have access to experiences you don't. Maybe they weight some kinds of social evidence more heavily (the 'prophets' theory that there are historical or current people with more direct connections). Maybe they're even right - you don't have access to any counterevidence, right? By "hard to argue", I mostly mean "hard to be sure yourself", but also literally "not worth arguing". We'll all find out soon enough, right?
Or maybe it's all relative - it's true for them, and not for you. Or maybe it's weirder than we can imagine.
I didn't downvote this, because it seems good-faith and isn't harmful. But I really dislike this "friendly" style of writing, and it doesn't fit well on lesswrong. It's very hard to find things that are concrete enough to understand whether I disagree or not. Rhetorical questions (especially that you don't answer) really detract from understanding your POV. Some specifics:
But most of us patch together a little of this and a little of that and try to muddle through with a philosophy that’s something of a crazy quilt.
Citation needed. In fact, purpose of statement needed - what does this actually assert, and how does it help in understanding ... anything?
In either case you are asked whether you will sacrifice one life to save many, and so from one perspective the two story variants seem to be essentially identical, only differing in inessential details.
But only a TINY bit of reflection indicates that the differences are nowhere near inessential. Choosing between 5 people who've gotten themselves tied to the railroad tracks vs 1 person who has is simply a different situation than pushing an innocent bystander off a bridge. You don't need to resort to a sledgehammer of different fundamental structures.
And you desperately want to be able to tell the next-of-kin “It wasn’t me—I had no choice!”
Not even close. It was me; I made the choice, and I understand that it sucks. For those who did not pull the lever/push the innocent, they ALSO made the choice, and it sucks.
This would be a lot stronger if it acknowledged how few lies have the convenient fatal flaw of a chocolate allergy. Many do, and it's a good overall process, but it's nowhere near as robust as implied.
Note that I disagree that it's not applicable when you don't already suspect deception - it's useful to look for details and inconsistency when dealing with any fallible source of information - doesn't matter whether it's an intentional lie, or a confused reporter, or an inapplicable model, truth the only thing that's consistent with itself and with observations.
This is a fundamental truth for all commodities and valuable things. They're fungible, but not positionally identical, and not linearly aggregable. This is why we prefer to talk about "utility" over "quantity" in game theory discussions.
Market cap is meaningful in some sense - the price in a liquid market isn't just randomly the last price used, it's the equilibrium price of a marginal share. That's the price that current holders don't want to sell for less, and people with money don't want to buy or more. That equilibrium is real information.
But it's still only for the marginal share (or block). We don't know what the value/quantity curve looks like, or how it will change if demand changes. A thing is worth what it will bring. We just don't know what the total mass of shares will bring.
"something like that" isn't open enough. "or something else entirely" seems more likely than "something like that". Many more than 2 groups (family-sized coalitions) is an obvious possibility, but there are plenty of other strategies used by primitive malthusian societies - infanticide being a big one, and ritual killings being another. According to Wikipedia, Jared Diamond suggests cannibalism for Rapa Nui.
Looking at Wikipedia (which I should have done earlier), there's very little evidence for what specific things changed during the collapse.
In any case, it's tenuous enough that one shouldn't take any lessons or update your models based on this.
In the medium-term reduced-scarcity future, the answer is: lock them into a VR/experience-machine pod.
edit: sorry, misspoke. In this future, humans are ALREADY mostly in these pods. Criminals or individuals who can't behave in a shared virtual space simply get firewalled into their own sandbox by the AI. Or those behaviors are shadowbanned - the perpetrator experiences them, the victim doesn't.
I nominate NYC, and I assert that LA is an inferior choice for this. Source: John Carpenter/Kurt Russel movies.
In a sufficiently wealthy society we would never kill anyone for their crimes.
In a sufficiently wealthy society, there're far fewer forgivable/tolerable crimes. I'm opposed to the death penalty in current US situation, mostly for knowledge and incentive reasons (too easy to abuse, too hard to be sure). All of the arguments shift in weight by a lot if the situation changes. If the equilibrium shifts significantly so that there are fewer economic reasons for crimes, and fewer economic reasons not to investigate very deeply, and fewer economic reasons not to have good advice and oversight, there may well be a place for it.
This was my thinking as well. On further reflection, and based on OP's response, I realize there IS a balance that's unclear. The list contains some false-positives. This is very likely just by the nature of things - some are trolls, some are pure fantasy, some will have moved on, and only a very few are real threats.
So the harm of making a public, anonymous, accusation and warning is definitely nonzero - it escalates tension for a situation that has passed. The harm of failing to do so in the real cases is also nonzero, but I expect many of the putative victims know they have a stalker or deranged enemy who'd wish them dead, and the information is "just" that this particular avenue has been explored.
That balance is difficult. I philosophically lean toward "open is better than secret, and neither is as good as organized curation and controlled disclosure". Since there's no clear interest by authorities, I'd publish. And probably I'd do so anonymously as I don't want the hassle of having potential murderers know about me.
Can you explore a bit more about why you can't ethically dump it on the internet? From my understanding, this is information you have not broken any laws to obtain, and have made no promises as to confidentiality.
If not true publication, what keeps you from sending it to prosecutors and police? They may or may not act, but that's true no matter who you give it to (and true NOW of you).
People who have a lot of political power or own a lot of capital, are unlikely to be adversely affected if (say) 90% of human labor becomes obsolete and replaced by AI.
That's certainly the hope of the powerful. It's unclear whether there is a tipping point where the 90% decide not to respect the on-paper ownership of capital.
so long as property rights are enforced, and humans retain a monopoly on decisionmaking/political power, such people are not-unlikely to benefit from the economic boost that such automation would bring.
Don't use passive voice for this. Who is enforcing which rights, and how well can they maintain the control? This is a HUGE variable that's hard to control in large-scale social changes.
Specifically, "So, the islanders split into two groups and went to war." is fiction - there's no evidence, and it doesn't seem particularly likely.
Well, there are possible outcomes that make resources per human literally infinite. They're not great either, by my preferences.
In less extreme cases, a lot depends on your definition of "poverty", and the weight you put on relative poverty vs absolute poverty. Already in most parts of the world the literal starvation rate is extremely low. It can get lower, and probably will in a "useful AI" or "aligned AGI" world. A lot of capabilities and technologies have already moved from "wealthy only" to "almost everyone, including technically impoverished people", and this can easily continue.
- There's a wide range of techniques and behaviors that can be called "hypnosis", and an even wider range of what can be called "a real thing, right?". Things in the realm of hypnosis (meditation, guided-meditation, self-hypnosis, daily affirmations, etc. have plenty of anecdotal support from adherents, and not a lot of RCTs or formal proof of who it will work for and who it won't.
- There's a TON of self-help and descriptive writing on the topics of meditation and self-hypnosis. For many people, daily affirmations seem to be somewhat effective in changing their attitude over time. For many, a therapist or guide may be helpful in setting up and framing the hypnosis.
What does "unsafe" mean for this prediction/wager? I don't expect the murder rate to go up very much, nor life expectancy to reverse it's upward trend. "Erosion of rights" is pretty general and needs more specifics to have any idea what changes are relevant.
I think things will get a little tougher and less pleasant for some minorities, both cultural and skin-color. There will be a return of some amount of discrimination and persecution. Probably not as harsh as it was in the 70s-90s, certainly not as bad as earlier than that, but worse than the last decade. It'll probably FEEL terrible, because it was on such a good trend recently, and the reversal (temporary and shallow, I hope) will dash hopes of the direction being strictly monotonic.
This seems like a story that's unsupported by any evidence, and no better than fiction.
They could have fought over resources in a scramble of each against all, but anarchy isn't stable.
This seems most likely, and "stable" isn't a filter in this situation - 1/3 of the population will die, nothing is stable. It wouldn't really be "each against all", but "small (usually family) coalitions against some of the other small-ish coalitions". The optimal size of coalition will be dependend on a lot of factors, including ease of defection and strength of non-economic bonds between members.
- If you could greatly help her at small cost, you should do so.
This needs to be quantified to determine whether or not I agree. In most cases I imagine (and a few I've experienced), I would (and did) kill the animal to end it's suffering and to prevent harm to others if the animal might be subject to death throes or other violent reactions to their fear and pain.
In other cases I imagine, I'd walk away or drive on, without a second thought. Neither the benefit nor the costs are simple, linear, measurable things.
- Her suffering is bad.
I don't have an operational definition of "bad". I prefer less suffering, all else equal. All else is never equal - I don't know what alternatives and what suffering (or reduced joy) any given remediation would require, and only really try to estimate them when faced with a specific case.
For the aggregate case, I don't buy into a simple or linear aggregation of suffering (or of joy or of net value of distinct parts of the universe). I care about myself perhaps two dozen orders of magnitude more than the ant I killed in my kitchen this morning. And I care about a lot of things with a non-additive function - somewhere in the realm of logarithmic. I care about the quarter-million remaining gorillas, but I care about a marginal gorilla much less than 1/250K of that caring.
One challenge I'd have for you / others who feel similar to you, is to try to get more concrete on measures like this, and then to show that they have been declining.
I've given some thought to this over the last few decades, and have yet to find ANY satisfying measures, let alone a good set. I reject the trap of "if it's not objective and quantitative, it's not important" - that's one of the underlying attitudes causing the decline.
I definitely acknowledge that my memory of the last quarter of the previous century is fuzzy and selective, and beyond that is secondhand and not-well-supported. But I also don't deny my own experience that the (tiny subset of humanity) people I am aware of as individuals have gotten much less hopeful and agentic over time. This may well be for reasons of media attention, but that doesn't make it not real.
Do you think that the world is getting worse each year?
Good clarification question! My answer probably isn’t satisfying, though. “It’s complicated” (meaning: multidimensional and not ordinally comparable).
On a lot of metrics, it’s better by far, for most of the distribution. On harder-to-operationally-define dimensions (sense of hope and agency for the 25th through 75th percentile of culturally normal people), it’s quite a bit worse.
would consider the end of any story a loss.
Unfortunately, now you have to solve the fractal-story problem. Is the universe one story, or does each galaxy have it's own? Each planet? Continent? Human? Subpersonal individual goals/plotlines? Each cell?