Posts
Comments
Quite possibly, but without SOME framework of evaluating wishes, it's hard to know which wishes (even of oneself) to support and which to fight/deprioritize.
Humans (or at least this one) often have desires or ideas that aren't, when considered, actually good ideas. Also, humans (again, at least this one) have conflicting desires, only a subset of which CAN be pursued.
It's not perfect, and it doesn't work when extended too far into the tails (because nothing does), but consequentialism is one of the better options for judging one's desires and picking which to pursue.
Thank you, this is interesting and important. I worry that it overstates similarity of different points on a spectrum, though.
in a certain sense, you are doing the exact same thing as the more overtly irrational person, just hiding it better!
In a certain sense, yes. In other, critical senses, no. This is a case where quantitative differences are big enough to be qualitative. When someone is clinically delusional, there are a few things which distinguish it from the more common wrong ideas. Among them, the inability to shut up about it when it's not relevant, and the large negative impact on relationships and daily life. For many many purposes, "hiding it better" is the distinction that matters.
I fully agree that "He's not wrong but he's still crazy" is valid (though I'd usually use less-direct phrasing). It's pretty rare that "this sounds like a classic crazy-person thought, but I still separately have to check whether it's true" happens to me, but it's definitely not never.
This may be a complaint about legibilism, not specifically consequentialism. Godel was pretty clear - a formal system is either incomplete or inconsistent. Any moral or decision system that demands that everything important about a decision is clear and well-understood is going to have similar problems. Your TRUE reasons for a lot of things are not accessible, so you will look for legible reasons to do what you want, and you will find yourself a rationalizing agent, rather than a rational one.
That said, consequentialism is still a useful framework for evaluating how closely your analytic self matches with your acting self. It's not going to be perfect, but you can choose to get closer, and you can get better at understanding which consequences actually matter to you.
Climbing a mountain has a lot of consequences that you didn't mention, but probably should consider. It connects you to people in new ways. It gives you interesting stories to tell at parties. It's a framework for improving your body in various ways. If you die, it lets you serve as a warning to others. It changes your self-image (honestly, this one may be the most important impact).
We can be virtually certain that 2+2=4 based on priors.
I don't understand this model. For me, 2+2=4 is an abstract analytic concept that is outside of bayesean probability. For others, it may be "just" a probability, about which they might be virtually certain about, but it won't be on priors, it'll be on mountains of evidence and literally zero counterevidence (presumably because every experience that contradicts it gets re-framed as having a different cause).
There's no way to update on evidence outside of your light cone, let alone on theoretical other universes or containing universes. Because there's no way to GET evidence from them.
I did some analysis of audio watermarks a long time ago, and concluded that the answer to the title is "yes, this is a lost cause analytically, but viable socially and legally for quite some time". The reason is mostly economics and standards of proof, not technical. You start out with
1) The watermark should be decisive: It should be almost impossible for normal text to appear marked by accident.
But then most of your analysis is statistical. Without defining "almost impossible", you don't actually know what's successful. What will hold up in court, what will be good enough to send a threatening letter, and what will allow you to enforce an internal rule are all somewhat different requirements.
Further, you assume random distribution, rather than your point 3
3) The watermark should be robust and able to withstand alterations to the text. Perhaps even from a malicious actor trying to destroy the watermark.
Alteration is adversarial - people are TRYING to remove or reduce the level of proof provided by the watermark. For audio, it took a long time for the knowledge of acoustic modeling and MP3 compression to filter into transcoding tools that removed watermarks. It remains the case that specific watermarking choices are kept secret for as long as possible, and are easily removed once known. The balance of not affecting perceptible sound and not trivially removed with a re-encode is still debated, and for most purposes, it's not bothered with anymore.
Text is, at first glance, EVEN HARDER to disguise a watermark - word choice is distinctive, especially if it deviates enough to get to "beyond a reasonable doubt" threshold. For most schemes, simple rewrites (including automatic rewrites) are probably enough to get the doubt level high enough that it's deniable.
Then there's the question of where the watermark is being introduced. Intentional, "text came from X edition of Y source" watermarking is different from "an LLM generated this chunk of text". The latter is often called "fingerprinting" more than "watermarking", because it's not intentionally added for the purpose. It's also doomed in the general case, but most plagiarists are too lazy to bother, so there are current use cases.
I would definitely call the bee response "smart" but thats hard to define.
I think a solid attempt at defining it is required for this post to make sense. I'd call the bee response "effective", but I can't talk myself into thinking it's "smart" in the way you talk about coordination and individual identity. It's a different axis entirely.
Do you have any references to quantity, rate of extraction/delivery, and downstream refinement processes? The book looks to be 6 years old, and likely took a while to research, so it probably describes a world that's evolved since.
My mental model is that there are economic tipping points for both refinment of lower-grade materials and synthesizing the material directly that mean "uneconomical" is a point-in-time judgement, not a long-term stable impossibility.
Also, if it's all that valuable, it seems unlikely that road disruptions will stop the flow - it could reduce and delay it by a bit, but for highly-valuable stuff, there are lots of transport options that can work, if the price is allowed to adjust.
Hmm. I worry that, without significant modeling and measurement of relevant variables, the causality is suspect. Neither bee nor bird response is the same dimension as what we usually mean by "smart", and they're very likely optimized for different cost-benefit values (specifically, protection of central hive vs distributed nests or individual birds, and cost-of-replacement of individuals lost in the action).
The leap to humans, with type-2 reasoning in addition to automatic behaviors, is pretty huge.
There are (at least) two different meanings of "costing" in large-scale economic impact thinking. The narrow meaning is "actual amount spent on this topic". The more common (because it's a bigger number) meaning is "how much bigger would the economy be in the counterfactual world that doesn't have this feature".
The article linked from Wikipedia says
The damage cost estimation is based on historical cybercrime figures including recent year-over-year growth, a dramatic increase in hostile nation-state sponsored and organized crime gang hacking activities, and a cyberattack surface which will be an order of magnitude greater in 2025 than it is today.
Cybercrime costs include damage and destruction of data, stolen money, lost productivity, theft of intellectual property, theft of personal and financial data, embezzlement, fraud, post-attack disruption to the normal course of business, forensic investigation, restoration and deletion of hacked data and systems, reputational harm, legal costs, and potentially, regulatory fines.
Which puts it in the second category - most of these costs are NOT direct expenses, but indirect and foregone value. That doesn't make it wrong, exactly, just not comparable to "real" measures (which GDP and GPP isn't either, but it's more defensible).
It's extremely unclear whether LLM adoption and increasing capabilities will shift the equilibrium between attack and defense on these fronts. Actually, it's almost certain that it will shift it, but it's uncertain how much and in what direction, on what timeframes.
It's further unclear whether legislation can slow the attacks more than they hinder defense.
Mostly, it's not a useful estimate or model for reasoning about decisions.
so basically, we could make an AI that wants to maximize a variable called Utility
Oh, maybe this is the confusion. It's not a variable called Utility. It's the actual true goal of the agent. We call it "utility" when analyzing decisions, and VNM-rational agents act as if they have a utility function over states of the world, but it doesn't have to be external or programmable.
I'd taken your pseudocode as a shorthand for "design the rational agent such that what it wants is ...". It's not literally a variable, nor a simple piece of code that non-simple code could change.
I don't follow the scenario. If the AI is VNM-rational and has a utility function that is linear with number of paperclips created, then it doesn't WANT to edit the utility function, because no other function maximizes paperclips.
Conversely, if an agent WANTS to modify it's utility function, that implies it's not actually the utility function.
Utility function defines what "want" means.
I know some pretty senior people in security for 2 FAANG companies, and passing acquaintance at others, and currently work in the Security org at a comparable company. All of them have reporting channels for specific threats, and none (that I know) are ignorant of the range of AI-enabled attacks that are likely in the near future (shockingly many already). The conversations I've had (regarding products or components I do know pretty well) have convinced me that everything I come up with is already on their radar (though some are of the form "Yeah, that's gonna happen and it's gonna suck. Current strategy is to watch for it and not talk much about it, in order not to encourage it").
Without disclosing some details, there's probably no way to determine whether your knowledge or theory is something they can update on. I'm happy to pass on any information, but I can't see why you'd trust me more than more direct employees of the future victims.
If I didn't remember discussion from previous Petrov Days here, I'd probably have skipped it, on the heuristic that big red buttons on public websites are usually advertisements, which i'd rather not deal with. I doubt there's any visual that could, without a LOT of explanation, make it seem like an ominous thing, rather than an annoying attention-grab. With the context of what petrov day is, and what LW has done in the past, it's neither - just an interesting view into what the admins think is worth doing and what users weirdly extrapolate from.
I pressed the button, and it asked me to confirm that I want to participate in a "social deception game", which I did. I'm somewhat ambivalent that this event does anything except raise awareness of a specific thing that happened during the cold war (and presumably has been patched away in current response procedures), but it's interesting to see people taking it seriously.
Depending on the framing, I will or will not press it if I'm chosen. I'll try to play along with the rules and guidelines as published, including the in-game motivations specified.
Thanks for posting this - I never look at the front page, so would have missed it.
Be VERY careful distinguishing different uses of "most" ("many", "very many", "almost all", "all except a few exceptions"), especially when applying to outliers on other unmeasured dimensions.
My expectation is that mental facilities aren't the most critical feature of popes, so they're typically selected such that even a reduction is still sufficient. And there's likely also active obfuscation of older popes' mental acuity, so it's not as obvious to the public.
I won't speculate on whether there are other measures taken to make sure that living popes are at least somewhat mentally capable. It's quite likely that they're lucky in that it doesn't get to that point very often.
edit: I bothered to actually look for articles on the topic. It seems Benedict XVI did retire in 2013, and didn't die until 2022. And the current pope is denying he's planning to retire, but is open to the idea that a health decline could change that. Previous to Benedict XVI's retirement, it had been ~600 years since a pope had failed to die in office.
I think my speculation stands that a pope's duties (especially until very recently, when video coverage makes papal activities and appearances extremely widely public) can be compatible with a LOT of health and cognitive degradation.
Depending on your devops pipeline (that is, how often do you reimage with DD, as opposed to updating in place, and what changes you make AFTER imaging), you might consider a few timesaving options:
- small root partition, rest as a data partition. This lets you image only the root with dd. It's probably FASTER to image a 4GB partition of a 32GB SDCARD than to image a 4GB card directly.
- Go smaller than 4GB (by partition, you probably don't want to buy smaller cards). RPiOsLite should install in under 1GB, and a very usable commandline system with space for updates in 2GB. If you only have 300MB free in a 4GB storage, something is wrong. It may not be worth fiddling with, though.
For multiple copies of a master, it's common to have a large/fast setup that you install, update, tune, and prepare on, then prepare the image for distribution. There are lots of options for that, the simplest is to clean and shrink the filesystem, and use that shrunken filesystem as the image to distribute. - use a USB "real" SSD, not an SDCARD. Rpi4 and above can boot from it, earlier models need to boot from SDCARD, but can have root and all data on usb. this can be MASSIVELY faster and somewhat more reliable than sdcard.
- Probably not worth the effort, but PiLFS or Yocto make it possible to build a "custom application distribution" for your pi. It will contain only the packages you need, and needs no update mechanism on the devices - to do an update, you rebuild and re-distribute. It can be TINY and still quite functional.
This seems really dependent on the distribution of games and choices faced by participants. Also the specifics of why external limits are possible but normal commitments aren’t.
Edit: this is interesting work that I have never seen explored before. My concern is about applying it too directly to the real world, not about the underlying exploration.
I think this is mixing up colloquial "know nothing" and literal "know nothing". It's impossible to identify a thing about which one knows nothing, as that identification is something about the thing. It can be wrong, and it can be very imprecise, but it's not nothing.
50/50 are the odds of A when we know nothing about A.
No. 50/50 is a reasonable universal prior, but that's both very theoretical and deeply unclear how to categorize quantum waveforms into things over which a probability is even applicable. In most real cases, 50/50 are the odds to start with when all you know is that it's common enough to come to your attention, and that it "feels" balanced whether or not it'll happen.
In other words, "undefined and inapplicable" is the probability for things you know nothing about. Almost all things you can apply probability to, you know SOMETHING about.
You add another layer of mixing literal and figurative "don't know anything" to the term "singularity". Also, don't forget to multiply by the probability that a singularity-on-relevant-factors might not have happened for the thing you're predicting.
Oh. Your model of LW readers is very different from mine - I doubt there exists anything that "the vast majority" will accept as "actual, bonafide truth". In fact, words like actual and bonafide are likely to confuse most of us, and we'll want clarification.
Then how can anyone prove, in the future, whether an AGI exists, or not?
I don't think anyone can (and I don't think they'll have to - it'll either be self-evident or irrelevant). But you said you didn't want to extrapolate to that anyway.
I think anytime you say "what is the probability that", as if it were an objective fact or measure, rather than an agent's tool for prediction, framed as "what is this agent's probability assignment over ...", you're somewhat outside of a bayesean framework.
In my view, those are incomplete propositions - your probability assignment may be a convenience in making predictions, but it's not directly updatable. Bayesean calculations are about how to predict evidence, and how to update on that evidence. "what is the chance that this decider can solve the halting program for this program in that timeframe" is something that can use evidence to update. likewise "what is the chance that I will measure this constant next week and have it off by more than 10% from last week".
"What is true of the universe, in an unobservable way" is not really a question for Bayes-style probability calculations. That doesn't keep agents from having beliefs, just that there's no general mechanism for correctly making them better.
It means your model was inapplicable to the event. Careful Bayesean reasoners don't have any 0s or 1s in predictions of observations. They may have explicit separation of observation and probability, such as "all circles in euclidian planes have pi as their ratio of circumference to their diameter", with the non-1 probability falling into "is that thing I see actually truly a circle in a flat plane?"
Likewise, it's fine to give probability 1 to "a fair die will roll integers between 1 and 6 inclusive with equal probability", and then when a 7 rolls, say "that's evidence that it's not a fair die".
Anyone who assigns a probability of 0 or 1 for a future experience is wrong. There's an infinitesimal chance that the simulation ends or your Boltzmann brain has a glitch or aliens are messing with gravity or whatever. In casual use, we often round these off, which is convenient but not strictly applicable.
Note that there's absolutely no way to GET a 0 or 1 probability in Bayesean calculations, unless it's a prior. Any sane prior can adjust arbitrarily close to 0 or 1 with sufficient observations but can't actually get all the way there - update size is proportional to surprise, so it takes a LOT of evidence to shift a tiny bit closer when it's already close to 0 or 1.
For real, limited-calculation agents and humans, one can also model a meta-credence about "is my model and the probability assignments I have vaguely close to correct", which ALSO is not 1.
I think competition is an important element to capture in the name. I'm not sure that any of sacrifice, race, or trap are defining concepts, though they usually apply. I think of it as over-focus on a few somewhat-visible (if not perfectly legible) dimensions, at the expense of harder-to-notice-analytically value dimensions. So maybe "goodharted value optimization", but that's not great either.
In my head, I tend to think of it as related to "moneyball over-focus" - taking the wrong lesson from Billy Beane's success - he optimized for winning on a budget, not for great baseball athleticism or developing player excellence (except as part of the visible optimization). He was very successful in that, but the world lost something for it.
The antidote is Slack, which is the ultimate illegible value. I'm not sure how/whether to work that into the name.
That's a lot of text, and I wasn't able to find a particular thesis to debate or elucidate (but I didn't try all that hard). Instead, I'll react to an early statement that made me question the rigor of the exploration:
I remember how a judge pulled out an encyclopedia on cow milking 🐄 🤠 read it overnight to adjudicate a complex legal case about a farm.
That memory is from fiction, ancient history, or from a legal system very different from modern Western countries.
Proving to whom, and to what degree of credence, and avoiding what correlated measures? Standardized tests, like the SAT, are perfectly fine at showing "above average", for most purposes. They're not very fine-grained, so it's hard to tell "very slightly above average", and not very accurate for outliers, so can't distinguish between 90th and 99th percentile with much certainty. But they're just fine for showing "very likely above average" for a lot of uses.
I'm not sure there exists a formal operational definition of "general intelligence", there's no direct measurement possible. Still, if "above average or not above average" is the criteria, most correlates are usable.
I'm having trouble following whether this categories the definition/concept of a soul, or the causality and content of this conception of soul. Is "sideways soul" about structure and material implementation, or about weights and connectivity, independent of substrate? WHICH factors are removed from upwards ("genes" and "utility function" are VERY different dimensions, both tiny parts of what I expect create (for genes) or comprise (for utility function) a soul. What about memory? multiple levels of value and preferences (including meta-preferences in how to abstract into "values")?
Putting "downwards" supernatural ideas into the same framework as more logical/materialist ideas confuses me - I can't tell if that makes it a more useful model or less.
For singleton events (large-scale nuclear attack and counterattack), deception plays an important role. This isn't a problem, apparently, in dath ilan - everyone has common knowledge of other's rationality.
It's not how the game would be played between dath ilan and true aliens
This is a very important caveat. Many humans or CDT agents could be classified as “true aliens” by someone not part of their ingroup.
Good discussion. I don't think anyone (certainly not me) is arguing that consciousness isn't a physical thing ("real", in that sense). I'm arguing that "consciousness" may not be a coherent category. In the same sense that long ago, dolphins and whales were considered to be "fish", but then more fully understood to be marine mammals. Nobody EVER thought they weren't real. Only that the category was wrong.
Same with the orbiting rock called "pluto". Nobody sane has claimed it's not real, it's just that some believe it's not a planet. "fish" and "planet" are not real, although every instance of them is real. In fact, many things that are incorrectly thought to be them are real as well. It's not about "real", it's about modeling and categorization.
"Consciousness" is similar - it's not a real thing, though every instance that's categorized (and miscategorized) that way is real. There's no underlying truth or mechanism of resolving the categorization of observable matter as "conscious" or "behavior, but not conscious" - it's just an agreement among taxonomists.
(note: personally, I find it easiest to categorize most complex behavior in brains as "conscious" - I don't actually know how it feels to be them, and don't REALLY know that they self-model in any way I could understand, but it's a fine simplification to make for my own modeling. I can't make the claim that this is objectively true, and I can't even design theoretical tests that would distinguish it from other theories. In this way, it's similar to MWI vs Copenhagen interpretations of QM - there's no testable distinction, so use whichever one fits your needs best.)
I don't know that "the AI doomer argument" is a coherent thing. At least I haven't seen an attempt to gather or summarize it in an authoritative way. In fact, it's not really an argument (as far as I've seen), it's somewhere between a vibe and a prediction.
For me, when I'm in a doomer mood, it's easy to give a high probability to the idea that humanity will be extinct fairly soon (it may take centuries to fully die out, but will be fully irreversible path in 10-50 years, if it's not already). Note that this has been a common belief long before AI was a thing - nuclear war/winter, ecological collapse, pandemic, etc. are pretty scary, and humans are fragile.
My optimistic "argument" is really not better-formed. Humans are clever, and when they can no longer ignore a problem, they solve it. We might lose 90%+ of the current global population, and a whole lot of supply-chain and tech capability, but that's really only a few doublings lost, maybe a millennium to recover, and maybe we'll be smarter/luckier in the next cycle.
From your perspective, what do you think the argument is, in terms of thesis and support?
The great insight though is that consciousness is part of reality. It is a real phenomenon.
That is somewhat contentious. MY consciousness and internal experiences are certainly part of reality. I do not know how to show that YOUR consciousness is similar enough to say that it's real. You could be a p-zombie that does not have consciousness, even though you use words that claim you do. Or you could be conscious, but in a way that feels (to you) so alien to my own perceptions that it's misleading to use the same word.
Because we're somewhat mechanically similar, it's probable that our conscious experiences (qualia) are also similar, but we're not identical and have no even theoretical way to measure which, if any, differences between us are important to that question.
In other words, consciousness is a real phenomenon, but it's not guaranteed that it's the SAME phenomenon for me as for anything else.
This uncertainty flows into your thoughts on morals - there's no testable model for which variations in local reality cause what variations in morals, so no tie from individual experience to universal "should".
Does this require some sort of enforcement mechanism to ensure that neither party puts in a bad-faith bid as a discovery mechanism for what number to seek in their real negotiations? In fact, does anyone have actual data on what negotiations are even available in most employment situations - many companies seem to have figured out how to reduce this quite a bit over the last decade or two.
What goals (for Metaculus, for observers, or for predictors) does that serve?
I worry about concepts like this being discussed FAR too generally. Some rich people deserve some of their wealth, surely. But it's hard to have the detailed discussions and compromises needed to agree on WHICH specific assets are undeserved and are therefore candidates for redistribution.
So we fall back on collective punishment and assert that, since some part of some wealth is not deserved, we are justified in taking a bunch from everyone who has any legible, take-able things. IMO, even if MOST wealth is undeserved, that doesn't justify taking anything without identifying that specific thing as correct to take.
Code link gives a 404, so I can't look and see, but I'm curious what the ratio is actually comparing. Is that the percentage of removals that were an improvement the ratio of improved to degraded, ignoring irrelevant removals, or the mean change (unlikely, since all are positive), or something else? Does >0.5 imply a benefit and <0.5 imply a harm (assuming so)?
It's interesting that removing random options from A is never beneficial to A, but is also harmful to B unless B starts out with more actions than A. I presume these payouts weren't normalized to zero-sum, so that's down to the distribution of outcomes-to-actions, and who "has more control".
A lot depends on your definition of "matter". Interesting and important debates are always on margins of disagreement. The median member likely has a TON of important beliefs and activities that are uncontroversial and ignored for most things. Those things matter, and they matter more than 95% of what gets debated and focused on.
The question isn't whether the entities matter, but whether the highlighted, debated topics matter.
I think this makes a lot of sense, and it points out that even people who write about status hierarchies don't REALLY take it seriously that it's one of the top motivations for most humans.
If status and dominance hierarchies are the only known way to make most humans (say, 70% of the intrinsic-extrensic motivation spectrum) actually cooperate, and there is an economy to scale of setting up and maintaining a hierarchy, then the current world of large organizations is pretty easy to understand.
Like, in a world where the median person is John Wentworth (“Wentworld”), I’m pretty sure there just aren’t large organizations of the sort our world has.
Yeah, Dagonworld is pretty dysfunctional - the garbage doesn't get collected, customers get yelled at when they call with a complaint, shelves get restocked rather randomly, etc.
On some level and timescale, we can't really separate economic questions from any other decisions we're interested in. "why would someone do a thing" is a pretty big question in any domain.
For this specifically, there are a couple of distinct hurdles:
- what does it cost and who's willing to scan YOU?
- Note: in no case is it actually you - it's some future version of you who's lived and died. The you that is now disappears every night and a slight modification awakes in the morning.
- what does it cost and who's willing to store YOU until someone is willing and able to run sims?
- What does it cost to run you, at what level of completeness and time factor, and who is willing and able to do so?
For all current humans, they will not be scanned. For MOST current humans, not even future iterations that identify as them will be scanned. It's extremely hard to predict the resource needs and availability to guess at what percentage of scans will be run, let alone for how long or how often. Or how often they'll be branched and reset, or any other option for what to do with such a dataset.
Agreed, but I hope I don't need a new laptop for some time. The BEST options, depending on your very specific needs and expectation of lifecycle (do you buy for 2 years, or 5). Eliminate these until you get to the one you can accept (or ignore it entirely - a lot of personal preference goes into this, one size really does not fit all).
1. M3 Macbook Air.
2. Wait ~1 year for Windows-on-ARM to fully normalize
3. Acer or Asus (or maybe Lenovo) Snapdragon X
4. Surface Laptop Snapdragon X
5. Intel laptop
6. Snapdragon plus laptop
I think you're using the word "real" in a confusing way. The possibility/likelihood that there are things outside your light cone or possible causal web is completely irrelevant to any experiences you have had or will ever have.
Dreams are "real", in that the dream is a thing which has an impact on your memories and perceptions. Events and specifics in dreams are "unreal", in that they have no predictive power nor causal effect on waking experiences.
Likewise your imagining/predicting things in alternate universes (or this universe outside your lightcone) - your beliefs are real, in that they impact your perceptions and actions. The specific instantiations are "unreal" because they have no causal impact on anything in the real world.
I feel like more and more LLM use is inevitable, and at least some of it is no different from the age-old (as far as "old" applies to LessWrong and online forums) problem of filtering new users who are more enthusiastic than organized, and generate a lot of spew before really slowing down and writing FOR THIS AUDIENCE (or getting mad and going away, in the case of bad fit). LLMs make it easy to increase that volume of low-value stuff.
I really want to enable the high-value uses of LLM, because I want more from many posts and I think LLMs CAN BE a good writing partner. My mental model is that binary approaches ("identify LLM-generated content") are going to fail, because the incentives are wrong, but also because it discourages the good uses.
The problem of voluminous bad posts has two main tactics for us to use against it.
1) identification and filtering (downvoting and admin intervention). This works today (though it's time-consuming and uncomfortable), and is likely to continue to work for quite some time. I haven't really played with LLM as evaluation/summary for things I read and comment on, but I think I'm going to try it. I wonder if I can get GPT to estimate how much effort went into a post...
2) assistance and encouragement of posters to think and write for this audience, and to engage with comments (which are sometimes very direct). This COULD include recommendations for LLM critiques or assistance with organization or presentation, along with warnings that LLMs can't actually predict your ideas or reason for posting - you have to prompt it well and concretely.
We need both of these, and I'm not sure the balance changes all that much in an LLM world.
Most communication questions will have different options depending on audience. Who are you communicating to, and how high-bandwidth is the discussion (lots of questions and back-and-forth with one or two people is VERY different from, say, posting on a public forum).
For your examples, it seems you're looking for one-shot outbound communication, to a relatively wide and mostly educated audience. I personally don't find the ambiguity in your examples particularly harmful, and any of them are probably acceptable.
If anyone complains or it bugs you, I'd EITHER go with
- an end-note that all percentages are chance-to-win
- a VERY short descriptor like (43% win) or even (43%W).
- reduce the text rather than the quantifier - "Kamala is 54% to win" without having to say that means "slight favorite".
Thank you for saying this! It's easy to have a very limited typical-mind-fallacy view of LessWrong readers, and hearing about preferences very different from my own is extremely important.
explaining them all to the AI is almost the same as writing the post itself.
Depending on your skill with writing clear, concise English, this may be true. For many, it may be that the effort level is the same between using AI well and just writing it yourself, but the effort type is different, and the quality is improved.
I think the potential value of LLM-assisted writing is very high, but it requires similar levels of clarity and attention to detail either way. Low-effort posts will be remain low-value, high-effort posts could get quite a boost.
One can take a hard anti-realism stance while still having values and beliefs about others' values. It requires more humility and acknowledgement of boundaries than most people want from their moral systems. Especially around edge cases, distant extrapolation, and counterexamples - if you forget that most of your intuitions come from some mix of evolution, social learning, and idiosyncratic brain configuration, you're likely to strongly believe untrue things.
That's awesome. One of my worries about this (which applies to most harm-reduction programs) is that I'd rather have less current-quality-LLM-generated stuff on LW overall, and making it a first-class feature makes it seem like I want more of it.
Having a very transparent not-the-same-as-a-post mechanism solves this worry very well.
Upvoted and agreed, but the ship has sailed. We need to help people use LLM article generation well, to help them focus and express their thoughts, rather than just making longer versions of a brief thesis. This is usally a multi-step process, generating and iterating on an outline, then individually generating and iterating on each section (of a paragraph or two). The iteration - human control of the conciseness and correctness of each portion - is absolutely key.
As a low-effort very small change to your prompt to demonstrate the variance, I did a one-shot with GPT 4o that specified LessWrong and added an instruction for tone.
I have recently seen a substantial increase in the proportion of postings to the LessWrong online forum that were clearly generated by AI, although not marked as such. The writing is verbose and meandering, full of "on the one hand" and "on the other", never making an assertion without also suggesting its opposite, sounding oh-so-measured and even-handed that it ends up saying nothing at all. Please write a brief article for me, written in the first person, arguing that AI tools in the current state of the art should not be used for this purpose. Keep it factual and direct, without unnecessary elaboration or verbosity.
and got
I've noticed a growing trend on the LessWrong forum where AI-generated content is becoming more prevalent, and it's not marked as such. This raises significant concerns, especially when the writing is verbose, indecisive, and ultimately devoid of substance. The hallmark of these posts is a constant balancing act—presenting multiple sides of an argument without ever committing to a clear stance. This approach, while superficially measured, often results in content that says nothing meaningful.
AI tools, as they currently stand, are excellent at generating text that mimics human writing, but they lack the ability to produce genuinely insightful or novel ideas. Their output tends to be safe, repetitive, and lacking in depth because they are designed to avoid controversy and adhere to patterns seen in their training data. This often leads to the production of content that feels empty—words strung together in a way that looks polished but fails to engage or challenge the reader.
The problem with using AI for content creation in thoughtful forums like LessWrong is that it undermines the quality of discussion. The platform thrives on clear, well-reasoned arguments that push the boundaries of rational thought. When AI-generated content is introduced into this space without proper identification, it dilutes the discourse. Readers are left sifting through noise to find genuine contributions, which can erode the overall value of the community.
In its current state, AI should not be used for generating posts on platforms that prioritize intellectual rigor and meaningful debate. Until AI can contribute something more than surface-level analysis and pseudo-argumentation, its use in these contexts should be reconsidered. Identifying AI-generated content clearly when it is used is also essential to maintain the integrity of the forum and respect the expectations of its users.
This is much better (IMO) than your initial attempt, and I believe it could be much better with more significant effort into multi-shot prompt engineering and light edits afterward.
I'd rephrase your title as "If you're going to use AI to write for you, use it as a fairly limited assistant rather than an open-ended delegation of responsibility". People need to put in the effort to clarify their own thoughts and put them clearly into text - AI is really helpful for grammar and the mechanics of writing, and not (yet) any good at conceptual presentation that's not basically a summary of something else.
I agree that it's not terribly useful beyond identifying someone's fears. Using almost any taxonomy to specify what the speaker is actually worried about lets you stop saying "infohazard" and start talking about "bad actor misuse of information" or "naive user tricked by partial (but true) information". These ARE often useful, even though the aggregate term "infohazard" is limited.
Close, but not quite. There well may be colors and gradients, but we're all very nearsighted and can't tell what anyone else is seeing, and there's no photography or ability to share the experiences and intuitions. There is no authority to tell us, no way to be sure that what we think is important is what matters to somebody else.
Measuring utility is an unsolved, and probably unsolvable, problem. It's hard to say that someone who values the cultural continuity and linguistic diversity of keeping multiple languages alive is wrong about that value, only that I don't share it.
How you aggregate stated or inferred values across different beings is the next level of unsolved (and IMO unsolvable) problem for Utilitiarianism. Maybe they DO value that highly, but you should devalue their utility compared to some other person's desires.
Oh, that'd be nice - hide it for maybe 6 hours or until there are 4 voters, whichever comes first. That lets solid signals come through quickly, and weak ones settle a bit before hitting.