Posts
Comments
I think this post is very funny (disclaimer: I wrote this post).
A number of commenters (both here and on r/slatestarcodex) think it's also profound, basically because of its reference to the anti-critical-thinking position better argued in the Michael Huemer paper that I cite about halfway through the post.
The question of when to defer to experts and when to think for yourself is important. This post is fun as satire or hyperbole, though it ultimately doesn't take any real stance on the question.
I think this post is very good (note: I am the author).
Nietzsche is brought up often in different contexts related to ethics, politics, and the best way to live. This post is the best summary on the Internet of his substantive moral theory, as opposed to vague gesturing based on selected quotes. So it's useful for people who
- are interested in what Nietzsche's arguments, as a result of their secondhand impressions
- have specific questions like "Why does Nietzsche think that the best people are more important"
- want to know whether something can be well-described as "Nietzschean"
It's able to answer questions like this and describe Nietzsche's moral theory concisely because it focuses on his lines of argument and avoids any description of his metaphors or historical narratives: no references are made to the Ubermensch, Last Man, the "death of God," the blond beast, or other concepts that aren't needed for an analytic account of his theory.
By "calligraphy" do you mean cursive writing?
So why don't the four states sign a compact to assign all their electoral votes in 2028 and future presidential elections to the winner of the aggregate popular vote in those four states? Would this even be legal?
It would be legal to make an agreement like this (states are authorized to appoint electors and direct their votes however they like; see Chiafalo v. Washington) but it's not enforceable in the sense that if one of the states reneges, the outcome of the presidential election won't be reversed.
lol fixed thanks
Yeah it's for the bounty. Hanson suggested that a list of links might be preferred to a printed book, at least for now, since he might want to edit the posts.
What was it called
Brief comments on what's bad about the output:
The instruction is to write an article arguing that AI-generated posts suffer from verbosity, hedging, and unclear trains of thought. But ChatGPT makes that complaint in a single sentence in the first paragraph and then spends 6 paragraphs adding a bunch of its own arguments:
- that the "nature of conversation itself" draws value from "human experience, emotion, and authenticity" that AI content replaces with "a hollow imitation of dialogue"
- that AI content creates "an artificial sense of expertise," i.e. that a dumb take can be made to seem smarter than it is
- that the option to use AI content discourages posters from "engag[ing] deeply with the topic themselves"
It wasn't instructed to make these arguments and they aren't really defended as important.
Then lastly ChatGPT offers a "solution": that people just disclose when they're using AI assistance. But it's not explained how this actually prevents the problems above, besides I guess allowing people downvote AI content or scrutinize it more. Importantly this proposed solution wouldn't solve the concern the model was instructed to present.
So in general instructions aren't followed and the thinking/argumentation isn't clear.
I added to your prompt the instructions
Be brief and write concise prose in the style of Paul Graham. Don't hedge or repeat yourself or go on tangents.
And the output is still bad, but now mostly for the flaw (also present in your output) that ChatGPT can't resist making the complaint about "human authenticity" and "transparency/trust" when that's not what you're talking about:
I've noticed a troubling trend on online forums: a surge in posts that clearly seem to be generated by AI. These posts are verbose, meandering, and devoid of real substance. They present every argument with an opposing one, wrapped in so much "on the one hand" and "on the other" that they end up saying nothing at all.
AI in its current form shouldn’t be used for this purpose. The problem isn’t just that these posts waste time, though they do. It’s that they dilute meaningful discourse. When an AI-generated post floods a forum, it clogs the flow of genuine ideas. The writing might seem polished, but it’s hollow, a shell of real thought. And that’s a disservice to any community that values actual discussion.
Forums are places for people to exchange ideas, not for machines to simulate them. Using AI to generate content in this way erodes the authenticity of conversation. It’s a misuse of technology that contributes to the noise rather than to meaningful exchange.
AI can be powerful, but that power should be wielded carefully. Right now, the tools aren’t good enough to replace real human input, and pretending they are does more harm than good.
EDIT: Below is what it looks like if I replace "in the style of Paul Graham" with "in the style of LessWrong":
AI-generated content in online forums is a growing problem. These tools are designed to mimic human language, but they often fail to deliver meaningful or original contributions. Instead, they produce verbose, meandering prose that tries to cover all bases but ultimately says nothing. This kind of content is worse than useless—it clogs up discussions with noise, making it harder for genuine insights to emerge.
AI, in its current state, lacks the ability to form and defend coherent, novel ideas. It doesn't understand context, nuance, or the importance of taking a clear stance. The result is writing that hedges endlessly, offering a shallow imitation of thoughtful discourse but devoid of real substance. When forums are flooded with such content, it dilutes the value of the platform. It wastes the time of readers who are searching for real engagement and stifles the development of ideas that can only come from genuine human interaction.
AI tools should not be used to generate content for forums, period. They don't contribute to the quality of discussion and instead, undermine it. The promise of AI lies in its ability to augment human capabilities, not replace them in areas where human judgment, creativity, and insight are paramount. Until AI can meaningfully contribute to a discussion, we should resist the temptation to use it in contexts where clarity, originality, and human understanding are crucial.
Is this word long or short? Only say "long" or "short". The word is: {word}.
To test out Cursor for fun I asked models whether various words of different lengths were "long" and measured the relative probability of "Yes" vs "No" answers to get a P(long) out of them. But when I use scrambled words of the same length and letter distribution, GPT 3.5 doesn't think any of them are long.
Update: I got Claude to generate many words with connotations related to long ("mile" or "anaconda" or "immeasurable") and short ("wee" or "monosyllabic" or "inconspicuous" or "infinitesimal") It looks like the models have a slight bias toward the connotation of the word.
What's the actual probability of casting a decisive vote in a presidential election (by state)?
I remember the Gelman/Silver/Edlin "What is the probability your vote will make a difference?" (2012) methodology:
1. Let E be the number of electoral votes in your state. We estimate the probability that these are necessary for an electoral college win by computing the proportion of the 10,000 simulations for which the electoral vote margin based on all the other states is less than E, plus 1/2 the proportion of simulations for which the margin based on all other states equals E. (This last part assumes implicitly that we have no idea who would win in the event of an electoral vote tie.) [Footnote: We ignored the splitting of Nebraska’s and Maine’s electoral votes, which retrospectively turned out to be a mistake in 2008, when Obama won an electoral vote from one of Nebraska’s districts.]
2. We estimate the probability that your vote is decisive, if your state’s electoral votes are necessary, by working with the subset of the 10,000 simulations for which the electoral vote margin based on all the other states is less than or equal to E. We compute the mean M and standard deviation S of the vote margin among that subset of simulations and then compute the probability of an exact tie as the density at 0 of the Student-t distribution with 4 degrees of freedom (df), mean M, and scale S.
The product of two probabilities above gives the probability of a decisive vote in the state.
This gives the following results for the 2008 presidential election, where they estimate that you had less than one chance in a hundred billion of deciding the election in DC, but better than a one in ten million chance in New Mexico. (For reference, 131 million people voted in the election.)
Is this basically correct?
(I guess you also have to adjust for your confidence that you are voting for the better candidate. Maybe if you think you're outside the top ~20% in "voting skill"—ability to pick the best candidate—you should abstain. See also.)
FiveThirtyEight released their prediction today that Biden currently has a 53% of winning the election | Tweet
The other day I asked:
Should we anticipate easy profit on Polymarket election markets this year? Its markets seem to think that
- Biden will die or otherwise withdraw from the race with 23% likelihood
- Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
- either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
- Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?
Probably worthwhile to think about this further, including ways to make leveraged bets.
Should we anticipate easy profit on Polymarket election markets this year? Its markets seem to think that
- Biden will die or otherwise withdraw from the race with 23% likelihood
- Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
- either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
- Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?
I like "Could you repeat that in the same words?" so that people don't try to rephrase their point for no reason.
In addition to daydreaming, sometimes you're just thinking about the first of a series of points that your interlocutor made one after the other (a lot of rationalists talk too fast).
By "subscriber growth" in OP I meant both paid and free subscribers.
My thinking was that people subscribe after seeing posts they like, so if they get to see the body of a good post they're more likely to subscribe than if they only see the title and the paywall. But I guess if this effect mostly affects would-be free subscribers then the effect mostly matters insofar as free subscribers lead to (other) paid subscriptions.
(I say mostly since I think high view/subscriber counts are nice to have even without pay.)
Paid-only Substack posts get you money from people who are willing to pay for the posts, but reduce both (a) views on the paid posts themselves and (b) related subscriber growth (which could in theory drive longer-term profit).
So if two strategies are
- entice users with free posts but keep the best posts behind a paywall
- make the best posts free but put the worst posts behind the paywall
then regarding (b) above. the second strategy has less risk of prematurely stunting subscriber growth, since the best posts are still free. Regarding (a), it's much less bad to lose view counts on your worst posts.
[Book Review] The 8 Mansion Murders by Takemaru Abiko
As a kid I read a lot of the Sherlock Holmes and Hercule Poirot canon. Recently I learned that there's a Japanese genre of honkaku ("orthodox") mystery novels whose gimmick is a fastidious devotion to the "fair play" principles of Golden Age detective fiction, where the author is expected to provide everything that the attentive reader needs to come up with the solution himself. It looks like a lot of these honkaku mysteries include diagrams of relevant locations, genre-savvy characters, and a puzzle-like aesthetic. A bunch have been translated by Locked Room International.
The title of The 8 Mansion Murders doesn't refer to the number of murders, but to murders committed in the "8 Mansion," a mansion designed in the shape of an 8 by the eccentric industrialist who lives there with his family (diagrams show the reader the layout). The book is pleasant and quick—it didn't feel like much over 50,000 words. Some elements feel very Japanese, like the detective's comic-relief sidekick who suffers increasingly serious physical-comedy injuries. The conclusion definitely fits the fair-play genre in that it makes sense, could be inferred from the clues, is generally ridiculous, and doesn't offer much in the way of motive.
If you like mystery novels, I would recommend reading one of these honkaku mysteries for the novelty. Maybe not this one, since there are more famous ones (this one was on libgen).
Ask LLMs for feedback on "the" rather than "my" essay/response/code, to get more critical feedback.
Seems true anecdotally, and prompting GPT-4 to give a score between 1 and 5 for ~100 poems/stories/descriptions resulted in an average score of 4.26 when prompted with "Score my ..." versus an average score of 4.0 when prompted with "Score the ..." (code).
https://x.com/panickssery/status/1792586407623393435
If I understand the term "double crux" correctly, to say that something is a double crux is just to say that it is "crucial to our disagreement."
Quick Take: People should not say the word "cruxy" when already there exists the word "crucial." | Twitter
Crucial sometimes just means "important" but has a primary meaning of "decisive" or "pivotal" (it also derives from the word "crux"). This is what's meant by a "crucial battle" or "crucial role" or "crucial game (in a tournament)" and so on.
So if Alice and Bob agree that Alice will work hard on her upcoming exam, but only Bob thinks that she will fail her exam—because he thinks that she will study the wrong topics (h/t @Saul Munn)—then they might have this conversation:
Bob: You'll fail
Alice: I won't, because I'll study hard.
Bob: That's not crucial to our disagreement.
The older nickname was "Cornell of the West." Stanford was modeled after Cornell.
This story is inspired by The Trouble With Being Born, a collection of aphorisms by the Romanian philosopher Emil Cioran (discussed more here), including the following aphorisms:
A stranger comes and tells me he has killed someone. He is not wanted by the police because no one suspects him. I am the only one who knows he is the killer. What am I to do? I lack the courage as well as the treachery (for he has entrusted me with a secret—and what a secret!) to turn him in. I feel I am his accomplice, and resign myself to being arrested and punished as such. At the same time, I tell myself this would be too ridiculous. Perhaps I shall go and denounce him all the same. And so on, until I wake up.
The interminable is the specialty of the indecisive. They cannot mark life out for their own, and still less their dreams, in which they perpetuate their hesitations, pusillanimities, scruples. They are ideally qualified for nightmare.
Here on the coast of Normandy, at this hour of the morning, I needed no one. The very gulls’ presence bothered me: I drove them off with stones. And hearing their supernatural shrieks, I realized that that was just what I wanted, that only the Sinister could soothe me, and that it was for such a confrontation that I had got up before dawn.
Large Language Models (LLMs) have demonstrated remarkable capabilities in various NLP tasks. However, previous works have shown these models are sensitive towards prompt wording, and few-shot demonstrations and their order, posing challenges to fair assessment of these models. As these models become more powerful, it becomes imperative to understand and address these limitations. In this paper, we focus on LLMs robustness on the task of multiple-choice questions -- commonly adopted task to study reasoning and fact-retrieving capability of LLMs. Investigating the sensitivity of LLMs towards the order of options in multiple-choice questions, we demonstrate a considerable performance gap of approximately 13% to 75% in LLMs on different benchmarks, when answer options are reordered, even when using demonstrations in a few-shot setting. Through a detailed analysis, we conjecture that this sensitivity arises when LLMs are uncertain about the prediction between the top-2/3 choices, and specific options placements may favor certain prediction between those top choices depending on the question caused by positional bias. We also identify patterns in top-2 choices that amplify or mitigate the model's bias toward option placement. We found that for amplifying bias, the optimal strategy involves positioning the top two choices as the first and last options. Conversely, to mitigate bias, we recommend placing these choices among the adjacent options. To validate our conjecture, we conduct various experiments and adopt two approaches to calibrate LLMs' predictions, leading to up to 8 percentage points improvement across different models and benchmarks.
Also "Benchmarking Cognitive Biases in Large Language Models as Evaluators" (Koo et al., 2023):
Order Bias is an evaluation bias we observe when a model tends to favor the model based on the order of the responses rather than their content quality. Order bias has been extensively studied (Jung et al., 2019; Wang et al., 2023a; Zheng et al., 2023), and it is well-known that state-of-the-art models are still often influenced by the ordering of the responses in their evaluations. To verify the existence of order bias, we prompt both orderings of each pair and count the evaluation as a “first order” or “last order” bias if the evaluator chooses the first ordered (or last ordered) output in both arrangements respectively.
Do non-elite groups factor into OP's analysis. I interpreted is as inter-elite veto, e.g. between the regional factions of the U.S. or between religious factions, and less about any "people who didn't go to Oxbridge and don't live in London"-type factions.
I can't think of examples where a movement that wasn't elite-led destabilized and successfully destroyed a regime, but I might be cheating in the way I define "elites" or "led."
But, as other commenters have noted, the UK government does not have structural checks and balances. In my understanding, what they have instead is a bizarrely, miraculously strong respect for precedent and consensus about what "is constitutional" despite (or maybe because of?) the lack of a written constitution. For the UK, and maybe other, less-established democracies (i.e. all of them), I'm tempted to attribute this to the "repeated game" nature of politics: when your democracy has been around long enough, you come to expect that you and the other faction will share power (roughly at 50-50 for median voter theorem reasons), so voices within your own faction start saying "well, hold on, we actually do want to keep the norms around."
The UK is also a small country, both literally, having a 4-5x smaller population than e.g. France during several centuries of Parliamentary rule before the Second Industrial Revolution, and figuratively, since they have an unusually concentrated elite that mostly goes to the same university and lives in London (whose metro area has 20% of the country's population).
Changes my view, edited the post.
Thanks for taking the time to respond; I didn't figure the post would get so much reach.
Wow, thanks for replying.
If the model has beaten GMs at all, then it can only be so weak, right? I'm glad I didn't make stronger claims than I did.
I think my questions about what humans-who-challenge-bots are like was fair, and the point about smurfing is interesting. I'd be interested in other impressions you have about those players.
Is the model's Lichess profile/game history available?
Powerful
Could refer to them in writing as "MC-effectiveness measures"
Could someone explain how Rawls's veil of ignorance justifies the kind of society he supports? (To be clear I have an SEP-level understanding and wouldn't be surprised to be misunderstanding him.)
It seems to fail at every step individually:
- At best, the support of people in the OP provides necessary but probably insufficient conditions for justice, unless he refutes all the other proposed conditions involving whatever rights, desert, etc.
- And really the conditions of the OP are actively contrary to good decision-making, e.g. you don't know your particular conception of the good (??) or that they're essentially self-interested. . .
- There's no reason to think, generally, that people disagree with John Rawls only because of their social position or psychological quirks
- There's no reason to think, specifically, that people would have the literally infinite risk aversion required to support the maximin principle.
- Even given everything, the best social setup could easily be optimized for the long-term (in consideration of future people) in a way that makes it very different (e.g. harsher for the poor living today) from the kind of egalitarian society I understand Rawls to support.
More concretely:
(A) I imagine that if Aristotle were under a thin veil of ignorance, he would just say "Well if I turn out to be born a slave then I will deserve it"; it's unfair and not very convincing to say that people would just agree with a long list of your specific ideas if not for their personal advantages.
(B) If you won the lottery and I demanded that you sell your ticket to me for $100 on the grounds that you would have, hypothetically, agreed to do this yesterday (before you know that it was a winner), you don't have to do this; the hypothetical situation doesn't actually bear on reality in this way.
Another frame is that his argument involves a bunch of provisions that seem designed to avoid common counterarguments but are otherwise arbitrary (utility monsters, utilitarianism, etc).
Here's Resolution 2712 from a few weeks ago, on "The situation in the Middle East, including the Palestinian question:
The Security Council,
(here I skip preambulatory clauses that altogether are as long as the rest of the text),
1. Demands that all parties comply with their obligations under international law, including international humanitarian law, notably with regard to the protection of civilians, especially children;
2. Calls for urgent and extended humanitarian pauses and corridors throughout the Gaza Strip for a sufficient number of days to enable, consistent with international humanitarian law, the full, rapid, safe, and unhindered humanitarian access for United Nations humanitarian agencies and their implementing partners, the International Committee of the Red Cross and other impartial humanitarian organizations, to facilitate the continuous, sufficient and unhindered provision of essential goods and services important to the well-being of civilians, especially children, throughout the Gaza Strip, including water, electricity, fuel, food, and medical supplies, as well as emergency repairs to essential infrastructure, and to enable urgent rescue and recovery efforts, including for missing children in damaged and destroyed buildings, and including the medical evacuation of sick or injured children and their care givers;
3. Calls for the immediate and unconditional release of all hostages held by Hamas and other groups, especially children, as well as ensuring immediate humanitarian access;
4. Calls on all parties to refrain from depriving the civilian population in the Gaza Strip of basic services and humanitarian assistance indispensable to their survival, consistent with international humanitarian law, which has a disproportionate impact on children, welcomes the initial, although limited, provision of humanitarian supplies to civilians in the Gaza Strip and calls for the scaling up of the provision of such supplies to meet the humanitarian needs of the civilian population, especially children;
5. Underscores the importance of coordination, humanitarian notification, and deconfliction mechanisms, to protect all medical and humanitarian staff, vehicles including ambulances, humanitarian sites, and critical infrastructure, including UN facilities, and to help facilitate the movement of aid convoys and patients, in particular sick and injured children and their care-givers;
6. Requests the Secretary-General to report orally to the Security Council on the implementation of this resolution at the next mandated meeting of the Security Council on the situation in the Middle East, and further requests the Secretary-General to identify options to effectively monitor the implementation of this resolution as a matter of prime concern;
7. Decides to remain seized of the matter.
But herd morality is not just hostile to higher men, it's hostile to all positive development in mankind in general. If you glorify everything which makes weak and weary, you trap society in a prison of its own making.
Sometimes Nietzsche will use terms like "life" in e.g. "[a] tendency hostile to life is therefore characteristic of [herd] morality." But in context this refers to the higher type (in this specific passage to the man "raised to his greatest power and splendor"). The term "anti-nature" is the same way.
This is complicated by the sense in which herd morality is considered harmful to life in an indirect way, because Nietzsche's response to Schopenhauer's challenge of life's suffering is that "it is only as an aesthetic phenomenon that existence and the world are eternally justified." So herd morality is also injurious to life generally because it hinders the Goethe/Beethoven-style aesthetic spectacle that makes life worthwhile.
Importantly, I'd say with confidence that Nietzsche's opposition to herd morality is driven only by its direct effect on the norms of higher men, without any consideration for its good or bad effects on those of the lower men.
The Übermensch is discussed as an ideal kind of higher man only in Thus Spoke Zarathustra and disappears afterward. Zarathustra is often especially obscure and the Übermensch's importance in understanding Nietzsche is overstated in popular culture compared to the broader higher type of person exemplified by actual persons like Goethe.
My first guess was that it's noise from the label ordering (some of the digits must be harder to learn than others). Ran it 10 times with the labels shuffled each time:
Still unsure.
It'll be a public good
I agree
Typo, thanks for spotting
Conditional of course
Second the recommendation for Steven Pinker's The Sense of Style. His own summary here: https://davidlabaree.com/2021/07/08/pinker-why-academics-stink-at-writing/
The guiding metaphor of classic style is seeing the world. The writer can see something that the reader has not yet noticed, and he orients the reader so she can see for herself. The purpose of writing is presentation, and its motive is disinterested truth. It succeeds when it aligns language with truth, the proof of success being clarity and simplicity. The truth can be known and is not the same as the language that reveals it; prose is a window onto the world. The writer knows the truth before putting it into words; he is not using the occasion of writing to sort out what he thinks. The writer and the reader are equals: The reader can recognize the truth when she sees it, as long as she is given an unobstructed view. And the process of directing the reader’s gaze takes the form of a conversation.
Most academic writing, in contrast, is a blend of two styles. The first is practical style, in which the writer’s goal is to satisfy a reader’s need for a particular kind of information, and the form of the communication falls into a fixed template, such as the five-paragraph student essay or the standardized structure of a scientific article. The second is a style that Thomas and Turner call self-conscious, relativistic, ironic, or postmodern, in which “the writer’s chief, if unstated, concern is to escape being convicted of philosophical naïveté about his own enterprise.”
The prime example of this is the relation between parents and children
For what it's worth, I would not be surprised if Huemer argued that children have no general obligation to obey their parents.
In these proposals, what is to stop these security forces from simply conquering anyone and everyone that isn't under the protection of one? Nothing. Security forces have no reason to fight each other to protect your right not to belong to one. And they will conquer, since the ones that don't, won't grow to keep pace. It is thus the same as the example given of a job offer you can't refuse, except that here the deal offered likely is terrible (since they have no reason to give you a good one.).
Channeling Huemer, I'd say that the world's states are in a kind of anarchy and they don't simply gobble each other up all the time.