arjun-panickssery

Comment by Arjun Panickssery (arjun-panickssery) on To Understand History, Keep Former Population Distributions In Mind · 2025-04-23T20:47:02.180Z · LW · GW

Regarding the Russians and East Slavs more broadly, Anatoly Karlin has some napkin math that at the very least shows the huge toll that the world wars had on their populations, which barely grow or s:

(8a) Russia just within its current borders, assuming otherwise analogous fertility and migration trends, would have had 261.8 million people by 2017 without the triple demographic disasters of Bolshevism, WW2, and the 1990s – that’s double its actual population of 146 million.
Source: Демографические итоги послереволюционного столетия & Демографические катастрофы ХХ века by Anatoly Vishnevsky
(8b) According to my very rough calculations, based on various sources, the population change for each of the following in their current borders between 1913/14 and 1945/46 was about as follows:
Russia – 91M/97M
Ukraine – 35M/34M
Belarus – 7.5M/7.7M
Assuming a threefold expansion in all of these populations, we could have been looking to a Russian Empire or Republic with a further ~120M fully Russified Belorussians and largely Russified Ukrainians, for a total Slavic population of almost 400M.
That’s twice bigger than the number of White Americans today, the most populous single European ethnicity, and almost as much as all of today’s Western Europe.
(8c) Total population of a hypothetical Russian Empire that also retained Central Asia and the Caucasus, and that hadn’t been bled white by commies, Nazis, and Westernizers during the course of the 20th century, would likely have been not that far off from Dmitry Mendeleev’s 1906 projection of 594 million for 2000.

Comment by Arjun Panickssery (arjun-panickssery) on Three Months In, Evaluating Three Rationalist Cases for Trump · 2025-04-18T20:28:57.729Z · LW · GW

These indices are probably not meaningful. It's easy to find news stories of ordinary people in England and Germany being arrested for their opinions or even for mocking elected officials.

German criminal law actually adds special penalties for "defaming" politicians (Defamation of persons in the political arena, Section 188, German Criminal Code) as part of a dozen free-speech limitations that would violate the First Amendment here. And truth isn't an ironclad defense the way it is here.

In England, content that's merely "grossly offensive" or "menacing" is illegal under their Communications Act 2003, and similar laws date back earlier.

There are multiple cases even in the last few months alone of these laws being used vigorously in both countries (e.g. this case in England in which someone was jailed for insulting a politician)

In contrast, in America since Brandenburg v. Ohio (1969), speech is criminal only when it is (a) intended, and (b) likely to produce imminent lawless action, or when it is a true threat, obscenity, or narrow category such as child‑pornography.

Comment by Arjun Panickssery (arjun-panickssery) on American College Admissions Doesn't Need to Be So Competitive · 2025-04-08T20:55:43.511Z · LW · GW

> So after removing the international students from the calculations...
Having a sizable portion of International students necessarily subsidizes the cost of higher education for domestic students

Maybe I should rephrase the sentence in OP. What I mean is that "After assuming that half of international students scored at or above 1550 and half scored below, the remaining spots are divided among domestic students in such-and-such way."

Comment by Arjun Panickssery (arjun-panickssery) on American College Admissions Doesn't Need to Be So Competitive · 2025-04-08T17:23:03.427Z · LW · GW

Why is it obvious

Comment by Arjun Panickssery (arjun-panickssery) on American College Admissions Doesn't Need to Be So Competitive · 2025-04-08T07:10:53.417Z · LW · GW

At the start of the post I describe an argument I often hear:

But many people are under the misconception that the resulting “rat race”—the highly competitive and strenuous admissions ordeal—is the inevitable result of the limited class sizes among top schools and the strong talent in the applicant pools, and that it isn’t merely because of the reasons listed in (2). Some even go so far as to suggest that a better system would be to run a lottery for any applicant who meets a minimum “qualification” standard—under the assumption that there would be many such qualified students.

This is the argument that I'm responding to and refuting.

Comment by Arjun Panickssery (arjun-panickssery) on American College Admissions Doesn't Need to Be So Competitive · 2025-04-07T19:25:00.498Z · LW · GW

You're phrasing this as though it's rebutting some remark I made; if so, I'm not sure what remark that is. I know that admissions offices are admitting students according to an intentional system.

Comment by Arjun Panickssery (arjun-panickssery) on Thomas Kwa's Shortform · 2025-04-04T01:22:14.233Z · LW · GW

Didn't watch the video but is there the short version of this argument? France is at the 90th percentile of population sizes and also has the 4th-most nukes.

Comment by Arjun Panickssery (arjun-panickssery) on Why Have Sentence Lengths Decreased? · 2025-04-03T22:27:25.758Z · LW · GW

Shorter sentences are better. Why? Because they communicate clearly. I used to speak in long sentences. And they were abstract. Thus I was hard to understand. Now I use short sentences. Clear sentences.

It's been net-positive. It even makes my thinking clearer. Why? Because you need to deeply understand something to explain it simply.

Comment by Arjun Panickssery (arjun-panickssery) on Explaining British Naval Dominance During the Age of Sail · 2025-04-03T18:09:27.686Z · LW · GW

More object-level content: Why Have Sentence Lengths Decreased? — LessWrong

Comment by Arjun Panickssery (arjun-panickssery) on Explaining British Naval Dominance During the Age of Sail · 2025-03-29T02:48:20.463Z · LW · GW

That part encourages captains to avoid shirking in general (rather than to use aggressive tactics in particular) because it increases the costs of job loss (due to high compensation) and because there are captains in reserve that can replace them quickly.

Comment by Arjun Panickssery (arjun-panickssery) on Childhood and Education #9: School is Hell · 2025-03-08T19:54:48.717Z · LW · GW

The prior should be towards liberty

Right, I was the first strong-disagree and I disagreed because of the implied premise that homeschooling was deviant in a way that warranted a high degree of scrutiny, rather than being the natural right/default that should be restricted only in extreme cases. I figure that's why others disagreed as well.

Elaborated in this thread: https://www.lesswrong.com/posts/MJFeDGCRLwgBxkmfs/childhood-and-education-9-school-is-hell?commentId=W8FoHDkB3xAkfcwyw

Comment by Arjun Panickssery (arjun-panickssery) on Childhood and Education #9: School is Hell · 2025-03-08T19:49:10.447Z · LW · GW

This can hold true even if we grant that mandatory public school is itself abusive to children

My point doesn't have anything to do with the relevant merits of public vs homeschool. My point is that your listed interventions aren't reasonable because they involve too much government intrusion into a parent's freedom to educate his children how he wants, for reasons based on dubious and marginal "safeguarding" grounds. My example of a different education regime was to consider a different status quo that makes the intrusiveness more clear.

I feel like all my position needs to argue for is that some children have parents/caretakers where it would be worse if they had 100% control and no accountability than if the children also spend some time outside the household in public school [emphasis mine]

This might be the crux: I'm saying that this position isn't based on any reasonable principle of government intrusion on people's lives. The government shouldn't intrude on basic parenting rights with a ton of surveillance just to see whether it can make what it thinks is a marginal improvement.

More concretely, do you think parents should have to pass a criminal background check (assuming this is what you meant by "background check") in order to homeschool, even if they retain custody of their children otherwise?

Re-reading your previous reply, I noticed this:

So, what you could do to get some of the monitoring back: have homeschooling with (e.g.) yearly check-ins with the affected children from a social worker. I don't know the details, but my guess is that some states have this and others don't.

No states use this policy and it doesn't make sense that parents should have to submit to yearly inspections of their parenting practices. It only makes sense from the point of view where homeschooling is highly deviant and basically impermissible without special dispensation, and where the government has the authority to decide in very specific terms what composes children's educations.

Comment by Arjun Panickssery (arjun-panickssery) on Childhood and Education #9: School is Hell · 2025-03-08T18:24:21.300Z · LW · GW

I strong-disagreed since I don't think any of your listed criticisms are reasonable. The implied premise is that homeschooling is deviant in a way that justifies a lot of government scrutiny, when really parents have a natural right to educate their children the way they want (with government intervention being reasonable in extreme cases that pass a high bar of scrutiny).

In particular, I think that outside of an existing norm where most students go to public school, the things you listed would be obviously unjust. Do you think that parents who fail a criminal background check shouldn't be allowed to educate their own children (given that they still have custody of their children), or that a CPS investigation should make some kind of intermediate judgment of "not abusive but not worthy to educate the children without state intervention"?

If we were under a different education regime like universal school vouchers, or just totally unfunded private education, do you think it would be reasonable to have parents' freedom to educate their children restricted for reasons like the ones you gave, or to introduce mandatory public schooling just for this kind of monitoring?

Comment by Arjun Panickssery (arjun-panickssery) on Why You Should Never Update Your Beliefs · 2024-12-16T02:17:09.745Z · LW · GW

I think this post is very funny (disclaimer: I wrote this post).

A number of commenters (both here and on r/slatestarcodex) think it's also profound, basically because of its reference to the anti-critical-thinking position better argued in the Michael Huemer paper that I cite about halfway through the post.

The question of when to defer to experts and when to think for yourself is important. This post is fun as satire or hyperbole, though it ultimately doesn't take any real stance on the question.

Comment by Arjun Panickssery (arjun-panickssery) on Nietzsche's Morality in Plain English · 2024-12-07T22:35:09.227Z · LW · GW

I think this post is very good (note: I am the author).

Nietzsche is brought up often in different contexts related to ethics, politics, and the best way to live. This post is the best summary on the Internet of his substantive moral theory, as opposed to vague gesturing based on selected quotes. So it's useful for people who

are interested in what Nietzsche's arguments, as a result of their secondhand impressions
have specific questions like "Why does Nietzsche think that the best people are more important"
want to know whether something can be well-described as "Nietzschean"

It's able to answer questions like this and describe Nietzsche's moral theory concisely because it focuses on his lines of argument and avoids any description of his metaphors or historical narratives: no references are made to the Ubermensch, Last Man, the "death of God," the blond beast, or other concepts that aren't needed for an analytic account of his theory.

Comment by Arjun Panickssery (arjun-panickssery) on Personal AI Planning · 2024-11-12T04:43:41.353Z · LW · GW

By "calligraphy" do you mean cursive writing?

Comment by Arjun Panickssery (arjun-panickssery) on Should CA, TX, OK, and LA merge into a giant swing state, just for elections? · 2024-11-07T11:42:30.362Z · LW · GW

So why don't the four states sign a compact to assign all their electoral votes in 2028 and future presidential elections to the winner of the aggregate popular vote in those four states? Would this even be legal?

It would be legal to make an agreement like this (states are authorized to appoint electors and direct their votes however they like; see Chiafalo v. Washington) but it's not enforceable in the sense that if one of the states reneges, the outcome of the presidential election won't be reversed.

Comment by Arjun Panickssery (arjun-panickssery) on Overcoming Bias Anthology · 2024-10-21T22:52:45.423Z · LW · GW

lol fixed thanks

Comment by Arjun Panickssery (arjun-panickssery) on Overcoming Bias Anthology · 2024-10-20T20:00:09.138Z · LW · GW

Yeah it's for the bounty. Hanson suggested that a list of links might be preferred to a printed book, at least for now, since he might want to edit the posts.

Comment by Arjun Panickssery (arjun-panickssery) on What are your favorite books or blogs that are out of print, or whose domains have expired (especially if they also aren't on LibGen/Wayback/etc, or on Amazon)? · 2024-10-16T21:35:40.917Z · LW · GW

What was it called

Comment by Arjun Panickssery (arjun-panickssery) on Please do not use AI to write for you · 2024-08-22T05:25:57.431Z · LW · GW

Brief comments on what's bad about the output:

The instruction is to write an article arguing that AI-generated posts suffer from verbosity, hedging, and unclear trains of thought. But ChatGPT makes that complaint in a single sentence in the first paragraph and then spends 6 paragraphs adding a bunch of its own arguments:

that the "nature of conversation itself" draws value from "human experience, emotion, and authenticity" that AI content replaces with "a hollow imitation of dialogue"
that AI content creates "an artificial sense of expertise," i.e. that a dumb take can be made to seem smarter than it is
that the option to use AI content discourages posters from "engag[ing] deeply with the topic themselves"

It wasn't instructed to make these arguments and they aren't really defended as important.

Then lastly ChatGPT offers a "solution": that people just disclose when they're using AI assistance. But it's not explained how this actually prevents the problems above, besides I guess allowing people downvote AI content or scrutinize it more. Importantly this proposed solution wouldn't solve the concern the model was instructed to present.

So in general instructions aren't followed and the thinking/argumentation isn't clear.

Comment by Arjun Panickssery (arjun-panickssery) on Please do not use AI to write for you · 2024-08-22T05:16:17.185Z · LW · GW

I added to your prompt the instructions

Be brief and write concise prose in the style of Paul Graham. Don't hedge or repeat yourself or go on tangents.

And the output is still bad, but now mostly for the flaw (also present in your output) that ChatGPT can't resist making the complaint about "human authenticity" and "transparency/trust" when that's not what you're talking about:

I've noticed a troubling trend on online forums: a surge in posts that clearly seem to be generated by AI. These posts are verbose, meandering, and devoid of real substance. They present every argument with an opposing one, wrapped in so much "on the one hand" and "on the other" that they end up saying nothing at all.
AI in its current form shouldn’t be used for this purpose. The problem isn’t just that these posts waste time, though they do. It’s that they dilute meaningful discourse. When an AI-generated post floods a forum, it clogs the flow of genuine ideas. The writing might seem polished, but it’s hollow, a shell of real thought. And that’s a disservice to any community that values actual discussion.
Forums are places for people to exchange ideas, not for machines to simulate them. Using AI to generate content in this way erodes the authenticity of conversation. It’s a misuse of technology that contributes to the noise rather than to meaningful exchange.
AI can be powerful, but that power should be wielded carefully. Right now, the tools aren’t good enough to replace real human input, and pretending they are does more harm than good.

EDIT: Below is what it looks like if I replace "in the style of Paul Graham" with "in the style of LessWrong":

AI-generated content in online forums is a growing problem. These tools are designed to mimic human language, but they often fail to deliver meaningful or original contributions. Instead, they produce verbose, meandering prose that tries to cover all bases but ultimately says nothing. This kind of content is worse than useless—it clogs up discussions with noise, making it harder for genuine insights to emerge.
AI, in its current state, lacks the ability to form and defend coherent, novel ideas. It doesn't understand context, nuance, or the importance of taking a clear stance. The result is writing that hedges endlessly, offering a shallow imitation of thoughtful discourse but devoid of real substance. When forums are flooded with such content, it dilutes the value of the platform. It wastes the time of readers who are searching for real engagement and stifles the development of ideas that can only come from genuine human interaction.
AI tools should not be used to generate content for forums, period. They don't contribute to the quality of discussion and instead, undermine it. The promise of AI lies in its ability to augment human capabilities, not replace them in areas where human judgment, creativity, and insight are paramount. Until AI can meaningfully contribute to a discussion, we should resist the temptation to use it in contexts where clarity, originality, and human understanding are crucial.

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-07-28T07:29:43.898Z · LW · GW

Is this word long or short? Only say "long" or "short". The word is: {word}.

Code: https://github.com/ArjunPanickssery/long_short

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-07-28T05:37:05.928Z · LW · GW

To test out Cursor for fun I asked models whether various words of different lengths were "long" and measured the relative probability of "Yes" vs "No" answers to get a P(long) out of them. But when I use scrambled words of the same length and letter distribution, GPT 3.5 doesn't think any of them are long.

Update: I got Claude to generate many words with connotations related to long ("mile" or "anaconda" or "immeasurable") and short ("wee" or "monosyllabic" or "inconspicuous" or "infinitesimal") It looks like the models have a slight bias toward the connotation of the word.

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-07-25T21:38:30.735Z · LW · GW

What's the actual probability of casting a decisive vote in a presidential election (by state)?

I remember the Gelman/Silver/Edlin "What is the probability your vote will make a difference?" (2012) methodology:

1. Let E be the number of electoral votes in your state. We estimate the probability that these are necessary for an electoral college win by computing the proportion of the 10,000 simulations for which the electoral vote margin based on all the other states is less than E, plus 1/2 the proportion of simulations for which the margin based on all other states equals E. (This last part assumes implicitly that we have no idea who would win in the event of an electoral vote tie.) [Footnote: We ignored the splitting of Nebraska’s and Maine’s electoral votes, which retrospectively turned out to be a mistake in 2008, when Obama won an electoral vote from one of Nebraska’s districts.]
2. We estimate the probability that your vote is decisive, if your state’s electoral votes are necessary, by working with the subset of the 10,000 simulations for which the electoral vote margin based on all the other states is less than or equal to E. We compute the mean M and standard deviation S of the vote margin among that subset of simulations and then compute the probability of an exact tie as the density at 0 of the Student-t distribution with 4 degrees of freedom (df), mean M, and scale S.
The product of two probabilities above gives the probability of a decisive vote in the state.

This gives the following results for the 2008 presidential election, where they estimate that you had less than one chance in a hundred billion of deciding the election in DC, but better than a one in ten million chance in New Mexico. (For reference, 131 million people voted in the election.)

Is this basically correct?

(I guess you also have to adjust for your confidence that you are voting for the better candidate. Maybe if you think you're outside the top ~20% in "voting skill"—ability to pick the best candidate—you should abstain. See also.)

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-06-11T15:27:15.633Z · LW · GW

FiveThirtyEight released their prediction today that Biden currently has a 53% of winning the election | Tweet

The other day I asked:

Should we anticipate easy profit on Polymarket election markets this year? Its markets seem to think that
Biden will die or otherwise withdraw from the race with 23% likelihood
Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
Biden will win the election with only 34% likelihood
Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?

Probably worthwhile to think about this further, including ways to make leveraged bets.

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-06-07T15:57:02.873Z · LW · GW

Should we anticipate easy profit on Polymarket election markets this year? Its markets seem to think that

Biden will die or otherwise withdraw from the race with 23% likelihood
Biden will fail to be the Democratic nominee for whatever reason at 13% likelihood
either Biden or Trump will fail to win nomination at their respective conventions with 14% likelihood
Biden will win the election with only 34% likelihood

Even if gas fees take a few percentage points off we should expect to make money trading on some of this stuff, right (the money is only locked up for 5 months)? And maybe there are cheap ways to transfer into and out of Polymarket?

Comment by Arjun Panickssery (arjun-panickssery) on Just admit that you’ve zoned out · 2024-06-04T12:13:17.366Z · LW · GW

I like "Could you repeat that in the same words?" so that people don't try to rephrase their point for no reason.

In addition to daydreaming, sometimes you're just thinking about the first of a series of points that your interlocutor made one after the other (a lot of rationalists talk too fast).

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-06-03T19:44:15.819Z · LW · GW

By "subscriber growth" in OP I meant both paid and free subscribers.

My thinking was that people subscribe after seeing posts they like, so if they get to see the body of a good post they're more likely to subscribe than if they only see the title and the paywall. But I guess if this effect mostly affects would-be free subscribers then the effect mostly matters insofar as free subscribers lead to (other) paid subscriptions.

(I say mostly since I think high view/subscriber counts are nice to have even without pay.)

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-06-03T19:31:31.180Z · LW · GW

Paid-only Substack posts get you money from people who are willing to pay for the posts, but reduce both (a) views on the paid posts themselves and (b) related subscriber growth (which could in theory drive longer-term profit).

So if two strategies are

entice users with free posts but keep the best posts behind a paywall
make the best posts free but put the worst posts behind the paywall

then regarding (b) above. the second strategy has less risk of prematurely stunting subscriber growth, since the best posts are still free. Regarding (a), it's much less bad to lose view counts on your worst posts.

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-05-24T14:07:54.763Z · LW · GW

[Book Review] The 8 Mansion Murders by Takemaru Abiko

As a kid I read a lot of the Sherlock Holmes and Hercule Poirot canon. Recently I learned that there's a Japanese genre of honkaku ("orthodox") mystery novels whose gimmick is a fastidious devotion to the "fair play" principles of Golden Age detective fiction, where the author is expected to provide everything that the attentive reader would need to come up with the solution himself. It looks like a lot of these honkaku mysteries include diagrams of relevant locations, genre-savvy characters, and a puzzle-like aesthetic. A bunch have been translated by Locked Room International.

The title of The 8 Mansion Murders doesn't refer to the number of murders, but to murders committed in the "8 Mansion," a mansion designed in the shape of an 8 by the eccentric industrialist who lives there with his family (diagrams show the reader the layout). The book is pleasant and quick—it didn't feel like much over 50,000 words. Some elements feel very Japanese, like the detective's comic-relief sidekick who suffers increasingly serious physical-comedy injuries. The conclusion definitely fits the fair-play genre in that it makes sense, could be inferred from the clues, is generally ridiculous, and doesn't offer much in the way of motive.

If you like mystery novels, I would recommend reading one of these honkaku mysteries for the novelty. Maybe not this one, since there are more famous ones (this one was on libgen).

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-05-23T15:13:17.206Z · LW · GW

Ask LLMs for feedback on "the" rather than "my" essay/response/code, to get more critical feedback.

Seems true anecdotally, and prompting GPT-4 to give a score between 1 and 5 for ~100 poems/stories/descriptions resulted in an average score of 4.26 when prompted with "Score my ..." versus an average score of 4.0 when prompted with "Score the ..." (code).

Comment by Arjun Panickssery (arjun-panickssery) on Stephen Fowler's Shortform · 2024-05-20T16:02:10.059Z · LW · GW

https://x.com/panickssery/status/1792586407623393435

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-04-29T16:30:50.528Z · LW · GW

If I understand the term "double crux" correctly, to say that something is a double crux is just to say that it is "crucial to our disagreement."

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2024-04-29T14:08:56.007Z · LW · GW

Quick Take: People should not say the word "cruxy" when already there exists the word "crucial." | Twitter

Crucial sometimes just means "important" but has a primary meaning of "decisive" or "pivotal" (it also derives from the word "crux"). This is what's meant by a "crucial battle" or "crucial role" or "crucial game (in a tournament)" and so on.

So if Alice and Bob agree that Alice will work hard on her upcoming exam, but only Bob thinks that she will fail her exam—because he thinks that she will study the wrong topics (h/t @Saul Munn)—then they might have this conversation:

Bob: You'll fail
Alice: I won't, because I'll study hard.
Bob: That's not crucial to our disagreement.

Comment by Arjun Panickssery (arjun-panickssery) on Express interest in an "FHI of the West" · 2024-04-24T02:07:16.862Z · LW · GW

The older nickname was "Cornell of the West." Stanford was modeled after Cornell.

Comment by Arjun Panickssery (arjun-panickssery) on [Fiction] A Confession · 2024-04-18T19:36:32.208Z · LW · GW

This story is inspired by The Trouble With Being Born, a collection of aphorisms by the Romanian philosopher Emil Cioran (discussed more here), including the following aphorisms:

A stranger comes and tells me he has killed someone. He is not wanted by the police because no one suspects him. I am the only one who knows he is the killer. What am I to do? I lack the courage as well as the treachery (for he has entrusted me with a secret—and what a secret!) to turn him in. I feel I am his accomplice, and resign myself to being arrested and punished as such. At the same time, I tell myself this would be too ridiculous. Perhaps I shall go and denounce him all the same. And so on, until I wake up.

The interminable is the specialty of the indecisive. They cannot mark life out for their own, and still less their dreams, in which they perpetuate their hesitations, pusillanimities, scruples. They are ideally qualified for nightmare.

Here on the coast of Normandy, at this hour of the morning, I needed no one. The very gulls’ presence bothered me: I drove them off with stones. And hearing their supernatural shrieks, I realized that that was just what I wanted, that only the Sinister could soothe me, and that it was for such a confrontation that I had got up before dawn.

Comment by Arjun Panickssery (arjun-panickssery) on Your LLM Judge may be biased · 2024-03-31T22:16:16.511Z · LW · GW

See "Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions" (Pezeshkpour and Hruschka, 2023):

Large Language Models (LLMs) have demonstrated remarkable capabilities in various NLP tasks. However, previous works have shown these models are sensitive towards prompt wording, and few-shot demonstrations and their order, posing challenges to fair assessment of these models. As these models become more powerful, it becomes imperative to understand and address these limitations. In this paper, we focus on LLMs robustness on the task of multiple-choice questions -- commonly adopted task to study reasoning and fact-retrieving capability of LLMs. Investigating the sensitivity of LLMs towards the order of options in multiple-choice questions, we demonstrate a considerable performance gap of approximately 13% to 75% in LLMs on different benchmarks, when answer options are reordered, even when using demonstrations in a few-shot setting. Through a detailed analysis, we conjecture that this sensitivity arises when LLMs are uncertain about the prediction between the top-2/3 choices, and specific options placements may favor certain prediction between those top choices depending on the question caused by positional bias. We also identify patterns in top-2 choices that amplify or mitigate the model's bias toward option placement. We found that for amplifying bias, the optimal strategy involves positioning the top two choices as the first and last options. Conversely, to mitigate bias, we recommend placing these choices among the adjacent options. To validate our conjecture, we conduct various experiments and adopt two approaches to calibrate LLMs' predictions, leading to up to 8 percentage points improvement across different models and benchmarks.

Also "Benchmarking Cognitive Biases in Large Language Models as Evaluators" (Koo et al., 2023):

Order Bias is an evaluation bias we observe when a model tends to favor the model based on the order of the responses rather than their content quality. Order bias has been extensively studied (Jung et al., 2019; Wang et al., 2023a; Zheng et al., 2023), and it is well-known that state-of-the-art models are still often influenced by the ordering of the responses in their evaluations. To verify the existence of order bias, we prompt both orderings of each pair and count the evaluation as a “first order” or “last order” bias if the evaluator chooses the first ordered (or last ordered) output in both arrangements respectively.

Comment by Arjun Panickssery (arjun-panickssery) on The Worst Form Of Government (Except For Everything Else We've Tried) · 2024-03-26T22:56:49.101Z · LW · GW

Do non-elite groups factor into OP's analysis. I interpreted is as inter-elite veto, e.g. between the regional factions of the U.S. or between religious factions, and less about any "people who didn't go to Oxbridge and don't live in London"-type factions.

I can't think of examples where a movement that wasn't elite-led destabilized and successfully destroyed a regime, but I might be cheating in the way I define "elites" or "led."

Comment by Arjun Panickssery (arjun-panickssery) on The Worst Form Of Government (Except For Everything Else We've Tried) · 2024-03-26T17:26:05.228Z · LW · GW

But, as other commenters have noted, the UK government does not have structural checks and balances. In my understanding, what they have instead is a bizarrely, miraculously strong respect for precedent and consensus about what "is constitutional" despite (or maybe because of?) the lack of a written constitution. For the UK, and maybe other, less-established democracies (i.e. all of them), I'm tempted to attribute this to the "repeated game" nature of politics: when your democracy has been around long enough, you come to expect that you and the other faction will share power (roughly at 50-50 for median voter theorem reasons), so voices within your own faction start saying "well, hold on, we actually do want to keep the norms around."

The UK is also a small country, both literally, having a 4-5x smaller population than e.g. France during several centuries of Parliamentary rule before the Second Industrial Revolution, and figuratively, since they have an unusually concentrated elite that mostly goes to the same university and lives in London (whose metro area has 20% of the country's population).

https://www.youtube.com/watch?app=desktop&v=dkhcNoMNHA0

Comment by Arjun Panickssery (arjun-panickssery) on Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search · 2024-02-14T19:17:20.638Z · LW · GW

Changes my view, edited the post.

Thanks for taking the time to respond; I didn't figure the post would get so much reach.

Comment by Arjun Panickssery (arjun-panickssery) on Skepticism About DeepMind's "Grandmaster-Level" Chess Without Search · 2024-02-13T20:23:59.266Z · LW · GW

Wow, thanks for replying.

If the model has beaten GMs at all, then it can only be so weak, right? I'm glad I didn't make stronger claims than I did.

I think my questions about what humans-who-challenge-bots are like was fair, and the point about smurfing is interesting. I'd be interested in other impressions you have about those players.

Is the model's Lichess profile/game history available?

Comment by Arjun Panickssery (arjun-panickssery) on More Hyphenation · 2024-02-08T03:10:39.500Z · LW · GW

Powerful

Comment by Arjun Panickssery (arjun-panickssery) on More Hyphenation · 2024-02-08T01:36:01.227Z · LW · GW

Could refer to them in writing as "MC-effectiveness measures"

Comment by Arjun Panickssery (arjun-panickssery) on Arjun Panickssery's Shortform · 2023-12-30T10:34:24.991Z · LW · GW

Could someone explain how Rawls's veil of ignorance justifies the kind of society he supports? (To be clear I have an SEP-level understanding and wouldn't be surprised to be misunderstanding him.)

It seems to fail at every step individually:

At best, the support of people in the OP provides necessary but probably insufficient conditions for justice, unless he refutes all the other proposed conditions involving whatever rights, desert, etc.
And really the conditions of the OP are actively contrary to good decision-making, e.g. you don't know your particular conception of the good (??) or that they're essentially self-interested. . .
There's no reason to think, generally, that people disagree with John Rawls only because of their social position or psychological quirks
There's no reason to think, specifically, that people would have the literally infinite risk aversion required to support the maximin principle.
Even given everything, the best social setup could easily be optimized for the long-term (in consideration of future people) in a way that makes it very different (e.g. harsher for the poor living today) from the kind of egalitarian society I understand Rawls to support.

More concretely:

(A) I imagine that if Aristotle were under a thin veil of ignorance, he would just say "Well if I turn out to be born a slave then I will deserve it"; it's unfair and not very convincing to say that people would just agree with a long list of your specific ideas if not for their personal advantages.

(B) If you won the lottery and I demanded that you sell your ticket to me for $100 on the grounds that you would have, hypothetically, agreed to do this yesterday (before you know that it was a winner), you don't have to do this; the hypothetical situation doesn't actually bear on reality in this way.

Another frame is that his argument involves a bunch of provisions that seem designed to avoid common counterarguments but are otherwise arbitrary (utility monsters, utilitarianism, etc).

Comment by Arjun Panickssery (arjun-panickssery) on "Model UN Solutions" · 2023-12-09T00:58:50.019Z · LW · GW

Here's Resolution 2712 from a few weeks ago, on "The situation in the Middle East, including the Palestinian question:

The Security Council,
(here I skip preambulatory clauses that altogether are as long as the rest of the text),
1. Demands that all parties comply with their obligations under international law, including international humanitarian law, notably with regard to the protection of civilians, especially children;
2. Calls for urgent and extended humanitarian pauses and corridors throughout the Gaza Strip for a sufficient number of days to enable, consistent with international humanitarian law, the full, rapid, safe, and unhindered humanitarian access for United Nations humanitarian agencies and their implementing partners, the International Committee of the Red Cross and other impartial humanitarian organizations, to facilitate the continuous, sufficient and unhindered provision of essential goods and services important to the well-being of civilians, especially children, throughout the Gaza Strip, including water, electricity, fuel, food, and medical supplies, as well as emergency repairs to essential infrastructure, and to enable urgent rescue and recovery efforts, including for missing children in damaged and destroyed buildings, and including the medical evacuation of sick or injured children and their care givers;
3. Calls for the immediate and unconditional release of all hostages held by Hamas and other groups, especially children, as well as ensuring immediate humanitarian access;
4. Calls on all parties to refrain from depriving the civilian population in the Gaza Strip of basic services and humanitarian assistance indispensable to their survival, consistent with international humanitarian law, which has a disproportionate impact on children, welcomes the initial, although limited, provision of humanitarian supplies to civilians in the Gaza Strip and calls for the scaling up of the provision of such supplies to meet the humanitarian needs of the civilian population, especially children;
5. Underscores the importance of coordination, humanitarian notification, and deconfliction mechanisms, to protect all medical and humanitarian staff, vehicles including ambulances, humanitarian sites, and critical infrastructure, including UN facilities, and to help facilitate the movement of aid convoys and patients, in particular sick and injured children and their care-givers;
6. Requests the Secretary-General to report orally to the Security Council on the implementation of this resolution at the next mandated meeting of the Security Council on the situation in the Middle East, and further requests the Secretary-General to identify options to effectively monitor the implementation of this resolution as a matter of prime concern;
7. Decides to remain seized of the matter.

Comment by Arjun Panickssery (arjun-panickssery) on Nietzsche's Morality in Plain English · 2023-12-05T20:06:41.903Z · LW · GW

But herd morality is not just hostile to higher men, it's hostile to all positive development in mankind in general. If you glorify everything which makes weak and weary, you trap society in a prison of its own making.

Sometimes Nietzsche will use terms like "life" in e.g. "[a] tendency hostile to life is therefore characteristic of [herd] morality." But in context this refers to the higher type (in this specific passage to the man "raised to his greatest power and splendor"). The term "anti-nature" is the same way.

This is complicated by the sense in which herd morality is considered harmful to life in an indirect way, because Nietzsche's response to Schopenhauer's challenge of life's suffering is that "it is only as an aesthetic phenomenon that existence and the world are eternally justified." So herd morality is also injurious to life generally because it hinders the Goethe/Beethoven-style aesthetic spectacle that makes life worthwhile.

Importantly, I'd say with confidence that Nietzsche's opposition to herd morality is driven only by its direct effect on the norms of higher men, without any consideration for its good or bad effects on those of the lower men.

Comment by Arjun Panickssery (arjun-panickssery) on Nietzsche's Morality in Plain English · 2023-12-04T20:34:07.198Z · LW · GW

The Übermensch is discussed as an ideal kind of higher man only in Thus Spoke Zarathustra and disappears afterward. Zarathustra is often especially obscure and the Übermensch's importance in understanding Nietzsche is overstated in popular culture compared to the broader higher type of person exemplified by actual persons like Goethe.

Comment by Arjun Panickssery (arjun-panickssery) on Estimating effective dimensionality of MNIST models · 2023-11-04T17:38:45.641Z · LW · GW

My first guess was that it's noise from the label ordering (some of the digits must be harder to learn than others). Ran it 10 times with the labels shuffled each time:

Still unsure.

Comment by Arjun Panickssery (arjun-panickssery) on Dominant Assurance Contract Experiment #2: Berkeley House Dinners · 2023-07-05T13:30:49.869Z · LW · GW

It'll be a public good

User info

Posts

Comments