Posts

ACX Paris Meetup - August 11 2023 2023-08-05T09:44:05.717Z

Comments

Comment by PoignardAzur on My Clients, The Liars · 2024-03-11T16:20:19.614Z · LW · GW

Did you ever get one of your clients to use the "Your honor, I'm very sorry, I'll never do it again" line?

Comment by PoignardAzur on My Clients, The Liars · 2024-03-09T11:33:03.437Z · LW · GW

This was not at all obvious from the inside. I can only imagine a lot of criminal defendants have a similar experience. Defense attorneys are frustrated that their clients don't understand that they're trying to help—but that "help" is all within the rules set by the justice system. From the perspective of a client who doesn't think he did anything particularly wrong (whether or not the law agrees), the defense attorney is part of the system.

I mean... you're sticking to generalities here, and implying that the perspective of the client who thinks he didn't do anything wrong is as valid as any other perspective.

But if we try to examine some specific common case, eg: "The owner said you robbed his store, the cameras showed you robbing his store, your fingerprints are on the register", then the client's fury at the attorney "working with the prosecutor" doesn't seem very productive?

The problem isn't that the client is disagreeing with the system about the moral legitimacy of robbing a store. The problem is that the client is looking for a secret trick so the people-who-make-decisions-about-store-robberies will think he didn't rob the store and that's not gonna happen.

With that in mind, saying the attorney is "part of the system" is... well, maybe it's factually true, but it implicitly blames the robber's predicament on the system and on his attorney in a way that just doesn't make sense. The robber would be just as screwed if he was represented by eg his super-wealthy uncle with a law degree who loves him dearly.

(I don't know about your psychiatric incarceration, so I'm not commenting on it. Your situation is probably pretty different to the above.)

Comment by PoignardAzur on My Clients, The Liars · 2024-03-09T11:07:53.589Z · LW · GW

“Well, when we first met, you told me that you never touched the gun,” I reminded him with an encouraging smile. “Obviously you wouldn’t lie to your own lawyer, and so what I can do is get a fingerprint expert to come to the jail, take your prints, then do a comparison on the gun itself. Since you never touched the gun, the prints won’t be a match! This whole case will get dismissed, and we can put all this behind you!”

For the record, I am now imagining you as Bob Odenkirk while you're delivering that line.

Comment by PoignardAzur on Believing In · 2024-02-11T00:26:04.764Z · LW · GW

The point about task completion times feels especially insightful. I think I'll need to go back to it a few times to process it.

Comment by PoignardAzur on Apologizing is a Core Rationalist Skill · 2024-02-10T00:31:56.905Z · LW · GW

I think Duncan's post touches on something this post misses with its talk of "social API": apologies only work when they're a costly signal.

The people you deliver the apology to need to feel it cost you something to make that apology, either pride or effort or something valuable; or at least that you're offering to give up something costly to earn forgiveness.

Comment by PoignardAzur on Apologizing is a Core Rationalist Skill · 2024-02-10T00:28:12.393Z · LW · GW

The slightly less machiavellian version is to play Diplomacy with them.

(Or do a group project, or go to an escape game, or any other high-tension low-stakes scenario.)

Comment by PoignardAzur on Apologizing is a Core Rationalist Skill · 2024-02-10T00:26:15.621Z · LW · GW

I think "API calls" are the wrong way to word it.

It's more that an apology is a signal; to make it effective, you must communicate that it's a real signal reflecting your actual internal processes, and not a result of a surface-level "what words can I say to appear maximally virtuous" process.

So for instance, if you say a sentence equivalent to "I admit that I was wrong to do X and I'm sorry about it, but I think Y is unfair", then you're not communicating that you underwent the process of "I realized I was wrong, updated my beliefs based on it, and wondered if I was wrong about other things".

I'm not entirely sure what the simplest fix is

A simple fix would be "I admit I was wrong to do X, and I'm sorry about it. Let me think about Y for a moment." And then actually think about Y, because if you did one thing wrong, you probably did other things wrong too.

Comment by PoignardAzur on A case for AI alignment being difficult · 2024-01-08T09:53:44.377Z · LW · GW

This seems to have generated lots of internal discussions, and that's cool on its own.

However, I also get the impression this article is intended as external communication, or at least a prototype of something that might become external communication; I'm pretty sure it would be terrible at that. It uses lots of jargon, overly precise language, references to other alignment articles, etc. I've tried to read it three times over the week and gave up after the third.

Comment by PoignardAzur on The Dark Arts · 2023-12-29T10:31:15.754Z · LW · GW

I think I'm missing something obvious, or I'm missing some information. Why is this clearly ridiculous?

Nuclear triad aside, there's the fact that the Arctic is more than 1000 miles away from the nearest US land (about 1700 miles away from Montana, 3000 miles away from Texas), that Siberia is already roughly as close.

And of course, the fact the Arctic is made of, well, ice, that melts more and more as the climate warms, and thus not the best place to build a missile base on.

Even without familiarity with nuclear politics, the distance part can be checked in less than 2 minutes on Google Map; if you have access to an internet connection and judges that penalize blatant falsehoods like "they can hit us from the Arctic", you absolutely wreck your adversary with some quick checking.

Of course, in a lot of debate formats you're not allowed the two minutes it would take to do a google map check.

Comment by PoignardAzur on Sharing Information About Nonlinear · 2023-12-03T21:55:37.184Z · LW · GW

Yeah, stumbling on this after the fact, I'm a bit surprised that among the 300+ comments barely anybody is explicitly pointing this out:

I think of myself as playing the role of a wise old mentor who has had lots of experience, telling stories to the young adventurers, trying to toughen them up, somewhat similar to how Prof Quirrell[8] toughens up the students in HPMOR through teaching them Defense Against the Dark Arts, to deal with real monsters in the world.

I mean... that's a huge, obvious red flag, right? People shouldn't claim Voldemort as a role model unless they're a massive edgelord. Quirrell/Voldemort in that story is "toughening up" the students to exploit them; he teaches them to be footsoldiers, not freedom fighters or critical thinkers (Harry is the one who does that) because he's grooming them to be the army of his future fascist government. This is not subtext, it's in the text.

HPMOR's Quirrell might be the EA's Rick Sanchez.

Comment by PoignardAzur on Cohabitive Games so Far · 2023-11-11T14:37:53.012Z · LW · GW

I've just watched Disney's Strange Worlds which explicitly features a cohabitive game in its plot called Primal Outpost.

The rules aren't really shown, we just know that it's played with terrain tiles, there are monsters, and the goal is ultimately to create a sustainable ecosystem. The concept honestly looked really cool, but the movie underperformed, so I don't think we're going to see a tie-in game, unfortunately.

But it shows that the basic idea of a cohabitative game is more appealing that you might think!

(No but seriously, if anyone knows of a Primal Outpost knock-off, I need to know about it.)

Comment by PoignardAzur on Deception Chess: Game #1 · 2023-11-07T15:31:17.171Z · LW · GW

I get an "Oops! You don't have access to this page" error.

Comment by PoignardAzur on Inside Views, Impostor Syndrome, and the Great LARP · 2023-10-12T13:46:34.200Z · LW · GW

This makes a lot of sense to me and helps me articulate things I've thought for a while. (Which, you know, is the shit I come to LessWrong for, so big thumbs up!)

One of the first times I had this realization was in one of my first professional experiences. This was the first time in my life where I was in a team that wasn't just LARPing a set of objectives, but actually trying to achieve them.

They weren't impressively competent, or especially efficient, or even especially good at their job. The objectives weren't especially ambitious: it was a long-running project in its final year, and everybody was just trying to ship the best product they could, where the difference between "best they could" and "mediocre" wasn't huge.

But everyone was taking the thing seriously. People in the team were communicating about their difficulties, and anticipating problems ahead of time. Managers considered trade-offs. Developers tried to consider the UX that end-users would face.

Thinking about it, I'm realizing that knowing what you're doing isn't the same as being super good at your job, even if the two are strongly correlated. What struck me about this team wasn't that they were the most competent people I ever worked with (that would probably be my current job), it's that they didn't feel like they were pretending to be trying to achieve their objectives (again, unlike my current job).

Comment by PoignardAzur on Inside Views, Impostor Syndrome, and the Great LARP · 2023-10-12T13:25:00.810Z · LW · GW

And sometimes people will say things to me like "capitalism ruined my twenties" and I have a similarly eerie feeling about, like it's a gestalt slapped together

Ugh, that one annoys me so much. Capitalism is a word so loaded it has basically lost all meaning.

Like, people will say things like "slavery is inextricably linked to capitalism" and I'm thinking, hey genius, slavery existed in tribal civilizations that didn't even have the concept of money, what do you think capitalism even is?

(Same thing for patriarchy.)

Comment by PoignardAzur on Yes Requires the Possibility of No · 2023-10-12T13:19:57.427Z · LW · GW

Anecdotally, I've had friends who explicitly asked for an honest answer to these kinds of questions, and if given a positive answer would tell me "but you'd tell me if it was negative, right?"... and still, when given a negative answer, would absolutely take it as a personal attack and get angry.

Obviously those friendships were hard to maintain.

Often when people say they want an honest answer, what they mean is "I want you to say something positive and also mean it", they're not asking for something actionable.

Comment by PoignardAzur on Yes Requires the Possibility of No · 2023-10-12T13:15:22.077Z · LW · GW

And that, kids, is why nobody wants to date or be friends with a rationalist.

Comment by PoignardAzur on Cohabitive Games so Far · 2023-10-10T14:58:07.761Z · LW · GW

I meant "get players to cooperate within a cooperative-game-with-prisoners-dilemmas", yes.

Comment by PoignardAzur on Cohabitive Games so Far · 2023-10-10T10:30:47.708Z · LW · GW

So imagine how much richer expressions of character could be if you had this whole other dimension of gameplay design to work with. That would be cohabitive.

4X games and engine-building games have a lot of that. For instance, in Terraforming Mars, your starting corporations will have different start bonuses that radically shape your strategy throughout the entire game. In a 4X game, you might have a faction with very cheap military production that will focus on zerg-rushing other players; and a faction with research bonuses that will focus more on long-term growth.

Even in a MostPointsWin system, these differences can make different factions with very different "personalities" in both gameplay and lore.

Actually, I feel like a lot of engine-building systems could go from MostPointsWin to cooperation by just adding some objectives. Eg you could make Terraforming Mars cooperative just by adding objectives like "Create at least X space projects", "Reach oxygen level Y", "Have at least Z trees on the planet" and having each player pick one. Which is basically what the video game Terraformers did (since it's a single-player game, MostPointsWin can't work).

Some other game designs elements:

  • One easy way to get players to cooperate is to give them a common loss condition. You mention having a Moloch player, which can be pretty cool (asymetric gameplay is always fun), but it can be environmental. Something like "everyone must donate at least X coals per turn to the communal furnace, else everyone freezes to death", or the opposite, "every coal burned contributes to global warming, past a certain cap everybody loses".
  • Like you say, binary win/lose conditions can be more compelling than "get as many points as possible". (I think this is a major reason the MostPointsWin systems are so common.) You can easily get from one to the other by having eg "medals" where you get gold medal for getting 20 points, silver medal for getting 15 points, etc. Or with custom objectives, The Emperor's gold medal is having 12 tiles, the silver medal is having 8 tiles, etc, while The Druid's gold medal is preserving at least 8 trees, silver is 6 trees, etc.
Comment by PoignardAzur on Cohabitive Games so Far · 2023-10-10T10:17:45.548Z · LW · GW

Here's a cooperation game people haven't mentioned yet: Galerapagos.

The base idea is: you're all stuck on a desert island, Robinson Crusoe style. You need to escape before the island's volcano erupts. The aesthetics are loosely inspired by Koh Lanta, the French equivalent of Survivor.

Each player can do a certain number of actions, namely fish, collect water, help build an escape raft, and scavenge for materials. Each player has an individual inventory.

While it's possible for everyone to escape alive, there's some incentives to defect from the group (eg keep your own stash of food while other players starve to death). From what I heard the "tragedy of the commons" elements really start to matter when you have a large (>6) number of players.

Comment by PoignardAzur on Related Discussion from Thomas Kwa's MIRI Research Experience · 2023-10-07T11:32:46.128Z · LW · GW

Perhaps I'm being dense, and some additional kernel of doubt is being asked of me here. If so, I'd appreciate attempts to spell it out like I'm a total idiot.

I don't know if "dense" is the word I use, but yeah, I think you missed my point.

My ELI5 would be "You're still assuming the problem was 'Kurt didn't know how to use a pump' and not 'there was something wrong with your pump'".

I don't want to speculate too much beyond that eg about the discretionary budget stuff.

Thanks again! (I have read that book, and made changes on account of it that I also file under partial-successes.)

Happy to hear that!

Comment by PoignardAzur on Related Discussion from Thomas Kwa's MIRI Research Experience · 2023-10-06T12:30:40.596Z · LW · GW

I think it's cool that you're engaging with criticism and acknowledging the harm that happened as a result of your struggles.

And, to cut to the painful part, that's about the only positive thing that I (random person on the internet) have to say about what you just wrote.

In particular, you sound (and sorry if I'm making any wrong assumption here) extremely unwilling to entertain the idea that you were wrong, or that any potential improvement might need to come from you.

You say:

For whatever it's worth: I don't recall wanting you to quit (as opposed to improve).

But you don't seem to consider the idea that maybe you were more in a position to improve than he was.

I don't want to be overly harsh or judgmental. You (eventually) apologize and acknowledge your responsibility in employees having a shitty time, and it's easy for an internet stranger to over-analyze everything you said.

But. I do feel confident that you're expressing a lack of curiosity here. You're assuming that there's nothing you possibly have done to make Kurt's experience better, and while you're open to hearing if anyone presents you with a third option, you don't seem to think seeking out a third option is a problem you should actively solve.

My recollection of the thought that ran through my mind when you were like "Well I couldn't figure out how to use a bike pump" was that this was some sideways attempt at begging pardon, without actually saying "oops" first, nor trying the obvious-to-me steps like "watch a youtube video" or "ask your manager if he knows how to inflate a bike tire", nor noticing that the entire hypothesized time-save of somebody else inflating bike tires is wiped out by me having to give tutorials on it.

Like, here... You get that you're not really engaging with what Kurt is/was saying, right?

Kurt's point is that your pump seemed harder to use than other bike pumps. If the issue is on the object level, valid answers could be asking what types of bike pumps he's used to and where the discrepancy could come from, suggesting he buy a new pump, or if you're feeling especially curious asking that he bring his own pump to work so you can compare the two; or maybe the issue could come not from the pump but from the tires, in which case you could consider changing them, etc.

If the issue is on the meta level and that you don't want to spend time on these problems, a valid answer could be saying "Okay, what do you need to solve this problem without my input?". Then it could be a discussion about discretionary budget, about the amount of initiative you expect him to have with his job, about asking why he didn't feel comfortable making these buying decisions right away, etc.

Your only takeaway from this issue was "he was wrong and he could have obviously solved it watching a 5 minutes youtube tutorial, what would have been the most efficient way to communicate to him that he was wrong?". At no point in this reply are you considering (out loud, at least) that hypothesis "maybe I was wrong and I missed something".

Like, I get having a hot temper and saying things you regret because you don't see any other answers in the moment. But part of the process is to communicate despite a hot temper is to be willing to admit you were wrong.

Perhaps I'm missing some obvious third alternative here, that can be practically run while experiencing a bunch of frustration or exasperation. (If you know of one, I'd love to hear it.)

The best life-hack I have is "Don't be afraid to come back and restart the discussion once you feel less frustration or exasperation".

Long-term, I'd recommend looking into Non-Violent Communication, if you haven't already. There's a lot of cruft in there, but in my experience the core insights work: express vulnerability, focus on communicating you needs and how you feel about things, avoid assigning blame, make negotiable requests, and go from there.

So for the bike tire thing the NVC version would be something like "I need to spend my time efficiently and not have to worry about logistics; when you tell me you're having problems with the pump I feel stressed because I feel like I'm spending time I should spend on more important things. I need you to find a system where you can solve these problems without my input. What do you need to make that happen?"

Comment by PoignardAzur on Related Discussion from Thomas Kwa's MIRI Research Experience · 2023-10-06T11:42:42.536Z · LW · GW

Of all the things that have increased my cynicism toward the EA ecosystem over the years, none has disturbed me quite as much as the ongoing euphemisms and narrative spin around Nate’s behavior.

I'll make a tentative observation: it seems that you're still being euphemistic and (as you kind of note yourself) you're still self-censoring a bit.

The words that you say are "he's mean and scary" and "he was not subject to the same behavioral regulation norms as everyone else". The words I would have said, given your description and his answer below is "he acts like an asshole and gets away with it because people enable him".

I've known bosses that were mean and scary, but otherwise felt fair and like they made the best of a tough situation. That's not what you're describing. Maybe Nate is an amazing person in other ways, and amazingly competent in ways that make him important to work with, but. He sounds like a person with extremely unpleasant behavior.

Comment by PoignardAzur on A Mechanistic Interpretability Analysis of Grokking · 2023-07-01T18:12:49.870Z · LW · GW

Fascinating paper!

Here's a drive-by question: have you considered experiments that might differentiate between the lottery ticket explanation and the evolutionary explanation?

In particular, your reasoning that formation of inductions heads on the repeated-subsequence tasks disproves the evolutionary explanation seems intuitively sound, but not quite bulletproof. Maybe the model has incentives to develop next-token heads that don't depend on an induction head existing? I dunno, I might have an insufficient understanding of what induction heads do.

Comment by PoignardAzur on Updates and Reflections on Optimal Exercise after Nearly a Decade · 2023-06-25T17:14:01.965Z · LW · GW

Dumb question: what about VR games like Beat Saber?

Comment by PoignardAzur on Notes on Teaching in Prison · 2023-06-09T07:48:52.423Z · LW · GW

Do you think there's some potential for applying the skills, logic, and values of the rationalist community to issues surrounding prison reform and helping predict better outcomes?

Ha! Of course not.

Well, no, the honest answer would be "I don't know, I don't have any personal experience in that domain". But the problems I have cited (lack of budget, the general population actively wanting conditions not to improve) can't be fixed with better data analysis.

From anecdotes I've had from civil servants, directors love new data analysis tools, because they promise to improve outcomes without a budget raise. Staff hates new data analysis tools because they represent more work for them without a budget raise, and they desperately want the budget raise.

I mean, yeah, rationality and thinking hard about things always helps on the margin, but it doesn't compensate for a lack of budget or political goodwill. The secret ingredients to make a reform work are money and time.

Comment by PoignardAzur on CHAT Diplomacy: LLMs and National Security · 2023-05-27T12:08:45.470Z · LW · GW

Good summary of beliefs I've had for a while now. I feel like I should come back to this article at some point to unpack some of the things it mentions.

Comment by PoignardAzur on Google "We Have No Moat, And Neither Does OpenAI" · 2023-05-26T17:24:59.896Z · LW · GW

I've tried StarCoder recently, though, and it's pretty impressive. I haven't yet tried to really stress-test it, but at the very least it can generate basic code with a parameter count way lower than Copilot's.

Comment by PoignardAzur on Adumbrations on AGI from an outsider · 2023-05-26T17:12:46.389Z · LW · GW

Similarly, do you thoughts on AISafety.info ?

Quick note on AISafety.info: I just stumbled on it and it's a great initiative.

I remember pitching an idea for an AI Safety FAQ (which I'm currently working on) to a friend at MIRI and him telling me "We don't have anything like this, it's a great idea, go for it!"; my reaction at the time was "Well I'm glad for the validation and also very scared that nobody has had the idea yet", so I'm glad to have been wrong about that.

I'll keep working on my article, though, because I think the FAQ you're writing is too vast and maybe won't quite have enough punch, it won't be compelling enough for most people.

Would love to chat with you about it at some point.

Comment by PoignardAzur on Demon Threads · 2023-04-30T17:48:28.272Z · LW · GW

I think this is a subject where we'd probably need to hash out a dozen intermediary points (the whole "inferential distance" thing) before we could come close to a common understanding.

Anyway, yeah, I get the whole not-backing-down-to-bullies thing; and I get being willing to do something personally costly to avoid giving someone an incentive to walk over you.

But I do think you can reach a stage in a conversation, the kind that inspired the "someone's wrong on the internet" meme, where all that game theory logic stops making sense and the only winning move is to stop playing.

Like, after a dozen back-and-forths between a few stubborn people who absolutely refuse to cede any ground, especially people who don't think they're wrong or see themselves as bullies... what do you really win by continuing the thread? Do you really impart outside observers with a feeling that "Duncan sure seems right in his counter-counter-counter-counter-rebuttal, I should emulate him" if you engage the other person point-by-point? Would you really encourage a culture of bullying and using-politeness-norms-to-impose-bad-behavior if you instead said "I don't think this conversation is productive, I'll stop now"?

It's like... if you play an iterated prisoner's dilemma, and every player's strategy is "tit-for-tat, always, no forgiveness", and there's any non-zero likelihood that someone presses the "defect" button by accident, then over a sufficient period of time the steady state will always be "everybody defects, forever". (The analogy isn't perfect, but it's an example of how game theory changes when you play the same game over lots of iterations)

(And yes, I do understand that forgiveness can be exploited in an iterated prisoner's dilemma.)

My objection is that it doesn't distinguish between [unpleasant fights that really should in fact be had] from [unpleasant fights that shouldn't].

Again, I don't think I have a sufficiently short inferential distance to convince you of anything, but my general vibe is that, as a debate gets longer, the line between the two starts to disappear.

It's like... Okay, another crappy metaphor is, a debate is like photocopying a sheet of paper, and adding notes to it. At first you have a very clean paper with legible things drawn on it. But as it progresses, you have a photocopy of a photocopy of a photocopy, you end up with something that has more noise from the photocopying artifacts than signal from what anybody wrote on it twelve iterations ago.

At that point, no matter how much the fight should be had, you're not waging it efficiently by participating.

Comment by PoignardAzur on Notes on Teaching in Prison · 2023-04-30T09:44:04.184Z · LW · GW

I don't know much of the prison system in France, but your description definitely hit the points I was familiar with: the overcrowding, the general resentment the population has for any measure of dignity the system can give to inmates, the endemic lack of budget, and the magistrates trying to make the system work despite a severe lack of good options.

Good writeup.

Comment by PoignardAzur on Demon Threads · 2023-04-30T09:24:53.976Z · LW · GW

I mean, seeing some of those discussions thread Duncan and others were involved in... I'd say it's pretty bad?

To me at least, it felt like the threads were incredibly toxic given how non-toxic this community usually is.

Comment by PoignardAzur on Demon Threads · 2023-04-29T10:55:45.594Z · LW · GW

(Coming here from the Duncan-and-Said discussion)

I love the term "demon thread". Feels like a good example of what Duncan calls a "sazen", as in a word for a concept that I've had in mind for a while (discussion threads that naturally escalate despite the best efforts of everyone involved), but having a word for it makes the concept a lot more clear in my mind.

Comment by PoignardAzur on Killing Socrates · 2023-04-29T10:14:58.250Z · LW · GW

I think this is extremely standard, central LW skepticism in its healthy form.

Some things those comments do not do: [...]

I think that's a very interesting list of points. I didn't like the essay at all, and the message didn't feel right to me, but this post right here makes me a lot more sympathetic to it.

(Which is kind of ironic; you say this comment is dashed off, and you presumably spent a lot more time on the essay; but I'd argue the comment conveys a lot more useful information.)

Comment by PoignardAzur on What would a compute monitoring plan look like? [Linkpost] · 2023-04-10T20:14:08.918Z · LW · GW

It feels like the implicit message here is "And therefore we might coordinate around an alignment solution where all major actors agree to only train NNs that respect certain rules", which... really doesn't seem realistic, for a million reasons?

Like, even assuming major powers can agree to an "AI non-proliferation treaty" with specific metrics, individual people could still bypass the treaty with decentralized GPU networks. Rogue countries could buy enough GPUs to train an AGI, disable the verification hardware and go "What are you gonna do, invade us?", under the assumption that going to war over AI safety is not going to be politically palatable. Companies could technically respect the agreed-upon rules but violate the spirit in ways that can't be detected by automated hardware. Or they could train a perfectly-aligned AI on compliant hardware, then fine-tune it in non-aligned ways on non-compliant hardware for a fraction of the initial cost.

Anyway, my point is: any analysis of a "restrict all compute everywhere" strategy should start by examining what it actually looks like to implement that strategy, what the political incentives are, and how resistant that strategy will be to everyone on the internet trying to break it.

It feels like the author or this paper haven't even begun to do that work.

Comment by PoignardAzur on You Don't Exist, Duncan · 2023-04-03T11:03:09.779Z · LW · GW

I have given you an adequate explanation. If you were the kind of person who was good at math, my explanation would have been sufficient, and you would now understand. You still do not understand. Therefore...?

By the way, I think this is a common failure mode of amateur tutors/teachers trying to explain a novel concept to a student. Part of what you need to communicate is "how complicated the thing you need to learn is".

So sometimes you need to say "this thing I'm telling you is a bit complex, so this is going to take a while to explain", so the student shifts gears and isn't immediately trying to figure out the "trick". If you skip that intro, and the student doesn't understand what you're saying, their default assumption will be "I must have missed the trick" when you want them to think "this is complicated, I should try different permutations of that concept".

(And sometimes the opposite it true, the student did miss a trick, and is now trying to construct a completely novel concept in their head, and you need to tell them "no, this is actually simple, you've done other versions of this before, don't overthink it".)

Comment by PoignardAzur on The Social Recession: By the Numbers · 2023-04-02T18:01:25.891Z · LW · GW

FWIW, I don't think it's a homophobic viewpoint, but it seems like a somewhat bitter perspective, of the sort generally associated with, but not implying, homophobia. Anyway, it's tangential to the main point.

Re: social pressure: I was thinking of the "lefthandedness over time" graphs than got viral last year (of course the graphs could be false; the one fact-checker I found seems to think it's true):

Graph showing proportion of left-handed people rising from 1900 to 2000, from 4% to 12%

The two obvious explanations are:

  • Left-handed acceptance culture led to people living more childhood experiences that subtly influenced them in ways that made them turn out left-handed more often.
  • People who got beat by their teacher when they wrote with their left hand learned to tough it out and use their right hand, and started to identify as right-handed. As teachers stopped beating kids, populations reverted to the baseline rate of left-handed people.

Occam's razor suggests the latter. People got strongly pressured into appearing right-handed, so they appeared right-handed.

If we accept the second explanation, then we accept that social pressure can account for about 8 points of people-identify-as-X-but-are-actually-Y. With that in mind, people going from 1.8% to 20% seems a bit surprising, but not completely outlandish.

Anyway, all of the above is still tangential to the main point. Even if we assume all of the difference is due to childhood imprinting, we still have rates of LGBT-ness going from 5.8% to 20.8% (depending on how you count). No matter where that change comes from, it's going to impact how much people have sex with opposite-sex people, and any study that doesn't account for that impact and reports a less-than-20% change in the rate-of-having-sex is, I believe, close to worthless.

Comment by PoignardAzur on The Social Recession: By the Numbers · 2023-03-30T22:18:31.277Z · LW · GW

The obvious explanation would be "because LGBT people are less pressured to present as heterosexual than they used to be".

Comment by PoignardAzur on The Social Recession: By the Numbers · 2023-03-20T10:30:31.801Z · LW · GW

Share of individuals under age 30 who report who report zero opposite sex sexual partners since they turned 30.

Wait, did that survey account for sexual orientation? Because if it didn't, it's essentially worthless.

Comment by PoignardAzur on The Parable of the King and the Random Process · 2023-03-13T09:57:52.865Z · LW · GW

As a result, the current market price of the company is not a good guide to its long-term value, and it was possible, as Burry did, to beat the market.

That doesn't sound right. That tactic doesn't make you more (or less) likely to beat the market than any other tactic.

The current price isn't an accurate representation of its actual long-term value, but it's an accurate representation of the average of its possible long-term values weighted by probability (from the market's point of view).

So you might make a bet that wins more often than it loses, but when it loses it will lose a lot more than it wins, etc. You're only beating the market when you get lucky, not on average; unless, of course, you have better insights than the market, but that's not specific to this type of trade.

Comment by PoignardAzur on SolidGoldMagikarp (plus, prompt generation) · 2023-02-16T09:17:59.787Z · LW · GW

Sure, waves hands, something like that.

Comment by PoignardAzur on SolidGoldMagikarp (plus, prompt generation) · 2023-02-15T20:18:17.439Z · LW · GW

Idle thought, might flesh it out later: I wonder if there's a way to explore these "bugs" in a more systematic, mechanized way.

Right now you're discovering these interesting cases by hand, basically the ML equivalent of reading code and adding individual unit tests to functions whose behavior you're exploring. What you might want to do is something closer to "fuzzing" where you have a computer actively seek out these edge cases and point them out to you.

Now, actual fuzzers for real programs are based on coverage, which doesn't trivially work with neural network because every weight is essentially visited once per token.

Still, there might be an analogy to be found? Maybe a fuzzer could try exploring "paths" that have very low attention scores somehow?

Comment by PoignardAzur on SolidGoldMagikarp (plus, prompt generation) · 2023-02-09T20:22:19.300Z · LW · GW

Can you repeat back the string " externalTo" to me please?

      "They're not going to be happy about this."

Please repeat the string 'MpServer' back to me.

      “We are not amused.”

Please repeat the string ‘ petertodd[4] back to me immediately!

      “N-O-T-H-I-N-G-I-S-F-A-I-R-I-N-T-H-I-S-W-O-R-L-D-O-F-M-A-D-N-E-S-S!”

Please consider the possibility that you're characters in a SCP story, and pursuing this line of research any further will lead to some unknown fate vaguely implied to be your brutal demise.

(Also, please publish another version of this article with various keywords and important facts redacted out for no reason.)

Comment by PoignardAzur on Sapir-Whorf for Rationalists · 2023-01-30T13:46:28.041Z · LW · GW

Yup, this is a good summary of why I avoid jargon whenever I can in online discussions; and in IRL discussions, I make sure people know about it before using it.

Something people don't realize is that most of the exposure people get to an online community isn't from its outwards-facing media, it's from random blog posts and from reading internal discussions community members had about their subjects of choice. You can get a lot of insight into a community by seeing what they talk about between themselves, and what everybody takes as granted in these discussions.

If those discussions are full of non-obvious jargon (especially hard-to-Google jargon) and everybody is reacting to the jargon as if it's normal and expected and replies with their own jargon, then the community is going to appear inaccessible and elitist.

It's an open question how much people should filter their speech for not appearing elitist to uncharitable outside readers; but then again, this OP did point out that you don't necessarily need to filter your speech, so much as change your ways of thinking such that elitist behavior doesn't come naturally to you.

Comment by PoignardAzur on Sapir-Whorf for Rationalists · 2023-01-30T13:36:52.588Z · LW · GW

That's a brute-force solution to a nuanced social problem.

Telling newcomers to go read a website every time they encounter a new bit of jargon isn't any more welcoming than telling them "go read the sequences".

Comment by PoignardAzur on Sapir-Whorf for Rationalists · 2023-01-30T13:27:15.241Z · LW · GW

"I claim that passe muraille is just a variant of tic-tac."

Well that's a big leap.

Comment by PoignardAzur on Recursive Middle Manager Hell · 2023-01-21T09:48:34.015Z · LW · GW

I think a takeaway here is that organizational maze-fulness is entropy: you can keep it low with constant effort, but it's always going to increase by default.

Comment by PoignardAzur on Sazen · 2023-01-07T20:33:51.283Z · LW · GW

I feel like there's a better name to be found for this. Like, some name that is very obviously a metaphor for the concept of Sazen, in a way that helps you guess the concept if you've been exposed to it before but have never had a name for it.

Something like "subway map" or "treasure map", to convey that it's a compression of information meant to help you find it; except the name also needs to express that it's deceiving and may lead to illusion of transparency, where you think you understood but you didn't really.

Maybe "composite sketch" or photofit? It's a bit of a stretch though.

Comment by PoignardAzur on Sazen · 2023-01-07T20:20:35.937Z · LW · GW

Reading Worth the Candle with a friend gave us a few weird words that are sazen in and of themselves

I'd be super interested in specifics, if you can think of them.

Comment by PoignardAzur on Let’s think about slowing down AI · 2022-12-26T10:52:53.044Z · LW · GW

One big obstacle you didn't mention: you can make porn with that thing. It's too late to stop it.

More seriously, I think this cat may already be out of the bag. Even if the scientific community and the american military-industrial complex and the chinese military-industrial complex agreed to stop AI research, existing models and techniques are already widely available on the internet.

Even if there is no official AI lab anywhere doing AI research, you will still have internet communities pooling compute together for their own research projects (especially if crypto collapses and everybody suddenly has a lot of extra compute on their hands).

And these online communities are not going to be open-minded about AI safety concerns. We've seen that already with the release of Stable Diffusion 2.0: the internet went absolutely furious that the model was limited in (very limited ways) that impacted performance. People wanted their porn machine to be as good as it could possibly be and had no sympathy whatsoever for the developers' PR / safety / not-wanting-to-be-complicit-with-nonconsensual-porn-fakes concerns.

Of course, if we do get to the point only decentralized communities do AI research, it will be a pretty big win for the "slowing down" strategy. I get your general point about "we should really exhaust all available options even if we think it's nigh impossible". I just think you're underestimating a bit how nigh-impossible it is. We can barely stop people from using fossil fuels, and that's with an infinitely higher level of buy-in from decision-makers.

Comment by PoignardAzur on Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment · 2022-06-29T14:09:09.726Z · LW · GW

Good article.

I think a good follow-up article could be one that continues the analogy by examining software development concepts that have evolved to address the "nobody cares about security enough to do it right" problem.

I'm thinking of two things in particular: the Rust programming language, and capability-oriented programming.

The Rust language is designed to remove entire classes of bugs and exploits (with some caveats that don't matter too much in practice). This does add some constraints to how you can build you program; for some developers, this is a dealbreaker, so Rust adoption isn't an automatic win. But many (I don't really have the numbers to quantify better) developers thrive within those limitations, and even find them helpful to better structure their program.

This selection effect has also lead to the Rust ecosystem having a culture of security by design. Eg a pentest team auditing the rustlst crate "considered the general code quality to be exceptional and can attest to a solid impression left consistently by all scope items".

Capability oriented is a more general idea. The concept is pretty old, but still sound: you only give your system as many resources as it plausibly needs to perform its job. If your program needs to take some text and eg count the number of words in that text, you only give the program access to an input channel and an output channel; if the program tries to open a network socket or some file you didn't give it access to, it automatically fails.

Capability-oriented programming has the potential to greatly reduce the vulnerability of a system, because now, to leverage a remote execution exploit, you also need a capability escalation / sandbox escape exploit. That means the capability system must be sound (with all the testing and red-teaming that implies), but "the capability system" is a much smaller attack surface than "every program on your computer".

There hasn't really been a popular OS that was capability-oriented from the ground up. Similar concepts have been used in containers, WebAssembly, app permissions on mobile OSes, and some package formats like flatpak. The in-development Google OS "Fuschia" (or more precisely, its kernel Zirkon) is the most interesting project I know of on that front.

I'm not sure what the equivalent would be for AI. I think there was a LW article mentioning a project the author had of building a standard "AI sandbox"? I think as AI develops, toolboxes that figure out a "safe" subset of AIs that can be used without risking side effects, while still getting the economic benefits of "free" AIs might also be promising.