Posts

Comments

Comment by Michael Roe (michael-roe) on DeepSeek beats o1-preview on math, ties on coding; will release weights · 2024-11-26T11:45:41.366Z · LW · GW

That’s interesting, if true. Maybe the tokeniser was trained on a dataset that had been filtered for dirty words.

Comment by Michael Roe (michael-roe) on DeepSeek beats o1-preview on math, ties on coding; will release weights · 2024-11-26T11:42:27.658Z · LW · GW

I suppose we might worry that LlMs might learn to do RLHF evasion this way - human evaluator sees Chinese character they don’t understand, assumes it’s ok, and then the LLM learns you can look acceptable to humans by writing it in Chinese.

Some old books (which are almost certainly in the training set) used Latin for the dirty bits. Translations of Sanskrit poetry, and various works by that reprobate Richard Burton, do this.

Comment by Michael Roe (michael-roe) on DeepSeek beats o1-preview on math, ties on coding; will release weights · 2024-11-26T11:38:29.566Z · LW · GW

As someone who, in a previous job, got to go to a lot of meetings where the European commission is seeking input about standardising or regulating something - humans also often do the thing where they just use the English word in the middle of a sentence in another language, when they can’t think what the word is. Often with associated facial expression / body language to indicate to the person they’re speaking to “sorry, couldn’t think of the right word”. Also used by people speaking English, whose first language isn’t English, dropping into their own lamguage for a word or two. If you’ve been the editor of e.g. an ISO standard, fixing these up in the proposed text is such fun

So, it doesn’t surprise me at all that LLMs do this.


I have, weirdly, seen llms put a single Chinese word in the middle of English text … and consulting a dictionary reveals that it was, in fact, the right word, just in Chinese.

Comment by Michael Roe (michael-roe) on Crosspost: Developing the middle ground on polarized topics · 2024-11-26T10:24:14.935Z · LW · GW

I will take “actually, it’s even more complicated” as a reasonable response. Yes, it probably is.

Comment by Michael Roe (michael-roe) on Crosspost: Developing the middle ground on polarized topics · 2024-11-25T17:31:04.896Z · LW · GW

Candidate explanations for some specific person being trans could as easily be that they are sexually averse, rather than that they are turned on by presenting as their preferred gender. Compare anorexia nervosa, which might have some parallel with some cases of gender identity disorder. If the patient is worrying about being gender non conforming in the same way that an anorexic worries that that they’re fat, then Blanchard is just completely wrong about what the condition even is in that case.

Comment by Michael Roe (michael-roe) on Crosspost: Developing the middle ground on polarized topics · 2024-11-25T17:24:49.163Z · LW · GW

This might be a good (if controversial) example of “the reality is more complicated than typical simplifications, and it matters what your oversimplification is leaving out”.

And Blanchard’s account of autogynephilia is more nuanced than most peoples second hand version of it. Like, e.g. Blanchard doesn’t think trans men have AGP, and doesn’t think trans women who are attracted to men have AGP.

So, we might, say…

Oversimplication 1: Even Blanchard didn’t try to apply his theory to trans men or trans women attracted to men

Oversimplification 2: Bisexuals exist. Many trans women report their sexual orientation changing when they start taking hormones. The correlation between having AGP and being attracted to women can’t be as 100% as Blanchard appears to believe it is.

Oversimplification 3: looks like Blanchard only identified two subtypes of trans person, and completely missed some of the other subtypes.

Oversimplification 4: Do heterosexual cisgender women have AGP? (Cf. Comments by Aella, eigenrobot etc.) if straight cisgender women also like being attractive in the same way as (some) trans women do, it becomes somewhat doubtful that it’s a pathology.

Comment by Michael Roe (michael-roe) on Which things were you surprised to learn are not metaphors? · 2024-11-25T17:04:40.909Z · LW · GW

To add to the differences between people:


I can choose to see mental images actually overlaid over my field of vision, or somehow in a separate space.


The obvious question someone might ask: can you trace an overlaid mental image? The problem is registration - if my eyes move, the overlaid mental image can shift relative to an actual, perceived, sheet of paper. Easier to do a side by side copy than trace.

Comment by Michael Roe (michael-roe) on Boring & straightforward trauma explanation · 2024-11-11T22:10:20.083Z · LW · GW

I think there might be other aspects to trauma, though. Some possible candidates:


- memories feel as if they are “tagged” with an emotion, in a way that memories normally aren’t

-depletion of some kind of mental resource; not sure what to call it, so I won’t be too so specific about exactly what is depleted

Comment by Michael Roe (michael-roe) on Boring & straightforward trauma explanation · 2024-11-11T21:37:29.588Z · LW · GW

One of the ideas in Cognitive Behavioral Therapy is you might be treating as dangerous something that actually isn’t dangerous (and don’t learn that it’s safe because you’re avoiding it).

so the account you’re giving here seems to be fairly standard.


On the other hand: some things actually are dangerous.

Comment by Michael Roe (michael-roe) on What is the alpha in one bit of evidence? · 2024-10-23T10:03:35.221Z · LW · GW

In any case, as a researcher currently working in this area, I am putting a big bet on moderate badness happening (in that I could be working on something else, and my time has value).

Comment by Michael Roe (michael-roe) on What is the alpha in one bit of evidence? · 2024-10-23T09:59:32.833Z · LW · GW

Also, there is counterparty risk if you bet on everyone dying.


(Yeah, yeah, you can bet on something like other peoples belief in the impednding apocalypse going up before it actually happens).

“Rapid takeoff” hypotheses are particularly hard to bet on.

Comment by Michael Roe (michael-roe) on If I wanted to spend WAY more on AI, what would I spend it on? · 2024-10-22T22:18:57.760Z · LW · GW

If I was going to play this game with an AI, I’d also feed it my genomic data, which would reveal I have a version of the HLA genes that makes me more likely to develop autoimmune diseases.

Comment by Michael Roe (michael-roe) on If I wanted to spend WAY more on AI, what would I spend it on? · 2024-10-22T22:01:01.045Z · LW · GW

Probably, if some AI were to recommend additional blood testing I could manage to persuade the wctual medical professionals to do it. Recent conversation went some thing like this:


Me: “can I have my thyroid levels checked pleas? And the consultant endocrinologist said he’d like to see a liver function test done next time i give a blood sample.”

Nurse (taking my blood sample and pulling my medical record up in the computer) “you take carbimazole right?”

Me: “yes”

Nurse (ticking boxes on a form on the computer) “… and full blood panel, and electrolytes…”

Probably wouldn’t be hard to get suggestions from an AI added to the list.

Comment by Michael Roe (michael-roe) on If I wanted to spend WAY more on AI, what would I spend it on? · 2024-10-22T21:39:15.933Z · LW · GW

Things I might spend more money on, if the were better AI’s to spend it on,


1. I am currently having a lot of blood tests done, with a genuine qualified medical doctor  interpreting the results. Just for fun, I can see if AI gives a similar interpretation of the test results (its not bad).

Suppose we had AI that was actually better than human doctors, and cheaper. (Sounds like that might be here real soon, to be honest). I would probably pay money for that.


2. Some work things I am doing involve formally proving correctness of software. AI is not there, quite yet. If it was, I could probably get DARPA  to pay the license fee for it, assuming cost isnt absolutely astronomical.


Etc.


On the other hand, this would imply that most doctors, and mathematicians, are out of work.

Comment by Michael Roe (michael-roe) on What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented? · 2024-10-19T18:00:31.514Z · LW · GW

https://www.bbc.co.uk/news/technology-67012224

Comment by Michael Roe (michael-roe) on What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented? · 2024-10-19T17:55:55.083Z · LW · GW

Replika, I think.

Comment by Michael Roe (michael-roe) on What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented? · 2024-10-19T14:33:53.439Z · LW · GW

“self-reported data from demons is questionable for at least two reasons”—Scott Alexander.

He was actually talking about Internal Family Systems, but you could probably be skeptical about what malign AIs are telling you, too.

Comment by Michael Roe (michael-roe) on What actual bad outcome has "ethics-based" RLHF AI Alignment already prevented? · 2024-10-19T14:21:41.865Z · LW · GW

Well, we had that guy who tried to assassinate the Queen of England with a crossbow because his AI girlfriend told him to. That was clearly a harm to him, and could have been one for the Queen.


We don’t know how much more “But the AI told me to kill Trump” we’d have with less alignment, but it’s a reasonable guess (given the Replika datapoint) that it might not be zero,

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-19T10:23:06.409Z · LW · GW

Discussing sleep paralysis might be an infohazard…


The times I’ve entered sleep paralysis it hasn’t bothered me, as I knew what it was.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-18T16:54:43.707Z · LW · GW

And then you get the people who are like, “Great! I’m lucid! Now I shall cast one of those demon summoning spells from Vajrayana Buddhism.”

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-18T16:50:31.279Z · LW · GW

Lucid dreaming is often like being Sigourney Weaver in Alien while also being on hospital sedatives. (You are, in fact, actually asleep, so it’s kind of a miracle you can reason at all and not the least bit surprising that you feel a bit groggy; also, dream can be nightmarish).


Why people choose to do this for fun is an interesting question.


You do get people who think they might get into lucid dreaming, then they read the dream diaries of some of the experienced lucid dreamers, and then are like “OMG, I never, ever, want to experience that.”

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T19:04:17.269Z · LW · GW

Well, it’s an interesting question whether there might be more efficient ways to do it.


Lucid nightmares are quite a good way of exposing you to real-seeming dangers without actually dying. 

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T12:58:51.743Z · LW · GW

Reading this article, I have just realised that a dream I had last night came from reading one of those test cases where people try to bypass the guardrails on LLMs. Only the dream was taken from the innocuous part of the prompt.


At this rate, I’m going to be having dreams about turning Lemsip(*) into meth.


(*) UK cold remedy. Contains pseudoephedrine.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T11:02:55.448Z · LW · GW

Chöd in a lucid dream if you’re feeling brave.

Like transform into vajrayogini and invite the demons to devour your corpse, etc,

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T11:01:14.464Z · LW · GW

And then there’s the thing where you dispel the entire dream-universe are just there in a black formless void.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T11:00:04.292Z · LW · GW

Hmm… but, for example, stabilising a dream is kind of like a meditation, and one of the many ways you can transform your body in a dream is basically a body scan meditation from hatha yoga.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T09:46:43.442Z · LW · GW

Given the significance of lucid dreaming in Buddhist practise (Siz Yogas of Naropa, etc.) realising that having a lucid dream just for sexual purposes is kind of pointless may lead to you realising that it’s kind of pointless in waking life too. Many of those guys were monks…

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T09:43:09.258Z · LW · GW

I’m not sure about (10).


Whenever someone has a theory that it’s impossible to do thing X in a dream, the regular lucid dreamers will provide a counterecamp,e by deliberately doing X in their next dream.


Computers, clocks, and written text can behave weirdly in dreams. Really, it’s the same things that generative AI has diffuculty with, possibly for information-theory reasons.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T09:39:10.859Z · LW · GW

A possible benefit: the regulation of your own emotion that you do to keep a dream stable (even when alarming things are happening in it) may help you keep your emotion stable in the waking state too.

Comment by Michael Roe (michael-roe) on Bitter lessons about lucid dreaming · 2024-10-17T09:32:27.922Z · LW · GW

I can lucid dream, and I kind of agree here. Sure, lucid dreaming is possible, but why would you do that?

Re (3), a dream you can completely control tells you nothing you didn’t know already. There is some scope for controlling the dream enough to, in effect, set up a question, and then not control the result.


There a running joke in the lucid dreaming community that the first thing everyone tries is either flying or sex. It’s only when you get to #3 on their list of things they want to do that it becomes at all interesting.

Comment by Michael Roe (michael-roe) on sarahconstantin's Shortform · 2024-10-08T16:54:34.048Z · LW · GW

Some psychiatry textbooks classify “overvalued ideas” as distinct from psychotic delusions.


Depending on how wide you make the definition, a whole rag-bag of diagnoses from the DSM V are overvalued ideas (e.g, anorexia nervosa over valuing being fat).

Comment by Michael Roe (michael-roe) on Extended Interview with Zhukeepa on Religion · 2024-10-06T17:55:21.417Z · LW · GW

Possibly similar dilemma with e.g. UK political parties, who generally have a rule that publicly supporting another party’s candidate will get you expelled.

An individual party member, on the other hand, may well support the party’s platform in general, but think that that one particular candidate is an idiot who is unfit to hold political office - but is not permitted to say so,

(There is a joke about the Whitby Goth Weekend that everyone thinks half the bands are rubbish, but there is no consensus on which half that is. Something similar seems to hold for Labour Party supporters.)

Comment by Michael Roe (michael-roe) on Extended Interview with Zhukeepa on Religion · 2024-10-06T17:48:20.000Z · LW · GW

An organisation such as the Catholic Church primarily wants to perpetuate its own existence, so of course the official doctrine is that they are The One True Church.

An individual Catholic, on the other hand, might genuinely believe that the benefits of religion are also available from other suppliers.

Comment by Michael Roe (michael-roe) on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T18:37:04.325Z · LW · GW

COVID-19 killed, idk, tens of millions worldwide rather than hundreds of millions.

 

But consider that an example of a (biological) virus takeoff of the order of months.

 

So the question for AGI takeoff .. death rate growing more rapidly than COVID19 pandemic, or slower?

Comment by Michael Roe (michael-roe) on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T18:32:50.403Z · LW · GW

Takeoff speed could be measured by e.g. the time between the first mass casualty incident that kills thousands of people vs the first mass casualty incident that kills hundreds of millions.

Comment by Michael Roe (michael-roe) on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T18:23:42.674Z · LW · GW

(This bit isn't serious) "i mean, a days long takeoff leaves you will loads of time for the hypersonic missiles to destroy all of Meta's datacenters."

Comment by Michael Roe (michael-roe) on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually" · 2024-09-30T18:21:24.779Z · LW · GW

Minutes long takeoff...

 

[By comparison, I forget the reference but there is a paper estimating how quickly a computer virus could destroy most of the Internet. About 15 minutes, if I recall correctly.]

Comment by Michael Roe (michael-roe) on What is SB 1047 *for*? · 2024-09-06T12:17:02.173Z · LW · GW

e.g. After the mass casuality incident...

 

"You told the government that you had a shutdown procedure, but you didnt, and hundreds of people died because you knowingly lied to the government."

Comment by Michael Roe (michael-roe) on What is SB 1047 *for*? · 2024-09-06T12:14:13.190Z · LW · GW

My personal view on how it might help:

 

  1. Meta will probably carry on being as negligent as ever, even with sb1047
  2. When/if the first mass casualty incident happens, sb1047 makes it easier for Meta to be successfully sued
  3. After that.AI companies become more careful.
Comment by Michael Roe (michael-roe) on Book Review: What Even Is Gender? · 2024-09-04T18:32:36.124Z · LW · GW

On the one hand, we encounter a lot of arguments  about gender that seem, to me, to be philosophically bad. Maybe a good source of reasoning fallacies you might be able to spot in other contexts too,

 

On the other hand, the more I think about it, the less I care about the object level isue. It seems inevitable that there are going to be various sorts of statistical outliers and hard to classify cases, and really is it that big a deal?

I know one person who is intersex, and I know because they're involved in right activism and they told me, Probably couldnt tell otherwise. Could well be xome other people I know are intersex and haven't told me.Maybd they dont even know themselves, as it appears that this information was frequently witheld by doctors. Shrug.

 

Also: if you take gender as chromosomal sex, then (a) tye aforementioed person is totally genuinely both xx and xy because they have mosaic chromosomes; and (b) it seems really strange for your gender to sometimes be something that you, yourself, do not know,.

Comment by Michael Roe (michael-roe) on Akash's Shortform · 2024-09-03T08:06:14.657Z · LW · GW

A financial conflict of interest is a wonderous thing...

Comment by Michael Roe (michael-roe) on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T17:55:53.768Z · LW · GW

"Okay, Beatrice. There was no alien, and the flash of light you saw in the sky wasn't a UFO. Swamp gas from a weather balloon was trapped in a thermal pocket and refracted the light from Venus -- Men in Black

Comment by Michael Roe (michael-roe) on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T17:49:04.617Z · LW · GW

The TV Series "Dark Skies" .. in which the US Government is orchestrating a coverup about the involvement of giant prawns from outer space in the Roswell incident, the JFK assassination, the shootdown of Gary Power's US spyplane, erc.

Comment by Michael Roe (michael-roe) on "Deception Genre" What Books are like Project Lawful? · 2024-08-29T17:31:02.929Z · LW · GW

I agree that Vernor Vinge's A Deepness in the Sky is an example.

 

Almost but not quite an example: Edmund Cooper's The Overman Culture. It is obvious to the reader from the outset that the characters cannot be when and where they think the are (evacuated from London during World War 2).Maybe not enough deceiver's perspective to count.

 

Also not quite: Gene Wolfe's The Book of the New Sun.

Comment by Michael Roe (michael-roe) on Liability regimes for AI · 2024-08-28T21:20:47.528Z · LW · GW

This is pretty much why many people thought that the term "Open Source" was a betrayal of the objectives of the Free Software movement,

 

"Free as in free speech, not free bewr" has implication that "well, you can read the source" lacks.

Comment by Michael Roe (michael-roe) on O O's Shortform · 2024-08-28T13:40:09.552Z · LW · GW

Yeah, many of the issues are the same:

 

*RLHF can be jail broken with prompts, so you can get it to tell you a sexy story or a recipe for methamphetamine. If we ever get to a point where LLMs know truly dangerous things, they'll tell you those, too.

*Open source weights are fundamentally insecure, because you can finetune out the guardrails. Sexy stories, meth, or whatever.

 

The good thing about the War on Horny

  • probably doesnt really matter, so not much harm done when people get LLMx to write porn
  • Turns out, lots of people want to read porn (surprise! who would have guessed?) so there are lots of attackers trying to bypass the guardrails
  • This gives us good advance warning that the guardrails are worthless
Comment by Michael Roe (michael-roe) on Liability regimes for AI · 2024-08-20T10:46:26.495Z · LW · GW

Also note that Open Source precludes doing this ...

 

The basic Open SOurce deal is that absolutely anyone can take the product and do whatever they like with it, without paying the supplier anything.

 

So

  • The vendor cannot prevent the customer doing something bad with the product (If there is a line of code that says "dont do this bad thing", then the customer can just delete it
  • The vendor also cannot charge the customer an insurance premium base on how likely the customer is to do something ba with the product

... which would suggest that Open Source is only viable in areas where there isn't much third party liability.

Comment by Michael Roe (michael-roe) on Raemon's Shortform · 2024-08-15T12:55:31.569Z · LW · GW

With a nod to the recent Crowdstrike incident .... if your AI is sending out packets to other people;s Windows systems, and bricking them about as fast it can send packets through its ethernet interface, your liability may be expanding rapidly. An additional billion dollars for each hour you dont shut it down sounds possible.

Comment by Michael Roe (michael-roe) on Raemon's Shortform · 2024-08-15T12:44:00.922Z · LW · GW

If your AI is doing something that's causing harm to third parties that you are legally liable for .. chances are, whatever it is doing, it is doing it at Internet speeds, and even small delays are going to be very, very expensive.

 

I am imagining that all the people who got harmed after the first minute or so after the AI went rogue are going to be pointing at SB1047 to argue that you are negligent, and therefore liable for whatever bad thing it did.

Comment by Michael Roe (michael-roe) on A computational complexity argument for many worlds · 2024-08-13T20:14:11.268Z · LW · GW

If quantum computers really work, for more than 3 qbits, then I think I will believe in infinite worlds interpretation.

On the other hand, if there turns out to be some fundamental reason why quantum omputers with many qbits cant exist then maybe not.

The version where you only have 3 qbits is kind of unsatisfactory (look, there are exactly 8 parallel universes and no more...)