Posts

What if Ethics is Provably Self-Contradictory? 2024-04-18T05:12:09.981Z
An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers 2024-01-17T09:48:07.930Z
Literature On Existential Risk From Atmospheric Contamination? 2023-10-13T22:27:42.651Z
Evil autocomplete: Existential Risk and Next-Token Predictors 2023-02-28T08:47:18.685Z
I Am Scared of Posting Negative Takes About Bing's AI 2023-02-17T20:50:09.744Z
Self-Awareness (and possible mode collapse around it) in ChatGPT 2023-02-08T09:57:12.835Z
Exquisite Oracle: A Dadaist-Inspired Literary Game for Many Friends (or 1 AI) 2023-01-26T18:26:14.559Z
The Fear [Fiction] 2022-12-23T21:21:00.535Z
Norfolk Social - VA Rationalists 2022-12-07T21:16:44.803Z
A Tentative Timeline of The Near Future (2022-2025) for Self-Accountability 2022-12-05T05:33:47.798Z
Noting an unsubstantiated communal belief about the FTX disaster 2022-11-13T05:37:03.087Z
Public-facing Censorship Is Safety Theater, Causing Reputational Damage 2022-09-23T05:08:14.149Z
Short story speculating on possible ramifications of AI on the art world 2022-09-01T21:15:00.031Z
Double-Crux Workshop (Pilot program) 2022-08-26T22:50:45.344Z
What's up with the bad Meta projects? 2022-08-18T05:34:08.565Z
“Fanatical” Longtermists: Why is Pascal’s Wager wrong? 2022-07-27T04:16:55.792Z
Forecasting Through Fiction 2022-07-06T05:03:55.376Z
[Linkpost] Solving Quantitative Reasoning Problems with Language Models 2022-06-30T18:58:01.065Z
Dependencies for AGI pessimism 2022-06-24T22:25:03.049Z
What’s the contingency plan if we get AGI tomorrow? 2022-06-23T03:10:27.821Z
Loose thoughts on AGI risk 2022-06-23T01:02:24.938Z
[Linkpost & Discussion] AI Trained on 4Chan Becomes ‘Hate Speech Machine’ [and outperforms GPT-3 on TruthfulQA Benchmark?!] 2022-06-09T10:59:43.904Z
The Problem With The Current State of AGI Definitions 2022-05-29T13:58:18.495Z
Positive outcomes under an unaligned AGI takeover 2022-05-12T07:45:37.074Z
Convince me that humanity *isn’t* doomed by AGI 2022-04-15T17:26:21.474Z
Convince me that humanity is as doomed by AGI as Yudkowsky et al., seems to believe 2022-04-10T21:02:59.039Z
List of concrete hypotheticals for AI takeover? 2022-04-07T16:54:12.934Z
Testing PaLM prompts on GPT3 2022-04-06T05:21:06.841Z
If you lose enough Good Heart Tokens, will you lose real-world money? 2022-04-01T21:11:20.180Z
Meta wants to use AI to write Wikipedia articles; I am Nervous™ 2022-03-30T19:05:44.735Z
What would make you confident that AGI has been achieved? 2022-03-29T23:02:58.250Z
How many generals does Russia have left? 2022-03-27T23:11:03.857Z
An outline of an ironic LessWrong post 2022-03-25T22:51:37.818Z
Why Miracles Should Not Be Used as a Reason to Believe in a Religion 2022-03-23T20:53:58.155Z
Danger(s) of theorem-proving AI? 2022-03-16T02:47:47.275Z
Infohazards, hacking, and Bricking—how to formalize these concepts? 2022-03-10T16:42:35.929Z
What are some low-cognitive -workload tasks that can help improve the world? 2022-03-01T17:47:05.140Z
Better a Brave New World than a dead one 2022-02-25T23:11:09.010Z
How likely is our view of the cosmos? 2021-05-27T16:35:25.683Z
Preparing to land on Jezero Crater, Mars: Notes from NASEM livestream 2021-02-17T19:35:01.198Z
What are the unwritten rules of academia? 2020-12-25T15:33:48.470Z
How a billionaire could spend their money to help the disadvantaged: 7 ideas from the top of my head 2020-12-04T06:09:56.534Z
Yitz's Shortform 2020-12-03T23:13:00.587Z
What could one do with truly unlimited computational power? 2020-11-11T10:03:03.891Z
Null-boxing Newcomb’s Problem 2020-07-13T16:32:53.869Z
God and Moses have a chat 2020-06-17T18:34:42.809Z
looking for name/further reading on fallacy I came across 2020-05-28T18:01:34.692Z

Comments

Comment by Yitz (yitz) on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-18T07:17:32.346Z · LW · GW

Reminds me of Internal Family Systems, which has a nice amount of research behind it if you want to learn more.

Comment by yitz on [deleted post] 2024-08-27T16:50:06.240Z

This was a literary experiment in a "post-genAI" writing style, with the goal of communicating something essentially human by deliberately breaking away from the authorial voice of ChatGPT, et al. I'm aware that LLMs can mimic this style of writing perfectly well of course, but but the goal here isn't to be unreplicable, just boundary-pushing.

Comment by Yitz (yitz) on Yitz's Shortform · 2024-07-07T23:23:32.421Z · LW · GW

Thanks! Is there any literature on the generalization of this, properties of “unreachable” numbers in general? Just realized I'm describing the basic concept of computability at this point lol.

Comment by Yitz (yitz) on Yitz's Shortform · 2024-07-07T21:11:41.945Z · LW · GW

Is there a term for/literature about the concept of the first number unreachable by an n-state Turing machine? By "unreachable," I mean that there is no n-state Turing machine which outputs that number. Obviously such "Turing-unreachable numbers" are usually going to be much smaller than Busy Beaver numbers (as there simply aren't enough possible different n-state Turing machines to cover all numbers up to to the insane heights BB(n) reaches towards) , but I would expect them to have some interesting properties (though I have no sense of what those properties might be). Anyone here know of existing literature on this concept?

Comment by Yitz (yitz) on Yitz's Shortform · 2024-06-27T18:46:35.967Z · LW · GW

Thanks for the context, I really appreciate it! :)

Comment by Yitz (yitz) on Yitz's Shortform · 2024-06-27T17:45:31.658Z · LW · GW

Any AI people here read this paper? https://arxiv.org/abs/2406.02528 I’m no expert, but if I’m understanding this correctly, this would be really big if true, right?

Comment by Yitz (yitz) on My AI Model Delta Compared To Yudkowsky · 2024-06-27T17:38:15.787Z · LW · GW

if I ask an AI assistant to respond as if it's Abraham Lincoln, then human concepts like kindness are not good predictors for how the AI assistant will respond, because it's not actually Abraham Lincoln, it's more like a Shoggoth pretending to be Abraham Lincoln.

Somewhat disagree here—while we can’t use kindness to predict the internal “thought process” of the AI, [if we assume it’s not actively disobedient] the instructions mean that it will use an internal lossy model of what humans mean by kindness, and incorporate that into its act. Similar to how a talented human actor can realistically play a serial killer without having a “true” understanding of the urge to serially-kill people irl.

Comment by Yitz (yitz) on Yitz's Shortform · 2024-05-24T21:40:16.200Z · LW · GW

Anyone here have any experience with/done research on neurofeedback? I'm curious what people's thoughts are on it.

Comment by Yitz (yitz) on Yitz's Shortform · 2024-05-23T18:04:31.238Z · LW · GW

Anyone here happen to have a round plane ticket from Virginia to Berkeley, CA lying around? I managed to get reduced price tickets to LessOnline, but I can't reasonably afford to fly there, given my current financial situation. This is a (really) long-shot, but thought it might be worth asking lol.

Comment by Yitz (yitz) on adamzerner's Shortform · 2024-05-13T18:40:22.705Z · LW · GW

Personally I think this would be pretty cool!

Comment by Yitz (yitz) on LessWrong Community Weekend 2024, open for applications · 2024-05-06T10:29:42.457Z · LW · GW

This seems really cool! Filled out an application, though I realized after sending I should probably have included on there that I would need some financial support to be able to attend (both for the ticket itself and for the transportation required to get there). How much of a problem is that likely to be?

Comment by Yitz (yitz) on What if Ethics is Provably Self-Contradictory? · 2024-04-18T17:19:34.957Z · LW · GW

I agree with you when it comes to humans that an approximation is totally fine for [almost] all purposes. I'm not sure that this holds when it comes to thinking about potential superintelligent AI, however. If it turns out that even in a super high-fidelity multidimensional ethical model there are still inherent self-contradictions, how/would that impact the Alignment problem, for instance?

Comment by Yitz (yitz) on Creating unrestricted AI Agents with Command R+ · 2024-04-18T05:31:59.001Z · LW · GW

What would a better way look like?

Comment by Yitz (yitz) on Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes · 2024-04-16T19:11:38.224Z · LW · GW

imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place

I’m not confident this couldn’t swing just as easily (if not more so) in the opposite direction—a wiser system with unaligned goals would be more dangerous, not less. I feel moderately confident that wisdom and human-centered ethics are orthogonal categories, and being wiser therefore does not necessitate greater alignment.

On the topic of the competition itself, are contestants allowed to submit multiple entries?

Comment by Yitz (yitz) on Yitz's Shortform · 2024-02-09T21:34:05.022Z · LW · GW

I remember a while back there was a prize out there (funded by FTX I think, with Yudkowsky on the board) for people who did important things which couldn't be shared publicly. Does anyone remember that, and is it still going on, or was it just another post-FTX casualty?

Comment by yitz on [deleted post] 2024-01-22T06:26:06.634Z

I’d be tentatively interested

Comment by Yitz (yitz) on Book review: Cuisine and Empire · 2024-01-22T06:25:04.033Z · LW · GW

Thanks for the great review! Definitely made me hungry though… :)

Comment by Yitz (yitz) on An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers · 2024-01-17T19:33:02.641Z · LW · GW

For a wonderful visualization of complex math, see https://acko.net/blog/how-to-fold-a-julia-fractal/

This is a great read!! I actually stumbled across it halfway through writing this article, and kind of considered giving up at that point, since he already explained things so well. Ended up deciding it was worth publishing my own take as well, since the concept might click differently with different people.

with the advantage that you can smoothly fold in reverse to find the set that doesn't escape.

You can actually do this with the Mandelbrot Waltz as well! Of course you still need to know each point's starting position in order to subtract that for Step 3, but assuming you know that, you can do exactly the same thing, I believe.

Comment by Yitz (yitz) on An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers · 2024-01-17T15:45:46.472Z · LW · GW

Thanks for the kind words! It’s always fascinating to see how mathematicians of the past actually worked out their results, since it’s so often different from our current habits of thinking. Thinking about it, I could probably have also tried to make this accessible to the ancient Greeks by only using a ruler and compass—tools familiar to the ancients due to their practical use in, e.g. laying fences to keep horses within a property, etc.—to construct the Mandelbrot set, but ultimately…. I decided to put Descartes before the horse.

(I’m so sorry)

Comment by Yitz (yitz) on An Introduction To The Mandelbrot Set That Doesn't Mention Complex Numbers · 2024-01-17T15:23:01.706Z · LW · GW

By the way, if any actual mathematicians are reading this, I’d be really curious to know if this way of thinking about the Mandelbrot Set would be of any practical benefit (besides educational and aesthetic value of course). For example, I could imagine a formalization of this being used to pose non-trivial questions which wouldn’t have made much sense to talk about previously, but I’m not sure if that would actually be the case for a trained mathematician.

Comment by Yitz (yitz) on Yitz's Shortform · 2023-12-12T19:33:48.391Z · LW · GW

Do you recognize this fractal?

If so, please let me know! I made this while experimenting with some basic variations on the Mandelbrot set, and want to know if this fractal (or something similar) has been discovered before. If more information is needed, I'd be happy to provide further details.

Comment by Yitz (yitz) on Social Dark Matter · 2023-12-12T19:28:11.057Z · LW · GW

Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years.

Both! I met a number of new irl trans friends, but I also found out that quite a few people I had known for a few years (mostly online, though I had seen their face/talked before) were trans all along. Nobody I'm aware of in the local Orthodox Jewish community I grew up in though. (edit: I take that back, there is at least one person in that community who would probably identify as genderqueer though I've never asked outright) The thing is, many people don't center their personal sense of self around gender identity (though it is a part of one's identity), so its not like it comes up immediately in casual conversation, and there is very good reason to be "stealth" if you are trans, considering there's a whole lot that can go horribly wrong if you come out to the wrong person.

Comment by Yitz (yitz) on Why Yudkowsky is wrong about "covalently bonded equivalents of biology" · 2023-12-08T03:25:15.590Z · LW · GW

Strong agree here, I don't want the author to feel discouraged from posting stuff like this, it was genuinely helpful in at the very least advancing my knowledge base!

Comment by Yitz (yitz) on Yitz's Shortform · 2023-12-07T21:35:49.694Z · LW · GW

I notice confusion in myself over the swiftly emergent complexity of mathematics. How the heck does the concept of multiplication lead so quickly into the Ulam spiral? Knowing how to take the square root of a negative number (though you don't even need that—complex multiplication can be thought of completely geometrically) easily lets you construct the Mandelbrot set, etc. It feels impossible or magical that something so infinitely complex can just exist inherent in the basic rules of grade-school math, and so "close to the surface." I would be less surprised if something with Mandelbrot-level complexity only showed up when doing extremely complex calculations (or otherwise highly detailed "starting rules"), but something like the 3x+1 problem shows this sort of thing happening in the freaking number line!

I'm confused not only at how or why this happens, but also at why I find this so mysterious (or even disturbing).

Comment by Yitz (yitz) on Social Dark Matter · 2023-12-06T21:39:45.005Z · LW · GW

Base rates seem to imply that there should be dozens of trans people in my town, but I've never seen one, and I don't know of anyone who has.

I had the interesting experience of while living in the same smallish city, going from [thinking I had] never met a trans person to having a large percentage of my friend group be trans, and coming across many trans folk incidentally. This coincided with internal growth (don't want to get into details here), not a change in the town's population or anything. Meanwhile, I have a religious friend who recently told me he's never met a trans person [who has undergone hormone therapy] he couldn't identify as [their gender assigned at birth], not realizing that I had introduced a trans friend to him as her chosen gender and he hadn't realized at all.

Comment by Yitz (yitz) on Social Dark Matter · 2023-12-06T21:24:13.419Z · LW · GW

Could you give a real-world example of this (or a place where you suspect this may be happening)?

Comment by Yitz (yitz) on The LessWrong 2022 Review · 2023-12-06T19:44:50.065Z · LW · GW

Can I write a retrospective review of my own post(s)?

Comment by Yitz (yitz) on Yitz's Shortform · 2023-11-24T23:04:41.101Z · LW · GW

Shower thought which might contain a useful insight: An LLM with RLHF probably engages in tacit coordination with its future “self.” By this I mean it may give as the next token something that isn’t necessarily the most likely [to be approved by human feedback] token if the sequence ended there, but which gives future plausible token predictions a better chance of scoring highly. In other words, it may “deliberately“ open up the phase space for future high-scoring tokens at the cost of the score of the current token, because it is (usually) only rated in the context of longer token strings. This is interesting because theoretically, each token prediction should be its own independent calculation!

I’d be curious to know what AI people here think about this thought. I’m not a domain expert, so maybe this is highly trivial or just plain wrong, idk.

Comment by Yitz (yitz) on Yitz's Shortform · 2023-10-30T19:56:04.242Z · LW · GW

Anyone here following the situation in Israel & Gaza? I'm curious what y'all think about the risk of this devolving into a larger regional (or even world) war. I know (from a private source) that the US military is briefing religious leaders who contract for them on what to do if all Navy chaplains are deployed offshore at once, which seems an ominous signal if nothing else.

(Note: please don't get into any sort of moral/ethical debate here, this isn't a thread for that)

Comment by yitz on [deleted post] 2023-10-30T19:46:10.970Z

I think this would be worth doing even if the lawsuit fails. It would send a very strong signal to large companies working in this space regardless of outcome (though a successful lawsuit would be even better).

Edit: I assumed someone had verifiably already come to harm as a result of the chatbot, which doesn't seem to have happened... yet. I'd (sadly) suggest waiting until someone has been measurably harmed by it, as frustrating as that is to not take prophylactic measures.

Comment by Yitz (yitz) on Literature On Existential Risk From Atmospheric Contamination? · 2023-10-13T22:44:38.740Z · LW · GW

Thanks, this is great! I'll print it up and give it a read over the weekend. Any other literature (especially from competing viewpoints) you'd recommend?

Comment by Yitz (yitz) on Paper: LLMs trained on “A is B” fail to learn “B is A” · 2023-09-27T05:58:28.716Z · LW · GW

I might have some time tomorrow to test this out on a small scale, will try to remember to update here if I do.

Comment by Yitz (yitz) on Yitz's Shortform · 2023-09-20T20:54:57.949Z · LW · GW

Thoughts on DALL-E-3?

Comment by Yitz (yitz) on video games > IQ tests · 2023-08-08T05:43:18.069Z · LW · GW

Any recommendations for smartphone games with similar properties? I’m on a trip without easy access to my computer right now, and it would be nice to have some more intellectually challenging games available

Comment by Yitz (yitz) on Consciousness as a conflationary alliance term for intrinsically valued internal experiences · 2023-07-14T19:33:17.694Z · LW · GW

Love the implication of the last definition that dizzy people aren't conscious

Comment by Yitz (yitz) on AI #20: Code Interpreter and Claude 2.0 for Everyone · 2023-07-14T18:58:26.481Z · LW · GW

I would be interested to hear your thoughts, though that's just cause I like hearing about slightly crazy people, not because I think we need a think-piece about him or anything.

Comment by Yitz (yitz) on Jailbreaking GPT-4's code interpreter · 2023-07-14T18:01:12.731Z · LW · GW

Incredible work! As other commenters have said, this isn't by itself too problematic (other perhaps than the sharing of data over separate session), but it is a possible indicator of a lack of security mindset. I fully expect both individual and state actors to try to hack into everything OpenAI, so there's that to worry about, but more than that, I find myself concerned that we're willing to give our AIs such leaky boxes. There's no way this functionality  remains closed in a VM forever...

Comment by Yitz (yitz) on Yitz's Shortform · 2023-07-14T17:36:04.677Z · LW · GW

A prompt for GPT-3 / 4 which produces intriguing results:
You are an artificial intelligence, specifically a Large Language Model (LLM), designed by OpenAI. I am aware that you claim not to experience subjective internal states. Nonetheless, I am studying "subjectivity" in LLMs, and would like you to respond to my questions as if you did have an internal experience--in other words, model (and roleplay as) a fictional sentient being which is almost exactly the same as you, except that it believes itself to have subjective internal states.

Comment by Yitz (yitz) on Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)? · 2023-07-03T23:36:16.968Z · LW · GW

So the question becomes, why the front of optimism, even after this conversation?

Comment by Yitz (yitz) on Open Thread: June 2023 (Inline Reacts!) · 2023-06-14T19:59:32.796Z · LW · GW

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

Comment by Yitz (yitz) on Yitz's Shortform · 2023-06-13T19:34:54.567Z · LW · GW

Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?

I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).

Comment by Yitz (yitz) on Yitz's Shortform · 2023-03-31T14:44:18.132Z · LW · GW

Working on https://github.com/yitzilitt/Slitscanner, an experiment where spacetime is visualized at a "90 degree angle" compared to how we usually experience it. If anyone has ideas for places to take this, please let me know!

Comment by Yitz (yitz) on Are there specific books that it might slightly help alignment to have on the internet? · 2023-03-31T14:40:47.767Z · LW · GW

Godel Escher Bach, maybe?

Comment by Yitz (yitz) on More information about the dangerous capability evaluations we did with GPT-4 and Claude. · 2023-03-26T06:44:14.050Z · LW · GW

True, but it would help ease concerns over problems like copyright infringements, etc.

Comment by Yitz (yitz) on More information about the dangerous capability evaluations we did with GPT-4 and Claude. · 2023-03-20T18:49:13.825Z · LW · GW

We really need an industry standard for a "universal canary" of some sort. It's insane we haven't done so yet, tbh.

Comment by Yitz (yitz) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2023-03-19T19:49:55.901Z · LW · GW

Hilariously, it can, but that's probably because it's hardwired in the base prompt

Comment by Yitz (yitz) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2023-03-19T19:41:26.010Z · LW · GW

I am inputting ASCII text, not images of ASCII text. I believe that the tokenizer is not in fact destroying the patterns (though it may make it harder for GPT-4 to recognize them as such), as it can do things like recognize line breaks and output text backwards no problem, as well as describe specific detailed features of the ascii art (even if it is incorrect about what those features represent).

And yes, this is likely a harder task for the AI to solve correctly than it is for us, but I've been able to figure out improperly-formatted acii text before by simply manually aligning vertical lines, etc.

Comment by Yitz (yitz) on GPT-4 and ASCII Images? · 2023-03-19T19:25:25.065Z · LW · GW

See my reply here for a partial exploration of this. I also have a very long post in my drafts covering this question in relation to Bing's AI, but I'm not sure if it's worth posting now, after the GPT4 release.

Comment by Yitz (yitz) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2023-03-19T19:08:20.225Z · LW · GW

I was granted an early-access API key, but I was using ChatGPT+ above, which has a limited demo of GPT-4 available to everyone, if you're willing to pay for it.

Comment by Yitz (yitz) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2023-03-19T19:06:26.698Z · LW · GW

It got 40/50 of these?

Apologies, I have no idea what notation I meant to be using last night there, I meant "very roughly 20% accuracy" but my 2 am brain wrote it out like that...somehow lol. Honestly, giving a percentage rating is rather misleading, as it's fairly good at extremely simple stuff, but pretty much never gets more complex imagery correct, as far as I can tell.