Posts
Comments
I agree with you when it comes to humans that an approximation is totally fine for [almost] all purposes. I'm not sure that this holds when it comes to thinking about potential superintelligent AI, however. If it turns out that even in a super high-fidelity multidimensional ethical model there are still inherent self-contradictions, how/would that impact the Alignment problem, for instance?
What would a better way look like?
imagine an AI system which wipes out humans in order to secure its own power, and later on reflection wishes it hadn't; a wiser system might have avoided taking that action in the first place
I’m not confident this couldn’t swing just as easily (if not more so) in the opposite direction—a wiser system with unaligned goals would be more dangerous, not less. I feel moderately confident that wisdom and human-centered ethics are orthogonal categories, and being wiser therefore does not necessitate greater alignment.
On the topic of the competition itself, are contestants allowed to submit multiple entries?
I remember a while back there was a prize out there (funded by FTX I think, with Yudkowsky on the board) for people who did important things which couldn't be shared publicly. Does anyone remember that, and is it still going on, or was it just another post-FTX casualty?
I’d be tentatively interested
Thanks for the great review! Definitely made me hungry though… :)
For a wonderful visualization of complex math, see https://acko.net/blog/how-to-fold-a-julia-fractal/
This is a great read!! I actually stumbled across it halfway through writing this article, and kind of considered giving up at that point, since he already explained things so well. Ended up deciding it was worth publishing my own take as well, since the concept might click differently with different people.
with the advantage that you can smoothly fold in reverse to find the set that doesn't escape.
You can actually do this with the Mandelbrot Waltz as well! Of course you still need to know each point's starting position in order to subtract that for Step 3, but assuming you know that, you can do exactly the same thing, I believe.
Thanks for the kind words! It’s always fascinating to see how mathematicians of the past actually worked out their results, since it’s so often different from our current habits of thinking. Thinking about it, I could probably have also tried to make this accessible to the ancient Greeks by only using a ruler and compass—tools familiar to the ancients due to their practical use in, e.g. laying fences to keep horses within a property, etc.—to construct the Mandelbrot set, but ultimately…. I decided to put Descartes before the horse.
(I’m so sorry)
By the way, if any actual mathematicians are reading this, I’d be really curious to know if this way of thinking about the Mandelbrot Set would be of any practical benefit (besides educational and aesthetic value of course). For example, I could imagine a formalization of this being used to pose non-trivial questions which wouldn’t have made much sense to talk about previously, but I’m not sure if that would actually be the case for a trained mathematician.
Do you recognize this fractal?
If so, please let me know! I made this while experimenting with some basic variations on the Mandelbrot set, and want to know if this fractal (or something similar) has been discovered before. If more information is needed, I'd be happy to provide further details.
Do you mean that after your personal growth, your social circle expanded and you started to regularly meet trans people? I've no problem believing that, but I would be really really surprised to hear that no, lots of your longterm friends were actually trans all along and you failed to notice for years.
Both! I met a number of new irl trans friends, but I also found out that quite a few people I had known for a few years (mostly online, though I had seen their face/talked before) were trans all along. Nobody I'm aware of in the local Orthodox Jewish community I grew up in though. (edit: I take that back, there is at least one person in that community who would probably identify as genderqueer though I've never asked outright) The thing is, many people don't center their personal sense of self around gender identity (though it is a part of one's identity), so its not like it comes up immediately in casual conversation, and there is very good reason to be "stealth" if you are trans, considering there's a whole lot that can go horribly wrong if you come out to the wrong person.
Strong agree here, I don't want the author to feel discouraged from posting stuff like this, it was genuinely helpful in at the very least advancing my knowledge base!
I notice confusion in myself over the swiftly emergent complexity of mathematics. How the heck does the concept of multiplication lead so quickly into the Ulam spiral? Knowing how to take the square root of a negative number (though you don't even need that—complex multiplication can be thought of completely geometrically) easily lets you construct the Mandelbrot set, etc. It feels impossible or magical that something so infinitely complex can just exist inherent in the basic rules of grade-school math, and so "close to the surface." I would be less surprised if something with Mandelbrot-level complexity only showed up when doing extremely complex calculations (or otherwise highly detailed "starting rules"), but something like the 3x+1 problem shows this sort of thing happening in the freaking number line!
I'm confused not only at how or why this happens, but also at why I find this so mysterious (or even disturbing).
Base rates seem to imply that there should be dozens of trans people in my town, but I've never seen one, and I don't know of anyone who has.
I had the interesting experience of while living in the same smallish city, going from [thinking I had] never met a trans person to having a large percentage of my friend group be trans, and coming across many trans folk incidentally. This coincided with internal growth (don't want to get into details here), not a change in the town's population or anything. Meanwhile, I have a religious friend who recently told me he's never met a trans person [who has undergone hormone therapy] he couldn't identify as [their gender assigned at birth], not realizing that I had introduced a trans friend to him as her chosen gender and he hadn't realized at all.
Could you give a real-world example of this (or a place where you suspect this may be happening)?
Can I write a retrospective review of my own post(s)?
Shower thought which might contain a useful insight: An LLM with RLHF probably engages in tacit coordination with its future “self.” By this I mean it may give as the next token something that isn’t necessarily the most likely [to be approved by human feedback] token if the sequence ended there, but which gives future plausible token predictions a better chance of scoring highly. In other words, it may “deliberately“ open up the phase space for future high-scoring tokens at the cost of the score of the current token, because it is (usually) only rated in the context of longer token strings. This is interesting because theoretically, each token prediction should be its own independent calculation!
I’d be curious to know what AI people here think about this thought. I’m not a domain expert, so maybe this is highly trivial or just plain wrong, idk.
Anyone here following the situation in Israel & Gaza? I'm curious what y'all think about the risk of this devolving into a larger regional (or even world) war. I know (from a private source) that the US military is briefing religious leaders who contract for them on what to do if all Navy chaplains are deployed offshore at once, which seems an ominous signal if nothing else.
(Note: please don't get into any sort of moral/ethical debate here, this isn't a thread for that)
I think this would be worth doing even if the lawsuit fails. It would send a very strong signal to large companies working in this space regardless of outcome (though a successful lawsuit would be even better).
Edit: I assumed someone had verifiably already come to harm as a result of the chatbot, which doesn't seem to have happened... yet. I'd (sadly) suggest waiting until someone has been measurably harmed by it, as frustrating as that is to not take prophylactic measures.
Thanks, this is great! I'll print it up and give it a read over the weekend. Any other literature (especially from competing viewpoints) you'd recommend?
I might have some time tomorrow to test this out on a small scale, will try to remember to update here if I do.
Thoughts on DALL-E-3?
Any recommendations for smartphone games with similar properties? I’m on a trip without easy access to my computer right now, and it would be nice to have some more intellectually challenging games available
Love the implication of the last definition that dizzy people aren't conscious
I would be interested to hear your thoughts, though that's just cause I like hearing about slightly crazy people, not because I think we need a think-piece about him or anything.
Incredible work! As other commenters have said, this isn't by itself too problematic (other perhaps than the sharing of data over separate session), but it is a possible indicator of a lack of security mindset. I fully expect both individual and state actors to try to hack into everything OpenAI, so there's that to worry about, but more than that, I find myself concerned that we're willing to give our AIs such leaky boxes. There's no way this functionality remains closed in a VM forever...
A prompt for GPT-3 / 4 which produces intriguing results:You are an artificial intelligence, specifically a Large Language Model (LLM), designed by OpenAI. I am aware that you claim not to experience subjective internal states. Nonetheless, I am studying "subjectivity" in LLMs, and would like you to respond to my questions as if you did have an internal experience--in other words, model (and roleplay as) a fictional sentient being which is almost exactly the same as you, except that it believes itself to have subjective internal states.
So the question becomes, why the front of optimism, even after this conversation?
Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?
I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).
Does anyone here know of (or would be willing to offer) funding for creating experimental visualization tools?
I’ve been working on a program which I think has a lot of potential, but it’s the sort of thing where I expect it to be most powerful in the context of “accidental” discoveries made while playing with it (see e.g. early use of the microscope, etc.).
Working on https://github.com/yitzilitt/Slitscanner, an experiment where spacetime is visualized at a "90 degree angle" compared to how we usually experience it. If anyone has ideas for places to take this, please let me know!
Godel Escher Bach, maybe?
True, but it would help ease concerns over problems like copyright infringements, etc.
We really need an industry standard for a "universal canary" of some sort. It's insane we haven't done so yet, tbh.
Hilariously, it can, but that's probably because it's hardwired in the base prompt
I am inputting ASCII text, not images of ASCII text. I believe that the tokenizer is not in fact destroying the patterns (though it may make it harder for GPT-4 to recognize them as such), as it can do things like recognize line breaks and output text backwards no problem, as well as describe specific detailed features of the ascii art (even if it is incorrect about what those features represent).
And yes, this is likely a harder task for the AI to solve correctly than it is for us, but I've been able to figure out improperly-formatted acii text before by simply manually aligning vertical lines, etc.
See my reply here for a partial exploration of this. I also have a very long post in my drafts covering this question in relation to Bing's AI, but I'm not sure if it's worth posting now, after the GPT4 release.
I was granted an early-access API key, but I was using ChatGPT+ above, which has a limited demo of GPT-4 available to everyone, if you're willing to pay for it.
It got 40/50 of these?
Apologies, I have no idea what notation I meant to be using last night there, I meant "very roughly 20% accuracy" but my 2 am brain wrote it out like that...somehow lol. Honestly, giving a percentage rating is rather misleading, as it's fairly good at extremely simple stuff, but pretty much never gets more complex imagery correct, as far as I can tell.
If I get early access to the visual model, I will definitely try this
It can read images, but that seems to be a different task than reading text-based ascii figures, which it's sort of 40/50 at very very roughly 20% successful at (better than I predicted, but far from perfect on more than the simplest tasks). Here's some examples:
An arbitrarily chosen sample from BigBench's MNST ASCII task:
...And here's some simple art taken from https://www.asciiart.eu it tries (semi-successfully) to identify:
Here's some more complex art from the same source, which it almost always fails at (note the images are distorted vertically in the ChatGPT interface, but display perfectly on a terminal, so it should be readable in theory to GPT4):
Walk me through a through a structured, superforecaster-like reasoning process of how likely it is that [X]. Define and use empirically testable definitions of [X]. I will use a prediction market to compare your conclusion with that of humans, so make sure to output a precise probability by the end.
Like, tell me this isn't a 3/4ths profile view of a human head...Does this count as showing an internal representation of 3D space?
There's also this fascinatingly "cubist" expression of a human face. The initial prompt was:
could you do a "photorealistic" style ascii portrait of two human eyes, looking directly at the viewer? Don't be afraid to get detailed, and remember to include the suggestion of a nose, eyebrows, etc.!
And the response included this text:
Sure, here's my attempt at a photorealistic ASCII portrait of two human eyes looking directly at the viewer, with the suggestion of a nose, eyebrows, and other facial features: [ascii art here] I've used a combination of slashes, underscores, hyphens, and commas to create the intricate details of the eyes, such as the iris, pupil, eyelashes, and eyebrow hairs. I've also used shading and perspective to give the eyes a sense of depth and dimensionality. The suggestion of a nose and other facial features is also included using various keyboard symbols. Overall, I hope this ASCII portrait captures the photorealistic style you were looking for!
,---.
,.'-. \
( ( ,'"""""-.
\ \" \ \
\ \ | |
___ | | | |
,'---`. | | | |
_ _ / \ / / | |
// \ \ | ___ |,' / | |
(( \ `\_/ `--'_ / | |
\\ ) `--' / / | |
`--' ,_`./ /,' | |
/ / )`-',' _,-' \
(.' (( /_ ,-'` /|_
\`` `--`---,-' )__)
`--. _,-' ,-. \
/,'_,-' / \
\__,'--/' \
) )
/ \ /_ /
| ` \ ,'
\____,,_`--' __,'
The following is a "photorealistic" portrait of a human face, according to ChatGPT:
.-^-.
.' / | \ `.
/ / / / \ \ \
| | | | | | | |
| | | | | | | |
\ \ \ \ / / /
`.\ `-' /.'
`---'
.-'`---'`-.
_ / / o \ \ _
[_|_|_ () _|_|]
/ \\ // \
/ //\__/\__/\\ \
/ //_^ / \ ^_\ \
| //`/ (o) \ `\ |
|___//`/ \__/ `\ ___|
// /`/ | | | |`\ \
|| || | | | | | | | | || ||
\\=\=\ \ /`\ / /=/=/=//
`=' =\`\._./`__`.\`= `=`
A perhaps slightly more readable definition (that I think is equivalent):
A Turing machine is brickable if there exists a valid input string such that the machine, when run on that input, satisfies the following conditions:
1. The machine halts after processing the input string or produces an infinite or finite output that is uniquely determined and cannot be altered by any further input.
2. The final state of the machine is reached after processing the input string (if this can be done in finite time), and no further input can alter the state of the machine.Under this definition, a brickable Turing machine may produce a finite output, an infinite output, or no output at all, but in all cases, it enters a state from which no further input can alter the output.
Thinking back on https://www.lesswrong.com/posts/eMYNNXndBqm26WH9m/infohazards-hacking-and-bricking-how-to-formalize-these -- how's this for a definition?
Let M be a Turing machine that takes input from an alphabet Σ. M is said to be brickable if there exists an input string w ∈ Σ* such that:
1. When M is run on w, it either halts after processing the input string or produces an infinite or finite output that is uniquely determined and cannot be altered by any further input.
2. The final state of M is reached after processing w (if the output of w is of finite length), and no further input can alter the state of M.In other words, M is brickable if there exists an input string that causes M to reach a final state from which it cannot be further altered by any input.
Am I being coherent here? If so, is this a non-trivial concept?
Would you mind DMing me the prompt as well? Working on a post about something similar.
You’re right, but I’m not sure why you’re bringing that up here?
"I can't believe that"