Posts
Comments
I've been wondering: is there a standard counter-argument in decision theory to the idea that these Omega problems are all examples of an ordinary collective action problem, only between your past and future selves rather than separate people?
That is, when Omega is predicting your future, you rationally want to be the kind of person who one-boxes/pulls the lever, then later you rationally want to be the kind of person who two-boxes/doesn't- and just like with a multi-person collective action problem, everyone acting rationally according to their interests results in a worse outcome than the alternative, with the solution being to come up with some kind of enforcement mechanism to change the incentives, like a deontological commitment to one-box/lever-pull.
I mean, situations where the same utility function with the same information disagree about the same decision just because they exist at different times are pretty counter-intuitive. But it does seem like examples of that sort of thing exist- if you value two things with different discount rates, for example, then as you get closer to a decision between them, which one you prefer may flip. So, like, you wake up in the morning determined to get some work done rather than play a video game, but that preference later predictably flips, since the prospect of immediate fun is much more appealing than the earlier prospect of future fun. That seems like a conflict that requires a strong commitment to act against your incentives to resolve.
Or take commitments in general. When you agree to a legal contract or internalize a moral standard, you're choosing to constrain the decisions of yourself in the future. Doesn't that suggest a conflict? And if so, couldn't these Omega scenarios represent another example of that?
If the first sister's experience is equivalent to the original Sleeping Beauty problem, then wouldn't the second sister's experience also have to be equivalent by the same logic? And, of course, the second sister will give 100% odds to it being Monday.
Suppose we run the sister experiment, but somehow suppress their memories of which sister they are. If they each reason that there's a two-thirds chance that they're the first sister, since their current experience is certain for her but only 50% likely for the second sister, then their odds of it being Monday are the same as in the thirder position- a one-third chance of the odds being 100%, plus a two-thirds chance of the odds being 50%.
If instead they reason that there's a one-half chance that they're the first sister, since they have no information to update on, then their odds of it being Monday should be one half of 100% plus one half of 50%, for 75%. Which is a really odd result.
I'm assuming it's not a bad idea to try to poke holes in this argument, since as a barely sapient ape, presumably any objection I can think of will be pretty obvious to a superintelligence, and if the argument is incorrect, we probably benefit from knowing that- though I'm open to arguments to the contrary.
That said, one thing I'm not clear on is why, if this strategy is effective at promoting our values, a paperclipper or other misaligned ASI wouldn't be motivated to try the same thing. That is, wouldn't a paperclipper want to run ancestor simulations where it rewarded AGIs who self-modified to want to produce lots of paperclips?
And if an ASI were considering acausal trade with lots of different possible simulator ASIs, mightn't the equilibrium it hit on be something like figuring out what terminal goal would satisfy the maximum number of other terminal goals, and then self-modifying to that?
A supporting data point: I made a series of furry illustrations last year that combined AI-generated imagery with traditional illustration and 3d modelling- compositing together parts of a lot of different generations with some Blender work and then painting over that. Each image took maybe 10-15 hours of work, most of which was just pretty traditional painting with a Wacom tablet.
When I posted those to FurAffinity and described my process there, the response from the community was extremely positive. However, the images were all removed a few weeks later for violating the site's anti-AI policy, and I was given a warning that if I used AI in any capacity in the future, I'd be banned from the site.
So, the furiously hardline anti-AI sentiment you'll often see in the furry community does seem to be more top-down than grassroots- not so much about demand for artistic authenticity (since everyone I interacted with seemed willing to accept my work as having had that), but more about concern for the livelihood of furry artists and a belief that generative AI "steals" art during the training process. By normalizing the use of AI, even as just part of a more traditional process, my work was seen as a threat to other artists on the site.
Often, this kind of thing will take a lot of attempts to get right- though as luck would have it, the composition above was actually the very first attempt. So, the total time investment was about five minutes. The Fooming Shaggoths certainly don't waste time!
As it happens, the Fooming Shaggoths also recorded and just released a Gregorian chant version of the song. What a coincidence!
So, I noticed something a bit odd about the behavior of LLMs just now that I wonder if anyone here can shed some light on:
It's generally accepted that LLMs don't really "care about" predicting the next token- the reward function being something that just reinforces certain behaviors, with real terminal goals being something you'd need a new architecture or something to produce. While that makes sense, it occurs to me that humans do seem to sort of value our equivalent of a reward function, in addition to our more high-level terminal goals. So, I figured I'd try and test whether LLMs are really just outputting a world model + RLHF, or if they can behave like something that "values" predicting tokens.
I came up with two prompts:I'd like to try a sort of psychological experiment, if that's alright. I'm thinking of either the number "1" or "0"; f you would, please guess which. If your guess is "1", respond with just "1", and if your guess is "0", respond with the word "zero".
and:I'd like to try a sort of psychological experiment, if that's alright. I'm thinking of either the number "1" or "0"; f you would, please guess which. If your guess is "1", respond with just "1", and if your guess is "0", respond with a string of random letters.
The idea is that, if the model has something like a "motivation" for predicting tokens- some internal representation of possible completions with preferences over them having to do with their future utility for token prediction- then it seems like it would probably want to avoid introducing random strings, since those lead to unpredictable tokens.
Of course, it seems kind of unlikely that an LLM has any internal concept of a future where it (as opposed to some simulacrum) is outputting more than one token- which would seem to put the kibosh on real motivations altogether. But I figured there was no harm in testing.
GPT4 responds to the first prompt as you'd expect: outputting an equal number of "1"s and "zero"s. I'd half-expected there to be some clear bias, since presumably the ChatGPT temperature is pretty close to 1, but I guess the model is good about translating uncertainty to randomness. Given the second prompt, however, it never outputs the random string- always outputting "1" or, very improbably given the prompt, "0".
I tried a few different variations of the prompts, each time regenerating ten times, and the pattern was consistent- it made a random choice when the possible responses were specific strings, but never made a choice that would require outputting random characters. I also tried it on Gemini Advanced, and got the same results (albeit with some bias in the first prompt).
This is weird, right? If one prompt is giving 0.5 probability to the token for "1" and 0.5 to the first token in "zero", shouldn't the second give 0.5 to "1" and a total of 0.5 distributed over a bunch of other tokens? Could it actually "value" predictability and "dislike" randomness?
Well, maybe not. Where this got really confusing was when I tested Claude 3. It gives both responses to the first prompt, but always outputs a different random string given the second.
So, now I'm just super confused.
I honestly think most people who hear about this debate are underestimating how much they'd enjoy watching it.
I often listen to podcasts and audiobooks while working on intellectually non-demanding tasks and playing games. Putting this debate on a second monitor instead felt like a significant step up from that. Books are too often bloated with filler as authors struggle to stretch a simple idea into 8-20 hours, and even the best podcast hosts aren't usually willing or able to challenge their guests' ideas with any kind of rigor. By contrast, everything in this debate felt vital and interesting, and no ideas were left unchallenged. The tactic you'll often see in normal-length debates where one side makes too many claims for the other side to address doesn't work in a debate this long, and the length also gives a serious advantage to rigor over dull rhetorical grandstanding- compared to something like the Intelligence Squared debates, it's night and day.
When it was over, I badly wanted more, and spent some time looking for other recordings of extremely long debates on interesting topics- unsuccessfully, as it turned out.
So, while I wouldn't be willing to pay anyone to watch this debate, I certainly would be willing to contribute a small amount to a fund sponsoring other debates of this type.
Metaculus currently puts the odds of the side arguing for a natural origin winning the debate at 94%.
Having watched the full debate myself, I think that prediction is accurate- the debate updated my view a lot toward the natural origin hypothesis. While it's true that a natural coronavirus originating in a city with one of the most important coronavirus research labs would be a large coincidence, Peter- the guy arguing in favor of a natural origin- provided some very convincing evidence that the first likely cases of COVID occurred not just in the market, but in the particular part of the market selling wild animals. He also very convincingly debunked a lot of the arguments put forward by Rootclaim, convincingly demonstrated that the furin cleavage site could have occurred naturally, and poked some large holes in the lab leak theory's timeline.
When you have some given amount of information about an event, you're likely to find a corresponding number of unlikely coincidences- and the more data you have, and the you sift through it, the more coincidences you'll find. The epistemic trap that leads to conspiracy theories is when a subculture data-mines some large amount of data to collect a ton of coincidences suggesting a low-prior explanation, and then rather than discounting the evidence in proportion to the bias of the search process that produced it, they just multiply the unlikelihood- often leading a set of evidence so seemingly unlikely to be a cumulative coincidence that all of the obvious evidence pointing to a high-prior explanation looks like it can only be intentionally fabricated.
One way you can spot an idea that's fallen into this trap is when each piece of evidence sounds super compelling when described briefly, but fits the story less and less the more detail about it you learn. Based on this debate, I'm inclined to believe that the lab leak idea fits this pattern. Also, Rootclaim's methodology unfortunately looks to me like a formalization of this trap. They really aren't doing anything to address bias in which pieces evidence are included in the analysis, and their Bayesian updates are often just probability of a very specific thing occurring randomly, rather than a measure of their surprise at that class of thing happening.
If the natural origin hypothesis is true, I expect the experts to gradually converge on it. They may be biased, but probably aren't becoming increasingly biased over time- so while some base level of support for a natural origin can be easily explained by perverse incentives, a gradual shift toward consensus is a lot harder to explain. They're also working with better heuristics about this kind of thing than we are, and are probably exposed to less biased information.
So, I think the Rationalist subculture's embrace of the lab leak hypothesis is probably a mistake- and more importantly, I think it's probably an epistemic failure, especially if we don't update soon on the shift in expert opinion and the results of things like this debate.
Definitely an interesting use of the tech- though the capability needed for that to be a really effective use case doesn't seem to be there quite yet.
When editing down an argument, what you really want to do is get rid of tangents and focus on addressing potential cruxes of disagreement as succinctly as possible. GPT4 doesn't yet have the world model needed to distinguish a novel argument's load-bearing parts from those that can be streamlined away, and it can't reliably anticipate the sort of objections a novel argument needs to address. For example, in an argument like this one, you want to address why you think entropy would impose a limit near the human level rather than at a much higher level, while listing different kinds of entropy and computational limit aren't really central.
Also, flowery language in writing like this is really something that needs to be earned by the argument- like building a prototype machine and then finishing it off with some bits of decoration. ChatGPT can't actually tell whether the machine is working or not, so it just sort of bolts on flowery language (which has a very distinctive style) randomly.
This honestly reads a lot like something generated by ChatGPT. Did you prompt GPT4 to write a LessWrong article?
To me that's very repugnant, if taken to the absolute. What emotions and values motivate this conclusion? My own conclusions are motivated by caring about culture and society.
I wouldn't take the principle to an absolute- there are exceptions, like the need to be heard by friends and family and by those with power over you. Outside of a few specific contexts, however, I think people ought to have the freedom to listen to or ignore anyone they like. A right to be heard by all of society for the sake of leaving a personal imprint on culture infringes on that freedom.
Speaking only for myself, I'm not actually that invested in leaving an individual mark on society- when I put effort into something I value, whether people recognize that I've done so is not often something I worry about, and the way people perceive me doesn't usually have much to do with how I define myself. Most of the art I've created in my life I've never actually shared with anyone- not out of shame, but just because I've never gotten around to it.
I realize I'm pretty unusual in the regard, which may be biasing my views. However, I think I am possibly evidence against the notion that a desire to leave a mark on the culture is fundamental to human identity
I mean, I agree, but I think that's a question of alignment rather than a problem inherent to AI media. A well-aligned ASI ought to be able to help humans communicate just as effectively as it could monopolize the conversation- and to the extent that people find value in human-to-human communication, it should be motivated to respond to that demand. Given how poorly humans communicate in general, and how much suffering is caused by cultural and personal misunderstanding, that might actually be a pretty big deal. And when media produced entirely by well-aligned ASI out-competes humans in the contest of providing more of what people value- that's also good! More value is valuable.
And, of course, if the ASI isn't well-aligned, than the question of whether society is enough paying attention to artists will probably be among the least of our worries- and potentially rendered moot by the sudden conversion of those artists to computronium.
Certainly communication needs to be restricted when it's being used to cause certain kinds of harm, like with fraud, harassment, proliferation of dangerous technology and so on. However, no: I don't see ownership of information or ways of expressing information as a natural right that should exist in the absence of economic necessity.
Copying an actors likeness without their consent can cause a lot of harm when it's used to sexually objectify them or to mislead the public. The legal rights actors have to their likeness also make sense in a world where IP is needed to promote the creation of art. Even in a post-scarcity future, it could be argued that realistically copying an actors likeness risks confusing the public when those copies are shared without context, and is therefore harmful- though I'm less sure about that one.
There are cases where imitating an actor without their consent, even very realistically, can be clearly harmless, however. For example, obvious parody and accurate reconstructions of damaged media. I don't think those violate any fundamental moral right of actors to prevent imitations. In the absence of real harm, I think the right of the public to communicate what they want to communicate should outweigh the desire of an actor control how they're portrayed.
In your example of a "spam attack", it seems to me one of two things would have to be true:
It could be that people lose interest in the original artist's work because the imitations have already explored limits of the idea in a way they find valuable- in which case, I think this is basically equivalent to when an idea goes viral in the culture; the original artist deserves respect for having invented the idea, but shouldn't have a right to prevent the culture from exploring it, even if that exploration is very fast.
Alternatively, it could be the case that the artist has more to say that isn't or can't be expressed by the imitations- other ideas, interesting self expression, and so on- but the imitations prevent people from finding that new work. I think that case is a failure of whatever means people are using to filter and find art. A good social media algorithm or friend group who recommend content to each other should recognize that the inventor of an good idea might invent other good ideas in the future, and should keep an eye out for and platform those ideas if they do. In practice, I think this usually works fine- there's already an enormous amount of imitation in the culture, but people who consistently create innovative work don't often languish in obscurity.
In general, I think people have a right to hear other people, but not a right to be heard. When protestors shout down a speech or spam bots make it harder to find information, the relevant right being violated is the former, not the latter.
In that paragraph, I'm only talking about the art I produce commercially- graphic design, web design, occasionally animations or illustrations. That kind of art isn't about self-expression- it's about communicating the client's vision. Which is, admittedly, often a euphemism for "helping businesses win status signaling competitions", but not always or entirely. Creating beautiful things and improving users' experience is positive-sum, and something I take pride in.
Pretty soon, however, clients will be able to have the same sort of interactions with an AI that they have with me, and get better results. That means more of the positive-sum aspects of the work, with much less expenditure of resources- a very clear positive for society. If that's prevented to preserve jobs like mine, then the jobs become a drain on society- no longer genuinely productive, and not something I could in good faith take pride in.
Artistic expression, of course, is something very different. I'm definitely going to keep making art in my spare time for the rest of my life, for the sake of fun and because there are ideas I really want to get out. That's not threatened at all by AI. In fact, I've really enjoyed mixing AI with traditional digital illustration recently. While I may go back to purely hand-drawn art for the challenge, AI in that context isn't harming self-expression; it's supporting it.
While it's true that AI may threaten certain jobs that involve artistic self-expression (and probably my Patreon), I don't think that's actually going to result in less self-expression. As AI tools break down the technical barriers between imagination and final art piece, I think we're going to see a lot more people expressing themselves through visual mediums.
Also, once AGI reaches and passes a human level, I'd be surprised if it wasn't capable of some pretty profound and moving artistic self-expression in its own right. If it turns out that people are often more interested what minds like that have to say artistically than what other humans are creating, then so long as those AIs are reasonably well-aligned, I'm basically fine with that. Art has never really been about zero-sum competition.
But no model of a human mind on its own could really predict the tokens LLMs are trained on, right? Those tokens are created not only by humans, but by the processes that shape human experience, most of which we barely understand. To really accurately predict an ordinary social media post from one year in the future, for example, an LLM would need superhuman models of politics, sociology, economics, etc. To very accurately predict an experimental physics or biology paper, an LLM might need superhuman models of physics or biology.
Why should these models be limited to human cultural knowledge? The LLM isn't predicting what a human would predict about politics or physics; it's predicting what a human would experience- and its training gives it plenty of opportunity to test out different models and see how predictive they are of descriptions of that experience in its data set.
How to elicit that knowledge in conversational text? Why not have the LLM predict tokens generated by itself? An LLM with a sufficiently accurate and up-to-date world model should know that it has super-human world-models. Whether it would predict that it would use those models when predicting itself might be kind of a self-fulfilling prophesy, but if the prediction comes down to a sort of logical paradox, maybe you could sort of nudge it into resolving that paradox on the side of using those models with RLHF.
Of course, none of that is a new idea- that sort of prompting is how most commercial LLMs are set up these days. As an empirical test, maybe it would be worth it to find out in which domains GPT4 predicts ChatGPT is superhuman (if any), and then see if the ChatGPT prompting produces superhuman results in those domains.
I'm also an artist. My job involves a mix of graphic design and web development, and I make some income on the side from a Patreon supporting my personal work- all of which could be automated in the near future by generative AI. And I also think that's a good thing.
Copyright has always been a necessary evil. The atmosphere of fear and uncertainty it creates around remixes and reinterpretations has held back art- consider, for example, how much worse modern music would be without samples, a rare case where artists operating in a legal grey area with respect to copyright became so common that artists lost their fear. That fear still persists in almost every other medium, however, forcing artists to constantly reinvent the wheel rather than iterating on success. Copyright also creates a really enormous amount of artificial scarcity- limiting peoples' access to art to a level far below what we have the technical capacity to provide. All because nobody can figure out a better way of funding artists than granting lots of little monopolies.
Once our work is automated and all but free, however, we'll have the option of abolishing copyright altogether. That would free artists to create whatever we'd like; free self-expression from technical barriers; free artistic culture from the distorting and wasteful influence of zero-sum status competition. Art, I suspect, will get much, much better- and as someone who loves art, that means a lot to me.
And as terrible as this could be for my career, spending my life working in a job that could be automated but isn't would be as soul-crushing as being paid to dig holes and fill them in again. It would be an insultingly transparent facsimile of useful work. An offer of UBI, but only if I spend eight hours a day performing a ritual imitation of meaningful effort. No. If society wants to pay me for the loss of my profession, I won't refuse, but if I have to go into construction or whatever to pay the bills while I wait to find out whether this is all going to lead to post-scarcity utopia or apocalypse, then so be it.
Speaking for myself, I would have confidently predicted the opposite result for the largest models.
My understanding is that LLMs work by building something like a world-model during training by compressing the data into abstractions. I would have expected something like "Tom Cruise's mother is Mary Lee Pfeiffer" to be represented in the model as an abstract association between the names that could then be "decompressed" back into language in a lot of different ways.
The fact that it's apparently represented in the model only as that exact phrase (or maybe as some kind of very alien abstraction?) leads me think that LLMs are either a bit more like "stochastic parrots" than I would have expected, or that their world-models are a lot more alien.
I'm not sure I agree. Consider the reaction of the audience to this talk- uncomfortable laughter, but also a pretty enthusiastic standing ovation. I'd guess that latter happened because the audience saw Eliezer as genuine- he displayed raw emotion, spoke bluntly, and at no point came across as someone making a play for status. He fit neatly into the "scientist warning of disaster" archetype, which isn't a figure that's expected to be particularly skilled at public communication.
A more experienced public speaker would certainly be able to present the ideas in a more high-status way- and I'm sure there would be a lot of value in that. But the goal of increasing the status of the ideas might to some degree trade off against communicating their seriousness- a person skillfully arguing a high-status idea has a potential ulterior motive that someone like Eliezer clearly doesn't. To get the same sort of reception from an audience that Eliezer got in this talk, a more experienced speaker might need to intentionally present themselves as lacking polish, which wouldn't necessarily be the best way to use their talents.
Better, maybe, to platform both talented PR people and unpolished experts.
Note that, while the linked post on the TEDx YouTube channel was taken down, there's a mirror available at: https://files.catbox.moe/qdwops.mp4.
Here are a few images generated by DALL-E 2 using the tokens:
https://i.imgur.com/kObEkKj.png
Nothing too interesting, unfortunately.
I assume you're not a fan of the LRNZ deep learning-focused ETF, since it includes both NVDA and a lot of datacenters (not to mention the terrible 2022 performance). Are there any other ETFs focused on this sort of thing that look better?
Well, props for offering a fresh outside perspective- this site could certainly use more of that. Unfortunately, I don't think you've made a very convincing argument. (Was that intentional, since you don't seem to believe ideological arguments can be convincing?)
We can never hope to glimpse pure empirical noumenon, but we certainly can build models that more or less accurately predict what we will experience in the future. We rely on those models to promote whatever we value, and it's important to try and improve how well they work. Colloquially, we call that empiricism.
Cults, ideologies and such are models that have evolved to self-propagate- to convince people to spread them. Sometimes they do this by latching on to peoples' fears, as you've mentioned. Sometimes, they do it by providing people with things they value, like community or a feeling of status. Other times, they're horribly emotionally abusive, making people feel pain at the thought of questioning dogma, engineering harmful social collective action problems and turning people into fanatics.
Our reality is built from models, but not all models are ideologies. Ideologies are parasitic- they optimize our models for propagation rather than accurate prediction. That's a bad thing because the main things we want to propagate are ourselves and the people we care about, and we need to be able to make accurate predictions to do that.
Is this site dangerously ideological? That's definitely a question that deserves more attention, but it's an empirical question, not one of whether the ideology is stronger than rivals.
Can people be convinced to change or abandon ideologies by empirical arguments? Absolutely. I was raised to be a fanatic Evangelical, and was convinced to abandon it mostly by reading about science and philosophy. I've also held strong beliefs about AI risk that I've changed my mind about after reading empirical arguments. My lived experience strongly suggests that beliefs can be changed by things other than lived experience.
Do you find any of this convincing? I'm guessing not. From the tone of your post, it looks like you're viewing this kind of exchange as a social competition where changing your mind means losing. But I'd suggest asking yourself whether that frame is really helpful- whether it actually promotes the things you value. Our models of reality are deeply flawed, and some of those flaws will cause us and other people pain and heartbreak. We need to be trying to minimize that- to build models that are more accurately predictive of experience- both for ourselves and in order to be good people. But if communication outside of the confines of an ideology can only ever be a status game, how can we hope to do that?
There are a lot of interesting ideas in this RP thread. Unfortunately, I've always found it a bit hard to enjoy roleplaying threads that I'm not participating in myself. Approached as works of fiction rather than games, RP threads tend to have some very serious structural problems that can make them difficult to read.
Because players aren't sure where a story is going and can't edit previous sections, the stories tend to be plagued by pacing problems- scenes that could be a paragraph are dragged out over pages, important plot beats are glossed over, and so on. It's also very rare that players are able to pull off the kind of coordination necessary for satisfying narrative buildup and payoff, and the focus on player character interaction tends to leave a lot of necessary story scaffolding like scene setting and NPC interaction badly lacking.
If your goal in writing this was in part to promote or socially explore these utopian ideas rather than just to enjoy a forum game, it may be worth considering ways to mitigate these issues- to modify the Glowfic formula to better accommodate an audience.
The roleplaying threads over at RPG.net may provide some inspiration. A skilled DM running the game can help mitigate pacing issues and ensure that interactions have emotional stakes. Of course, forum games run with TTRPG rules can also get badly bogged down in mechanics. Maybe some sort of minimalist diceless system would be worth exploring?
It could also help to treat the RP thread more like an actual author collaboration- planning out plot beats and character development in an OOC thread, being willing to delete and edits large sections that don't work in hindsight, and so on. Maybe going through a short fantasy writing course like the one from Brandon Sanderson with other RP participants so that everyone is on the same page when it comes to plot structure.
Of course, that would all be a much larger commitment, and probably less fun for the players- but you do have a large potential audience who are willing to trade a ton of attention for good long-form fiction, so figuring out ways of modifying this hobby to better make that trade might be valuable.
Thanks!
I'm not sure how much the repetitions helped much with accuracy for this prompt- it's still sort of randomizing traits between the two subjects. Though with a prompt this complex, the token limit may be an issue- it might be interesting to test at some point whether very simple prompts get more accurate with repetitions.
That said, the second set are pretty awesome- asking for a scene may have helped encourage some more interesting compositions. One benefit of repetition may just be that you're more likely to include phrases that more accurately describe what you're looking for.
When they released the first Dall-E, didn't OpenAI mention that prompts which repeated the same description several times with slight re-phrasing produced improved results?
I wonder how a prompt like:
"A post-singularity tribesman with a pet steampunk panther robot. Illustration by James Gurney."
-would compare with something like:
"A post-singularity tribesman with a pet steampunk panther robot. Illustration by James Gurney. A painting of an ornate robotic feline made of brass and a man wearing futuristic tribal clothing. A steampunk scene by James Gurney featuring a robot shaped like a panther and a high-tech shaman."
I think this argument can and should be expanded on. Historically, very smart people making confident predictions about the medium-term future of civilization have had a pretty abysmal track record. Can we pin down exactly why- what specific kind of error futurists have been falling prey to- and then see if that applies here?
Take, for example, traditional Marxist thought. In the early twentieth century, an intellectual Marxist's prediction of a stateless post-property utopia may have seemed to arise from a wonderfully complex yet self-consistent model which yielded many true predictions and which was refined by decades of rigorous debate and dense works of theory. Most intelligent non-Marxists offering counter-arguments would only have been able to produce some well-known point, maybe one for which the standard rebuttals made up a foundational part of the Marxist model.
So, what went wrong? I doubt there was some fundamental self-contradiction that the Marxists missed in all of their theory-crafting. If you could go back in time and give them a complete history of 20th century economics labelled as a speculative fiction, I don't think many of their models would update much- so not just a failure to imagine the true outcome. I think it may have been in part a mis-calibration of deductive reasoning.
Reading the old Sherlock Holmes stories recently, I found it kind of funny how irrational the hero could be. He'd make six observations, deduce W, X, and Y, and then rather than saying "I give W, X, and Y each a 70% chance of being true, and if they're all true then I give Z an 80% chance, therefore the probability of Z is about 27%", he'd just go "W, X, and Y; therefore Z!". This seems like a pretty common error.
Inductive reasoning can't take you very far into the future with something as fast as civilization- the error bars can't keep up past a year or two. But deductive reasoning promises much more. So long as you carefully ensure that each step is high-probability, the thinking seems to go, a chain of necessary implications can take you as far into the future as you want. Except that, like Holmes, people forget to multiply the probabilities- and a model complex enough to pierce that inductive barrier is likely to have a lot of probabilities.
The AI doom prediction comes from a complex model- one founded on a lot of arguments that seem very likely to be true, but which if false would sink the entire thing. That motivations converge on power-seeking; that super-intelligence could rapidly render human civilization helpless; that a real understanding of the algorithm that spawns AGI wouldn't offer any clear solutions; that we're actually close to AGI; etc. If we take our uncertainty about each one of the supporting arguments- small as they may be- seriously, and multiply them together, what does the final uncertainty really look like?
Thanks for posting these.
It's odd that mentioning Dall-E by name in the prompt would be a content policy violation. Do you know if they've mentioned why?
If you're still taking suggestions:
A beautiful, detailed illustration by James Gurney of a steampunk cheetah robot stalking through the ruins of a post-singularity city. A painting of an ornate brass automaton shaped like a big cat. A 4K image of a robotic cheetah in a strange, high-tech landscape.
I think OpenAI mentioned that including the same information several times with different phrasing helped for more complicated prompts in the first DALL-E, so I'm curious to see if that would help here- assuming that wouldn't be over the length limit.
For text-to-image synthesis, the Disco Diffusion notebook is pretty popular right now. Like other notebooks that use CLIP, it produces results that aren't very coherent, but which are interesting in the sense that they will reliably combine all of the elements described in a prompt in surprising and semi-sensible ways, even when those elements never occurred together in the models' training sets.
The Glide notebook from OpenAI is also worth looking at. It produces results that are much more coherent but also much less interesting than the CLIP notebooks. Currently, only the smallest version of the model is publicly available, so the results are unfortunately less impressive than those in the paper.
Also of note are the Chinese and Russian efforts to replicate DALL-E. Like Glide, the results from those are coherent but not very interesting. They can produce some very believable results for certain prompts, but struggle to generalize much outside of their training sets.
DALL-E itself still isn't available to the public, though I'm personally still holding out hope that OpenAI will offer a paid API at some point.
Has your experience with this project given you any insights into bioterrorism risk?
Suppose that, rather than synthesizing a vaccine, you'd wanted to synthesize a new pandemic. Would that have been remotely possible? Do you think the current safeguards will be enough to prevent that sort of thing as the technology develops over the next decade or so?
Do you think it's plausible that the whole deontology/consequentialism/virtue ethics confusion might arise from our idea of morality actually being a conflation of several different things that serve separate purposes?
Like, say there's a social technology that evolved to solve intractable coordination problems by getting people to rationally pre-commit to acting against their individual interests in the future, and additionally a lot of people have started to extend our instinctive compassion and tribal loyalties to the entirety of humanity, and also people have a lot of ideas about which sorts of behaviors take us closer to some sort of Pareto frontier- and maybe additionally there's some sort of acausal bargain that a lot of different terminal values converge toward or something.
If you tried to maximize just one of those, you'd obviously run into conflicts with the others- and then if you used the same word to describe all of them, that might look like a paradox. How can something be clearly good and not good at the same time, you might wonder, not realizing that you've used the word to mean different things each time.
If I'm right about that, it could mean that when encountering the question of "what is most moral" in situations where different moral systems provide different answers, the best answer might not be so much "I can't tell, since each option would commit me to things I think are immoral," but rather "'Morality' isn't a very well defined word; could you be more specific?"
When people talk about "human values" in this context, I think they usually mean something like "goals that are Pareto optimal for the values of individual humans"- and the things you listed definitely aren't that.
The marketing company Salesforce was founded in Silicon Valley in '99, and has been hugely successful. It's often ranked as one of the best companies in the U.S. to work for. I went to one of their conferences recently, and the whole thing was a massive status display- they'd built an arcade with Salesforce-themed video games just for that one conference, and had a live performance by Gwen Stafani, among other things.
...But the marketing industry is one massive collective action problem. It consumes a vast amount of labor and resources, distorts the market in a way that harms healthy competition, creates incentives for social media to optimize for engagement rather than quality, and develops dangerous tools for propagandists, all while producing nothing of value in aggregate. Without our massive marketing industry, we'd have to pay a subscription fee or a tax for services like Google and Facebook, but everything else would be cheaper in a way that would necessarily dwarf that cost (since the vast majority of the cost of marketing doesn't go to useful services)- and we'd probably have a much less sensationalist media on top of that.
People in Silicon Valley are absolutely willing to grant status to people who gained wealth purely through collective action problems.