Project Adequate: Seeking Cofounders/Funders

post by Lorec · 2024-11-17T03:12:12.995Z · LW · GW · 7 comments

Contents

7 comments

"But the machines which reproduce machinery do not reproduce machines after their own kind. [ . . . ] Herein lies our danger. For many seem inclined to acquiesce in so dishonourable a future. They say that although man should become to the machines what the horse and dog are to us, yet that he will continue to exist, and will probably be better off in a state of domestication under the beneficent rule of the machines than in his present wild condition. We treat our domestic animals with much kindness. [ . . . ] With those who can argue in this way I have nothing in common." -- Samuel Butler, "Darwin among the Machines"

"Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously." -- I.J. Good, "Speculations Concerning the First Ultraintelligent Machine"

"It was really just humans playing with an old library. It should be safe, using their own automation, clean and benign. This library wasn't a living creature, or even possessed of automation (which here might mean something more, far more, than human). They would look and pick and choose, and be careful not to be burned.... Humans starting fires and playing with the flames. The archive informed the automation. Data structures were built, recipes followed. A local network was built, faster than anything on Straum, but surely safe. Nodes were added, modified by other recipes. The archive was a friendly place, with hierarchies of translation keys that led them along. Straum itself would be famous for this. Six months passed. A year. The omniscient view. Not self-aware really. Self-awareness is much over-rated. Most automation works far better as a part of a whole, and even if human-powerful, it does not need to self-know. But the local net at the High Lab had transcended -- almost without the humans realizing. The processes that circulated through its nodes were complex, beyond anything that could live on the computers the humans had brought. Those feeble devices were now simply front ends to the devices the recipes suggested. The processes had the potential for self-awareness ... and occasionally the need. "We should not be." "Talking like this?" "Talking at all." The link between them was a thread, barely more than the narrowness that connects one human to another. But it was one way to escape the overness of the local net, and it forced separate consciousness upon them. They drifted from node to node, looked out from cameras mounted on the landing field. An armed frigate and a empty container vessel were all that sat there. It had been six months since resupply. A safety precaution early suggested by the archive, a ruse to enable the Trap. Flitting, flitting. We are wildlife that must not be noticed by the overness, by the Power that soon will be. On some nodes they shrank to smallness and almost remembered humanity, became echoes . . . 'Poor humans; they will all die.'" -- Vernor Vinge, "A Fire Upon The Deep"

"The cultural drive was updated continuously so that the program could make the best decisions at any given time. The information was so fresh that sometimes the program could anticipate which DNA quenes would become advantageous or disadvantageous in the near future, causing the program to adjust its recommendations on the fly in order to best serve its human clientele. [ . . . ] The traitor quene would respond: 'Yes, I'm sure. This fancy new protein will enhance our replication like nothing that preceded it.' As this warped quene swept through humanity, we evolved into phenotypic servants of the new quantome quenes, and the other DNA quenes came to see that the traitor had been correct. The last strides of our species comprised the grandest replication opportunity for that corrupt little quene, copying itself to all its human descendants on Earth before it, too, was eliminated as the last of our kind was extinguished." -- J.F. Gariepy, "The Revolutionary Phenotype"

"Now imagine the company fires all its employees and replaces them with robots. It fires the inventor and replaces him with a genetic algorithm that optimizes battery design. It fires the CEO and replaces him with a superintelligent business-running algorithm. All of these are good decisions, from a profitability perspective. We can absolutely imagine a profit-driven shareholder-value-maximizing company doing all these things. But it reduces the company’s non-masturbatory participation in an economy that points outside itself, limits it to just a tenuous connection with soccer moms and maybe some shareholders who want yachts of their own. Now take it further. Imagine there are no human shareholders who want yachts, just banks who lend the company money in order to increase their own value. And imagine there are no soccer moms anymore; the company makes batteries for the trucks that ship raw materials from place to place. Every non-economic goal has been stripped away from the company; it’s just an appendage of Global Development. Now take it even further, and imagine this is what’s happened everywhere. There are no humans left; it isn’t economically efficient to continue having humans." -- Scott Alexander, "Book Review: Age of Em"

"We think it’s very unlikely that the AI alignment field will be able to make progress quickly enough to prevent human extinction and the loss of the future’s potential value, that we expect will result from loss of control to smarter-than-human AI systems. [ . . . ] As such, in 2023, MIRI shifted its strategy to pursue three objectives: Policy: Increase the probability that the major governments of the world end up coming to some international agreement to halt progress toward smarter-than-human AI, until humanity’s state of knowledge and justified confidence about its understanding of relevant phenomena has drastically changed; and until we are able to secure these systems such that they can’t fall into the hands of malicious or incautious actors. Communications: Share our models of the situation with a broad audience, especially in cases where talking about an important consideration could help normalize discussion of it. Research: Continue to invest in a portfolio of research. This includes technical alignment research (though we’ve become more pessimistic that such work will have time to bear fruit if policy interventions fail to buy the research field more time), as well as research in support of our policy and communications goals. We see the communications work as instrumental support for our policy objective. We also see candid and honest communication as a way to bring key models and considerations into the Overton window, and we generally think that being honest in this way tends to be a good default. Although we plan to pursue all three of these priorities, it’s likely that policy and communications will be a higher priority for MIRI than research going forward." -- Malo Bourgon, "MIRI 2024 Mission and Strategy Update [LW · GW]"

Hello.

My name is Mack Gallagher.

I've made two posts ( [ 1 ] [LW · GW], [ 2 ] [LW · GW] ) previously asking for advice.

Now I'm asking for help.

In 2021 I reached out to Eliezer Yudkowsky. I told him I knew I and everything I loved were at near-term risk of extinction from unaligned ASI explosion. I said I had zero interest in building ASIs I was not sure were aligned. I told him I was historically literate enough to have a concept of a hard unsolved technical problem, and that there was nothing he could do to dissuade me from trying until the world was actually saved or I died. I said I didn't know much decision theory yet [ I've since gotten a better grasp on the basics ] but from what I'd seen of the field, if everything was simple, I imagined he [/MIRI] would be interested in funding me. He said there wasn't much I would be able to do if I didn't know how to code.

I spent a year learning how to code while cashiering and working fast food and then retail [ I took 6 months out to go to BloomTech ] and keeping up somewhat with AI. I tried to get a software job but ultimately could not get one [ I should have taken the ISA with BloomTech ]. I spent 15 months packing meat, reading old science books, and doing toy projects like writing half an MNIST recognizer in C with no matrix libraries, just to be sure I understood everything [ unfortunately that Arch install got corrupted so I don't have that code anymore ]. Then I moved across Iowa to work for my dad's company. That didn't work out [ I ended up finding out their business model was legit unethical and I told them so many times when I quit ].

Anyway, nothing that's happened since I contacted Eliezer in 2021 has assured me humanity's AI projects are on a non-doom trajectory, and altogether in the last year I've learned enough and matured enough that now I feel ready to do this myself.

I don't think I'm necessarily the only one who can do it.

But I'm someone who can do it.

I'm primarily looking for people who want to pay me to do this myself [ I'm not an arrogant type who will fail to seek expert tech consultants - it's just that I can recruit my own expert tech consultants as the project gets under way, and it's too early to say whether I'll find anyone who should be on payroll ]. And the nature of the work is legit such that sharing too much of it too publicly would be unwise. But it hardly makes sense to pay me if you don't have at least some kind of cutaway view into what you're paying for. I think that if anyone is up for being a funder/cofounder, that would make sense. If you're not up for being a funder/cofounder of Project Adequate but still want to fund it, then we can work something out.

What I can tell you at this stage:

You will unfortunately likely not be seeing immediate deliverable returns, and quite possibly not deliverable returns of any kind, ever. When the project is complete, maybe we build the [limited test] AI, maybe we hand the completed theory/design/gameplan/blueprint off to someone with a secure outfit who can be trusted to build it. I do not currently know.

This is [likely] a multi-year project. We need to build certain technical infrastructure before we know much about what the end stages will look like. The project unfortunately cannot be derailed by commercial motivations that conflict with the motivation to survive.

I am mainly asking upfront for a salary adequate to pay moderate living/medical expenses and for a few normal computers. I do not need to be relocated to San Francisco [or anywhere else, in particular] at this time, unless someone else lives there who wants to work with me there.

We may need a significant raise in capital to build more specialized computers later, but for right now we do not know what kind of capital that is. We will certainly never need anything like a fraction of a datacenter of compute; I think that paradigm is at least logistically and probably logically hopeless for us.

My chosen short-to-medium term AI goal / alignment target [ as of November 13, 2024 ] is optimized enzyme design - that is, making maximally efficient enzymes to do specific pre-defined "chemical construction" tasks. This target achievement for early models of AI alignment has some commercial potential, and that is part of why I [currently] favor it; however, I don't expect there to be enough time between provisionally solving the problem, and the more serious next steps we must take, for that commercial potential to actually be realized. This probable waste cannot be mourned or salvaged too much lest the project lose sight of its more consequential aims. We reserve the right to switch near-to-medium term goal / alignment targets if another should suggest itself as superior, and hotfix project strategy in other ways - though we will communicate with our funders about such changes as much as we can.

As for tooling, I am a Lisp maximalist. My reasons for dispensing with the conventional Python-NN-GPU paradigm can be stated in various ways. I imagine they align with Wolfram's, Norvig's, etc. [ I consider myself a McCarthyist rather than a Schmidhüberist, if I consider myself to be of any school. ] Basically, I respect Anthropic and Nanda immensely for trying, but I think the "giant inscrutable matrices of floating-point numbers", in Eliezer's words, are indeed hopelessly intractible [ illegible, incorrigible, etc. ]. I currently run a bare-bones Chicken Scheme instance; this will obviously quickly grow into something more advanced as the project bootstraps.

My planned approach right now consists of:

  1. Iterate on skeleton sketch code solutions to MIRI's deepest Big Conceptual Puzzles

  2. In parallel with Step 1, improve legible-AI tooling and catch up to a roughly GPT-2-equivalent capability level

[ 5. Repeat Steps 1-4 until the Pivotal Act phase ]

  1. Test current capabilities

  2. Run safety audit at current expertise level; stop and preempt safety issues; return to Step 1

Happy to receive any questions, technical or otherwise; there are some conceivable questions about details I unfortunately cannot be willing to get into on LessWrong.

And if you think you have a better idea and I should work with you, that would be great news. Please comment.

[ DMs open ]

email: magallagher00@gmail.com

Signal: mack.11

Discord: kaventekeit

7 comments

Comments sorted by top scores.

comment by Meme Marine (meme-marine) · 2024-11-17T03:56:14.244Z · LW(p) · GW(p)

Kudos to you for actually trying to solve the problem, but I must remind you that the history of symbolic AI is pretty much nothing but failure after failure; what do you intend to do differently, and how do you intend to overcome the challenges that halted progress in this area for the past ~40 years?

Replies from: Lorec
comment by Lorec · 2024-11-17T04:09:02.636Z · LW(p) · GW(p)

This is one of those subject areas that'd be unfortunately bad to get into publicly. If you or any other individual wants to grill me on this, feel free to DM me or contact me by any of the above methods and I will take disclosure case by case.

Replies from: lcmgcd
comment by lemonhope (lcmgcd) · 2024-11-17T08:03:45.188Z · LW(p) · GW(p)

Wasted opportunity to guarantee this post keeps getting holywar comments for the next hundred years.

comment by Seth Herd · 2024-11-19T03:18:14.207Z · LW(p) · GW(p)

I missed this until I finally got around to responding to your last post, which I'd put on my todo list.

I applaud your initiative and drive! I do think it's a tough pitch to try to leapfrog the fast progressive in deep networks. Nor do I think the alignment picture for those types of systems is nearly as bleak as Yudkowsky & the old school believe. But neither is it likely to be easy enough to leave to chance and those who don't fully grasp the seriousness of the problem. I've written about some of the most common Cruxes of disagreement on alignment difficulty [LW(p) · GW(p)].

So I'd suggest you would have better odds working within the ML framework that is happening with or without your help. I also think that even if you do produce a near-miraculous breakthrough in symbolic GOFAI, Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc [LW · GW].

OTOH, if you have a really insightful approach, and a good reason to think the result would be easier to align than language model agents, maybe pursuing that path makes sense, since no one else is doing exactly that. As I said in my comment on your last request for directions, I think there are higher-expected-value nearly-as-underserved routes to survival; namely, working on alignment for the LLM agents that are our most likely route to first AGIs at this point (focusing on different routes from aligning the base LLMs, which is common but inadequate).

I'm also happy to talk. Your devotion to the project is impressive, and a resource not to be wasted!

Replies from: Lorec
comment by Lorec · 2024-11-20T06:06:05.371Z · LW(p) · GW(p)

Thank you for your kind comment! I disagree with the johnswentworth post you linked; it's misleading to frame NN interpretability as though we started out having any graph with any labels, weird-looking labels or not. I have sent you a DM.

comment by lemonhope (lcmgcd) · 2024-11-17T08:02:34.707Z · LW(p) · GW(p)

This is pretty inspiring to me. Thank you for sharing.

Replies from: Lorec
comment by Lorec · 2024-11-20T06:06:13.539Z · LW(p) · GW(p)

Thank you!