Posts

Biorisk is an Unhelpful Analogy for AI Risk 2024-05-06T06:20:28.899Z
A Dozen Ways to Get More Dakka 2024-04-08T04:45:19.427Z
"Open Source AI" isn't Open Source 2024-02-15T08:59:59.034Z
Technologies and Terminology: AI isn't Software, it's... Deepware? 2024-02-13T13:37:10.364Z
Safe Stasis Fallacy 2024-02-05T10:54:44.061Z
AI Is Not Software 2024-01-02T07:58:04.992Z
Public Call for Interest in Mathematical Alignment 2023-11-22T13:22:09.558Z
What is autonomy, and how does it lead to greater risk from AI? 2023-08-01T07:58:06.366Z
A Defense of Work on Mathematical AI Safety 2023-07-06T14:15:21.074Z
"Safety Culture for AI" is important, but isn't going to be easy 2023-06-26T12:52:47.368Z
"LLMs Don't Have a Coherent Model of the World" - What it Means, Why it Matters 2023-06-01T07:46:37.075Z
Systems that cannot be unsafe cannot be safe 2023-05-02T08:53:35.115Z
Beyond a better world 2022-12-14T10:18:26.810Z
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm) 2022-11-02T12:57:23.445Z
Announcing AISIC 2022 - the AI Safety Israel Conference, October 19-20 2022-09-21T19:32:35.581Z
Rehovot, Israel – ACX Meetups Everywhere 2022 2022-08-25T18:01:16.106Z
AI Governance across Slow/Fast Takeoff and Easy/Hard Alignment spectra 2022-04-03T07:45:57.592Z
Arguments about Highly Reliable Agent Designs as a Useful Path to Artificial Intelligence Safety 2022-01-27T13:13:11.011Z
Elicitation for Modeling Transformative AI Risks 2021-12-16T15:24:04.926Z
Modelling Transformative AI Risks (MTAIR) Project: Introduction 2021-08-16T07:12:22.277Z
Maybe Antivirals aren’t a Useful Priority for Pandemics? 2021-06-20T10:04:08.425Z
A Cruciverbalist’s Introduction to Bayesian reasoning 2021-04-04T08:50:07.729Z
Systematizing Epistemics: Principles for Resolving Forecasts 2021-03-29T20:46:06.923Z
Resolutions to the Challenge of Resolving Forecasts 2021-03-11T19:08:16.290Z
The Upper Limit of Value 2021-01-27T14:13:09.510Z
Multitudinous outside views 2020-08-18T06:21:47.566Z
Update more slowly! 2020-07-13T07:10:50.164Z
A Personal (Interim) COVID-19 Postmortem 2020-06-25T18:10:40.885Z
Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? 2020-04-27T22:43:26.034Z
Potential High-Leverage and Inexpensive Mitigations (which are still feasible) for Pandemics 2020-03-09T06:59:19.610Z
Ineffective Response to COVID-19 and Risk Compensation 2020-03-08T09:21:55.888Z
Link: Does the following seem like a reasonable brief summary of the key disagreements regarding AI risk? 2019-12-26T20:14:52.509Z
Updating a Complex Mental Model - An Applied Election Odds Example 2019-11-28T09:29:56.753Z
Theater Tickets, Sleeping Pills, and the Idiosyncrasies of Delegated Risk Management 2019-10-30T10:33:16.240Z
Divergence on Evidence Due to Differing Priors - A Political Case Study 2019-09-16T11:01:11.341Z
Hackable Rewards as a Safety Valve? 2019-09-10T10:33:40.238Z
What Programming Language Characteristics Would Allow Provably Safe AI? 2019-08-28T10:46:32.643Z
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4) 2019-08-12T08:07:01.769Z
Applying Overoptimization to Selection vs. Control (Optimizing and Goodhart Effects - Clarifying Thoughts, Part 3) 2019-07-28T09:32:25.878Z
What does Optimization Mean, Again? (Optimizing and Goodhart Effects - Clarifying Thoughts, Part 2) 2019-07-28T09:30:29.792Z
Re-introducing Selection vs Control for Optimization (Optimizing and Goodhart Effects - Clarifying Thoughts, Part 1) 2019-07-02T15:36:51.071Z
Schelling Fences versus Marginal Thinking 2019-05-22T10:22:32.213Z
Values Weren't Complex, Once. 2018-11-25T09:17:02.207Z
Oversight of Unsafe Systems via Dynamic Safety Envelopes 2018-11-23T08:37:30.401Z
Collaboration-by-Design versus Emergent Collaboration 2018-11-18T07:22:16.340Z
Multi-Agent Overoptimization, and Embedded Agent World Models 2018-11-08T20:33:00.499Z
Policy Beats Morality 2018-10-17T06:39:40.398Z
(Some?) Possible Multi-Agent Goodhart Interactions 2018-09-22T17:48:22.356Z
Lotuses and Loot Boxes 2018-05-17T00:21:12.583Z
Non-Adversarial Goodhart and AI Risks 2018-03-27T01:39:30.539Z

Comments

Comment by Davidmanheim on Linear infra-Bayesian Bandits · 2024-05-15T09:31:05.472Z · LW · GW

I'll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance.  It's not always better to work in academia, but it's also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)

And in my experience, much of the good work coming out of AI Safety has been sidelined because it fails the academic prestige test, and so it fails to engage with academics who could contribute or who have done closely related work. Other work avoids or fails the publication process because the authors don't have the right kind of guidance and experience to get their papers in to the right conferences and journals, and not only is it therefore often worse for not getting feedback from peer review, but it doesn't engage others in the research area.

Comment by Davidmanheim on Tools to discern between real and AI · 2024-05-13T15:34:07.970Z · LW · GW

There aren't good ways to do this automatically for text, and state of the art is rapidly evolving.
https://arxiv.org/abs/2403.05750v1

For photographic images which contain detailed images humans or contain non-standard objects with details, there are still some reasonably good heuristics for when AIs will mess up those details, but I'm not sure how long they will be valid for.

Comment by Davidmanheim on yanni's Shortform · 2024-05-12T06:36:20.179Z · LW · GW

This is one of the key reasons that the term alignment was invented and used instead of control; I can be aligned with the interests of my infant, or my pet, without any control on their part.

Comment by Davidmanheim on How do top AI labs vet architecture/algorithm changes? · 2024-05-08T20:26:26.167Z · LW · GW

Most of this seems to be subsumed in the general question of how do you do research, and there's lot of advice, but it's (ironically) not at all a science. From my limited understanding of what goes on in the research groups inside these companies, it's a combination of research intuition, small scale testing,  checking with others and discussing the new approach, validating your ideas, and getting buy-in from people higher up that it's worth your and their time to try the new idea. Which is the same as research generally.

At that point, I'll speculate and assume whatever idea they have is validated in smaller but still relatively large settings. For things like sample efficiency, they might, say, train a GPT-3 size model, which now cost only a fraction of the researcher's salary to do. (Yes, I'm sure they all have very large compute budgets for their research.) If the results are still impressive, I'm sure there is lots more discussion and testing before actually using the method in training the next round of frontier models that cost huge amounts of money - and those decisions are ultimately made by the teams building those models, and management.

Comment by Davidmanheim on Zero-Sum Defeats Nash Equilibrium · 2024-05-08T20:12:52.331Z · LW · GW

It seems like you're not being clear about how you are thinking about the cases, or are misusing some of the terms. Nash Equilibria exist in zero-sum games, so those aren't different things. If you're familiar with how to do game theory, I think you should carefully set up what you claim the situation is in a payoff matrix, and then check whether, given the set of actions you posit people have in each case, the scenario is actually a Nash equilibrium in the cases you're calling Nash equilibrium.

Comment by Davidmanheim on an effective ai safety initiative · 2024-05-08T20:08:26.160Z · LW · GW

...but there are a number of EAs working on cybersecurity in the context of AI risks, so one premise of the argument here is off.

And a rapid response site for the public to report cybersecurity issues and account hacking generally would do nothing to address the problems that face the groups that most need to secure their systems, and wouldn't even solve the narrower problem of reducing those hacks, so this seems like the wrong approach even given the assumptions you suggest. 

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-08T20:00:05.370Z · LW · GW

I agree that your question is weird and confused, and agree that if that were the context, my post would be hard to understand. But I think it's a bad analogy! That's because there are people who have made analogies between AI and Bio very poorly, and it's misleading and leading to sloppy thinking. In my experience seeing discussions on the topic, either the comparisons are drawn carefully and the relevant dissimilarities are discussed clearly, or they are bad analogies.
 
To stretch your analogy, if the context were that I'd recently heard people say "Steve and David are both people I know, and if you don't like Steve, you probably won't like David," and also "Steve and David are both concerned about AI risks, so they agree on how to discuss the issue," I'd wonder if there was some confusion, and I'd feel comfortable saying that in general, Steve is an unhelpful analog for David, and all these people should stop and be much more careful in how they think about comparisons between us.

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-08T08:34:39.530Z · LW · GW

I agree with you that analogies are needed, but they are also inevitably limited. So I'm fine with saying "AI is concerning because its progress is exponential, and we have seen from COVID-19 that we need to intervene early," or "AI is concerning because it can proliferate as a technology like nuclear weapons," or "AI is like biological weapons in that countries will pursue and use these because they seem powerful, without appreciating the dangers they create if they escape control." But what I am concerned that you are suggesting is that we should make the general claim "AI poses uncontrollable risks like pathogens do," or "AI needs to be regulated the way biological pathogens are," and that's something I strongly oppose. By ignoring all of the specifics, the analogy fails.

In other words, "while I think the disanalogies are compelling, comparison can still be useful as an analytic tool - while keeping in mind that the ability to directly learn lessons from biorisk to apply to AI is limited by the vast array of other disanalogies."

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-08T03:37:38.155Z · LW · GW

I said:

disanalogies listed here aren’t in and of themselves reasons that similar strategies cannot sometimes be useful, once the limitations are understood. For that reason, disanalogies should be a reminder and a caution against analogizing, not a reason on its own to reject parallel approaches in the different domains.

You seem to be simultaneously claiming that I had plenty of room to make a more nuanced argument, and then saying you think I'm saying something which exactly the nuance I included seems to address. Yes, people could cite the title of the blog post to make a misleading claim, assuming others won't read it - and if that's your concern, perhaps it would be enough to change the title to "Biorisk is Often an Unhelpful Analogy for AI Risk," or "Biorisk is Misleading as a General Analogy for AI Risk"?

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-07T10:26:26.923Z · LW · GW

I agree that we do not have an exact model for anything in immunology, unlike physics, and there is a huge amount of uncertainty. But that's different than saying it's not well-understood; we have clear gold-standard methods for determining answers, even if they are very expensive. This stands in stark contrast to AI, where we don't have the ability verify that something works or is safe at all without deploying it, and even that isn't much of a check on its later potential for misuse.

But aside from that, I think your position is agreeing with mine much more than you imply. My understanding is that we have newer predictive models which can give uncertain but fairly accurate answers to many narrow questions. (Older, non-ML methods also exist, but I'm less familiar with them.) In your hypothetical case, I expect that the right experts can absolutely give indicative answers about whether a novel vaccine peptide is likely or unlikely to have cross-reactivity with various immune targets, and the biggest problem is that it's socially unacceptable to assert confidence in anything short of tested and verified case. But the models can get, in the case of the Zhang et al paper above, 70% accurate answers, which can help narrow the problem for drug or vaccine discovery, then they do need to be followed with in vitro tests and trials.

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-07T05:25:54.291Z · LW · GW

I'm arguing exactly the opposite; experts want to make comparisons carefully, and those trying to transmit the case to the general public should, at this point, stop using these rhetorical shortcuts that imply wrong and misleading things.

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-07T05:23:30.184Z · LW · GW

On net, the analogies being used to try to explain are bad and misleading.

I agree that I could have tried to convey a different message, but I don't think it's the right one. Anyone who wants to dig in can decide for themselves, but you're arguing that ideal reasoners won't conflate different things and can disentangle the similarities and differences, and I agree, but I'm noting that people aren't doing that, and others seem to agree.

Comment by Davidmanheim on Biorisk is an Unhelpful Analogy for AI Risk · 2024-05-07T03:16:10.402Z · LW · GW

I don't understand why you disagree. Sure, pathogens can have many hosts, but hosts generally follow the same logic as for humans in terms of their attack surface being static and well adapted, and are similarly increasingly understood.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-26T15:28:42.601Z · LW · GW

That doesn't seem like "consistently and catastrophically," it seems like "far too often, but with thankfully fairly limited local consequences."

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-26T05:32:12.332Z · LW · GW

BSL isn't the thing that defines "appropriate units of risk", that's pathogen risk-group levels, and I agree that those are are problem because they focus on pathogen lists rather than actual risks. I actually think BSL are good at what they do, and the problem is regulation and oversight, which is patchy, as well as transparency, of which there is far too little. But those are issues with oversight, not with the types of biosecurity measure that are available.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-26T05:26:17.301Z · LW · GW

If you're appealing to OpenPhil, it might be useful to ask one of the people who was working with them on this as well.

And you've now equivocated between "they've induced an EA cause area" and a list of the range of risks covered by biosecurity - not what their primary concerns are - and citing this as "one of them." I certainly agree that biosecurity levels are one of the things biosecurity is about, and that "the possibility of accidental deployment of biological agents" is a key issue, but that's incredibly far removed from the original claim that the failure of BSL levels induced the cause area!

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-25T18:36:42.142Z · LW · GW

I mean, I'm sure something more restrictive is possible. 

But what? Should we insist that the entire time someone's inside a BSL-4 lab, we have a second person who is an expert in biosafety visually monitoring them to ensure they don't make mistakes? Or should their air supply not use filters and completely safe PAPRs, and feed them outside air though a tube that restricts their ability to move around instead?  (Edit to Add: These are already both requires in BSL-4 labs. When I said I don't know of anything more restrictive they could do, I was being essentially literal - they do everything including quite a number of unreasonable things to prevent human infection, short of just not doing the research.)

Or do you have some new idea that isn't just a ban with more words?
 

"lists of restrictions" are a poor way of managing risk when the attack surface is enormous 

Sure, list-based approaches are insufficient, but they have relatively little to do with biosafety levels of labs, they have to do with risk groups, which are distinct, but often conflated. (So Ebola or Smallpox isn't a "BSL-4" pathogen, because there is no such thing. )

I just meant "gain of function" in the standard, common-use sense—e.g., that used in the 2014 ban on federal funding for such research.

That ban didn't go far enough, since it only applied to 3 pathogen types, and wouldn't have banned what Wuhan was doing with novel viruses, since that wasn't working with SARS or MERS, it was working with other species of virus. So sure, we could enforce a broader version of that ban, but getting a good definition that's both extensive enough to prevent dangerous work and that doesn't ban obviously useful research is very hard.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-25T18:23:18.405Z · LW · GW

Having written extensively about it, I promise you I'm aware. But please, tell me more about how this supports the original claim which I have been disagreeing with, that these class of incidents were or are the primary concern of the EA biosecurity community, the one that led to it being a cause area.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-24T17:14:51.346Z · LW · GW

The OP claimed a failure of BSL levels was the single thing that induced biorisk as a cause area, and I said that was a confused claim. Feel free to find someone who disagrees with me here, but the proximate causes of EAs worrying about biorisk has nothing to do with BSL lab designations. It's not BSL levels that failed in allowing things like the soviet bioweapons program, or led to the underfunded and largely unenforceable BWC, or the way that newer technologies are reducing the barriers to terrorists and other being able to pursue bioweapons.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-24T15:58:20.803Z · LW · GW

I did not say that they didn't want to ban things, I explicitly said "whether to allow certain classes of research at all," and when I said "happy to rely on those levels, I meant that the idea that we should have "BSL-5" is the kind of silly thing that novice EAs propose that doesn't make sense because there literally isn't something significantly more restrictive other than just banning it.

I also think that "nearly all EA's focused on biorisk think gain of function research should be banned" is obviously underspecified, and wrong because of the details. Yes, we all think that there is a class of work that should be banned, but tons of work that would be called gain of function isn't in that class.

Comment by Davidmanheim on Paul Christiano named as US AI Safety Institute Head of AI Safety · 2024-04-22T12:24:55.014Z · LW · GW

BSL levels, which have failed so consistently and catastrophically they've induced an EA cause area,

This is confused and wrong, in my view. The EA cause area around biorisk is mostly happy to rely on those levels, and unlike for AI, the (very useful) levels predate EA interest and give us something to build on. The questions are largely instead about whether to allow certain classes of research at all, the risks of those who intentionally do things that are forbiddn, and how new technology changes the risk.

Comment by Davidmanheim on Partial value takeover without world takeover · 2024-04-18T20:19:48.852Z · LW · GW

and then the 2nd AI pays some trivial amount to the 1st for the inconvenience

Completely as an aside, coordination problems among ASI don't go away, so this is a highly non trivial claim.

Comment by Davidmanheim on Staged release · 2024-04-18T11:59:54.646Z · LW · GW

I thought that the point was that either managed-interface-only access, or API access with rate limits, monitoring, and an appropriate terms of service, can prevent use of some forms of scaffolding. If it's staged release, this makes sense to do, at least for a brief period while confirming that there are not security or safety issues.

Comment by Davidmanheim on Staged release · 2024-04-18T11:56:54.615Z · LW · GW

These days it's rare for a release to advance the frontier substantially.

This seems to be one crux. Sure, there's no need for staged release if the model doesn't actually do much more than previous models, and doesn't have unpatched vulnerabilities of types that would be identified by somewhat broader testing.

The other crux, I think, is around public release of model weights. (Often referred to, incorrectly, as "open sourcing.") Staged release implies not releasing weights immediately - and I think this is one of the critical issues with what companies like X have done that make it important to demand staged release for any models claiming to be as powerful or more powerful than current frontier models. (In addition to testing and red-teaming, which they also don't do.)

Comment by Davidmanheim on OMMC Announces RIP · 2024-04-08T13:31:37.454Z · LW · GW

It is funny, but it also showed up on April 2nd in Europe and anywhere farther east...

Comment by Davidmanheim on A Dozen Ways to Get More Dakka · 2024-04-08T13:29:01.018Z · LW · GW

I think there are two very different cases of "almost works" that are being referred to. The first is where the added effort is going in the right direction, and the second is where it is slightly wrong. For the first case, if you have a drug that doesn't quite treat your symptoms, it might be because it addresses all of them somewhat, in which case increasing the dose might make sense. For the second case, you could have one that addresses most of the symptoms very well, but makes one worse, or has an unacceptable side effect, in which case increasing the dose wouldn't help. Similarly, we could imagine a muscle that is uncomfortable. The second case might then be a stretch that targets almost the right muscle. That isn't going to help if you do it more. The first case, on the other hand, would be a stretch that targets the right muscle but isn't doing enough, and obviously it could be great to do more often, or for a longer time.

Comment by Davidmanheim on My Clients, The Liars · 2024-03-12T08:29:34.414Z · LW · GW

Again, I think it was a fine and enjoyable post.

But I didn't see where you "demonstrate how I used very basic rationalist tools to uncover lies," which could have improved the post, and I don't think this really explored any underappreciated parts of "deception and how it can manifest in the real world" - which I agree is underappreciated. Unfortunately, this post didn't provide much clarity about how to find it, or how to think about it. So again, it's a fine post, good stories, and I agree they illustrate being more confused by fiction than reality, and other rationalist virtues, but as I said, it was not "the type of post that leads people to a more nuanced or better view of any of the things discussed." 

Comment by Davidmanheim on My Clients, The Liars · 2024-03-10T10:35:34.025Z · LW · GW

I disagree with this decision, not because I think it was a bad post, but because it doesn't seem like the type of post that leads people to a more nuanced or better view of any of the things discussed, much less a post that provided insight or better understanding of critical things in the broader world. It was enjoyable, but not what I'd like to see more of on Less Wrong.

(Note: I posted this response primarily because I saw that lots of others also disagreed with this, and think it's worth having on the record why at least one of us did so.)

Comment by Davidmanheim on The World in 2029 · 2024-03-05T11:34:13.977Z · LW · GW

"Climate change is seen as a bit less of a significant problem"
 

That seems shockingly unlikely (5%) - even if we have essentially eliminated all net emissions (10%), we will still be seeing continued warming (99%) unless we have widely embraced geoengineering (10%). If we have, it is a source of significant geopolitical contention (75%) due to uneven impacts (50%) and pressure from environmental groups (90%) worried that it is promoting continued emissions and / or causes other harms. Progress on carbon capture is starting to pay off (70%) but is not (90%) deployed at anything like the scale needed to stop or reverse warming.

Adaptation to climate change has continued (99%), but it is increasingly obvious how expensive it is and how badly it is impacting developing world. The public still seems to think this is the fault of current emissions (70%) and carbon taxes or similar legal limits are in place for a majority of G7 countries (50%) but less than half of other countries (70%).

Comment by Davidmanheim on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-22T15:26:35.129Z · LW · GW

To start, the claim that it was found 2 miles from the facility is an important mistake, because WIV is 8 miles from the market. For comparison to another city people might know better, in New York, that's the distance between World Trade Center and either Columbia University, or Newark Airport. Wuhan's downtown is around 16 miles across. 8 miles away just means it was in the same city. 

And you're over-reliant on the evidence you want to pay attention to. For example, even rstricting ourselves to "nearby coincidence" evidence, the Hunan the market is the largest in central China - so what are the odds that a natural spillover events occurs immediately surrounding the largest animal market? If the disease actually emerged from WIV, what are the odds that the cases centered around the Hunan market, 8 miles away, instead of the Baishazhou live animal market, 3 miles away, or the Dijiao market, also 8 miles away?

So I agree that an update can be that strong, but this one simply isn't.

Comment by Davidmanheim on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-20T22:12:27.135Z · LW · GW

Yeah, but I think that it's more than not taken literally, it's that the exercise is fundamentally flawed when being used as an argument instead of very narrowly for honest truth-seeking, which is almost never possible in a discussion without unreasonably high levels of trust and confidence in others' epistemic reliability.

Comment by Davidmanheim on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-19T14:28:53.309Z · LW · GW
  1. What is the relevance of the "posterior" that you get after updating on a single claim that's being chosen, post-hoc, as the one that you want to use as an example?
  2. Using a weak prior biases towards thinking the information you have to update with is strong evidence. How did you decide on that particular prior? You should presumably have some reference class for your prior. (If you can't do that, you should at least have equipoise between all reasonable hypotheses being considered. Instead, you're updating "Yes Lableak" versus "No Lableak" - but in fact, "from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis just to locate the hypothesis in theory-space. It’s not a question of justifying anything to anyone.") 
  3. How confident are you in your estimate of the bayes factor here? Do you have calibration data for roughly similar estimates you have made? Should you be adjusting for less than perfect confidence? 
Comment by Davidmanheim on On coincidences and Bayesian reasoning, as applied to the origins of COVID-19 · 2024-02-19T13:53:55.928Z · LW · GW

Thank you for writing this.

I think most points here are good points to make, but I also think it's useful as a general caution against this type of exercise being used as an argument at all! So I'd obviously caution against anyone taking your response itself as a reasonable attempt at an estimate of the "correct" Bayes factors, because this is all very bad epistemic practice!  Public explanations and arguments are social claims, and usually contain heavily filtered evidence (even if unconsciously). Don't do this in public.

That is, this type of informal Bayesian estimate is useful as part of a ritual for changing your own mind, when done carefully. That requires a significant degree of self-composure, a willingness to change one's mind, and a high degree of justified confidence n your own mastery of unbiased reasoning.

Here, though, it is presented as an argument, which is not how any of this should work. And in this case, it was written by someone who already had a strong view of what the outcome should be, repeated publicly frequently, which makes it doubly hard to accept the implicit necessary claim that it was performed starting from an unbiased point at face value! At the very least, we need strong evidence that it was not an exercise in motivated reasoning, that the bottom line wasn't written before the evaluation started - which statement is completely missing, though to be fair, it would be unbelievable if it had been stated.

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-16T07:59:30.042Z · LW · GW

I agree that releasing model weights is "partially open sourcing" - in much the same way that freeware is "partially open sourcing" software, or restrictive licences with code availability is.

But that's exactly the point; you don't get to call something X because it's kind-of-like X, it needs to actually fulfill the requirements in order to get the label. What is being called Open Source AI doesn't actually do the thing that it needs to.

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T20:55:55.385Z · LW · GW

Thanks - I agree that this discusses the licenses, which would be enough to make LlaMa not qualify, but I think there's a strong claim I put forward in the full linked piece that even if the model weights were released using a GPL license, those "open" model weights wouldn't make it open in the sense that Open Source means elsewhere.

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T20:51:38.471Z · LW · GW

I agree that the reasons someone wants the dataset generally aren't the same reasons they'd want to compile from source code. But there's a lot of utility for research in having access to the dataset even if you don't recompile. Checking whether there was test-set leakage for metrics, for example, or assessing how much of LLM ability is stochastic parroting of specific passages versus recombination. And if it was actually open, these would not be hidden from researchers.

And supply chain is a reasonable analogy - but many open-source advocates make sure that their code doesn't depend on closed / proprietary libraries. It's not actually "libre" if you need to have a closed source component or pay someone to make the thing work. Some advocates, those who built or control quite a lot of the total open source ecosystem, also put effort into ensuring that the entire toolchain needed to compile their code is open, because replicability shouldn't be contingent on companies that can restrict usage or hide things in the code. It's not strictly required, but it's certainly relevant.

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T20:44:46.435Z · LW · GW

The vast majority of uses of software are via changing configuration and inputs, not modifying code and recompiling the software. (Though lots of Software as a Service doesn't even let you change configuration directly.) But software is not open in this sense unless you can recompile, because it's not actually giving you full access to what was used to build it.

The same is the case for what Facebook call open-source LLMs; it's not actually giving you full access to what was used to build it.

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T15:49:52.875Z · LW · GW

Thanks - Redpajama definitely looks like it fits the bill, but it shouldn't need to bill itself as making "fully-open, reproducible models," since that's what "open source" is already supposed to mean. (Unfortunately, the largest model they have is 7B.)

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T15:45:06.113Z · LW · GW

Yes, agreed - as I said in the post, "Open Source AI simply means that the models have the model weights released - the equivalent of software which makes the compiled code available. (This is otherwise known as software.)"

Comment by Davidmanheim on "Open Source AI" isn't Open Source · 2024-02-15T13:57:39.654Z · LW · GW

"Freely remixable" models don't generally have open datasets used for training. If you know of one, that's great, and would be closer to open source. (Not Mistral. And Phi-2 is using synthetic data from other LLMs - I don't know what they released about the methods used to generate or select the text, but it's not open.)

But the entire point is that weights are not the source code for an LLM, they are the compiled program. Yes, it's modifiable via LoRA and similar, but that's not open source! Open source would mean I could replicate it, from the ground up. For facebook's models, at least, the details of the training methods, the RLHF training they do, where they get the data, all of those things are secrets. But they call it "Open Source AI" anyways.

Comment by Davidmanheim on Technologies and Terminology: AI isn't Software, it's... Deepware? · 2024-02-13T16:51:37.116Z · LW · GW

Good point, and I agree that it's possible that what I see as essential features might go away - "floppy disks" turned out to be a bad name when they ended up inside hard plastic covers, and "deepware" could end up the same - but I am skeptical that it will.

I agree that early electronics were buggy until we learned to build them reliably - and perhaps we can solve this for gradient-descent based learning, though many are skeptical of that, since many of the problems have been shown to be pretty fundamental. I also agree that any system is inscrutable until you understand it, but unlike early electronics, no-one understands these massive lists of numbers that produce text, and human brains can't build them, they just program a process to grow them. (Yes, composable NNs could solve some of this, as you point out when mentioning separable systems, but I still predict they won't be well understood, because the components individually are still deepware.)

Comment by Davidmanheim on Safe Stasis Fallacy · 2024-02-11T19:29:32.807Z · LW · GW

You talk about "governance by Friendly AGI" as if it's a solved problem we're just waiting to deploy, not speculation that might simply not be feasible even if we solve AGI alignment, which itself is plausibly unsolvable in the near term. You also conflate AI safety research with AI governance regimes. And note that the problems with governance generally aren't a lack of intelligence by those in charge, it's largely conflicting values and requirements. And with that said, you talk about modern liberal governments as if they are the worst thing we've experienced, "riddled with brokenness," as if that's the fault of the people in charge, not the deeply conflicting mandates that the populace gives them. And to the extent that the systemic failure is the fault of the untrustworthy incentives of those in charge, why would controllable or aligned AGI fix that?

Yes, stasis isn't safe by default, but undirected progress isn't a panacea, and governance certainly isn't any closer to solved just because we have AI progress.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-11T19:21:38.765Z · LW · GW

Thanks. I was unaware of the law, and yes, that does seem to be strong evidence that the agencies in question don't have any evidence specific enough to come to any conclusion. That, or they are foolishly risking pissing off Congress, which can subpoena them, and seems happy to do exactly that in other situations - and they would do so knowing that it's eventually going to come out that they withheld evidence?!?

Again, it's winter, people get sick, that's very weak Bayesian evidence of an outbreak, at best. On priors, how many people at an institute that size get influenza every month during the winter?

And the fact that it was only 3 people, months earlier, seems to indicate moderately strongly it wasn't the source of the full COVID-19 outbreak, since if it were, given the lack of precautions against spread at the time, if it already infected 3 different people, it seems likely it would have spread more widely within China starting at that time.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-11T07:17:30.260Z · LW · GW

Sorry, I'm having trouble following. You're saying that 1) it's unlikely to be a lab leak known to US Intel because it would have been known to us via leaks, and 2) you think that Intel agencies have evidence about WIV employees having COVID and that it's being withheld?

First, I think you're overestimating both how much information from highly sensitive sources would leak, and how much Chinese leaders would know if it were a lab leak. This seems on net to be mostly uninformative. 

Second, if they have evidence about WIV members having COVID, (and not, you know, any other respiratory disease in the middle of flu/cold season,) I still don't know why you think you would know that it was withheld from congress. Intel agencies share classified information with certain members of Congress routinely, but you'd never know what was or was not said. You think a lack of a leak is evidence that  would have been illegally withheld from congress - but it's not illegal for Intel agencies to keep information secret, in a wide variety of cases. 

And on that second point, even without the above arguments, not having seen such evidence publicly leaked can't plausibly be more likely in a world where it was a lab leak that was hidden, than it would be in a world where it wasn't a lab leak and the evidence you're not seeing simply doesn't exist!

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-09T13:30:01.457Z · LW · GW

State department isn't part of "US intelligence agencies and military," and faces very, very different pressures. And despite this, as you point out there are limits to internal pressures in intel agencies - which at least makes it clear that the intel agencies don't have strong and convincing non-public evidence for the leak hypothesis.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-09T13:27:08.545Z · LW · GW

I'm not saying it's impossible, I'm saying it's implausible. (So if this is a necessary precondition for believing in a lab leak, it is clear evidence against it.)

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-07T15:18:50.700Z · LW · GW

"(and likely parts of the US intelligence agencies and military) desperately wanted this to not be a lab leak."

 

As I said in another comment, that seems very, very hard to continue to believe, even if it might have seemed plausible on priors.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-07T15:16:38.454Z · LW · GW

Whoever publishes or sends out notices may or may not have others they check with. That's sometimes the local health authority directly, but may go through the national government. I don't know enough about how that works in China to say in general who might have been able to tell Wuhan Municipal Health Committee or WCDC what they were and were not supposed to say when they made their announcements. However, we have lots of information about what was said in the public statements and hospital records from that time, most of which is mentioned here. (You don't need to trust him much, the descriptions of the systems and what happened when are well known.) But data is also disseminated informally through lots of channels, and I don't know who would have been getting updates from colleagues or sending data to the WHO or US CDC.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-07T13:46:42.080Z · LW · GW

But the government would need to have started the coverup while they were suppressing evidence. It's weird to think they simultaneously were covering up transmission, and faking the data about the cases to make it fit the claim it originated in the wet market.

Comment by Davidmanheim on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-07T13:41:15.167Z · LW · GW

Most very large changes is viral evolution is lateral transfer between viruses, rather than accumulation of point mutations. The better claim would be that this was acquired by a proto-SARS-CoV-2 virus that way, not that it was the result of cross-species changes alone.