Posts

Meetup : Durham: Ugh Fields Followup 2014-04-28T00:32:15.869Z
Meetup : Ugh Fields 2014-04-16T16:32:51.417Z
Meetup : Durham: Stupid Rationality Questions 2014-03-18T17:11:40.975Z
Meetup : Social meetup in Raleigh 2014-01-21T14:27:35.517Z
Meetup : Durham: New Years' Resolutions etc. 2013-12-18T03:02:17.243Z
Meetup : Durham: Luminosity followup 2013-06-19T17:32:09.517Z
Rationality witticisms suitable for t-shirts or bumper stickers 2013-06-15T12:56:19.245Z
Meetup : Zendo and discussion 2013-06-05T00:35:23.107Z
Meetup : Durham: Luminosity (New location!) 2013-04-03T03:03:28.992Z
Meetup : Durham HPMoR Discussion, chapters 51-55 2013-04-03T02:56:22.161Z
Meetup : Durham: Status Quo Bias 2013-02-11T04:32:36.747Z
Meetup : Durham HPMoR Discussion, chapters 34-38 2013-02-09T03:59:45.544Z
Meetup : Durham HPMoR Discussion, chapters 30-33 2013-01-24T04:51:22.959Z
Meetup : Durham: Calibration Exercises 2013-01-16T21:31:02.538Z
Meetup : Durham HPMoR Discussion, chapters 27-29 2013-01-11T05:12:12.780Z
Meetup : Durham LW Meetup: Zendo 2012-12-30T19:30:34.095Z
Meetup : Durham HPMoR Discussion, chapters 24-26 2012-12-28T23:40:42.616Z
Meetup : Durham LW discussion 2012-12-18T18:38:13.094Z
Meetup : Durham HPMoR Discussion, chapters 21-23 2012-12-13T20:25:40.330Z
Meetup : Durham HPMoR Discussion, chapters 18-20 2012-11-30T17:16:21.069Z
Meetup : Durham HPMoR Discussion, chapters 15-17 2012-11-14T17:52:49.431Z
Meetup : Durham LW: Technical explanation, meta 2012-10-31T18:19:46.577Z
Meetup : Durham HPMoR discussion, ch 12-14 2012-10-31T18:13:57.163Z
Meetup : Durham Meetup: Article discussions 2012-10-16T23:00:14.776Z
Meetup : Durham HPMoR Discussion group 2012-10-16T22:34:31.727Z
Meetup : Durham NC HPMoR Discussion, chapters 4-7 2012-09-26T03:53:48.227Z
Meetup : Research Triangle Less Wrong 2012-09-19T20:03:21.463Z

Comments

Comment by evand on Range and Forecasting Accuracy · 2023-12-10T05:20:32.704Z · LW · GW

Have you considered looking at the old Foresight Exchange / Ideosphere data? It should be available, and it's old enough that there might be a useful number of long-term forecasts there.

http://www.ideosphere.com/

Comment by evand on Experiments as a Third Alternative · 2023-10-29T03:56:44.702Z · LW · GW

Yep, those are extremely different drugs with very different effects. They do in fact take a while and the effects can be much more subtle.

Comment by evand on Experiments as a Third Alternative · 2023-10-29T02:09:15.205Z · LW · GW

having read a few books about ADHD, my best guess is that I have (and had) a moderate case of it.

when I'm behind the wheel of a car and X = "the stoplight in front of me" and Y = "that new Indian restaurant", it is bad

Which is one of the reasons why I don't drive.

If your ADHD is interfering with your driving, that does not sound like a moderate case!

But option #3 is much better: take medication for six weeks and see what happens.

My expectation is that you will likely get a lot of information from trying the meds for 6 days; 6 weeks sounds like a very long experiment. Quite possibly even 2-3 days or just 1 day. 6 weeks sounds like enough time to test out a few doses and types (time release vs not, for example) and form opinions. And possibly get an understanding of whether you want to take it every day or only some days and maybe even how to figure out which is which.

All of which is to say: yes, perform cheap experiments! They're great! This one is probably far faster (if not much cheaper in dollar terms) than you're predicting.

Comment by evand on How to Resolve Forecasts With No Central Authority? · 2023-10-26T01:27:16.117Z · LW · GW

Bitcoin Hivemind (nee Truthcoin) is the authority on doing this in a truly decentralized fashion. The original whitepaper is well worth a read. The fundamental insight: it's easier to coordinate on the truth than on something else; incentivizing defection to the truth works well.

Judges have reputation. If you judge against the consensus, you lose reputation, and the other (consensus) judges gain it. The amount you lose depends on the consensus: being the lone dissenter has a very small cost (mistakes have costs, but small ones), but being part of a near-majority is very costly. So if your conspiracy is almost but not quite winning, the gain from defecting against the conspiracy is very high.

Adaptations to a non-blockchain context should be fairly easy. In that case, you have a central algorithmic resolver that tracks reputation, but defers to its users for resolution of a particular market (and applies the reputation algorithm).

Comment by evand on Show LW: Get a phone call if prediction markets predict nuclear war · 2023-09-19T04:45:09.447Z · LW · GW

Thank you!

I went ahead and created the 2024 version of one of the questions. If you're looking for high-liquidity questions to include, which seems like a good way to avoid false alarms / pranks, this one seems like a good inclusion.

There are a bunch of lower-liquidity questions; including a mix of those with some majority-rule type logic might or might not be worth it.

Comment by evand on Logical Share Splitting · 2023-09-14T04:18:06.627Z · LW · GW

Thank you! Much to think about, but later...

If there are a large number of true-but-not-publicly-proven statements, does that impose a large computational cost on the market making mechanism?

I expect that the computers running this system might have to be fairly beefy, but they're only checking proofs.

They're not, though. They're making markets on all the interrelated statements. How do they know when they're done exhausting the standing limit orders and AMM liquidity pools? My working assumption is that this is equivalent to a full Bayesian network and explodes exponentially for all the same reasons. In practice it's not maximally intractable, but you don't avoid the exponential explosion either -- it's just slower than the theoretical worst case.

If every new order placed has to be checked against the limit orders on every existing market, you have a problem.

For thickly traded propositions, I can make money by investing in a proposition first, then publishing a proof. That sends the price to $1 and I can make money off the difference. Usually, it would be more lucrative to keep my proof secret, though.

The problem I'm imaging comes when the market is trading at .999, but life would really be simplified for the market maker if it was at actually, provably 1. So it could stop tracking that price as something interesting, and stop worrying about the combinatorial explosion.

So you'd really like to find a world where once everyone has bothered to run the SAT-solver trick and figure out what route someone is minting free shares through, that just becomes common knowledge and everyone's computational costs stop growing exponentially in that particular direction. And furthermore, the first person to figure out the exact route is actually rewarded for publishing it, rather than being able to extract money at slowly declining rates of return.

In other words: at what point does a random observer start turning "probably true, the market said so" into "definitely true, I can download the Coq proof"? And after that point, is the market maker still pretending to be ignorant?

Comment by evand on Logical Share Splitting · 2023-09-12T14:40:57.360Z · LW · GW

This is very neat work, thank you. One of those delightful things that seems obvious in retrospect, but that I've never seen expressed like this before. A few questions, or maybe implementation details that aren't obvious:

For complicated proofs, the fully formally verified statement all the way back to axioms might be very long. In practice, do we end up with markets for all of those? Do they each need liquidity from an automated market maker? Presumably not if you're starting from axioms and building a full proof, and that applies to implications and conjunctions and so on as well, because the market doesn't need to keep tracking things that are proven. However:

First Alice, who can prove , produces many many shares of  for free. This is doable if you have a proof for  by starting from a bunch of free  shares and using equivalent exchange. She sells these for $0.2 each to Bob, pure profit.

In order for this to work, the market must be willing to maintain a price for these shares in the face of a proof that they're equivalent to . Presumably the proof is not yet public, and if Alice has secret knowledge she can sell with a profit-maximizing strategy.

She could simply not provide the proof to the exchange, generating  and  pairs and selling only the latter, equivalent to just investing in A, but that requires capital. It's far more interesting if she can do it without tying up the capital.

So how does the market work for shares of proven things, and how does the proof eventually become public? Is there any way to incentivize publishing proofs, or do we simply get a weird world where everyone is pretty sure some things are true but the only "proof" is the market price?

If there are a large number of true-but-not-publicly-proven statements, does that impose a large computational cost on the market making mechanism?

Second question: how does this work in different axiom systems? Do we need separate markets, or can they be tied together well? How does the market deal with "provable from ZFC but not Peano"? "Theorem X implies corollary Y" is a thing we can prove, and if there's a price on shares of "Theorem X" then that makes perfect sense, but does it make sense to put a "price" on the "truth" of the ZFC axioms?

Presumably if we have a functional market that distinguishes Peano proofs from ZFC proofs, we'd like to distinguish more axiom sets. What happens if someone sets up an inconsistent axiom set, and that inconsistency is found? Presumably all dependent markets become a mess and there's a race to the exits that extracts all the liquidity from the AMMs; that seems basically fine. But can that be contained to only those markets, without causing weird problems in Peano-only markets?

Probably some of this would be clearer if I knew a bit more about modern proof formalisms.

Comment by evand on My current LK99 questions · 2023-08-04T03:10:21.933Z · LW · GW

My background: educated amateur. I can design simple to not-quite-simple analog circuits and have taken ordinary but fiddly material property measurements with electronics test equipment and gotten industrially-useful results.

One person alleges an online rumor that poorly connected electrical leads can produce the same graph.  Is that a conventional view?

I'm not seeing it. With a bad enough setup, poor technique can do almost anything. I'm not seeing the authors as that awful, though. I don't think they're immune from mistakes, but I give low odds on the arbitrarily-awful end of mistakes.

You can model electrical mistakes as some mix of resistors and switches. Fiddly loose contacts are switches, actuated by forces. Those can be magnetic, thermal expansion, unknown gremlins, etc. So "critical magnetic field" could be "magnetic field adequate to move the thing". Ditto temperature. But managing both problems at the same time in a way that looks like a plausible superconductor critical curve is... weird. The gremlins could be anything, but gremlins highly correlated with interesting properties demand explanation.

Materials with grains can have conducting and not-conducting regions. Those would likely have different thermal expansion behaviors. Complex oxides with grain boundaries are ripe for diode-like behavior. So you could have a fairly complex circuit with fairly complex temperature dependence.

I think this piece basically comes down to two things:

  1. Can you get this level of complex behavior out of a simple model? One curve I'd believe, but the multiple curves with the relationship between temperature and critical current don't seem right. The level of mistake to produce this seems complicated, with very low base rate.
  2. Did they manage to demonstrate resistivity low enough to rule out simple conduction in the zero-voltage regime? (For example, lower resistivity than copper by an order of magnitude.) The papers are remarkably short on details to this effect. They claim yes, but details are hard to come by. (Copper has resistivity ~ 1.7e-6 ohm*cm, they claim < 10^-10 in the 3-author paper for the thin-film sample, but details are in short supply.) Four point probe technique to measure the resistivity of copper in a bulk sample is remarkably challenging. You measure the resistivity of copper with thin films or long thin wires if you want good data. I'd love to see more here.

If the noise floor doesn't rule out copper, you can get the curves with adequately well chosen thermal and magnetic switches from loose contacts. But there are enough graphs that those errors have to be remarkably precisely targeted, if the graphs aren't fraud.

Another thing I'd love to see on this front: multiple graphs of the same sort from the same sample (take it apart and put it back together), from different locations on the sample, from multiple samples. Bad measurement setups don't repeat cleanly.

My question for the NO side: what does the schematic of the bad measurement look like? Where do you put the diodes? How do you manage the sharp transition out of the zero-resistance regime without arbitrarily-fine-tuned switches?

Do any other results from the 6-person or journal-submitted LK papers stand out as having the property, "This is either superconductivity or fraud?"

The field-cooled vs zero-field-cooled magnetization graph (1d in the 3-author paper, 4a in the 6-author paper). I'm far less confident in this than the above; I understand the physics much less well. I mostly mention it because it seems under-discussed from what I've seen on twitter and such. This is an extremely specific form of thermal/magnetic hysteresis that I don't know of an alternate explanation for. I suspect this says more about my ignorance than anything else, but I'm surprised I haven't seen a proposed explanation from the NO camp.

Comment by evand on Lessons On How To Get Things Right On The First Try · 2023-06-21T05:36:54.943Z · LW · GW

The comparison between the calculations saying igniting the atmosphere was impossible and the catastrophic mistake on Castle Bravo is apposite as the initial calculations for both were done by the same people at the same gathering!

One out of two isn't bad, right?

https://twitter.com/tobyordoxford/status/1659659089658388545

Comment by evand on Why I am not an AI extinction cautionista · 2023-06-18T22:17:26.028Z · LW · GW

Of course a superintelligence could read your keys off your computer's power light, if it found it worthwhile. Most of the time it would not need to, it would find easier ways to do whatever humans do by pressing keys. Or make the human press the keys.

FYI, the referenced thing is not about what keys are being pressed on a keyboard, it's about extracting the secret keys used for encryption or authentication. You're using the wrong meaning of "keys".

Comment by evand on UFO Betting: Put Up or Shut Up · 2023-06-18T05:20:22.166Z · LW · GW

If you think the true likelihood is 10%, and are being offered odds of 50:1 on the bet, then the Kelly Criterion suggests you should be about 8% of your bankroll. For various reasons (mostly human fallibility and an asymmetry in the curve of the Kelly utility), lots of people recommend betting at fractions of the Kelly amount. So someone in the position you suggest might reasonably wish to be something like $2-5k per $100k of bankroll. That strategy, your proposed credences, and the behavior observed so far would imply a bankroll of a few hundred thousand dollars. That's not trivial, but also far from implausible in this community.

I'd also guess that the proper accounting of the spending here is partly on the bet for positive expected value, and partly on some sort of marketing / pushing for higher credibility of their idea sort of thing. I'm not sure of the exact mechanism or goal, and this is not a confident prediction, but it has that feel to it.

Comment by evand on What is the foundation of me experiencing the present moment being right now and not at some other point in time? · 2023-06-17T21:05:59.618Z · LW · GW

"Now" is the time at which you can make interventions. Subjective experience lines up with that because it can't be casually compatible with being in the future, and it maximizes the info available to make the decision with. Or rather, approximately maximizes subject to processing constraints: things get weird if you start really trying to ask whether "now" is "now" or "100 ms ago".

That's sort of an answer that seems like it depends on a concept of free will, though. To which my personal favorite response is... how good is your understanding of counterfactuals? Have you written a program that tries to play a two-player game, like checkers or go? If you have, you'll discover that your program is completely deterministic, yet has concepts like "now" and "if I choose X instead of Y" and they all just work.

Build an intuitive understanding of how that program works, and how it has both a self-model and understanding of counterfactuals while being deterministic in a very limited domain, and you'll be well under way to dissolving this confusion. (Or at least, I've spent a bunch of hours on such programs and I find the analogy super useful; YMMV and I'm probably typical-minding too much here.)

Comment by evand on MetaAI: less is less for alignment. · 2023-06-14T02:53:59.485Z · LW · GW

My concern with conflating those two definitions of alignment is largely with the degree of reliability that's relevant.

The definition "does what the developer wanted" seems like it could cash out as something like "x% of the responses are good". So, if 99.7% of responses are "good", it's "99.7% aligned". You could even strengthen that as something like "99.7% aligned against adversarial prompting".

On the other hand, from a safety perspective, the relevant metric is something more like "probabilistic confidence that it's aligned against any input". So "99.7% aligned" means something more like "99.7% chance that it will always be safe, regardless of who provides the inputs, how many inputs they provide, and how adversarial they are".

In the former case, that sounds like a horrifyingly low number. What do you mean we only get to ask the AI 300 things in total before everyone dies? How is that possibly a good situation to be in? But in the latter case, I would roll those dice in a heartbeat if I could be convinced the odds were justified.

So anyway, I still object to using the "alignment" term to cover both situations.

Comment by evand on UFO Betting: Put Up or Shut Up · 2023-06-14T02:38:47.603Z · LW · GW

If there are reasons to refuse bets in general, that apply to the LessWrong community in aggregate, something has gone horribly horribly wrong.

No one is requiring you personally to participate, and I doubt anyone here is going to judge you for reluctance to engage in bets with people from the Internet who you don't know. Certainly I wouldn't. But if no one took up this bet, it would have a meaningful impact on my view of the community as a whole.

Comment by evand on A plea for solutionism on AI safety · 2023-06-11T00:03:02.675Z · LW · GW

I don't know how it prevents us from dying either! I don't have a plan that accomplishes that; I don't think anyone else does either. If I did, I promise I'd be trying to explain it.

That said, I think there are pieces of plans that might help buy time, or might combine with other pieces to do something more useful. For example, we could implement regulations that take effect above a certain model size or training effort. Or that prevent putting too many flops worth of compute in one tightly-coupled cluster.

One problem with implementing those regulations is that there's disagreement about whether they would help. But that's not the only problem. Other problems are things like: how hard would they be to comply with and audit compliance with? Is compliance even possible in an open-source setting? Will those open questions get used as excuses to oppose them by people who actually object for other reasons?

And then there's the policy question of how we move from the no-regulations world of today to a world with useful regulations, assuming that's a useful move. So the question I'm trying to attack is: what's the next step in that plan? Maybe we don't know because we don't know what the complete plan is or whether the later steps can work at all, but are there things that look likely to be useful next steps that we can implement today?

One set of answers to that starts with voluntary compliance. Signing an open letter creates common knowledge that people think there's a problem. Widespread voluntary compliance provides common knowledge that people agree on a next step. But before the former can happen, someone has to write the letter and circulate it and coordinate getting signatures. And before the latter can happen, someone has to write the tools.

So a solutionism-focused approach, as called for by the post I'm replying to, is to ask what the next step is. And when the answer isn't yet actionable, break that down further until it is. My suggestion was intended to be one small step of many, that I haven't seen discussed much as a useful next step.

Comment by evand on A plea for solutionism on AI safety · 2023-06-10T20:51:32.777Z · LW · GW

I think neither. Or rather, I support it, but that's not quite what I had in mind with the above comment, unless there's specific stuff they're doing that I'm not aware of. (Which is entirely possible; I'm following this work only loosely, and not in detail. If I'm missing something, I would be very grateful for more specific links to stuff I should be reading. Git links to usable software packages would be great.)

What I'm looking for mostly, at the moment, is software tools that could be put to use. A library, a tutorial, a guide for how to incorporate that library into your training run, and a result of better compliance with voluntary reporting. What I've seen so far is mostly high-effort investigative reports and red-teaming efforts.

Best practices around how to evaluate models and high-effort things you can do while making them are also great. But I'm specifically looking for tools that enable low effort compliance and reporting options while people are doing the same stuff they otherwise would be. I think that would complement the suggestions for high-effort best practices.

The output I'd like to see is things like machine-parseable quantification of flops used to generate a model, such that a derivative model would specify both total and marginal flops used to create it.

Comment by evand on A plea for solutionism on AI safety · 2023-06-10T02:56:09.758Z · LW · GW

One thing I'd like to see more of: attempts at voluntary compliance with proposed plans, and libraries and tools to support that.

I've seen suggestions to limit the compute power used on large training runs. Sounds great; might or might not be the answer, but if folks want to give it a try, let's help them. Where are the libraries that make it super easy to report the compute power used on a training run? To show a Merkle tree of what other models or input data that training run depends on? (Or, if extinction risk isn't your highest priority, to report which media by which people got incorporated, and what licenses it was used under?) How do those libraries support reporting by open-source efforts, and incremental reporting?

What if the plan is alarm bells and shutdowns of concerning training runs? Or you're worried about model exfiltration by spies or rogue employees? Are there tools that make it easy to report what steps you're taking to prevent that? That make it easy to provide good security against those threat models? Where's the best practices guide?

We don't have a complete answer. But we have some partial answers, or steps that might move in the right direction. And right now actually taking those next steps, for marginal people kinda on the fence about how to trade capabilities progress against security and alignment work, looks like it's hard. Or at least harder than I can imagine it being.

(On a related note, I think the intersection of security and alignment is a fruitful area to apply more effort.)

Comment by evand on What's the best way to streamline two-party sale negotiations between real humans? · 2023-05-23T02:58:58.689Z · LW · GW

Aren't the other used cars available nearby, and the potential other buyers should you walk away, relevant to that negotiation?

Comment by evand on A Walkthrough of A Mathematical Framework for Transformer Circuits · 2023-05-22T00:55:53.612Z · LW · GW

This was fantastic; thank you! I still haven't quite figured it out, I'll definitely have to watch it a second time (or at least some parts of it).

I think some sort of improved interface for your math annotations and diagrams would be a big benefit, whether that's a drawing tablet or typing out some LaTeX or something else.

I think the section on induction heads and how they work could have used a bit more depth. Maybe a couple more examples, maybe some additional demos of how to play around with PySvelte, maybe something else. That's the section I had the most trouble following.

You mentioned a couple additional papers in the video; having links in the description would be handy. I suspect I can find them easily enough as it is, though.

Comment by evand on The Unexpected Clanging · 2023-05-18T17:37:07.429Z · LW · GW

Yes, if Omega accurately simulates me and wants me to be wrong, Omega wins. But why do I need to get the answer exactly "right"? What does it matter if I'm slightly off?

This would be a (very slightly) more interesting problem if Omega was offering a bet or a reward and my goal was to maximize reward or utility or whatever. It sure looks like for this setup, combined with a non-adversarial reward schedule, I can get arbitrarily close to maximizing the reward.

Comment by evand on New OpenAI Paper - Language models can explain neurons in language models · 2023-05-11T02:53:51.097Z · LW · GW

This feel reminiscent of:

If the human brain were so simple that we could understand it, we would be so simple that we couldn’t.

And while it's a well-constructed pithy quote, I don't think it's true. Can a system understand itself? Can a quining computer program exist? Where is the line between being able to recite itself and understand itself?

You need a model above some threshold of capability at which it can provide useful interpretations, yes, but I don't see any obvious reason why that threshold would move up with the size of the model under interpretation.

Agreed. A quine needs some minimum complexity and/or language / environment support, but once you have one it's usually easy to expand it. Things could go either way, and the question is an interesting one needing investigation, not bare assertion.

And the answer might depend fairly strongly on whether you take steps to make the model interpretable or a spaghetti-code turing-tar-pit mess.

Comment by evand on Formalizing the "AI x-risk is unlikely because it is ridiculous" argument · 2023-05-04T00:44:08.657Z · LW · GW

I think that sounds about right. Collecting the arguments in one place is definitely helpful, and I think they carry some weight as initial heuristics, which this post helps clarify.

But I also think the technical arguments should (mostly) screen off the heuristics; the heuristics are better for evaluating whether it's worth paying attention to the details. By the time you're having a long debate, it's better to spend (at least some) time looking instead of continuing to rely on the heuristics. Rhymes with Argument Screens Off Authority. (And in both cases, only mostly screens off.)

Comment by evand on The Rocket Alignment Problem, Part 2 · 2023-05-03T01:13:09.772Z · LW · GW

That's the point. SpaceX can afford to fail at this; the decision makers know it. Eliezer can afford to fail at tweet writing and knows it. So they naturally ratchet up the difficulty of the problem until they're working on problems that maximize their expected return (in utility, not necessarily dollars). At least approximately. And then fail sometimes.

Or, for the trapeze artist... how long do they keep practicing? Do they do the no-net route when they estimate their odds of failure are 1/100? 1/10,000? 1e-6? They don't push them to zero, at some point they make a call and accept the risk and go.

Why should it be any different for an entity that can one-shot those problems? Why would they wait until they had invested enough effort to one-shot it, and then do so? When instead they could just... invest less effort, attempt it earlier, take some risk of failure, and reap a greater expected reward?

The analogy suggests that entities capable of one-shotting problem X (presumably, by putting in a lot of preparatory effort, running analysis, and so on) will do so. I don't think that's true.

(And I think the tweet writing problem is actually an especially strong example of this -- hypercompetitive social environments absolutely produce problems calibrated to be barely-solvable and that scale with ability, assuming your capability is in line with the other participants, which I assert is the case for Eliezer. he might be smarter / better at writing tweets than most, but he's not that far ahead.)

Comment by evand on The Rocket Alignment Problem, Part 2 · 2023-05-02T03:03:10.647Z · LW · GW

Perhaps I'm missing something obvious, and just continuing the misunderstanding, but...

It seems to me that if you're the sort of thing capable of one-shotting Starship launches, you don't just hang around doing so. You tackle harder problems. The basic Umeshism: if you're not failing sometimes, you're not trying hard enough problems.

Even the "existential" risk of SpaceX getting permanently and entirely shut down, or just Starship getting shut down, is much closer in magnitude to the payoff than is the case in AI risk scenarios.

Some problems are well calibrated to our difficulties, because we basically understand them and there's a feedback loop providing at least rough calibration. AI is not such a problem, rockets are, and so the analogy is a bad analogy. The problem isn't just one of communication, the analogy breaks for important and relevant reasons.

This is extremely true for hypercompetitive domains like writing tweets that do well.

Comment by evand on How can one rationally have very high or very low probabilities of extinction in a pre-paradigmatic field? · 2023-05-01T02:19:55.396Z · LW · GW

Taboo "rationally".

I think the question you want is more like: "how can one have well-calibrated strong probabilities?". Or maybe "correct". I don't think you need the word "rationally" here, and it's almost never helpful at the object level -- it's a tool for meta-level discussions, training habits, discussing patterns, and so on.

To answer the object-level question... well, do you have well-calibrated beliefs in other domains? Did you test that? What do you think you know about your belief calibration, and how do you think you know it?

Personally, I think you mostly get there by looking at the argument structure. You can start with "well, I don't know anything about proposition P, so it gets a 50%", but as soon as you start looking at the details that probability shifts. What paths lead there, what don't? If you keep coming up with complex conjunctive arguments against, and multiple-path disjunctive arguments for, the probability rapidly goes up, and can go up quite high. And that's true even if you don't know much about the details of those arguments, if you have any confidence at all that the process producing those is only somewhat biased. When you do have the ability to evaluate those in detail, you can get fairly high confidence.

That said, my current way of expressing my confidence on this topic is more like "on my main line scenarios..." or "conditional on no near-term giant surprises..." or "if we keep on with business as usual...". I like the conditional predictions a lot more, partly because I feel more confident in them and partly because conditional predictions are the correct way to provide inputs to policy decisions. Different policies have different results, even if I'm not confident in our ability to enact the good ones.

Comment by evand on Moderation notes re: recent Said/Duncan threads · 2023-04-24T01:45:23.158Z · LW · GW

I'm still uncertain how I feel about a lot of the details on this (and am enough of a lurker rather than poster that I suspect it's not worth my time to figure that out / write it publicly), but I just wanted to say that I think this is an extremely good thing to include:

I will probably build something that let's people Opt Into More Said. I think it's fairly likely the mod team will probably generally do some more heavier handed moderation in the nearish future, and I think a reasonable countermeasure to build, to alleviate some downsides of this, is to also give authors a "let this user comment unfettered on my posts, even though the mod teams have generally restricted them in some way."

This strikes me basically as a way to move the mod team's role more into "setting good defaults" and less "setting the only way things work". How much y'all should move in that direction seems an open question, as it does limit how much cultivation you can do, but it seems like a very useful tool to make use of in some cases.

Comment by evand on What if we Align the AI and nobody cares? · 2023-04-20T04:27:54.901Z · LW · GW

i could believe this number’s within 3 orders of magnitude of truth, which is probably good enough for the point of this article

It's not. As best I can tell it's off by more like 4+ OOM. A very quick search suggests actual usage was maybe more like 1 GWh. Back of the envelope guess: thousands of GPUs, thousands of hours, < 1kW/GPU, a few GWh.

https://www.theregister.com/2020/11/04/gpt3_carbon_footprint_estimate/

https://www.numenta.com/blog/2022/05/24/ai-is-harming-our-planet/

i am a little surprised if you just took it 100% at face value.

Same. That doesn't seem to rise to the quality standards I'd expect.

Comment by evand on Outrage and Statistics into Policy · 2023-04-11T03:02:01.264Z · LW · GW

I feel like the argumentation here is kinda misleading.

Here's a pattern that doesn't work very well: a tragedy catches our attention, we point to statistics to show it's an example of a distressingly common problem, and we propose laws to address the issue.

The post promises to discuss a pattern. It's obviously a culture-war-relevant pattern, and I can see a pretty good case for it being one where all the examples are going to be at least culture-war-adjacent. It's an important pattern, if true, so far that seems justified and worth worrying about how to improve on.

But then the post provides one example. Is it a pattern? If so, what are the other cases, and why doesn't the post concern itself with them? If the post is about the pattern of mistakes and how to address them, and why this is a thing to worry about in general, shouldn't there be more like 3+ examples?

The case made that gun violence is not being addressed well, and is conflating at least two different but related problems, seems quite strong. It's a good read, and the statistics presented make a strong case that we're not addressing it well. I liked the argument presented, and felt better informed for having read the post (note to self: I should probably dig a bit deeper on the statistics, since the conclusion is one I'm pretty sympathetic to). But if that's the argument the post is going to make, why isn't the title / intro paragraph something more like "we're not addressing the main causes of gun violence" or "mass shootings that make headlines aren't central examples of the problem" or something like that?

I assume that the gun violence example is one case of many, and that there are both general and specific lessons to be learned. I assume it would be possible to over-generalize from one example, so looking for common features and common failures would be instructive.

Overall, it felt like the post made several good points, but that the structure was a bit of a bait and switch.

Comment by evand on LW Team is adjusting moderation policy · 2023-04-08T20:13:51.617Z · LW · GW

Something like an h-index might be better than a total.

Comment by evand on All AGI Safety questions welcome (especially basic ones) [April 2023] · 2023-04-08T16:33:50.383Z · LW · GW

For an extremely brief summary of the problem, I like this from Zvi:

The core reason we so often have seen creation of things we value win over destruction is, once again, that most of the optimization pressure by strong intelligences was pointing in that directly, that it was coming from humans, and the tools weren’t applying intelligence or optimization pressure. That’s about to change.

https://thezvi.wordpress.com/2023/03/28/response-to-tyler-cowens-existential-risk-ai-and-the-inevitable-turn-in-human-history/

Comment by evand on Eliezer Yudkowsky’s Letter in Time Magazine · 2023-04-07T01:28:38.329Z · LW · GW

Intelligence has no upper limit, instead of diminishing sharply in relative utility

It seems to me that there is a large space of intermediate claims that I interpret the letter as falling into. Namely, that if there exists an upper limit to intelligence, or a point at which the utility diminishes enough to not be worth throwing more compute cycles at it, humans are not yet approaching that limit. Returns can diminish for a long time while still being worth pursuing.

you have NO EVIDENCE that AGI is hostile or is as capable as you claim or support for any of your claims.

"No evidence" is a very different thing from "have not yet directly observed the phenomenon in question". There is, in fact, evidence from other observations. It has not yet raised the probability to [probability 1](https://www.lesswrong.com/posts/QGkYCwyC7wTDyt3yT/0-and-1-are-not-probabilities), but there does exist such a thing as weak evidence, or strong-but-inconclusive evidence. There is evidence for this claim, and evidence for the counterclaim; we find ourselves in the position of actually needing to look at and weigh the evidence in question.

Comment by evand on Policy discussions follow strong contextualizing norms · 2023-04-02T20:43:56.955Z · LW · GW

It's always possible to fudge the numbers and decide that some values are unimportant and some are super important and lo and behold, the calculation turns in your favour! In the end it's no better than deontology or simply saying "I think this is good"; there is no point trying to vest it with a semblance of objectivity that just isn't there.

Is this not simply the fallacy of gray?

As saying goes, it's easy to lie with statistics, but even easier to lie without them. Certainly you can fudge the numbers to make the result say anything, but if you show your work then the fudging gets more obvious.

Comment by evand on Shannon's Surprising Discovery · 2023-04-02T17:49:56.640Z · LW · GW

I think you missed a follow-on edit:

"Let’s unpack what that 0.36 bits means,"

Comment by evand on "Rationalist Discourse" Is Like "Physicist Motors" · 2023-03-02T03:03:40.882Z · LW · GW

Metrics are only useful for comparison if they're accepted by a sufficient broad cross section of society. Since nearly everyone engages in discourse.

I note that "sufficiently broad" might mean something like "most of LessWrong users" or "most people attending this [set of] meetups". Just as communication is targeted at a particular audience, discourse norms are (presumably) intended for a specific context. That context probably includes things like intended users, audience, goals, and so on. I doubt "rationalist discourse" norms will align well with "televised political debate discourse" norms any time soon.

Nonetheless, I think we can discuss, measure, and improve rationalist discourse norms; and I don't think we should concern ourselves overly much with how well those norms would work in a presidential debate or a TV ad. I suspect there are still norms that apply very broadly, with broad agreement -- but those mostly aren't the ones we're talking about here on LessWrong.

Comment by evand on "Rationalist Discourse" Is Like "Physicist Motors" · 2023-02-26T17:57:28.643Z · LW · GW

"Physicist motors" makes little sense because that position won out so completely that the alternative is not readily available when we think about "motor design". But this was not always so! For a long time, wind mills and water wheels were based on intuition.

But in fact one can apply math and physics and take a "physicist motors" approach to motor design, which we see appearing in the 18th and 19th centuries. We see huge improvements in the efficiency of things like water wheels, the invention of gas thermodynamics, steam engines, and so on, playing a major role in the industrial revolution.

The difference is that motor performance is an easy target to measure and understand, and very closely related to what we actually care about (low Goodhart susceptibility). There are a bunch of parameters -- cost, efficiency, energy source, size, and so on -- but the number of parameters is fairly tractable. So it was very easy for the "physicist motor designers" to produce better motors, convince their customers the motors were better, and win out in the marketplace. (And no need for them to convince anyone who had contrary financial incentives.)

But "discourse" is a much more complex target, with extremely high dimensionality, and no easy way to simply win out in the market. So showing what a better approach looks like takes a huge amount of work and care, not only to develop it, but even to show that it's better and why.

If you want to find it, the "non-physicist motors" camp is still alive and well, living in the "free energy" niche on YouTube among other places.

Comment by evand on Satoshi Nakamoto? · 2017-11-03T01:26:13.777Z · LW · GW

Obvious kinds of humans include:

Dead humans. (Who didn't manage to leave the coins to their heirs.)

Cryonically preserved humans hoping to use them later. (Including an obvious specific candidate.)

Humans optimistic enough about Bitcoin to think current prices are too low. (We know Nakamoto had resources, so it seems a safe bet that they could keep living on ordinary means for now.)

And the obvious: you don't know that all of Nakamoto's coins fit the standard assumed profile. It's entirely possible they intentionally mined some with the regular setup and are spending a few from that pool.

Comment by evand on Inadequacy and Modesty · 2017-10-30T03:18:34.924Z · LW · GW

The advanced answer to this is to create conditional prediction markets. For example: a market for whether or not the Bank of Japan implements a policy, a market for the future GDP or inflation rate of Japan (or whatever your preferred metric is), and a conditional market for (GDP given policy) and (GDP given no policy).

Then people can make conditional bets as desired, and you can report your track record, and so on. Without a prediction market you can't, in general, solve the problem of "how good is this prediction track record really" except by looking at it in detail and making judgment calls.

Comment by evand on Scope Insensitivity · 2017-06-20T05:32:49.309Z · LW · GW

I hope you have renter's insurance, knowledge of a couple evacuation routes, and backups for any important data and papers and such.

Comment by evand on Bet or update: fixing the will-to-wager assumption · 2017-06-11T16:54:39.793Z · LW · GW

I'm not aware of any legal implications in the US. US gambling laws basically only apply when there is a "house" taking a cut or betting to their own advantage or similar. Bets between friends where someone wins the whole stake are permitted.

As for the shady implications... spend more time hanging out with aspiring rationalists and their ilk?

Comment by evand on Bet or update: fixing the will-to-wager assumption · 2017-06-08T14:24:10.022Z · LW · GW

The richer structure you seek for those two coins is your distribution over their probabilities. They're both 50% likely to come up heads, given the information you have. You should be willing to make exactly the same bets about them, assuming the person offering you the bet has no more information than you do. However, if you flip each coin once and observe the results, your new probability estimate for next flips are now different.

For example, for the second coin you might have a uniform distribution (ignorance prior) over the set of all possible probabilities. In that case, if you observe a single flip that comes up heads, your probability that the next flip will be heads is now 2/3.

Comment by evand on [deleted post] 2017-05-30T19:57:56.918Z

Well, in general, I'd say achieving that reliability through redundant means is totally reasonable, whether in engineering or people-based systems.

At a component level? Lots of structural components, for example. Airplane wings stay attached at fairly high reliability, and my impression is that while there is plenty of margin in the strength of the attachment, it's not like the underlying bolts are being replaced because they failed with any regularity.

I remember an aerospace discussion about a component (a pressure switch, I think?). NASA wanted documentation for 6 9s of reliability, and expected some sort of very careful fault tree analysis and testing plan. The contractor instead used an automotive component (brake system, I think?), and produced documentation of field reliability at a level high enough to meet the requirements. Definitely an example where working to get the underlying component that reliable was probably better than building complex redundancy on top of an unreliable component.

Comment by evand on [deleted post] 2017-05-26T13:11:22.155Z

You might also want a mechanism to handle "staples" that individuals want. I have a few foods / ingredients I like to keep on hand at all times, and be able to rely on having. I'd have no objections to other people eating them, but if they did I'd want them to take responsibility for never leaving the house in a state of "no X on hand".

Comment by evand on [deleted post] 2017-05-26T13:06:27.086Z

Those numbers sound like reasonable estimates and goals. Having taught classes at TechShop, that first handful of hours is important. 20 hours of welding instruction ought to be enough that you know whether you like it and can build some useful things, but probably not enough to get even an intro-level job. It should give you a clue as to whether signing up for a community college class is a good idea or not.

Also I'm really confused by your inclusion of EE in that list; I'd have put it on the other one.

Comment by evand on [deleted post] 2017-05-25T23:54:06.058Z

However, I'm skeptical of systems that require 99.99% reliability to work. Heuristically, I expect complex systems to be stable only if they are highly fault-tolerant and degrade gracefully.

On the other hand... look at what happens when you simply demand that level of reliability, put in the effort, and get it. From my engineering perspective, that difference looks huge. And it doesn't stop at 99.99%; the next couple nines are useful too! The level of complexity and usefulness you can build from those components is breathtaking. It's what makes the 21st century work.

I'd be really curious to see what happens when that same level of uncompromising reliability is demanded of social systems. Maybe it doesn't work, maybe the analogy fails. But I want to see the answer!

Comment by evand on Hidden universal expansion: stopping runaways · 2017-05-16T17:29:19.070Z · LW · GW

What happens when the committed scorched-earth-defender meets the committed extortionist? Surely a strong precommitment to extortion by a powerful attacker can defeat a weak commitment to scorched earth by a defender?

It seems to me this bears a resemblence to Chicken or something, and that on a large scale we might reasonably expect to see both sets of outcomes.

Comment by evand on Change utility, reduce extortion · 2017-04-28T18:11:17.999Z · LW · GW

What's that? If I don't give into your threat, you'll shoot me in the foot? Well, two can play at that game. If you shoot me in the foot, just watch, I'll shoot my other foot in revenge.

Comment by evand on Defining the normal computer control problem · 2017-04-27T16:27:38.800Z · LW · GW

On the other hand... what level do you want to examine this at?

We actually have pretty good control of our web browsers. We load random untrusted programs, and they mostly behave ok.

It's far from perfect, but it's a lot better than the desktop OS case. Asking why one case seems to be so much farther along than the other might be instructive.

Comment by evand on Defining the normal computer control problem · 2017-04-27T15:58:28.810Z · LW · GW

Again, I'm going to import the "normal computer control" problem assumptions by analogy:

  • The normal control problem allows minor misbehaviour, but that it should not persist over time

Take a modern milling machine. Modern CNC mills can include a lot of QC. They can probe part locations, so that the setup can be imperfect. They can measure part features, in case a raw casting isn't perfectly consistent. They can measure the part after rough machining, so that the finish pass can account for imperfections from things like temperature variation. They can measure the finished part, and reject or warn if there are errors. They can measure their cutting tools, and respond correctly to variation in tool installation. They can measure their cutting tools to compensate for wear, detect broken tools, switch to the spare cutting bit, and stop work and wait for new tools when needed.

Again, I say: we've solved the problem, for things literally as simple as pounding a nail, and a good deal more complicated. Including variation in the nails, the wood, and the hammer. Obviously the solution doesn't look like a fixed set of voltages sent to servo motors. It does look like a fixed set of parts that get made.

How involved in the field of factory automation are you? I suspect the problem here may simply be that the field is more advanced than you give it credit for.

Yes, the solutions are expensive. We don't always use these solutions, and often it's because using the solution would cost more and take more time than not using it, especially for small quantity production. But the trend is toward more of this sort of stuff being implemented in more areas.

The "normal computer control problem" permits some defects, and a greater than 0% error rate, provided things don't completely fall apart. I think a good definition of the "hammer control problem" is similar.

Comment by evand on Defining the normal computer control problem · 2017-04-27T15:25:36.065Z · LW · GW

It bends the nails, leaves dents in the surface and given the slightest chance will even attack your fingers!

We've mostly solved that problem.

I'm not sure that being able to nearly perfectly replicate a fixed set of physical actions is the same thing as solving a control problem.

It's precisely what's required to solve the problem of a hammer that bends nails and leaves dents, isn't it?

Stuxnet-type attacks

I think that's outside the scope of the "hammer control problem" for the same reasons that "an unfriendly AI convinced my co-worker to sabotage my computer" is outside the scope of the "normal computer control problem" or "powerful space aliens messed with my FAI safety code" is outside the scope of the "AI control problem".

It is worth noting that the type of control that you mention (e.g. "computer-controlled robots") is all about getting as far from "agenty" as possible.

I don't think it is, or at least not exactly. Many of the hammer failures you mentioned aren't "agenty" problems, they're control problems in the most classical engineering sense: the feedback loop my brain implements between hammer state and muscle output is incorrect. The problem exists with humans, but also with shoddily-built nail guns. Solving it isn't about removing "agency" from the bad nail gun.

Sure, if agency gets involved in your hammer control problem you might have other problems too. But if the "hammer control problem" is to be a useful problem, you need to define it as not including all of the "normal computer control problem" or "AI control problem"! It's exactly the same situation as the original post:

  • The normal control problem assumes that no specific agency in the programs (especially not super-intelligent agency)
Comment by evand on [Stub] Extortion and Pascal's wager · 2017-04-27T14:57:59.936Z · LW · GW

They usually don't have any way to leverage their models to increase the cost of not buying their product or service though; so such a situation is still missing at least one criterion.

Modern social networks and messaging networks would seem to be a strong counterexample. Any software with both network effects and intentional lock-in mechanisms, really.

And honestly, calling such products a blend of extortion and trade seems intuitively about right.

To try to get at the extortion / trade distinction a bit better:

Schelling gives us definitions of promises and threats, and also observes there are things that are a blend of the two. The blend is actually fairly common! I expect there's something analogous with extortion and trade: you can probably come up with pure examples of both, but in practice a lot of examples will be a blend. And a lot of the 'things we want to allow' will look like 'mostly trade with a dash of extortion' or 'mostly trade but both sides also seem to be doing some extortion'.