Posts
Comments
huh. was it the particular meme (brave dude telling the truth), the size, or some third thing?
context note: Jacob is also a mod/works for LessWrong, kave isn't doing this to random users.
the same argument for a different virtue, allegedly from C.S. Lewis
I think this is beautiful but incorrect, mostly because it discounts the virtue in keeping yourself in situations where virtue is easy.
I interpret your comment as assuming that new researchers with good ideas produce more impact on their own than in teams working towards a shared goal
I don't believe that, although I see how my summary could be interpreted that way. I agree with basically all the reasons in your recent comment and most in the original comment. I could add a few reasons of my own doing independent grant-funded work sucks. But I think it's really important to track how founding projects tracks to increased potential safety instead of intermediates, and push hard against potential tail wagging the dog scenarios.
I was trying to figure out why this was important to me, given how many of your points I agree with. I think it's a few things:
- Alignment work seems to be prone to wagging the dog, and is harder to correct, due to poor feedback loops.
- The consequences of this can be dire
- making it harder to identify and support the best projects.
- making it harder to identify and stop harmful projects
- making it harder to identify when a decent idea isn't panning out, leading to people and money getting stuck in the mediocre project instead of moving on.
- One of the general concerns about MATS is it spins up potential capabilities researchers. If the market can't absorb the talent, that suggests maybe MATS should shrink.
- OTOH if you told me that for every 10 entrants MATS spins up 1 amazing safety researcher and 9 people who need makework to prevent going into capabilities, I'd be open to arguments that that was a good trade.
Everyone who waits longer than me to publicly share their ideas is a coward, afraid to expose their ideas to the harsh light of day. Everyone who publicly shares their ideas earlier than me is a maniac, wasting others people's time with stream of consciousness bullshit.
This still reads to me as advocating for a jobs program for the benefit of MATS grads, not safety. My guess is you're aiming for something more like "there is talent that could do useful work under someone else's direction, but not on their own, and we can increase safety by utilizing it".
- Talent leaves MATS/ARENA and sometimes struggles to find meaningful work
I'm surprised this one was included, it feels tail-wagging-the-dog to me.
Good question. My revised belief is that OpenAI will not sufficiently slow down production in order to boost safety. It may still produce theoretical safety work that is useful to others, and to itself if the changes are cheap to implement.
I do also expect many people assigned to safety to end up doing more work on capabilities, because the distinction is not always obvious and they will have so many reasons to err in the direction of agreeing with their boss's instructions.
how likely do you believe it is that OAI has a position where at least 90% of people who are both, (A) qualified skill wise (eg, ML and interpretability expert), and, (B) believes that AIXR is a serious problem, would increase safety faster than capabilities in that position?
The cheap answer here is 0, because I don't think there is any position where that level of skill and belief in AIXR has a 90% chance of increasing net safety. Ability to do meaningful work in this field is rarer than that.
So the real question is how does OpenAI compare to other possibilities? To be specific, let's say being an LTFF-funded solo researcher, academia, and working at Anthropic.
Working at OpenAI seems much more likely to boost capabilities than solo research and probably academia. Some of that is because they're both less likely to do anything. But that's because they face OOM less pressure to produce anything, which is an advantage in this case. LTFF is not a pressure- or fad-free zone, but they have nothing near the leverage of paying someone millions of dollars, or providing tens of hours each week surrounded by people who are also paid millions of dollars to believe they're doing safe work.
I feel less certain about Anthropic. It doesn't have any of terrible signs OpenAI did (like the repeated safety exoduses, the board coup, and clawbacks on employee equity), but we didn't know about most of those a year ago.
If we're talking about a generic skilled and concerned person, probably the most valuable thing they can do is support someone with good research vision. My impression is that these people are more abundant at Anthropic than OpenAI, especially after the latest exodus, but I could be wrong. This isn't a crux for me for the 80k board[1] but it is a crux for how much good could be done in the role.
Some additional bits of my model:
- I doubt OpenAI is going to tell a dedicated safetyist they're off the safety team and on direct capabilities. But the distinction is not always obvious, and employees will be very motivated to not fight OpenAI on marginal cases.
- You know those people who stand too close, so you back away, and then they move closer? Your choices in that situation are to steel yourself for an intense battle, accept the distance they want, or leave. Employers can easily pull that off at scale. They make the question become "am I sure this will never be helpful to safety?" rather than "what is the expected safety value of this research?"
- Alternate frame: How many times will an entry level engineer get to say no before he's fired?
- I have a friend who worked at OAI. They'd done all the right soul searching and concluded they were doing good alignment work. Then they quit, and a few months later were aghast at concerns they'd previous dismissed. Once you are in the situation is is very hard to maintain accurate perceptions.
- Something @Buck said made me realize I was conflating "produce useful theoretical safety work" with "improve the safety of OpenAI's products." I don't think OpenAI will stop production for safety reasons[2], but they might fund theoretical work that is useful to others, or that is cheap to follow themselves (perhaps because it boosts capabilities as well...).
This is a good point and you mentioning it updates me towards believing that you are more motivated by (1) finding out what's true regarding AIXR and (2) reducing AIXR, than something like (3) shit talking OAI.
Thank you. My internal experience is that my concerns stem from around x-risk (and belatedly the wage theft). But OpenAI has enough signs of harm and enough signs of hiding harm that I'm fine shit talking as a side effect, where normally I'd try for something more cooperative and with lines of retreat.
- ^
I think the clawbacks are disqualifying on their own, even if they had no safety implications. They stole money from employees! That's one of the top 5 signs you're in a bad workplace. 80k doesn't even mention this.
- ^
to ballpark quantify: I think there is <5% chance that OpenAI slows production by 20% or more, in order to reduce AIXR. And I believe frontier AI companies need to be prepared to slow by more than that.
I'd define "genuine safety role" as "any qualified person will increase safety faster that capabilities in the role". I put ~0 likelihood that OAI has such a position. The best you could hope for is being a marginal support for a safety-based coup (which has already been attempted, and failed).
There's a different question of "could a strategic person advance net safety by working at OpenAI, more so than any other option?". I believe people like that exist, but they don't need 80k to tell them about OpenAI.
reposting comment from another post, with edits:
re: accumulating status in hope of future counterfactual impact.
I model status-qua-status (as opposed to status as a side effect of something real) as something like a score for "how good are you at cooperating with this particular machine?". The more you demonstrate cooperation, the more the machine will trust and reward you. But you can't leverage that into getting the machine to do something different- that would immediately zero out your status/cooperation score.
There are exceptions. If you're exceptionally strategic you might make good use of that status by e.g. changing what the machine thinks it wants, or coopting the resources and splintering. It is also pretty useful to accumulate evidence you're a generally responsible adult before you go off and do something weird. But this isn't the vibe I get from people I talk to with the 'status then impact' plan, or from any of 80ks advice. Their plans only make sense if either that status is a fungible resource like money, or if you plan on cooperating with the machine indefinitely.
So I don't think people should pursue status as a goal in and of itself, especially if there isn't a clear sign for when they'd stop and prioritize something else.
From Conor's response on EAForum, it sounds like the answer is "we trust OpenAI to tell us". In light of what we already know (safety team exodus, punitive and hidden NDAs, lack of disclosure to OpenAI's governing board), that level of trust seems completely unjustified to me.
When I did my vegan nutrition write-ups, I directed people to Examine.com's Guide to Vegan+Vegetarian Supplements. Unfortunately, it is paywalled. Fortunately, it is now possible to ask your library to buy access, so you can read that guide plus their normal supplement reviews at no cost to yourself.
Library explainer: https://examine.com/plus/public-libraries/
Ven*n guide: https://examine.com/guides/vegetarians-vegans/
How does 80k identify actual safety roles, vs. safety-washed capabilities roles?
I would say Epistemic Daddies are deferred to, for action and strategy, although sometimes with a gloss of giving object level information. But I think you're right that there's a distinction between "giving you strategy" and "telling you your current strategy is so good it's going right on the fridge", and Daddy/Mommy is a decent split for that.
re: accumulating status in hope of future counterfactual impact.
I model status-qua-status (as opposed to status as a side effect of something real) as something like a score for "how good are you at cooperating with this particular machine?". The more you demonstrate cooperation, the more the machine will trust you. But you can't leverage that into getting the machine to do something different- that would immediately zero out your status/cooperation score.
There are exceptions. If you're exceptionally strategic you might make good use of that status by e.g. changing what the machine thinks it wants, or coopting the resources and splintering. But this isn't the vibe I get from people I talk to with the 'status then impact' plan, or from any of 80ks advice. They sound like they think status is a fungible resource that can be spent anywhere, like money[1].
So unless you start with a goal and authentically backchain into a plan where a set amount of a specific form of status is a key resource, you probably shouldn't accumulate status.
I think money-then-impact plans risk being nonterminating, but are great if they are responsive and will terminate.
I also think getting a few years of normal work under your belt between college and crazy independent work can be a real asset, as long as you avoid the just-one-more-year trap.
Part of me likes the idea of making solstice higher investment. But I feel like the right balance is one high investment event and one very low investment event, and high investment is a much better fit for winter solstice.
I like that split because I see value in both high investment, high meaning things that will alienate a lot of people (because they're too much work, or the meaning doesn't resonate with them), and in shelling points for social gathering. These can't coexist, so better to have separate events specializing in each.
I'm much more likely to take existing karma into account when strong voting. For weak votes I'll just vote my opinion unless the karma total is way out from what I think is deserved. This comes up mostly with comments that are bad but not so bad I want to beat the dead horse, or that express a popular sentiment without adding much.
While writing the email to give mentioned people and orgs a chance to comment, I wasn't sure whether to BCC (more risk of going to spam) or CCed (shares their email). I took a FB poll, which got responses from the class of people who might receive emails like this, but not the specific people I emailed. Of the responses, 6 said CC and one said either. I also didn't receive any objections from the people I actually emailed. So seems like CCing is fine.
I wish this list didn't equally weight harms, groundwork for harms, and weirdness.
I'm sad this got voted down to zero (before I strong-upvoted), because I think "how can we have a good version of this discussion?" is a good question to ask. I'm not happy with how lesswrong discusses sensitive topics and would love to see those go better.
I started writing out some specific ideas, and then got overwhelmed by how much work they'd be to write and then deal with the comments. Just writing up the ideas is an afternoon project.
things I found interesting about this video:
- Brennan's mix of agency (organizing 100 person LARPs at 15, becoming creative director at LARP camp by 19), and mindless track following (thinking the goal of arts school was grades).
- He's so proactively submissive about starting community college at 14. "Oh man I was so annoying. I apologize to anyone who had to be around me back then". You can really see the childhood bullying trauma.
- This isn't conjecture, he says outright he still expects every new group he meets to put him in a trashcan.
- I imagine hearing him talk about this would be bad for a 14yo in a similar position, which is a shame because the introspection around choosing his own goals vs. having them picked for him seems really useful.
- About his recommendation of a social strategy: "Am I lying or telling the truth? I'm telling the truth to myself but you shouldn't do it".
- Frank discussion of how financial constraints affected his life.
- A happy ending where all the weird side threads from his life came together to create the best possible life for him.
1 day later, my retraction has more karma than the original humming post
3. put the spiciest posts behind a paywall, because you have something to say but don't want the entire internet freaking out about it.
This seems like a great thing to exist and you have my encouragement to write it.
Well in that case I was the one who was unnecessarily anxious so still feels like a cost, although one well worth paying to get the information faster.
I'm not quite sure what you're asking here. Do you want people interested in solving your particular problem? Solving the class of problems you're in (probably not option)? Solving mysterious illnesses in general?
Are you wondering why there's no one to hire? No one will help you for free? Not enough research money is spent on the topic?
The phenomenon extends beyond math
Additionally, prodigies are amongst the most likely people to experience this, because they spend so much of their early life being the best in the room. Math grad students aren't comparing themselves to the 8 billion people who are worse than them at math, they're comparing themselves to each other, field leaders, and dead people who accomplished more at a given age.
I've heard that a lot of skill in poker is not when to draw or what to discard, it's knowing how much to bet on a given hand. There isn't that much you can do to improve any given hand, but folding earlier and betting more on good hands are within your control.
feels like a metaphor for something.
I don't know what you mean by "total amount" because ppm is a concentration
The spray is clearly delivering a set amount, but describing it in ppm. Since the volume and density of air inside then nose isn't changing, you can treat the change as a count rather than concentration.
that tweet's interpretation agrees with mine.
My understanding of the tweet's model is that [actual released amount] * [8 hours] = 0.11ppm, so [released amount] = 0.11/8.
I still don't understand your number. Could you expand the equation behind "If NO is produced and reacts immediately, say in 20 seconds, this means the concentration achieved is 19.8 ppm"?
EA organizations frequently ask for people to run criticism by them ahead of time. I’ve been wary of the push for this norm. My big concerns were that orgs wouldn’t comment until a post was nearly done, and that it would take a lot of time. My recent post mentioned a lot of people and organizations, so it seemed like useful data.
I reached out to 12 email addresses, plus one person in FB DMs and one open call for information on a particular topic. This doesn’t quite match what you see in the post because some people/orgs were used more than once, and other mentions were cut. The post was in a fairly crude state when I sent it out.
Of those 14: 10 had replied by the start of next day. More than half of those replied within a few hours. I expect this was faster than usual because no one had more than a few paragraphs relevant to them or their org, but is still impressive.
It’s hard to say how sending an early draft changed things. Austin Chen got some extra anxiety joked about being anxious because their paragraph was full of TODOs (because it was positive and I hadn’t worked as hard fleshing out the positive mentions ahead of time). Turns out they were fine but then I was worried I'd stressed them out. I could maybe have saved myself one stressful interaction if I’d realized I was going to cut an example ahead of time
Only 80,000 Hours, Anima International, and GiveDirectly failed to respond before publication (7 days after I emailed them).
I didn’t keep as close track of changes, but at a minimum replies led to 2 examples being removed entirely, 2 clarifications and some additional information that made the post better. So overall I'm very glad I solicited comments, and found the process easier than expected.
Wait if 0.11ppm*hr is the integral, doesn't that suggest the total amount is 0.11ppm? My biologist friends have failed me but that's this twitter comment's interpretation.
on the reagent math: I believe the methycellulose is fairly bulky (because it's sold separately as a powder to inhale), which makes the lower about of NO more believable.
Yeah I definitely misread that ppm/hour. I'm unsure how to interpret *hrs, that seems nonsensical. I'm under a tight deadline right now but have reached out to some bio friends for help. Assuming this doesn't turn out to be a typo, I'd like to give you a bounty for catching this, can you PM me your paypal info?
I tried the imported ketoconazole shampoo recently and it indeed worked where American OTC shampoo had failed.
This is consistent with the dose being 130µl of a dilute liquid
Can you clarify this part? The liquid is a reactive solution (and contains other ingredients) so I don't understand how you calculated it.
I agree the integral is a reasonable interpretation and appreciate you pointing it out. My guess is low frequent applications are better than infrequent high doses, but I don't know what the conversion rate is and this definitely undermines the hundred-dollar-bill case.
To go one step further, potentially any and every major decision they have played a part in needs to be reevaluated by objective third parties.
I like a lot of this post, but the sentence above seems very out of touch to me. Who are these third parties who are completely objective? Why is objective the adjective here, instead of "good judgement" or "predicted this problem at the time"?
I haven't looked into it; seems plausible it helps, but since it's a signalling molecule I'm wary of amplifying it too much.
The best known amplifier of NO in the bloodstream is viagra. My understanding is they haven't found general health effects from it, despite looking really hard and first investigating it as a treatment for heart disease.
Yeah when I was writing this part of me kept saying "but humming is so cheap, why shouldn't everyone do it all the time?", and I had to remind myself that attention is a cost. This is despite the fact that it's not cheap for me (due to trigeminal neuralgia; I'll probably stick with enovid myslf) and attention is a limiting reagent for me. The too-cheap-to-meter argument is really seductive.
I think some poly scenarios save money (although most are accessible without poly), but poly also gives you new and exciting ways to lose it (these can be replicated without poly too, but it's harder).
If you can't afford your home without everyone's income, then your housing stability is dependent on every relationship in the house. Hope everyone is chore compatible. And is in agreement on if the house allows kid. And everyone's work is near each other. And...
I've seen poly housing (and found family shared housing) go well and save money, but mostly when at least one person had a lot of financial slack (to paper over housemate losses) and no one was so badly off they can't afford to leave. If someone needs the house sharing to work, issues will fester until they become toxic.
citric acid and a polymer
https://www.modestneeds.org/ will give one time cash infusions to people with capital intensive problems (like moving costs, or keeping a vehicle). I haven't looked into them in a while; a few years ago there was a requirement that the cash infusion would get recipients on a stable track, I think that might be looser now.
All of the problems you list seem harder with repeated within-person trials.
I found the gotcha: envoid has two other mechanisms of action. Someone pointed this out to me on my previous nitric oxide post, but it didn't quite sink in till I did more reading.
Is there a lesswrong canon post for the quantified impact of different masks? I want to compare a different intervention to masks and it would be nice to use a reference that's gone through battle testing.
Yep that's my main contender for the better formulations referred to in the intro. .
I don't think the original comment was a troll, but I also don't think it was a helpful contribution on this post. OP specifically framed the post as their own experience, not a universal cure. Comments explaining why it won't work for a specific person aren't relevant.
I think that's their guess but they don't directly check here.
I also suspect that it doesn't matter very much.
- The sinuses have so much NO compared to the nose that this probably doesn't materially lower sinus concentrations.
- the power of humming goes down with each breath but is fully restored in 3 minutes, suggesting that whatever change happens in the sinsues is restored quickly
- From my limited understanding of virology and immunology, alternating intensity of NO between sinuses and nose every three minutes is probably better than keeping sinus concentrations high[1]. The first second of NO does the most damage to microbes[2], so alternation isn't that bad.
I'd love to test this. The device you linked works via the mouth, and we'd need something that works via the nose. From a quick google it does look like it's the same test, so we'd just need a nasal adaptor.
Other options:
- Nnoxx. Consumer skin device, meant for muscle measurements
- There are lots of devices for measuring concentration in the air, maybe they could be repurporsed. Just breathing on it might be enough for useful relative metrics, even if they're low-precision.
I'm also going to try to talk my asthma specialist into letting me use their oral machine to test my nose under multiple circumstances, but it seems unlikely she'll go for it.
- ^
obvious question: so why didn't evolution do that? Ancestral environment didn't have nearly this disease (or pollution) load. This doesn't mean I'm right but it means I'm discounting that specific evolutionary argument.
- ^
although NO is also an immune system signal molecule, so the average does matter.
Check my math: how does Enovid compare to to humming?
Nitric Oxide is an antimicrobial and immune booster. Normal nasal nitric oxide is 0.14ppm for women and 0.18ppm for men (sinus levels are 100x higher). journals.sagepub.com/doi/pdf/10.117…
Enovid is a nasal spray that produces NO. I had the damndest time quantifying Enovid, but this trial registration says 0.11ppm NO/hour. They deliver every 8h and I think that dose is amortized, so the true dose is 0.88. But maybe it's more complicated. I've got an email out to the PI but am not hopeful about a response clinicaltrials.gov/study/NCT05109…
so Enovid increases nasal NO levels somewhere between 75% and 600% compared to baseline- not shabby. Except humming increases nasal NO levels by 1500-2000%. atsjournals.org/doi/pdf/10.116….
Enovid stings and humming doesn't, so it seems like Enovid should have the larger dose. But the spray doesn't contain NO itself, but compounds that react to form NO. Maybe that's where the sting comes from? Cystic fibrosis and burn patients are sometimes given stratospheric levels of NO for hours or days; if the burn from Envoid came from the NO itself than those patients would be in agony.
I'm not finding any data on humming and respiratory infections. Google scholar gives me information on CF and COPD, @Elicit brought me a bunch of studies about honey.
With better keywords google scholar to bring me a bunch of descriptions of yogic breathing with no empirical backing.
There are some very circumstantial studies on illness in mouth breathers vs. nasal, but that design has too many confounders for me to take seriously.
Where I'm most likely wrong:
- misinterpreted the dosage in the RCT
- dosage in RCT is lower than in Enovid
- Enovid's dose per spray is 0.5ml, so pretty close to the new study. But it recommends two sprays per nostril, so real dose is 2x that. Which is still not quite as powerful as a single hum.
I curated this post because
- this is a rare productivity system post that made me consider actually implementing it. Right now I can’t because my energy levels are too variable, but if that weren’t true I would definitely be trying it.
- lots of details, on lots of levels. Things like “I fail 5% of the time” and then translating that too “therefore i price things such that if I could pay 5% of the failure fee to just have it done, I would do so.”
- Practical advice like “yes verification sometimes takes a stupid amount of time, the habit is nonetheless worth it” or “arrange things to verify the day after”