Posts
Comments
I still think that if you want to know where X is on someone's TODO list, you should ask that instead of asking for their full TODO list. This feels nearly as wrong as asking for someone's top 5 movies of the year, instead of whether or not they liked Oppenheimer (when you want to know if they liked Oppenheimer).
I don't think this level of trickery is a good idea.
If you're working with someone honest, you should ask for the info you want. On the other hand, if you're working with someone who will obfuscate when asked "Are you working on X?", I don't see a strong reason to believe that they will give better info when instead asking about their top priorities.
In regard to Waymo (and Cruise, although I know less there) in San Francisco, the last CPUC meeting for allowing Waymo to charge for driverless service had the vote delayed. Waymo operates in more areas and times of day than Cruise in SF last I checked.
https://abc7news.com/sf-self-driving-cars-robotaxis-waymo-cruise/13491184/
I feel like Paul's right that the only crystal clear 'yes' is Waymo in Phoenix, and the other deployments are more debatable (due to scale and scope restrictions).
You gave the caveats, but I'm still curious to hear what companies you felt had this engineer vs manager conflict routinely about code quality. Mostly, I'd like to know so I can avoid working at those companies.
I suspect the conflict might be exacerbated at places where managers don't write code (especially if they've never written code). My managers at Google and Waymo have tended to be very supportive of code health projects. The discussion of how to trade-off code debt and velocity is also very explicit. We've gotten pretty guidance in some quarters along the lines of 'We are sprinting and expect to accumulate debt' vs 'We are slowing down to pay off tech debt'. This makes it pretty easy to tell if a given code health project is something that company leadership wants me to be doing right now.
Agreed, although it feels like in that case we should be comparing 'donating to X-risk organizations' vs 'working at X-risk organizations'. I think that by default I would assume that the money vs talent trade-off is similar at global health and X-risk organizations though.
Fair point that GiveWell has updated their RFMF and increased their estimated cost per QALY.
I do think that 300K EAs doing something equivalent to eliminating the global disease burden is substantially more plausible than 66K doing so. This seems trivially true since more people can do more than fewer people. I agree that it still sounds ambitious, but saying that ~3X the people involved in the Manhattan project could eliminate the disease burden certainly sounds easier than doing the same with half the Manhattan project's workforce size.
This is getting into nits, but ruling out all arguments of the form 'this seems to imply' seems really strong? Like, it naively seems to limit me to only discussing to implications that the argument maker explicitly acknowledges. I'm probably mis-interpreting you here though, since that seems really silly! This is usually what I'm trying to say when I ask about implications; I note something odd to see if the oddness is implied or if I misinterpreted something.
Agreed that X-risk is very important and also hard to quantify.
I'm surprised that you think that direct work has such a high impact multiplier relative to one's normal salary. The footnote seems to suggest that you expect someone who could get a $100K salary trying to earn to give could provide $3M in impact per year.
I think GiveWell still estimates it can save a life for ~$6K on the margin, which is ~50 QALYs.
(life / $6K) X (50 QALY / life) X ($3 million / EA year) ~= 25K QALY per EA year
Which both seems like a very high figure and seems to imply that 66K EAs would be sufficient to do good equivalent to totally eliminating the burden of all disease (I'm ignoring decreasing marginal returns). This seems like an optimistic figure to me, unless you're very optimistic about X-risk charities being effective? I'd be curious to hear how you got to the ~3 million figure intuitively.
I would guess something closer to 5-10X impact relative to industry salary, rather than a 30X impact.
Note that it might be very legally difficult to open source much of Space-X technology, due to the US classifying rockets as advanced weapons technology (because they could be used as such).
I'm not sure that contagiousness is a good reason to believe that an (in)action is particularly harmful, outside of the multiplier contagiousness creates by generating a larger total harm. It seems clear that we'd all agree that murder is much worse than visiting a restaurant with a common cold, despite the fact that the latter is a contagious harm.
Although there is a good point that the analogy breaks down because a DUI doesn't cause harm during your job (assuming you don't drive in your work), whereas being unvaccinated does cause expected harm to colleagues and customers.
Perhaps too tongue in cheek, but there is a strong theoretical upper bound on R0 for humans as of ~2021. It's around 8 billion, the current world population.
I think you're correct that the difference between R0 and Rt is that Rt takes into account the proportion of the population already immune.
However, R0 is still dependent on its environment. A completely naive (uninfected) population of hermits living in caves hundreds of miles distant from one another has an R0 of 0 for nearly anything. A completely naive population of immunocompromised packed-warehouse rave attendees would probably have an R0 of 100+ for measles.
I don't know if there is another Rte type variable that tries to define the infectiveness of a disease given both the prevalence of immunity and the environment. Seems like most folks just kinda assume that environment other than immune proportion is constant when comparing R0/Rt figures.
A one point improvement (measured on a ten point scale) feels like a massive change to expect. I like the guts to bet that it'll happen and change your mind otherwise, but I'm curious if you actually expected that scale of change.
For me, a one point change requires super drastic measures (ex. getting 2 hours too few sleep for 3+ days straight). Although I may well be arbitrarily compressing too much of my life into 6-9 on the ten point scale.
One of GiveDirectly's blog posts on survey and focus group results, by the way.
https://www.givedirectly.org/what-its-like-to-receive-a-basic-income/
Fair points!
I don't know if I'd consider JPAL directly EA, but they at least claim to conduct regular qualitative fieldwork before/after/during their formal interventions (source from Poor Economics, I've sadly forgotten the exact point but they mention it several times). Similarly, GiveDirectly regularly meets with program participants for both structured polls and unstructured focus groups if I recall correctly. Regardless, I agree with the concrete point that this is an important thing to do and EA/rationality folks are less inclined to collect unstructured qualitative feedback than its importance deserves.
I found it immensely refreshing to see valid criticisms of EA. I very much related to the note that many criticisms of EA come off as vague or misinformed. I really appreciated that this post called out specific instances of what you saw as significant issues, and also engaged with the areas where particular EA aligned groups have already taken steps to address the criticisms you mention.
I think I disagree on the degree to which EA folks expect results to be universal and generalizable (this is in response to your note at the end of point 3). As a concrete example, I think GiveWell/EAers in general would be unlikely to agree that GiveDirectly style charity would have similarly sized benefits in the developed world (even if scaled to equivalent size given normal incomes in the nation) without RCTs suggesting as much. I expect that the evidence from other nations would be taken into account, but the consensus would be that experiments should be conducted before immediately concluding large benefits would result.
Picking a Schelling point is hard. Since the post focused on very recent results, I thought that a one year time horizon was an obvious line. Vanguard does note that the performance numbers I quoted are time weighted averages.
You are of course correct that over the long run you should expect closer to 5-8% returns from the stock market at large.
I currently have a roughly 50/50 split between VTIAX and VTSAX. I would of course not expect to continue to get 30% returns moving forward (I expect 5% return after inflation), but that is the figure I got when I selected a one year time horizon for showing my return on Vanguard.com.
If I instead compute from 01/2020 to 01/2021, I had a roughly 18% rate of return. I don't know how your Wealthfront is setup, but I'll note that I have a relatively aggressive split of 100% stock and nothing in bonds.
Thanks for the reply! I find those numbers more persuasive than anything else. Well done!
You're claiming you've been correctly noticing good investment opportunities over a several month period. What has been your effective return over the last year (real return on all actual investments, not hypothetical)?
I feel like the strongest way to address the "If you are so smart why aren't you rich?" question is to show that you are in fact rich.
My Vanguard has gotten a 30.4% return over the last year. I have a very simple, everything in the basic large funds strategy (I can share the exact mix if its relevant). Your advice is substantially harder to execute than this, so it would be great to know the actual relative return.
knot shirt, force under door, pull
twist knob back and forth for hours, slowly wearing through a hole
kick down door
rip out teeth, use to scratch through door
text owner
call police
call friends to open door
hire locksmith
windows
air duct
ply away floorboards
give up concern with the door
die
smash phone, cut through door
hack through electronic door with phone
order pizza to door
post criminal pictures from phone, include locations (SWAT self)
break apart phone, use wires and battery to burn hinge
break phone, use screen protector as 'credit card trick'
rope ladder using clothes
feed shredded clothes through lock as a long thread, yank door knob out of door
break through wall with kicks
scream for help
fashion battery into key using teeth
add water to door hinges periodically, wait for rust
log back on to video game from respawn point
wait for release from good behavior
scratch through wood, waiting for nails to grow back
purchase building, order demolition
Dig out (use tooth to dig out small section of floor, then small section of floor can be used to dig further)
use teeth to scratch through hinges
use the internet to escape into video games
pee repetitively on the wood, weakening it to break through
wait, there is a decent probability the property will change hands over the next 10 years and be inspected
use teeth to dig around hinges to collapse door
go out the door that is not locked
unlock the latch
enter the key code
I'm in the room, so it's not empty' and leave via the logical contradiction
program as a job via the phone, acquire sufficient money to buy the property
make a lash out of untied clothing and teeth, use it to wear through doors
plant sufficient incriminatory evidence to get a drone strike called just nearby, leave through wreckage
remotely pilot drone purchased on amazon and wired up via task rabbit to unlock door
hire task rabbit to unlock the door
start fire using battery and wetted cloth threads near door, triggering fire alarm and emergency door unlock
climb out via punching through ceiling
spread reports that building is condemned, hire contractor to demolish building, escape via wreckage
launch gofundme for ransom to kidnappers
persuade kidnappers I am on their side, and should be released
step over the door (it is a locked doggie door)
These questions are way too 'Eureka!'/trivia for my taste. The first question relies on language specifics and then is really much more of a 'do you know the moderately weird sorting algs' question than an actual algorithms question. The second involves an oddly diluting resistant test. The third, again seems to test moderately well known crypto triva.
I've conducted ~60 interviews for Google and Waymo. To be clear, when I or most other interviewers I've seen say that a question covers 'algorithms' it means that it covers algorithm design. Ex. how do you find all pairs in an array whose sum is a given value? Such a question can be answered in several different ways, and the 'best' way involves using some moderately interesting data structures.
I'm super curious what kind of rubric you use in grading these questions.