Posts

Alex Ray's Shortform 2020-11-08T20:37:18.327Z

Comments

Comment by alex-ray on Alex Ray's Shortform · 2020-11-28T03:52:30.881Z · LW · GW

1. What am I missing from church?

(Or, in general, by lacking a religious/spiritual practice I share with others)

For the past few months I've been thinking about this question.

I haven't regularly attended church in over ten years.  Given how prevalent it is as part of human existence, and how much I have changed in a decade, it seems like "trying it out" or experimenting is at least somewhat warranted.

I predict that there is a church in my city that is culturally compatible with me.

Compatible means a lot of things, but mostly means that I'm better off with them than without them, and they're better off with me than without me.

Unpacking that probably will get into a bunch of specifics about beliefs, epistemics, and related topics -- which seem pretty germane to rationality.

2. John Vervaeke's Awakening from the Meaning Crisis is bizzarely excellent.

I don't exactly have handles for exactly everything it is, or exactly why I like it so much, but I'll try to do it some justice.

It feels like rationality / cognitive tech, in that it cuts at the root of how we think and how we think about how we think.

(I'm less than 20% through the series, but I expect it continues in the way it has been going.)

Maybe it's partially his speaking style, and partially the topics and discussion, but it reminded me strongly of sermons from childhood.

In particular: they have a timeless quality to them.  By "timeless" I mean I think I would take away different learnings from them if I saw them at different points in my life.

In my work & research (and communicating this) -- I've largely strived to be clear and concise.  Designing for layered meaning seems antithetical to clarity.

However I think this "timelessness" is a missing nutrient to me, and has me interested in seeking it out elsewhere.

For the time being I at least have a bunch more lectures in the series to go!

Comment by alex-ray on Alex Ray's Shortform · 2020-11-25T03:37:53.035Z · LW · GW

I don't know if he used that phrasing, but he's definitely talked about the risks (and advantages) posed by singletons.

Comment by alex-ray on Alex Ray's Shortform · 2020-11-22T18:34:47.117Z · LW · GW

Thinking more about the singleton risk / global stable totalitarian government risk from Bostrom's Superintelligence, human factors, and theory of the firm.

Human factors represent human capacities or limits that are unlikely to change in the short term.  For example, the number of people one can "know" (for some definition of that term), limits to long-term and working memory, etc.

Theory of the firm tries to answer "why are economies markets but businesses autocracies" and related questions.  I'm interested in the subquestion of "what factors given the upper bound on coordination for a single business", related to "how big can a business be".

I think this is related to "how big can an autocracy (robustly/stably) be", which is how it relates to the singleton risk.

Some thoughts this produces for me:

  • Communication and coordination technology (telephones, email, etc) that increase the upper bounds of coordination for businesses ALSO increase the upper bound on coordination for autocracies/singletons
  • My belief is that the current max size (in people) of a singleton is much lower than current global population
  • This weakly suggests that a large global population is a good preventative for a singleton
  • I don't think this means we can "war of the cradle" our way out of singleton risk, given how fast tech moves and how slow population moves
  • I think this does mean that any non-extinction event that dramatically reduces population also dramatically increases singleton risk
  • I think that it's possible to get a long-term government aligned with the values of the governed, and "singleton risk" is the risk of an unaligned global government

So I think I'd be interested in tracking two "competing" technologies (for a hand-wavy definition of the term)

  1. communication and coordination technologies -- tools which increase the maximum effective size of coordination
  2. soft/human alignment technologies -- tools which increase alignment between government and governed
Comment by alex-ray on Alex Ray's Shortform · 2020-11-22T18:34:09.492Z · LW · GW

Thinking more about the singleton risk / global stable totalitarian government risk from Bostrom's Superintelligence, human factors, and theory of the firm.

Human factors represent human capacities or limits that are unlikely to change in the short term.  For example, the number of people one can "know" (for some definition of that term), limits to long-term and working memory, etc.

Theory of the firm tries to answer "why are economies markets but businesses autocracies" and related questions.  I'm interested in the subquestion of "what factors given the upper bound on coordination for a single business", related to "how big can a business be".

I think this is related to "how big can an autocracy (robustly/stably) be", which is how it relates to the singleton risk.

Some thoughts this produces for me:

  • Communication and coordination technology (telephones, email, etc) that increase the upper bounds of coordination for businesses ALSO increase the upper bound on coordination for autocracies/singletons
  • My belief is that the current max size (in people) of a singleton is much lower than current global population
  • This weakly suggests that a large global population is a good preventative for a singleton
  • I don't think this means we can "war of the cradle" our way out of singleton risk, given how fast tech moves and how slow population moves
  • I think this does mean that any non-extinction event that dramatically reduces population also dramatically increases singleton risk
  • I think that it's possible to get a long-term government aligned with the values of the governed, and "singleton risk" is the risk of an unaligned global government

So I think I'd be interested in tracking two "competing" technologies (for a hand-wavy definition of the term)

  1. communication and coordination technologies -- tools which increase the maximum effective size of coordination
  2. soft/human alignment technologies -- tools which increase alignment between government and governed
Comment by alex-ray on The tech left behind · 2020-11-18T07:20:08.884Z · LW · GW

+1 Plan 9.

I think it (weirdly) especially hits a strange place with the "forgotten" mark, in that pieces of it keep getting rediscovered (sometimes multiple times).

I got to work w/ some of the Plan 9 folks, and they would point out (with citations) when highly regarded papers in OSDI had been built (and published) in Plan 9, sometimes 10-20 years prior.

One form of this "forgotten" tech is tech that we keep forgetting and rediscovering, but:

  1. maybe this isn't the type of forget the original question is about, and
  2. possibly academia itself is incentivizing this (since instead of only getting one paper out of a good idea, if it can get re-used, then that's good for grad students / labs that need publications)
Comment by alex-ray on Alex Ray's Shortform · 2020-11-18T01:09:56.929Z · LW · GW

Future City Idea: an interface for safe AI-control of traffic lights

We want a traffic light that
* Can function autonomously if there is no network connection
* Meets some minimum timing guidelines (for example, green in a particular direction no less than 15 seconds and no more than 30 seconds, etc)
* Secure interface to communicate with city-central control
* Has sensors that allow some feedback for measuring traffic efficiency or throughput

This gives constraints, and I bet an AI system could be trained to optimize efficiency or throughput within the constraints.  Additionally, you can narrow the constraints (for example, only choosing 15 or 16 seconds for green) and slowly widen them in order to change flows slowly.

This is the sort of thing Hash would be great for, simulation wise.  There's probably dedicated traffic simulators, as well.

At something like a quarter million dollars a traffic light, I think there's an opportunity here for startup.

(I don't know Matt Gentzel's LW handle but credit for inspiration to him)

Comment by alex-ray on The tech left behind · 2020-11-18T00:52:29.817Z · LW · GW

I think commercial applications of nuclear fission sources are another good example.

Through the 1940s, there were lots of industrial processes, and commercial products which used nuclear fission or nuclear materials in some way.  Beta sources are good supplies of high-energy electrons (used in a bunch of polymer processes, among other things), alpha sources are good supplies of positively charged nuclei (used in electrostatic discharge, and some sensing applications).

I think one of the big turning points was the Atomic Energy Act, in the US, though international agreements might also be important factors here.

The world seems to have collectively agreed that nuclear risks are high, and we seem to have chosen to restrict proliferation (by regulating production and sale of nuclear materials) -- and as a side effect have "forgotten" the consumer nuclear technology industry.

I am interested in this because its also an example where we seem to have collectively chose to stifle/prevent innovation in an area of technology to reduce downside risk (dirty bombs and other nuclear attacks).

Comment by alex-ray on The tech left behind · 2020-11-18T00:29:08.380Z · LW · GW

I think Google Wave/Apache Wave is a good candidate here, at least for the crowd familiar with it.

Designed to be a new modality of digital communication, it combined features of email, messengers/chat, collaborative document editing, etc.

It got a ton of excitement from a niche crowd while it was in a closed beta.

It never got off the ground, though, and less than a year after finishing the beta, it was slowly turned down and eventually handed over to Apache.

Comment by alex-ray on How to get the benefits of moving without moving (babble) · 2020-11-15T01:25:28.971Z · LW · GW

I really appreciate how this and the previous posts does a lot to describe and frame the problems that moving would solve, such that it's possible to make progress on them.

I think it's harder to clearly frame the problem (or clearly break a big/vague problem into concrete subproblems).

Anyways, some babbling:

  • Home
    • Redecorate or remodel
    • Arrange a furniture swap with friends and neighbors
    • Marie kondo your stuff
    • Give friends a virtual tour of your space and ask what they would change
    • Ask your friends for virtual tours of their spaces for ideas
    • Have a garage sale / get rid of a bunch of stuff
    • Buy land
    • buy a house
    • buy a condo
  • Work
    • Ask for a raise
    • Ask for a promotion
    • Interview at other companies
    • Get career coaching advice (80k / EA folks are pretty practiced at this!)
    • Give career coaching advice (showing up, as a professional in my 30s, to EA / early career events has been more fulfilling than I thought it would be)
    • Go to conferences in your field
    • Go to conferences in the field you want to be in (strongly recommend this)
    • Join vocational groups (meetups, etc) to connect with folks in your field
    • Give presentations about your job/work at vocational groups
    • Try moving around weekend days (wednesday/sunday?)
    • Try working from home (okay this would probably have been more useful pre-pandemic)
    • Change up your commute (bike or walk or public transit or drive or motorcycle)
    • Join a carpool with people you think are cool to be in cars with
    • Start a carpool with people you think are cool to be in cars with
    • Take an online course (bonus: get your work to pay for it)
    • Get a professional degree (online or night school or whatever)
  • Relationships (mostly not romantic)
    • Make it easy for other people to schedule 1:1s with you (calendly / etc)
    • Reach out to possible mentors you would want to have
    • Reach out to possible mentees (many people I know could be great mentors but are too uncertain to take the first step towards mentorship themselves)
    • Retire a mentor of yours that you're no longer getting lots of value out of (but maybe still be friends?)
    • Start a book club
    • Start a paper reading group
    • Start a running group or gym group
    • Host more parties (post-vaccine)
    • Go to more parties (post-vaccine)
    • Join a monastery / religion / church / etc (not for everyone)
    • Start a religion / etc (I think the anti-digital movement has room for a neo-luddite group to coalesce, but thats a separate post)
  • Other
    • Find a counselor or therapist
    • Turn off your internet after dark
    • Go on a 10 day silent retreat
    • Spend a month in an RV
    • Travel much more often (for me this could be "travel at least 3 days every month" but for others could vary)
    •  
Comment by alex-ray on Alex Ray's Shortform · 2020-11-15T00:25:51.543Z · LW · GW

Looking at this more, I think I my uncertainty is resolving towards "No".

Some things:
- It's hard to bet against the bonds themselves, since we're unlikely to hold them as individuals
- It's hard to make money on the "this will experience a sharp decline at an uncertain point in the future" kind of prediction (much easier to do this for the "will go up in price" version, which is just buying/long)
- It's not clear anyone was able to time this properly for Detroit, which is the closest analog in many ways
- Precise timing would be difficult, much more so while being far away from the state

I'll continue to track this just because of my family in the state, though.

Point of data: it was 3 years between Detroit bonds hitting "junk" status, and the city going bankrupt (in the legal filing sense), which is useful for me for intuitions as to the speed of these.

Comment by alex-ray on Alex Ray's Shortform · 2020-11-08T20:37:18.807Z · LW · GW

Can LessWrong pull another "crypto" with Illinois?

I have been following the issue with the US state Illinois' debt with growing horror.

Their bond status has been heavily degraded -- most states' bonds are "high quality" with the standards agencies (moodys, standard & poor, fitch), and Illinois is "low quality".  If they get downgraded more they become a "junk" bond, and lose access to a bunch of the institutional buyers that would otherwise be continuing to lend.

COVID has increased many states costs', for reasons I can go into later, so it seems reasonable to think we're much closer to a tipping point than we were last year.

As much as I would like to work to make the situation better I don't know what to do.  In the meantime I'm left thinking about how to "bet my beliefs" and how one could stake a position against Illinois.

Separately I want to look more into EU debt / restructuring / etc as its probably a good historical example of how this could go.  Additionally previously the largest entity to go bankrupt in the USA was the city of Detroit, which probably is also another good example to learn from.

Comment by alex-ray on Nuclear war is unlikely to cause human extinction · 2020-11-07T08:29:28.295Z · LW · GW

Thanks for writing this up!  I think having more well-researched and well-written dives into things like this are great

A bunch of scattered thoughts and replies to these:

Overall I agree with the central idea (Nuclear War is unlikely to cause human extinction), but I disagree enough with the reasoning to want to hammer it into better shape.

Writing this I feel like in a "editing/constructive feedback" mood, but I'd welcome you to throw it all out if its not what you're going for.  To the feedback!

This seems to only consider current known nuclear weapons arsenals.  It seems worth including probabilities that different kinds of weapons are built before such a war.  In particular, longer-lived species of bombs (e.g. salt bombs, cobalt bombs, etc)

I think I want to separate "kill everyone with acute radiation right away" and "kill everyone with radiation in all of the food/water", and the latter seems less addressed by the energy/half life argument.  I think the weapon design space here is pretty huge, so ruling these out seems hard to me.  (Though I do think if we're assigning probabilities, they should get lower probabilities than conventional weapons)

In general I would prefer approximate numbers or at least likelihood ratios for what you think this evidence balances out to, and what likelihood of odds you would put on different outcomes.

(For example: "what is the likelihood ratio of the 3.C evidence that nuclear war planners are familiar with ideas like nuclear winter" -- I don't think these are strictly required, but they really help me contextualize and integrate this information)

In particular, Toby Ord gives a bunch of excellent quantitative analysis of X-risks, including nuclear war risk, in The Precipice.

(In fact, if your main point of the post was to present a different model from that one, adding numbers would greatly help in comparing and contrasting and integrating them)

Finally, I think my mental models of case 3) are basically the same as any event that is a significant change to the biosphere -- and it seems reasoning about this gets harder given your premise.

A hypothetical: if there are 3 major climate events in the next 100 years (of which one is a bellicose nuclear exchange), and humanity goes extinct due to climate related symptoms, does the nuclear war "cause" the human extinction in a way you're trying to capture?

Maybe what I want is for the premise to be more precise: define a time limit (extinct within X years) and maybe factor what it means to "cause" (for example, it seems like this suggests that an economic collapse triggered by an nuclear war, which triggers other things, that eventually leads to extinction, is not as clearly "caused by nuclear war")

Also maybe define a bit what "full-scale" means?  I assume that it means total war (as opposed to limited war), but good to clear up in any case.

That's all that came to mind for now.  Thanks again for sharing!~

Comment by alex-ray on Location Discussion Takeaways · 2020-11-05T06:21:33.401Z · LW · GW

This is probably better put in another post, but I think I agree with your read of the situation and recommendations, and want to follow it with the (to me) logical next step: "how to get some of the good things we could get by moving, without moving"

I like this post because it does talk about a bunch of things that could be got (sense of safety, isolation from political unrest, etc).

It seems not-easy and also not-impossible to brainstorm ways of dealing with this as a community in a sane and cost-effective way.

One way this could go:  (sketch of a vision)

Right now SF and most cities have a bunch of "earthquake disaster preparation" advice.  Things you should have on hand, ready to go, plans you should have made ahead of time, people to contact in case communications go down, things to do to prepare your home structure, attach your furniture to the walls, etc.

We could make some community version of that, pointed directly at the things we want to point at.

Comment by alex-ray on Where do (did?) stable, cooperative institutions come from? · 2020-11-04T19:49:25.785Z · LW · GW

Sharing an idea that came to mind while reading it, low confidence.

Maybe "forming great cultures" is really just the upper tail of "forming cultures" -- the more cultures we form, the more great cultures we get.

In this case the interesting thing is tracking how many cultures we form, and what factors control this rate.


I think over the timescales described, humans haven't really gotten much more interesting to other humans.  Humans are pretty great, hanging out together is (to many) more fun and exciting than hanging out alone.

A difference could be that the alternatives have been getting more and more interesting -- wandering in the woods is pretty but also boring.  Walking around town might be less boring.  Reading a book less boring still.  Listening to music is better for some.  The internet has created a whole lot of less-boring activities.  Somewhere in there we crossed a threshold for "forming cultures" becoming less and less interesting.

This is basically the idea "we form cultures when we get bored, and we're less bored".

But let's say I personally find the idea of starting a culture exciting, does this still affect me?

I think 'yes', because the people I try to recruit for participating in my culture will also have to choose between joining me and the BATNA.

Things that would update me against this: models that show "starting a culture" continues to be exciting, models that show "hedonic setpoint" reasons for the 'BATNA gets better' idea to be broken, evidence that more cultures are being started now than ever before.

Of all the things here I think the idea I'm most interested in inspecting is "formation of great cultures tracks formation of cultures in general".

Comment by alex-ray on Where do (did?) stable, cooperative institutions come from? · 2020-11-04T19:33:06.209Z · LW · GW

I think the point about people Goodhart-ing the things seen as greatness makes sense.  These incentives would have been around for a long time and don't predict recent changes, though.

One thing that is different now is more of the words/sentences/pictures/ideas of interactions I have are with some form of manipulable media (websites, podcasts, radio, television, etc) rather than flesh-and-blood humans.  Here I'm trying to capture something like the amount of beliefs, knowledge, and ideas moved, rather than amount of time or attention.

So this predicts things like 'its easier to form institutional cultures when there is more human-human interaction', which would point to a decline in recent decades, but also probably have significant events at past points in history.

Radio, television, internet, etc would probably be interesting points to study.

The recent pandemic is then interesting, because this would predict that in places that shut down for the pandemic, it became acutely more difficult to build/maintain cultures of great institutions, because we acutely curtailed human-human interaction.

Comment by alex-ray on Where do (did?) stable, cooperative institutions come from? · 2020-11-04T19:23:59.310Z · LW · GW

Maybe not the right place, but my understanding is that Robert Gordon's hypothesis is very different from the others.

The common view between these folks is that our expectation is for growth, and with this comes plans/strategies/policies which are breaking down as our growth has been slower than expected (at this point for decades).

(I think I know more about this one) Gordon's view is that stagnation is because our growth has come from discovering, scaling, and rolling-out a sequence of "once only" inventions.  We can only disseminate germ theory once, we can only add women to the workforce once, we can only widely deploy indoor plumbing once, etc.  This means the expectation is that as we get all the easy improvements, growth will slow down.  Gordon's view is importantly independent of culture, and makes similar predictions of US, UK, JP, SK, CN, FR, DE, (which they'll arrive at in different dates given the convergence model, but eventually all trend the same).  Gordon's prediction is that we're just now in a world where we're stuck at ~1% TFP growth.

(I know some about this) Cowen's hypothesis in the Great Stagnation is similar to Gordon's, but seems to argue that the stagnation he's describing is 1) more specific to American culture 2) reversible, in that he predicts given some policy changes that we can get back to the higher growth of earlier decades.  I don't know how much Cowen's thinking has changed since publishing that book.

(I know less about this) E.Weinstein's hypothesis is there's something in the cultural zeitgeist that is causing the stagnation.  I am interested in learning more about this take, and would appreciate references.

Comment by alex-ray on What is the current bottleneck on genetic engineering of human embryos for improved IQ · 2020-10-23T04:17:34.098Z · LW · GW

I don't really have much but this is at least from last year:

Steve Hsu discusses Human Genetic Engineering and CRISPR babies in this (the first?) episode of the podcast he has w/ Corey Washington https://manifoldlearning.com/podcast-episode-1-crispr-babies/?utm_source=rss&utm_medium=rss&utm_campaign=podcast-episode-1-crispr-babies

Transcript: https://manifoldlearning.com/episode-001-transcript/

Comment by alex-ray on Reviews of TV show NeXt (about AI safety) · 2020-10-11T17:17:08.948Z · LW · GW

In case other folks would be interested, here is the trailer on youtube: https://www.youtube.com/watch?v=micrLvzThs8&feature=emb_title

Not obvious from the review (to me): it's a fictional drama about a conflict between humans and a rogue AI.

Comment by alex-ray on Forecasting Thread: AI Timelines · 2020-08-25T15:47:06.990Z · LW · GW

It might be useful for every person responding to attempt to define precisely what they mean by human level AGI.

Comment by alex-ray on What's a Decomposable Alignment Topic? · 2020-08-25T07:10:54.142Z · LW · GW

I work at OpenAI on safety. In the past it seems like theres a gap between what I'd consider to be alignment topics that need to be worked on, and the general consensus for this forum. A good friend poked me to write something for this so here I am.

Topics w/ strategies/breakdown:

  • Fine-tuning GPT-2 from human preferences, to solve small scale alignment issues
    • Brainstorm small/simple alignment failures: ways that existing generative language models are not aligned with human values
    • Design some evaluations or metrics for measuring a specific alignment failure (which lets you measure whether you’ve improved a model or not)
    • Gather human feedback data / labels / whatever you think you can try training on
    • Try training on your data (there are tutorials on how to use Google Colab to fine-tune GPT-2 with a new dataset)
    • Forecast scaling laws: figure out how performance on your evaluation or metric varies with the amount of human input data; compare to how much time it takes to generate each labelled example (be quantitative!)
  • Multi-objective reinforcement learning — instead of optimizing a single objective, optimize multiple objectives together (and some of the objectives can be constraints)
    • What are ways we can break down existing AI alignment failures in RL-like settings into multi-objective problems, where some of the objectives are safety objectives and some are goal/task objectives
    • How can we design safety objectives such that they can transfer across a wide variety of systems, machines, situations, environments, etc?
    • How can we measure and evaluate our safety objectives, and what should we expect to observe during training/deployment?
    • How can we incentivize individual development and sharing of safety objectives
    • How can we augment RL methods to allow transferrable safety objectives between domains (e.g., if using actor critic methods, how to integrate a separate critic for each safety objective)
    • What are good benchmark environments or scenarios for multi-objective RL with safety objectives (classic RL environments like Go or Chess aren’t natively well-suited to these topics)
  • Forecasting the Economics of AGI (turn ‘fast/slow/big/etc’ into real numbers with units)
    • This is more “AI Impacts” style work than you might be asking for, but I think it’s particularly well-suited for clever folks that can look things up on the internet.
    • Identify vague terms in AI alignment forecasts, like the “fast” in “fast takeoff”, that can be operationalized
    • Come up with units that measure the quantity in question, and procedures for measurements that result in those units
    • Try applying traditional economics growth models, such as experience curves, to AI development, and see how well you can get things to fit (much harder to do this for AI than making cars — is a single unit a single model trained? Maybe a single week of a researchers time? Is the cost decreasing in dollars or flops or person-hours or something else? Etc etc)
    • Sketch models for systems (here the system is the whole ai field) with feedback loops, and inspect/explore parts of the system which might respond most to different variables (additional attention, new people, dollars, hours, public discourse, philanthropic capital, etc)

Topics not important enough to make it into my first 30 minutes of writing:

  • Cross disciplinary integration with other safety fields, what will and won’t work
  • Systems safety for organizations building AGI
  • Safety acceleration loops — how/where can good safety research make us better and faster at doing safety research
  • Cataloguing alignment failures in the wild, and create a taxonomy of them

Anti topics: Things I would have put on here a year ago

  • Too late for me to keep writing so saving this for another time I guess

I’m available tomorrow to chat about these w/ the group. Happy to talk then (or later, in replies here) about any of these if folks want me to expand further.

Comment by alex-ray on What problem would you like to see Reinforcement Learning applied to? · 2020-07-17T01:34:16.992Z · LW · GW

I'm surprised this hasn't got more comments. Julian, I've been incredibly impressed by your work in RL so far, and I'm super excited to see what you end up working on next.

I hope folks will forgive me just putting down some opinions about what problems in RL to work on:

I think I want us (RL, as a field) to move past games -- board games, video games, etc -- and into more real-world problems.

Where to go looking for problems?

These are much harder to make tractable! Most of the unsolved problems are very hard. I like referencing the NAE's Engineering Grand Challenges and the UN's Sustainable/Millennium Development Goals when I want to think about global challenges. Each one is much bigger than a research project, but I find them "food for thought" when I think about problems to work on.

What characteristics probably make for good problems for deep RL?

1. Outside of human factors -- either too big for humans, or too small, or top fast, or too precise, etc.

2. Episodic/resettable -- has some sort of short periodicity, giving bounds on long-term credit assignment

3. Already connected to computers -- solving a task with RL in a domain that isn't already hooked up to software/sensors/computers is going to be 99% setup and 1% RL

4. Supervised/Unsupervised failed -- I think in general it makes sense only to try RL after we've tried the simpler methods and they've failed to work (perhaps the data is too few, or labels too weak/noisy)

What are candidate problem domains?

Robotics is usually the first thing people say, so best just get it out of the way first. I think this is exactly right, but I think the robots we have access to today are terrible, so this turns into mostly a robot design problem with a comparatively smaller ML problem on top. (After working with robots & RL for years I have hours of this but saving that for another time)

Automatic control systems is underrated as a domain. Many problems involving manufacturing with machines involve all sorts of small/strange adjustments to things like "feed rate" "rotor speed" "head pressure" etc etc etc. Often these are tuned/adjusted by people who build up intuition over time, then transfer intuition to other humans. I expect it would be possible for RL to learn how to "play" these machines better and faster than any human. (Machines include: CNC machines, chemical processing steps, textile manufacture machines, etc etc etc)

Language models have been very exciting to me lately, and I really like this approach to RL with language models: https://openai.com/blog/fine-tuning-gpt-2/ I think the large language models are a really great substrate to work with (so far much better than robots!) but specializing them to particular purposes remains difficult. I think having much better RL science here would be really great.

Some 'basic research' topics

Fundamental research into RL scaling. It seems to me that we still don't really have a great understanding of the science of RL. Compared to scaling laws in other domains, RL is hard to predict, and has a much less well understood set of scaling laws (model size, batch size, etc etc). https://arxiv.org/abs/2001.08361 is a great example of the sort of thing I'd like to have for RL.

Multi-objective RL. In general if you ask RL people about multi-objective, you'll get a "why don't you just combine them into a single objective" or "just use one as an aux goal", but it's much more complex than that in the deep RL case, where the objective changes the exploration distribution. I think having multiple objectives is a much more natural way of expressing what we want systems to do. I'm very excited about at some point having transferrable objectives (since there are many things we want many systems to do, like "don't run into the human" and "don't knock over the shelf", in addition to whatever specific goal).

Trying to find some concrete examples, I'm coming up short.

I'm sorry I didn't meet the recommendation for replies, but glad to have put something on here. I think this is far too few replies for a question like this.

Comment by alex-ray on Should I wear wrist-weights while playing Beat Saber? · 2019-07-22T23:30:33.511Z · LW · GW

Comment because this is answering a different question than “should I use wrist weights”

I have found that a weight vest is a nice improvement to the game. I’d recommend trying it, and it possibly might have some of the common benefits with the wrist weights without some of the downsides.

Comment by alex-ray on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-07T06:48:05.249Z · LW · GW

I'm pretty surprised this entire argument goes without using any amount of quantitive modeling or data analysis.

I do think it presents a bunch of persuasive and philosophical arguments in the direction of your conclusions, but it's easy to imagine (and find, searching on the internet) persuasive and philosophical arguments in the opposite direction.

(Caveat: I'm a bit new to this forum and how things work, but surely for folks here this is better answered by building a model and incorporating uncertainty?)

A few of the specifics you give I've found are not borne out in the research I've done (e.g. price sensitivity has more to do with location/centrality than it does with luxury, though more luxurious homes tend to be built farther from city centers). This could just be from different sources, but I'm noticing want to wave several large [CITATION NEEDED] flags.

Also maybe it'd be useful to share the quantitative analysis I've done? Basically "modeling the financial, legal, and social implications of buying a house together" has been the biggest project of mine for 2019 outside of work, but I'm not an expert (most of us in my house, myself included, are first time home buyers). I'd consider myself better informed than the average person who has not owned a house for an extended period of time, but would be very interested in learning more and learning where my models are bad.

For interested folks I found Shiller's Irrational Exuberance to give a bunch of nice solid models (backed with data!) on speculative pricing bubbles in ways that seem to apply to SF bay area housing in particular.