A Bayesian Aggregation Paradox 2021-11-22T10:39:59.935Z
Jsevillamol's Shortform 2021-11-20T16:00:10.434Z
[Link post] When pooling forecasts, use the geometric mean of odds 2021-09-06T06:45:01.244Z
Analysis of World Records in Speedrunning [LINKPOST] 2021-08-04T15:26:35.463Z
Work on Bayesian fitting of AI trends of performance? 2021-07-19T18:45:19.148Z
Trying to approximate Statistical Models as Scoring Tables 2021-06-29T17:20:11.050Z
Parameter counts in Machine Learning 2021-06-19T16:04:34.733Z
How to Write Science Fiction and Fantasy - A Short Summary 2021-05-29T11:47:30.613Z
Parameter count of ML systems through time? 2021-04-19T12:54:26.504Z
Survey on cortical uniformity - an expert amplification exercise 2021-02-23T22:13:24.157Z
Critiques of the Agent Foundations agenda? 2020-11-24T16:11:22.495Z
Spend twice as much effort every time you attempt to solve a problem 2020-11-15T18:37:24.372Z
Aggregating forecasts 2020-07-23T18:04:37.477Z
What confidence interval should one report? 2020-04-20T10:31:54.107Z
On characterizing heavy-tailedness 2020-02-16T00:14:06.197Z
Implications of Quantum Computing for Artificial Intelligence Alignment Research 2019-08-22T10:33:27.502Z
Map of (old) MIRI's Research Agendas 2019-06-07T07:22:42.002Z
Standing on a pile of corpses 2018-12-21T10:36:50.454Z
EA Tourism: London, Blackpool and Prague 2018-08-07T10:41:06.900Z
Learning strategies and the Pokemon league parable 2018-08-07T09:37:27.689Z
EA Spain Community Meeting 2018-07-10T07:24:59.310Z
Estimating the consequences of device detection tech 2018-07-08T18:25:15.277Z
Advocating for factual advocacy 2018-05-06T08:47:46.599Z
The most important step 2018-03-24T12:34:01.643Z


Comment by Jsevillamol on Why Study Physics? · 2021-11-28T12:46:13.635Z · LW · GW

Related work:

Also, while this post focuses a lot on Physics, my experience is that top level math people are quite comfortable with informal math reasoning.

Comment by Jsevillamol on Finding the Central Limit Theorem in Bayes' rule · 2021-11-27T09:20:28.501Z · LW · GW

Great post! Helped me build an intuition of why this is true, and I came off pretty convinced it is.

I specially liked how each step is well motivated, so that by the end I already knew where this was going.

One note:

In the last section you write that the convolution of the distribution equals the Fourier transform of the pointwise distributions.

But I think you meant to write that the Fourier transform of the convolution of the distributions is the pointwise product of their Fourier transforms (?).

This does not break the proof.

Comment by Jsevillamol on EfficientZero: How It Works · 2021-11-26T16:07:18.579Z · LW · GW

Great post!

Do you mind if I ask you what is the amount of free parameters and training compute of EfficientZero? 

I tried scanning the paper but didn't find them readily available.

Comment by Jsevillamol on A Bayesian Aggregation Paradox · 2021-11-24T13:24:16.029Z · LW · GW

average log odds could make sense in the context in which there is a uniform prior


This is something I have heard from other people too, and I still cannot make sense of it. Why would questions where uninformed forecasters produce uniform priors make logodds averaging work better?

A tendency for the questions asked to have priors of near 50% according to the typical unknowledgeable person would explain why more knowledgeable forecasters would assign more extreme probabilities on average: it takes more expertise to justifiably bring their probabilities further from 50%.

I don't understand your point. Why would forecasters care about what other people would do? They only want to maximize their own score.

If A, B, and C are mutually exclusive, then they can't all have 50% prior probability, so a pooling method that implicitly assumes that they do will not give coherent results.

This also doesn't make much sense to me, though it might be because I still don't understand the point about needing uniform priors for logodd pooling. 


Different implicit priors don't appear to be ruining anything.



I conclude that the incoherent results in my ABC example cannot be blamed on switching between the uniform prior on {A,B,C} and the uniform prior on {A,A}, and, instead, should be blamed entirely on the experts having different beliefs conditional on A, which is taken account in the calculation using A,B,C, but not in the calculation using A,A.

I agree with this.

Comment by Jsevillamol on Laplace's rule of succession · 2021-11-23T23:42:31.822Z · LW · GW

Solid post - it is good to have the full reasoning for Laplace's rule of succession in a single post instead of buried in a statistics post. I also liked the discussion on how to use it in practice - I'd love to see a full example using actual numbers if you feel like writing one! 

On this topic I also recently enjoyed UnexpectedValues post. He provides a cool proof / intuition for the rule of succession.

Comment by Jsevillamol on A Bayesian Aggregation Paradox · 2021-11-23T08:53:53.973Z · LW · GW

Note that you are making the same mistake than me! Updates are not summarized in the same way as beliefs - for the update the "correct" way is to take an average of the  likelihoods:


This does not invalidate the example though!

Thanks for suggesting, I think it helps clarify the conondrum.

Comment by Jsevillamol on A Bayesian Aggregation Paradox · 2021-11-22T14:40:39.982Z · LW · GW

I like this framing.

This seems to imply that summarizing beliefs and summarizing updates are two distinct operations.

For summarizing beliefs we can still resort to summing:


But for summarizing updates we need to use an average - which in the absence of prior information will be a simple average:


Annoyingly and as you point out this is not a perfect summary - we are definitely losing information here and subsequent updates will be not as exact as if we were working with the disaggregated odds.

I still find it quite disturbing that the update after summarizing depends on prior information - but I can't see how to do better than this, pragmatically speaking.

Comment by Jsevillamol on Jsevillamol's Shortform · 2021-11-20T16:00:10.792Z · LW · GW

From OpenAI Five's blogpost:

We’re still fixing bugs. The chart shows a training run of the code that defeated amateur players, compared to a version where we simply fixed a number of bugs, such as rare crashes during training, or a bug which resulted in a large negative reward for reaching level 25. It turns out it’s possible to beat good humans while still hiding serious bugs!

One common line of thought is thinking that goals are very brittle - small misspecifications will be amplified after optimizing.

Yet Open AI Five managed to wrangle a good performance out of a seriously buggy reward function.

Hardly conclusive, but it would be interesting to see more examples of this. One could also do deliberate experiments to see how much you can distort a reward function before behaviour breaks.

Comment by Jsevillamol on My ML Scaling bibliography · 2021-10-24T13:57:05.939Z · LW · GW

For people interested in scaling laws, check out "Parameter, Compute and Data Trends in Machine Learning" - the biggest public database of milestone AI systems annotated with data on parameters, compute and data.

A visualization of the current dataset is available here.

(disclaimer: I am the main coordinator of the project)

Comment by Jsevillamol on Emergent modularity and safety · 2021-10-21T06:39:59.121Z · LW · GW

Relevant related work : NNs are surprisingly modular

On the topic of pruning neural networks, see the lottery ticket hypothesis

Comment by Jsevillamol on Optimization Concepts in the Game of Life · 2021-10-17T06:25:58.211Z · LW · GW

How might we quantify size in our definitions above?

Random K complexity inspired measure of size for a context / property / pattern.

Least number of squares you need to turn on, starting from an empty board, so that the grid eventually evolves into the context.

It doesn't work for infinite contexts though.

Comment by Jsevillamol on NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG · 2021-10-12T08:00:20.016Z · LW · GW

Here is your estimate in the context of other big AI models:

The top 3 training compute in the graph are Megatron-Turing, GPT-3 and AlphaZero

An interactive visualization is available here.

Comment by Jsevillamol on Sigmoids behaving badly: arXiv paper · 2021-09-22T08:32:42.393Z · LW · GW

I skimmed this paper and liked it. My personal takeaway is that historical fit to a sigmoid model is not predictive of future fit. If you want to have a good sigmoid forecast, you need to have good priors on what are the mechanisms causing the fastening and dampening of the curve.

Thank you for sharing this!

Comment by Jsevillamol on AI Safety Papers: An App for the TAI Safety Database · 2021-08-23T17:29:35.064Z · LW · GW

My user experience

When I first load the page, I am greeted by an empty space.


From here I didn't know what to look for, since I didn't remember what kind of things where in the database.

I tried clicking on table to see what content is there.

Ok, too much information, hard to navigate.

I remember that one of my manuscripts made it to the database, so I look up my surname


That was easy! (and it loaded very fast)

The interface is very neat too. I want to see more papers, so I click on one of the tags.

I get what I wanted.

Now I want to find a list of all the tags. Hmmm I cannot find this anywhere.

I give up and look at another paper:

Oh cool! The Alignmnet Newsletter summary is really great. Whenever I read something in Google Scholar it is really hard to find commentary on any particular piece.

I tried now to look for my current topic of research to find related work

Meh, not really anything interesting for my research.

Ok, now I want to see if Open AI's "AI and compute" post is in the dataset:

Huhhh it is not here. The bitter lesson is definitely relevant, but I am not sure about the other articles.

Can I search for work specific to open ai?

Hmm that didnt quite work. The top result is from OpenAI, but the rest are not.

Maybe I should spell it different?

Oh cool that worked! So apparently the blogpost is not in the dataset.

Anyway, enough browsing for today.

Alright, feedback: 

  1. This is a very cool tool. The interface is neatly designed.
  2. Discovering new content seems hard. Some things that could help include a) adding recommended content on load (perhaps things with most citations, or even ~10 random papers) and b) having a list of tags somewhere
  3. The reviewer blurbs are very nice. However I do not expect to use this tool. Or rather I cannot think right now of what exactly I would use this tool for. It has made me consider reaching out to the database mantainers to suggest the inclusion of an article of mine. So maybe like that, to promote my work?
Comment by Jsevillamol on "AI and Compute" trend isn't predictive of what is happening · 2021-08-20T09:20:29.092Z · LW · GW

Deep or shallow version?

Comment by Jsevillamol on "AI and Compute" trend isn't predictive of what is happening · 2021-08-16T14:31:18.973Z · LW · GW

One more question: for the BigGAN which model do your calculations refer to?

Could it be the 256x256 deep version?

Comment by Jsevillamol on Outline of Galef's "Scout Mindset" · 2021-08-11T16:36:56.569Z · LW · GW

Some discussion from the OP here

Comment by Jsevillamol on Analysis of World Records in Speedrunning [LINKPOST] · 2021-08-10T14:09:01.978Z · LW · GW

Update: I tried regressing on the ordinal position of the world records and found a much better fit, and better (above baseline!) forecasts of the last WR of each category.

This makes me update further towards the hypothesis that date is a bad predictive variable. Sadly this would mean that we really need to track whatever the index in WR is correlated with (presumably the cumulative number of runs overall by the speedrunning community). 

Linear regression with logarithmic transforms on both axis
Comment by Jsevillamol on Analysis of World Records in Speedrunning [LINKPOST] · 2021-08-10T10:59:28.328Z · LW · GW

This is so cool!

It seems like the learning curves are reasonable close to the diagonal, which means that:

  • Given the logarithmic X-axis, it seems like improvements become increasingly harder over time. You need to invest exponentially more time to get a linear improvement.
  • The rate of logarithmic improvement is overall relatively constant.

On the other hand, despite all curves being close to the diagonal, they seem to mostly undershoot it. This might imply that the rate of improvement is slighly decreasing over time.

One thing that tripped me from this graph for other readers: the relative attempt is wrt to the amount of WR improvements. That means that if there are 100 WRs, the point with relative attempt = 0.5 is the 50th WR improvement, not the one whose time is closer to the average between the date of the first and last attempt.

So this graph is giving information about "conditional on you putting enough effort to beat the record, by how much should you expect to beat it?" rather than on "conditional on spending X amount of effort on the margin, by how much should you expect to improve the record?".

Here is the plot that would correspond to the other question, where the x axis value is not proportional to the ordinal index of WR improvement but to the date when the WR was submitted.

It shows a far weaker correlation. This suggests that a) the best predictor of new WRs is the amount of runs overall being put into the game and 2) the amount of new WRs around a given time is a good estimate of the amount of runs overall being put into the game.

This has made me update a bit against plotting WR vs time, and in favor of plotting WR vs cumulative number of runs. Here are some suggestions about how one could go about estimating the number of runs being put into the game, if somebody want to look into this!

PS: the code for the graph above, and code to replicate Andy's graph, is now here

Comment by Jsevillamol on Analysis of World Records in Speedrunning [LINKPOST] · 2021-08-10T10:14:43.028Z · LW · GW

Is there any way to estimate how many cumulative games that speedrunners have run at a given point?


One should be able to use the API to search for the number of runs submitted by a certain date, as a proxy for the cumulative games (though it will not reflect all attempts since AFAIK many runners only submit their personal bests to

Additionally, provides some stats on the amount of runs and players for each game, for example the current stats for Super Metroid can be found here: 

Current stats for Super Metroid according to

There are some problems with this approach too. 

  1. These are aggregated by game, not by category, so one would need to somehow split the runs among popular categories of the same game.
  2. There is only current data avaiable through the webpage. There might be a way to access historical data through the API. If not, one would need to use archived versions of the pages and interpolate the scrapped stats.

I'd be excited about learning about the results of either approach if anybody ends up scrapping this data!

Comment by Jsevillamol on Analysis of World Records in Speedrunning [LINKPOST] · 2021-08-06T12:20:56.068Z · LW · GW

Those are good suggestions! 

Here is what happens when we align the start dates and plot the improvements relative to the time of the first run.

Relative improvement vs days since first run for most popular categories

I am slightly nervous about using the first run as the reference, since early data in a category is quite unrealiable and basically reflects the time of the first person to thought to submit a run. But I think it should not create any problems.

Interestingly, plotting the relative improvement reveals some S-curve patterns, with phases of increasing returns followed by phases of diminishing returns.

I did not manage either to beat the baseline by extrapolating the relative improvement times. Interestingly, using a grid to count non-improvements as observations made the extrapolation worse, so this time the best fit was achieved with log linear regression over the last 8 weeks of data in each category.

Log linear extrapolation of relative improvements 

As before, the code to replicate my analysis is available here.

Haven't had time yet to include logistic models or do analysis of the derivative of the improvements - if you feel so inclined feel free to reuse my code to perform the analysis yourself and if you share them here we can comment on the results!

PS: there is a sentence missing an ending in your comment

Comment by Jsevillamol on How much compute was used to train DeepMind's generally capable agents? · 2021-07-31T01:26:29.520Z · LW · GW

Do you mind sharing your guesstimate on number of parameters?

Also, do you have per chance guesstimates on number of parameters / compute of other systems?

Comment by Jsevillamol on Incorrect hypotheses point to correct observations · 2021-07-30T22:26:40.268Z · LW · GW

Ah, I just realized this is the norm with curated posts. FWIW I feel a bit awkward to have the curation comments displayed so prominently, since it alters the author's intended reading experience in a way I find a bit weird / offputting.

If it was up to me, I would remove the curator's words at the top of a post in favor of comments like this one, where the reasons for curation are explained but its not the first thing that readers see when reading the post.

Comment by Jsevillamol on Incorrect hypotheses point to correct observations · 2021-07-30T22:20:53.964Z · LW · GW

Meta: it seems like you have accidentally added this comment also at the beginning of the post besides commenting?

Comment by Jsevillamol on "AI and Compute" trend isn't predictive of what is happening · 2021-07-21T10:25:25.627Z · LW · GW

What is the GShard dense transformer you are referring to in this post?

Comment by Jsevillamol on How much chess engine progress is about adapting to bigger computers? · 2021-07-08T22:39:32.195Z · LW · GW

Very tangential to the discussion so feel free to ignore, but given that you have put some though before on prize structures I am curious about the reasoning for why you would award a different prize for something done in the past versus something done in the future

Comment by Jsevillamol on Spend twice as much effort every time you attempt to solve a problem · 2021-07-05T10:03:31.068Z · LW · GW

Nicely done!

I think this improper prior approach makes sense.

I am a bit confused on the step when you go from an improper prior to saying that  the "expected" effort would land in the middle of these numbers. This is because the continuous part of the total effort spent vs doubling factor is concave, so I would expect the "expected" effort to be weighted more in favor of the lower bound.

I tried coding up a simple setup where I average the graphs across a space of difficulties to approximate the "improper prior" but it is very hard to draw a conclusion from it. I think the graph suggests that the asymptotic minimum is somewhere above 2.5 but I am not sure at all. 

Doubling factor (x-axis) vs expected total effort spent (y-axis), averaged across 1e5 difficulty levels uniformly spaced between d=2 and d=1e6

Also I guess it is unclear to me whether a flat uninformative prior is best, vs an uninformative prior over logspace of difficulties. 

What do you think about both of these things?

Code for the graph:

import numpy as np
import matplotlib.pyplot as plt
import math

effort_spent = lambda d,b : (b**(np.ceil(math.log(d, b))+1)-1) / (b-1)

ds = np.linspace(2, 1000000, 100000)
hist = np.zeros(shape=(1000,))
for d in ds:
  bs = np.linspace(1.1, 5, 1000)
  hist += np.vectorize(lambda b : effort_spent(d,b))(bs) / len(ds)
plt.plot(bs, hist);
Comment by Jsevillamol on The Generalized Product Rule · 2021-07-02T11:52:37.015Z · LW · GW

I really like this article.

It has helped me appreciate how product rules (or additivity, if we apply a log transform) arises in many contexts. One thing I hadn't appreciated when studying Cox theorem is that you do not need to respect "commutativity" to get a product rule (though obviously this restricts how you can group information). This was made very clear to me in example 3. 

One thing that confused me in the first reading was that I misunderstood you as referring to the third requirement as associativity of . Rereading this is not the case; you just say that the third requirement implies that F is associative. But I wish you had spelled out the implication, ie saying that.

Comment by Jsevillamol on Parameter counts in Machine Learning · 2021-06-28T09:54:14.517Z · LW · GW

Good suggestion! Understanding the trend of record-setting would be interesting indeed so that we avoid the pesky influence of the systems which are below the trend like CURL in the game domain.

The problem with the naive setup of just regressing on record-setters is that is quite sensitive to noise - one early outlier in the trend can completely alter the result.

I explore a similar problem in my paper Forecasting timelines of quantum computing, where we try to extrapolate progress on some key metrics like qubit count and gate error rate. The method we use in the paper to address this issue is to bootstrap the input and predict a range of possible growth rates - that way outliers do not completely dominate the result.

I will probably not do it right now for this dataset, though I'd be interested in having other people try that if they are so inclined!

Comment by Jsevillamol on Parameter counts in Machine Learning · 2021-06-28T09:45:18.679Z · LW · GW

This is now fixed; see the updated graphs. We have also updated the eye ball estimates accordingly.

Comment by Jsevillamol on The Point of Trade · 2021-06-26T18:42:51.501Z · LW · GW

Trying to think a bit harder about this - maybe companies are sort of like this? To manage my online shop I need someone to maintain the web, someone to handle marketing, etc. I need many people to work for me to make it work, and I need all of them at once. Let's suppose that I pay my workers directly proportionally to the amount of sales they manage to make it more obvious.

As I painted it, this is not about amortizing a fixed cost. And I cannot subdivide the task - if I tell my team I expect to make only 10 sales and pay accordingly they are going to tell me go eff myself (though maybe in the magical world where there are no task-switching costs this breaks down).

Another try: maybe a fairness constraint can force a minimum. The government has given me the okay to sell my new cryonics procedure, but only if I can make enough for everyone.

Comment by Jsevillamol on The Point of Trade · 2021-06-26T17:18:36.106Z · LW · GW

You are quite right that 1 and 2 are related, but the way I was thinking about them I didn't have them as equivalent.

1 is about fixed costs; each additional sheet of paper I produce amortizes part of the initial, fixed cost 

2 is about a threshold of operation. Even if there are no fixed costs, it would happen in a world when I can only produce in large bulks and no individual units.

Then again, I am struggling to think of a real-life example of 2, so maybe it is not something that happens in our universe.

Comment by Jsevillamol on The Point of Trade · 2021-06-22T18:28:49.494Z · LW · GW

I'm confused. Why would diminishing marginal returns incentivize trade? If the first unit of everything was very cheap then I would rather produce it myself than produce extra of one things (which costs more) then trade.  

Comment by Jsevillamol on The Point of Trade · 2021-06-22T18:26:12.660Z · LW · GW

Other magical powers of trade:

  1. Economies of scale. It is basically as easy for me to produce 20 sheets of paper as to produce 1 ; after paying the set up costs the marginal costs are much smaller in comparison. So all in all I would rather specialize in paper-making, have somebody else specialize in pencil-making, then trade.
  2. Investment. often I need A LOT of capital to get something started, more than I could reasonably accumulate over a lifetime. So I would rather trade the starting capital for IOUs I will get from profit.
  3. Insurance. I may have a particularly bad harvest this year and a very good one the next one, while my neighbour might have the opposite problem. All in all I would rather we pool our harvests each year, so that we can have food both years. So we are "trading" part of our harvest for insurance.
Comment by Jsevillamol on Parameter counts in Machine Learning · 2021-06-21T14:40:12.807Z · LW · GW

Thank you! The shapes mean the same as the color (ie domain) - they were meant to make the graph more clear. Ideally both shape and color would be reflected in the legend. But whenever I tried adding shapes to the legend instead a new legend was created, which was more confusing.

If somebody reading this knows how to make the code produce a correct legend I'd be very keen on hearing it!

EDIT: Now fixed

Comment by Jsevillamol on Parameter counts in Machine Learning · 2021-06-21T14:33:04.360Z · LW · GW

Thank you! I think you are right - by default the Altair library (what we used to plot the regressions) does OLS fitting of an exponential instead of fitting a linear model over the log transform. We'll look into this and report back.

Comment by Jsevillamol on How to Write Science Fiction and Fantasy - A Short Summary · 2021-05-31T08:33:26.852Z · LW · GW

Thank you! Now fixed :)

Comment by Jsevillamol on Parameter count of ML systems through time? · 2021-04-19T15:22:29.883Z · LW · GW

Thank you for the feedback, I think what you say makes sense.

I'd be interested in seeing whether we can pin down exactly in what sense are Switch parameters "weaker". Is it because of the lower precision? Model sparsity (is Switch sparse on parameters or just sparsely activated?)?

What do you think, what typology of parameters would make sense / be useful to include?

Comment by Jsevillamol on "New EA cause area: voting"; or, "what's wrong with this calculation?" · 2021-02-27T22:05:44.960Z · LW · GW

Can you explain to me why is the probability of a swing $1/\srt{NVoters}$? :)

Comment by Jsevillamol on Survey on cortical uniformity - an expert amplification exercise · 2021-02-24T09:27:10.213Z · LW · GW

re: "I'd expect experts to care more about the specific details than I would"

Good point. We tried to account for this by making it so that the experts do not have to agree or disagree directly with each sentence but instead choose the least bad of two extreme positions.

But in practice one of the experts bypassed the system by refusing to answer Q1 and Q2 and leaving an answer in the space for comments.

Comment by Jsevillamol on Survey on cortical uniformity - an expert amplification exercise · 2021-02-24T09:20:03.563Z · LW · GW

Street fighting math:

Let's model experts as independent draws of a binary random variable with a bias $P$. Our initial prior over their chance of choosing the pro-uniformity option (ie $P$) is uniform. Then if our sample is $A$ people who choose the pro-uniformity option and $B$ people who choose the anti-uniformity option we update our beliefs over $P$ to a $Beta(1+A,1+B)$, with the usual Laplace's rule calculation.  

To scale this up to eg a $n$ people sample we compute the mean of $n$ independent draws of a $Bernoilli(P)$, where $P$ is drawn from the posterior Beta. By the central limit theorem is approximately a normal of mean $P$ and variance equal to the variance of the bernouilli divided by $n$ ie $\{1}{n}P(1-P)$.

We can use this to compute the approximate probability that the majority of experts in the expanded sample will be pro-uniformity, by integrating the probability that this normal is greater than $1/2$ over the possible values of $P$.

So for example we have $A=1$, $B=3$ in Q1, so for a survey of $n=100$ participants we can approximate the chance of the majority selecting option $A$ as:

import scipy.stats as stats
import numpy as np

A = 1
B = 3
n = 100

b = stats.beta(A+1,B+1)
np.mean([(1 - survey_dist.cdf(1/2)) * b.pdf(p)
        for p in np.linspace(0.0001,0.9999,10000)
        for survey_dist in (stats.norm(loc = p, scale = np.sqrt(p*(1-p)/n)),)])

which gives about $0.19$.

For Q2 we have $A=1$, $B=4$, so the probability of the majority selecting option $A$ is about $0.12$.

For Q3 we have $A=6$, $B=0$, so the probability of the majority selecting option $A$ is about $0.99$.

EDIT: rephrased the estimations so they match the probability one would enter in the Elicit questions 

Comment by Jsevillamol on Implications of Quantum Computing for Artificial Intelligence Alignment Research · 2021-02-23T11:59:01.039Z · LW · GW

re: impotance of oversight

I do not think we really disagree on this point. I also believe that looking at the state of the computer is not as important as having an understanding of how the program is going to operate and how to shape its incentives. 

Maybe this could be better emphasized, but the way I think about this article is showing that even the strongest case for looking at the intersection of quantum computing and AI alignment does not look very promising. 


re: How quantum computing will affect ML

I basically agree that the most plausible way QC can affect AI aligment is by providing computational speedups - but I think this mostly changes the timelines rather than violating any specific assumptions in usual AI alignment research.

Relatedly, I am bullish that we will see better than quadratic speedups (ie Grover) - to get better-than-quadratic speedups you need to surpass many challenges that right now it is not clear can be surpassed outside of very contrived problem setup [REF].

In fact I think that the speedups will not even be quadratic because you "lose" the quadratic speedup when parallelizing quantum computing (in the sense that the speedup does not scale quadratically with the number of cores).

Comment by Jsevillamol on Suggestions of posts on the AF to review · 2021-02-18T12:55:46.559Z · LW · GW

Suggestion 1: Utility != reward by Vladimir Mikulik. This post attempts to distill the core ideas of mesa alignment. This kind of distillment increases the surface area of AI Alignment, which is one of the key bottlenecks of the area (that is, getting people familiarized with the field, motivated to work on it and with a handle on some open questions to work on). I would like an in-depth review because it might help us learn how to do it better!

Suggestion 2: me and my coauthor Pablo Moreno would be interested in feedback in our post about quantum computing and AI alignment. We do not think that the ideas of the paper are useful in the sense of getting us closer to AI alignment, but I think it is useful to have signpost explaining why avenues that might seem attractive to people coming into the field are not worth exploring, while introducing them to the field in a familiar way (in this case our audience are quantum computing experts). One thing that confuses me is that some people have approached me after publishing the post asking me why I think that quantum computing is useful for AI alignment, so I'd be interested in feedback on what went wrong on the communication process given the deflationary nature of the article. 

Comment by Jsevillamol on Making Vaccine · 2021-02-10T12:55:37.856Z · LW · GW

Amazing initiative John - you might give yourself a D but I am giving you an A+ no doubt.

Trying to decide if I should recommend this to my family.

In Spain, we have 18000 confirmed COVID cases in January 2021. I assume real cases are at least 20000. Some projections estimate that laypeople might not get vaccinated in 10 months, so the potential benefit of a widespread DIY vaccine is avoiding 200k cases of COVID19 (optimistically assuming linear growth of cases). 

Spain pop is 47 million, so the naïve chance of COVID for an individual before vaccines are widely available is 2e4*10 / 5e6 ie about 1 in 250.

Let's say that the DIY vaccine has 10% chance of working on a givne individual. If we take the side effects of the vaccine to be as bad as catching COVID19 itself, then I want the chances of a serious side effect to be lower than 1 in 2500 for the DIY vaccine to be worth it.

Taking into account the risk of preparing it incorrectly plus general precaution, the chances of a serious side effect look to me more like 1 in 100 than 1 in 1000.

So I do not think, given my beliefs, that I should recommend it. Is this reasoning broadly correct? What is a good baseline for the chances of a side effect in a new peptide vaccine?

Comment by Jsevillamol on How long does it take to become Gaussian? · 2020-12-10T02:26:46.337Z · LW · GW

This post is great! I love the visualizations. And I hadn't made the explicit connection between iterated convolution and CLT!

Comment by Jsevillamol on Spend twice as much effort every time you attempt to solve a problem · 2020-11-16T16:00:30.733Z · LW · GW

I don't think so.

What I am describing is an strategy to manage your efforts in order to spend as little as possible while still meeting your goals (when you do not know in advance how much effort will be needed to solve a given problem).

So presumably if this heuristic applies to the problems you want to solve, you spend less on each problem and thus you'll tackle more problems in total. 

Comment by Jsevillamol on AGI safety from first principles: Goals and Agency · 2020-10-22T10:42:46.330Z · LW · GW

I think this helped me a lot understand you a bit better - thank you

Let me try paraphrasing this:

> Humans are our best example of a sort-of-general intelligence. And humans have a lazy, satisfying, 'small-scale' kind of reasoning that is mostly only well suited for activities close to their 'training regime'. Hence AGIs may also be the same - and in particular if AGIs are trained with Reinforcement Learning and heavily rewarded for following human intentions this may be a likely outcome.

Is that pointing in the direction you intended?

Comment by Jsevillamol on Babble challenge: 50 ways to escape a locked room · 2020-10-13T18:10:53.454Z · LW · GW

(I realized I miseed the part on the instructions about an empty room - so my solutions involve other objects)

Comment by Jsevillamol on Babble challenge: 50 ways to escape a locked room · 2020-10-13T18:00:51.097Z · LW · GW
  1. Break the door with your shoulders
  2. Use the window
  3. Break the wall with your fists
  4. Scream for help until somebody comes
  5. Call a locksmith
  6. Light up a paper and trigger the smoke alarm and wait for the firemen to rescue you
  7. Hide in the closet and wait for your captors to come back - then run for your life
  8. Discover how to time travel - time travel forward into the future until there is no room
  9. Wait until the house becomes old and crumbles
  10. Pick the lock with a paperclip
  11. Shred the bed into a string, pass it through the pet door, lasso the lock and open it
  12. Google how to make a bomb and blast the wall
  13. Open the door
  14. Wait for somebody to pass by, attract their attention hitting the window and ask for help writing on a notepad
  15. Write your location in a paper and slide it under the door, hoping it will find its way to someone who can help
  16. Use the vents
  17. Use that handy secret door you built it a while ago and your wife called you crazy for doing so
  18. Send a message through the internet asking for help
  19. Order a pizza, ask for help when they arrive
  20. Burn the door
  21. Melt the door with a smelting tool
  22. Shoot at the lock with a gun
  23. Push against the door until you quantum tunnel through it
  24. Melt the lock with the Breaking Bad melting lock stuff (probably google that first)
  25. There is no door - overcome your fears and cross the emptyness
  26. Split your matress in half with a kitchen knife, fit the split mattress through the window to make a landing spot and jump into it
  27. Make a paper plane with instructions for someone to help and throw it out of the window
  28. Make a rope with your duvet and slide yourself down to the street
  29. Make a makeshift glider with your duvet and jump out of the window - hopefully it will slow you down enough to not die
  30. Climb out of the window and into the next room
  31. Dig the soil under the door until you can fit through
  32. Set your speaker to maximum volume and ask for help
  33. Break the window with a chair and climb outside
  34. Grow a tree under the door and let it lift the door for you
  35. Use a clothe hanger to slide through the clothing line between your building and your neighbourg's. Apologize to the neightbour for disrupting their sleep.
  36. Hit the ceiling with a broom to make the house rate come out. Attach a message to them and send them back into their hole, and to your neighbour
  37. Meditate until somebody opens the door
  38. Train your flexibility for years until you fit through the dog door
  39. Build a makeshift ariete with the wooden frame of the bed
  40. Unmont the hinges with a scredriver and remove the door
  41. Try random combinations until you find the password
  42. Look for the key over the door frame
  43. Collect dust and blow it over the numpad. The dust collects over the three most greasy digits. Try the 6 possible combinations until the door opens.
  44. Find the model number of the lock. Call the fabricator pretending to be the owner. Wait five minutes while listening to waiting music. Explain you are locked. Realize you are talking to an automated receiver. Ask to talk with a real person. Explain you ae locked. Follow all instructions.
  45. Do not be in the room in the first place
  46. Try figuring out if you really need to escape in the first place
  47. Swap consciosuness with the other body you left outside the room
  48. Complain to your captor that the room is too small and you are claustrophobic. Hope they are understanding.
  49. Pretend to have a hearth attack, wait for your captor to carry you outside
  50. Check out ideas on how to escape in the lesswrong bable challenge
Comment by Jsevillamol on AGI safety from first principles: Goals and Agency · 2020-10-02T13:53:26.281Z · LW · GW

Let me try to paraphrase this: 

In the first paragraph you are saying that "seeking influence" is not something that a system will learn to do if that was not a possible strategy in the training regime. (but couldn't it appear as an emergent property? Certainly humans were not trained to launch rockets - but they nevertheless did?)

In the second paragraph you are saying that common sense sometimes allows you to modify the goals you were given (but for this to apply to AI ststems, wouldn't they need have common sense in the first place, which kind of assumes that the AI is already aligned?)

In the third paragraph it seems to me that you are saying that humans have some goals that have an built-in override mechanism in them - eg in general humans have a goal of eating delicious cake, but they will forego this goal in the interest of seeking water if they are about ot die of dehydratation (but doesn't this seem to be a consequence of these goals being just instrumental things  that proxy the complex thing that humans actually care about?)

I think I am confused because I do not understand your overall point, so the three paragraphs seem to be saying wildly different things to me.