Pseudorandomness contest, Round 2
post by Eric Neyman (UnexpectedValues) · 2020-12-20T08:35:09.266Z · LW · GW · 6 commentsThis is a link post for https://ericneyman.wordpress.com/2020/12/17/pseudorandomness-contest-round-2/
Contents
[Rules change 12/20: Previously the rules said that I would normalize your probabilities to add to 62 for incentives reasons. It was pointed out to me that there’s little or no incentives issue, so I no longer plan to do this.] None 6 comments
[Note: if you’re reading this on 12/20 I recommend checking again tomorrow in case I need to make any clarifications to the rules.]
Last week, I asked you all to take 10 minutes to write down and submit a 150-bit string, with the goal of making your string “seem random” without the aid of any sources of randomness (except your brain and anything you had memorized). I received 62 submissions; thank you to everyone who participated!
Now it’s time for Round 2, which I think of as the “real” part of the contest. In addition to the 62 strings you all submitted, I created 62 “truly” random strings with my computer, and I have put the 124 strings in a random order. Your goal is to figure out which of the strings are truly random and which ones were submitted by a Round 1 participant. Note that you can participate in Round 2 even if you did not participate in Round 1. Just as for Round 1, there will be prizes for doing well! (See below for details.)
Click here to see the binary strings, but you should read the rules before starting!
The rules: For each string, you will be asked to submit your guess about the probability that the string is truly random. For example, you might say “70%” for a string you think is probably random and “5%” for a string you’re pretty sure isn’t random.
Unlike in Round 1, this time you are free to use almost any resources you want to complete this task. You can write a computer program. You may do Google for advice about how to tell apart truly random and fake-random strings. You may look up numerical constants people might have used to generate their random strings. There are two things you are not allowed to do:
- You may not interact with a person, e.g. ask questions on Stack Exchange or talk to a friend, as part of completing this task, unless you are on the same team (see the team policy below).
- You may not use code or pseudocode that was written by someone else, if that code is meant to be used for distinguishing random and pseudorandom strings. That is, you’re allowed to read articles describing how to tell strings apart, but may not use code that someone else has written, or something that is written in a format that may as well be code. (You may use code that someone else wrote for some other purpose, e.g. most libraries.) I’ll trust you in terms of where to draw the line between “algorithm description” and “pseudocode”, but the line I have in mind is something like: if it’s mostly text then it’s fine to look at and if it looks basically like code then it’s not.
Team policy: Teams of up to three people are allowed. However, if multiple members of your team participated in Round 1, as part of your submission you must indicate which Round 1 strings were submitted by your team members, and the weights of those strings will be reduced in Round 2 scoring (so that the total weight of your team’s strings is 1).
[Rules change 12/20: Previously the rules said that I would normalize your probabilities to add to 62 for incentives reasons. It was pointed out to me that there’s little or no incentives issue, so I no longer plan to do this.]
The deadline: Sunday, December 27th at 11:59 pm ET. However, I reserve the right to extend the deadline by one week. Specifically, I am hoping that the following things will happen by December 27th:
- There will be at least 10 submissions.
- There will be at least 3 submissions with a score of at least 15 (this is my cutoff for considering someone to have done a good job).
In the (I think unlikely) even that one of these doesn’t happen, I will probably extend the deadline.
Click here to submit your probabilities! (But I encourage you to read the details below.)
Round 2 scoring: Let’s say you submit a probability p for a string. If the string is truly random, your score will be
and if the string was submitted by someone in Round 1 then your score will be
Basically, this means that your score will be 0 no matter what if you say 50% for a string. The highest score you can get is 1 (if you say 100% and it’s truly random, or if you say 0% and it’s not truly random), and the lowest score you can get is -3 (if say 100% and it’s not truly random or the reverse). Your total score will be the sum of your scores for all the strings (weighted as per the team policy above). Note that if you’re going for maximizing your expected score, this scoring system incentivizes you to be honest.
Round 1 scoring: If you participated in Round 1, your score for your Round 1 string will be an average of all probabilities assigned to your string by all Round 2 participants (excluding you), weighted by their Round 2 score. (Entries with negative scores won’t count for Round 1 scoring.) So basically, you’ll get a good score in Round 1 if you manage to trick Round 2 participants into thinking that your string is truly random, with higher weights assigned to Round 2 participants who did well. (The weighting feature of the scoring is new compared to what I wrote in the Round 1 post.)
Note to people who participate in both rounds: You have a very slight advantage in Round 2 if you know your Round 1 string because you can get a free point by saying 0 for your string. You should have received an email (sent to the email you entered when you submitted your Round 1 entry) with your submission, so you should be able to figure out your string. If you didn’t receive the email, you can email me (see here for my email) and I’ll try to figure out your string.
Prizes: If you do well, you will be able to decide what charity I send some amount of money to (subject to my approval). Round 1 amounts will total to at least $50 (probably exactly $50). Round 2 amounts will total to at least $50, and at least $100 (quite possibly more) if the “10 submissions, 3 good submissions” criterion described above is met. I haven’t decided on the particulars of how I will award these prizes, but I hope you trust me to be impartial in such determinations.
Once more, here is the link to the binary strings, and here is the link to submit your Round 2 entry. Good luck and have fun!
(Questions? Comment below, or here if you want me to notice right away!)
6 comments
Comments sorted by top scores.
comment by Rafael Harth (sil-ver) · 2020-12-20T12:12:50.024Z · LW(p) · GW(p)
My thanks for making me practice python :-)
comment by SarahSrinivasan (GuySrinivasan) · 2020-12-24T23:13:40.364Z · LW(p) · GW(p)
I submitted an answer. It took me longer than I expected or originally wanted, but the side effects were pleasant.
comment by Ericf · 2020-12-20T21:08:29.945Z · LW(p) · GW(p)
Can you make a version available that converts all the numbers to letters (1=A and 0=B or vice versa). It's a pain to try and copy them over into Excel without the long strings turning into 1.0011 e149 and losing data, and I bet I'm not the only one who would benefit.
Replies from: UnexpectedValues↑ comment by Eric Neyman (UnexpectedValues) · 2020-12-20T21:15:53.340Z · LW(p) · GW(p)
Thanks for the suggestion. I might do that later; in the meanwhile, the following should work for pasting the strings as text (at least on Windows).
- Format the cells you are planning to paste the strings in as "text". (Right click -> format -> text)
- Copy the strings
- Right click -> paste special -> values only (if you hover over the options you should be able to find that one) -> text.
comment by Eric Neyman (UnexpectedValues) · 2020-12-20T18:49:28.815Z · LW(p) · GW(p)
I've changed the rules to get rid of the normalization of probabilities clause, because it was pointed out to me that if someone says 0 to everything in an attempt to do well in Round 1, their Round 2 submission will receive a weight of 0 for Round 1 scoring anyway. It's still possible that there are some incentives issues here, but I doubt there's anything major, and I don't want to mess too much with what people submit.
comment by Multicore (KaynanK) · 2020-12-20T16:19:32.898Z · LW(p) · GW(p)
It seems reasonably possible to be confident that a string is human-generated, but if anyone did their job well in round 1, it probably won't be possible to be confident that a string is computer-generated.
Maybe some of the ones left over will seem slightly more or less random, but probably at some point I'll just have n strings left over and assign them all probability 62/n, adjusted for whatever uncertainty I had about the ones that seemed human-generated.