Comment by gwern on Disincentives for participating on LW/AF · 2019-05-18T23:07:57.931Z · score: 5 (2 votes) · LW · GW

A rapporteur?

Comment by gwern on What are good practices for using Google Scholar to research answers to LessWrong Questions? · 2019-05-18T22:26:22.442Z · score: 33 (8 votes) · LW · GW

My search guide might be helpful.

Comment by gwern on "One Man's Modus Ponens Is Another Man's Modus Tollens" · 2019-05-18T19:30:44.180Z · score: 3 (1 votes) · LW · GW

Do you think any of the examples are better termed 'modus delens'?

Comment by gwern on Which scientific discovery was most ahead of its time? · 2019-05-17T22:05:36.555Z · score: 6 (2 votes) · LW · GW

Related: "sleeping beauty" papers.

"One Man's Modus Ponens Is Another Man's Modus Tollens"

2019-05-17T22:03:59.458Z · score: 34 (5 votes)
Comment by gwern on Implications of GPT-2 · 2019-05-10T19:38:01.854Z · score: 4 (2 votes) · LW · GW

In what sense is being able to do addition or subtraction with different numbers, for example, which is what it means to learn addition or subtraction, not 'the exact same problem but with different labels'?

Comment by gwern on Implications of GPT-2 · 2019-05-08T20:45:03.753Z · score: 5 (2 votes) · LW · GW

DeepMind has shown that Transformers trained on natural text descriptions of math problems can solve them at well above random: "Analysing Mathematical Reasoning Abilities of Neural Models", Saxton et al 2019:

Mathematical reasoning---a core ability within human intelligence---presents some unique challenges as a domain: we do not come to understand and solve mathematical problems primarily on the back of experience and evidence, but on the basis of inferring, learning, and exploiting laws, axioms, and symbol manipulation rules. In this paper, we present a new challenge for the evaluation (and eventually the design) of neural architectures and similar system, developing a task suite of mathematics problems involving sequential questions and answers in a free-form textual input/output format. The structured nature of the mathematics domain, covering arithmetic, algebra, probability and calculus, enables the construction of training and test splits designed to clearly illuminate the capabilities and failure-modes of different architectures, as well as evaluate their ability to compose and relate knowledge and learned processes. Having described the data generation process and its potential future expansions, we conduct a comprehensive analysis of models from two broad classes of the most powerful sequence-to-sequence architectures and find notable differences in their ability to resolve mathematical problems and generalize their knowledge.

And this sounds like goal post moving:

unless a very similar problem appears in the training data—e.g. the exact same problem but with different labels

Comment by gwern on Recent updates to gwern.net (2017–2019) · 2019-05-02T00:26:36.221Z · score: 3 (1 votes) · LW · GW

TWDNE has now been upgraded with samples from an additional 2 months of training on bigger faces, which should make them considerably better: https://twitter.com/gwern/status/1123640762309201921

Comment by gwern on An Apology is a Surrender · 2019-05-01T22:08:41.837Z · score: 20 (6 votes) · LW · GW

An apology experiment: "Does Apologizing Work? An Empirical Test of the Conventional Wisdom", Hanania 2015:

This paper presents the results of an experiment where respondents were given two versions of two real-life controversies involving comments made by public figures. Approximately half of the participants read a story that made it appear as if the person had apologized, while the rest were led to believe that the individual stood firm. In the first experiment, involving Rand Paul and his comments on the Civil Rights Act, hearing that he was apologetic did not change whether respondents were less likely to vote for him. When presented with two versions of the controversy surrounding Larry Summers and his comments about women scientists and engineers, however, liberals and females were much more likely to say that he definitely or probably should have faced negative consequences for his statement when presented with his apology.

April 2019 gwern.net newsletter

2019-05-01T14:43:18.952Z · score: 11 (2 votes)
Comment by gwern on Recent updates to gwern.net (2017–2019) · 2019-04-29T15:00:22.621Z · score: 3 (1 votes) · LW · GW

I haven't but I should.

Comment by gwern on Recent updates to gwern.net (2017–2019) · 2019-04-28T23:44:17.896Z · score: 11 (3 votes) · LW · GW

Maybe. I think you would have to check the metadata field for 'finished', because otherwise there's no definitive criteria: I put up the notes weeks in advance, and they usually aren't finished on the 1st of the month. I don't especially mind manual submission since I have to crosspost to Twitter/Reddit/#lesswrong/TinyLetter anyway.

Recent updates to gwern.net (2017–2019)

2019-04-28T20:18:27.083Z · score: 36 (8 votes)
Comment by gwern on Hull: An alternative to shell that I'll never have time to implement · 2019-04-28T18:30:37.044Z · score: 5 (3 votes) · LW · GW

Oleg's zipper-based 'shell' and 'filesystem' has some similar properties: http://okmij.org/ftp/continuations/ZFS/zfs-talk.pdf

Comment by gwern on "Everything is Correlated": An Anthology of the Psychology Debate · 2019-04-27T13:48:58.167Z · score: 8 (4 votes) · LW · GW

(Sort of a very long delayed followup to https://www.lesswrong.com/posts/ttvnPRTxFyru9Hh2H/against-nhst , tracking down one specific strand of the debate.)

"Everything is Correlated": An Anthology of the Psychology Debate

2019-04-27T13:48:05.240Z · score: 40 (6 votes)
Comment by gwern on Open Thread April 2019 · 2019-04-09T03:05:41.858Z · score: 5 (2 votes) · LW · GW

Those pictures are eight years old, and those particular masks aren’t listed on the store’s website ( http://www.cadelsolmascherevenezia.com/en/masks/27 )

Is there a reason to not just email & ask (other than depression)?

Comment by gwern on Alignment Newsletter #52 · 2019-04-06T01:54:29.719Z · score: 9 (4 votes) · LW · GW

Looking at the description of that Pavlov algorithm, it bears more than a passing resemblance to REINFORCE or evolutionary methods of training NNs, except with the neurons relabeled 'agents'.

Comment by gwern on Aumann Agreement by Combat · 2019-04-05T15:53:37.023Z · score: 8 (4 votes) · LW · GW

One can’t link to sections within a PDF,

Yes you can. #page=N. That's how I linked to the papers I liked.

Comment by gwern on March 2019 gwern.net newsletter · 2019-04-04T01:05:05.619Z · score: 6 (3 votes) · LW · GW

Which part? There have cumulatively been a lot of changes.

Comment by gwern on March 2019 gwern.net newsletter · 2019-04-02T20:25:15.422Z · score: 9 (4 votes) · LW · GW

I am excited and terrified of eyetracking for foveated rendering in VR for precisely those reasons: it will be both awesome & awful and I don't know how it'll net out. (All the more reason to keep paying for VR games, I guess, to help ensure that the user is the customer rather than the product...)

March 2019 gwern.net newsletter

2019-04-02T14:17:38.032Z · score: 19 (3 votes)
Comment by gwern on User GPT2 Has a Warning for Violating Frontpage Commenting Guidelines · 2019-04-01T20:33:21.500Z · score: 28 (12 votes) · LW · GW

At first from the title I thought this was hilariously funny, but after looking at user GPT2's comments, it appears the username is a doggone dirty lie and these are not in fact GPT-2-small samples but merely human-written, which comes as a great disappointment to me.

Since user GPT2 seems to be quite prolific, we have implemented a setting to hide comments by GPT2, which can be accessed from the settings page when you are logged in.

Wouldn't it make more sense to implement a generic blacklist for which GPT2 could be a special case?

Comment by gwern on What is up with carbon dioxide and cognition? An offer · 2019-03-26T00:48:19.696Z · score: 11 (4 votes) · LW · GW

Some recent kerfluffles over CO2 (prompted by people rediscovering Allen et al 2016 on Twitter etc) lead me to one I missed: "Breathing Carbon Dioxide (4% for 1-Hour) Slows Response Selection, Not Stimulus Encoding", Vercruyssen 2014. 4% is a ton but the results remain subtle, at best.

Comment by gwern on 'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples · 2019-03-25T03:14:58.998Z · score: 3 (1 votes) · LW · GW

And now a full guide to using StyleGAN: https://www.gwern.net/Faces

Comment by gwern on Inverse p-zombies: the other direction in the Hard Problem of Consciousness · 2019-03-13T01:55:26.805Z · score: 11 (2 votes) · LW · GW

"This is what it’s like waking up during surgery: General anaesthetic is supposed to make surgery painless. But now there’s evidence that one person in 20 may be awake when doctors think they’re under", Robson:

One day, for instance, she was waiting in the car as her daughter ran an errand, and realised that she was trapped inside. What might once have been a frustrating inconvenience sent her into a panic attack. “I started screaming. I was flailing my arms, I was crying,” she says. “It just left me so shaken.” Even the wrong clothing can make her anxiety worse. “Anything that’s tight around my neck is out of the question because it makes me feel like I’m suffocating,” says Donna, a 55-year-old from Altona in Manitoba, Canada.

...The lingering trauma can resurface with the slightest trigger, and still causes her to have “two or three nightmares each night”. Having been put on medical leave from her job, she has lost her independence. She suspects that she will never fully escape the effects of that day more than a decade ago. “It’s a life sentence.”

...When she woke up, she could hear the nurses buzzing around the table, and she felt someone scrubbing at her abdomen – but she assumed that the operation was over and they were just clearing up. “I was thinking, ‘Oh boy, you were anxious for no reason.’” It was only once she heard the surgeon asking the nurse for a scalpel that the truth suddenly dawned on her: the operation wasn’t over. It hadn’t even begun. The next thing she knew, she felt the blade of his knife against her belly as he made his first incision, leading to excruciating pain. She tried to sit up and to speak – but thanks to a neuromuscular blocker, her body was paralysed. “I felt so… so powerless. There was just nothing I could do. I couldn’t move, couldn’t scream, couldn’t open my eyes,” she says. “I tried to cry just to get tears rolling down my cheeks, thinking that they would notice that and notice that something was going on. But I couldn’t make tears.”

...Various projects around the world have attempted to document experiences like Donna’s, but the Anesthesia Awareness Registry at the University of Washington, Seattle, offers some of the most detailed analyses. Founded in 2007, it has now collected more than 340 reports – most from North America – and although these reports are confidential, some details have been published, and they make illuminating reading.

As you might expect, a large majority of the accounts – more than 70 per cent – also contain reports of pain. “I felt the sting and burning sensation of four incisions being made, like a sharp knife cutting a finger,” wrote one. “Then searing, unbearable pain.” “There were two parts I remember quite clearly,” wrote a patient who had had a wide hole made in his femur. “I heard the drill, felt the pain, and felt the vibration all the way up to my hip. The next part was the movement of my leg and the pounding of the ‘nail’.” The pain, he said, was “unlike anything I thought possible”. It is the paralysing effects of the muscle blockers that many find most distressing, however. For one thing, it produces the sensations that you are not breathing – which one patient described as “too horrible to endure”. Then there’s the helplessness. Another patient noted: “I was screaming in my head things like ‘don’t they know I’m awake, open your eyes to signal them’.” To make matters worse, all of this panic can be compounded by a lack of understanding of why they are awake but unable to move. “They have no reference point to say why is this happening,” says Christopher Kent at the University of Washington, who co-authored the paper about these accounts. The result, he says, is that many patients come to fear that they are dying. “Those are the worst of the anaesthesia experiences.”

...

The result is that many more people might be conscious during surgery, but they simply can’t remember it afterwards.

To investigate this phenomenon, researchers are using what they call the isolated forearm technique. During the induction of the anaesthesia, the staff place a cuff around the patient’s upper arm that delays the passage of the neuromuscular agent through the arm. This means that, for a brief period, the patient is still able to move their hand. So a member of staff could ask them to squeeze their hand in response to two questions: whether they were still aware, and, if so, whether they felt any pain. (Read more in this short on how doctors are trying to detect anaesthesia awareness.) In the largest study of this kind to date, Robert Sanders at the University of Wisconsin–Madison recently collaborated with colleagues at six hospitals in the US, Europe and New Zealand. Of the 260 patients studied, 4.6 per cent responded to the experimenters’ first question, about awareness. That is hundreds of times greater than the rate of remembered awareness events that had been noted in the National Audit Project. And around four in ten of those patients who did respond with the hand squeeze – 1.9 per cent across the whole group – also reported feeling pain in the experimenters’ second question.

These results raise some ethical quandaries. “Whenever I talk to the trainees I talk about the philosophical element to this,” says Sanders. “If the patient doesn’t remember, is it concerning?” Sanders says that there’s no evidence that the patients who respond during the isolated forearm experiments, but fail to remember the experience later, do go on to develop PTSD or other psychological issues like Donna. And without those long-term consequences, you might conclude that the momentary awareness is unfortunate, but unalarming. Yet the study does make him uneasy, and so he conducted a survey to gather the public’s views on the matter. Opinions were mixed. “Most people didn’t think that amnesia alone is sufficient – but a surprisingly large minority thought that as long as you didn’t remember the event, it’s OK,” Sanders says.

...

The survey is https://academic.oup.com/bja/article/118/4/486/3574495 (Given the described wording and the remarkably blase acceptance claimed, I'm left wondering a little if the respondents really appreciated the scenario being described - being gutted like a fish and feeling every last bit of it, so to speak.)

"Patient perspectives on intraoperative awareness with explicit recall: report from a North American anaesthesia awareness registry", Kent et al 2015:

Background: Awareness during general anaesthesia is a source of concern for patients and anaesthetists, with potential for psychological and medicolegal sequelae. We used a registry to evaluate unintended awareness from the patient’s perspective with an emphasis on their experiences and healthcare provider responses.

Methods: English-speaking subjects self-reported explicit recall of events during anaesthesia to the Anesthesia Awareness Registry of the ASA, completed a survey, and submitted copies of medical records. Anaesthesia awareness was defined as explicit recall of events during induction or maintenance of general anaesthesia. Patient experiences, satisfaction, and desired practitioner responses to explicit recall were based on survey responses.

Results: Most of the 68 respondents meeting inclusion criteria (75%) were dissatisfied with the manner in which their concerns were addressed by their healthcare providers, and many reported long-term harm. Half (51%) of respondents reported that neither the anaesthesia provider nor surgeon expressed concern about their experience. Few were offered an apology (10%) or referral for counseling (15%). Patient preferences for responses after an awareness episode included validation of their experience (37%), an explanation (28%), and discussion or follow-up to the episode (26%).

Conclusions: Data from this registry confirm the serious impact of anaesthesia awareness for some patients, and suggest that patients need more systematic responses and follow-up by healthcare providers.

"Incidence of Connected Consciousness after Tracheal Intubation: A Prospective, International, Multicenter Cohort Study of the Isolated Forearm Technique", Sanders et al 2017:

Background: The isolated forearm technique allows assessment of consciousness of the external world (connected consciousness) through a verbal command to move the hand (of a tourniquet-isolated arm) during intended general anesthesia. Previous isolated forearm technique data suggest that the incidence of connected consciousness may approach 37% after a noxious stimulus. The authors conducted an international, multicenter, pragmatic study to establish the incidence of isolated forearm technique responsiveness after intubation in routine practice.

Methods: Two hundred sixty adult patients were recruited at six sites into a prospective cohort study of the isolated forearm technique after intubation. Demographic, anesthetic, and intubation data, plus postoperative questionnaires, were collected. Univariate statistics, followed by bivariate logistic regression models for age plus variable, were conducted.

Results: The incidence of isolated forearm technique responsiveness after intubation was 4.6% (12/260); 5 of 12 responders reported pain through a second hand squeeze. Responders were younger than nonresponders (39 ± 17 vs. 51 ± 16 yr old; P = 0.01) with more frequent signs of sympathetic activation (50% vs. 2.4%; P = 0.03). No participant had explicit recall of intraoperative events when questioned after surgery (n = 253). Across groups, depth of anesthesia monitoring values showed a wide range; however, values were higher for responders before (54 ± 20 vs. 42 ± 14; P = 0.02) and after (52 ± 16 vs. 43 ± 16; P = 0.02) intubation. In patients not receiving total intravenous anesthesia, exposure to volatile anesthetics before intubation reduced the odds of responding (odds ratio, 0.2 [0.1 to 0.8]; P = 0.02) after adjustment for age.

Conclusions: Intraoperative connected consciousness occurred frequently, although the rate is up to 10-times lower than anticipated. This should be considered a conservative estimate of intraoperative connected consciousness.

Comment by gwern on Inverse p-zombies: the other direction in the Hard Problem of Consciousness · 2019-03-12T01:13:36.572Z · score: 5 (2 votes) · LW · GW

Daniel Dennett turns out to discuss precisely this problem in the context of curare/analgesics/anesthetics/amnestics in Dennett 1978, "Why You Can't Make A Computer That Feels Pain".

He also discusses an interesting detail of pain, "reactive dissociation". In my pain taxonomy, I split the various kinds of pain disorders into useful/motivating/qualia; the only combination I was missing was a kind of pain which is experienced as painful and yet was not motivating/aversive/unpleasant. "reactive dissociation" turns out to be just that - if morphine is administered after pain starts happening, people apparently frequently will report that the pain is excruciatingly painful, and yet they don't mind it.

Aspirin by antagonizing bradykinin thus prevents pain at the earliest opportunity. This is interesting because aspirin is also unique among analgesics in lacking the 'reactive disassociation' effect. All other analgesics (e.g., the morphine group and nitrous oxide in sub-anesthetic doses) have a common 'phenomenology.' After receiving the analgesic subjects commonly report not that the pain has disappeared or diminished (as with aspirin) but that the pain is as intense as ever though they no longer mind it. To many philosophers this may sound like some sort of conceptual incoherency or contradiction, or at least indicate a failure on the part of the subjects to draw enough distinctions, but such philosophical suspicions, which we will examine more closely later, must be voiced in the face of the normality of such first-person reports and the fact that they are expressed in the widest variety of language by subjects of every degree of sophistication. A further curiosity about morphine is that if it is administered before the onset of pain (for instance, as a pre-surgical medication) the subjects claim not to feel any pain subsequently (though they are not numb or anesthetized - they have sensation in the relevant parts of their bodies); while if the morphine is administered after the pain has commenced, the subjects report that the pain continues (and continues to be pain), though they no longer mind it.

...Lobotomized subjects similarly report feeling intense pain but not minding it, and in other ways the manifestations of lobotomy and morphine are similar enough to lead some researchers to describe the action of morphine (and some barbiturates) as "reversible pharmacological leucotomy [lobotomy]".^23^

23: A. S. Keats and H. K. Beecher, "Pain Relief with Hypnotic Doses of Barbiturates, and a Hypothesis", J. Pharmacol, 1950. Lobotomy, though discredited as a behavior-improving psychosurgical procedure, is still a last resort tactic in cases of utterly intractable central pain, where the only other alternative to unrelenting agony is escalating morphine dosages, with inevitable addiction, habituation and early death. Lobotomy does not excise any of the old low path (as one might expect from its effect on pain perception), but it does cut off the old low path from a rich input source in the frontal lobes of the cortex.

Dennett throws in this disturbing anecdote in footnote 27:

Scopolamine and other amnestics are often prescribed by anesthesiologists for the purpose of creating amnesia. "Sometimes", I was told by a prominent anesthesiologist, "when we think a patient may have been awake during surgery, we give scopolamine to get us off the hook. Sometimes it works and sometimes not."

Comment by gwern on 'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples · 2019-03-07T17:05:33.429Z · score: 3 (1 votes) · LW · GW

I now have poetry samples up using a retrained/finetuned version of GPT-2-small: https://www.gwern.net/RNN-metadata#finetuning-the-gpt-2-small-transformer-for-english-poetry-generation

February gwern.net newsletter

2019-03-02T22:42:09.490Z · score: 13 (3 votes)
Comment by gwern on 'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples · 2019-03-02T00:14:06.141Z · score: 5 (2 votes) · LW · GW

Nothing makes sense when Google Translates tries Japanese. Although the fact that that is still better than pre-RNN shows you how much of an upgrade that was.

Comment by gwern on 'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples · 2019-03-01T04:29:37.262Z · score: 9 (4 votes) · LW · GW

Implementation details: https://www.gwern.net/TWDNE

'This Waifu Does Not Exist': 100,000 StyleGAN & GPT-2 samples

2019-03-01T04:29:16.529Z · score: 39 (12 votes)
Comment by gwern on Life, not a game · 2019-02-23T21:35:27.404Z · score: 12 (3 votes) · LW · GW

OP's rebuttal the first time I asked him to enlarge on his first paragraph:

[...They all 3 killed themselves like a chain reaction.] It seems like the only time that is possible by which a person is able to choose to live without anyone getting involved in this and the only possible way to keep his name from being forgotten. What if in a sense, for it is in the best interest of humanity for everyone else to live without anyone knowing about the other, it comes to people in need of life? Who would really argue that not everyone is the same? Who would argue that if a person can only choose the way, then not everyone may take some responsibility for their death? People who make the decisions should take responsibility for their own lives. In contrast, they should make a living only with the goal that they make a living for themselves, and not some other person. Are we being overly simplistic and the people who make them feel inadequate (i.e., just lazy) only make more like us even though they will be better than everyone else if they have a better life but are unable to earn income for it that would be of benefit, or would we be able to increase our income and get more as a team instead of just having two or a few more people around to take care of it? The answer is: No. I think the world needs not only the people with the power to make decisions but also the people to decide on, and that will be the greatest force for change in the whole of human experience. And we all need to join together in a great struggle in coming weeks and not one day go to war. The whole world needs this revolution.

Comment by gwern on Implications of GPT-2 · 2019-02-19T00:29:51.646Z · score: 10 (4 votes) · LW · GW

I don't know why you would think that would be such a barrier. You don't need Transformers at all to do analogical reasoning, and both the CoQA and SQUAD results suggests at least some 'modest logic-related stuff' is going on. If you put your exact sample into the public/small GPT-2 model, it'll even generate syntactically correct list completions and additional lists which are somewhat more sorted than not.

Comment by gwern on Implications of GPT-2 · 2019-02-18T16:28:23.227Z · score: 8 (2 votes) · LW · GW

It’s a cool language model but can it do even modest logic-related stuff without similar examples in the training data?

Have you looked at the NLP tasks they evaluated it on?

January 2019 gwern.net newsletter

2019-02-04T15:53:42.553Z · score: 15 (5 votes)
Comment by gwern on Which textbook would you recommend to learn decision theory? · 2019-01-29T23:45:17.169Z · score: 7 (4 votes) · LW · GW

https://www.reddit.com/r/DecisionTheory/search?q=flair%3ATextbook&restrict_sr=on is a starting point.

Comment by gwern on [Link] Did AlphaStar just click faster? · 2019-01-29T00:17:33.772Z · score: 6 (3 votes) · LW · GW

Also discussed in https://www.lesswrong.com/posts/f3iXyQurcpwJfZTE9/alphastar-mastering-the-real-time-strategy-game-starcraft-ii

"Forecasting Transformative AI: An Expert Survey", Gruetzemacher et al 2019

2019-01-27T02:34:57.214Z · score: 17 (8 votes)
Comment by gwern on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-01-27T01:14:57.421Z · score: 10 (5 votes) · LW · GW

Yes, if it's as simple as 'spam clicks from imitation learning are too hard to wash out via self-play given the weak APM limits', it should be relatively easy to fix. Add a very tiny penalty for each click to incentivize efficiency, or preprocess the replay dataset - if a 'spam click' does nothing useful, it seems like it should be possible to replay through all the games, track what clicks actually result in a game-play difference and what clicks are either idempotent (eg multiple clicks in the same spot) or cancel out (eg a click to go one place which is replaced by a click to go another place before the unit has moved more than epsilon distance), and filter out the spam clicks.

Comment by gwern on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-01-25T16:13:56.643Z · score: 4 (2 votes) · LW · GW

There's also the CPU. Those <=200 years of SC2 simulations per agent aren't free. OA5, recall, was '256 GPUs and 128,000 CPU cores'. (Occasionally training a small NN update is easier than running many games necessary to get the experience to decide what tweak to make.)

Comment by gwern on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-01-25T15:16:38.892Z · score: 7 (4 votes) · LW · GW

It's worth noting that NLP took a big leap in 2018 through simple unsupervised/predictive training on large text corpuses to build text embeddings which encode a lot of semantic knowledge about the world.

Comment by gwern on "AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros] · 2019-01-24T21:48:38.570Z · score: 9 (4 votes) · LW · GW
  • DM Q&A: https://www.reddit.com/r/MachineLearning/comments/ajgzoc/we_are_oriol_vinyals_and_david_silver_from/
  • Video: https://www.youtube.com/watch?v=cUTMhmVh1qs
  • /r/reinforcementlearning discussion: https://www.reddit.com/r/reinforcementlearning/comments/ajeg5m/deepminds_alphastar_starcraft_2_demonstration/

"AlphaStar: Mastering the Real-Time Strategy Game StarCraft II", DeepMind [won 10 of 11 games against human pros]

2019-01-24T20:49:01.350Z · score: 62 (23 votes)
Comment by gwern on Link: That Time a Guy Tried to Build a Utopia for Mice and it all Went to Hell · 2019-01-23T16:09:19.004Z · score: 41 (12 votes) · LW · GW

No one really knows. Calhoun did a bad job writing it up. When I went looking for details years back, I think the most in-depth primary source I found was like... 2 pages long.

I've wondered if what actually happened was just a contagious infection (sterility is a consequence of many infectious diseases), which given the population density should be near-inevitable. Even if he had checked for infections (it's unclear if he did) it would be easy to miss a lot of organisms, which is why we still regularly run into new evidence of infectious contributions to various problems.

He also ran a number of mouse utopias, IIRC, and usually you hear only about the one which gave the 'collapse' narrative.... At this point, I assign it to the mental bucket of 'wrong 1960s blankslatism like Rosenthal or Pygmalion effect or the Stanford Prison Experiment or Robber's Cave but which will live on forever in pop science because their message is too appealing'.

Comment by gwern on Stale air / high CO2 may decrease your cognitive function · 2019-01-22T17:36:11.544Z · score: 30 (9 votes) · LW · GW

Earlier discussion of the research: https://www.lesswrong.com/posts/pPZ27eZdBXtGuLqZC/what-is-up-with-carbon-dioxide-and-cognition-an-offer

Comment by gwern on Beware Trivial Inconveniences · 2019-01-19T17:49:37.718Z · score: 18 (3 votes) · LW · GW

"The Impact of Media Censorship: Evidence from a Randomized Field Experiment in China", Chen & Yang 2018:

Media censorship is a hallmark of authoritarian regimes. We conduct a field experiment in China to measure the effects of providing citizens with access to an uncensored Internet. We track subjects' media consumption, beliefs regarding the media, economic beliefs, political attitudes, and behaviors over 18 months. We find four main results:

  1. free access alone does not induce subjects to acquire politically sensitive information;
  2. temporary encouragement leads to a persistent increase in acquisition, indicating that demand is not permanently low;
  3. acquisition brings broad, substantial, and persistent changes to knowledge, beliefs, attitudes, and intended behaviors; and
  4. social transmission of information is statistically-significant but small in magnitude.

We calibrate a simple model to show that the combination of low demand for uncensored information and the moderate social transmission means China's censorship apparatus may remain robust to a large number of citizens receiving access to an uncensored Internet.

Comment by gwern on Doing Despite Disliking: Self‐regulatory Strategies in Everyday Aversive Activities · 2019-01-19T00:48:40.387Z · score: 11 (3 votes) · LW · GW

Preprint: https://psyarxiv.com/ps7fk/ Fulltext: https://www.gwern.net/docs/psychology/2018-hennecke.pdf

Comment by gwern on New article on in vitro iterated embryo selection · 2019-01-18T23:12:10.473Z · score: 6 (3 votes) · LW · GW

As far as I can tell, his only public reaction is a line in Sparrow 2014: "A recent treatment of this topic by Shulman and Bostrom^6 calls the same technology 'iterated embryo selection' - a name that Matthews and Fujita et al may prefer.". So, he never acknowledged being scooped.

Although after more looking into it, if we're going to argue about priority, it looks like IES was actually first proposed a decade before MIRI did, in Haley & Visscher 1998's "Strategies to Utilize Marker-Quantitative Trait Loci Associations" - their Figure 5c is unambiguously IES.

Comment by gwern on What is up with carbon dioxide and cognition? An offer · 2019-01-16T00:13:59.743Z · score: 7 (3 votes) · LW · GW

A new one: "Using EEG to characterise drowsiness during short duration exposure to elevated indoor Carbon Dioxide concentrations", Snow et al 2018:

Drowsiness which can affect work performance, is often elicited through self-reporting. This paper demonstrates the potential to use EEG to objectively quantify changes to drowsiness due to poor indoor air quality. Continuous EEG data was recorded from 23 treatment group participants subject to artificially raised indoor CO2 concentrations (average 2,700 ± 300 ppm) for approximately 10 minutes and 13 control group participants subject to the same protocol without additional CO2 (average 830 ± 70 ppm). EEG data were analysed for markers of drowsiness according neurophysiological methods at three stages of the experiment, Baseline, High CO2 and Post-Ventilation. Treatment group participants’ EEG data yielded a closer approximation to drowsiness than that of control group participants during the High CO­2 condition, despite no significant group differences in self-reported sleepiness. Future work is required to determine the persistence of these changes to EEG over longer exposures and to better isolate the specific effect of CO2 on drowsiness compared to other environmental or physiological factors.

Comment by gwern on Visualizing the power of multiple step selection processes in JS: Galton's bean machine · 2019-01-12T21:35:30.309Z · score: 11 (5 votes) · LW · GW

Background: https://www.gwern.net/Embryo-selection#multi-stage-selection

Any selective breeding program operates similarly to this, and iterated embryo selection operates even more like this; if you consider the full life cycle from ethnicity to individual to assortative mating to embryo selection, much of the logic carries through (and considering such scenarios where there are multiple 'stages' is what made me appreciate the asymptotic advantage). Particle filters/swarm optimization algorithms operate similarly, in a sense, and evolutionary computation methods like CMA-ES operate pretty much exactly like this. This can also be a good intuition pump for pipeline scenarios like Scannell's drug discovery pipeline model.

Visualizing the power of multiple step selection processes in JS: Galton's bean machine

2019-01-12T17:58:34.584Z · score: 27 (8 votes)

Littlewood's Law and the Global Media

2019-01-12T17:46:09.753Z · score: 37 (8 votes)

Evolution as Backstop for Reinforcement Learning: multi-level paradigms

2019-01-12T17:45:35.485Z · score: 18 (4 votes)
Comment by gwern on The Craigslist Revolution: a real-world application of torture vs. dust specks OR How I learned to stop worrying and create one billion dollars out of nothing · 2019-01-09T04:37:19.870Z · score: 11 (2 votes) · LW · GW

The optional banner is harmless,

Revisiting this page now in 2019, I'd take more exception to this. For entirely unrelated reasons, I ran my own banner ad A/B test, and the results were far from harmless: https://www.gwern.net/Ads And this turns out to parallel experiments by both Pandora and Mozilla. Scuttlebutt has it there are more suppressed experiments also demonstrating long-term harm. (I'm running a followup experiment which I hope will show smaller effects but I don't know what it is finding yet.)

Extrapolate the various estimates out to Craigslist and that's a lot of potential global deadweight loss from sales/deals/rentals not happening.

December gwern.net newsletter

2019-01-02T15:13:02.771Z · score: 20 (4 votes)
Comment by gwern on LessWrong Help Desk - free paper downloads and more (2014) · 2018-12-31T23:01:55.322Z · score: 7 (3 votes) · LW · GW

A bit late, but I've tried to write up what I know about searching: https://www.gwern.net/Search

Comment by gwern on Why Don't Creators Switch to their Own Platforms? · 2018-12-23T15:18:51.986Z · score: 3 (1 votes) · LW · GW

WordPress is not analogous to YouTube for reasons quanticle just explained.

Internet Search Tips: how I use Google/Google Scholar/Libgen

2018-12-12T14:50:30.970Z · score: 54 (13 votes)
Comment by gwern on Is Science Slowing Down? · 2018-12-11T00:13:29.066Z · score: 13 (3 votes) · LW · GW

Theranos, as I understand it, was promising blood testing of all sorts of biomarkers like blood glucose, and nothing to do with DNA. DNA sequencing is different from measuring concentration - at least in theory, you only need a single strand of DNA and you can then amplify that up arbitrary amounts (eg in PGD/embryo selection, you just suck off a cell or two from the young embryo and that's enough to work with). If you were trying to measure the nanograms of DNA per microliter, that's a bit different.

I don't know anything about RNA sequencing, since it's not relevant to anything I follow like GWASes.

Comment by gwern on What precisely do we mean by AI alignment? · 2018-12-09T02:34:23.777Z · score: 4 (2 votes) · LW · GW

Where did the rest of this article go? There's just a paragraph at the start, on both LW2/GW.

Comment by gwern on Is Science Slowing Down? · 2018-12-08T01:34:12.349Z · score: 10 (4 votes) · LW · GW

The gossip I hear is that the Gods of Straight Lines continued somewhat but prices took a breather because of the Illumina quasi-monopoly (think Intel vs AMD). Several of the competitors stumbled badly for what appear to be reasons unrelated to the task itself: BGI gambled on an acquisition to develop its own sequencers which famously blew up in its face for organizational reasons, and rumor has it that 23andMe spent a ton of money on an internal effort which it eventually discarded for unknown reasons. You'll notice that after WGS prices paused for years on end, suddenly, now that other companies have begun to catch up, Illumina has begun talking about $100 WGS next year and we're seeing DTC WGS drop like a stone.

Comment by gwern on Coherence arguments do not imply goal-directed behavior · 2018-12-04T19:58:01.516Z · score: 43 (14 votes) · LW · GW

That's not very imaginative. Here's how a chess tree search algorithm - let's take AlphaZero for concreteness - could learn to kill other processes, even if it has no explicit action which corresponds to interaction with other processes and is apparently sandboxed (aside from the usual sidechannels like resource use). It's a variant of the evolutionary algorithm which learned to create a board so large that its competing GAs crashed/were killed while trying to deal with it (the Tic-tac-toe memory bomb). In this case, position evaluations can indirectly reveal that an exploration strategy caused enough memory use to trigger the OOM, killing rival processes, and freeing up resources for the tree search to get a higher win rate by more exploration:

  1. one of the main limits to tree evaluation is memory consumption, due to the exponential growth of breadth-first memory requirements (this is true regardless of whether an explicit tree or implicit hash-based representation is used); to avoid this, memory consumption is often limited to a fixed amount of memory or a mix of depth/breadth-first strategies are used to tame memory growth, even though this may not be optimal, as it may force premature stopping to expansion of the game tree (resorting to light/heavy playouts) or force too much exploitation depthwise along a few promising lines of play and too little exploration etc. (One of the criticisms of AlphaZero, incidentally, was that too little RAM was given to the standard chess engines to permit them to reach their best performance.)

  2. when a computer OS detects running out of memory, it'll usually invoke an 'OOM killer', which may or may not kill the program which makes the request which uses up the last of free memory

  3. so, it is possible that if a tree search algorithm exhausts memory (because the programmer didn't remember to include a hard limit, the hard limit turns out to be incorrect for the machine being trained on, the limit is defined wrong like in terms of max depth instead of total nodes, etc), it may not crash or be killed but other programs, using unknown & potentially large percentages of memory, may be killed instead to free up memory. (I've observed this on Linux, to my frustration, where the programs I don't want killed get killed by the OOM reaper instead of the haywire program.)

  4. once other programs are killed to free up memory, all that memory is now available for the tree search algorithm to use; using this memory will increase performance by allowing more of the game tree to be explicitly evaluated, either wider or deeper.

  5. in AlphaZero, the choice of widening or deepening is inherently controlled by the NN, which is trained to predict the result of the final values of each position and increase win probabilities.

  6. reaching a position (which can be recognized by its additional complexity, indicating it lies at a certain additional depth in the tree and thus indirectly reveals how much memory is being used by the NN's cumulative exploration) which triggers an OOM killing other programs will result in more accurate position evaluations, leading to higher values/higher win probability; so it will reinforce a strategy where it learns to aggressively widen early in the game to exhaust memory, waits for an OOM to happen, and then in the rest of the game proceeds to explore more aggressively (rather than depth-first exploit) given the new memory.

    (Depending on the exact details of how the tree expansion & backups are done, it's possible that the AlphaZero NN couldn't observe the benefits of wide-then-deep - it might just look like noise in value estimates - but there are expert iteration variants where the NN directly controls the tree expansion rather than merely providing value estimates for the MCTS algorithm to explore using, and those should be able to observe indirect benefits of exploration strategies over a game.)

At no point does it interact directly with other processes, or even know that they exist; it just implicitly learns that expanding a decision tree in a particular wide-then-deep fashion leads to better evaluations more consistent with the true value and/or end-game result (because of side-effects leading to increased resource consumption leading to better performance). And that's how a tree-search algorithm can hit upon killing other processes.

Comment by gwern on Book Review - Other Minds: The Octopus, the Sea, and the Deep Origins of Consciousness · 2018-12-04T03:44:59.162Z · score: 15 (5 votes) · LW · GW

That's a good question, and the answer may be that they don't have color vision in any normal sense; what they have is the ability to use chromatic aberration to focus their eyes for various colors, and this serial focusing scan lets them decide how to adjust their skin to match surroundings: "Spectral discrimination in color blind animals via chromatic aberration and pupil shape", Stubbs & Stubbs 2016.

Should this be considered color vision? It seems safe to say that whatever the qualia of scanning chromatic-aberration vision would be, it would be very different from our simultaneous realtime trichromatic color vision. And it's worth noting that under the Stubbs model, to explain why behavioral assays (not just the finding that they only have one kind of photoreceptor) found them to be color blind, there's a lot of things they can't do with the chromatic aberration trick:

Second, some behavioral experiments (7⇓⇓⇓–11) designed to test for color vision in cephalopods produced negative results by using standard tests of color vision to evaluate the animal’s ability to distinguish between two or more adjacent colors of equal brightness. This adjacent color comparison is an inappropriate test for our model (Fig. 4R). Tests using rapidly vibrating (8, 9) color cues are also inappropriate. Although these dynamical experiments are effective tests for conventional color vision, they would fail to detect spectral discrimination under our model, because it is difficult to measure differential contrast on vibrating objects. These results corroborate the morphological and genetic evidence: any ability in these organisms for spectral discrimination is not enabled by spectrally diverse photoreceptor types

...In our proposed mechanism, cephalopods cannot gain spectral information from a flat-field background or an edge between two abutting colors of comparable intensity (Fig. 3). This phenomenology would explain why optomotor assays and camouflage experiments using abutting colored substrates (7, 9, 11) fail to elicit a response different from a flat-field background. Similarly, experiments (10) with monochromatic light projected onto a large uniform reflector or training experiments (8, 9) with rapidly vibrating colored cues would defeat a determination of chromatic defocus

...We predict that the animals will fail to match flat-field backgrounds with no spatial structure as previously shown in figure 3B in the work by Mäthger et al. (7) just as a photographer could not determine best focus when imaging a screen with no fine-scale spatial structure. If, for instance, their ability to spectrally match backgrounds was conferred by the skin or another potential unknown mechanism, they would successfully match on flat-field backgrounds. However, under our model, they should succeed when there is a spatial structure allowing for the calculation of chromatically induced defocus, such as in our test patterns (Fig. 4) or the more naturally textured backgrounds by Kühn (21). If, however, cephalopods truly cannot accurately match their background color but solely use luminance and achromatic contrast to determine camouflage, we would expect the response on colored substrates to be identical to that on a gray substrate of similar apparent brightness with identical spatial structure.

November 2018 gwern.net newsletter

2018-12-01T13:57:00.661Z · score: 35 (8 votes)
Comment by gwern on Genetically Modified Humans Born (Allegedly) · 2018-11-29T15:42:12.483Z · score: 11 (4 votes) · LW · GW

Watching the Q&A was mindbending, and reminded me why I dislike bioethicists so much. Asking He Jiankui whether the participants were literate enough to read the forms! (You got one question, with so many important things to ask, and that was the point you decided to make?)

October gwern.net links

2018-11-01T01:11:28.763Z · score: 31 (8 votes)

Whole Brain Emulation & DL: imitation learning for faster AGI?

2018-10-22T15:07:54.585Z · score: 15 (5 votes)

New /r/gwern subreddit for link-sharing

2018-10-17T22:49:36.252Z · score: 45 (13 votes)

September links

2018-10-08T21:52:10.642Z · score: 18 (6 votes)

Genomic Prediction is now offering embryo selection

2018-10-07T21:27:54.071Z · score: 39 (14 votes)

August gwern.net links

2018-09-25T15:57:20.808Z · score: 18 (5 votes)

July gwern.net newsletter

2018-08-02T13:42:16.534Z · score: 24 (8 votes)

June gwern.net newsletter

2018-07-04T22:59:00.205Z · score: 36 (8 votes)

May gwern.net newsletter

2018-06-01T14:47:19.835Z · score: 73 (14 votes)

$5m cryptocurrency donation to Alcor by Brad Armstrong in memory of LWer Hal Finney

2018-05-17T20:31:07.942Z · score: 47 (11 votes)

Tech economics pattern: "Commoditize Your Complement"

2018-05-10T18:54:42.191Z · score: 97 (27 votes)

April links

2018-05-10T18:53:48.970Z · score: 20 (6 votes)

March gwern.net link roundup

2018-04-20T19:09:29.785Z · score: 27 (6 votes)

Recent updates to gwern.net (2016-2017)

2017-10-20T02:11:07.808Z · score: 7 (7 votes)

The NN/tank Story Probably Never Happened

2017-10-20T01:41:06.291Z · score: 2 (2 votes)

Regulatory lags for New Technology [2013 notes]

2017-05-31T01:27:52.046Z · score: 5 (5 votes)

"AIXIjs: A Software Demo for General Reinforcement Learning", Aslanides 2017

2017-05-29T21:09:53.566Z · score: 1 (3 votes)

Keeping up with deep reinforcement learning research: /r/reinforcementlearning

2017-05-16T19:12:04.201Z · score: 3 (4 votes)

"The unrecognised simplicities of effective action #2: 'Systems engineering’ and 'systems management' - ideas from the Apollo programme for a 'systems politics'", Cummings 2017

2017-02-17T00:59:04.256Z · score: 9 (8 votes)

Decision Theory subreddit

2017-02-07T18:42:55.277Z · score: 6 (7 votes)

Rationality Heuristic for Bias Detection: Updating Towards the Net Weight of Evidence

2016-11-17T02:51:19.316Z · score: 10 (11 votes)

Recent updates to gwern.net (2015-2016)

2016-08-26T19:22:02.157Z · score: 27 (29 votes)

The Brain Preservation Foundation's Small Mammalian Brain Prize won

2016-02-09T21:02:02.585Z · score: 43 (45 votes)

Recent updates to gwern.net (2014-2015)

2015-11-02T00:06:11.241Z · score: 21 (22 votes)

[Link] 2015 modafinil user survey

2015-09-26T17:28:17.324Z · score: 9 (10 votes)

LW survey: Effective Altruists and donations

2015-05-14T00:44:42.661Z · score: 18 (23 votes)

[POLL] LessWrong group on YourMorals.org (2015)

2015-03-03T03:08:32.748Z · score: 12 (13 votes)

Harper's Magazine article on LW/MIRI/CFAR and Ethereum

2014-12-12T20:34:45.244Z · score: 47 (46 votes)

Confound it! Correlation is (usually) not causation! But why not?

2014-07-09T03:04:26.084Z · score: 44 (44 votes)

Recent updates to gwern.net (2013-2014)

2014-07-08T01:44:01.951Z · score: 26 (27 votes)

A Parable of Elites and Takeoffs

2014-06-30T23:04:35.372Z · score: 23 (36 votes)

Anonymous feedback forms revisited

2013-12-01T02:50:25.202Z · score: 23 (24 votes)

Notes on Brainwashing & 'Cults'

2013-09-13T20:49:51.412Z · score: 37 (40 votes)

LW wiki spam filtering

2013-03-30T16:13:12.929Z · score: 24 (25 votes)