Signs of the Times

post by nixtaken · 2020-08-04T07:09:41.553Z · score: -7 (15 votes) · LW · GW · 11 comments

One morning I woke up to see the front page of every newspaper across the world covered with pictures of what looked like a big orange doughnut. What could this represent? I learned that the doughnut picture was rendered by combining images taken from telescopes all around the world.

Petabytes of data had been mailed to one location for analysis and a few groups of PhD students wrote some code to combine and overlay the images. The student responsible for combining all of the work into one image got something that looked like a ‘black hole’ and newspapers and magazines from around the world put it on the front page, even though each individual image in the data set showed no donut, just blur.

The woman behind first black hole image

I thought: PhD student work is great, but it often contains mistakes, so I sure hope her advisor went through her code very carefully. If you overlay petabytes of images with individually tuned contrast ratios and construct your algorithm so that the overlay is centered around a region of interest, it is certainly possible to create a black hole in the middle of your region of interest. You’d think that sort of scrutiny is routinely applied to new results, but in the darkened conference rooms in which such data is presented, skepticism is expressed in hushed and veiled terms. Sometimes, nothing stands in the way of an idea marching ahead.

The first article I read had a bit of detail on how the algorithm was developed:

Bouman adopted a clever algebraic solution to this problem: If the measurements from three telescopes are multiplied, the extra delays caused by atmospheric noise cancel each other out. This does mean that each new measurement requires data from three telescopes, not just two, but the increase in precision makes up for the loss of information.

I wanted to see her presentation of her data, but what I found was a TED talk.

The thing that concerned me the most from her TED talk was when she said at 6:40 (I paraphrased):

Some images are less likely than others and it is my job to design an algorithm that gives more weight to the images which are more likely.

as in, she told her algorithm to look for pictures that look like black holes and, lo and behold, her algorithm found black holes by ignoring the data that didn’t look like black holes.

LIGO did something similar in their algorithms, so if they got away with it, why can’t she?

Finally, I found a real, academic presentation from shortly after the media blitz and front page news stories.

It is an hour long and the technical stuff starts about ten minutes in. At 14:40 she talks about the ‘challenge’ of dealing with data that had an ‘error’ of 100%. At 16:00 she talks about how the ‘CLEAN’ algorithm is guided a lot by the user – as in, the user makes the image look how they think it should look. At 19:30, she said, “Most people use this method to do calibration before imaging, but we set it up so that we could do calibration during imaging.” Gaaaah! At 31:40, she shows four images that look the same in the amplitude domain – showing the extent to which this measurement relies on information in the phase domain. An image with a hole or without a hole looks the same in the amplitude domain. At 39:30, she says that the phase data is unusable and the amplitude data is noisy. To me, this sounds like she just contradicted herself.

Upon inspection of things she wrote about her data analysis methods, even more strange methods appear, for example, deciding on whether or not to include a data point based on its 'weirdness' - as in, based on whether or not it contributed to the result she wanted to see.

A clever commenter who read this material when I first posted it on wrote:

Take a look at this picture of my cat.

You don’t see a cat? You just need to apply the right cat-shaped filters.

When I see a picture of a black hole, I see that our academic system has succumbed to the overwhelming noise of our dark age.

I should mention pre-existing biases; my default setting is skeptical. I don’t really believe that black holes exist because I think that theorists got drunk on general relativity and invented them. Astronomers got drunk on interpreting the meanings of tiny dots of light and convinced themselves that they had seen these invisible, theoretical beasts of the night sky.

Seriously, any time you use a singularity to describe a physical phenomenon, it usually turns out to be wrong. Just think about the equations governing water swirling around in a basin. If you use one type of equation, you get a singularity in your basin and if you use another type of equation, you don’t.

That the whole universe is full of singularities floating around in space requires a suspension of disbelief with which I am not really comfortable. I am not alone in this, apparently. Believe it or not, there are still scientists out there who do not believe that black holes exist.

Some black hole researchers will display a picture of a star with a dark spot in it and then follow that up with a simulation showing the same picture. There is something important to know about simulations. If you have a picture of something, it is easy to make a simulation that copies the picture. What is hard is to make a simulation of something you have never seen before, predict how and where you will see it, and then record an observation of it. That is really the only way to do science. Any other route can lead to self-trickery.

Here is a precursor to the black hole image that made Katie Bouman’s PhD work famous. It was what theorists predicted that the astronomers would see.

Saturation effects must be rather difficult to deal with in such images, but, as we saw recently, that hasn’t stopped them from ending up on the front page of newspapers across the world.

Here is a study which claims to support the existence of black holes, but really just tracked some stars near the center of our galaxy. They write

“The stars in the innermost region are in random orbits, like a swarm of bees,” says Gillessen. “However, further out, six of the 28 stars orbit the black hole in a disc. In this respect the new study has also confirmed explicitly earlier work in which the disc had been found, but only in a statistical sense. Ordered motion outside the central light-month, randomly oriented orbits inside – that’s how the dynamics of the young stars in the Galactic Centre are best described.” Unprecedented 16-Year Long Study Tracks Stars Orbiting Milky Way Black Hole

I believe their measurements, but I don’t always believe the interpretation which scientists give to their results.

A lot of money goes into making these sorts of simulations and studying black holes, so one should expect resistance to any change in belief.

Although the author of this Veritasium video does not come out and directly question the existence of black holes, I like how he describes the way in which astronomers pick unlikely scenarios out of thin air and use them to explain the qualities of blurry blobs of light. I find it amazing how ‘artist’s renditions’ and just-so stories pass as science in this day and age.

Science should produce progress, but when it swirls around in an eddy of self-citation, you end up with a black hole – in a figurative, not literal sense.


(If you would like to hear this post read aloud, try this video:

I was advised that the articles I first posted here on Less Wrong were too long, wide-ranging, and technical, so with this article, I'm trying a more scaled-back, focused, 'reveal culture' style. Does it work better than the approach in the article below? [LW · GW]


Comments sorted by top scores.

comment by korin43 · 2020-08-04T16:17:40.026Z · score: 15 (10 votes) · LW(p) · GW(p)

I'm assuming it wasn't your goal, but this feels like too much of a personal attack on a particular scientist.

As far as I can tell, Dr. Bouman is just one of many users of the CLEAN algorithm, which Wikipedia says was invented in 1974.

The algorithm she was apparently instrumental in creating is CHIRP, which Wikipedia says is useful exactly because it doesn't require the user-input that you're complaining about:

While the BSMEM and SQUEEZE algorithms may perform better with hand-tuned parameters, tests show CHIRP can do better with less user expertise.

Your points about how noisy this data is and the limitations of the algorithms used to construct these images are really interesting, but the narrative of this article puts too much blame for the problems in astrophysics on a single scientist.

comment by lsusr · 2020-08-04T20:46:51.167Z · score: 8 (5 votes) · LW(p) · GW(p)

The photo especially makes it feel like a personal attack. It is not clear to me what purpose this photo serves other than to make the attack extra personal.

comment by nixtaken · 2020-08-04T16:26:24.252Z · score: 5 (4 votes) · LW(p) · GW(p)

There are tools and there are how those tools are used by high-profile researchers. I think that it is unfortunate that she was put out into the public eye at such an early stage of her career, before she could fully understand what she was doing.

If you do calibration prior to looking at your result, you can be sure that you are not biasing your result, but if you do calibration at the same time that you decide which data to include in your result, you are mis-using a tool and she showed no awareness that was what she was doing.

Just because a tool is widely used doesn't mean that it is used correctly. If her team was using the CLEAN algorithm to encourage the sort of image they wanted to see, they were not using it correctly and she showed no awareness that was what she was doing.

In her talk she compared two methods to generate the black hole image:

one allowed the user to hand tune the image by selecting areas that should be brighter (CLEAN)

the other allowed the user to bias the 'priors' in a form of automated delusion (CHIRP).

comment by korin43 · 2020-08-04T19:50:39.000Z · score: 14 (8 votes) · LW(p) · GW(p)

It looks like gjm already explained how you're giving a misleading account of what these algorithms do and how Dr. Bauman used them in a comment 18 days ago [LW(p) · GW(p)]:

The "weirdness" term in the CHIRP algorithm is a so-called "patch prior", which means that you get it by computing individual weirdness measures for little patches of the image, and you do that over lots of patches that cover the image, and add up the results. (This is what she's trying to get at with the business about random image fragments.) The patches used by CHIRP are only 8x8 pixels, which means they can't encode very much in the way of prejudices about the structure of a black hole.


For CHIRP, they have a way of building a patch prior from a large database of images, which amounts to learning what tiny bits of those images tend to look like, so that the algorithm will tend to produce output whose tiny pieces look like tiny pieces of those images. You might worry that this would also tend to produce output that looks like those images on a larger scale, somehow. That's a reasonable concern! Which is why they explicitly checked for that. (That's what is shown by the slide from the TEDx talk that I thought might be misleading you, above.) The idea is: take several very different large databases of images, use each of them to build a different patch prior, and then run the algorithm using a variety of inputs and see how different the outputs are with differently-learned patch priors. And the answer is that the outputs look almost identical whatever set of images they use to build the prior. So whatever features of those 8x8 patches the algorithm is learning, they seem to be generic enough that they can be learned equally well from synthetic black hole images, from real astronomical images, or from photos of objects here on earth.


Oh, a bonus: you remember I said that one extreme is where the "weirdness" term is zero, so it definitely doesn't import any problematic assumptions about the nature of the data? Well, if you look at the CalTech talk at around 38:00 you'll see that Bouman actually shows you what you get when you do almost exactly that. (It's not quite a weirdness term of zero; they impose two constraints, first that the amount of emission in each place is non-negative, and second a "field-of-view constraint" which I assume means that they're only interested in radio waves coming from the region of space they were actually trying to measure. ... And it still looks pretty decent and produces output with much the same form as the published image.


Bouman says (CalTech, 16:00) “the CLEAN algorithm is guided a lot by the user.” Yes, and she is pointing out that this is an unfortunate feature of the ("self-calibrating") CLEAN algorithm, and a way in which her algorithm is better. (Also, if you listen at about 35:00, you'll find that they actually developed a way to make CLEAN not need human guidance.)

comment by lsusr · 2020-08-04T21:25:28.192Z · score: 9 (4 votes) · LW(p) · GW(p)

I think that it is unfortunate that she was put out into the public eye at such an early stage of her career, before she could fully understand what she was doing. [emphasis mine]

This is yet another example of a personal attack. Please refrain from making unsubstantiated claims belittling other people.

comment by Raemon · 2020-08-04T21:29:16.378Z · score: 14 (7 votes) · LW(p) · GW(p)

This is a reminder to me to write up an article that looks at what the norms actually should be for speculating about other people's motivations. 

(I think there is some sort of general "Be careful psychologizing people, especially when criticizing them" norm that's sort of in the water, but I don't know that I've seen it written up clearly. I currently endorse Duncan Sabien's proposed norm of "if you're going to do that, clearly label your cruxes for what would change your mind about their psychology", but it's a non-obvious norm IMO and took me a while to come around on.)

comment by lsusr · 2020-08-04T20:57:40.560Z · score: 10 (6 votes) · LW(p) · GW(p)

I was advised that the articles I first posted here on Less Wrong were too long, wide-ranging, and technical, so with this article, I'm trying a more scaled-back, focused, 'reveal culture' style. Does it work better than the approach in the article below?

This is better but it still contains two separate ideas. One idea is "I don’t really believe that black holes exist because I think that theorists got drunk on general relativity and invented them." The other idea has to do with the creation of this specific image.

I agree [LW(p) · GW(p)] with korin43 [LW(p) · GW(p)] that the second idea already "feels like too much of a personal attack on a particular scientist". However, if you really want to continue in that direction then you should read all of the relevant scientific papers, write a post (or posts) explaining exactly, in mathematical terms[1] how the algorithms work and then explain the error in careful unambiguous mathematical terms. Strip out everything else[2]. Less Wrong is read by theoretical physicists, quantitative hedge fund managers, specialists in machine learning, and so on. We can handle the math. We do not have time for unnecessary words.

I think it would be more constructive to go in the direction of "black holes do not exist". The problem is not that your articles are too technical. The problem is that they depend unnecessarily upon deep technical knowledge from unrelated domains. An article explaining why black holes do not exist would require technical knowledge in only a single domain.

  1. Less Wrong supports MathJax. ↩︎

  2. In your comments and articles, you often speculate on the motives and thought processes of people you disagree with. I think your writing would benefit from leaving this out and sticking to the facts. Instead of arguing "[person] who expouses [popular idea] is wrong because ", your writing would improve if you wrote "[unpopular idea] is right because ". ↩︎

comment by korin43 · 2020-08-04T21:12:39.142Z · score: 6 (4 votes) · LW(p) · GW(p)

I would add that an article showing problems with either CLEAN or CHIRP would be interesting (especially if you can demonstrate them by actually running the algorithm, or point at other people's results doing that), but an article about both at the same time is needlessly complex.

comment by nixtaken · 2020-08-05T09:41:21.594Z · score: 1 (1 votes) · LW(p) · GW(p)

Compartmentalization is a good way to make sure that no one ever understands the big picture and what I am writing about are things that are going on at the meta level which are obscured or hidden when people only focus on myopic detail.

comment by Dagon · 2020-08-06T17:48:48.873Z · score: 4 (2 votes) · LW(p) · GW(p)
Compartmentalization is a good way to make sure that no one ever understands the big picture

We're not asking for compartmentalization, we're asking for clearer composition. The argument about meta-level trends or aggregates fully depends on multiple strong examples, each of which is independently verifiable. It's fine to give pointers between the arguments, but they shouldn't actually depend on each other in a circular way.

comment by korin43 · 2020-08-05T20:07:27.548Z · score: 1 (1 votes) · LW(p) · GW(p)

That's a good point. I think I'm more interested in your meta-point about science in general anyway, but I think the problem is that at a first glance, your supporting arguments seem to be wrong. Given that Dr Bouman worked on one of several teams trying to avoid exactly the problem you're talking about by using multiple different methods, and her CHIRP algorithm was created specifically to avoid the biases that CLEAN introduces, your meta argument doesn't work unless you go deeper and make a stronger argument that CHIRP is biased or broken in the way you're claiming.