GPT-4o Can In Some Cases Solve Moderately Complicated Captchas

post by dirk (abandon) · 2024-11-09T04:04:37.782Z · LW · GW · 2 comments

Contents

2 comments

Here are several examples; I found these captchas via the web rather than generating them anew, but none of them came attached to solutions so I'm not sure their presence in the training data would affect things in any case. (That said, it's possible that the lower resolution of the latter two degraded the adversarial perturbation; I would appreciate a source of higher-resolution captchas if anyone happens to know one.)

It clearly couldn't see all the objects, but the owl was in fact the correct answer
Entertaining failure at basic numerals while nonetheless answering correctly here
This one I was surprised by; I expected the image to be too low-resolution to be comprehensible, but 8/9 are correct here (the middle left image is a chair with an unusually low back)

2 comments

Comments sorted by top scores.

comment by jbash · 2024-11-09T15:45:32.513Z · LW(p) · GW(p)

CAPTCHAs have "adversarial perturbations"? Is that in the sense of "things not visible to humans, but specifically adversarial to deep learning networks"? I thought they just had a bunch of random noise and weird ad hoc patterns thrown over them.

Anyway, CAPTCHAs can't die soon enough. Although the fact that they persist in the face of multiple commercial services offering to solve 1000 for a dollar doesn't give me much hope...

Replies from: abandon
comment by dirk (abandon) · 2024-11-09T16:27:37.472Z · LW(p) · GW(p)

I don't know if it's aimed at neural nets specifically (and of course it is in fact visible to the naked eye) but AFAIK the noise is there to disrupt computer-vision systems, yes. (And in the first one it seems to have been effective in keeping 4o from recognizing the light bulb or the buildings, though interestingly Claude was able to see the buildings and not the bulb or the teapot). Agreed re: hoping they die soon XD