[LINK] Computer program that aces 'guess next' in IQ test

post by Dmytry · 2012-02-16T09:01:23.172Z · LW · GW · Legacy · 14 comments

Contents

14 comments

From

http://astrobio.net/pressrelease/4569/computers-that-think-like-humans

The group is therefore using a psychological model of human patterns in their computer programmes. They have integrated a mathematical model that models human-like problem solving. The programme that solves progressive matrices scores IQ 100 and has the unique ability of being able to solve the problems without having access to any response alternatives. The group has improved the programme that specialises in number sequences to the point where it is now able to ace the tests, implying an IQ of at least 150.

'Our programmes are beating the conventional math programmes because we are combining mathematics and psychology. Our method can potentially be used to identify patterns in any data with a psychological component, such as financial data. But it is not as good at finding patterns in more science-type data, such as weather data, since then the human psyche is not involved,' says Strannegård.

 

That's an awesome study.

I always thought the variations of continue series test (progressive matrices, number sequences, word A is to word B as word  C is to ?? etc) are very culturally biased. You solve those best and easiest by sharing with the test maker the learning environment (and for visual ones, sharing visual environment), as well as sharing neural architecture. That lets you pick same choice as the test maker [edit: and do so easily and naturally]. And this research provides very good demonstration.

Of course there will be a correlation of ability to guess the same or secondguess the test maker with intelligence, but so does e.g. height correlate with intelligence (via effect of nutrition on both); perhaps we should add 'what is your height' question to IQ test and then let some giant robot score a genius.

Note: one might think of sequence guessing as task of minimizing Kolmogorov complexity. That's not quite so, sequences are too short, shorter than the generators. Consider sequence 2,3,5,7,11,? . Obviously the answer on IQ test would be 13 (primes). Good luck writing primes generating program that is simpler than this sequence itself, though [edit: i mean, simpler than a program which just prints those numbers followed by whatever garbage. Unless you have a language where 'print primes' is a basic command]. (and of course the length of program will be very dependent on the machine being used)

14 comments

Comments sorted by top scores.

comment by jmmcd · 2012-02-16T13:38:07.683Z · LW(p) · GW(p)

This looks really interesting. It reminds me of Hofstadter's work described in Fluid Concepts and Creative Analogies.

Can anyone point to the paper behind this press release?

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2012-02-16T17:11:25.476Z · LW(p) · GW(p)

There doesn't appear to be one yet, just a talk given at a philosophy conference in 2011. The abstract is here (p.21), but it doesn't say anything that isn't in the newspaper account.

comment by Jayson_Virissimo · 2012-02-18T08:28:22.053Z · LW(p) · GW(p)

I always thought the variations of continue series test (progressive matrices, number sequences, word A is to word B as word C is to ?? etc) are very culturally biased. You solve those best and easiest by sharing with the test maker the learning environment (and for visual ones, sharing visual environment), as well as sharing neural architecture. That lets you pick same choice as the test maker [edit: and do so easily and naturally]. And this research provides very good demonstration.

If IQ testing determined how culturally similar test-takers are to the test-makers, then you would expect that people the are the most similar culturally to score the highest. They don't (for instance, East Asians score higher than White Americans). Therefore, IQ testing is probably not mostly about cultural similarity between test-makers and test-takers.

comment by Kaj_Sotala · 2012-02-18T07:58:24.704Z · LW(p) · GW(p)

I don't see how "IQ tests are very culturally biased" follows from this result.

comment by Luke_A_Somers · 2012-02-16T15:24:32.646Z · LW(p) · GW(p)

If we're told that the data is a rule-generated sequence, then programs that are just lists can be excluded. That's a conclusion from the data, not from the prior.

Plus, if you come across, out of nowhere, the sequence of prime numbers up to 11, what probability ought you assign that it is the sequence of primes instead of arbitrary numbers? I think the overhead of a computer program in comparison to a statement of mathematical requirement may be too great to give a realistic estimate. In particular, generating a nonexistence test in code carries a lot of overhead, while in math it's quite compact.

Just listing them, we have "2,3,5,7,11"

That's 10 symbols, if we let 11 be considered 1 symbol (its short enough that a reasonable representation would leave it at its present length), but leave the commas as delimiters.

Via mathematical criterion, "a:!b,c>1|b*c=a"

That's 14 characters, of which 2 can be considered overhead (could leave 'a:' implicit). Note, we dropped order information here. I suppose in the event of non-uniformly-increasing sequences, we'll need the spot used here for 'a:' for something else.

Now, code. Count each line as 2 characters, and I'm completely neglecting any runtime optimization.

  • Assign register 1 the value 2
  • (label NEXT) Assign register 2 the value 2.
  • (label CANDIDATE) test for equality between register 2 and register 1
  • (on success) jump to KEEP
  • Assign register 3 the value in register 1
  • apply modulo base register 2 to register 3
  • test for register 3 being 0.
  • (on success) jump to NEXT while incrementing register 1
  • jump to CANDIDATE while incrementing register 2
  • (label KEEP) output register 1
  • jump to NEXT while incrementing register 1

22 'characters'

It may be possible to trim a few more away, if the instruction set is optimized for efficient short-range jumps. If the test commands can include longer jumps than 1 step in them (e.g. a skip-2 test and a go-back-4 test) then you get to save 2 lines. If the modulo command takes 3 register arguments, you can save one more line.

With all of these, we're down to 16 characters, still 4 more than the briefer variant of the mathematical criterion. There's no way 'ascending order' counts for 4 characters, or even 2 for the longer variant (since those 2 characters could be taken to specify that), to make up that difference.

Yet... I really suspect that if you run across the sequence "2, 3, 5, 7, 11" with no context other than that 2 was the beginning of the sequence, you ought to assign more than a 1/(sizeof char)^2 probability that it's the sequence of primes - that would be somewhere around 0.0001 probability.

(edited: mathematical statement of primality did not exclude factors of 1)

Replies from: Dmytry, kpreid
comment by Dmytry · 2012-02-16T19:17:12.483Z · LW(p) · GW(p)

Well, I dunno.

If I had random C program print me out 2,3,5,7,11 , I would still assume VERY low probability that it is going to print out primes correctly up to reasonably big number. Even more so for Turing machines or anything of this kind. Ditto for any natural processes. Ditto for seeing those numbers in idk child doodling when child didn't learn the primes yet. Child might have invented primes but that list is not remotely enough evidence.

If a human writer of a test tells me this sequence, I would guess that he knows of primes, and is telling primes to see if I know of primes too. But if he didn't get taught primes, I would not assume he invented primes from just this sequence.

The Kolmogorov's complexity really is dependent to language. If we do it informally with human language, then 'primes' is an answer that is shorter than 'two three five seven eleven'. But then the complexity depends to language and expectations of what's more complex.

Consider the 'petals around the roses' joke as very extreme example. Most educated individuals just try to search some giant solution space and are blind to the dots themselves, seeing it as numbers. Unless they have encountered something of this kind before (e.g. the other joke about counting loops in numbers), in which case they solve it quite easily.

edit: There is other slightly less related example with regards to how distribution is different for humans. If you look at natural data, it most often begins with 1. If you look at human-forged data, the frequencies of first digits are much more equal unless the person committing the forgery is very clever.

Replies from: dlthomas, Luke_A_Somers
comment by dlthomas · 2012-02-17T00:44:52.281Z · LW(p) · GW(p)

Very low probability, but much, much larger than most other specific sequences of comparable length!

comment by Luke_A_Somers · 2012-02-17T04:01:42.665Z · LW(p) · GW(p)

All the cases in your first paragraph provide context. After the first few, the context essentially tells you whether it's possible for the sequence to be an enumeration of primes.

In the first few cases, of unknown computer programs, do you really think that the prime number hypothesis should be struck with a 40 decibel probability penalty? I'd love to bet with you. Lots and lots of money, as often as possible.

Replies from: Dmytry
comment by Dmytry · 2012-02-17T08:32:37.322Z · LW(p) · GW(p)

Well, if the programming language had some primes(); function that prints primes, then no, it shouldn't. If this is a random choice of a program written by a human being, ditto.

If we are talking of some programming language like C, or assembly, or especially Turing machine, and randomly generated programs, then i'm pretty sure if you see 2,3,5,7,11 it is still quite unlikely (on order of 10^-4 at least) that program would print primes correctly. (However the chance that program prints 13 next would be way higher than 10^-4)

In general, the generated programs have a tendency to do really weird stuff. There was an example posted right here:

http://lesswrong.com/lw/9pl/automatic_programming_an_example/

And the likehood btw depends on programming language. With wolfram alpha, '5 primes' will print you the primes. With x-86 assembly, division instruction may be used. With z-80 assembly, there's no division or multiplication instruction. With Turing machine or anything of this sort even the addition needs to be 'reinvented'.

With a language that has huge library of functions, with huffman-coded names (approximately the human language), the complexity will greatly depend to how often people who made that library expected to use primes.

comment by kpreid · 2012-02-16T16:29:28.943Z · LW(p) · GW(p)

I think if you want to write symbol-short programs for small problems, you should probably look at models which are more composition-based, so as to remove the identifiers/register names which are not really part of the complexity of the problem. To name some actual existant programming languages, (APL or J) or (Forth or Factor). APL in particular uses single-character symbols in a character set designed for the language.

Then there's compression. If you want to keep it reasonably human-understandable and compositional, how about Huffman-coding the set of symbols?

Replies from: Luke_A_Somers
comment by Luke_A_Somers · 2012-02-17T03:56:32.341Z · LW(p) · GW(p)

As noted, language-dependent for sure. APL looks appropriate for this... but... Wikipedia says this code snipped finds all of the prime numbers up to R...

(~R∊R∘.×R)/R←1↓⍳R

which is 17 characters, and you need to feed it a top of range. Machine code wins!

By the way, machine-code symbols are already pretty close to Huffman-coded.

comment by Normal_Anomaly · 2012-02-16T13:40:27.051Z · LW(p) · GW(p)

Note: one might think of sequence guessing as task of minimizing Kolmogorov complexity. That's not quite so, sequences are too short, shorter than the generators. Consider sequence 2,3,5,7,11,? . Obviously the answer on IQ test would be 13 (primes). Good luck writing primes generating program that is simpler than this sequence itself, though. (and of course the length of program will be very dependent on the machine being used)

I'm having trouble understanding this part. I thought that "predicting the next number by minimizing Kolmogorov complexity" meant "finding the shortest program that would generate that sequence and the next bit", not "finding a program shorter than that sequence." Can you explain?

Replies from: lukstafi
comment by lukstafi · 2012-02-16T13:57:52.105Z · LW(p) · GW(p)

It is generally assumed that the program that returns a predefined sequence and then always "1" is negligibly longer than that sequence.

Replies from: Dmytry
comment by Dmytry · 2012-02-16T19:02:39.226Z · LW(p) · GW(p)

Precisely.

Also, to clarify. Kolmogorov's complexity of printing out primes vs printing this exact sequence followed by ones, is a matter of language. One could imagine a language where there's 1-symbol command that prints primes. Or a language that is like actual programming languages, where you have to implement printing of primes.