Isomorphic agents with different preferences: any suggestions?

post by Stuart_Armstrong · 2016-09-19T13:15:42.620Z · LW · GW · Legacy · 6 comments

Contents

6 comments

In order to better understand how AI might succeed and fail at learning knowledge, I'll be trying to construct models of limited agents (with bias, knowledge, and preferences) that display identical behaviour in a wide range of circumstance (but not all). This means their preferences cannot be deduced merely/easily from observations.

Does anyone have any suggestions for possible agent models to use in this project?

6 comments

Comments sorted by top scores.

comment by Gunnar_Zarncke · 2016-09-20T22:18:30.645Z · LW(p) · GW(p)

Would you consider computer viruses as limited agents trying to appear as identical (superficially) as the unaltered system as possible?

Also note that the actual change between the original system and the altered system can be arbitrarily small though the change in behavior can be extremely large. Consider for example the Ken Thompson hack or the recent single gate security attack.

Replies from: Stuart_Armstrong
comment by Stuart_Armstrong · 2016-09-21T08:19:54.097Z · LW(p) · GW(p)

Not looking for exactly this, but somewhat related.

Replies from: Gunnar_Zarncke
comment by Gunnar_Zarncke · 2016-09-21T18:42:29.804Z · LW(p) · GW(p)

I guess what you are missing is the agentyness or intelligence. But consider that already now Android comes with 'assistants' that make recommendations and that soon may cooperate with other such agents to arrange for appointments, flights and such.

comment by turchin · 2016-09-19T23:32:28.424Z · LW(p) · GW(p)

Farmers are nursing small pigs like their children, but later kill them and eat them. It may be unpredictable for pigs.

A spy who works like an ordinary person, but sometimes stole information.

comment by MrMind · 2016-09-26T08:34:15.691Z · LW(p) · GW(p)

I think you should make a distinction if the different behaviours comes from different circumstances or not.
If their environment is always the same, then I think the only to have what you ask is if the system has a hidden, very specific parameter, that says "when X and Y and Z happens, zig instead of zagging".
Otherwise, if the model is slightly chaotic, then an important alteration to the environment might provoke very different behaviour.

For the first type of agent, think of two Markov chains almost identical, only one has a very improbable arc to a stable subnet that doesn't exists in the other chain.
For the second type, think of two similar strange attractors, that have different behaviours away from the stable parameters. They will be approximately identical in the same zone and be very different away from that zone.

comment by MattG2 · 2016-09-20T16:09:35.098Z · LW(p) · GW(p)

Agents based on lookup tables.