Explicit model visualizations

post by Dorian Stern vukotic (dorian-stern-vukotic) · 2021-12-05T18:37:25.737Z · LW · GW · 0 comments

Contents

  What is this about?
  Origin of (most) disagreements
  Explicit modelling
  Models in simulations and games 
  Models in “real life”
  Discuss4Real - useful discussions 
  Modelling and rationality in institutions
  Alignment maybe?
  Ok, so now what?
None
No comments

What is this about?

Motivated by two recent posts, I have decided to write this love letter to explicit model visualizations, and how I believe they make explaining systems easier. If we had the tools to effectively work with them, they could massively improve the quality of debates, and even contribute to increasing rationality in institutions.

In Lies, Damn Lies and Fabricated Options [LW · GW], the basic idea is that mental models of reality that people hold are often wrong. Not only that, it's hard to estimate just how wrong they are.

In Prioritization Research for Wisdom [LW · GW], the basic idea is that good things happen with wisdom, and we should find ways to be wiser.

Wisdom is “the quality of having experience, knowledge, and good judgement”, which is basically similar to “high quality mental models and the skill to use them to achieve the right goals”.

I believe that creation and support of high quality mental models could benefit from infrastructure that goes beyond simple textual explanations. We have all the tools, like rationality and the scientific method for their creation, but little real infrastructure beyond linear text to effectively express, share and discuss them. I also believe we can build that infrastructure in the form of software tools. 

Such tools can contribute in 3 areas:

Origin of (most) disagreements

It should not be a controversial statement that most of the disagreements in politics and business are not due to some core values misalignment, but rather an effect of mismatching mental models of how the world works.

The statement “prosperity is good” is not controversial, but saying “the best way to cause prosperity is capitalism” or “the best way to cause prosperity is communism” is extremely controversial, even though this can arguably be considered as the most important question of the 20th century.

This is an infrastructure problem

(This is not to say that the incentives of opinionmakers like the media do not exploit the issues with politics, or that conflict theory does not exist)

When we compare the two options, like communism and capitalism, we run a simulation using our mental model, and make predictions about how they will contribute to our core values, for example prosperity. 

Simulations like : “Since people are incentivized to work harder under capitalism than communism, they will on average be more prosperous” or “Since the greedy elites cannot hoard wealth, median prosperity will be higher under communism”. Both of these internally make perfect sense, and objectively, both are possible. The real question in this oversimplified example should be: “how does the effect of incentivizing hard work in a capitalist system compare to the effect of disincentivizing hoarding in a communist system, in regards to median/average prosperity?”

 

This question has a solution.



Ideally, we would be able to simulate the entire universe, seed the conditions for either A or B, observe the results after a while and decide which universe we like better. In most cases, the core values of a vast majority of people heavily align with one of these solutions compared to the other. We’ve pretty much globally agreed on some issues that were physically simulated, like “Which is better, trade or plunder?” or “Should murder be punished?”
 

This approach is somewhat computationally expensive to use everyday, so heuristics need to be used.

If we could accurately depict our mental models, the relationships between the ‘decision nodes’, relationship intensity(weights) between the nodes and how they relate to the core values, it would be much easier to figure out the exact points of disagreement, and focus on clearing that up. You can read more and different words about it here and also here.
 

One plausible conclusion of the prosperity debate would be that, while both motivation and hoarding issues exits to some extent, it's easier to solve the harmful hoarding issue than the motivation issue, so a plausible solution would be “The best way to cause prosperity is capitalism with regulations against corruption and monopolies”. Of course, this was simplified to only account for human laziness/greediness variables and its effect on a single issue being economic prosperity, which itself is fuzzy.

 

Explicit modelling

Lets use a famous example of whether or not soldiers in WW1 should wear helmets to show how explicit modelling would have helped.

The nations of WW1 had a problem - artillery barrages would explode around the entrenched positions, blow up a whole bunch of dirt and rocks, which gravity dropped on the soldiers hiding in the trenches. This caused a high amount of head wounds, which is bad. The solution to this is for soldiers to wear helmets, which would reduce the number of head wounds. But what actually happened was… the number of head wounds famously increased! There were theories why this happened, like “Soldiers feel safer therefore become less cautious”, even leading to the consideration of recalling the helmets altogether.


What was causing this? Well, let's explain how soldiers get wounded and create a modelled visualization of what happens during an artillery barrage.
 

The rocks fly up, and without the helmets, when they come down, small rocks cause wounds, but medium and large ones cause death.
 



 

What would happen if helmets are introduced?

(arguably small shrapnels wouldn’t cause a wound, but since I only included 3 categories I had to overstate the effect) 


 

The prediction of this model is that introducing helmets changes the ratio of wound/death scenarios from 1 wound / 2 deaths without helmets to 2 wounds / 1 death with helmets, which is what happened in reality. The people looking at the reports falsely assumed helmets had a negative effect, because they  were only looking at wounds but not the deaths. Even if the specific goal failed (reduce head wounds), the decision had a positive effect on the underlying ‘core value’ (our casualties = bad).
 

Of course, with the power of hindsight and since this is a common example used when teaching survivorship bias, I knew to draw the correct model which includes deaths. The competing model would be to include the effect of helmets on soldier carelessness, which would then cause more wounds. How to know which one is correct? 

Well, aside from the option of going into the trench to see for yourself, it would be a good idea to continuously track all relevant datapoints (wounds, deaths, attacks, etc), and see how a single change impacts them - while this is essentially experimental science, explicit modelling can also help explain the observed changes and find their real causes. 

What is the likelihood that helmets cause soldiers to be careless in a way that causes them to receive more wounds specifically to the head but overall die less, compared to the likelihood that helmets simply turn would-be-deaths into wounds? The models can also make predictions like: “If helmets cause carelessness, then our soldiers would die to enemy snipers at a higher rate when wearing them”. If that doesn't happen (and presumably it didn’t), the model is likely more wrong in relation to “Helmets turn some would-be-shrapnel-deaths to injuries”, which predicts the number of sniper deaths staying the same as helmets don't stop bullets, and is comparably less wrong.

Drawing explicit models in this situation would have revealed other relevant data points, such as deaths, and solve the mystery.


 

Models in simulations and games
 

I’ll get to actual model building and selection a bit afterwards, but I want to point out that building and visualizing models in this way is to some extent already very common in computer games and simulations. 

In Civilization VI, it's not really debatable whether democracy, communism or fascism is the best way to cause prosperity within the rules set by the game. Millions of games of Civ VI have been played, and it's almost provable that in most cases, democracy is best for pursuing a culture victory, communism is best for a space colonization victory, and fascism for a military domination victory. Of course, there are specific situations in which democracy is better for a space colonization win, such as when getting lucky with the right ‘great person’ rolls, but that case is also predictable from the game state to those familiar with the games model, i.e. experienced Civ VI players. 

It's also not debatable whether helmets work or not inside games, as you usually know their exact armor rating and other relevant stats.

Funnily, efforts to build a simulation can be repurposed into a game because 1: simulating reality is hard and 2: simulations are fun. If I remember correctly, eRepublik started out as a political simulation, the developers decided it was too difficult, and turned it into a game/social experiment.

 

A game I particularly like for their explicit depiction of models is Democracy(4), in which you are put in the position of a policymaker, so you try to optimize policies of your country. Well, the actual goal is just to stay in power, but I'd wager most people find it fun for playing around with policies, ideologies, and imagining parallels to the real world.

The model in Democracy is relatively clearly represented. If you want to increase the GDP, you would hover over GDP and see a visual representation of everything that affects GDP ingame. There are dozens of policy nodes, and metrics that have an effect, and you can see how big of an effect they have. While a 95% income tax would have a directly negative effect on the GDP, in some cases there are many steps or nodes between a policy and its end effects, some of which feed back into one another over time. For example, allowing wire tapping helps your police crack down on organized crime syndicates, which results in less crime, which results in more business confidence, which results in more foreign investment, which results in higher GDP, among a dozen other things. GDP in turn, increases the use of technology, which increases the annoyance of your citizens over being wiretapped.


 

It’s a thing of beauty

Main screen with all policy and variable nodes.



 

Expanded GDP node relationships, negatively affected by corruption

 

Expanded corruption node relationship, negatively affected by wiretapping



 

Wiretapping, reducing corruption and pissing off liberals

 

Models in games are directly useful, because they are guaranteed to be correct since the game, its rules and simulations are governed by the model itself (the game just finds an entertaining way to make you aware of it). This is not true for our reality until we figure physics out.



 

Models in “real life”

Even though they are not strictly speaking correct, models and games can still be useful in a real world scenario.
 

Taking simplified models is a useful tool for discussion.
Creating models is directly useful in predictable real world systems like factories, logistics and software architecture.

Neural networks are basically the science/art of creating models, though often not human readable ones, which is a technical problem that may or may not kill us all relatively soon.
 

I believe that if we had the will and the tools to explicitly create and share any arbitrary models, the practice of modelling could  be useful in many more instances and fields.

Let's take a look at a debate that often causes a great deal of misunderstandings: “Should we use solar or nuclear power as the primary solution to  climate change?”

The way to approach this via models is to first figure out what the relevant core values are, say:

Then, we would branch out all the relevant aspects of the technology, recursively breaking each thread into separate subthreads until an uncontroversial, inseparable ‘atom’ of discussion is reached, and connect them to how they relate to its sub nodes, and ultimately, the core values. I call this process atomization.

 

The environmental impact of a traditional light water reactor comes from:

 

Now that the impact has been atomized into separate nodes, each can be analyzed and discussed separately. Note that, even though we branched aspects of environmental impact, each of the subnodes also relates to cost, but not necessarily safety (building the reactor has low environmental impact, high cost, and no meltdown safety concerns). 

Optimizing the model so it is understandable and reasonably accurate will probably not be achieved on the first try, but that is fine.

This process can be continued recursively, through attaching data and multithreaded discussions to each node and relationship if needed. I would assume the environmental impact of building the reactor is not controversial, but nuclear waste is. Recursively branching out the node into smaller, and less controversial claims will eventually force understanding to come out, as misunderstanding will have less places to hide in. 
 

What is the risk if we spread out the nuclear waste in the atmosphere? That would be bad. What is the risk if we store it in a mountain? What is the risk if we store it in this specific mountain? Do we care if some specific area gets uninhabitable if storage fails? Who cares and how much? Can nuclear waste be turned into fuel in the future and be 100% used up so it stops existing and how likely is that scenario to happen?
 

Most of these question atoms have definite answers. 
 

I’ve had countless discussions about these topics (surprisingly, supporting nuclear or renewables feels like it has a religious zeal attached to it), but only when forcing a system in which extreme depth of discussion is pursued has there been some, if miniscule, change of opinions. 


 

Discuss4Real - useful discussions
 

Some method of judging the quality of the model and its intelligibility is needed. 

To facilitate mass participation,  i suggest an in-depth voting system that rates relevant nodes and relationships on 3 axes:

 

For example, it may be true that nuclear waste in theory is dangerous for a million years, but if someone makes a subnode nuclear-waste->price connection by assuming armed 24/7 guard patrols for a million years, i would label that node and connection as Facts True and Disagree, since the facts and maths are true, but the proposed solution itself is stupid.

This voting system should help steer humans in a more productive direction of discussion, as it will be clearer what exactly is controversial.


To recap, the proposed system is software for creating models by selecting “core values”, then recursively atomizing it into distinct nodes and weighted relationships. Visually, the models would look similar to previous democracy or helmet examples. Each node and relationship can have its own multithreaded discussion attached so its existence and properties can be questioned and justified through discourse (and providing sources where applicable). 

 

Each of these subnodes and its relationships are rated on perceived truth, agreement and quality of explanation so that discussion efforts can be focused on what actually matters. Through discussion guided changes and iteration, the models get more complex, and hopefully more accurate and less controversial. 

The end product is a robust, more or less accurate, publishable model that clearly shows how an approach (like nuclear or renewable energy) affects the core values.

 In theory, it's impossible to misunderstand each other if using this model, as any source of misunderstanding will be made clear through the use of such infrastructure, and then that specific node or relationship can be further analyzed and discussed.

 

Now, all of this seems like an awful lot of work to use. And it is. But imagine what would happen if a system like this gets adopted, and results shared by even a few thousands of users with diverse opinions. By looking up any topic in the public finished model database, You can gain an understanding of the topics model - how different approaches affect the relevant core values, and exactly what (if anything) is controversial (and why), so you can easily update your own mental model with the distilled wisdom of modelling, discussing crowds. 

Ideally, many of the currently controversial topics will, through using such a system result in an accurate model that most relevant people can agree and act upon, thus making discussions more useful. 


 

Modelling and rationality in institutions

The same software infrastructure, used in a different way, could help alleviate the problem of institutional decision inertia by documenting processes and their requirements/goals so that they can be reevaluated when conditions change.
 

How do humans make decisions? When faced with a new situation we have no previous knowledge of, we use first principles reasoning, basically starting from the most basic of basics and try to figure out how to combine them to solve some problem. 

Say we know basic math concepts, but not multiplication tables. 


Then we get a task: “3 * 3 = x. Find x”.

We can deduce that 3 * 3 = 3 + 3 + 3 = 9, just by using the first principles of basic math. Now that we know the solution, we can remember and cache it, to use that knowledge in further problems.
 

If we get a new problem: “3 ^ 3 = x, find x” ,we can immediately know that

3 * 3 * 3 = 9 * 3 = 9 + 9 + 9 = x = 27, skipping the first 3 + 3 + 3 to find the 9 because we cached that result from before to save time and energy, even though

3 * 3 * 3 = (3 + 3 + 3) * 3 = 9 * 3 = 9 + 9 + 9 = x = 27 is technically the more correct approach. 

In this example, the conditions we cached before - the rules of mathematics - never change, however, in real life they constantly do. Chattel slavery was one very good method for cotton production in 1700s, but try that shit today and you will find your shareholders very unhappy, because mechanization is more effective, and societal norms have also changed which would affect your brand image somewhat.
 

First principles thinking is very accurate, but expensive, and thinking from other cached conclusions is cheaper but runs the risk of cache being wrong. To avoid this, the cache needs to be periodically verified/cleaned… Or better yet, some system should exist to notify you when the conditions for your cached decisions have changed so you can reevaluate them.

I believe that a good portion of bad institutional inertia is due to decisions that were good at the time of their implementation, that just get grandfathered in and never questioned afterwards. Laws with expiration dates would be an example of clearing the cache to solve that problem, but laws with modelled specific goals and explanations on how and why they will be achieved can be better tracked and revisited exactly when needed.

If decisions were expressed as models, the exact conditions upon which they rely on would be clear. For car factories, “Since storage costs money, and glass is easy to acquire, we do not need to store glass and will order it so it arrives Just In Time to gain efficiency” while being true, should not be cached to “Just In Time = more efficiency” as it relies on materials being easy to acquire, which may be true for glass but not computer chips, as we’ve seen with the pandemic and the resulting dip in car production.

The same kind of software described in the previous chapter can be used for explicit modelling of institutional decisions, by setting the desired goals and required conditions as ‘root’ values, then branching out decision and process nodes and their relationships from there. Using the JIT example, storage costs money and adds reliability if the thing stockpiled is not easy to obtain from multiple, fungible sources. Should we store glass? No, Since its easy to obtain from many factories. Should we store computer chips? Yes, Since there are only a few factories producing them.

 

Making a somewhat accurate model requires deep understanding of the modelled issue, and initially most simplified models in many fields will simply be  wrong. But if enough attempts are made, and skill in doing so is gained, eventually some of those will be useful, and we will know exactly why it is useful, so it can be replicated in most similar situations. 

To incentivize creating better models, teaching and implementing those models in institutions could be sold as a service by consultants who would use such a system.


 

Alignment maybe?

If such a system would be used for discussions and making of institutional decisions, it would generate a vast amount of data about decision making, as well as effective and explicit explanations of mental models. That data could potentially be useful to an AI for learning to explain itself to humans, and to make higher level decisions.


 

Ok, so now what?

I tried building the infrastructure to solve this problem in my teens, when I had no skills, contacts or resources, and I mostly learned that those 3 are a positive modifier on my chances to do anything in the future.

The system at first was only supposed to be a tool for discussions through explicit modelling, then tested and refined enough to be useful for institutional decision-making. Over time, enough data about explicitly understandable high level decisions would be generated, and testing its usefulness to understandable AI would begin. However, I stopped working on it after Kialo launched (and i had to continue paying rent).
 

While Kialo does a… job of supplying discussion infrastructure through multithreading and voting systems, focusing on debate is treating the symptoms of misunderstandings and not the cause. Analysis, model building and their visualization is in my opinion forcing understanding and avoiding most useless discussion in the first place.
 

The ideas described in this post seem like low hanging fruit, so its unlikely no one thought of it before. Is there any obvious reason I am not seeing why a system like this will not be useful, except requiring more effort to use?


 

The project was called Discuss4Real, and looked cringe.




 

0 comments

Comments sorted by top scores.