PSA: Tagging is Awesome

post by abramdemski · 2020-07-30T17:52:14.047Z · score: 71 (18 votes) · LW · GW · 19 comments

I'd like to supplement the open call for taggers [LW · GW] with a few points:

19 comments

Comments sorted by top scores.

comment by Yoav Ravid · 2020-07-30T20:15:30.936Z · score: 14 (5 votes) · LW(p) · GW(p)

Seems worth adding a link to the list of top posts without tags [? · GW]

comment by Gurkenglas · 2020-07-30T18:34:30.306Z · score: 13 (4 votes) · LW(p) · GW(p)

Just ask GPT to do the tagging, people.

comment by Ruby · 2020-07-30T19:07:41.541Z · score: 3 (2 votes) · LW(p) · GW(p)

You know, it'd probably work.

comment by Raemon · 2020-07-30T20:05:43.607Z · score: 14 (6 votes) · LW(p) · GW(p)

We sure did just talk about this for 15 minutes. Seems like GPT would actually do a decent (and/or interesting?) job. But, also, man, I sure have reservations about giving GPT any control over the ontology of the AI Alignment space.

comment by Raemon · 2020-07-30T20:19:48.211Z · score: 26 (10 votes) · LW(p) · GW(p)

Update: we asked GPT-3 to create some tags, and it suggested, among other things: "Robotic Autonomy"

comment by mr-hire · 2020-07-31T00:15:57.409Z · score: 2 (1 votes) · LW(p) · GW(p)

For which article? Was it an article about Robotic Autonomy? Or, did you just give it a list of LW tags and have it create more.

comment by habryka (habryka4) · 2020-07-31T03:31:17.581Z · score: 5 (3 votes) · LW(p) · GW(p)

And here is me asking it to extend the list of all Rationality tags:

-------- My Input -------
Alief
Anticipated Experiences
Bayes Theorem / Bayesianism
Bounded Rationality
Cached Thoughts
Conservation of Expected Evidence
Decision Theory
Epistemology
Game Theory
Hansonian Pre-Rationality
Law-Thinking
Newcomb's Problem
Robust Agents
Solomonoff Induction
Utility Functions
Value of Information
Betting
Dark Arts
Focusing
Forecasting & Prediction
Goal Factoring
Hamming Questions
Identity
Inside/Outside View
Internal Double Crux
Introspection
Noticing
Scholarship & Learning
Techniques
Trigger-Action Planning
Affect Heuristic
Bucket Errors
Compartmentalization
Confirmation Bias
Fallacies
Goodhart's Law
Groupthink
Heuristics and Biases
Mind Projection Fallacy
Motivated Reasoning
Pica
Pitfalls of Rationality
Rationalization
Self-Deception
Sunk-Cost Fallacy
Common Knowledge
Conversation
Decoupling vs Contextualizing
Disagreement
Inferential Distance
Information Cascades
Philosophy of Language
Steelmanning
Consciousness
Dual Process Theory (System 1 & 2
General Intelligence
Subagents
Predictive Processing
Perceptual Control Theory
Center for Applied Rationality
Curiosity
Gears-Level
Memetic Immune System
Practice & Philosophy of Science
Rationality Quotes

------- GPT3's Response -------

Mental Simulation
Mind Projection Fallacy
Modal Rationality
Narrative Causality
Simulation Heuristic
Subjunctive Probability
Bayes' Theorem
Blind Spots
Counterfactuals
Deontology
Deontology Quotes
Epistemic Regret
Epistemic Utility
Eternalism
Eternalism Quotes
Experientialism
Fictionalism
Fictionalism Quotes
Game Theory
Hansonian Pre-Rationality
Law-Thinking
Logical Fallacies
Predictive Processing
Predictive Processing Quotes
Psychology of Science
Rationality
Rationality Quotes
Science & Rationality
Science Quotes
Scientific Progress
Scientific Progress Quotes
Scientific Realism
Scientific Realism Quotes
Solomonoff's Theory
Solomonoff's Theory Quotes
Solomonoff's Prior
Solomonoff's Prior Quotes
Solomonoff's Induction
Solomonoff's Induction Quotes
Solomonoff's Algorithm
Solomonoff's Algorithm Quotes
Rationalist Community
Affective Death Spirals
Anthropic Bias
Bayes' Theorem
Bias Heuristic
Confirmation Bias
Deductive vs Inductive
Epistemic Status
Epistemic Utility
Expert Testimony
Extrapolative Tendencies
False-Positive Paradox
comment by Gurkenglas · 2020-07-31T08:01:26.828Z · score: 6 (3 votes) · LW(p) · GW(p)

Long outputs will tend to naturally deteriorate, as it tries to reproduce the existing deterioration and accidentally adds some more. Better: Sample one tag at a time. Shuffle the inputs every time to access different subdistributions. (I wonder how much the subdistributions differ for two random shuffles...) If you output the tag that has the highest minimum probability in each of a hundred subdistributions, I bet that'll produce a tag that's not in the inputs.

comment by abramdemski · 2020-07-31T14:48:05.072Z · score: 4 (2 votes) · LW(p) · GW(p)

Shuffling would also be good to combat the alphabetic order, which has got to be skewing output somehow.

comment by habryka (habryka4) · 2020-07-31T03:27:26.178Z · score: 4 (2 votes) · LW(p) · GW(p)

I just did it again, asking it to extend the list of all the AI tags. 

------- My Input -----

AIXI(23)
Complexity of Value(8)
Corrigibility(12)
Decision Theory(83)
Embedded Agency(13)
Fixed Point Theorems(7)
Goodhart's Law(29)
Inner Alignment(17)
Instrumental Convergence(9)
Logical Induction(22)
Mesa-Optimization(10)
Myopia(6)
Newcomb's Problem(4)
Optimization(21)
Orthogonality Thesis(4)
Outer Alignment(7)
Solomonoff Induction(9)
Utility Functions(20)
AI Boxing (Containment)(8)
Debate (AI safety technique)(9)
Factored Cognition(8)
Humans Consulting HCH(7)
Impact Measures(21)
Inverse Reinforcement Learning(14)
Iterated Amplification (26)
Transparency / Interpretability(9)
Value Learning(39)
AI Risk(53)
AI Services (CAIS)(8)
AI Takeoff(38)
AI Timelines(28)
CHAI (UC Berkeley)(12)
FHI (Oxford)(13)
MIRI(72)
OpenAI(17)
Ought(7)
Alpha- (algorithm family)(10)
GPT(44)
Research Agendas(27)

-------- GPT3's Response ---------

Happiness / Life Satisfaction(3)
Existential Risks(6)
Growth Mindset(6)
Optimization(21)
Metacognition(9)
Probabilistic Thinking(9)
Rationality(33)
Retroduction(9)
Semi-HUMANs(6)
Superintelligence(40)
Tractability(18)
Trust(5)
VNM(44)
Books(1)
Causal Explanation(5)
Computational Equivalence(9)
Computational Models(13)
Computationalism(11)
Computational Psychosociology(8)
Computational Sociology(8)
Conferences(4)
Control Theory(6)
Deliberation(10)
Egalitarianism(7)
Ethical Alignment(7)
Ethical Uncertainty(6)
Evolutionary Game Theory(5)
Experimental Ethics(9)
Experimental Philosophy(9)
Fairness(5)
Futarchy(8)
Game Theory(9)
Genetic Algorithms(6)
Happiness / Life Satisfaction(3)
Human Level AI(7)
Human-level Intelligence(5)
Human-Level Systems(5)
Impact Measures(21)
Incentives(6)
comment by John_Maxwell (John_Maxwell_IV) · 2020-08-07T07:39:36.659Z · score: 2 (1 votes) · LW(p) · GW(p)

I expect you'd get better results by using older, less hyped NLP techniques that are designed for this sort of thing:

https://stackoverflow.com/questions/15377290/unsupervised-automatic-tagging-algorithms

The tagging work that's already been done need not be a waste, because you can essentially use it as training data for the kind of tags you'd like an automated system to discover and assign.  For example, tweak the hyperparameters of the topic modeling system until it is really good at independently rediscovering/reassigning the tags that have already been manually assigned.

An advantage of the automated approach is that you should be able to reapply it to some other document corpus--for example, autogenerate tags for the EA Forum, or all AI alignment related papers/discussion off LW, or the entire AI literature in order to help with/substitute for this job https://intelligence.org/2017/12/12/ml-living-library/ (especially if you can get some kind of hierarchical tagging to work)

I've actually spent a while thinking about this sort of problem and I'm happy to video call and chat more if you want.

comment by Raemon · 2020-07-31T00:25:44.515Z · score: 4 (2 votes) · LW(p) · GW(p)

In this case someone just gave it a list and asked it to create more. (I do think the ideal process here would have been to feed it some posts + corresponding taglists, and then given it a final post with a "Tags: ..." prompt. But, that was a bit more work and nobody did it yet AFAICT)

comment by Gurkenglas · 2020-07-30T21:17:24.339Z · score: 2 (3 votes) · LW(p) · GW(p)

You make it sound like it wants things. It could at most pretend to be something that wants things. If there's a UFAI in there that is carefully managing its bits of anonymity (which sounds as unlikely as your usual conspiracy theory - a myopic neural net of this level should keep a secret no better than a conspiracy of a thousand people), it's going to have better opportunities to influence the world soon enough.

comment by Raemon · 2020-07-30T21:19:07.060Z · score: 6 (3 votes) · LW(p) · GW(p)

Sorry, to be clear this was a joke.

comment by Raemon · 2020-07-30T21:30:45.358Z · score: 2 (1 votes) · LW(p) · GW(p)

(joke was more about the general principle of putting opaque AIs in charge of alignment ontology, even if this one obviously wasn't going to be adversarial about it)

comment by abramdemski · 2020-07-30T21:20:48.605Z · score: 4 (2 votes) · LW(p) · GW(p)

I think the concern is more "it wouldn't optimize the ontology carefully".

comment by Raemon · 2020-07-30T20:41:35.769Z · score: 2 (1 votes) · LW(p) · GW(p)

Thanks!

One thing I'll add is that the tagging we could most use help with is fairly "conceptually nuanced." We're most excited to have taggers that are something like "dedicated ontologists" who think seriously about LessWrong's concepts and how they fit together.

This role requires a bit of onboarding/getting-in-sync, and I'd be interested in chatting with anyone who's interested in that aspect of it. Send me (or Ruby) a PM if you're interested.

comment by abramdemski · 2020-07-30T22:09:47.006Z · score: 4 (2 votes) · LW(p) · GW(p)

I've mainly been tagging special topics which have few actual posts dedicated to them, and whose posts are not usually so popular. For example, Hansonian Pre-Rationality [? · GW]. Special topics like this require conceptual nuance in the sense that the tagger should be familiar with the topic (which is a high bar, if these are posts which relatively few people read or which have been long forgotten by most people).

For this sort of tagging to happen, basically, someone with the niche interest has to decide to make the tag.

I'm hoping more people with niche interests will do that. I also kind of think of this as the main benefit of tagging?

It sounds like this differs somewhat from your picture of what tagging is most valuable / what the LW team primarily needs help with. Do you think so?

comment by Raemon · 2020-07-30T22:24:32.782Z · score: 4 (2 votes) · LW(p) · GW(p)

(totally off the cuff, this is not official Ray Opinion let alone a commonly endorsed LW Team opinion)

I think there's something of a spectrum of value. 

I think it's straightforwardly valuable to add the Obvious Posts to the Obvious Tags, and this is a job that most people can do.

I think, indeed, niche specialists are going to need to make the Niche Specialist tags. 

The thing that gets a bit tricky is that different people might be finding the same topics but giving them different names. Something that I think is useful for Niche Specialists to do is to also be putting some effort into "taking in the overall evolving tagging structure". Then, thinking about how to resolve multiple overlapping tags, and or competing definitions within tags, etc. 

Perhaps most interestingly: most of the LW team is currently leaning towards tags evolving in a wiki-like direction. So, one of the things we need here is not just good tagging, but good pedagogy for introducing various concepts on a given tag page. And that requires some thinking about how the concept relates to other concepts.

I guess this means that there's maybe two types of people I'm especially excited by getting involved with Tagging: people with the niche interests (who would be even more beneficial if they learned to look holistically at the concept-space), and people who are good at pedagogy and/or mapping out the concept-space, who could probably use to learn a bit more about individual niche areas to help improve the descriptions for the niche tags.

(I find myself thinking about Scott Alexander's Alchemists / Teachers story where you need a mix of people who are particularly good at a subject to even understand the details at all, and people who are good at education to make it easier to understand, and various points on the spectrum between to bridge the gaps)