Roman Malov's Shortform

post by Roman Malov · 2024-12-19T21:14:54.805Z · LW · GW · 2 comments

Contents

2 comments

2 comments

Comments sorted by top scores.

comment by Roman Malov · 2024-12-19T21:14:55.985Z · LW(p) · GW(p)

I recently prepared an overview lecture about research directions in AI alignment for the Moscow AI Safety Hub. I had limited time, so I did the following: I reviewed all the sites on the AI safety map, examined the 'research' sections, and attempted to classify the problems they tackle and the research paths they pursue. I encountered difficulties in this process, partly because most sites lack a brief summary of their activities and objectives (Conjecture is one of the counterexamples). I believe that the field of AI safety would greatly benefit from improved communication, and providing a brief summary of a research direction seems like low-hanging fruit.

comment by Roman Malov · 2025-04-16T21:59:38.788Z · LW(p) · GW(p)

Rule and Example

Rules can generate examples. For instance: DALLE-3 is a rule according to which different examples (images) are generated.

From examples, rules can be inferred. For example: with a sufficient dataset of images and their names, a DALLE-3 model can be trained on it.

In computer science, there is a concept called Kolmogorov complexity of data. It is (roughly) defined as the length of the shortest program capable of producing that data.

Some data are simple and can be compressed easily; some are complex and harder to compress. In a sense, the task of machine learning is to find a program of a given size that serves as a "compression" of the dataset.

In the real world, although knowing the underlying rule is often very useful, sometimes it is more practical to use a giant look-up table (GLUT) [LW · GW] of examples. Sometimes you need to memorize the material instead of trying to "understand" it.

Sometimes there are examples that are more complex than the rule that generated them. For example, in the interval [0;1] (which is quite easy to describe, the rule being: all numbers are not greater than 1 and not less than 0), there exists a number containing all the works of Shakespeare (which definitely cannot be compressed to a description comparable to that of the interval [0;1]). 

Or, сonsider the program that outputs every natural number from 1 to  (which is very short, because the Kolmogorov complexity of  is low) will at some point produce a binary encoding of LOTR. In that case, the complexity lies in the starting index, the map for finding the needle in the haystack is as valuable (and as complex) as the needle itself.

Properties follow from rules. It is not necessary to know about every example of a rule in order to have some information about all of them. Moreover, all examples together can have less information (or Kolmogorov complexity) than sum of individual Kolmogorov complexities (as in example above).