FactorialCode's Shortform 2019-07-30T22:53:24.631Z · score: 1 (1 votes)


Comment by factorialcode on Have epistemic conditions always been this bad? · 2020-02-03T23:27:53.011Z · score: 3 (2 votes) · LW · GW

Wei_Dai, in the past

A bit off topic, but does LW have username pinging?

Comment by factorialcode on FactorialCode's Shortform · 2020-01-30T07:24:16.378Z · score: 2 (2 votes) · LW · GW

I think for it work you'd definitely need to do it on a larger scale. When you go on a cruise, you pay money to abstract away all of the upkeep associated with operating a boat.

I read the post and did some more research. The closest analog to what I'm thinking looks to be google's barge project, which encountered some regulatory barriers during the construction phase. However, the next closest thing is this startup and they seem to be the real deal. With regards to what you brought up>

It's pretty hard to get permits for large-berth ships (and not that easy for small berth ships either)

Correct me if I'm wrong, but AFAICT most regulation is for parking a boat at a berth. I don't think the permits are nearly as strict if you are out in the bay. I don't think coastal housing can scale. That's why I mentioned the amphibious vehicles. Or more realistically, a ferry to move people to and from the housing ship.

Ships have all kinds of hidden costs you don't notice at first There are huge upfront costs to figuring out how to go about this

Yeah, there's no getting around that. It's the kind of thing that you contract out to a maritime engineering firm. 10 million for the ship, maybe 5 million for the housing units, maybe another 5 million to pay an engineering firm to design the thing to comply with relevant regulations. Throw in another 5 million to assemble the thing. Then who knows how much to cut through all the red tape. However, rents keep going up significantly faster than inflation. Condos in SF seem to be on the order of ~1 million per bedroom. I think you could easily recoup your costs after deploying a single 60-120 bedroom ship.

It seems likely that if you attempt to do this at scale, you'll just trigger some kind of government response that renders your operation not-much-cheaper, so you pay the upfront costs but not reap the scale benefits

I think this is the big one, rent seekers are going to do everything they can to stop you if you're a credible threat. I really don't know how politics works at that level. I'd imagine you'd need to appease the right people and make PR moves that make your opponents look like monsters. Then again, if you go with the cargoship model, it's not like you've lost your capital investment. You can just pack up and move to a different city anywhere in the world. You can also spread a fleet around to multiple cities across the world so as to avoid triggering said response while building up credibility/power.

Comment by factorialcode on FactorialCode's Shortform · 2020-01-30T03:14:04.854Z · score: 4 (3 votes) · LW · GW

Here's an idea. Buy a container ship, and retrofit it with amphibious vehicles, shipping container houses, and associated utility and safety infrastructure. Then take it to any major costal city and rent the shipping container houses at a fraction of the price of the local rent.

You could also convert some of the space into office space and stores.

Assuming people can live in a single 40' shipping container, the price per person should be minimal. You can buy some pretty big old ships for less than individual houses and we can probably upper bound the cost per unit by looking at cruise ship berth prices.

The best part? You can do all your construction where wages are cheap and ship the apartment anywhere it's needed.

Comment by factorialcode on Raemon's Scratchpad · 2020-01-29T06:31:19.183Z · score: 1 (1 votes) · LW · GW

Has there been any discussion about showing the up/down vote counts? I know reddit used to do it a long time ago. I don't know why they stopped though.

Comment by factorialcode on Using vector fields to visualise preferences and make them consistent · 2020-01-29T03:45:15.499Z · score: 3 (3 votes) · LW · GW

You can work around this by making your "state space" descriptions of sequences of states. And defining preferences between these sequences.

Comment by factorialcode on Using vector fields to visualise preferences and make them consistent · 2020-01-29T03:44:58.398Z · score: 10 (6 votes) · LW · GW

I think you can extend this idea to graphs.

Comment by factorialcode on Algorithms vs Compute · 2020-01-29T03:31:26.536Z · score: 5 (3 votes) · LW · GW

I think they would both suck honestly. Many things have changed in 20 years. Datasets, metrics, and architectures have all changed significantly.

I think the relationship between algorithms and compute looks something like this.

For instance, look at language models. LSTMs had been introduced 3 years prior. People mainly used n-gram markov models for language modelling. N-grams don't really scale and training a transformer using as much resources as you need to train an N-gram model would probably not work at all. In fact, I don't think you even really "train" an N-gram model.

The same goes for computer vision. SVM's using the kernel trick have terrible scaling properties. (O(N^3) in the number of datapoints, but until compute increased, they worked better. See the last slide here

You often hear the complaint that the algorithms we use were invented 50 years ago, and many NN techniques fall in and out of fashion.

I think this is all because of the interactions between algorithms and compute/data. The best algorithm for the job changes as a function of compute, so as compute grows, new methods that previously weren't competitive suddenly start to outperform older methods.

I think this is a general trend in much of CS. Look at matrix multiplication. The naive algorithm has a small constant overhead, but N^3 scaling. You can use group theory to come up with algorithms that have better scaling, but have a larger overhead. As compute grows, the best matmul algorithm changes.

Comment by factorialcode on Have epistemic conditions always been this bad? · 2020-01-27T19:03:20.971Z · score: 20 (6 votes) · LW · GW

It confuses me that I seem to be the first person to talk much about this on either LW or EA Forum, given that there must be people who have been exposed to the current political environment earlier or to a greater extent than me. On the other hand, all my posts/comments on the subject have generally been upvoted on both forums, and nobody has specifically said that I'm being too alarmist. One possible explanation for nobody else raising an alarm about this is that they're afraid of the current political climate and they're not as "cancel-proof" as I am, or don't feel that they have as much leeway to talk about politics-adjacent issues here as I do.

I think Scott put it best when he said:

No, you don’t understand. It’s not just the predictable and natural reputational consequences of having some embarrassing material in a branded space. It’s enemy action.

Every Twitter influencer who wants to profit off of outrage culture is going to be posting 24-7 about how the New York Times endorses pedophilia. Breitbart or some other group that doesn’t like the Times for some reason will publish article after article on New York Times‘ secret pro-pedophile agenda. Allowing any aspect of your brand to come anywhere near something unpopular and taboo is like a giant Christmas present for people who hate you, people who hate everybody and will take whatever targets of opportunity present themselves, and a thousand self-appointed moral crusaders and protectors of the public virtue. It doesn’t matter if taboo material makes up 1% of your comment section; it will inevitably make up 100% of what people hear about your comment section and then of what people think is in your comment section. Finally, it will make up 100% of what people associate with you and your brand. The Chinese Robber Fallacy is a harsh master; all you need is a tiny number of cringeworthy comments, and your political enemies, power-hungry opportunists, and 4channers just in it for the lulz can convince everyone that your entire brand is about being pro-pedophile, catering to the pedophilia demographic, and providing a platform for pedophile supporters. And if you ban the pedophiles, they’ll do the same thing for the next-most-offensive opinion in your comments, and then the next-most-offensive, until you’ve censored everything except “Our benevolent leadership really is doing a great job today, aren’t they?” and the comment section becomes a mockery of its original goal.

So let me tell you about my experience hosting the Culture War thread.

(“hosting” isn’t entirely accurate. The Culture War thread was hosted on the r/slatestarcodex subreddit, which I did not create and do not own. I am an honorary moderator of that subreddit, but aside from the very occasional quick action against spam nobody else caught, I do not actively play a part in its moderation. Still, people correctly determined that I was probably the weakest link, and chose me as the target.)

People settled on a narrative. The Culture War thread was made up entirely of homophobic transphobic alt-right neo-Nazis. I freely admit there were people who were against homosexuality in the thread (according to my survey, 13%), people who opposed using trans people’s preferred pronouns (according to my survey, 9%), people who identified as alt-right (7%), and a single person who identified as a neo-Nazi (who as far as I know never posted about it). Less outrageous ideas were proportionally more popular: people who were mostly feminists but thought there were differences between male and female brains, people who supported the fight against racial discrimination but thought could be genetic differences between races. All these people definitely existed, some of them in droves. All of them had the right to speak; sometimes I sympathized with some of their points. If this had been the complaint, I would have admitted to it right away. If the New York Times can’t avoid attracting these people to its comment section, no way r/ssc is going to manage it.

But instead it was always that the the thread was “dominated by” or “only had” or “was an echo chamber for” homophobic transphobic alt-right neo-Nazis, which always grew into the claim that the subreddit was dominated by homophobic etc neo-Nazis, which always grew into the claim that the SSC community was dominated by homophobic etc neo-Nazis, which always grew into the claim that I personally was a homophobic etc neo-Nazi of them all. I am a pro-gay Jew who has dated trans people and votes pretty much straight Democrat. I lost distant family in the Holocaust. You can imagine how much fun this was for me.

People would message me on Twitter to shame me for my Nazism. People who linked my blog on social media would get replies from people “educating” them that they were supporting Nazism, or asking them to justify why they thought it was appropriate to share Nazi sites. I wrote a silly blog post about mathematics and corn-eating. It reached the front page of a math subreddit and got a lot of upvotes. Somebody found it, asked if people knew that the blog post about corn was from a pro-alt-right neo-Nazi site that tolerated racists and sexists. There was a big argument in the comments about whether it should ever be acceptable to link to or read my website. Any further conversation about math and corn was abandoned. This kept happening, to the point where I wouldn’t even read Reddit discussions of my work anymore. The New York Times already has a reputation, but for some people this was all they’d heard about me.

Some people started an article about me on a left-wing wiki that listed the most offensive things I have ever said, and the most offensive things that have ever been said by anyone on the SSC subreddit and CW thread over its three years of activity, all presented in the most damning context possible; it started steadily rising in the Google search results for my name. A subreddit devoted to insulting and mocking me personally and Culture War thread participants in general got started; it now has over 2,000 readers. People started threatening to use my bad reputation to discredit the communities I was in and the causes I cared about most.

Some people found my real name and started posting it on Twitter. Some people made entire accounts devoted to doxxing me in Twitter discussions whenever an opportunity came up. A few people just messaged me letting me know they knew my real name and reminding me that they could do this if they wanted to.

Some people started messaging my real-life friends, telling them to stop being friends with me because I supported racists and sexists and Nazis. Somebody posted a monetary reward for information that could be used to discredit me.

One person called the clinic where I worked, pretended to be a patient, and tried to get me fired.

Many of the users on LW have their real names and reputations attached to this website. If LW were to come under this kind of loosely coordinated memetic attack, many people would find themselves harassed and their reputations and careers could easily be put in danger. I don't want to sound overly dramatic, but the entire truth seeking and AI safety project could be hampered by association.

That's why even though I remain anonymous, I think it's best if I refrain from discussing these topics at anything except the meta level on LW. Even having this discussion strikes me as risky. That doesn't mean that we shouldn't discuss these topics at all. But it needs to be on a place like r/TheMotte where there is no attack vector. This includes using different usernames so we can't be traced back here. Even then, the reddit AEO and the admins are technically weak points.

Comment by factorialcode on Have epistemic conditions always been this bad? · 2020-01-27T18:13:18.953Z · score: 3 (2 votes) · LW · GW

I think much of it can be attributed to the eternal September. The quality of discussion on the internet has declined steadily since it's inception. The barriers to entry have steadily lowered, and I think those barriers selected for people who valued better epistemic conditions. Now that anyone can participate, the overall quality of discussion has gone down the drain. Furthermore, I think platforms have adapted to appeal to the general population that does not value good epistemic conditions. (Compare the reddit redesign to the old reddit with custom CSS turned off.)

A look at history will show that people have always had terrible epistemic practices. The major abrahamic religions all had components that encouraged their members to spread the religion using violence and required unquestioning faith from their believers. Communist Russia had lysenkoism. All the propaganda posters during world war 2 from all sides we're straight-forward appeals to emotion.

So things have probably always been around this bad, however, I don't know how the presence of the internet and social medial will change things compared to before.

Comment by factorialcode on [AN #83]: Sample-efficient deep learning with ReMixMatch · 2020-01-23T01:49:13.862Z · score: 3 (2 votes) · LW · GW

I'm surprised at how simple the FixMatch paper is. I wonder how sensitive the method is to all the hyperparameters it needs for the pseudo-labeling and the data augmentation.

Comment by factorialcode on Embedded Agents · 2020-01-18T01:04:49.470Z · score: 8 (4 votes) · LW · GW

In the comments of this post, Scott Garrabrant says:

I think that Embedded Agency is basically a refactoring of Agent Foundations in a way that gives one central curiosity based goalpost, rather than making it look like a bunch of independent problems. It is mostly all the same problems, but it was previously packaged as "Here are a bunch of things we wish we understood about aligning AI," and in repackaged as "Here is a central mystery of the universe, and here are a bunch things we don't understand about it." It is not a coincidence that they are the same problems, since they were generated in the first place by people paying close to what mysteries of the universe related to AI we haven't solved yet.

This entire sequence has made that clear for me. Most notably it has helped me understand the relationship between the various problems in decision theory that have been discussed on this site for years, along with their proposed solutions such as TDT, UDT, and FDT. These problems are a direct consequence of agents being embedded in their environments.

Furthermore, it's made me think more clearly about some of my high level models of ideal AI and RL systems. For instance, the limitations of the AIXI framework and some of it's derivatives has become more clear to me.

Comment by factorialcode on Underappreciated points about utility functions (of both sorts) · 2020-01-08T15:46:10.143Z · score: 3 (2 votes) · LW · GW

By "a specific gamble" do you mean "a specific pair of gambles"? Remember, preferences are between two things! And you hardly need a utility function to express a preference between a single pair of gambles.

This is true, then it would only be between a specific subset of gambles.

I don't understand how to make sense of what you're saying. Agent's preferences are the starting point -- preferences as in, given a choice between the two, which do you pick? It's not clear to me how you have a notion of preference that allows for this to be undefined (the agent can be indifferent, but that's distinct).

I mean, you could try to come up with such a thing, but I'd be pretty skeptical of its meaningfulness. (What happens if you program these preferences into an FAI and then it hits a choice for which its preference is undefined? Does it act arbitrarily? How does this differ from indifference, then? By lack of transitivity, maybe? But then that's effectively just nontransitive indifference, which seems like it would be a problem...)

I think you should be able to set things up so that you never encounter a pair of gambles where this is undefined. I'll illustrate with an example. Suppose you start with a prior over the integers, such that:

p(n) = (C/F(n)) where F(n) is a function that grows really fast and C is a normalization constant. Then the set of gambles that we're considering would be posteriors on the integers given that they obey certain properties. For instance, we could ask the agent to choose between the posterior over integers given that n is odd vs the posterior given that n is even.

I'm pretty sure that you can construct an agent that behaves as if it had an unbounded utility function in this case. So long as the utility associated with an integer n grows sufficiently slower than F(N), all expectations over posteriors on the integers should be well defined.

If you were to build an FAI this way, it would never end up in a belief state where the expected utility diverges between two outcomes. The expected utility would be well defined over any posterior on it's prior, so it's choice given a pair of gambles would also be well defined for any belief state it could find itself in.

Comment by factorialcode on TurnTrout's shortform feed · 2020-01-07T17:01:17.079Z · score: 1 (1 votes) · LW · GW

Probably 3-5 years then. I'd use it to get a stronger foundation in low level programming skills, math and physics. The limiting factors would be entertainment in the library to keep me sane and the inevitable degradation of my social skills from so much spent time alone.

Comment by factorialcode on TurnTrout's shortform feed · 2020-01-06T19:38:59.584Z · score: 1 (1 votes) · LW · GW

How good are the computers?

Comment by factorialcode on FactorialCode's Shortform · 2020-01-06T19:36:11.598Z · score: 8 (3 votes) · LW · GW

I've been thinking about arxiv-sanity lately and I think it would be cool to have a sort of "LW papers" where we share papers that are relevant to the topics discussed on this website. I know that's what link posts are supposed to be for, but I don't want to clutter up the main feed. Many of the papers I want to share are related to the topics we discuss, but I don't think they're worthy of their own posts.

I might start linking papers in my short-form feed.

Comment by factorialcode on What were the biggest discoveries / innovations in AI and ML? · 2020-01-06T16:22:14.363Z · score: 6 (4 votes) · LW · GW

-Reverse Mode Autodiff

-Using GPUs for computing.

These are the two big ones. Yes there are some others, but those two ideas together are the backbone of the current AI and ML boom.

Comment by factorialcode on Underappreciated points about utility functions (of both sorts) · 2020-01-04T22:28:26.514Z · score: 1 (1 votes) · LW · GW

you can't have both unbounded utility functions, and meaningful expected utility comparisons for infinite gambles

Are you sure about this? My intuition is that there should be a formalism where the utility functions can be unbounded so long as their expectations remain well defined and bounded. The price to pay is obviously that your agent won't have well defined preferences for all gambles, but often we're talking about an agents preferences on a specific gamble or subset of gambles, in those cases, certain unbounded utilities should be fine.

Comment by factorialcode on [AN #80]: Why AI risk might be solved without additional intervention from longtermists · 2020-01-04T04:16:44.959Z · score: 5 (4 votes) · LW · GW

My impression is that people working on self-driving cars are incredibly safety-conscious, because the risks are very salient.

Safety conscious people working on self driving cars don't program their cars to not take evasive action after detecting that a collision is imminent.

(It's notable to me that this doesn't already happen, given the insane hype around AI.)

I think it already has.(It was for extra care, not drugs, but it's a clear cut case of a misspecified objective function leading to suboptimal decisions for a multitude of individuals.) I'll note, perhaps unfairly, that the fact that this study was not salient enough to make it to your attention even with a culture war signal boost is evidence that it needs to be a Chernobyl level event.

Comment by factorialcode on [AN #80]: Why AI risk might be solved without additional intervention from longtermists · 2020-01-04T03:24:45.945Z · score: 1 (1 votes) · LW · GW

Why isn't the threat clear once the problems are discovered?

I think I should be more specific, when you say:

Suppose that we had extremely compelling evidence that any AI system run with > X amount of compute would definitely kill us all. Do you expect that problem to get swept under the rug?

I mean that no one sane who knows that will run that AI system with > X amount of computing power. When I wrote that comment I also thought that no one sane would not blow the whistle in that event. See my note at the end of the comment.*

However, when presented with that evidence, I don't expect the AI community to react appropriately. The correct response to that evidence is to stop what your doing, and revisit the entire process and culture that led to the creation of an algorithm that will kill us all if run with >X amount of compute. What I expect will happen is that the AI community will try and solve the problem the same way it's solved every other problem it has encountered. It will try an inordinate amount of unprincipled hacks to get around the issue.

Part of my claim is that we probably will get that (assuming AI really is risky), though perhaps not Chernobyl-level disaster, but still something with real negative consequences that "could be worse".

Conditional on no FOOM, I can definitely see plenty of events with real negative consequences that "could be worse". However, I claim that anything short of a Chernobyl level event won't shock the community and the world into changing it's culture or trying to coordinate. I also claim that the capabilities gap between a Chernobyl level event and a global catastrophic event is small, such that even in a non-FOOM scenario the former might not happen before the latter. Together, I think that there is a high probability that we will not get a disaster that is scary enough to get the AI community to change it's culture and coordinate before it's too late.

*Now that I think about it more though, I'm less sure. Undergraduate engineers get entire lectures dedicated to how and when to blow the whistle when faced with unethical corporate practices and dangerous projects or designs. When working, they also have insurance and some degree of legal protection from vengeful employers. Even then, you still see cover ups of shortcomings that lead to major industrial disasters. For instance, long before the disaster, someone had determined that the fukushima plant was indeed vulnerable to large tsunami impacts. The pattern where someone knows that something will go wrong but nothing is done to prevent it for one reason or another is not that uncommon in engineering disasters. Regardless of whether this is due to hindsight bias or an inadequate process for addressing safety issues, these disasters still happen regularly in fields with far more conservative, cautious, and safety oriented cultures.

I find it unlikely that the field of AI will change it's culture from one of moving fast and hacking to something even more conservative and cautious than the cultures of consumer aerospace and nuclear engineering.

Comment by factorialcode on [AN #80]: Why AI risk might be solved without additional intervention from longtermists · 2020-01-04T01:39:45.103Z · score: 2 (2 votes) · LW · GW

My worry is less that we wouldn't survive AI-Chernobyl as much as it is that we won't get an AI-Chernobyl.

I think that this is where there's a difference in models. Even in a non-FOOM scenario I'm having a hard time envisioning a world where the gap in capabilities between AI-Chernobyl and global catastrophic UFAI is that large. I used Chernobyl as an example because it scared the public and the industry into making things very safe. It had a lot going for it to make that happen. Radiation is invisible and hurts you by either killing you instantly, making your skin fall off, or giving you cancer and birth defects. The disaster was also extremely expensive, with the total costs on the order of 10^11 USD$.

If a defective AI system manages to do something that instils the same level of fear into researchers and the public as Chernobyl did, I would expect that we were on the cusp of building systems that we couldn't control at all.

If I'm right and the gap between those two events is small, then there's a significant risk that nothing will happen in that window. We'll get plenty of warnings that won't be sufficient to instil the necessary level of caution into the community, and later down the road we'll find ourselves in a situation we can't recover from.

Comment by factorialcode on [AN #80]: Why AI risk might be solved without additional intervention from longtermists · 2020-01-03T05:02:48.876Z · score: 7 (4 votes) · LW · GW

I agree that ML often does this, but only in situations where the results don't immediately matter. I'd find it much more compelling to see examples where the "random fix" caused actual bad consequences in the real world.


Perhaps people are optimizing for "making pretty pictures" instead of "negative log likelihood". I wouldn't be surprised if for many applications of GANs, diversity of images is not actually that important, and what you really want is that the few images you do generate look really good. In that case, it makes complete sense to push primarily on GANs, and while you try to address mode collapse, when faced with a tradeoff you choose GANs over VAEs anyway.

This is fair. However, the point of the example is more that mode dropping and bad NLL were not noticed when people started optimizing GANs for image quality. As far as I can tell, it took a while for individuals to notice, longer for it to become common knowledge, and even more time for anyone to do anything about it. Even now, the "solutions" are hacks that don't completely resolve the issue.

There was a large window of time where a practitioner could implement a GAN expecting it to cover all the modes. If there was a world where failing to cover all the modes of the distribution lead to large negative consequences, the failure would probably have gone unnoticed until it was too late.

Here's a real example. This is the NTSB crash report for the Uber autonomous vehicle that killed a pedestrian. Someone should probably do an in depth analysis of the whole thing, but for now I'll draw your attention to section 1.6.2. Hazard Avoidance and Emergency Braking. In it they say:

When the system detects an emergency situation, it initiates action suppression. This is a one-second period during which the ADS suppresses planned braking while the (1) system verifies the nature of the detected hazard and calculates an alternative path, or (2) vehicle operator takes control of the vehicle. ATG stated that it implemented action suppression process due to the concerns of the developmental ADS identifying false alarms—detection of a hazardous situation when none exists—causing the vehicle to engage in unnecessary extreme maneuvers.


if the collision cannot be avoided with the application of the maximum allowed braking, the system is designed to provide an auditory warning to the vehicle operator while simultaneously initiating gradual vehicle slowdown. In such circumstance, ADS would not apply the maximum braking to only mitigate the collision.

This strikes me as a "random fix" where the core issue was that the system did not have sufficient discriminatory power to tell apart a safe situation from an unsafe situation. Instead of properly solving this problem, the researchers put in a hack.

Suppose that we had extremely compelling evidence that any AI system run with > X amount of compute would definitely kill us all. Do you expect that problem to get swept under the rug?

I agree that we shouldn't be worried about situations where there is a clear threat. But that's not quite the class of failures that I'm worried about. Fairness, bias, and adversarial examples are all closer to what I'm getting at. The general pattern is that ML researchers hack together a system that works, but has some problems they're unaware of. Later, the problems are discovered and the reaction is to hack together a solution. This is pretty much the opposite of the safety mindset EY was talking about. It leaves room for catastrophe in the initial window when the problem goes undetected, and indefinitely afterwards if the hack is insufficient to deal with the issue.

More specifically, I'm worried about a situation where at some point during grad student decent someone says, "That's funny..." then goes on to publish their work. Later, someone else deploys their idea plus 3 orders of magnitude more computing power and we all die. That, or we don't all die. Instead we resolve the issue with a hack. Then a couple bumps in computing power and capabilities later we all die.

The above comes across as both paranoid and farfeched, and I'm not sure the AI community will take on the required level of caution to prevent it unless we get an AI equivalent of Chernobyl before we get UFAI. Nuclear reactor design is the only domain I know of where people are close to sufficiently paranoid.

Comment by factorialcode on [AN #80]: Why AI risk might be solved without additional intervention from longtermists · 2020-01-03T00:34:09.825Z · score: 14 (5 votes) · LW · GW

A likely crux is that I think that the ML community will actually solve the problems, as opposed to applying a bandaid fix that doesn't scale. I don't know why there are different underlying intuitions here.

I'd be interested to hear a bit more about your position on this.

I'm going to argue for the "applying bandaid fixes that don't scale" position for a second. To me, it seems that there's a strong culture in ML of "apply random fixes until something looks like it works" and then just rolling with whatever comes out of that algorithm.

I'll draw attention to image modelling to illustrate what I'm pointing at. Up until about 2014, the main metric for evaluating an image quality was the bayesian negative log likelyhood. As far as I can tell, this goes all the way back to at least "To Recognize Shapes, First Learn to Generate Images" Where the CD algorithm acts to minimize the log likelihood of the data. This can be seen in the VAE paper and also the original GAN paper. However, after GANs became popular, the log likelyhood metric seemed to have gone out the window. The GANs made really compelling images. Due to the difficulty of evaluating NLL, people invented new metrics. ID and FID were used to assess the quality of the generated images. I might be wrong, but I think it took a while after that for people to realize that SOTA GANs we're getting terrible NNLs compared to SOTA VAEs, even though the VAE's generated images that we're significantly blurrier/noisier. It also became obvious that GANs were dropping modes of the distribution, effectively failing to model entire classes of images.

As far as I can, tell there's been a lot of work to get GANs to model all image modes. The most salient and recent would be DeepMinds PresGAN . Where they clearly show the issue and how PresGAN solves it in Figure 1. However, looking at table 5, there's still a huge gap between in NLL between PresGAN and VAEs. It seems to me that most of the attempt to solve this issue are very similar to "bandaid fixes that don't scale" in the sense that they mostly feel like hacks. None of them really address the gap in likelyhood between VAEs and GANs.

I'm worried that a similar story could happen with AI safety. A problem arises and gets swept under the rug for a bit. Later, it's rediscovered and becomes common knowledge. Then, instead of solving it before moving forward, we see massive increases in capabilities. Simultaneously, the problem is at most addressed with hacks that don't really solve the problem, or solve it just enough to prevent the increase in capabilities from becoming obviously unjustified.

Comment by factorialcode on What will quantum computers be used for? · 2020-01-02T22:05:21.381Z · score: 5 (4 votes) · LW · GW

Given the cooling requirements, I would expect cloud computing to be done as a service, similar to AWS. You can shrink a circuit, but not a refrigerator.

The first application will probably be to break most current cryptography. I'm sure there are plenty of governments that have intercepted communications and are holding onto the data until they can decrypt it. Also bitcoin mining and generating hash collisions.

Beyond that, the main applications I can think of are solving optimization problems and quantum simulations. The optimization applications are to broad for me to speculate about, but the simulations will probably be useful for protein modelling and design. The other main application I can think of is for materials science. Modelling the properties of materials is currently pretty hard and heavily determined by quantum phenomena.(Solids wouldn't really be a thing if electrons and nuclei obeyed classical electrodynamics.)

I wouldn't expect the average consumer to have a direct need for quantum computing. At least, I can't think of anything off of the top of my head. Rather they'll see the downstream effects of QC. New and better materials, new encryption algorithms, better medical technology.

Comment by factorialcode on Programmers Should Plan For Lower Pay · 2019-12-29T18:35:57.917Z · score: 5 (3 votes) · LW · GW

They are? I know several people who've pivoted to becoming software developers. I think it's just that growth in demand is keeping up or outpacing growth in supply.

Comment by factorialcode on Programmers Should Plan For Lower Pay · 2019-12-29T18:07:20.472Z · score: 5 (4 votes) · LW · GW

I think coding just generates a ridiculous and growing amount of value. Look at this list of companies with large earnings per employee. Note that they all specialize in some form of tech or finance. With a regular job, you're bottlenecked by how much work you can accomplish as an individual. With programming, the value you generate is proportional to how much value your code generates. A Lawyer might generate 100,000$ in value per year. The company that makes lawyers 5% more efficient generates 5,000$ / lawyer*year. The lawyer has to dedicate his life to doing that. The other does it once and moves on to something else that goes on to produce even more value. It might even be the case that they are worth even more, but companies are under paying them. This is before factoring in that developers get more and more powerful each year. Computers become more capable, the internet gains new users, and software development tools become more sophisticated but easier and more intuitive to use.

Comment by factorialcode on Vaccine... Help? Deprogramming? Something? · 2019-12-27T23:34:39.136Z · score: 3 (2 votes) · LW · GW

Google scholar + Sci Hub should get you 95% of what you need.

Comment by factorialcode on FactorialCode's Shortform · 2019-12-24T19:00:56.643Z · score: 2 (2 votes) · LW · GW

I've been thinking about 2 things lately:

-Nuclear marine propulsion

-Marine cloud brightening

For some context, marine cloud brightening(MCB) is a geoengineering proposal to disperse aerosolized salt water particles into the lower atmosphere using ships. MCB appears to be one of the more promising geoengineering proposals, with more optimistic estimates placing the cost of reversing climate change at ~ $200 million per year.. However, one of the main technical issues with the proposal is how to actually lift and aerosolize the water particles. Current proposals propose using wind powered ships. However, if the power requirements are not sufficient, nuclear power could be used. Estimates for how much power needs to be delivered in order to aerosolize the water are as low as 30MW. This compares favourably to the estimated 60MW from a Russian nuclear icebreaker. Furthermore, the extra 200-300MW of waste heat from the reactor can be used to heat the water before spraying it. When hot aerosolized water comes into contact with the air, it expands and floats up, allowing the salt to rise in the same way that current marine ship exhaust rises into the atmosphere.

Comment by factorialcode on Free Speech and Triskaidekaphobic Calculators: A Reply to Hubinger on the Relevance of Public Online Discussion to Existential Risk · 2019-12-21T04:29:57.000Z · score: 9 (4 votes) · LW · GW

I think the whole thing is a very relevant case study. It should be investigated if we want to develop methods that will allow rationalists to discuss and understand object level politics. All CW threads were initially quarantined to a single thread on the subreddit. Then the entire thread needed to be moved to a separate subreddit.

Building a calculator that adds 6+7 properly can be exceptionally difficult in an environment full of Triskaidekaphobes who want to smash it and harass it's creators.

Comment by factorialcode on Causal Abstraction Intro · 2019-12-20T01:24:36.686Z · score: 7 (4 votes) · LW · GW

Two points.

First, I don't mind the new format as long as there is some equivalent written reference I can go to. The same way the embedded agency sequence has the full written document and the fun diagrams. This is to make it easier to reference individual components of the material for later discussion. On reddit, I find it's far more difficult to have a discussion about specific points in video content because it requires me to transcribe the section I want to talk about in order to quote it properly.

Second, I might have missed this, but is there a reason we're limiting ourselves to abstract causal models? I get that they're useful for answering queries with the do() operator, but there are many situations where it doesn't make sense to model the system as a DAG.

Comment by factorialcode on Propagating Facts into Aesthetics · 2019-12-19T18:41:14.995Z · score: 4 (2 votes) · LW · GW

I'm having a hard time wrapping my head around the problem this post is trying to address. Are you trying to resolve disagreements between people about 'aesthetics' as you've defined them in this post?

When you say:

But this all left me with a nagging, frustrated sense that something important and beautiful being lost. I want to live in a world where people help each other out in small ways. It’s the particular kind of beauty that a small town in a Miyazaki movie embodies. It feels important to me.

I think that something that you value is being lost, because you want to live in a world where people help each other in small ways. The cost of doing that might outweigh the benefits. For instance, if the community exists to serve some downstream purpose, and so everything must be optimized for efficiency. There's nothing wrong with noticing that comes at a cost.

Under what circumstances should I change how I feel about that?

Approximating humans as "rational agents", the same times it makes sense for an agent to modify it's utility function. For instance, if you're being offered a deal where changing your values will end up giving you more of what you currently value. Generally though, as long as your beliefs about reality are accurate, I think it's a mistake to change the way you feel about it, since that seems dangerously like ignoring you're own preferences.

It seems to me that a persons preferences for a thing should be factored into what they value and their beliefs about that thing. I can see two parties coming to an agreement about the reality of a thing. I can see them coming to an agreement about what each of them finds valuable. I can't see them coming to a consensus about whether a thing is beautiful/ugly, good/bad, or tasteful/distasteful.

Comment by factorialcode on Inductive biases stick around · 2019-12-19T06:27:47.854Z · score: 5 (3 votes) · LW · GW

Does anyone know if double decent happens when you look at the posterior predictive rather than just the output of SGD? I wouldn't be too surprised if it does, but before we start talking about the bayesian perspective, I'd like to see evidence that this isn't just an artifact of using optimization instead of integration.

Comment by factorialcode on Against Premature Abstraction of Political Issues · 2019-12-18T23:54:29.602Z · score: 13 (7 votes) · LW · GW

I agree that this is a real limitation of exclusively meta level political discussion.

However, I'm going to voice my strong opposition to any sort of object level political discussion on LW. The main reason is that my model of the present political climate is that it consumes everything valuable that it comes into contact with it. Having any sort of object level discussion of politics could attract the attention of actors with a substantial amount of power who have an interest in controlling the conversation.

I would even go so far as to say that the combination of "politics is the mindkiller", EY's terrible PR, and the fact that "lesswrong cult" is still the second result after typing "lesswrong" into google has done us a huge favor. Together, it's ensured that this site has never had any strategic importance whatsoever to anyone trying to advance their political agenda.

That being said, I think it would be a good idea to have a rat-adjacent space for discussing these topics. For now, the closest thing I can think of is r/themotte on reddit. If we set up a space for this, then it should be on a separate website with a separate domain and separate usernames that can't be easily traced back to us on LW. That way, we can break all ties with it/nuke it from orbit if things go south.

Comment by factorialcode on Is Causality in the Map or the Territory? · 2019-12-18T05:50:02.165Z · score: 6 (4 votes) · LW · GW

The idea of using a causal model to model the behaviour of a steady state circuit strikes me as unnatural. Doubly so for a simple linear circuit. Their behaviour can be well predicted by solving a set of equations. If you had to force me to create a causal model it would be:

Circuit description(topology, elements) -> circuit state(voltage at nodes, current between nodes)

IIRC, this is basically how SPICE does it's DC and linear AC analysis. The circuit defines a set of equations. Their solution gives the voltages and currents that describe the steady state of the system. That way, it's easy to look at both mentioned counterfactuals, since each is just a change in the circuit description. The process of solving those equations is best abstracted as acausal, even if the underlying process is not.

This changes when you start doing transient analysis of the circuit. In that case, it starts to make sense to model the circuit using the state variables and to model how those state variables evolve. Then you can describe the system using diff equations and boundary conditions. They can be thought of as a continuous limit of causal DAGs. But even then, the state variables and the equations that describe their evolution are not unique. It's just a matter of what makes the math easier.

Not that this takes away from your point that you need different abstractions to answer different queries. For instance, the mass and physical space occupied by the circuit is relevant to most mechanical design queries, but not to most electrical design queries.

Comment by factorialcode on Under what circumstances is "don't look at existing research" good advice? · 2019-12-13T22:53:51.329Z · score: 1 (1 votes) · LW · GW

I think this is mainly a function of how established the field is and how much time you're willing to spend on the subject. The point of thinking about a field before looking at the literature is to avoid getting stuck in the same local optima as everyone else. However, making progress by yourself is far slower than just reading what everyone has already figured out.

Thus, if you don't plan to spend a large amount of time in a field , it's far quicker and more effective to just read the literature. However, if you're going to spend a large amount of time on the problems in the field, then you want to be able to "see with fresh eyes" before looking at what everyone else is doing. This prevents everyone's approaches from clustering together.

Likewise, in a very well established field like math or physics, we can expect everyone to already have clustered around the "correct answer". It doesn't make as much sense to try and look at the problem from a new perspective, because we already have very good understanding of the field. This reasoning break down once you get to the unsolved problems in the field. In that case, you want to do your own thinking to make sure you don't immediately bias your thinking towards solutions that others are already working on.

Comment by factorialcode on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-12T17:50:26.337Z · score: 4 (2 votes) · LW · GW

I don't think so, jockeying can only get you so far, and even then only in situations where physical reality doesn't matter. If you're in a group of ~50 people, and your rival brings home a rabbit, but you and your friend each bring back half a stag because of your superior coordination capabilities, the guy who brought back the rabbit can say all the clever things he wants, but it's going to be clear to everyone who's actually deserving of status. The two of you will gain a significant fitness advantage over the rest of the members of the tribe, and so you will outcompete them.

Comment by factorialcode on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-12T17:36:02.742Z · score: 4 (2 votes) · LW · GW

why go from a four-grunt language to the full variety of human speech?

Bandwidth. 4 grunts let you communicate 2 bits of information per grunt n grunts let you communicate log(n) bits per grunt. In addition, without a code or compositional language, that's the most information you can communicate. Even the simple agents in the OpenAI link were developing a binary code to communicate because 2 bits wasn't enough:

The first problem we ran into was the agents’ tendency to create a single utterance and intersperse it with spaces to create meaning.

In my model, the marginal utility of extra bandwidth and a more expressive code is large and positive when cooperating. This goes on up to the information processing limits of the brain, at which point further bandwidth is probably less beneficial. I think we don't talk as fast as Marshal Mathers simply because our brains can't keep up. Evolution is just following the gradient.

The main reason I don't think runaway dynamics are a major factor is simply because language is very grounded. Most of our language is dedicated to referencing reality. If language evolved because of a signalling spiral, especially an IQ signalling spiral, I'd expect language to look like a game, something like verbal chess. Sometimes it does look like that, but it's the exception, not the rule. Social signalling seems to be mediated through other communication mechanisms, such as body-language and tone or things like vibing. In all cases, the actual content of the language is mostly irrelevant and doesn't need to be the expressive, grounded, and compositional mechanism of language to fulfill it's purpose.

Comment by factorialcode on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-11T20:57:18.334Z · score: 2 (2 votes) · LW · GW

I think this explanation misses something very important. Namely, language lets small groups of agents coordinate for their collective gain. The richer the language and the higher the bandwidth, the more effectively the agents can work together and the more complicated the tasks that they can solve. Agents that can work together will mop the floor with agents that can't. It's easy to construct tasks that can only be achieved by letting agents communicate with each other and I suspect the ancestral environment provided plenty of challenges that were easier to solve with communication. I wouldn't be surprised if a large amount of the sophistication of our language comes from it's ability to let us jockey for status or to deceive others to more effectively propagate our genes, but I don't think we should discount that language vastly increases the power of agents that are willing to cooperate.

Comment by factorialcode on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-11T13:49:24.859Z · score: 9 (6 votes) · LW · GW

I have a boring hypothesis. It's similar to the social media hypothesis. One signals virtue or IQ based on how much other people confer status for either of those things. In the early days of the internet, when there were barriers to entry preventing people from participating online, the internet was populated with people who disproportionately valued IQ over virtue. As a result, in order to gain status in old school online communities, you need to signal IQ. However, as the barriers to entry were lowered, a more representative sample of the population began to emerge online. The general population does not confer status based on signals of IQ, if anything, it's it's the opposite. Nerds have always been low status in the general population. Thus, the parts of the internet conferred status for IQ have become insignificant compared to those that confer status for virtue. So people respond to the new incentives and signal virtue. Since our online views are connected to our real personas, this virtue signalling leaks out into reality, with observable phenomena such as people being fired from their jobs due to internet flash-mobs.

If this hypothesis is correct, then there isn't much we can do about it. Maybe we can make people aware of the dangers of conferring status for fake virtue in places with barriers to entry that are strong enough to keep out the people who would displace you for speaking about those dangers.

In short, we're all becoming part of the global village, and the village is chock full of people who "manipulate the social world, the world of popularity and offense and status, with the same ease that you manipulate the world of nature. But not to the same end. There is no goal for them, nothing to be maintained, just the endless twittering of I’m-better-than-you and how-dare-you-say-that."

Comment by factorialcode on What determines the balance between intelligence signaling and virtue signaling? · 2019-12-11T13:25:15.007Z · score: 8 (5 votes) · LW · GW

With the exception of a Nootropics arms race, I don't think runaway IQ signalling looks like anything that was mentioned. Runaway IQ signalling might start off like that, but as time goes on, peoples views on all of the above will start to flipflop as they try to distinguish themselves from their peers at a similar level of IQ. Leaning too hard on any of the above mentioned signalling mechanisms exposes you to arguments against them, allowing someone to signal that they're smarter than you. But then if you over-correct, then you become exposed to counter arguments, allowing someone else to signal that they're smarter than you. I think EY captured the basic idea in this post on meta-contrarianism. I think a better model might be that it looks a lot more like old school internet arguments. Several people trying really hard to out-manoeuvre each other in a long series of comments in a thread, mailing list, or debate. With each of them saying some version of, "Ah this is true, but you've forgotten to account for..." in order to prove that they are Right and everyone else is Wrong. Or mathematicians trying to to prove difficult and well recognized theorems, since those are solid benchmarks for demonstrating intelligence.

Comment by factorialcode on Is Rationalist Self-Improvement Real? · 2019-12-11T06:21:00.580Z · score: 7 (4 votes) · LW · GW

This is a properly controversial comment and I find this confusing. Could someone who's down-voted it explain their reasoning for doing so? My best guess is the first sentence: "I haven’t finished reading this post yet, but one thing jumped out at me, which is that…" My best guess is that we have a norm against responding before reading the entire post, or that comments that start off that way often end up being vapid or clearly wrong.

Comment by factorialcode on Is Rationalist Self-Improvement Real? · 2019-12-11T06:07:06.548Z · score: 13 (7 votes) · LW · GW

I think most of the strength of rationalism hasn't come from the skill of being rational, but rather from the rationality memeplex that has developed around it. Forming accurate beliefs from scratch is a lot more work and happens very slowly compared to learning them from someone else. Compare how much an individual PHD student achieves in his/her doctorate compared to the body of knowledge that they learn before making an original contribution. Likewise, someone who's practised the art of rationality will be instinctively better at distilling information into accurate beliefs. As a result they might have put together slightly more accurate beliefs about the world given what they've seen, but that slight increase in knowledge due to rationality isn't going to hold a candle to the massive pile of cached ideas and arguments that have been debated to hell and back in the rat-sphere.

To paint a clearer picture of what I'm getting at, suppose we model success as:

[Success] ~ ([Knowledge]*[Effort/Opportunities])

I.E. Success is proportional to how much you know, times how many opportunities you have to make use of it/how much effort you make to preform actions that you have chosen base on that knowledge. I think this could be further broken down into:

[Success] ~ ([Local Cultural Knowledge]+[Personal Knowledge])*[Effort/Opportunities]

I.E. Some knowledge is culturally transmitted, and some knowledge is acquired by interacting with the world. Then being a good practitioner of rationality outside the rationality community corresponds to maybe this:

[Success] ~ (1.05*[Local Cultural Knowledge]+1.05*[Personal Knowledge]))*[Effort/Opportunities]

In other words, a good rationalist will be better than the equivalent person at filtering though their cultural knowledge to identify the valuable aspects, and they'll be better at taking in all the information that they have personally observed, and used it to form beliefs. However, humans already need to be good at this, and evolution has spent a large amount of time honing heuristics for making this happen. So any additional gains in information efficiency will be hard won and possibly pyrrhic gains after extensive practice.

However, I think it's better to model an individual in the rationalist community as:

[Success] ~ ([Local Cultural Knowledge]+[Personal Knowledge]+[Rationalist Cultural Knowledge]))*[Effort/Opportunities]

LessWrong and the adjacent rat-sphere has fostered a community of individuals who all care about forming accurate beliefs about a very broad set of topics, and a focus on the topic of forming accurate beliefs. I think that because of this, the cultural knowledge that has been put together by the rationalist community has exceptionally high "quality" compared to what you would find locally.

Anyone who comes to LW will be exposed to a very large pile of new cultural knowledge, and will be able to put this knowledge into practice when the opportunity comes or they put in effort. I think that the value of this cultural knowledge vastly exceeds any gains that one could reasonably make by being a good rationalist.

A rationalist benefiting from LW doesn't look like someone going over a bunch of information and pulling out better conclusions or ideas than an equivalent non-rationalist. Rather, it looks like someone remembering some concept or idea they read in a blog post, and putting it into practice or letting it influence their decision when the opportunity shows itself.

Off the top of my head, some of the ideas that I've invoked or have influenced me IRL to positive effect have been rationalist taboo, meditation, CO2 Levels, everything surrounding AI risk, and the notion of effective altruism. To a lesser extent, I would also say that rat fiction has also influenced me. I can't produce any evidence that I've benefited from it, but characters like HJPEV and the Comet King have become something like role models for me and this has definitely influenced my personality and behaviour.

The question I think we should ask ourselves then, is has the art of rationality allowed us to put together a repository of cultural knowledge with an unusually high quality compared to what you would see in daily life.

Comment by factorialcode on Recent Progress in the Theory of Neural Networks · 2019-12-06T22:17:25.656Z · score: 3 (2 votes) · LW · GW

Indeed, it's entirely possible that the training data and the test data are of qualitatively different types, drawn from entirely different distributions. A Bayesian method with a well-informed model can often work well in such circumstances. In that case, the performance on the training and test sets aren't even comparable-in-principle.

For instance, we could have some experiment trying to measure the gravitational constant, and use a Bayesian model to estimate the constant from whatever data we've collected. Our "test data" is then the "true" value of G, as measured by better experiments than ours. Here, we can compare our expected performance to actual performance, but there's no notion of performance comparison between train and test.

I think this is beyond the scope of what the post is trying to address. One of the stated assumptions is:

The data is independent and identically distributed and comes separated in a training set and a test set.

In that case, a naive estimate of the expected test loss would be the average training loss using samples of the posterior. The author shows that this is an underestimate and gives us a much better alternative in the form of the WAIC.

Comment by factorialcode on Recent Progress in the Theory of Neural Networks · 2019-12-06T22:00:02.069Z · score: 1 (1 votes) · LW · GW

There is some progress in that direction though. The bigger problem, as mentioned in the link, it is that that estimator seems to completely break down if you try and use an approximation to the posterior although there seems to be ongoing work to estimate generalisation error just from MCMC samples.

Comment by factorialcode on Recent Progress in the Theory of Neural Networks · 2019-12-06T19:15:24.634Z · score: 2 (2 votes) · LW · GW

For small datasets, the PAC-Bayes bounds suffer because they scale as sqrt(KL/N)

I agree with you about the current PAC-Bayes bounds, but there are other results which I think are more powerful and useful.

Not sure if I agree regarding the real-world usefulness. For the non-IID case, PAC-Bayes bounds fail, and to re-instate them you'd need assumptions about how quickly the distribution changes, but then it's plausible that you could get high probability bounds based on the most recent performance.

I think you make even looser assumptions than that, as how quickly and the way in which the distribution changes are themselves quantities can be estimated. I wouldn't be surprised if you could get some very general results by using equally expressive time series models.

Comment by factorialcode on Understanding “Deep Double Descent” · 2019-12-06T18:42:53.042Z · score: 3 (2 votes) · LW · GW

I wonder if this is a neural network thing, an SGD thing, or a both thing? I would love to see what happens when you swap out SGD for something like HMC, NUTS or ATMC if we're resource constrained. If we still see the same effects then that tells us that this is because of the distribution of functions that neural networks represent, since we're effectively drawing samples from an approximation to the posterior. Otherwise, it would mean that SGD is plays a role.

what exactly are the magical inductive biases of modern ML that make interpolation work so well?

Are you aware of this work and the papers they cite?

From the abstract:

We prove that the binary classifiers of bit strings generated by random wide deep neural networks with ReLU activation function are biased towards simple functions. The simplicity is captured by the following two properties. For any given input bit string, the average Hamming distance of the closest input bit string with a different classification is at least sqrt(n / (2{\pi} log n)), where n is the length of the string. Moreover, if the bits of the initial string are flipped randomly, the average number of flips required to change the classification grows linearly with n. These results are confirmed by numerical experiments on deep neural networks with two hidden layers, and settle the conjecture stating that random deep neural networks are biased towards simple functions. This conjecture was proposed and numerically explored in [Valle Pérez et al., ICLR 2019] to explain the unreasonably good generalization properties of deep learning algorithms. The probability distribution of the functions generated by random deep neural networks is a good choice for the prior probability distribution in the PAC-Bayesian generalization bounds. Our results constitute a fundamental step forward in the characterization of this distribution, therefore contributing to the understanding of the generalization properties of deep learning algorithms.

I would field the hypothesis that large volumes of neural network space are devoted to functions that are similar to functions with low K-complexity, and small volumes of NN-space are devoted to functions that are similar to high K-complexity functions. Leading to a Solomonoff-like prior over functions.

Comment by factorialcode on Recent Progress in the Theory of Neural Networks · 2019-12-06T01:09:33.035Z · score: 4 (3 votes) · LW · GW

Deriving bounds on the generalization error might seem pointless when it's easy to do this by just holding out a validation set. I think the main value is in providing a test of purported theories: your 'explanation' for why neural networks generalize ought to be able to produce non-trivial bounds on their generalization error.

I think there's more value to the exercise than just that, it may be less useful in the iid case with lots of data where having a "validation set" makes sense, but there are many non-IID time series problems where effectively your "dataset" consists of one datapoint, and slicing out a "validation set" from this is sketchy at best, and a great recipe to watch an overconfident model fail catastrophically at worst. There are also situations where data is so scarce or expensive to collect that slicing out a validation or test set would prevent you from having enough data to fit a useful model. Being able to form generalisation bounds without relying on a validation set in non-IID situations is something that would be extremely useful for understanding the behaviour AI or AGI systems deployed in the "real world".

Even if you can't make general or solid bounds, understanding how those bound change as we add assumptions (Markov Property, IID, etc...) can tell us more about when it's safe to deploy AI systems.

Comment by factorialcode on Could someone please start a bright home lighting company? · 2019-12-02T15:46:32.677Z · score: 6 (3 votes) · LW · GW

LED bulbs aren't made of tungsten and so cannot heat up to 3000+ degrees without taking damage. LED's are much more sensitive to heat and will burn out very quickly if not properly cooled.

Comment by factorialcode on CO2 Stripper Postmortem Thoughts · 2019-12-02T00:25:57.702Z · score: 1 (1 votes) · LW · GW

I really can't help too much there, the way these capstone projects work is not only dependent on the university but also the department. I have a detailed understanding of who to talk to and what the process looks like for my own department and university, but not for others. It's very much the kind of thing where you need to go and fire off some emails or walk into an office. As some examples, here's the UofT ECE website for industrial project sponsorship, and here's the general UofC website for the same kind of thing. Note that in both cases they don't really tell you anything except to just email them. However, do look around the UofT site as it gives you some idea of what these projects are like and what these students are expected to accomplish.

Also, I suspect that it might be harder to get larger departments/ more reputable universities to help you with your problem because they tend to be much more flush with cash and the students work is in higher demand.

Comment by factorialcode on CO2 Stripper Postmortem Thoughts · 2019-12-01T19:57:25.879Z · score: 3 (2 votes) · LW · GW

I've been on the side actually doing the projects, have worked at companies that farm out the investigation of experimental ideas to engineering students, and acted as a point of contact for those students. Come summer I'll be a project sponsor.

Comment by factorialcode on CO2 Stripper Postmortem Thoughts · 2019-12-01T02:31:07.551Z · score: 10 (6 votes) · LW · GW

Pro tip: When you come up with an idea like this, or want to further refine it, go to your local universities engineering departments, and pitch this as an undergraduate capstone project. This is exactly the kind of project and scope that their looking for. Be prepared to put down a couple grand to fund it, and make sure you discuss who owns the resulting IP. This will give you basically free labour from a team of 3-8 people with more expertise and equipment at their disposal than you could ever hope to have individually. Them and their professors will be much better equipped to deal with all the BS that comes with bridging the gap between ideas and reality.