Posts

Automating Mechanistic Interpretability via Program Synthesis 2025-04-17T10:58:46.748Z
Why are neuro-symbolic systems not considered when it comes to AI Safety? 2025-04-11T09:41:45.199Z
What are the main arguments against AGI? 2024-12-24T15:49:03.196Z

Comments

Comment by Edy Nastase (edy-nastase) on Why are neuro-symbolic systems not considered when it comes to AI Safety? · 2025-04-12T14:36:58.351Z · LW · GW

These are some very valid points, and it does indeed make sense to ask "who would actually do it/advocate it/steer the industry etc.". I was just wondering what are the chances of such approach to take-off, but maybe the current climate does not really allow for such major changes to the systems' architecture. 

Maybe my thinking is flawed, but the hope with this post was to confirm whether it would harmful or not to work on neuro-symbolic systems. Another point was to use such a system on benchmarks like ARC-AGI to prove that an alternative to dominating LLMs is possible, while also being to some degree interpretable. The linked post by @tailcalled is a good point, but I also noticed some criticism in the comments regarding concrete examples of how interpretable/less interpretable such probabilistic/symbolic system really are. Perhaps, some research on this question might not be harmful at all, but I think that is my opinion. 

Comment by Edy Nastase (edy-nastase) on What are the main arguments against AGI? · 2024-12-26T23:15:03.143Z · LW · GW

My focus would have wanted to be purely on AGI. I guess, the addition of AGI x-risks was sort of trying to hint at the fact that they would be coming together with it (that is why I mentioned Lecunn). 

I feel like there is this strong belief that AGI is just around the corner (and it might very well be), but I wanted to know what is the opposition against such statement. I know that there is a lot of solid proof that we are going towards more intelligent systems, but understanding the gaps in this "given prediction" might provide useful information (either for updating timelines, changing research focus etc.).

Personally, I might be on the "in-between" position, where I am not sure what to believe (in terms of timelines). I am safety inclined, and I applaud the effort of people in the field, but there might be a blind spot in believing that AGI is coming soon (when the reality might be very much different). What if that is not the case? What then? What are the safety research implications? More importantly, what are the implications around the field of AI? Companies and researchers might very well use the hype wave to keep getting financed, get recognition etc.

Perhaps, an analogy would help. Think about cancer. Everyone knows it is true, and that is something that is not going to be argued about (hopefully). Now, I cannot come in and say what are the arguments in support of the existence of cancer, because it is already there and proven to be there. Now, in the context of AGI, I feel like there might be a lot of speculations and a lot of people trying to claim that they knew the perfect day AGI came. It feels sort of like a distraction to me. Even the posts around "getting your things in order". It feels sort of wrong to just give up on everything without even considering the arguments against the truth you believe in.

Comment by Edy Nastase (edy-nastase) on We are in a New Paradigm of AI Progress - OpenAI's o3 model makes huge gains on the toughest AI benchmarks in the world · 2024-12-22T23:24:13.548Z · LW · GW

I like this post, especially as I think that o3 went under the mainstream radar. Just took notice of this announcement today, and I have not seen many reactions yet (but perhaps people are waiting to get their hands on the system first?) Is there a lack of reactions (also given that this post does not have a lot of engagement), or is my Twitter just not very updated?

Mike Knoop also mentioned in his Twitter post that this shows proof of how good deep learning program synthesis is. Does this refer to the way o3 was prompted to solve the ARC questions? Otherwise, what suggests this paradigm?

Comment by Edy Nastase (edy-nastase) on Alignment Gaps · 2024-12-22T22:02:53.216Z · LW · GW

That's quite an interesting analysis. I feel like a lot of AI safety field, either intentionally isolates itself from academia, or simply does not want to engage with the academic research. Perhaps the feeling is both ways (academics might distrust AI Safety Research, especially the independent ones).

Regardless, I feel like this post motivated me to engage with a deeper / similar analysis between interpretability and the fields of genetic programming, program synthesis and neuro symbolic Ai. There might be some connections there, and on top of that formal analysis seems to be another layer of helpfulness.