00:45:08 Chris Wendler: The point is that blue and orange are both negative. They get multiplied with each other. 00:47:36 Can Rager: blue is weights (mlp value vectors), not mlp output activation 00:48:10 Chris Wendler: The presentation was confusing but it makes sense. 00:48:39 Chris Wendler: Value vectors multiplied by its activation actually points into the correct direction. 00:48:57 Chris Wendler: Interesting that the finetuning leverages many value vectors that normally would be inactive 00:49:42 Chris Wendler: @Andrew Lee maybe on this slide you can just plot the distribution of the product as well. 00:57:53 Chris Wendler: You can have properly negative coefficients 01:04:51 Chris Wendler: This is qwen 01:26:44 Chris Wendler: 50% is still huge 01:29:25 Nikhil Prakash: Need to leave now. Thanks for the great talk! 01:45:04 Chris Wendler: Really great! 01:52:01 Chris Wendler: Because they are literally polytopes 01:53:48 Chris Wendler: (For relus I think it would be polytopes)