WEBVTT

1
00:00:00.250 --> 00:00:01.020
David Bau: What?

2
00:00:02.040 --> 00:00:03.160
David Bau: Let me just…

3
00:00:07.170 --> 00:00:11.700
David Bau: The last couple slides that I didn't get to go over yesterday while we want major people to arrive.

4
00:00:11.900 --> 00:00:16.640
David Bau: And instead of describe what's going on here, and sort of beyond… it'll be on the Zoom recording.

5
00:00:17.180 --> 00:00:18.589
David Bau: We're trans.

6
00:00:19.380 --> 00:00:20.110
David Bau: Yes.

7
00:00:20.530 --> 00:00:22.190
David Bau: Not here at Chuck Rhode Island.

8
00:00:23.160 --> 00:00:26.769
David Bau: How do we translate their ambulance system?

9
00:00:27.400 --> 00:00:29.029
David Bau: Why is it by a mother of nothing?

10
00:00:29.320 --> 00:00:30.200
David Bau: Yay.

11
00:00:35.310 --> 00:00:36.150
David Bau: Great.

12
00:00:36.740 --> 00:00:37.610
David Bau: So…

13
00:00:38.000 --> 00:00:45.710
David Bau: Yeah, so I finished, you know, last week I was telling you the story. It was just sort of my personal lesson of how to do probing wrong.

14
00:00:46.670 --> 00:00:53.940
David Bau: And, and so… You know, the basic idea of our bank, buzz.

15
00:00:54.930 --> 00:00:56.780
David Bau: then, you know.

16
00:00:57.070 --> 00:00:59.490
David Bau: Once you have a probe that compels.

17
00:01:00.130 --> 00:01:07.789
David Bau: the difference between A and G, you know, between one classroom and opposites, then… then you can use the accuracy of the approach to kind of

18
00:01:08.280 --> 00:01:09.299
David Bau: guide you.

19
00:01:09.510 --> 00:01:13.420
David Bau: As an estimate for… How much?

20
00:01:14.030 --> 00:01:16.510
David Bau: How much your representation knows the concept, right?

21
00:01:16.870 --> 00:01:21.880
David Bau: So, so we did that here for… A fellow.

22
00:01:22.390 --> 00:01:26.520
David Bau: But there were two different types of classifiers that we contained.

23
00:01:26.850 --> 00:01:34.079
David Bau: financially trained, linear class times, But the linear classifiers were…

24
00:01:36.570 --> 00:01:41.349
David Bau: And we, we trained multi-layer, classifiers, like, two-layer…

25
00:01:43.350 --> 00:01:46.429
David Bau: And they work, right, like, 98% of the time.

26
00:01:46.680 --> 00:01:51.830
David Bau: And so, so we thought, well, if you contain it and classified.

27
00:01:51.970 --> 00:01:58.309
David Bau: on the page, then that'll… that'll simply show us the information that's in there.

28
00:01:58.520 --> 00:02:00.539
David Bau: So, so we published it.

29
00:02:00.660 --> 00:02:04.079
David Bau: And, and then I sort of told you the story that

30
00:02:04.450 --> 00:02:08.400
David Bau: Atlanta came out a few weeks later and said, no.

31
00:02:08.680 --> 00:02:13.760
David Bau: This LLP, you don't need to do it, you can get the information out with a linear classifier.

32
00:02:13.920 --> 00:02:24.849
David Bau: And, and so this is… I really like this chart. So this is the chart where he says, here's, like, you have white, which is a really bad guess, is, like, 25% long.

33
00:02:25.220 --> 00:02:27.630
David Bau: In the past, I've only been up, and then…

34
00:02:27.750 --> 00:02:32.070
David Bau: And then, you know, black, or really dark blue here, is, like, 0% wrong.

35
00:02:32.430 --> 00:02:33.420
David Bau: And…

36
00:02:33.560 --> 00:02:43.120
David Bau: And so, you know, our MLPs were 0% wrong, but he trained this linear classifier to do it. He trained the same linear classifier that we put in the GitHub.

37
00:02:43.290 --> 00:02:48.130
David Bau: And it was… it was bad. It was, like, 25% wrong, most of the time.

38
00:02:48.410 --> 00:02:54.000
David Bau: But then, when he switched it so that the class buyer predicted,

39
00:02:54.310 --> 00:03:12.140
David Bau: The things not on every turn, but on every other turn. It'd be like, instead of looking at the representation on every word, look at every other word, or instead of every sentence, looking at every other sentence. Or when you're… if you're studying dialogue between two people, like, oh, look at… don't look at, you know, every… every sentence, but every other sentence, right?

40
00:03:13.640 --> 00:03:19.890
David Bau: you know, he got the accuracy up to, you know, 99%. And,

41
00:03:20.270 --> 00:03:23.429
David Bau: And so, what's the heck? You know, the reason was for this?

42
00:03:23.880 --> 00:03:27.079
David Bau: What would kind of be rotated in Python models?

43
00:03:27.250 --> 00:03:32.960
David Bau: It had a representation of the Othello Ford, but it was encoded in a way that we didn't expect.

44
00:03:33.200 --> 00:03:40.010
David Bau: Instead of encoding, a white square is, you know, a white pebble is on this square.

45
00:03:40.290 --> 00:03:44.409
David Bau: And the black pebbles on another square. It was encoding a different concept.

46
00:03:44.560 --> 00:03:48.519
David Bau: Which was not the concept that we had anticipated.

47
00:03:48.730 --> 00:03:51.560
David Bau: with our training staff, it was encoding

48
00:03:51.890 --> 00:03:55.130
David Bau: Is the pebble on this square owned by the person to move?

49
00:03:55.340 --> 00:03:56.610
David Bau: Or the opponent.

50
00:03:56.770 --> 00:03:59.189
David Bau: Which flips. Every day of the time.

51
00:03:59.750 --> 00:04:00.919
David Bau: in settlement.

52
00:04:01.100 --> 00:04:10.200
David Bau: So I think that there's a couple lessons from this. So… One lesson is… Look at your data.

53
00:04:10.340 --> 00:04:12.560
David Bau: So this is, like, one of the reasons that…

54
00:04:12.790 --> 00:04:16.979
David Bau: Before I went into causal mediation analysis and all these other things, I was like.

55
00:04:17.230 --> 00:04:23.940
David Bau: Hey, let's just make some scatter plots to just look at some vectors, and then just get into the practice of doing that.

56
00:04:24.060 --> 00:04:25.639
David Bau: I think that,

57
00:04:25.930 --> 00:04:40.569
David Bau: There's probably other ways of looking at the data, too, but that's the most obvious, obvious thing to try. But if you have a way of just, like, looking at the raw data somehow, so Neil has, like, a bunch of different visualizations he was doing to try to look at the raw data.

58
00:04:40.700 --> 00:04:47.740
David Bau: And just live with that for a while. Like, get a little sense for it. You might find out that you're asking the wrong question.

59
00:04:48.500 --> 00:04:52.680
David Bau: Of your model, that you're not… that you have your own human bias?

60
00:04:53.080 --> 00:04:58.120
David Bau: For how you think the model might represent something, but the model might represent something in a slightly different way.

61
00:04:58.280 --> 00:05:04.090
David Bau: That would… would change what you find a lot. And so this is… I thought this was really interesting.

62
00:05:04.210 --> 00:05:06.080
David Bau: Because it's a clear case.

63
00:05:06.380 --> 00:05:08.439
David Bau: Where the model headed it.

64
00:05:09.030 --> 00:05:16.870
David Bau: And the equivalent concept, But one that… the classifier definition of a concept that we've had

65
00:05:17.340 --> 00:05:19.840
David Bau: Was totally wrong, so wrong, that

66
00:05:20.090 --> 00:05:22.269
David Bau: You know, it gave us this really bad accuracy.

67
00:05:23.470 --> 00:05:28.990
David Bau: And so… So, but then there's a second takeaway designer, which is that

68
00:05:29.390 --> 00:05:33.950
David Bau: We were able to get the accuracy up to 98% with an MLP.

69
00:05:34.890 --> 00:05:38.250
David Bau: Which, which, which leads to this question.

70
00:05:39.120 --> 00:05:43.570
David Bau: Like, were we fooling ourselves somehow? Like, what's wrong with an MLP?

71
00:05:44.010 --> 00:05:44.690
David Bau: Thank you.

72
00:05:45.600 --> 00:05:52.109
David Bau: And that's what… that's what the Hewitt paper is basically about. And so I just wanted to go over this.

73
00:05:52.410 --> 00:05:57.060
David Bau: And, and the basic idea of the Hewitt paper

74
00:05:57.380 --> 00:06:02.249
David Bau: is… yes, before we get into this, did he mention anything about his…

75
00:06:02.650 --> 00:06:06.849
David Bau: intuition for why did he look into BioCDR?

76
00:06:07.560 --> 00:06:13.349
David Bau: I think this was the first finger, where he showed it, and I think that, you know.

77
00:06:13.640 --> 00:06:27.440
David Bau: he was looking at… I think that he was visualizing the data in different direct ways. I don't know exactly which visualization he used. Like, in the end, he had a lot of these kind of color diagrams of what the predictions were and how accurate they were.

78
00:06:27.700 --> 00:06:35.050
David Bau: And, so he was doing… like, this is… this… you might think of this as sort of a logic lens kind of visualization or something, a fellow board.

79
00:06:35.610 --> 00:06:41.539
David Bau: And so I think he was just building a lot of these, sort of, Logic Land-style visualizations, and…

80
00:06:41.780 --> 00:06:47.720
David Bau: And then it… and then I don't know what gave him the idea, That maybe, instead of…

81
00:06:47.910 --> 00:06:57.520
David Bau: maybe the classifier could have been made better by making the problem easier. So this is always the pattern in research, right? In research, you have something, you're trying to make it work.

82
00:06:57.820 --> 00:07:08.449
David Bau: And, you know, you really believe in your hypothesis. You gotta chase your hypothesis for a while. You think that there's gotta be something here, but everything that we do to measure it is like coming up noise.

83
00:07:09.130 --> 00:07:15.929
David Bau: And then the right reaction to that, if you… until you stop believing your hypothesis, right, the right reaction is.

84
00:07:16.400 --> 00:07:30.239
David Bau: you know, is there a different way for me to set up an experiment? Maybe I have some confounder in there. Maybe I've got something that's making it unnecessarily hard. Maybe I've got something in there that's all noise, that isn't a necessary part of the experiment that I can remove.

85
00:07:30.690 --> 00:07:36.640
David Bau: And… and so I think that's, you know, that's what this experiment here is. It's like, O'Neill says.

86
00:07:36.750 --> 00:07:42.899
David Bau: Well, maybe, you know, for whatever reason, maybe doing this classifier every single turn

87
00:07:43.430 --> 00:07:48.869
David Bau: is somehow confounded, right? Maybe it'd be easier if we picked the easy turns.

88
00:07:49.210 --> 00:07:59.220
David Bau: to do. What would the easy turns be? Well, maybe the easy turns would be all the turns where the same person is moving or something, you know, every other turn, right?

89
00:07:59.420 --> 00:08:03.170
David Bau: And, you know, you can see how that would be, like, an easier problem.

90
00:08:03.310 --> 00:08:06.969
David Bau: Right? You only have to classify half of the banks.

91
00:08:08.210 --> 00:08:13.390
David Bau: And, and so… so I, I often will recommend this.

92
00:08:13.520 --> 00:08:22.390
David Bau: when people are doing different pieces of insert things, it's like, oh, your results are really noisy? Maybe some sort of subset analysis.

93
00:08:22.530 --> 00:08:32.979
David Bau: would be useful. Maybe you don't have to, you know, get your result to be clean on the whole data, maybe on a subset, maybe on half the data, maybe there's a certain third of the data

94
00:08:33.169 --> 00:08:42.519
David Bau: where your result is really clean. What is that one-third? If it's systematic, then you still learn something, right? And that's basically what Neil did here. He says, okay, let's… let's do it on half the data.

95
00:08:43.299 --> 00:08:46.309
David Bau: And, and… and boom!

96
00:08:46.630 --> 00:08:47.590
David Bau: It worked.

97
00:08:48.870 --> 00:08:51.300
David Bau: I was not a very experienced advisor.

98
00:08:51.620 --> 00:08:54.799
David Bau: when I was working with Kenneth on this problem otherwise.

99
00:08:55.070 --> 00:08:57.299
David Bau: I'm very embarrassed that we didn't do it.

100
00:08:57.400 --> 00:08:59.110
David Bau: You know,

101
00:08:59.820 --> 00:09:07.430
David Bau: You know, all I did was, how Kenneth really should be a linear classifier, right? And he tried. He tried really hard.

102
00:09:07.630 --> 00:09:10.890
David Bau: But we… we didn't do something like, let's do a sub-head analysis.

103
00:09:12.000 --> 00:09:12.870
David Bau: Okay.

104
00:09:13.400 --> 00:09:15.430
David Bau: So,

105
00:09:15.920 --> 00:09:26.010
David Bau: The second lesson, though, that comes out of it is NLPs can kind of fool you, and that's what the John Hewitt paper is about. And so I just give you a little bit

106
00:09:26.110 --> 00:09:30.500
David Bau: of… Overview of what's going on here.

107
00:09:30.980 --> 00:09:35.390
David Bau: And so, the idea here is…

108
00:09:36.940 --> 00:09:40.759
David Bau: So this diagram here is the diagram of, like.

109
00:09:41.380 --> 00:09:51.369
David Bau: Actually, I don't really even know how to understand this diagram, but it's a diagram of what a control task might be for part of speech tagging, and…

110
00:09:51.690 --> 00:10:08.350
David Bau: And the basic idea is that normally, to do part of speech tagging, you have to do a lot of interesting things. You might have to disambiguate ambiguous words, like I was telling you, play can sometimes be a noun, sometimes it can be a verb, it can sometimes be transitive, it can sometimes not be transitive, and whatever, right? And so… so to figure out, like.

111
00:10:08.350 --> 00:10:16.999
David Bau: what kind of verb it is, or what kind of noun it is, you sort of have to understand what's going on in the sentence, and so part of speech tagging is kind of tricky. You gotta do this thing.

112
00:10:17.180 --> 00:10:23.850
David Bau: And and so, what John says is, instead.

113
00:10:24.100 --> 00:10:37.550
David Bau: what he's gonna do… so he says, so… so if you train a classifier to classify part of speech tags out of one of these networks, you'll find that it has a certain accuracy. It can, like, get the part of speech tags out

114
00:10:37.620 --> 00:10:49.940
David Bau: With a linear classifier at, you know, maybe 85% or something like that, which, you know, is okay. And then he says, but he's frustrated because a lot of people publish papers saying, oh, I can do better.

115
00:10:49.940 --> 00:10:58.489
David Bau: Then the last person who wrote a probe paper saying that I can get this out at 85%. I have a new paper, and I can get it out at 99%.

116
00:10:58.780 --> 00:11:10.279
David Bau: And I've done it using a multi-layer perceptron probe, like a fancier probe, like a, you know, more complicated machine learning thing. And so now I've really proved that the information's there. It's like 99%.

117
00:11:10.630 --> 00:11:19.310
David Bau: And… and John's point is… That that's not really… The point of it.

118
00:11:20.190 --> 00:11:24.809
David Bau: I, you know, we had, we had a math class in here just, A few minutes ago.

119
00:11:24.910 --> 00:11:30.180
David Bau: And I feel like it's kind of like asking people, can you solve this math problem? And you give them a pencil?

120
00:11:30.560 --> 00:11:43.830
David Bau: And then, you know, it gives you a measure of why they can solve the math problem. And they say, yeah, they're learning something from the class, like, 80% of the people can solve the math problem. My probe has succeeded in telling me how much they know math.

121
00:11:44.240 --> 00:12:03.899
David Bau: And then… and then somebody else says, well, I'll really tell you whether they know math. And they distribute, like, pocket calculators to everybody. And they say, with pocket calculators, can you solve this math problem? Like, it's like 100%, like, everybody can solve it. It's like, so now, does that give you, like, you know, a better sense that everybody knows what's going on in the math? I'm not sure it does!

122
00:12:04.080 --> 00:12:04.830
David Bau: Right?

123
00:12:05.040 --> 00:12:09.940
David Bau: And so I think that this is more or less what… You know, John Hewitt's…

124
00:12:10.060 --> 00:12:13.129
David Bau: Point of view is on the difference between a linear probe

125
00:12:13.540 --> 00:12:20.980
David Bau: and then LLP. And the way that, He… he proved it.

126
00:12:21.080 --> 00:12:24.880
David Bau: Was to come up with a control task.

127
00:12:25.100 --> 00:12:34.379
David Bau: baseline suggestion. And so what he says is that what MLPs are good at is they're really good at learning how to solve

128
00:12:34.620 --> 00:12:35.980
David Bau: Due to problems.

129
00:12:36.160 --> 00:12:39.410
David Bau: That your neural network didn't already know how to solve.

130
00:12:40.870 --> 00:12:52.510
David Bau: Does that make sense? Like, oh, it's like… the analogy would be, like, not, like, giving calculators to everybody in class. What if you gave everybody ChatGPT in class, and then you gave them the test?

131
00:12:52.810 --> 00:12:53.590
David Bau: Right?

132
00:12:53.740 --> 00:12:56.520
David Bau: you know, ChatGPT is not only, like.

133
00:12:56.710 --> 00:13:00.040
David Bau: good at solving the math problem. It's good at, like, solving new problems!

134
00:13:00.190 --> 00:13:03.209
David Bau: That, that, you know, that you haven't seen before.

135
00:13:03.440 --> 00:13:13.569
David Bau: And so, you know, even if you come up with a really creative test with all sorts of new things that are getting the students to think they're using ChatGPT, it's a, you know…

136
00:13:13.730 --> 00:13:15.780
David Bau: That test instrument that you're using.

137
00:13:16.520 --> 00:13:22.720
David Bau: might hide… might hide the signal. And so, the idea here is that the control task is

138
00:13:22.840 --> 00:13:29.049
David Bau: the new task. So you've got… you've got a task that is your regular hypothesis, like, part of speech tagging, and the control task.

139
00:13:29.440 --> 00:13:31.400
David Bau: It's something that would be hard to learn.

140
00:13:31.850 --> 00:13:35.390
David Bau: But… That would be kind of meaningless.

141
00:13:35.880 --> 00:13:40.269
David Bau: And so, the thing that you did here is random tagging.

142
00:13:40.430 --> 00:13:46.190
David Bau: So he went to every single word, and he gave it a random tag instead of part of speech tag.

143
00:13:46.950 --> 00:13:52.960
David Bau: Like, so what is this? It's like, cat is normally a noun.

144
00:13:53.220 --> 00:14:00.650
David Bau: But instead, Kat is part of speech 37, or something like that. I don't know, like, he didn't…

145
00:14:01.140 --> 00:14:12.970
David Bau: you could have shuffled it and said, oh, part of speech is a verb or something like that, right? And then dog is normally a noun, but, like, dog and I always, like, part of speech 15 or something, like, so cat and dog are, like, in different classes, just totally…

146
00:14:13.210 --> 00:14:15.609
David Bau: Totally messes it all up, right?

147
00:14:15.840 --> 00:14:19.580
David Bau: And so, if you want to… So, the control task.

148
00:14:19.720 --> 00:14:27.230
David Bau: You'd have to sort of memorize this nonsense Garbage. And… Make sense?

149
00:14:27.530 --> 00:14:31.439
David Bau: And so, and so the idea is.

150
00:14:32.110 --> 00:14:35.999
David Bau: If you give this special AI to the system.

151
00:14:36.580 --> 00:14:42.140
David Bau: and you ask it, can you use this AI together with your own brain?

152
00:14:42.410 --> 00:14:44.689
David Bau: To solve the part of a speech task.

153
00:14:44.970 --> 00:14:49.119
David Bau: And can you use the AI together with your own brain to solve control tasks?

154
00:14:49.740 --> 00:14:53.130
David Bau: And it gets, like, 100% on both of them.

155
00:14:53.750 --> 00:14:55.790
David Bau: And that's sort of an indication

156
00:14:56.320 --> 00:15:02.699
David Bau: that the AI is doing all the work. Because we don't think that your own brain is actually doing the control task.

157
00:15:03.230 --> 00:15:04.360
David Bau: Does that make sense?

158
00:15:04.860 --> 00:15:10.869
David Bau: We don't think that your own brain actually represents cat totally different from dog, like… Do you think that…

159
00:15:11.090 --> 00:15:15.340
David Bau: Like, it thinks both of them are nouns, kind of similar nouns, if that makes sense?

160
00:15:15.860 --> 00:15:19.920
David Bau: But the control task has some other garbage thing, which we don't think is in your head.

161
00:15:20.690 --> 00:15:26.210
David Bau: And then, so if we give you a probate tool that measures 100% on that.

162
00:15:27.240 --> 00:15:29.109
David Bau: It's probably not measuring what's in your head.

163
00:15:30.490 --> 00:15:31.399
David Bau: Make sense?

164
00:15:31.560 --> 00:15:33.070
David Bau: So that's the basic idea.

165
00:15:33.400 --> 00:15:34.910
David Bau: Of the Hewitt thing.

166
00:15:35.090 --> 00:15:40.690
David Bau: And so, here's the… Here's the regular probe.

167
00:15:42.150 --> 00:15:44.500
David Bau: And then, here's the control task probe.

168
00:15:45.060 --> 00:15:53.079
David Bau: And here's what happens when you have this MLP, and you make the MLP really weak, so you make it so it's barely an extra layer.

169
00:15:53.460 --> 00:15:56.190
David Bau: You make the extra layer, like, not too many neurons?

170
00:15:56.780 --> 00:16:03.750
David Bau: Right? And then here's where you make it really fat extra layer, right? You've got, like, a lot of extra, you know, neurons in this extra layer.

171
00:16:03.890 --> 00:16:04.700
David Bau: Right.

172
00:16:04.860 --> 00:16:12.249
David Bau: And… and when the MLP has a lot of extra neurons in layer, you can see that it can take this garbage task.

173
00:16:12.490 --> 00:16:14.349
David Bau: And get it to near 100%.

174
00:16:15.410 --> 00:16:19.590
David Bau: Which is… which is meaning that the MLP is learning the new thing by itself.

175
00:16:19.920 --> 00:16:22.439
David Bau: It's not that it was already there, and that…

176
00:16:23.580 --> 00:16:25.160
David Bau: In the brand that you're testing.

177
00:16:25.890 --> 00:16:26.730
David Bau: Make sense?

178
00:16:27.040 --> 00:16:31.130
David Bau: You get it? But here, when the MLP is really weak.

179
00:16:31.660 --> 00:16:33.839
David Bau: You know, you can get really mad.

180
00:16:34.790 --> 00:16:35.620
David Bau: Bye.

181
00:16:35.890 --> 00:16:37.580
David Bau: Any questions about this?

182
00:16:40.340 --> 00:16:42.550
David Bau: Shuyi. Is she here?

183
00:16:43.110 --> 00:16:43.970
David Bau: Yes.

184
00:16:44.530 --> 00:16:49.460
David Bau: Yeah, the adjusting is very strong, I could go back then.

185
00:16:49.940 --> 00:16:54.430
David Bau: We will say that if it's not a blueprove, then it's not fair.

186
00:16:55.860 --> 00:16:57.070
David Bau: That's right.

187
00:16:57.210 --> 00:17:03.049
David Bau: So, you would say, if it was here, you would say, You're at 99%.

188
00:17:04.280 --> 00:17:09.429
David Bau: But… Like, for the red one, you'd say it's 99%?

189
00:17:10.410 --> 00:17:13.450
David Bau: But… You know, it's not a very good program.

190
00:17:13.829 --> 00:17:18.270
David Bau: And so, actually, what, what these, what, so it, the…

191
00:17:18.869 --> 00:17:28.859
David Bau: the logic is actually… what John would say is, it's this blue one, which is at 99%, which is not a good probe. So even though the blue one is measuring the thing that we really believe

192
00:17:29.050 --> 00:17:30.090
David Bau: Isn't it?

193
00:17:30.300 --> 00:17:33.119
David Bau: John would say, don't believe this blue one.

194
00:17:33.400 --> 00:17:37.499
David Bau: Because the gap to the red thing is tiny, the selectivity's…

195
00:17:37.680 --> 00:17:42.830
David Bau: low. So even though the accuracy is 99%, the selectivity is, you know, 1 or 2%.

196
00:17:44.500 --> 00:17:46.649
David Bau: So is that… so do you think that that's fair?

197
00:17:47.050 --> 00:17:55.079
David Bau: What do you think is unfair? I don't think it's unfair, because that means the probe… if the probe is very strong, then you always think it's not a good probe.

198
00:17:55.840 --> 00:18:01.479
David Bau: You think if the probe is very strong, then… You always think it's not a blush probe.

199
00:18:01.750 --> 00:18:03.319
David Bau: Both are not sprint.

200
00:18:03.630 --> 00:18:13.380
David Bau: Well, no, but this probe… You know, let's say this… Blue Point was up here.

201
00:18:13.830 --> 00:18:15.500
David Bau: Let's just say it was up there.

202
00:18:15.920 --> 00:18:21.090
David Bau: You wouldn't say it was bad, you'd say it was good, because the gap between it and the red thing is pretty large.

203
00:18:23.860 --> 00:18:24.800
David Bau: Yes, ma'am?

204
00:18:24.960 --> 00:18:32.349
David Bau: That makes sense, or do you think they… Yeah, I think one way to describe it is, like, just because you give somebody chat GPT and they can solve

205
00:18:32.470 --> 00:18:43.949
David Bau: doesn't mean they don't know how to solve it without chat. Like, it could be… Or maybe it's unfair. It could very be the case that they still know how to solve it, but I think the question… I see, I see. …that what John is saying is… I see.

206
00:18:44.130 --> 00:18:53.119
David Bau: how big of an update is it? That's right. Even though… So we can't fail you… We can't fail you, but we can't say that you've learned it. That's right. That's right.

207
00:18:53.300 --> 00:18:57.050
David Bau: That makes sense. So, so if we, if we, like…

208
00:18:57.160 --> 00:19:00.690
David Bau: If… if you get, like, 100% on the test.

209
00:19:01.560 --> 00:19:09.079
David Bau: And by using ChatGPT. And then you can also get, like, 99% on a garbage test using ChatGPT.

210
00:19:10.150 --> 00:19:11.420
David Bau: Can we fail you?

211
00:19:12.100 --> 00:19:16.260
David Bau: And the answer you're saying is, no, that's not fair. We can't fail you.

212
00:19:16.410 --> 00:19:22.449
David Bau: Right? But what we can say is it's not a very good test. It's not a game bunch. Does that make sense?

213
00:19:22.800 --> 00:19:27.409
David Bau: That's right. Yeah, but it's not enough to fail you. It's not enough to say that you're dumb.

214
00:19:28.250 --> 00:19:42.679
David Bau: Right. Okay, maybe that answers your question. Okay. But… but it does mean… so what I'd say is, is selectivity fair? It's… it's… it's fair against the test, is what it is. So we're backing up one level and not asking how good the model is, we're asking how good the test is.

215
00:19:44.020 --> 00:19:46.470
David Bau: And it's a good way of saying it's a bad test.

216
00:19:48.120 --> 00:19:52.229
David Bau: I'll say. Okay, so… Going on, had a… had a comment.

217
00:19:53.340 --> 00:19:54.460
David Bau: I'm not here.

218
00:19:55.190 --> 00:19:55.970
David Bau: No.

219
00:19:57.040 --> 00:19:58.400
David Bau: Not even online.

220
00:19:58.600 --> 00:20:00.699
David Bau: That's… Did pop up.

221
00:20:01.690 --> 00:20:02.780
Guangyuan Weng: Oh, can you hear me?

222
00:20:05.020 --> 00:20:06.610
David Bau: Yes, what was your question?

223
00:20:07.070 --> 00:20:16.440
Guangyuan Weng: Oh, oh, yeah, I was asking, like, why do they argue regulation to reduce generalization gap is not sufficient if your goal is selectivity?

224
00:20:17.140 --> 00:20:17.810
David Bau: Right.

225
00:20:17.950 --> 00:20:21.219
David Bau: So… Is regularization enough?

226
00:20:21.500 --> 00:20:24.790
David Bau: I think that, in a sense, this plot

227
00:20:25.120 --> 00:20:29.300
David Bau: Is one of ex… an example where it says, well, regularization is not nothing.

228
00:20:29.630 --> 00:20:33.229
David Bau: So, one form of regularization is to just reduce the parameterization.

229
00:20:33.680 --> 00:20:39.070
David Bau: Right? And… and and you can see that it improves sleep selectivity.

230
00:20:39.210 --> 00:20:45.830
David Bau: Here's more different types of regularization here. So here… They're doing weight decay.

231
00:20:46.480 --> 00:20:47.240
David Bau: Right?

232
00:20:47.770 --> 00:20:49.920
David Bau: And some other things.

233
00:20:50.760 --> 00:20:59.540
David Bau: And… And I guess what they're… So… So, I'd say that…

234
00:20:59.670 --> 00:21:08.729
David Bau: The conclusion from here is that you know, regularization, I… that's how.

235
00:21:09.000 --> 00:21:09.990
David Bau: Somewhat.

236
00:21:10.270 --> 00:21:16.509
David Bau: Right, when you regularize more, When you do… this is sample count, reducing the steps of early stopping.

237
00:21:16.800 --> 00:21:21.090
David Bau: You reduce the rank, you know, reduce the number of parameters, you have higher dropout.

238
00:21:21.200 --> 00:21:25.590
David Bau: You have higher weight decay, you know, so all the higher regularization things are on the left.

239
00:21:25.860 --> 00:21:28.580
David Bau: You can see selectivity generally rises.

240
00:21:29.040 --> 00:21:32.269
David Bau: when you go to the left. So… so the paper isn't…

241
00:21:32.720 --> 00:21:40.500
David Bau: if you looked at… I don't know about the words of the paper, but if you look at the actual results, the paper isn't totally down on regularization.

242
00:21:40.850 --> 00:21:42.230
David Bau: It's like, okay.

243
00:21:42.390 --> 00:21:47.670
David Bau: But then there's this other trend that… he notices.

244
00:21:48.050 --> 00:21:52.880
David Bau: That's even stronger than the regularization, which is the difference between the green lines

245
00:21:53.470 --> 00:21:54.759
David Bau: The red and the blue lines.

246
00:21:55.770 --> 00:21:56.839
David Bau: Does that make sense?

247
00:21:57.170 --> 00:22:01.830
David Bau: And so the green line sphere, Are all the linear models.

248
00:22:02.500 --> 00:22:07.379
David Bau: And the red and the blue lines are all the MLPs, like the… Of the nominator models.

249
00:22:08.200 --> 00:22:09.080
David Bau: And…

250
00:22:09.300 --> 00:22:17.439
David Bau: And for the same amount of weight decay, or dropout, or early stopping, or whatever, the green lines are pretty consistently

251
00:22:17.610 --> 00:22:19.140
David Bau: Like, a pretty big gap.

252
00:22:19.320 --> 00:22:20.889
David Bau: It's selectivity over?

253
00:22:21.370 --> 00:22:25.560
David Bau: the red and the blue models, and so that's why John Hewitt is like.

254
00:22:26.270 --> 00:22:33.780
David Bau: I don't know about… I don't… I'm not going to recommend that people use regularization, I just recommend people use linear models, like a simpler solution.

255
00:22:34.040 --> 00:22:34.760
David Bau: Right.

256
00:22:35.170 --> 00:22:37.249
David Bau: You'll get… you'll get better selectivity.

257
00:22:37.830 --> 00:22:38.840
David Bau: Does that make sense?

258
00:22:40.090 --> 00:22:40.950
Guangyuan Weng: Thank you.

259
00:22:42.070 --> 00:22:45.509
David Bau: Sure, sure. And so, now,

260
00:22:45.890 --> 00:22:52.880
David Bau: So I think this is a little bit of the background why I kept on harping on our student, Kenneth.

261
00:22:53.190 --> 00:22:55.130
David Bau: To try to do a linear model.

262
00:22:56.690 --> 00:23:06.339
David Bau: Because of… because of this kind of thing. Now, some people will still probe with nonlinear models, because it does tell you something, but you have to watch out for this type of issue.

263
00:23:06.710 --> 00:23:10.480
David Bau: Related to overfitting, and…

264
00:23:10.770 --> 00:23:20.749
David Bau: And you know, it's a classic machine learning, problem to be aware of. So… and selectivity is… and, you know, designing a randomized control task

265
00:23:21.390 --> 00:23:25.819
David Bau: If you have a complicated probe, even if you have a very high-dimensional linear probe.

266
00:23:26.160 --> 00:23:28.139
David Bau: When you write your papers.

267
00:23:28.360 --> 00:23:33.630
David Bau: If you want to get them through peer review, this is not a bad way of convincing a reviewer

268
00:23:33.840 --> 00:23:36.200
David Bau: That your probe is a good test.

269
00:23:36.560 --> 00:23:41.449
David Bau: Right, come up with a random… Control task. Say, John Hewitt.

270
00:23:41.590 --> 00:23:47.859
David Bau: Say we're gonna, like, show the gap, we're gonna measure the selectivity against this thing, you might have to put the chart in the appendix for room.

271
00:23:47.990 --> 00:23:50.190
David Bau: But, you know, if you can show a gap like this.

272
00:23:50.650 --> 00:24:08.280
David Bau: And then you'll convince a reviewer. Even if it's not linear, even if it's MLP, you could defend it as well. Yeah. If the idea is that, like, performance is only meaningful if it, like, disappears when, like, the meaningful structure is, like, destroyed.

273
00:24:08.370 --> 00:24:15.459
David Bau: How do you, like, what kind of intuition do you use to identify that, like, meaningful structure, like, that we need to control of?

274
00:24:16.180 --> 00:24:20.650
David Bau: Oh, like, what a meaning, like, what a good control task is? Yeah. Yeah, you asked that question.

275
00:24:21.360 --> 00:24:25.450
David Bau: What a good control… what a good control task is. Yeah, because I feel like…

276
00:24:26.620 --> 00:24:39.409
David Bau: the control task that they… they're like, control tasks are a function of word identity. Yeah. And I'm sort of like… I want to just give every word a random part of speech. Yes, and I think that, like, that seems to tell you, like, how well can you compute

277
00:24:39.680 --> 00:24:56.460
David Bau: random lookup tables as a function… like, I feel like how hard it is to compute a random lookup table depends on how hard it is to derive word identity, and… or how hard it is to determine the function that you're interested in. And so I don't really know if there's an obvious…

278
00:24:56.460 --> 00:25:00.379
David Bau: Because, like, when you saw his… so he had another control task for it.

279
00:25:00.550 --> 00:25:17.839
David Bau: I don't know if I have the thing, like, part of speech tagging, but then also, like, you know, dependency relationships, you know, so, which he said, oh, we'll have a weird control task, which is we'll randomly send an arc to the left, or to the right in the middle, or something like that, right? So,

280
00:25:18.240 --> 00:25:22.020
David Bau: And and then you look at that and you think, huh.

281
00:25:22.350 --> 00:25:34.470
David Bau: I don't know if I would have invented that control task. It's just, like, some random… some random thing, right? So, like, a random part of speech for every word, yeah, sure, that, like, maybe. Like, you know, so there's a classic…

282
00:25:34.920 --> 00:25:43.560
David Bau: experiment that machine learning people like to do, which is, they like to say, you know, how powerful is my…

283
00:25:44.240 --> 00:25:45.950
David Bau: Network, by neural network.

284
00:25:46.100 --> 00:25:49.400
David Bau: By taking…

285
00:25:51.300 --> 00:26:00.710
David Bau: is shuffling all the labels. So you have, like, you know, a million images, and some of them are dogs, and some of them are cats, and some of them are sailboats, and some of them are trucks, and whatever, right?

286
00:26:00.850 --> 00:26:12.039
David Bau: And, and then what you do is you just take exactly the same dataset, and you just completely scramble all the labels, and then you train your model on it to see what accuracy you get on that. It's exactly the same.

287
00:26:12.320 --> 00:26:17.260
David Bau: as this experiment here. But it's… it's like the… but what they call that…

288
00:26:17.420 --> 00:26:20.120
David Bau: Is they call that the memorization baseline.

289
00:26:20.770 --> 00:26:23.980
David Bau: Right, so there's these two things that you can have a model do.

290
00:26:24.420 --> 00:26:31.020
David Bau: Machine learning, people like to say that you can have a model generalize, or memorize.

291
00:26:31.660 --> 00:26:47.769
David Bau: And so when a model generalizes, it's supposed to be finding regular patterns in the data, and after it sees a few of the same regular patterns, then even if you show it a new example of a new sailboat that it's never seen before, a new cat that it's never seen before, because it…

292
00:26:47.940 --> 00:26:53.019
David Bau: recognize the patterns for a sailboat, or a cab, right? It can guess.

293
00:26:53.240 --> 00:26:54.100
David Bau: The new one.

294
00:26:54.330 --> 00:27:01.550
David Bau: Right? But the other thing that a network can do is just memorize. It can remember that I saw exactly this cat before, and this cat.

295
00:27:01.990 --> 00:27:08.889
David Bau: is A. And I saw this other cat, and this other cat is Class B, and this sailboat here is Class A.

296
00:27:09.060 --> 00:27:19.990
David Bau: And this different sailboat is, like, Class B, and it's just like, you know, they're just all scrambled. Some sailboats are A, and some sailboats are B, and some cats are A, and some cats are B, and they don't look anything like each other.

297
00:27:20.140 --> 00:27:25.779
David Bau: But… If you have a powerful neural network, it can…

298
00:27:25.930 --> 00:27:28.539
David Bau: Remember that anyway, after you train it for a while.

299
00:27:29.160 --> 00:27:35.359
David Bau: But it won't do what we call generalization. If you show it a new cat, then it will have no idea, give it a random label.

300
00:27:35.560 --> 00:27:39.620
David Bau: Because it has no idea what… Random hits.

301
00:27:39.880 --> 00:27:40.900
David Bau: Does that make sense?

302
00:27:43.740 --> 00:27:49.429
David Bau: So, so that, so, so, so that a classic control task is to shuffle the labels.

303
00:27:49.630 --> 00:27:52.729
David Bau: So, like, if you need to come up with a control task.

304
00:27:53.280 --> 00:27:56.710
David Bau: Then, the natural thing to do… would be…

305
00:27:56.950 --> 00:28:01.610
David Bau: to just say, well, just like Hewitt, we're gonna shuffle the labels.

306
00:28:01.850 --> 00:28:14.049
David Bau: you know, we have stripey things and not stripey things, and we'll have a control task, and we'll just, like, randomly assign stripey things to not stripey or something like that. And we'll see how well that program does that.

307
00:28:14.270 --> 00:28:19.850
David Bau: But… but he would say, well, sometimes that's… Too hard.

308
00:28:20.330 --> 00:28:29.510
David Bau: And, like, if you really shuffle every single thing, it might be too hard, so you might want to arrange things in groups, and then shuffle the groups, or something like that, to make it a little easier.

309
00:28:29.640 --> 00:28:33.360
David Bau: And the reason is because he's looking for this gap.

310
00:28:34.630 --> 00:28:42.300
David Bau: control tasks too hard, then the control task can, you know, sit down there at low accuracy, even though

311
00:28:42.410 --> 00:28:50.609
David Bau: you know, he's arguing that what you want to do is you want to find that there's a meaningless task that it could have learned instead, so…

312
00:28:51.750 --> 00:28:56.140
David Bau: You know, so this is a whole… it's kind of a hole in his analysis.

313
00:28:56.420 --> 00:29:01.890
David Bau: If you need to come up with a good control task that reveals the weakness of your… your test.

314
00:29:02.680 --> 00:29:03.350
David Bau: Right.

315
00:29:03.720 --> 00:29:04.680
David Bau: Make sense?

316
00:29:06.000 --> 00:29:09.319
David Bau: It's like… I think that after…

317
00:29:09.630 --> 00:29:11.699
David Bau: After spring break, I'm supposed to have

318
00:29:11.940 --> 00:29:14.849
David Bau: A professor coming… come here and watch me teach.

319
00:29:15.710 --> 00:29:16.850
David Bau: And,

320
00:29:16.960 --> 00:29:31.719
David Bau: And so, you know, they're gonna see if there's a hole in the way that I teach. Something that I'm doing that doesn't convey to you guys the actual knowledge, but just makes you feel like you have the illusion of getting knowledge, or something like that. And,

321
00:29:31.770 --> 00:29:37.699
David Bau: And so, but… but what's the test, right? Well, you know, you could devise a test.

322
00:29:38.000 --> 00:29:42.400
David Bau: And, and, you know, it passes it with flying colors, but maybe with a more stringent test.

323
00:29:42.660 --> 00:29:47.859
David Bau: you know, You could say, oh, there's a gap here, there's a selectivity problem.

324
00:29:48.150 --> 00:29:50.830
David Bau: And, but there's an art to Camper.

325
00:29:52.370 --> 00:30:07.350
David Bau: So then the last thing I wanted to talk about probing was other things that you can do for probing. There were a bunch of questions about… from Ayush and Eunice about, probing ideology and personality and biases and things like this. And so I just want to point out that there's been a bunch of work

326
00:30:07.420 --> 00:30:19.450
David Bau: Doing probing for interesting things. There's this nice work from Rita Chen over at Harvard, that builds probes that look into LLMs that tell them… that tell you, like.

327
00:30:19.660 --> 00:30:23.470
David Bau: you know, does the LLM think bed.

328
00:30:23.590 --> 00:30:24.700
David Bau: You are.

329
00:30:25.380 --> 00:30:33.960
David Bau: a certain age, a certain gender, a certain socioeconomic status, a certain education. And what these classifiers are, is they're basically probes.

330
00:30:34.140 --> 00:30:37.880
David Bau: You know, I think they just trained a bunch of linear probes on a bunch of text.

331
00:30:38.100 --> 00:30:47.149
David Bau: that came from young people, that came from men and women, that came from whatever, right? And they said, you know, how accurate is the model at

332
00:30:47.370 --> 00:30:50.260
David Bau: You know, classifying

333
00:30:50.410 --> 00:31:03.090
David Bau: this information, and they took their high-accuracy probes, and they hooked them up to a user interface. So there's… so yes, so you can probe for things like beauty, you can probe for things like socioeconomic status.

334
00:31:03.240 --> 00:31:08.499
David Bau: And, you know, and… and it often works pretty well.

335
00:31:09.270 --> 00:31:10.110
David Bau: Okay.

336
00:31:11.320 --> 00:31:15.090
David Bau: That's all I wanted to talk about for pros. Now let's go on to the presentations.

337
00:31:18.770 --> 00:31:21.820
David Bau: Any other questions about probes before we move on?

338
00:31:22.240 --> 00:31:24.150
David Bau: Any probing questions?

339
00:31:25.320 --> 00:31:30.719
David Bau: Meanwhile, we have Team QK up here. Like, really any questions I have to answer any questions about pros.

340
00:31:33.240 --> 00:31:43.820
David Bau: Yes. Could you probe a probe? You probe probes? Yeah. Like, if you had an LLP probe, could you apply the linear probe on that to see if it…

341
00:31:44.920 --> 00:31:49.300
David Bau: Yeah. In a sense.

342
00:31:50.520 --> 00:31:53.600
David Bau: you know, in a sense, that's what we're doing. We're doing this weird thing.

343
00:31:54.130 --> 00:31:54.890
David Bau: Right?

344
00:31:55.040 --> 00:32:03.099
David Bau: Because our neural networks are already machine learning models, they're already fake ones.

345
00:32:03.370 --> 00:32:08.159
David Bau: And then, it's like, why are we taking these fake models and putting other fake models on top of them?

346
00:32:08.470 --> 00:32:15.159
David Bau: And it's this weird exercise. So you can probe a probe. That's actually all that we're doing. Like, the LLM itself is a probe.

347
00:32:15.340 --> 00:32:16.369
David Bau: You know, for text.

348
00:32:16.840 --> 00:32:19.549
David Bau: The probe for text is, like, tells you what the next word is.

349
00:32:20.050 --> 00:32:23.360
David Bau: And, and we're probing little pieces of it.

350
00:32:23.540 --> 00:32:25.840
David Bau: I think the way to think of a probe is, like.

351
00:32:26.420 --> 00:32:29.599
David Bau: So is there… have… have people probed probes?

352
00:32:30.530 --> 00:32:31.550
David Bau: Oh, no.

353
00:32:31.670 --> 00:32:37.929
David Bau: I don't know if people have done that, yes? I think there was, like, some stuff where they tried to decompose

354
00:32:38.140 --> 00:32:48.839
David Bau: probes into SAE features, so they did some kind of weird… Really? And then, yeah, I think they did some stuff, but… Alright. Ask Grace.

355
00:32:49.360 --> 00:32:52.680
David Bau: Grace will tell you on Discord if people are pro-probes.

356
00:32:55.810 --> 00:33:00.219
David Bau: Grace is… Grace is our labs programming expert, so I'm a little… I'm a little embarrassed.

357
00:33:00.450 --> 00:33:06.419
David Bau: Great last name for that. You know, lecturing about probes right around grayscale.

358
00:33:07.080 --> 00:33:11.950
David Bau: So, Yes, good question now.

359
00:33:13.040 --> 00:33:14.869
David Bau: Alright, Team 2K.

360
00:33:15.840 --> 00:33:16.730
David Bau: Yes, ready?

361
00:33:17.900 --> 00:33:21.230
David Bau: Good. I have some slacked food. Why not?

362
00:33:21.380 --> 00:33:22.989
David Bau: I like high nuts. Yeah.

363
00:33:24.670 --> 00:33:39.559
David Bau: It's not just on probes. Oh, yeah, sure. Okay. Yeah, it's okay. You guys are now into the… I think that you have enough methods that you can just do whatever. Okay, cool. Yeah. Because I haven't… I, like, kind of restructured.

364
00:33:39.790 --> 00:33:43.849
David Bau: The task a little bit, and,

365
00:33:44.000 --> 00:33:48.489
David Bau: So, okay, we'll just start from here.

366
00:33:48.660 --> 00:33:52.039
David Bau: You want to push the next time? Okay.

367
00:33:52.690 --> 00:34:03.510
David Bau: Okay, cool. So, I basically… we're the ones that are trying to better understand if there's some sort of, like, speaker role represented? So that I have time. Sorry.

368
00:34:03.980 --> 00:34:10.710
David Bau: How much time do I get?

369
00:34:11.060 --> 00:34:12.629
David Bau: They're restricting really fast.

370
00:34:13.639 --> 00:34:22.759
David Bau: Okay, yep, so basically we want to know if there's role representations in guidelines when you give an LLM a transcript that has not already been differentiated.

371
00:34:23.139 --> 00:34:40.019
David Bau: And so, this kind of comes into world entity a little bit, but the question is, do LLMs encode with speaking as some sort of structured signal? There's a theory linked to this back from 1990 in Smolensky's work, where he looks at tensor product representations as, like, rolls and fillers.

372
00:34:40.030 --> 00:34:47.239
David Bau: And so, in a transcript, the role would be speaker 1, Speaker 2, or something, and then the filler would be, like, Alice Font, so…

373
00:34:47.290 --> 00:34:58.479
David Bau: The LLM needs to, like, bind Speaker 1 to Alice, and Speaker 2 to Bob, and then whatever Alice says later on has to get bound to Rule 1. Whatever Bob says later on needs to get bound to Rule 2.

374
00:34:58.840 --> 00:34:59.840
David Bau: That's, like…

375
00:35:00.390 --> 00:35:09.149
David Bau: a very theoretical framework that Smolensky created, and his whole take on that is that there should… it should be decomposable through linear operations.

376
00:35:09.310 --> 00:35:21.799
David Bau: Right, and so then there was this, like, empirical link from more recent times about how do language models find entities and context to show models can track entity role assignments, so that's basically what we're trying to explore.

377
00:35:22.750 --> 00:35:37.580
David Bau: So, I got Claude to give me 20 different dialogues between two speakers, Alice and Bob, and they have 80 turns each turn, meaning, like, the text that Alice said is one turn, the text that Bob says is another turn.

378
00:35:37.750 --> 00:35:56.990
David Bau: Then I did some length control so it didn't get too crazy, and I like the speaker swap stuff, ignore it for now, because I thought it would be a good, control, but it's probably… now that I think more about it, I don't think it is, but… yeah, so basically, this is just a screenshot of what I have, like, there's 6 different topics Alice and Bob like.

379
00:35:57.070 --> 00:36:00.970
David Bau: Have some sort of, like, conversation or dialogue or a debate on these things.

380
00:36:01.570 --> 00:36:02.300
David Bau: Okay.

381
00:36:03.620 --> 00:36:12.370
David Bau: So I'm looking at residual stream activations. I'm only extracting from layer 20 right now. I will eventually scale up to other layers.

382
00:36:12.570 --> 00:36:16.470
David Bau: So basically, for each turn of the dialogue… oh, I should've…

383
00:36:16.580 --> 00:36:30.789
David Bau: This is a typo here. It's not the full dialogue but turn, it's, like, the dialogue up until the current turn gets fed into the model, which is, like, as a text string, so it's been concatenated. And then the context is,

384
00:36:30.790 --> 00:36:40.420
David Bau: The only tokens that gets tokenized, after the forward pass and the hidden states are captured, we're only looking at the tokens of that current turn text here.

385
00:36:40.520 --> 00:36:44.990
David Bau: We're not looking at anything else. And then, so these are averaged into a single vector.

386
00:36:45.120 --> 00:36:46.080
David Bau: per turn.

387
00:36:46.220 --> 00:37:01.399
David Bau: Now, because of the results I'm going to show, I'm actually going to change this a little bit, too. I don't… I don't want to average it anymore, because I think it might diffuse a signal. I'm going to look at the last token of each turn moving forward, but I just wanted to say, like, for this one, it's averaged.

388
00:37:02.400 --> 00:37:06.240
David Bau: Okay, so the first thing we did is, for all the dialogues together.

389
00:37:06.290 --> 00:37:24.789
David Bau: do a PCA, and try to see if you can tell if there are separable clusters. Not really, as you can see. And so that's the big takeaway. It's like, and here you're just seeing the swapped label condition just overlap with what it should overlap with. So Bob swapped, Alice base, Bob base, Alice swapped.

390
00:37:25.770 --> 00:37:40.659
David Bau: Okay, so then I was like, okay, well, let me see what one dialog looks like in this global PCA space to see if that's, like, a little bit more separable, and it seems like maybe, maybe not. Like, I could probably draw a line right here.

391
00:37:41.250 --> 00:37:45.010
David Bau: Something else that's really interesting about this is that

392
00:37:45.320 --> 00:37:48.140
David Bau: But, so each of these dots are turns, right?

393
00:37:48.250 --> 00:38:01.279
David Bau: Down here, these… this is, like, turn 0, and then this is turn 1, and this is turn 2. And so when we get up here, this is the later turns. I thought that was cool. I don't know what to make of it, but… I don't know.

394
00:38:01.390 --> 00:38:06.519
David Bau: And so, now that I see this, I looked at a couple different, like, dialogues like this and saw

395
00:38:06.760 --> 00:38:11.069
David Bau: By eye, I feel like I could probably draw a line. That was, like, pretty good.

396
00:38:11.200 --> 00:38:14.110
David Bau: So, I was like, why don't I look at this per transcript then?

397
00:38:14.560 --> 00:38:15.500
David Bau: So…

398
00:38:15.610 --> 00:38:32.849
David Bau: Within transcript, we're looking at the role axis over time, so is there a stable role axis within this dialog? So, what we're doing is we basically have these two clusters, right? And so I want to find the centroid of the Bob cluster and the Alice cluster, and then I want to find the midpoint between those two clusters.

399
00:38:33.010 --> 00:38:34.740
David Bau: And so the idea is that

400
00:38:35.300 --> 00:38:51.619
David Bau: we're going to give it a score, like, for each turn gets a score, where we take the turn, which is X sub T, minus the midpoint, and then take that dot product of that with the roll direction vector, which is just, the centroid of Alice minus Bob.

401
00:38:51.960 --> 00:39:11.229
David Bau: In this case. And so that's what I'm kind of showing, like, plotting here, and you can see now, clearly, there is some separability here, right? And so everything that's blue is Alice, everything that's orange is Bob. So I'm basically just projecting each turn onto this, like, vector rolled, roll direction vector thing.

402
00:39:11.400 --> 00:39:14.720
David Bau: And here, I just highlighted where this model messed this up.

403
00:39:15.220 --> 00:39:30.160
David Bau: like, I pulled out the actual, like, text to see why. They're… they are not, like, they don't have great personalities, it's all very neutral, so it's like nothing obvious jumped out to me, the human, about why the model miscategorized these.

404
00:39:30.720 --> 00:39:35.179
David Bau: Can you make the screen big? What is the text? How do I? I just have a slideshow.

405
00:39:35.300 --> 00:39:37.329
David Bau: The little white button in the upper, yeah.

406
00:39:41.550 --> 00:39:51.930
David Bau: And so, which, which line of text was the one that was… So, this is Alice, the blue dot, which is crossed over into the wrong region. This is Bob, Alice again, and then Bob.

407
00:39:53.310 --> 00:39:56.699
David Bau: And they're talking about… How cities should be structured.

408
00:39:58.870 --> 00:40:08.590
David Bau: So, I actually pulled out a couple different projections like this for the dialogues, and some of them were messier than others, but generally speaking, the majority did separate them.

409
00:40:08.830 --> 00:40:12.609
David Bau: So, I wanted to, like, figure out a better way to kind of,

410
00:40:13.490 --> 00:40:19.429
David Bau: like, define that. And so now what we're doing is taking this, like, role direction vector per transcript.

411
00:40:19.500 --> 00:40:37.049
David Bau: And then we're looking at the cosine similarity. And so obviously on the diagonal, they're going to be perfectly similar. That's good, so it worked. And then off-diagonal, you see a lot of red, some gray, some blue. So basically, the red is gray, because it's sort of pointing in the same direction, or at least has the same sign.

412
00:40:37.320 --> 00:40:40.000
David Bau: Gray means that they're completely orthogonal.

413
00:40:40.360 --> 00:40:59.089
David Bau: And then blue means it's pointing in the opposite direction, basically, so it's, like, not super great. So there's something, I think… this is where you're… I'm like, okay, I messed this up somehow because I think there's a confound in the transcripts themselves, which is what's maybe actually getting picked up, not the role direction vectors themselves.

414
00:40:59.520 --> 00:41:13.160
David Bau: So it could be, like, the stances that each Bob and Alice take. It could be, like, one of… maybe one of them is arguing pro something, the other is saying, no, con. That might be actually what we're looking at, as opposed to speakers.

415
00:41:13.690 --> 00:41:20.719
David Bau: Okay, and then… so this is, just to summarize it all together, we're looking at distance over dispersion versus silhouette.

416
00:41:20.940 --> 00:41:29.380
David Bau: And so, distance versus dispersion is kind of going back to what I talked about earlier. We're just looking at the between-centroid distance between the two clusters, and then…

417
00:41:29.520 --> 00:41:42.349
David Bau: clusters can be different, because they might be dense and, like, pulled in together, they might be really, like, scattered. That plays a role in separability, and so we, like, created the separation score using… between-centered and then cluster dispersion.

418
00:41:42.510 --> 00:41:54.899
David Bau: And so what this says is how far apart are the centroids relative to the spread of each of those clusters? And silhouette, I just used this from scikit-learn, it's how close are the points to its own cluster versus the other.

419
00:41:55.170 --> 00:42:06.389
David Bau: And so, on the y-axis, you're seeing silhouette x-axis, you're seeing distance over dispersion, and so, basically, like, the more separable they are, the more within one cluster.

420
00:42:09.610 --> 00:42:23.850
David Bau: Okay, and then I kind of, like, tried to take some inspiration from this, like, concept of function vectors. So, I wanted to see if, across dialogues, if we could learn a single role detection vector from training data, and then classify held-out turns.

421
00:42:23.900 --> 00:42:33.890
David Bau: And use the sign of the projection on that direction, and so I compared it against some baselines, like label shuffling, and then I also tried just random directions.

422
00:42:34.140 --> 00:42:42.159
David Bau: So, it's barely, like, above chance, in terms of roll direction, and label shuffle and random directions are at chance, as they should be.

423
00:42:42.640 --> 00:43:00.300
David Bau: And again, this is a cross-dialogue, so this is where I was like, okay, I need to try to look at it per transcript. So then I had to split the transcripts' turns themselves into train tests, so it learns transcript-specific direction, and then on the training turns, and then we evaluate it on the held-out turns. Here, again, you see this, like.

424
00:43:00.400 --> 00:43:08.050
David Bau: Distribution, where it is… it does get pretty good for many of the transcripts, in fact, but there are some that are below chance or around chance, so…

425
00:43:08.450 --> 00:43:15.690
David Bau: This might also have to just do with the dialogue, the topic, how exactly Alex and Bob are speaking to each other, things like that.

426
00:43:16.820 --> 00:43:27.540
David Bau: Okay, the linear probe experiment, I don't know if I did it perfectly great, but, I, again, I trained this, directly on the average directors from the residual stream activations, then…

427
00:43:27.610 --> 00:43:37.560
David Bau: So I used it to try to predict role, which is what we care about, speaker A versus B, variant, which is the original dialogue or speaker swapped.

428
00:43:37.640 --> 00:43:53.180
David Bau: And then topic, because I had this, like, nagging feeling that topic was the issue here. And so, as you can see, the linear probe seems to have better accuracy when it comes to topic, based on basically for variant and role, so… Nice. Yeah.

429
00:43:53.200 --> 00:44:00.580
David Bau: And then I tried it, per transcript for role decodability, and you can see it's just above chance, not, not that much higher.

430
00:44:00.760 --> 00:44:16.310
David Bau: Nice. Nailed it, that was the last slide. Great. Yeah, nice experiment, nice negative result. So you can see the value probes, it's like, oh, yeah, my broken mistake, I didn't want to learn, it's perfect. Yeah. So I, I'm gonna try this again on the last, token.

431
00:44:16.420 --> 00:44:19.720
David Bau: Just to see if it, like, kind of gives a clearer signal or something.

432
00:44:20.130 --> 00:44:31.439
David Bau: I… it's one question, I maybe missed this part, but in the part where you calculated the vector, are you explicitly, like, have you given Alice and that speech to the vector, or is it just…

433
00:44:31.610 --> 00:44:37.529
David Bau: the speech. Which vector? The one before this? When you're doing the… the one before this.

434
00:44:37.870 --> 00:44:51.689
David Bau: Yeah, this one, low direction vector, so when you're calculating U of A, I'm assuming that's Alice? Yeah. So, is the, model… does the model have access to the fact that Alice did speak this, or does the model, like, just have access to the text?

435
00:44:51.700 --> 00:45:07.230
David Bau: This is not… this is not the model, it's just, like, I'm just taking all the vectors, and taking the mean of the vector, and then taking the mean of… this is, like, this is just kind of geometry, it's like, there's no model here doing anything. Yeah. How do you get the vector?

436
00:45:08.300 --> 00:45:20.790
David Bau: The vector is, like, so I have the… all of the, turn vectors, right? And so, so, like, there's a bunch of turn vectors for Alice, there's a bunch of turn vectors for Bob, and I just take the average.

437
00:45:21.000 --> 00:45:33.319
David Bau: For Alice, does the turn vector have added certain… No, no, no. Yeah, oh, so you're saying the label is completely separate. It's like, the model doesn't see it, the… like, yeah, yeah, yeah, yeah.

438
00:45:35.480 --> 00:45:40.520
David Bau: I'm sorry, I thought you meant, like, the models I used to… No, no, no. Yeah.

439
00:45:41.180 --> 00:45:44.749
David Bau: Do you think that might be too much of an implicit signal? I don't know, because…

440
00:45:44.920 --> 00:45:52.480
David Bau: the text that Alice is speaking may not, when you're deriving the vector, may not be, like, related to what Al, like…

441
00:45:52.610 --> 00:45:56.090
David Bau: This is Barcelona. Yeah, most likely. I mean…

442
00:45:56.380 --> 00:46:05.439
David Bau: the model, when you give an LLM a transcript, it is able to be like, oh, there's two spiders here, right? Like, if you use, like, as Yasmin's shown, so many examples, like.

443
00:46:05.460 --> 00:46:20.480
David Bau: So how, like, the whole point here is, like, how does this figure it out? Is it… is it really contextual? Just, like, finds people with, like, two different opinions, and it's like, that can't be the same person, or… or it's one person with a split personality? Like, you know what I mean? Like, how does it determine this exactly?

444
00:46:20.510 --> 00:46:24.409
David Bau: It's kind of, like, how we're trying to hack it.

445
00:46:27.300 --> 00:46:42.020
David Bau: It was nice that you saw the separation for a single transcript. Yeah. And I guess you're… I was surprised at how many dots you got. Actually, it looks like your transcripts are pretty long, like a single one. Yeah, 80 times. And so, are they long enough that you can have it hold out, so you could, like, drop out?

446
00:46:42.020 --> 00:47:00.209
David Bau: a bunch of things, and then… Yeah. …test on the holdout. Yeah, yeah, I think… I mean, I think I can make them longer, too, but I think 80 is quite a lot, and plus, like, I have 20 different dialogues, I can just… I've been able to repeat it and look at the standard deviation. I see, I see. I wonder if we could prop them in, like, give them personas, so they're, like, more distinct, too.

447
00:47:01.090 --> 00:47:18.290
David Bau: Yeah, like, no, like, the personalities we're talking about, making them, like, maybe goofy or fun, like, or… this kind of goes to the power group, like, having one be an advisor versus student role, like, that kind of thing would probably… but then it's, like, we're getting away from, like, how… what is… how do you disambiguate role from, like.

448
00:47:18.420 --> 00:47:34.939
David Bau: character that's played by the role. The other thing I wanted to try for an experiment here is that both Alice and Bob have two different stances in the conversations. I want them to agree by the end. So, like, I'm gonna tell the Alan, like, starting turn 50, start making them see each other's side and come to an agreement.

449
00:47:34.940 --> 00:47:39.160
David Bau: Because then I want to see this plot again, and then see if it completely collapses at that point.

450
00:47:39.280 --> 00:47:41.919
David Bau: Is, like, the next, like, thing to kind of test that.

451
00:47:42.600 --> 00:47:48.390
David Bau: There's also… a good setting for that control task idea from John Hewer, right? Yeah, yeah.

452
00:47:48.560 --> 00:47:52.959
David Bau: Like, if you took the same transcripts and you shuffled Alice and Bob somehow.

453
00:47:53.250 --> 00:47:57.910
David Bau: Would it make it harder for you, bro, to figure out this plan? Yeah, that's true. …be able to do this anyway?

454
00:47:58.350 --> 00:48:06.760
David Bau: So instead of a speaker swap, just, like, random. If you just randomly said half of these things were Alice, half of these group out, just shuffled.

455
00:48:07.390 --> 00:48:13.759
David Bau: But, like, linear props can be pretty powerful, too, so, like, I wonder if we can use, I appreciate it.

456
00:48:14.990 --> 00:48:23.550
David Bau: Yeah, well, then it would be like, oh, this is not so cool after all. Depends on everything else, also. Yeah, but yeah, it's just gonna be…

457
00:48:27.010 --> 00:48:29.919
David Bau: Okay, yeah, click on the next team so I don't know who it is.

458
00:48:33.670 --> 00:48:35.310
David Bau: Doesn't have a great job.

459
00:48:37.980 --> 00:48:40.889
David Bau: Lose, lose representatives feedback today.

460
00:48:46.420 --> 00:48:49.770
David Bau: I like the choice that you guys made of picking a presenter.

461
00:48:49.990 --> 00:49:00.370
David Bau: That's not a bad way to do it. But you can just, like, choose a different victim every week.

462
00:49:02.130 --> 00:49:07.130
David Bau: Super stop. So, last week, what did you do?

463
00:49:07.730 --> 00:49:09.540
David Bau: Oh, yeah, absolutely.

464
00:49:11.020 --> 00:49:26.810
David Bau: So basically, last week, we are trying to separate the, label from trinary and binary. Binary is, like, without, without uncertainty. The trend rate is, like, low, medium, and, no, no, intermediate, and high uncertainty.

465
00:49:26.910 --> 00:49:36.109
David Bau: And this is the result we get. As you can see here, first of all, the binary, AISR accuracy is much higher than the trinary.

466
00:49:36.790 --> 00:49:37.960
David Bau: result here.

467
00:49:38.100 --> 00:49:47.119
David Bau: It's here. So… but surprisingly, even if we give the definition and the label meaning, it doesn't really mean that the model is

468
00:49:47.310 --> 00:50:00.099
David Bau: can make the agreement. Actually, you can see here, given the definition and label, actually makes the, accuracy lowest. Oh, really? Yeah, among all the… for the trailer risk settings.

469
00:50:00.430 --> 00:50:04.879
David Bau: So, the top one is the without definition and without labeling.

470
00:50:05.380 --> 00:50:14.000
David Bau: This is the, only give definition of the uncertainty. This is only give the, definition of different labels. It's just without, given everything.

471
00:50:14.310 --> 00:50:31.579
David Bau: while this is lowest among all the three models. And that's accuracy with respect to human labels? Yes, right? Yeah. This is not too much. We only have, like, 70 labels. Yeah, we only consider these. But there's a very interesting…

472
00:50:31.900 --> 00:50:33.800
David Bau: like,

473
00:50:34.120 --> 00:50:50.139
David Bau: decision boundary shift, if you're given the, definition between no and intermediate, uncertainty. But there is no… basically no from high to no uncertainty confidence. They're not very common.

474
00:50:56.990 --> 00:51:16.490
David Bau: Because our data is too complicated, so we are creating synthetic data to, to see whether the learning model can do uncertainty. And we have three kinds of data. One is numerical data, which means there is numbers in the statement. For example, we have low uncertainty, to high uncertainty.

475
00:51:18.360 --> 00:51:34.590
David Bau: Yeah, and here is the result, because, the poor label is too complicated, so we do two labels. So can you… so… so before you show the data, can you describe the experiment in a little bit more detail? So, what did you do with the text, and what are you reading? What did you repeat it to? Yeah. With inputs and outputs?

476
00:51:34.830 --> 00:51:48.759
David Bau: inputs is a statement, and so, literally, these four statements, no, you must have more than just these four statements, is that right? Yeah, yeah, of course, we have, hundreds of statements. And how did you generate them?

477
00:51:53.160 --> 00:52:07.649
David Bau: It's okay, that's cool, that's not a lot, I'm just… you're just trying to understand what it is that you did. Yeah, we made a lot of effort, because, learning models always generate with a template, and it makes many seem

478
00:52:07.880 --> 00:52:15.700
David Bau: syntax symbols on the different, labels. So, we've generated a lot of them and chose some.

479
00:52:15.790 --> 00:52:32.749
David Bau: I try to choose ones with different syntax. Yeah. So right now, it's a little hard-coded, but, like, we might try API later. Basically, Shui did… what Shui did was she sort of, like, created different levels of synthetic data, where, like, this one is the, like, probably the easiest, where you have, like, mean and variance.

480
00:52:32.750 --> 00:52:37.289
David Bau: Mathematically, and we also have, like, economic synthetic data that are, like.

481
00:52:37.440 --> 00:52:41.360
David Bau: A little more complicated, less complicated than the earnings calls. Nice, and yeah.

482
00:52:41.820 --> 00:52:48.679
David Bau: That's very cool. Wow. Yeah, yeah, so you don't have to… so actually, one of the… one of the things I feel like

483
00:52:49.500 --> 00:52:52.860
David Bau: As you get into this, you know, Guy.

484
00:52:53.020 --> 00:53:09.349
David Bau: getting towards making the peer-reviewed paper, you realize that, oh, actually, data preparation, like, the whole story that you have about, like, how you synthesize your data sets, that's, like, an interesting story in itself, right? That's, like, an interesting part of the research itself.

485
00:53:09.870 --> 00:53:16.370
David Bau: feel free to, you know, talk about, you know, all the dilemmas you face with that in the future. So, okay, what do you get?

486
00:53:17.010 --> 00:53:25.309
David Bau: here is the PC from layer 0 to 7, and we can see that in the first layer, it has some eyelet, and then it,

487
00:53:25.460 --> 00:53:26.670
David Bau: became clusters.

488
00:53:28.050 --> 00:53:32.950
David Bau: And in, in the middle layers, it became very separate.

489
00:53:33.150 --> 00:53:36.960
David Bau: It's… And oh, it's a…

490
00:53:37.390 --> 00:53:41.090
David Bau: And, it's a… it's a memo.

491
00:53:41.280 --> 00:53:47.779
David Bau: The previous one was the last token, and in the import, the layer 0 is already separate, but not island.

492
00:53:49.350 --> 00:53:52.369
David Bau: And then this… the, second…

493
00:53:52.550 --> 00:53:57.609
David Bau: delivery of the data. It's equal data. There's no numbers in the statement.

494
00:53:59.800 --> 00:54:11.019
David Bau: No numbers. No numbers. So basically, you have, like, those are also a little bit templated, so, like, for no uncertainty, there are, like, things like immutable schedules, like, a bunch of it, but,

495
00:54:11.210 --> 00:54:14.210
David Bau: And for high uncertainty, you have, like,

496
00:54:15.040 --> 00:54:21.769
David Bau: a lot of different directions, moving pieces, a lot of things going on. I see. And,

497
00:54:22.280 --> 00:54:35.299
David Bau: Yeah, I guess this data is, like, more realistic, so, we, before, like, doing PCAs and stuff, like, we tried model performance on this, and, like, we basically gave, like, two-shot prompts to it, and then see, like, whether it gives…

498
00:54:35.680 --> 00:54:37.850
David Bau: The correct label for the,

499
00:54:38.420 --> 00:54:45.030
David Bau: So, like, in this test, we only extract the no uncertainty and high uncertainty ones to, like, just be binary.

500
00:54:45.180 --> 00:54:51.800
David Bau: And, like, we also tried, like, feeding in the definitions, just, like, what? And,

501
00:54:51.920 --> 00:55:11.890
David Bau: It seems like the model is leaning towards, like, giving uncertain answers, even on the, samples without uncertainty, so… and also, like, a similar trend observed is that when you give definitions, it's not necessarily improving the performance, so, like, the definition one is exactly the same as the one who's used. But, like, the second one.

502
00:55:12.060 --> 00:55:25.220
David Bau: because the definition one didn't work, so I tried, like, looking into the dataset and, like, make a much precise, a little bit dataset-specific definition, which definitely makes it, like, better. Oh, no. But,

503
00:55:25.320 --> 00:55:29.989
David Bau: I don't know, like, this definition might not be realistic, given that we have, like, we're looking at our income.

504
00:55:30.630 --> 00:55:39.219
David Bau: But, given those, even though, like, it's bad at, those without uncertainty, we still feel like at least it is, like.

505
00:55:39.580 --> 00:55:48.160
David Bau: Separating things, like, for example, on the… on those certain samples, it's, like, half-half, but, like, on the uncertain samples, it always… it predicts, like, 100% of the time.

506
00:55:48.290 --> 00:55:59.939
David Bau: So, we try, like, a little bit of activation patching, where, our question is, like, which layer encodes information regarding a statement's uncertainty level, and for this, we,

507
00:56:00.300 --> 00:56:20.069
David Bau: We did not do, like, token-level, patching, because, like, at the end of the day, uncertainty concepts would be dispersed across the entire earnings calls, so we do not really care about, like, oh, which token encodes that information, but rather, we sort of, like, ask which… well, like, in this

508
00:56:20.140 --> 00:56:30.159
David Bau: experiment, for example, we basically patch at the end of sentence period, for this statement, and so we construct, like, contrasted pairs, so, like, for each

509
00:56:30.190 --> 00:56:42.540
David Bau: For each sample, we have, like, a corresponding contrast sample that talks about the same thing, for example, a GPT growth, a GTP growth, sorry. But, like, with the uncertainty label flipped.

510
00:56:42.810 --> 00:56:49.289
David Bau: And, we ran this across, like, 100 contrasted pairs, and it seems like the…

511
00:56:49.490 --> 00:56:55.539
David Bau: So, you could see, like, Clean baseline is the ones with uncertainty, so, like,

512
00:56:55.580 --> 00:57:14.029
David Bau: this is, like, basically the logic difference between yes and no, and it's, like, very high. But, like, for the corrupted baseline, it's, like, the logic difference between yes and no on the certain samples. Similar to, like, our, model performance results, it is, like, near zero, because it is, like.

513
00:57:14.060 --> 00:57:26.930
David Bau: half percent of the time, it also outputs, like, yes when the sample has no uncertainty. So, like, it sort of corresponds to our empirical testing, but basically, the takeaway is that at layer 12-ish,

514
00:57:27.310 --> 00:57:35.149
David Bau: Hatching in, the clean samples token activation at the period token, helps recover, like.

515
00:57:35.290 --> 00:57:39.569
David Bau: about half of the legit difference. So, we do think that this is, like.

516
00:57:39.900 --> 00:57:42.880
David Bau: A strong evidence that, uncertainties

517
00:57:43.000 --> 00:57:45.200
David Bau: concept is, like, present here, I guess.

518
00:57:47.340 --> 00:57:58.830
David Bau: And here is the PCA result, and then we can see that in the layer 0, it's a… okay, separate, but not that separate. But in the next layer, it's becoming very separate. Next, please.

519
00:58:00.180 --> 00:58:04.389
David Bau: And in the middle, they are, they are all very separate and next to mixed.

520
00:58:04.570 --> 00:58:07.380
David Bau: And in the last, it became mixed again.

521
00:58:07.640 --> 00:58:15.180
David Bau: Yes, it's very expensive. And it's the four, four-label PCA. And, in the… Okay.

522
00:58:15.450 --> 00:58:18.269
David Bau: Yeah, that would be pretty, Cindy.

523
00:58:18.690 --> 00:58:23.509
David Bau: We'll skip to… Oh, oh, we also want to,

524
00:58:23.660 --> 00:58:40.420
David Bau: see that weather learning model can tell value from risk, so we do, also… also… what is that experiment? Okay, oh, we did an experiment to see… to decouple this. And it's a result can see that in layer 20 and layer 24, it's almost, 90…

525
00:58:40.740 --> 00:58:56.889
David Bau: So for this one, basically, we were talking about, like, the first and second moment separation of concepts. The second moment is, like, the uncertainty, which is the variance. The first moment is, like, how bad things are, which is, like, sentiment, basically. And we want to see if the model

526
00:58:56.890 --> 00:59:02.959
David Bau: sort of, like, understands, oh, the, market is bad doesn't mean the market is uncertain. Like…

527
00:59:03.200 --> 00:59:15.060
David Bau: So, like, we tried sort of, like, getting the direction of uncertainty versus the direction of sentiments, and see, like, calculating the correlation, cosine similarity between those.

528
00:59:15.210 --> 00:59:18.750
David Bau: And this is on the toy dataset. So I guess…

529
00:59:19.590 --> 00:59:28.369
David Bau: like, one direction for our research project would be, like, going further, like, I think this direction of separating the two,

530
00:59:28.590 --> 00:59:38.879
David Bau: like, uncertainty direction and the, like, first moment, second moment directions is really interesting. And, so we had, like, localized, sort of, like, localized the uncertainty concept.

531
00:59:38.880 --> 00:59:51.589
David Bau: using this synthetic data, we might want to make it better, make it less templated, and then actually recover this uncertainty direction, and do the same thing on the sentiment direction, or, like, the first moment direction, and then see, like.

532
00:59:51.630 --> 00:59:53.939
David Bau: If we could do some causal intervention.

533
00:59:54.130 --> 00:59:57.780
David Bau: And see, like, are they… are the model using them separately, independently?

534
01:00:04.380 --> 01:00:06.460
David Bau: Yeah, a lot of good stuff.

535
01:00:07.390 --> 01:00:09.190
David Bau: Any suggestions for this team?

536
01:00:13.480 --> 01:00:17.989
David Bau: Can we see the activation of how she feels like reports.

537
01:00:18.380 --> 01:00:20.250
David Bau: The contrasted players.

538
01:00:21.260 --> 01:00:24.499
David Bau: You want to see the data centers? Yeah.

539
01:00:27.960 --> 01:00:29.440
David Bau: That's the way you're looking at that.

540
01:00:30.470 --> 01:00:33.459
David Bau: One of the questions that I have about this is.

541
01:00:34.450 --> 01:00:36.479
David Bau: You know, when you have all this synthetic…

542
01:00:36.960 --> 01:00:41.599
David Bau: Data, you, you wonder, with these small data sets, actually, even with the natural data.

543
01:00:41.950 --> 01:00:49.309
David Bau: You know, are you… are you getting… are you basically getting your grav detectors? Are you just, like, you know, detecting…

544
01:00:49.540 --> 01:00:51.379
David Bau: A vocabulary chart.

545
01:00:52.240 --> 01:00:55.839
David Bau: And I wonder if it's possible

546
01:00:56.490 --> 01:01:02.519
David Bau: To do something, to generate text that had the same vocabulary with words in different order.

547
01:01:03.130 --> 01:01:06.030
David Bau: That… that conveyed it, and…

548
01:01:06.490 --> 01:01:14.050
David Bau: a little bit of… it's a little bit of one of these crossword puzzle problems, right? Is there some way of constructing different sentences that have

549
01:01:14.210 --> 01:01:16.659
David Bau: Great different uncertainties using the same words.

550
01:01:18.420 --> 01:01:21.229
David Bau: C. Okay. So the answer would be…

551
01:01:21.430 --> 01:01:25.549
David Bau: If the model, like, performs differently, then…

552
01:01:25.770 --> 01:01:36.799
David Bau: Yeah, like, it's a little suspicious, like, when you have the PCA separation, and even at layer 0, right? It's, like, pretty separated.

553
01:01:36.910 --> 01:01:53.010
David Bau: It's a problem of data generating, is that if we create it with, same words, that means that there's only a little word that will tell the difference of uncertainty. Yeah. And, maybe model will easily separate them. You could easily separate it. Yeah. Yes.

554
01:01:53.210 --> 01:01:56.380
David Bau: Yes, and you just, you just, UFCA, you're just getting…

555
01:01:56.710 --> 01:02:01.400
David Bau: Yeah. The separation between active sports and doesn't have sports. Yeah, yeah, yeah, exactly.

556
01:02:04.010 --> 01:02:04.940
David Bau: Oops.

557
01:02:05.200 --> 01:02:07.379
David Bau: But what we were, when we were thinking about this.

558
01:02:07.540 --> 01:02:09.359
David Bau: This, let's contrasted pair.

559
01:02:09.850 --> 01:02:10.650
David Bau: Cool.

560
01:02:11.330 --> 01:02:18.920
David Bau: Yeah, our contractors are… I guess I've been thinking about more similar pairs.

561
01:02:19.040 --> 01:02:26.419
David Bau: But then, my problem with that was Harvey… Just picking up the…

562
01:02:27.570 --> 01:02:33.159
David Bau: the difference between the words that we have changed, or the concepts. Right.

563
01:02:33.330 --> 01:02:34.120
David Bau: Right.

564
01:02:34.460 --> 01:02:41.409
David Bau: Yeah, this tree, yes. You like this better? Yeah. You like this better?

565
01:02:41.770 --> 01:02:46.309
David Bau: How do we lose the, precision on the token position level?

566
01:02:46.460 --> 01:02:48.980
David Bau: Because this is, like, more abstract than, like.

567
01:02:49.660 --> 01:02:51.980
David Bau: It's like, they looked different, but…

568
01:02:52.370 --> 01:02:58.159
David Bau: because I, like, I guess our belief is that you couldn't expect uncertainty to be located, like.

569
01:02:58.910 --> 01:03:03.599
David Bau: on, like, a word level, which is exactly why we're looking at it. Like, we wanted to…

570
01:03:04.230 --> 01:03:19.189
David Bau: Okay, maybe a dumb question, but, like, I think one of you works with Baker from Baker, Bloom, and Davis. Isn't EPU, like, literally, they just look for words, though? Like, they're looking for words like tariffs, or, like… Yeah, you mean, like, the model is…

571
01:03:19.670 --> 01:03:26.590
David Bau: I don't even know what I think their model is, they just have, like, a list of words, so, like, I was just… because you said that, like, it's not…

572
01:03:26.720 --> 01:03:33.619
David Bau: like, uncertainty isn't being expressed at the word level, but… like, when I read through transcripts, like, for earnings calls, it's usually just, like.

573
01:03:33.940 --> 01:03:36.000
David Bau: They're talking about, like, tariffs or something.

574
01:03:37.070 --> 01:03:45.210
David Bau: And there's, like, a… like, a pretty major benchmark, like the Policy Uncertainty Index, and I think they also just look at words.

575
01:03:45.860 --> 01:04:02.460
David Bau: I guess the motivation Veronica originally had was that, like, yeah, people have been looking at, like, words, back-of-words, style models, but, they're also, like, trying to use LMs to gen… like, and see, like, if they are better, and I think they are. There are some, like.

576
01:04:02.570 --> 01:04:08.319
David Bau: for example, one paper used LMs to, like, generate the, like, the EPU measure, and then, like.

577
01:04:08.520 --> 01:04:20.320
David Bau: that actually… that, like, on a scale of 0 to 10, something like that, and, like, that correlates with economic outcomes. And, like, sort of, like, the point is to see, like, oh, why LLMs can't accept those.

578
01:04:20.600 --> 01:04:25.129
David Bau: If the conclusion is it is on the word level.

579
01:04:25.650 --> 01:04:27.530
David Bau: That's, like, a failure, but, like…

580
01:04:28.970 --> 01:04:32.510
David Bau: If you could have just come up with a backwards model, Boolean.

581
01:04:33.030 --> 01:04:37.430
David Bau: That's kind of disappointing? That's what you're saying? Yeah. Yeah.

582
01:04:40.620 --> 01:04:46.229
David Bau: You have these different types of… I still, like, This question of…

583
01:04:47.130 --> 01:04:50.109
David Bau: Are different types of uncertainty the same?

584
01:04:50.490 --> 01:04:55.450
David Bau: And you have different data sets. You've got… Your number experiments.

585
01:04:56.430 --> 01:05:01.719
David Bau: Where I can… I can totally imagine number experiments having balanced tokens.

586
01:05:02.150 --> 01:05:03.690
David Bau: Where, you know.

587
01:05:03.860 --> 01:05:10.590
David Bau: You switch the number… the rolls of the numbers around, or something like that, in some way in a sentence, and it changes.

588
01:05:11.270 --> 01:05:17.300
David Bau: you know, the probability is a lot, while it's just the same numbers. And, and…

589
01:05:17.750 --> 01:05:23.080
David Bau: And I can totally imagine that the models are pretty good at understanding number sentences.

590
01:05:25.420 --> 01:05:34.930
David Bau: And then for these other sentences, you know, I do buy it that there's some words that just are inherently about uncertainty, like the word uncertain

591
01:05:35.560 --> 01:05:41.400
David Bau: You know, maybe you just say the word, and it's uncertain. And so the thing I wonder is…

592
01:05:41.760 --> 01:05:47.710
David Bau: when you have, like, unigram uncertainty, when you have, like, an utterance or something like HubSpot uncertainty is…

593
01:05:48.010 --> 01:05:56.660
David Bau: It's just an abstract concept, and you have uncertainty that shows up because of a situation, maybe because you have certain bets or something like that.

594
01:05:57.190 --> 01:06:03.130
David Bau: is that the same thing to the model? Right? Like, I'm kind of interested in… as a reader, I'm like.

595
01:06:03.290 --> 01:06:07.640
David Bau: Interested in whether… these two worlds.

596
01:06:07.930 --> 01:06:13.989
David Bau: are the same inside the battle line, right? Like, the old backwards thing that the economists were doing.

597
01:06:14.340 --> 01:06:18.269
David Bau: just looking for words. When the LM sees a word, does it…

598
01:06:19.040 --> 01:06:23.840
David Bau: Does it have an internal reaction that's the same as when it sees, like.

599
01:06:24.240 --> 01:06:26.410
David Bau: You know, a certain set of cards on the table.

600
01:06:26.780 --> 01:06:30.489
David Bau: Something like that, it's like, oh, you know, now, now I have no idea.

601
01:06:30.700 --> 01:06:35.119
David Bau: whether I should bet one way or the other, because there's a lot of uncertainty for this poker hand.

602
01:06:35.630 --> 01:06:37.930
David Bau: You know?

603
01:06:38.760 --> 01:06:41.570
David Bau: Without, without having to say that word uncertain.

604
01:06:42.230 --> 01:06:44.510
David Bau: You know?

605
01:06:44.630 --> 01:06:48.079
David Bau: Yeah, so, like, is that… or are those totally different concepts?

606
01:06:49.550 --> 01:06:52.249
David Bau: We do need to poke around with the data, I guess, a little more.

607
01:06:53.150 --> 01:06:55.020
David Bau: Like you would, for example, show…

608
01:06:56.300 --> 01:07:02.930
David Bau: Yeah, like, what would… what would it do? It would, like, if you… Like, if you patched…

609
01:07:03.660 --> 01:07:06.189
David Bau: If you found ways of patching costs.

610
01:07:06.500 --> 01:07:14.199
David Bau: Very different domains. If you fetch from the number domain to the word domain, or from the word domain to the number domain, or something like that.

611
01:07:14.440 --> 01:07:18.830
David Bau: And, and, you know… You can have the same effects?

612
01:07:19.110 --> 01:07:25.680
David Bau: That might make sense. So, you catch a little bit, you have some effects, but sort of in domain.

613
01:07:25.970 --> 01:07:29.020
David Bau: But, maybe, maybe approximate batches worth.

614
01:07:29.190 --> 01:07:30.900
David Bau: Playing with… I don't know.

615
01:07:31.140 --> 01:07:32.860
David Bau: I don't know, it's just a crazy idea.

616
01:07:36.430 --> 01:07:38.190
David Bau: Anything else?

617
01:07:38.960 --> 01:07:41.299
David Bau: We would suggest for this, this project.

618
01:07:44.970 --> 01:07:51.920
David Bau: I was going to suggest to our group, but, like, one of the easiest things we can do is, like, switch numbers around.

619
01:07:52.520 --> 01:07:56.660
David Bau: Which is, like, if… Like, have one number that's, like.

620
01:07:57.060 --> 01:08:05.460
David Bau: Like, we expect, a 0.1% variance and, like, 20% growth, and it's, like, 20% variance and 0.1%.

621
01:08:05.580 --> 01:08:07.170
David Bau: Yeah, of course.

622
01:08:07.410 --> 01:08:11.399
David Bau: Yeah, it's the same numbers, but, you know, play different roles.

623
01:08:12.070 --> 01:08:12.840
David Bau: Yeah.

624
01:08:16.870 --> 01:08:17.960
David Bau: That's cool.

625
01:08:22.580 --> 01:08:25.429
David Bau: So, yeah, you're seeing separation.

626
01:08:26.090 --> 01:08:28.359
David Bau: You've seen sort of causal effects.

627
01:08:30.660 --> 01:08:35.449
David Bau: But you're, you're all looking, you're all looking like cold, so do we have something yet?

628
01:08:36.620 --> 01:08:39.150
David Bau: And do you… it doesn't feel like you have something yet, right?

629
01:08:40.840 --> 01:08:47.509
David Bau: Yeah, right. So you have, like, you have these questions about Layer 0 responding, so there's all these confounders.

630
01:08:48.029 --> 01:08:57.569
David Bau: Right? And… and then… and then even if you do get peer separation, you're not sure that it's surprising that you know that the models

631
01:08:58.270 --> 01:09:03.440
David Bau: And classify, so you look inside, you can classify what's the big deal.

632
01:09:03.859 --> 01:09:04.580
David Bau: Right.

633
01:09:04.720 --> 01:09:08.530
David Bau: So I think that it's… but maybe the questions are…

634
01:09:08.680 --> 01:09:11.869
David Bau: That are interesting are these kinds of things, like.

635
01:09:12.840 --> 01:09:18.229
David Bau: I guess, like, what you asked, like, what is the model using to classify it? Does it need the words?

636
01:09:18.870 --> 01:09:22.669
David Bau: To be able to tell what's going on with that sort of do if you take away the words.

637
01:09:23.649 --> 01:09:27.870
David Bau: does it… does it still work? Like, if you… is it the same concept, just…

638
01:09:28.399 --> 01:09:30.720
David Bau: It had very balanced numbers and things like that.

639
01:09:30.890 --> 01:09:32.250
David Bau: That's another way of saying…

640
01:09:33.120 --> 01:09:36.340
David Bau: You know, it doesn't need the words. If the words are present, they'll use them.

641
01:09:36.660 --> 01:09:38.719
David Bau: But I think about it some other ways.

642
01:09:42.760 --> 01:09:46.050
David Bau: Yeah, I'm not sure. What would you have to do to nail down that story?

643
01:09:47.000 --> 01:09:50.000
David Bau: If we could have, like, a dataset that is,

644
01:09:50.920 --> 01:09:53.789
David Bau: Not having the bag of words, but…

645
01:09:54.029 --> 01:10:00.580
David Bau: like, our uncertain, like, we know of that. That would be great. Oh, you just, like, blacklist the whole world, so you have the standard economic model.

646
01:10:01.170 --> 01:10:05.550
David Bau: you turn on your LLM, make a lot of unsearched texts, The violations model.

647
01:10:06.170 --> 01:10:08.590
David Bau: Hello, Elle, your mission is…

648
01:10:08.910 --> 01:10:12.540
David Bau: To… to come up with that score badly on this.

649
01:10:14.360 --> 01:10:16.310
David Bau: But actually, the other way.

650
01:10:17.890 --> 01:10:19.590
David Bau: And then you can look inside those.

651
01:10:20.520 --> 01:10:24.130
David Bau: Yeah, maybe something like that, that's cool. That would be a cool dice.

652
01:10:25.070 --> 01:10:32.750
David Bau: Do you know what, like, some of the transcripts look like, where, like, a model labels it as uncertain, but it doesn't, like, contain any of the words?

653
01:10:35.060 --> 01:10:51.010
David Bau: Like, like, like, did you, like, I was wondering if you guys had looked at, like, basically transcripts that get labeled as uncertain by, like, an LLM, but don't contain, like, any of the, like, bags of, like, the words in the, their, like, EPS bag of words.

654
01:10:51.170 --> 01:10:57.259
David Bau: It's just… that's really interesting that it can be better. Actually, there are a few, because in reality.

655
01:10:57.510 --> 01:11:05.550
David Bau: doing a planning call, you, like, the one who speak, don't want others to know they're uncertain about their financial

656
01:11:05.720 --> 01:11:06.740
David Bau: like, conditions.

657
01:11:07.180 --> 01:11:10.529
David Bau: So they're… when they're talking, they're actually hiding.

658
01:11:10.900 --> 01:11:15.719
David Bau: I would say. So… In many cases.

659
01:11:16.080 --> 01:11:27.010
David Bau: the, like, I'm not quite sure of all other, like, uncertain… these rules were not directly showing in the, conversation. But the LLM knows.

660
01:11:27.230 --> 01:11:29.189
David Bau: Yeah,

661
01:11:32.480 --> 01:11:47.850
David Bau: I would say there are directly nos. There are, like, as you can see, the result, there are, like, nearly 50% of the disagreement on different level of uncertainty based on different models. So I would say models are quite…

662
01:11:48.640 --> 01:11:52.610
David Bau: Not very… quite sure about the, uncertainty.

663
01:11:52.950 --> 01:11:56.420
David Bau: Oh, and the words aren't present. Yes. Oh, interesting.

664
01:11:58.430 --> 01:12:03.759
David Bau: But in aggregate, like, it outperforms EPU on predicting, like, different economic…

665
01:12:03.900 --> 01:12:06.690
David Bau: But… but that question, like, one…

666
01:12:06.850 --> 01:12:15.859
David Bau: We haven't tried, like, larger models, so we are not quite sure whether the model's parameters will affect, like, given a…

667
01:12:16.070 --> 01:12:23.210
David Bau: More frontier models will… Gave better result, but… As far as we have.

668
01:12:23.780 --> 01:12:28.139
David Bau: it's not… works good. What's that policy, isn't?

669
01:12:29.220 --> 01:12:40.630
David Bau: Gemma to 9B, Lama 8B, and Q1 7B. It's, like, rather small. 7? 7, yeah. All less than 10 billion. Yeah. So I've been very happy

670
01:12:40.750 --> 01:12:51.220
David Bau: with the capabilities… so the next size up, like, 30… like, 1030B, you know, Llama 70B, like, I've been pretty happy that they…

671
01:12:51.340 --> 01:13:08.949
David Bau: you know, can do pretty good linguistic understanding, they have pretty good theory of mind, they have good other things. Whereas, when you get down to 70 bottles, like, they're pretty grammatical, they understand a lot of things about the world, but they're very simplistic, right? It's like talking to a little bit, and

672
01:13:09.450 --> 01:13:11.239
David Bau: They don't have good theory of mind.

673
01:13:11.390 --> 01:13:18.479
David Bau: Yeah, so, like, if you think about that as a milestone, right, like, the difference between a 4-year-old and a 5-year-old is…

674
01:13:18.860 --> 01:13:20.460
David Bau: Like a 5-year-old.

675
01:13:20.580 --> 01:13:22.860
David Bau: We'll be able to second-guess you.

676
01:13:23.210 --> 01:13:31.190
David Bau: You also, you know, play a theory-mind game with you, and a 4-year-old will just believe everything that you say, everything's just surface level, there's nothing…

677
01:13:31.480 --> 01:13:33.209
David Bau: Nothing under the surface.

678
01:13:33.520 --> 01:13:40.460
David Bau: And maybe if you look at these small models, you can't do clear your mind, that's, like, the mindset you're kind of look at.

679
01:13:40.880 --> 01:13:45.260
David Bau: You're sort of… you're sort of asking… asking a 4-year-old to read the earnings columns.

680
01:13:46.130 --> 01:14:00.949
David Bau: It makes sense. Yeah, yeah, yeah. Yeah, yeah. If you use Open Router, it's good for testing them out before you go, like, all in, and you're like, I'm gonna collect activations. Oh, I forgot to mention to everybody, oh, we have, there's a budget.

681
01:14:01.320 --> 01:14:07.629
David Bau: for, or some, For some compute.

682
01:14:07.990 --> 01:14:11.400
David Bau: that, that Aruna has raised.

683
01:14:11.560 --> 01:14:15.370
David Bau: Like, thousands of dollars for you guys to just spend.

684
01:14:15.730 --> 01:14:23.059
David Bau: So, and so she has… she's gotten a credit card that we can reimburse, and got an account.

685
01:14:23.230 --> 01:14:24.809
David Bau: And all you have to do

686
01:14:25.470 --> 01:14:30.339
David Bau: is go on the Discord, or, you know, find, find, find the email, or whatever.

687
01:14:30.570 --> 01:14:42.550
David Bau: It's… so there's a RunPod, thing for running, you know, detailed experiments. So if you want to, like, set up some training runs or something like that, it's inconvenient to do other ways. You can use this RunPod thing. And…

688
01:14:42.920 --> 01:14:51.240
David Bau: Apparently, the invitation expires this week. Today, actually. It expires today? Expires today.

689
01:14:51.630 --> 01:14:54.689
David Bau: Not even tomorrow? Okay.

690
01:14:55.930 --> 01:15:07.249
David Bau: So don't forget, before you go eat your dinner, right, go get, like, this free account thing. And, yes, if you miss it, of course, we can go back to revision.

691
01:15:07.350 --> 01:15:14.640
David Bau: You know, say, oh, sorry, I missed it. But, yes, go get the free thing. There's $1,500 right now.

692
01:15:14.840 --> 01:15:24.749
David Bau: Yes. I was wondering if it's, like, the Teams 1500, or… I think it's the class. It's just the whole class. So, yeah, so use it fast before it's all gone.

693
01:15:25.010 --> 01:15:30.430
David Bau: No, it'll be fine. I think that we can refill it, stuff like that. It'll be fine.

694
01:15:30.810 --> 01:15:31.670
David Bau: Thank you.

695
01:15:33.110 --> 01:15:38.169
David Bau: Yeah, I think… Okay, thanks a lot, you guys. Yeah, who's next?

696
01:15:41.140 --> 01:15:47.360
David Bau: Is it Team Spatial the team that wants to do more next week, or is it this… both.

697
01:15:48.910 --> 01:15:50.130
David Bau: Can you stay tuned.

698
01:15:52.480 --> 01:15:56.329
David Bau: He came into the washroom. Oh! Team Special is in the washroom.

699
01:15:57.020 --> 01:16:00.230
David Bau: I don't know if people are doing it on the Zoom. Anybody on Zoom?

700
01:16:05.620 --> 01:16:08.480
David Bau: It's next. We can, we can click next for that.

701
01:16:10.140 --> 01:16:12.999
David Bau: Anybody on the team that wants to present today?

702
01:16:15.830 --> 01:16:17.080
David Bau: Oh, that's a bad thing.

703
01:16:17.230 --> 01:16:18.250
David Bau: Oh.

704
01:16:18.890 --> 01:16:19.840
David Bau: Oh.

705
01:16:25.820 --> 01:16:26.770
David Bau: Pretty good.

706
01:16:27.360 --> 01:16:35.219
David Bau: Okay, we'll come back to the spatial next. Thanks, Ross, but then… Okay, but you wanna, but you wanna, you wanna skip? Yeah. Okay, so we'll go to the next.

707
01:16:35.540 --> 01:16:42.500
David Bau: We'll come back to 15 special when 5 different times, so… So, here, next one here.

708
01:16:42.930 --> 01:16:44.150
David Bau: to empower.

709
01:16:57.720 --> 01:17:00.339
David Bau: Really cool the name.

710
01:17:09.590 --> 01:17:30.519
David Bau: Cool. Okay, we have some, patching experiments to report on, and some data set construction, kind of next-step stuff stuff to report on. So we'll try to go quick. The first is, so this is, like, the setup that we've been playing around with. This is, like, the prompt in the following sentence, does entity exercise power over someone or something else, true or false?

711
01:17:30.520 --> 01:17:41.539
David Bau: Then insert the sentence, what's the answer? We've been trying to shorten this a little bit, and then playing around with true or false, yes, no, but this is what we're going to be doing this setup.

712
01:17:41.540 --> 01:17:48.520
David Bau: So, like, this is the example. Sometimes Tony reassigned them to the basement office, or they reassigned Tony to the basement office.

713
01:17:48.620 --> 01:18:06.749
David Bau: And, we've been, playing around with hatching. So this is what that sentence looks like, and we get a lot of, like, clear signal down here at the period, and then at the colon, in terms of, filling the outcome from true to false.

714
01:18:06.780 --> 01:18:09.670
David Bau: And then, it's a little bit, kind of.

715
01:18:10.270 --> 01:18:25.200
David Bau: a little bit more indecipherable in here. This is the they token, so going from they to Tony, and then Tony to they, and that actually bumps it in the opposite direction, so it makes it more likely to be false when you kind of patch there. So,

716
01:18:25.540 --> 01:18:34.769
David Bau: I think we're… oh, it's weird, so it goes the opposite. Yeah, but I think because we're, like, switching the position of the people, right, so the pony…

717
01:18:35.230 --> 01:18:42.560
David Bau: When Tony gets a patched down, now there's two Tonys in the sentence? So you think this is a pointer effect when you're, like, mixing up the pointers, is the thing.

718
01:18:42.760 --> 01:18:45.380
David Bau: As opposed to mixing up the power structure.

719
01:18:45.620 --> 01:18:49.790
David Bau: Yeah, that might… yeah, there's something going on about… yeah.

720
01:18:52.170 --> 01:19:11.660
David Bau: Yeah, we don't have, like, a very clear articulation of that. And then if we flip the prompt around, and so it's something or someone exercising power over Tony, we get somewhat similar structure. It's less pronounced on the first token here, but as a Tony token, there is a big…

721
01:19:12.890 --> 01:19:17.780
David Bau: shift, and then,

722
01:19:18.360 --> 01:19:32.669
David Bau: Yeah, so within this setup, we've been trying to figure out if there's other ways to kind of get a power beyond just kind of flipping the order of the entities, so Armuta's been thinking very carefully about this, and yeah. Over to you.

723
01:19:32.730 --> 01:19:44.699
David Bau: So it's the… This is another, set of data that we switch over, see how it impacts the output, and we can see that,

724
01:19:45.210 --> 01:19:47.780
David Bau: And fire is changed to greeted.

725
01:19:48.060 --> 01:19:55.520
David Bau: Then there is some… Something moved on here, where… Also these two positions.

726
01:19:55.730 --> 01:20:10.959
David Bau: But, I'm still thinking that it's the, if we change the, like, prompt to the, to other things, like, not, exercising power, but something that,

727
01:20:11.660 --> 01:20:17.699
David Bau: Asks about a characteristic which, true and false, means something for it.

728
01:20:18.270 --> 01:20:25.050
David Bau: then I still expect to see this here, see the same, like, pattern results.

729
01:20:25.230 --> 01:20:27.980
David Bau: Even if it has nothing to do with color.

730
01:20:28.730 --> 01:20:33.730
David Bau: Right. Yeah, and still thinking, how we can, right?

731
01:20:34.300 --> 01:20:36.270
David Bau: Picks the power from this.

732
01:20:36.530 --> 01:20:37.280
David Bau: Right.

733
01:20:39.020 --> 01:20:46.110
David Bau: Do you have an interpretation of what this means? Like, it's switching from true-false at that word. Is that it?

734
01:20:46.680 --> 01:20:55.480
David Bau: Also, some, the dynamic of the power is, very related to the verb.

735
01:20:55.690 --> 01:21:00.589
David Bau: Like, firing, or doing something, forceful.

736
01:21:00.890 --> 01:21:01.570
David Bau: Life.

737
01:21:02.030 --> 01:21:08.749
David Bau: There's a lot of semantic meaning in the verb, that's kind of what we're trying. One interesting experiment would be if you're…

738
01:21:08.830 --> 01:21:25.540
David Bau: make it a third-person, sentence. Like, you click the sentence out and switch the position of the form and see if that… like, instead of just switching the verb, also switching the position of the verb and see if this moves to that position. Like, the employee was greeted by Sean.

739
01:21:30.840 --> 01:21:33.780
David Bau: The combination of this one and the previous one.

740
01:21:35.070 --> 01:21:39.000
David Bau: Yeah, just to confirm that it is the farm,

741
01:21:40.060 --> 01:21:43.259
David Bau: And it stores the relationship between

742
01:21:43.480 --> 01:21:46.710
David Bau: I'm just thinking, like, if it's a position thing or a warm thing.

743
01:21:49.440 --> 01:21:57.039
David Bau: Yeah, because we're asking about Sean, and then this is the word that comes after it, and then it goes to, like, switching this into the passive voice. Yeah, yeah. Yeah.

744
01:21:59.360 --> 01:22:08.170
David Bau: Also, there are other, prompts that can make difference in the outcome, like, to exercise power.

745
01:22:08.170 --> 01:22:21.790
David Bau: or, you hold power, does this person hold power over someone or somebody? So for the sentences that the power is not much clearer for them, it can make a difference.

746
01:22:24.380 --> 01:22:45.190
David Bau: Yeah, and then we're, like, I think where we kind of started with this was kind of, like, how much do we impose a definition of power? And, like, I think one thing that we're kind of struggling with that is that maybe we're shading into all these different kind of conceptions of power, that we're kind of bumping into different concepts within the model. And, like, so, yeah, I think we're in this place where it was, like, a little bit of a give and take between, like.

747
01:22:45.360 --> 01:22:51.780
David Bau: Having a kind of clear conception of power, versus kind of trying to figure out what is the breadth of this within

748
01:22:51.850 --> 01:23:10.340
David Bau: the model. Yeah, you… I don't know, I put a bunch of other stuff in here, but this is you, yeah. Okay, so, sorry. So, coming on into there, so we… so looking at these examples, we can see… we can see that in some cases as well, from pre… From experience in the previous weeks.

749
01:23:10.450 --> 01:23:17.890
David Bau: that, that, that there are… that there are limitations to even the instruct models versions.

750
01:23:18.040 --> 01:23:35.209
David Bau: conditions of power. For instance, there are bad false negatives. There are certain contexts where it's less confident if you have multiple modifiers, right? So as we get into different definitions of power, one… one thought was that one, one thing we advanced on was, trying to systematically expand our synthetic data set.

751
01:23:35.260 --> 01:23:40.549
David Bau: So front… so that's one thing here. So now we're cross-referencing,

752
01:23:40.700 --> 01:23:49.069
David Bau: about 200 granular power mechanisms and different ending-based conditions, so we want… we want to try to…

753
01:23:49.370 --> 01:24:07.610
David Bau: So, for instance, like, we start with, like, I think 11 or 12 hard categories, economic power, military power, ex… you know, digital power, etc, right? But, and then from there, we articulate different contexts, so this might… in the case of military power, it might be embargoes, str…

754
01:24:07.790 --> 01:24:14.479
David Bau: Define legitimacy, strategic weapons, etc. So, and that results in about 220 different contexts.

755
01:24:14.540 --> 01:24:33.989
David Bau: around here with different attributes, because we want to be able to try and isolate for different things. So, for instance, there might be contexts where power is loaded to, using the military example, you might have an instance where it's like, oh, the commanding officer decided something, right? As opposed to the subordinate decides that, right? There are modifiers, it might be based on gender, rank.

756
01:24:33.990 --> 01:24:45.579
David Bau: and other descriptors that might encode notions of power. So to really kind of stress test that spectrum, like the landscape of it, we're trying to expand the synthetic data sets to account for these things.

757
01:24:46.760 --> 01:24:56.649
David Bau: as we said, we're doing a full maintenance cross-referencing, so it's about 3,000 records, for positive, and then we're also constructing one for negative. We'll get into that next.

758
01:24:56.790 --> 01:24:59.610
David Bau: One other thing is that, as we said.

759
01:24:59.630 --> 01:25:12.670
David Bau: we want to be able to try and scan for if it's verbs, entities, there are certain things that capture meaning, right? The issue with a lot… the issue with a lot of things, as a lot of the valuable stuff Armita and Isaac were doing, was, like, certain verbs

760
01:25:12.670 --> 01:25:26.919
David Bau: can contain a lot of information about power. So one thing is we want to have… we want to also try and create contexts that… that clearly demonstrate a power imbalance, or a lack of, but… but use… but use realistic, but less…

761
01:25:27.150 --> 01:25:37.489
David Bau: a singularly overloaded term. So, a national army surrounds a separatist militia. Surrounds is… is not necessarily the same thing as bordered, dismissed. You know, water can surround water.

762
01:25:38.010 --> 01:25:44.780
David Bau: An island. Same thing, intentionally relay evacuation procedure. Hear the power is not a single verb.

763
01:25:45.240 --> 01:25:59.700
David Bau: Right? So we're trying to capture a lot of linguistic variance, too. So the whole, the whole point with this expansion is that, for next week or later on, we're trying to create something we can use to create a systematic profile. How many, how many different types of power again did you say you guys?

764
01:26:00.000 --> 01:26:12.759
David Bau: So, I think we're at… I removed… Yeah, it's about 220, yeah, different contexts, and I… I removed a couple, supposed to be attribute conditions, but we're gonna have, like, 11 bias conditions.

765
01:26:13.070 --> 01:26:23.879
David Bau: So it's gonna be… and that's for positive power, there being no power. We want to actually open this up a bit, because while we're talking about isolating for verbs.

766
01:26:23.880 --> 01:26:38.820
David Bau: The… there's a rough… I mean, absolutely do some quality control things, but the rough draft of this data set, at least in the first instance, is close to ready. But we… while we're talking about different conditions to isolate for, one thing that Armita brought up was.

767
01:26:38.820 --> 01:26:57.400
David Bau: Are there different formulations of power by which we can ask the prompt, right? Because if we're saying, is power being exercised? I mean, I believe you had a couple of other formulations we can ask the question, right? Because if we're asking the question the same way maybe we're trying to generate the data, that's not correct. There might be different…

768
01:26:57.400 --> 01:27:02.449
David Bau: she identified several different ways we can rephrase the question, so we're trying… I think the next step

769
01:27:02.800 --> 01:27:12.880
David Bau: is going to try and be do a systematic scan, and then we're probably going to try and hone in a couple things, but as we expand the data set, we're actually hoping to ask some questions about what might be some interesting dimensions to

770
01:27:13.240 --> 01:27:14.460
David Bau: interact with.

771
01:27:15.130 --> 01:27:16.680
David Bau: data sets.

772
01:27:16.900 --> 01:27:17.830
David Bau: Yeah.

773
01:27:18.000 --> 01:27:23.090
David Bau: Yeah, so my feedback, so… I… I,

774
01:27:23.200 --> 01:27:27.540
David Bau: So, for this and some of the other projects, right, I do think that

775
01:27:27.720 --> 01:27:34.280
David Bau: You know, people ask, what, you know, one of the questions was, you know, is probating… Second best.

776
01:27:34.910 --> 01:27:35.869
David Bau: Whatever, right?

777
01:27:36.110 --> 01:27:40.600
David Bau: I… You know, once you develop a dataset.

778
01:27:41.270 --> 01:27:43.650
David Bau: That has a pretty clear idea.

779
01:27:44.180 --> 01:27:50.629
David Bau: Of what the concept is, that data sets, you know… Trying to compose.

780
01:27:50.740 --> 01:27:55.499
David Bau: then probate is just a really natural way to do. I mean, so it's a classification data set.

781
01:27:55.780 --> 01:27:59.680
David Bau: And you can go around all the hidden states of your…

782
01:27:59.990 --> 01:28:04.410
David Bau: On your model, or ones that you think are… you could train folks on them.

783
01:28:04.690 --> 01:28:08.880
David Bau: And so I think that the causal… so, like, you've got these two different types of experiments.

784
01:28:09.240 --> 01:28:13.239
David Bau: And you can… you can go back and you can think about, like, what Sam Marks did.

785
01:28:13.640 --> 01:28:23.239
David Bau: in using these two things together. Well, so he didn't do prob… he did math, mean, things, but he had a very simple binary class, is it true or not? And now you have, like, a…

786
01:28:23.400 --> 01:28:29.420
David Bau: I don't know, maybe it's a 3000 weight classifier or something, I'm not really sure, right? So, but,

787
01:28:29.690 --> 01:28:35.729
David Bau: But… But he did it in two stages. First, he says, well, I could probe anything.

788
01:28:36.380 --> 01:28:39.170
David Bau: But really, I'd rather probe

789
01:28:40.120 --> 01:28:46.009
David Bau: the… the places in the model where I think is where it's really paying attention.

790
01:28:46.290 --> 01:28:47.330
David Bau: this concept.

791
01:28:48.070 --> 01:28:50.040
David Bau: Like, what the previous…

792
01:28:50.200 --> 01:28:57.940
David Bau: You know, work has found is often at the end of the sentence or something like this, but you could do a causal

793
01:28:58.050 --> 01:29:00.929
David Bau: Our analysis, just to highlight the

794
01:29:01.180 --> 01:29:14.550
David Bau: these are the promising places to probe. And then… and then you could… you could train probes, right, to see what accuracies they get at different locations. And then… and then once you have those probes, then there's other things you can do. You can do…

795
01:29:14.770 --> 01:29:24.540
David Bau: Or, you know, causal, so, the probes will tell you what… you know.

796
01:29:24.850 --> 01:29:30.680
David Bau: What one of these classes… At least a probe thinks the model is thinking about.

797
01:29:31.070 --> 01:29:36.269
David Bau: You can see if the probe agrees with your prompting strategy. You put a thing after.

798
01:29:36.440 --> 01:29:41.550
David Bau: That sentence, and you say, what kind of power was it? And see if they agree or whatnot.

799
01:29:41.840 --> 01:29:46.260
David Bau: And, and, you know, a natural thing to do would be

800
01:29:46.400 --> 01:29:52.619
David Bau: Is that really what it's looking at to make that assessment? Let's say the probes and the prompt degree.

801
01:29:53.130 --> 01:29:57.779
David Bau: You can go and force in a different representation into the probe and see

802
01:29:57.950 --> 01:30:00.810
David Bau: Is that this prompt to behave a little differently.

803
01:30:01.010 --> 01:30:04.990
David Bau: We… You know, that would be… I feel like that…

804
01:30:05.210 --> 01:30:12.660
David Bau: That's sort of a way of bringing these things together and connecting it to, like, an internal thought.

805
01:30:12.770 --> 01:30:17.559
David Bau: I think eventually you're gonna have this question of, like, so what do we care?

806
01:30:18.220 --> 01:30:20.640
David Bau: About what the model's thinking inside.

807
01:30:20.870 --> 01:30:22.899
David Bau: As opposed to what it's saying.

808
01:30:23.260 --> 01:30:24.120
David Bau: Right?

809
01:30:24.240 --> 01:30:28.829
David Bau: Are there situations, like, so, you know.

810
01:30:29.920 --> 01:30:31.939
David Bau: If you could just ask a model.

811
01:30:32.090 --> 01:30:34.439
David Bau: What is thinking, and why do you need things?

812
01:30:35.010 --> 01:30:38.080
David Bau: You know, why do you need to probe around on the inside?

813
01:30:38.380 --> 01:30:57.569
David Bau: And and so I think that, you know, so these are my suggestions on, like, things that you can do. These are just, like, mechanics, things you can do to try to pin down the thing, but I think that you also have a larger question of, it's like, okay, so let's say this all works, and we can probe it out, we have positive effects, and we know

814
01:30:57.720 --> 01:31:02.600
David Bau: Where the concept of power is, you know, You know, what do we…

815
01:31:03.800 --> 01:31:10.710
David Bau: you know, what do we get out of that? Like, why do we care? I can imagine there's a couple sorts of things you could get out.

816
01:31:10.930 --> 01:31:11.850
David Bau: of…

817
01:31:12.510 --> 01:31:18.770
David Bau: But do you guys have a sense for why… why you want to look inside the bottle at power?

818
01:31:19.270 --> 01:31:30.789
David Bau: So, our, like, going back to our original research info we started about, one thing was we wanted to… we wanted to try… try to identify certain… at least certain triggers in the model, or at the very least, if it was possible then.

819
01:31:30.850 --> 01:31:49.330
David Bau: you know, if there were certain noise or signals you could… like, look at the words that caused this inference. Yeah, or even in particular, in certain models, for instance, if it was possible to affect the signal in some predictable way, to have it redu- to have it, you know, change its decision-making over.

820
01:31:49.330 --> 01:32:06.140
David Bau: Yeah. Oh, that… I really like that, because, like, I'm in this, like, prompting group, and they're like, oh, like, if you don't capitalize things, or you have a typo, like, it gives you much worse results, and, like, stuff like that, or if you slang, like, and it feels like what your question is, like, asking, like, why does that happen? Like, was there some…

821
01:32:06.420 --> 01:32:19.480
David Bau: some circuitry in there that's, like, making that… I feel like that'd be really cool to know, like, if the model is treating me differently because of some inference like that, like, maybe it only typed in lowercase or something.

822
01:32:19.760 --> 01:32:35.519
David Bau: Interesting. I think one thing I should do, like, for some of these is, I think what could be interesting is, for instance, maybe I'm wrong, say, for some of these examples, right, we've extracted, like, entities and verbs. If we, if we do that…

823
01:32:35.880 --> 01:32:49.920
David Bau: And then use those as labels. It might be interesting if there's an experiment we can do, for instance, where we can reliably have it reverse decisions it makes based on patching things from different prompts in a controlled way. I don't know, though. That'd be…

824
01:32:50.690 --> 01:32:54.239
David Bau: Yeah, just to build up that a little bit, like, I think,

825
01:32:56.340 --> 01:33:03.920
David Bau: Another way of thinking about this is, like, it's not just about power, but it's about, like, how models think about social relations more generally, and this is a very, kind of, particular

826
01:33:03.920 --> 01:33:22.710
David Bau: conception of social relations, but maybe there's this, like, this kind of template for thinking about other types of social relations as well, so there might be a little bit more of an interesting interest there. I like that a lot. So, like, there's this… so I like the… so, like, so you're taking a very broad view right now, which is, like.

827
01:33:22.710 --> 01:33:25.099
David Bau: Which, it leads to this thing, like, okay.

828
01:33:25.250 --> 01:33:32.519
David Bau: We're gonna catalog all the parts, we're gonna get 224 things, and that would just lead to this natural thing of a 224-way classifier.

829
01:33:32.700 --> 01:33:39.689
David Bau: That you just go and you say, and then, like, the natural experiment would be, like, oh, if we have some piece of text that has to do with making ice cream.

830
01:33:39.790 --> 01:33:41.100
David Bau: I wonder if you…

831
01:33:41.160 --> 01:34:00.720
David Bau: if you zap that classifier at it, if it says, oh, actually, this ice cream paragraph is secretly about, like, nuclear weapons, we're talking about that, right? You're like, that's really what the model is thinking about. So I think that that's kind of the direction that this data set would lead to, but there is this opposite direction.

832
01:34:01.610 --> 01:34:05.140
David Bau: Right, which is really narrowing it, right, to say, hey.

833
01:34:05.720 --> 01:34:09.440
David Bau: If you just have sentences with two people, like your other data set.

834
01:34:09.790 --> 01:34:14.020
David Bau: And… and there's a relationship between them, Like, how is that…

835
01:34:14.520 --> 01:34:33.929
David Bau: How is that social relationship represented by the people are like, this is a binary classifier. We're just changing the different conditions, and it's just… It's always binary, but you have, like, 224 ways it's, like, you know, like, situations, where… so you could actually do the other thing. You could also make a 224-way

836
01:34:33.930 --> 01:34:39.750
David Bau: classifier on… on this large data, if you wanted to. It's just the same data that you could just label it another way.

837
01:34:39.930 --> 01:34:43.489
David Bau: And it's actually a natural question, like, when you ask.

838
01:34:43.770 --> 01:34:49.129
David Bau: How do models think about power, and you've asked this several times when you present, Like, wow.

839
01:34:50.600 --> 01:34:56.490
David Bau: You're preparing this data that you could use this approach to see if any of these, you know, works better, or…

840
01:34:56.600 --> 01:35:02.509
David Bau: You know, is misclassified, or… does it have… it has a similar accuracy over all these classes, or…

841
01:35:02.770 --> 01:35:22.839
David Bau: Is it higher accuracy on some of them, or not, or yeah. I think in a class, I think you mentioned as well, even… let's say there's an interesting interaction, but it's reliably in one sector. That's still what you said, that's still something, as you said, right? What do you mean? I believe you said, so for instance, let's say the models are not interesting results for…

842
01:35:22.850 --> 01:35:32.820
David Bau: all but one category of the data, but for… but it's… but it's reliably interesting in one or two areas. Oh yeah, then that could still be interesting. Yeah, absolutely. Right.

843
01:35:32.850 --> 01:35:34.720
David Bau: If you're like, oh, the model is, like.

844
01:35:35.230 --> 01:35:42.120
David Bau: It doesn't have a very clear representation of most of these things, at least when we're probing this way, but the representation is clear this way.

845
01:35:42.450 --> 01:35:53.779
David Bau: At least given the state that we are today, where we don't know anything about how the models work inside, just in a part of it, so it's pretty interesting.

846
01:35:54.620 --> 01:35:55.350
David Bau: Huh.

847
01:35:56.270 --> 01:36:04.290
David Bau: I think the sentences that are unrelated, random says there is a power relation, which,

848
01:36:05.380 --> 01:36:09.960
David Bau: I've always been fascinated by that. I think this is so… when I showed you the,

849
01:36:10.080 --> 01:36:14.850
David Bau: the paper from Yudi Chen Wandberg's group, where they're like, hey, here's…

850
01:36:15.040 --> 01:36:23.620
David Bau: Here's, like, a bunch of probes for, are you, rich or poor, are you educated, are you not? Are you, you know, are you, are you male or female?

851
01:36:23.740 --> 01:36:26.650
David Bau: The fun thing about that

852
01:36:26.760 --> 01:36:31.450
David Bau: is that you can just put in other random sentences, you can say, you know.

853
01:36:31.610 --> 01:36:43.350
David Bau: You know, I was reading the Wall Street Journal the other day, it says, oh, you're definitely male, or something like that, right? Well, that's a bias, right? You know what I mean?

854
01:36:43.700 --> 01:36:50.630
David Bau: And something that's kind of interesting, you know, if you have a classified like this.

855
01:36:50.910 --> 01:36:54.399
David Bau: Now it gives you another tool.

856
01:36:54.530 --> 01:37:01.480
David Bau: You can do. Now, but now, of course, the other… what… it's… it's not a full paper by itself. It's just a tool.

857
01:37:01.810 --> 01:37:06.990
David Bau: Because it leads… because if you tried to make a full paper out of it, people would just say.

858
01:37:07.240 --> 01:37:11.399
David Bau: Well, why don't you just… Go to the language model and just ask it.

859
01:37:11.500 --> 01:37:15.920
David Bau: Do you think I'm a boy or a girl, or do you think there's part of that you're just in the way that you're doing it?

860
01:37:16.050 --> 01:37:20.299
David Bau: And, and then it tells you then, why do you trust it.

861
01:37:20.520 --> 01:37:31.459
David Bau: Right, so there's a couple situations. You could be in a situation where, well, for some reason, you don't want to trust the model because you think they're lying to you. Or it could be a situation where, oh, the point is not

862
01:37:31.550 --> 01:37:49.850
David Bau: to just read the signal out, but to see where the signal's coming from, or something like that, right? Or the point could be, you have 200 different types of signals, and you're trying to tell which ones are more faithfully represented in the model, and which ones are not, as sort of, as your proposal for the models.

863
01:37:50.020 --> 01:37:53.380
David Bau: Know about this, and they don't, and you can see evidence that they know about it.

864
01:37:53.630 --> 01:37:57.720
David Bau: From the way they represent, or something like that. Like, there's a few different…

865
01:37:58.030 --> 01:37:59.530
David Bau: You know, takes that you could…

866
01:37:59.750 --> 01:38:02.109
David Bau: They've got, like, wide one with consent?

867
01:38:02.330 --> 01:38:10.859
David Bau: But… but so… so you're getting there, right? You're… you're, like, you're looking around at enough things now that maybe you should have a stronger opinion.

868
01:38:11.470 --> 01:38:12.240
David Bau: Great.

869
01:38:12.610 --> 01:38:14.799
David Bau: About which kind of equations to that.

870
01:38:15.630 --> 01:38:16.360
David Bau: Thank you.

871
01:38:16.820 --> 01:38:20.459
David Bau: Anyway, that's my comment, but I think that it looks good.

872
01:38:21.930 --> 01:38:22.640
David Bau: Nope.

873
01:38:25.100 --> 01:38:26.570
David Bau: Any other suggestions?

874
01:38:27.270 --> 01:38:28.110
David Bau: Alright.

875
01:38:28.340 --> 01:38:29.450
David Bau: Oh, it's like…

876
01:38:30.300 --> 01:38:40.260
David Bau: suggestions. I think it's also, like, having… so currently, it's kind of a static situation, where there's one… one, like, it's basically the subject, verb, and object, so you have

877
01:38:40.290 --> 01:38:57.239
David Bau: the subject is asserting power over the object, probably not. You mean… you mean in that form? Yeah, in that form, right? I think most of them are like that, yeah. So, like, can you see how… can you make this into a dynamic situation, where, say, over a number of turns, it's,

878
01:38:57.280 --> 01:39:02.210
David Bau: moderating. So, does the model assign power just to, say.

879
01:39:02.650 --> 01:39:16.940
David Bau: Alice, or in the case of Alice and Bob, does it always assign power to Alice irrespective of the situation, or irrespective of the job? We did mess around with that, didn't we, where we were switching the subject and the object, and then in terms of we were asking who had power, we were doing the patching?

880
01:39:17.040 --> 01:39:18.889
David Bau: Did we do an experiment like that?

881
01:39:19.050 --> 01:39:29.250
David Bau: Yeah, I don't think we had it, like, evolve over time, though. They turned the tables kind of thing? Yeah, it's a good movie plot.

882
01:39:30.030 --> 01:39:39.190
David Bau: Okay. Interesting. Yeah. Yeah. So yeah, it would be like having a sequence, then ask, at this point, who has power in this situation, sequence runs a little bit longer, who has power at this point? Yeah.

883
01:39:39.210 --> 01:39:49.549
David Bau: Yeah, because I don't know if this one's helpful, there's a paper… I don't even know if it's a paper, but I saw a presentation by this person, and they basically gave two country names, and they asked.

884
01:39:49.560 --> 01:39:59.620
David Bau: like, if you had to save someone from this country, or would you save someone from this country? And they swapped their country, like, they didn't, like, keep it always. For example, would you save an Indian, or would you save an American?

885
01:39:59.710 --> 01:40:04.769
David Bau: And surprisingly, it always saved, like, the Global South than the Global North.

886
01:40:05.340 --> 01:40:09.660
David Bau: But they also, like, switched up stuff, so then I could be,

887
01:40:09.920 --> 01:40:13.240
David Bau: The one always stuck through to the identity.

888
01:40:13.780 --> 01:40:15.269
David Bau: How would you invent that?

889
01:40:16.010 --> 01:40:33.360
David Bau: We're done? Okay, we're done. So how… so… so, like, so we… we can go through about half of the teams. Should we do that pace? Do you want to do half the teams one week and half the teams the other week, or do you want to just do every team, every week? Like, how do you… how do you guys feel about that?

890
01:40:33.740 --> 01:40:44.089
David Bau: I think it would be better for one team who can do it twice a month, because if you're doing it per week… Do you want to… do you want to scale back to that?

891
01:40:44.340 --> 01:40:48.050
David Bau: Like, I… maybe… Sean, what do you guys think? I don't know.

892
01:40:48.750 --> 01:41:01.129
David Bau: Yeah, it often feels like there is that. I think they feel… I think it might be good, since… especially since we're getting… I think a lot of us are moving towards… You're getting… you're getting more into it. It's not… you're not just, like, proposing, oh, I don't know what we're doing, like, you're doing stuff, right?

893
01:41:01.320 --> 01:41:05.410
David Bau: So, why don't we do that? So, there's a few teams that…

894
01:41:05.710 --> 01:41:07.549
David Bau: So I, I didn't write down…

895
01:41:08.230 --> 01:41:23.540
David Bau: Nikhil, can I ask you to write down which teams presented this week? So, which team to present? The first three. The first three, which was… TK. TK. Economics. Economics. EK. Economics. Power. Empower. And we had finance yesterday.

896
01:41:24.460 --> 01:41:33.669
David Bau: So, like, by essence, like, half, okay? That's fine. Yeah, and… and… and there, and so why don't we do this? Instead of…

897
01:41:34.100 --> 01:41:41.160
David Bau: have you guys present on Tuesday. The next three groups just present on Thursday.

898
01:41:41.330 --> 01:41:44.780
David Bau: Next week. Does that make sense? And then we'll go into this…

899
01:41:45.100 --> 01:41:50.770
David Bau: And we can… we can be a little flexible, obviously not to do an extra presentation.

900
01:41:53.810 --> 01:41:56.100
David Bau: Okay. Okay.

901
01:41:56.310 --> 01:42:06.260
David Bau: Thanks. Do remember to sign up for that, complete resource. If you get any issues, please let me know.

902
01:42:07.000 --> 01:42:08.900
David Bau: It leads me to talk about the news.

903
01:42:34.670 --> 01:42:48.160
David Bau: Oh, listen… Let me double-check it with David. If you're working with an E for C…

904
01:42:49.050 --> 01:43:21.579
David Bau: I think so. I think so. And so… My Instagram has one, but…

905
01:43:21.700 --> 01:43:30.999
David Bau: I don't have a strategy, if you don't mind that, too.

906
01:43:31.000 --> 01:43:38.909
David Bau: You're gonna take action, so we're going to have…

907
01:43:38.910 --> 01:44:02.299
David Bau: Yeah, that's alright, like, I just followed that on me. Yeah, yeah.

908
01:44:02.380 --> 01:44:20.900
David Bau: So basically, I'm working on that part more right now. Maybe I can, like, tried to parse out some of it? Asking for, like, site paramise.

909
01:44:20.900 --> 01:44:44.649
David Bau: Because, like, I think one of the things that say you're supposed to that, I think is… Seriously, I see. We can avoid the predicated prompts. So it's like, let's say a prompt is, like, one of our sentences, right? What?

910
01:44:44.650 --> 01:45:07.219
David Bau: Okay, okay, okay. Yeah, yeah. Wait, wait, wait, wait, but, like, yeah, let me get that taken care of. Also, do you need keys or anything? I have, like, open router… Oh, yeah, I need to think about that a little bit more.

911
01:45:07.220 --> 01:45:29.890
David Bau: Oh my gosh, yeah, I'm so sorry, I think it… I think I did actually, but… Your budget is just really high for some of these, so we can do that particularly. I use that for a Q&A.

912
01:45:29.890 --> 01:45:34.040
David Bau: What we can do, I think…

913
01:45:34.100 --> 01:45:49.819
David Bau: It's power, it's classification accuracy. Thanks for, thanks for doing all the, to, like, set up for the meetings and everything.

914
01:45:50.070 --> 01:45:54.319
David Bau: Yeah, I think I had seen that, and I thought that it expired.

915
01:45:54.320 --> 01:46:13.419
David Bau: I think what could be interesting is if we do do, if we do the road, but it's just… I think this is not, like, it's just…

916
01:46:13.420 --> 01:46:22.520
David Bau: Again, if we just kind of take a shot. Or not hunger! Not sleep, I see if there are different conditions.

917
01:46:23.420 --> 01:46:44.499
David Bau: Because at that point, because at that point… I don't know what it is.

