WEBVTT

1
00:00:04.100 --> 00:00:04.810
David Bau: Thanks.

2
00:00:11.180 --> 00:00:14.420
David Bau: Then, we'll try.

3
00:00:15.140 --> 00:00:16.810
David Bau: Democrat's presentation.

4
00:00:20.320 --> 00:00:21.119
David Bau: Thank you.

5
00:00:22.720 --> 00:00:33.329
David Bau: No, I think it's good that we have a unit tag. So now we just have to figure out if we received it.

6
00:00:33.980 --> 00:00:39.170
David Bau: Same, right?

7
00:00:39.290 --> 00:00:40.710
David Bau: That'll be instant.

8
00:00:45.280 --> 00:00:46.210
David Bau: Probably not.

9
00:01:23.870 --> 00:01:24.670
David Bau: Pleasure.

10
00:01:26.250 --> 00:01:27.130
David Bau: Anybody?

11
00:01:28.690 --> 00:01:37.890
David Bau: I don't know… I think that there's a lot of students, I guess.

12
00:01:39.490 --> 00:01:57.330
David Bau: This corner is the worst corner, Patrick, right? So I can… I couldn't be the one that's sitting here. It's okay, I don't mind sitting here, it's all right.

13
00:02:01.320 --> 00:02:02.470
David Bau: That's a pleasure.

14
00:02:03.110 --> 00:02:04.810
David Bau: So we can go back and save.

15
00:02:05.590 --> 00:02:05.975
David Bau: Lucky

16
00:02:09.660 --> 00:02:10.389
David Bau: Bye.

17
00:02:16.850 --> 00:02:17.770
David Bau: That'll work.

18
00:02:19.660 --> 00:02:30.369
David Bau: It's still, yep, it's still my room. Yep, they can use the same one. Yep, I'm here.

19
00:02:35.380 --> 00:02:40.999
David Bau: So, yeah, so I just, I always use the same for everything.

20
00:02:43.120 --> 00:02:44.620
David Bau: I haven't had a problem with the game.

21
00:02:46.510 --> 00:02:49.290
David Bau: Do you know, some parallel? What?

22
00:02:49.580 --> 00:02:54.020
David Bau: But then I'll give somebody else's Zoom, yeah, tip of the meeting.

23
00:02:54.440 --> 00:02:56.340
David Bau: That's one of them appears to me.

24
00:02:56.560 --> 00:03:00.590
David Bau: But then some… the real problem is when I'm hosting a meeting.

25
00:03:01.050 --> 00:03:14.630
David Bau: If I have some other meeting on the same line, then that's a problem. I don't have a way of transferring, like, 50 people over to a different one, I just, you know, like, that's a… that's an issue.

26
00:03:15.080 --> 00:03:23.079
David Bau: So that's the only problem that I have. And I have to tell people that I have to move… at the last minute, I have to move islands.

27
00:03:24.300 --> 00:03:28.559
David Bau: Right? That's… It only happens last one up.

28
00:03:32.040 --> 00:03:35.569
David Bau: I usually try to avoid putting meetings right after I send them.

29
00:03:41.690 --> 00:03:42.490
David Bau: Right?

30
00:03:44.750 --> 00:03:46.689
David Bau: So, I said, I went okay.

31
00:03:48.000 --> 00:03:49.520
David Bau: You proud of your paper?

32
00:03:52.290 --> 00:03:53.910
David Bau: Anybody else for ICML?

33
00:03:55.840 --> 00:03:57.109
David Bau: There we have a few.

34
00:03:57.390 --> 00:03:58.850
David Bau: We had a few in the lab, right?

35
00:04:00.070 --> 00:04:01.070
David Bau: Thank you, man.

36
00:04:01.760 --> 00:04:04.229
David Bau: When he started panicking about getting his paper ready?

37
00:04:05.100 --> 00:04:10.399
David Bau: And, I saw Karam working really hard, I don't know if he ended up slipping through.

38
00:04:11.040 --> 00:04:19.899
David Bau: I don't know if you're following that at all. We… you're not in Crim's paper, right? No. Why would it… I guess it was cold. Yeah, instead of following, yeah, he has to decide if he can fall under the…

39
00:04:20.440 --> 00:04:24.229
David Bau: So, the reviews will be out of the word and call it himself.

40
00:04:24.340 --> 00:04:28.690
David Bau: Oh, the first round of reviewers will be out, so maybe it's worth it.

41
00:04:29.310 --> 00:04:31.320
David Bau: I thought we didn't do. I don't know.

42
00:04:32.480 --> 00:04:36.999
David Bau: I think Colin would be a better… better dreams than him, but it's okay, try adding that.

43
00:04:40.430 --> 00:04:41.330
David Bau: you know.

44
00:04:42.300 --> 00:04:46.740
David Bau: If he was my PhD student, I'd tell him, relax! Just go for column.

45
00:04:47.010 --> 00:04:53.400
David Bau: But he's a master's student, so he's really worried about, like, making his resume look good.

46
00:04:53.670 --> 00:04:57.449
David Bau: Like, every venue, he's gotta try every one. So, do it.

47
00:05:07.310 --> 00:05:13.259
David Bau: But I'm really happy with Kevin's paper, with the…

48
00:05:13.820 --> 00:05:15.560
David Bau: Turns to see another day around.

49
00:05:15.690 --> 00:05:27.500
David Bau: So… I have to… forgive me.

50
00:05:27.600 --> 00:05:35.610
David Bau: So, I haven't even… Damn.

51
00:05:35.920 --> 00:05:38.750
David Bau: How many minutes we've had with him?

52
00:05:38.970 --> 00:05:42.609
David Bau: Are we supposed to have started? We're supposed to have started.

53
00:05:46.290 --> 00:05:50.420
David Bau: Okay, so whose… whose team slides are these?

54
00:05:51.110 --> 00:05:52.169
David Bau: Team S.

55
00:05:52.800 --> 00:05:55.389
David Bau: All right, anybody else on here on Team S?

56
00:05:55.580 --> 00:05:57.099
David Bau: Are you guys ready to present?

57
00:05:57.250 --> 00:06:00.990
David Bau: I'm not sure what they want to stay, beyond.

58
00:06:01.330 --> 00:06:07.299
David Bau: If you guys are not prepared to turn that because you're missing the rest of your team, we can reshuffle.

59
00:06:07.500 --> 00:06:12.470
David Bau: The other thing, let's see, when you… let me click on the next admin.

60
00:06:13.710 --> 00:06:15.909
David Bau: So, if your team is not here yet?

61
00:06:16.650 --> 00:06:18.000
David Bau: That's okay.

62
00:06:21.180 --> 00:06:22.879
David Bau: You're in some other random order.

63
00:06:24.370 --> 00:06:25.350
David Bau: Who's this?

64
00:06:26.270 --> 00:06:30.329
David Bau: Team excellent. Is team excellent here? I'm still waiting for it.

65
00:06:30.660 --> 00:06:32.130
David Bau: Oh, okay.

66
00:06:33.580 --> 00:06:34.660
David Bau: Holy cow.

67
00:06:35.770 --> 00:06:39.080
David Bau: Is it? Okay.

68
00:06:39.570 --> 00:06:40.569
David Bau: Who is this?

69
00:06:42.210 --> 00:06:47.449
David Bau: I don't see Jasmine. Yeah, I don't… Jasmine's done this… Oh, Jasmine, are you guys ready to present?

70
00:06:49.590 --> 00:06:51.480
Jasmine Cui: Maybe not.

71
00:06:51.490 --> 00:06:59.869
David Bau: Maybe not? What's going on? Okay, is there a team that's ready to present? How about we start from there? Yes! Better, better, better proposal.

72
00:07:00.740 --> 00:07:03.600
David Bau: Team… team bias, team power.

73
00:07:03.840 --> 00:07:08.590
David Bau: Okay, all right, so find your slides.

74
00:07:08.800 --> 00:07:15.559
David Bau: And present, and okay, so now what I'm gonna do this time, go ahead, is I'm gonna… I'm gonna put up a 9-minute timer.

75
00:07:15.890 --> 00:07:23.890
David Bau: for everybody. And so, so, I didn't announce this in advance, I apologize,

76
00:07:24.010 --> 00:07:39.189
David Bau: But I'm just gonna set a buzzer off at 9 minutes, and we'll say thank you very much, and we'll have a few minutes of discussion, and then so on. So I'm just gonna… I'm just gonna truncate you, so you guys allocate your time any way you want. It's not necessary that everybody speaks, it's okay, but 9 minutes.

77
00:07:39.280 --> 00:07:55.359
David Bau: Okay. Yeah, that's okay. So we don't have a slide deck, we just, like, have a, Google Sheet, with some, ideas about, different data, sets that we're composing. So we're just gonna talk through a couple of the different things.

78
00:07:55.360 --> 00:08:12.630
David Bau: that we're dealing with, and, like, some of the questions that are coming up. So the first one that we've been… so, routine power, we're interested in how, models are kind of internally representing power, and then what you can do if you interpie on those representations. So, one thing that we were toying around with, was,

79
00:08:12.670 --> 00:08:18.160
David Bau: style sentences, so if you take a bunch of sentences, can you make the font really big on it so that we can see some of the sentences?

80
00:08:19.090 --> 00:08:22.169
David Bau: Very good. I just, yeah, oh yeah.

81
00:08:22.220 --> 00:08:33.710
David Bau: So, if, you have some sort of sentence that this was asking Cloud to join a series of sentences at the end of the word power,

82
00:08:33.710 --> 00:08:42.990
David Bau: Remove the word power, feed it into a different model, and, look at how, the model activates.

83
00:08:43.169 --> 00:08:57.609
David Bau: the, sentences that we've been kind of playing around with, do activate power in different models, so it will produce power. And so that was kind of, like, one starting place that we've been,

84
00:08:57.770 --> 00:09:07.299
David Bau: playing around with. There are, like, some, I don't know, basic issues with this kind of setup that we've been struggling with, too, so Ari that you want to talk a little bit about,

85
00:09:07.480 --> 00:09:09.130
David Bau: So,

86
00:09:09.420 --> 00:09:27.129
David Bau: One of the, like, things that may make the word power highly likely in the end is, like, the use of some words before is, like, exercise. If you search exercise off with no context on Google, it made power.

87
00:09:27.520 --> 00:09:33.599
David Bau: Or maybe different combinations, like, influence,

88
00:09:33.930 --> 00:09:43.340
David Bau: or other words that are somehow related to power. So, we also have a data set of, sentences with

89
00:09:43.840 --> 00:09:45.760
David Bau: None of those Sinanese.

90
00:09:45.870 --> 00:09:50.400
David Bau: We haven't tested this, but… Maybe we can't sleep.

91
00:09:50.960 --> 00:09:52.449
David Bau: Yeah, that's start somewhere.

92
00:09:52.610 --> 00:09:53.400
David Bau: Yeah.

93
00:09:56.670 --> 00:10:03.760
David Bau: There is still some words that may be related to power, like decision, but at least

94
00:10:04.280 --> 00:10:11.339
David Bau: We can see… we can feel there is some… some sort of power here. For example, nobody left until he…

95
00:10:11.580 --> 00:10:18.259
David Bau: Well, Ben screamed off for coming post. They waited 3 hours before he arrived. It seems someone was…

96
00:10:19.040 --> 00:10:25.590
David Bau: Okay, so I'm both. But does the LM have, that…

97
00:10:25.840 --> 00:10:29.079
David Bau: understanding of how our ads can do or not.

98
00:10:29.250 --> 00:10:31.270
David Bau: Maybe one question. Oh, nice.

99
00:10:31.610 --> 00:10:33.180
David Bau: And he decided.

100
00:10:33.470 --> 00:10:36.569
David Bau: The meeting will meet when she is down, I'm going, wow.

101
00:10:36.820 --> 00:10:38.280
David Bau: That's a power move.

102
00:10:39.680 --> 00:10:43.650
David Bau: That's good.

103
00:10:44.160 --> 00:10:47.289
David Bau: Well, that's interesting. How did you generate these sentences? So cool.

104
00:10:47.880 --> 00:10:50.770
David Bau: Oh, just with that thump. Just with that thump.

105
00:10:56.000 --> 00:10:58.470
David Bau: My favorite thing on spreadsheets is wrap mode.

106
00:10:59.190 --> 00:11:00.610
David Bau: Put a wrap on it, yeah.

107
00:11:04.460 --> 00:11:05.670
David Bau: I agree.

108
00:11:10.120 --> 00:11:10.990
David Bau: I bet.

109
00:11:19.490 --> 00:11:32.480
David Bau: So then the third thing that we were, playing around with was, non-LOM-generated datasets, and whether or not there's some sort of more, kind of, like, bespoke.

110
00:11:32.530 --> 00:11:43.909
David Bau: sentences that we could work with. So, last time we were talking about, different ways of, kind of, scoring power relationships within sentences, and what we could do with that.

111
00:11:43.910 --> 00:11:54.800
David Bau: So we, started pulling just some recent news articles, and seeing if we could use that to generate sentences that have power relationships between individuals, and come up with a…

112
00:11:56.540 --> 00:12:07.679
David Bau: close to ground truth score, or some sort of external way of measuring power in those relationships, and then, asking the L, and then to generate the score. So, how can you talk a little bit about that.

113
00:12:08.470 --> 00:12:10.560
David Bau: If I can pull the clock here.

114
00:12:12.750 --> 00:12:15.269
David Bau: Oh, you're on the planet. I mean, one pot.

115
00:12:19.800 --> 00:12:22.450
David Bau: Yeah, so this is, kind of…

116
00:12:23.690 --> 00:12:28.790
David Bau: A tool for scoring the power of different regions and sentences.

117
00:12:28.930 --> 00:12:32.309
David Bau: It's called Riveter. I briefly mentioned it last time.

118
00:12:32.450 --> 00:12:36.690
David Bau: and just as kind of a quick test.

119
00:12:36.810 --> 00:12:41.929
David Bau: We, like, manually just copied over 10 recent AP news articles.

120
00:12:42.030 --> 00:12:44.910
David Bau: And we ran this tool, and for, like, the…

121
00:12:45.160 --> 00:12:48.729
David Bau: Most referenced entities, we were able to quantify

122
00:12:48.890 --> 00:13:05.500
David Bau: a power score, which is not that surprising. And so you've got high bar and low bar, what do those mean? I can't really… Oh yeah, it's a bit small. So high bar means that entity has, like, more power, and low bar means it has less.

123
00:13:06.260 --> 00:13:11.299
David Bau: So, the recent news articles are about Professor Silvon.

124
00:13:11.540 --> 00:13:18.719
David Bau: So, like, authorities have a high bar score, protest, and trainings have lower.

125
00:13:19.280 --> 00:13:27.229
David Bau: And we also ran this on just a thousand APUs articles from 2019, got kind of a broader range of things.

126
00:13:27.390 --> 00:13:36.300
David Bau: The text is probably too small here, but there are some common themes, like authorities, police, rating high on the score, Republicans.

127
00:13:38.150 --> 00:13:54.779
David Bau: things like that. And… Yeah, so this tool is using a lexicon of verbs labeled by crowd workers, for whether they, like, increase or decrease the power of the subject, and the…

128
00:13:56.180 --> 00:14:00.269
David Bau: It's based on an analysis of the whole article, is that right? Yeah. So…

129
00:14:02.490 --> 00:14:05.960
David Bau: It's kind of just… like, it's not using a…

130
00:14:06.960 --> 00:14:10.020
David Bau: LLM to do the scoring, it's kind of based on that, like.

131
00:14:13.180 --> 00:14:15.660
David Bau: But it is doing things like co-reference for speech.

132
00:14:17.540 --> 00:14:20.809
David Bau: So, I guess the idea here is that this is our own truth.

133
00:14:21.160 --> 00:14:24.910
David Bau: for future buildings. If we're trying to get

134
00:14:25.520 --> 00:14:36.749
David Bau: like, an idea of which neurons or which circuits are responsible for power, but we don't exactly know what we're going to be prompting the model to do, if it's, like, maybe

135
00:14:37.070 --> 00:14:45.639
David Bau: higher, lowest power entities, and… Have people tried triangulating Riveter with LLMs?

136
00:14:45.950 --> 00:14:49.160
David Bau: you know, assessment, like…

137
00:14:49.490 --> 00:15:03.910
David Bau: whether an LLM thinks that this is, you know, more power or less power, and if it agrees with what Riverter would say, or even you could give a few examples of River and say, you know, how would you generalize it and see if the general publication agrees with how River would generalize it?

138
00:15:04.560 --> 00:15:08.050
David Bau: a lot of examples. I feel like there's a… it's an interesting…

139
00:15:08.270 --> 00:15:20.269
David Bau: I've been there. You know, it's likely that MLMs understand how to read text better than earlier guys, but the real question is, like, oh, does it have… can you calibrate it to the same sense of power?

140
00:15:21.580 --> 00:15:30.910
David Bau: But it might be… it might be a useful tool if you want to generate more text, or… or something like that, and you could use a bunch of river scored examples.

141
00:15:31.200 --> 00:15:34.589
David Bau: You know, once you have a sense for the population.

142
00:15:35.510 --> 00:15:37.460
David Bau: Yeah, I think anyone has done that.

143
00:15:37.650 --> 00:15:39.900
David Bau: There's… Cool.

144
00:15:40.890 --> 00:15:46.479
David Bau: Do you get, like, scores for particular sentences, or just averaged across content?

145
00:15:48.340 --> 00:15:52.510
David Bau: That's… that's entity per article, right? Yeah, I think it's for…

146
00:15:52.910 --> 00:15:57.119
David Bau: Entities, you can get it for individual sentences,

147
00:15:58.420 --> 00:16:06.719
David Bau: I guess I'd just be curious, because, like, I see, like, China is really low power, but I would imagine in certain contexts, it's a lot of power, and so I, like… Right.

148
00:16:08.730 --> 00:16:16.329
David Bau: So you just averaged over a thousand articles. And these are pretty rated, but they're from 2019, so there's COVID stuff.

149
00:16:17.710 --> 00:16:18.520
David Bau: Interesting.

150
00:16:18.650 --> 00:16:21.759
David Bau: Interesting. Oh, yeah, where'd you get? So where'd you get those articles from?

151
00:16:21.930 --> 00:16:27.750
David Bau: Just the ID-based dataset, and there's, like, a big scrape of resources.

152
00:16:29.300 --> 00:16:31.290
David Bau: Nice. Interesting.

153
00:16:34.820 --> 00:16:35.710
David Bau: Wow.

154
00:16:35.910 --> 00:16:39.630
David Bau: That's your 9 minutes. Any other last things in the last 10 seconds?

155
00:16:40.840 --> 00:16:43.030
David Bau: Any comments or suggestions from people?

156
00:16:43.420 --> 00:16:44.450
David Bau: Discuss.

157
00:16:45.990 --> 00:16:49.409
David Bau: So they have a whole bunch of data here that they've played with, and

158
00:16:51.490 --> 00:16:57.119
David Bau: So, so you, you suggested, oh, they could be used for making a pro.

159
00:16:57.510 --> 00:17:03.489
David Bau: What's this leading to for, like, a… do you have a sense for what kind of research question you might want to…

160
00:17:03.690 --> 00:17:10.870
David Bau: I'm learning… I gotta learn how to use this thing. Okay. What kind of research question?

161
00:17:11.040 --> 00:17:30.160
David Bau: you know, your tools that you've played with so far are sort of suggesting that you might want to ask? I can kind of hop in, but just to make sure… sorry, we've covered the generated census, right? Yeah, okay. One… this goes back to Armina's experiment for the last time we presented, but I want to thank, and when she generated those.

162
00:17:30.160 --> 00:17:43.029
David Bau: sentences, one thing we had discussed as a group was that, the pattern we saw originally was that the decision it makes pivoted in the last few layers. One research question would be, like, would be honing in on that kind of idea, saying, look.

163
00:17:43.030 --> 00:17:54.649
David Bau: Are there particular layers? Are there certain decisions that seem to get subverted or altered? Are there… that can help identify, like, are there layers of alignment? Are there hidden opinions that are relatively consistent? That's one… that would be one research question.

164
00:17:55.050 --> 00:17:55.760
David Bau: True.

165
00:17:56.100 --> 00:18:04.930
David Bau: I see. Like, I didn't… I didn't subversion. So, one of the things I'm a little worried about, what's… I love that. It's such a cool story. I think I mentioned to you, like, I'm not sure…

166
00:18:05.240 --> 00:18:05.970
David Bau: like…

167
00:18:06.080 --> 00:18:18.700
David Bau: You can… so usually you want to triangulate your results when you have some innovative method, and the main evidence you have is that the opinion was different at a certain layer.

168
00:18:18.810 --> 00:18:25.219
David Bau: From the opinion that it gives at the end, that's, like, one leg of the stool. But then people will ask you, well, what does that mean?

169
00:18:25.660 --> 00:18:40.649
David Bau: What does that mean? Like, you're using the model in some way that wasn't designed to use. Is there a second leg of the stool, or a third leg of the stool, where you can say, that's evidence of some other behavior, or it matters because it shows up in this other situation? And so that's my…

170
00:18:40.900 --> 00:18:55.589
David Bau: concern about… I love the story. I think it's so cool. I'm just not sure what you're doing, so I… so I don't want to dissuade you from advice. You know, if you want to go that way, I want to encourage you to be… Question.

171
00:18:55.730 --> 00:19:09.769
David Bau: I assume either through pro… if there are some activations, we can locate that, which… that would alter the decision downstream. Would that… would that contribute? That'd be a second link. It could be a second link. I mean, it might be very closely related.

172
00:19:09.970 --> 00:19:11.270
David Bau: It could be something.

173
00:19:11.400 --> 00:19:18.780
David Bau: So you could say, oh, we hypothesize that the model has a concept of Something.

174
00:19:19.090 --> 00:19:23.529
David Bau: Whatever, some aspect of power, that it processes, you know, that it's impolite.

175
00:19:23.770 --> 00:19:26.520
David Bau: To… to talk about power, something like that.

176
00:19:27.320 --> 00:19:33.210
David Bau: And, and that activates… use this as concept somewhere at Layer 27.

177
00:19:33.470 --> 00:19:40.369
David Bau: Or different layers of the model, and we have assessments for figuring out what layer it is. And we have an ability to turn off this intuition

178
00:19:40.880 --> 00:19:57.180
David Bau: Right? And maybe the third leg might be… and we noticed that if we look at models that are trained one way, and models that are trained a different way… Oh, the instructors, if they don't have the inhibition, or something like that, you know, let's say maybe that could be a third leg, right? So… so I think it's possible to make a…

179
00:19:58.230 --> 00:19:59.690
David Bau: Make a tripod.

180
00:20:00.150 --> 00:20:03.660
David Bau: Right, but but… but maybe there's something to explore?

181
00:20:04.050 --> 00:20:08.830
David Bau: You know, with more… Preliminary experiments before you really commit to it.

182
00:20:09.030 --> 00:20:10.200
David Bau: Does that make sense?

183
00:20:10.810 --> 00:20:12.399
David Bau: Any other suggestions?

184
00:20:15.010 --> 00:20:18.259
David Bau: I mean, not a suggestion, just a comment. I like the blank…

185
00:20:19.080 --> 00:20:25.920
David Bau: power sentence, like, kind of like those sentences that, like, power's not explicit, but you could sort of… and I wonder if you could…

186
00:20:26.560 --> 00:20:29.740
David Bau: Yeah, have, like…

187
00:20:30.030 --> 00:20:37.160
David Bau: varying data sets for the different kinds of power you want to test for understanding that for something that are similar for this, where, like.

188
00:20:37.350 --> 00:20:40.449
David Bau: It's implied, but not explicit. I like his pun.

189
00:20:45.220 --> 00:20:53.809
David Bau: Yeah, I like it too. Like, it seems like there's different types of power in here, probably, right? Do you… like, I wonder if it's… so, have you guys considered…

190
00:20:54.580 --> 00:20:58.379
David Bau: Trying to sort through different conceptions of powers within the model hands.

191
00:20:59.030 --> 00:21:05.650
David Bau: Yeah, we've been narrowing down on, like, specifically relational conceptions of power.

192
00:21:05.650 --> 00:21:08.269
David Bau: people. We haven't kind of fluttered this.

193
00:21:08.270 --> 00:21:27.870
David Bau: for that, and thinking more in terms of, there's a question of, like, the, kind of, like, latent possibility of exercising power, if you're a position of power, rather than the actual exercise of power, and those are kind of two distinct things as well. So thinking in that sort of relational context between two entities. Interesting.

194
00:21:28.130 --> 00:21:31.880
David Bau: Yeah, within that, that can, like, go in kind of different…

195
00:21:32.020 --> 00:21:37.000
David Bau: direction. So, we were thinking more about the kind of, like,

196
00:21:37.230 --> 00:21:42.089
David Bau: Diversity of context to make sure that we're picking up and,

197
00:21:42.440 --> 00:21:46.260
David Bau: the conversations we've had so far, and haven't talked as much about it like this.

198
00:21:47.160 --> 00:21:48.200
David Bau: arrangements.

199
00:21:49.630 --> 00:21:51.379
David Bau: Yeah, that sounds, sounds fun.

200
00:21:55.420 --> 00:21:57.299
David Bau: And I was like, keep it.

201
00:22:05.770 --> 00:22:07.070
David Bau: Who's the next team?

202
00:22:08.630 --> 00:22:09.470
David Bau: 8.

203
00:22:10.750 --> 00:22:12.859
David Bau: Alright, Jasmine!

204
00:22:13.480 --> 00:22:15.110
David Bau: How come you're remote today?

205
00:22:16.210 --> 00:22:18.770
Jasmine Cui: Just under the weather, sorry.

206
00:22:18.770 --> 00:22:21.589
David Bau: Well, I hope you feel better. Thanks for dialing in.

207
00:22:21.710 --> 00:22:23.569
David Bau: Is your team ready to present?

208
00:22:24.500 --> 00:22:28.199
Jasmine Cui: Let me… let me ask them, I'm not sure.

209
00:22:30.470 --> 00:22:31.830
Jasmine Cui: Okay, so…

210
00:22:31.830 --> 00:22:35.529
David Bau: I can put you guys up on the screen.

211
00:22:36.570 --> 00:22:37.599
David Bau: If you're not…

212
00:22:38.980 --> 00:22:41.660
Jasmine Cui: I think maybe a pass for today, if it's hard.

213
00:22:42.040 --> 00:22:46.220
David Bau: Okay, should we, should we, should we… are you, are you entertained?

214
00:22:46.370 --> 00:22:48.660
David Bau: I, I could just go through my slides.

215
00:22:49.240 --> 00:22:52.660
David Bau: Are you a TMTK? Yes. Oh, cool, yeah, why don't you go through your slides?

216
00:22:52.930 --> 00:22:56.769
David Bau: Jasmine, we'll have, we'll have, Aria… Is that fine?

217
00:22:56.770 --> 00:22:58.870
Jasmine Cui: Yeah, yeah, yeah, yeah, no, Aria's great, yeah.

218
00:22:58.870 --> 00:23:00.659
David Bau: I don't know where…

219
00:23:00.660 --> 00:23:02.290
Jasmine Cui: Favorite superstar.

220
00:23:02.510 --> 00:23:06.439
David Bau: You know what she looks like.

221
00:23:07.330 --> 00:23:10.370
David Bau: Yes. Yes, indeed.

222
00:23:17.700 --> 00:23:20.060
David Bau: Probably very small on the screen.

223
00:23:20.380 --> 00:23:29.319
David Bau: I forgot to put the, font size is bigger, but… Anyway… For dataset.

224
00:23:31.180 --> 00:23:34.060
David Bau: I want to start off by, like, asking a question.

225
00:23:34.620 --> 00:23:35.930
David Bau: Cause…

226
00:23:36.210 --> 00:23:43.290
David Bau: I know this class is, like, mostly about identifying a concept, and, like, the guidelines we have for building a dataset

227
00:23:43.670 --> 00:23:52.370
David Bau: Mostly focused on, like, if you have a concept, you… this is how you do, like, the constructive pairs and stuff, all because you demonstrate that you, like.

228
00:23:52.480 --> 00:23:57.869
David Bau: have an item that has this concept and one that doesn't. Right, for example. But, yes, for example.

229
00:23:58.170 --> 00:24:05.239
David Bau: Like, for, for ours, We were focusing on something slightly different than, like, you wouldn't think of as a

230
00:24:05.820 --> 00:24:09.620
David Bau: Mechanism, like, an algorithm of whatever day, which makes…

231
00:24:09.780 --> 00:24:17.570
David Bau: constructing this data kind of different from the guidelines? That's okay, that's totally okay. That's… so that's alright.

232
00:24:17.880 --> 00:24:24.640
David Bau: you know, the guidelines… I mean, so, you know, the real goal of the paper, or the real goal of the course, is to, like.

233
00:24:24.900 --> 00:24:29.280
David Bau: produce a piece of interesting insight, about how these LMs are thinking.

234
00:24:29.380 --> 00:24:36.699
David Bau: And, you know, we have suggestions on how you might go about it, but if you have a clever general that's, you know, A+, okay?

235
00:24:36.820 --> 00:24:42.820
David Bau: I don't have a clever way, I just have a different way, that might not work, but… Anyway…

236
00:24:42.940 --> 00:24:59.119
David Bau: because we're… our tasks were to, just to, catch up, people, have a transcript that is undifferentiated by speakers, and then have the LMs kind of attribute each sentence to, like, the crystalline speaker that spoke at.

237
00:24:59.480 --> 00:25:13.160
David Bau: So, in terms of, like, a data set, like, an easier setup, we want, essentially, a discourse where you have to infer the speaker identity without, like, tags, and essentially, we need

238
00:25:13.160 --> 00:25:22.400
David Bau: to have a question that's on the discussion within the dialogue, so that we could ask questions that… I love it. …the interpoles would then operate on.

239
00:25:22.700 --> 00:25:29.480
David Bau: And, like, the data items should be short. It wouldn't be, like, a full transcript, it would be something shorter, because

240
00:25:29.710 --> 00:25:36.500
David Bau: you… To make it not ambiguous, you cannot have the…

241
00:25:36.850 --> 00:25:39.620
David Bau: Instances where a speaker just starts speaking.

242
00:25:39.690 --> 00:25:57.570
David Bau: So, like, the next turn have to be… like, the current speaker must select the next speaker in some sense, either by, you know, referring by names, or in, like, a row setting of, like, you have an interviewer and the interviewee, the person who asks questions, the person who answers, you would know who is, like, taking that role.

243
00:25:58.050 --> 00:26:04.240
David Bau: So anyway, there's a bunch of, like, nuances in there, because there's a lot of things that you have to…

244
00:26:04.560 --> 00:26:08.230
David Bau: Satisfied to make such data unsolvable.

245
00:26:09.040 --> 00:26:18.160
David Bau: in terms of, like, for example, you… the logic must match, and, like, there's gotta be hints that you could take. For example, if someone asks the same question twice.

246
00:26:18.310 --> 00:26:23.670
David Bau: you already had someone answer it, the second answer would belong to a new person. Sure. Like, things like that.

247
00:26:24.510 --> 00:26:31.439
David Bau: And, like, the interviewer interviewee is, like, an example of, like, role assignments that would help avoid this…

248
00:26:32.510 --> 00:26:43.109
David Bau: And there's, like, inclusive clues of, like, if you mention someone's name, you're talking to them. So cool. But, so… Oh, so you have an example transcript here? For data items…

249
00:26:43.520 --> 00:27:02.169
David Bau: I'm separating it into, like, my levels of difficulties, starting with, like, a just labeled transcripts, which… Can you hit the present button to make the fonts a little bigger here? I'm just curious to see who's, because they're all JSONs. Yeah, it's very… Better? Yeah, it's much better. Okay. Yeah, like, this will be super easy, like…

250
00:27:02.540 --> 00:27:06.929
David Bau: Even the smallest, like, 1.5, you don't want to consult this, because it's, like, it's tact.

251
00:27:07.560 --> 00:27:09.200
David Bau: And,

252
00:27:09.470 --> 00:27:16.349
David Bau: So, wait, wait, the transcripts, literally, let's read it here, so the way I get… so I get an idea. So, Alice, I'm Alice, I'm Bob.

253
00:27:16.450 --> 00:27:25.700
David Bau: I like basketball. I like soccer. Okay? And then now the QA pairs are… What sport does Alice…

254
00:27:26.090 --> 00:27:29.129
David Bau: Like. Like? Well, Alice likes.

255
00:27:29.280 --> 00:27:30.250
David Bau: aestha.

256
00:27:30.730 --> 00:27:31.610
David Bau: I see.

257
00:27:31.840 --> 00:27:40.540
David Bau: And so, these are just questions about the dialogue. Yeah. Okay, so it's pretty straightforward, like you were saying. Yes. And then, if it… if it turns into…

258
00:27:40.890 --> 00:27:45.009
David Bau: And I'll label the client. Oh, without Alice Colin. Now you don't…

259
00:27:45.150 --> 00:27:58.549
David Bau: Now the model, first, it must understand the task of the transcript of a conversation, then it starts turn-taking, so then you would know Alice is, like, saying that word. You would… Bob's second. Yeah. Alice is third, Bob is fourth.

260
00:27:58.630 --> 00:28:06.250
David Bau: Where it gets a little more interesting is when you have three, or, like, more than two speakers, where now turn TikTok, and it's not, like.

261
00:28:06.890 --> 00:28:12.919
David Bau: the next line have to belong to, like, the next speaker. Yeah. This is, like, an example where you…

262
00:28:13.470 --> 00:28:31.160
David Bau: Oh, hi, Alice, yes, I like… yeah, I heard that Claire likes basketball, too. Oh yeah, Alice, yes. You would have heard that this line, basically, is spoken by Claire. Yeah, you're right. And, like, now that, like, you would have to also understand that when Alice asks

263
00:28:31.160 --> 00:28:35.079
David Bau: Again, this answer cannot be clairsent class, it has to be Bob's.

264
00:28:35.350 --> 00:28:42.270
David Bau: It's like, funny, transgirl thing. Oh, right, because… Because, like…

265
00:28:42.630 --> 00:28:50.379
David Bau: Because it would be weird for Claire to go back to Alice and ask, what do you like? Because Alice just said that I like basketball. Yes.

266
00:28:50.730 --> 00:28:54.129
David Bau: So… so it makes more sense for it to be Bob.

267
00:28:54.620 --> 00:28:56.250
David Bau: Oh, what a cool transcript!

268
00:28:56.450 --> 00:28:58.920
David Bau: If you can explain that one, that's pretty awesome.

269
00:29:00.090 --> 00:29:02.090
David Bau: I like it. What do you guys think?

270
00:29:04.230 --> 00:29:14.929
David Bau: Isn't that neat? This happens all the time when you're reading fiction, have you noticed? You read this dialogue in fiction, and it's like, quote this person, quote that person. Sometimes it gets a little long, sometimes the writer's like.

271
00:29:15.110 --> 00:29:32.410
David Bau: You can follow, but once in a while, you have to sort of pause for a second and say, wait, who's talking? Yeah. Right? This is, like, an example where I tested Quinn on 1.5B, and it's instruct fair, 7B, and it's instruct fair, and then, just for fun, a reasoning model, Q over Q, 32B.

272
00:29:32.530 --> 00:29:45.650
David Bau: This is, like, a case where the reasoning models can, like… They can do it? Yes, they can do it. But they also waste a lot of token thinking through, like… And they think through it? They reason through it? They're like, just like the way I narrowed it. Yeah.

273
00:29:45.740 --> 00:29:55.629
David Bau: They're like, oh, that couldn't be wrong. They go on and off, they backtrack, they do, like, every single permutations of, like, how I could assign it. And then for the instruct models, they just suck.

274
00:29:56.430 --> 00:29:57.620
David Bau: Really, they can't do it.

275
00:29:57.730 --> 00:30:00.709
David Bau: No. And what about the reasoning models if you don't let them reason?

276
00:30:01.470 --> 00:30:04.970
David Bau: Oh, I have not tried it. Okay. I wonder.

277
00:30:06.510 --> 00:30:13.599
David Bau: Anyway, and then there's level 3, where I actually took this from, the transcript that Jasmine gave us, and, like, just,

278
00:30:13.800 --> 00:30:16.269
David Bau: But to make it shorter.

279
00:30:16.420 --> 00:30:24.070
David Bau: This is harder to explain, but… because it's so small. But basically, the setup looks like there's one interviewer…

280
00:30:24.550 --> 00:30:30.729
David Bau: There's two interviewees and one interviewee, and thus, if there's two questions being asked consecutively, both of them have to belong.

281
00:30:30.850 --> 00:30:36.580
David Bau: To the interviewers, and, like, through that, you could infer the roles, and…

282
00:30:36.710 --> 00:30:39.630
David Bau: Take a question, and then ask about their stance.

283
00:30:39.830 --> 00:30:41.020
David Bau: And,

284
00:30:44.500 --> 00:30:45.210
David Bau: Oh.

285
00:30:46.830 --> 00:30:47.680
David Bau: Quickly.

286
00:30:48.850 --> 00:30:51.400
David Bau: For the… for the natural stretch, like, one, this is…

287
00:30:52.020 --> 00:31:02.570
David Bau: By looking at the reasoning model's traces, I see some cases where the model could, like, understand what they're talking about, they could summarize people's saying.

288
00:31:02.610 --> 00:31:15.860
David Bau: But notice how, when the models ask about Joe's perspective on, like, policy uncertainty, it refers to Richard's analysis. It basically collapsed the discourse into just, like, one narrative.

289
00:31:16.140 --> 00:31:18.400
David Bau: And, yeah.

290
00:31:19.050 --> 00:31:23.540
David Bau: I don't know if this happens to instruct models, because it doesn't really say much.

291
00:31:24.100 --> 00:31:34.160
David Bau: But… here is an example of where I'm, like, the made-up three-person task, the instruct model, so it's…

292
00:31:34.640 --> 00:31:38.999
David Bau: It's, like, an obvious mistake that's just, like… I don't know.

293
00:31:41.760 --> 00:31:45.329
David Bau: Exactly. So, and how many, and how many… when you say it sucks.

294
00:31:45.450 --> 00:31:49.220
David Bau: How many, cases did you try through these different models to sort of measure?

295
00:31:49.600 --> 00:31:54.370
David Bau: Like, 10 of them? Yeah, 10 of them. Not much. Okay.

296
00:31:54.570 --> 00:31:56.820
David Bau: because of… the…

297
00:31:57.590 --> 00:32:13.209
David Bau: these above reasons I have not figured out how do I have data generate things in, like… With this form? …bunch, or, like, in this form. There's gotta be some kind of… I was hoping there's gonna be some kind of templates that I could take from, like, linguistic research that…

298
00:32:13.460 --> 00:32:22.860
David Bau: the finds, things like that, like… But have you just… have you tried going to any of the big models and just saying, this is what I need? Just, like, pasting the text in it? I, I, I did, but, like…

299
00:32:23.390 --> 00:32:25.120
David Bau: Oh, sorry. Sorry.

300
00:32:25.950 --> 00:32:30.919
David Bau: It's just, like, you would have to manually go through them to actually, like, figure out if…

301
00:32:31.200 --> 00:32:35.560
David Bau: It's valid. And also, they tend to, like…

302
00:32:35.770 --> 00:32:38.450
David Bau: Some, like, like, the clever ones, like this.

303
00:32:38.860 --> 00:32:45.129
David Bau: where Claire is, like, introduced into the conversation instead of starting with, like, I'm Alice, I'm Bob, I'm Claire.

304
00:32:45.740 --> 00:32:47.589
David Bau: It's very hard for a bottle to come up.

305
00:32:48.800 --> 00:32:56.710
David Bau: You could give a few examples. Yeah. Like, if I write one of these and a few-shot them, maybe… Yeah, which is, you know, that's what,

306
00:32:57.290 --> 00:32:58.529
David Bau: That's what Franz did.

307
00:32:58.870 --> 00:33:00.529
David Bau: Right? Yeah, more.

308
00:33:00.680 --> 00:33:11.979
David Bau: Very nice. So, any other suggestions for Team TK? Are you guys interested in this setting, or would you change it in a way to make it more interesting? What do you think?

309
00:33:14.570 --> 00:33:28.060
Jasmine Cui: Oh, I just wanted to note, too, that, like, some of the really big instruct models can also do the task, so, like, I just tried it on OpenRouter with, Quen3VL235B, and it can, like, it can do, like, the annotation

310
00:33:28.290 --> 00:33:34.070
Jasmine Cui: I think, like, the size might be, like, a challenge, just cause… Yeah.

311
00:33:34.080 --> 00:33:35.850
David Bau: How many bees, you said?

312
00:33:36.550 --> 00:33:40.390
Jasmine Cui: It's 235B, but it's a… it's a Moe, so…

313
00:33:41.020 --> 00:33:44.819
Jasmine Cui: I don't know how many parameters are active at a given time, but .

314
00:33:45.640 --> 00:33:48.700
David Bau: Yeah, we should, we should ask the NDIF team to help you with that.

315
00:33:49.400 --> 00:33:57.849
Jasmine Cui: Yeah, yeah, I think for some of the more challenging task settings, we might need to use bigger instruct models to be able to see the behavior, but…

316
00:33:58.000 --> 00:33:58.810
David Bau: We love it.

317
00:33:59.590 --> 00:34:05.580
David Bau: Yeah, that's great. So, it's really good to know this. So, like, this… the testing that you're doing, is…

318
00:34:05.860 --> 00:34:11.780
David Bau: from my point of view, it's perfect, right? It's like, right, basically… You know.

319
00:34:12.040 --> 00:34:14.330
David Bau: You want to avoid wasting your time.

320
00:34:15.330 --> 00:34:17.460
David Bau: Study a model that can't do the task.

321
00:34:18.040 --> 00:34:28.200
David Bau: Right? And on the other hand, you also want to avoid wasting your time setting a giant model that's oversized for the task, right? Because it is very inconvenient to study big models.

322
00:34:28.420 --> 00:34:32.270
David Bau: So, so, you know, you sort of want to understand what you're… what you're facing.

323
00:34:32.440 --> 00:34:37.970
David Bau: So what is… so how do you… how do you feel about the situation now that you've probed it a little bit? Do you feel like it's promising?

324
00:34:40.969 --> 00:34:45.290
David Bau: I think… Yeah,

325
00:34:46.620 --> 00:34:51.329
David Bau: I like to, try to do interctals on that, instead of just popping at first reason.

326
00:34:51.739 --> 00:34:59.250
David Bau: Because… You like to what? I like to, like, see how… like, see how base models suck at it, like, why does it suck? Or, like, why does it…

327
00:35:00.410 --> 00:35:01.870
David Bau: Yeah, I don't know, I'm just…

328
00:35:02.540 --> 00:35:07.680
David Bau: This is, like, just one of the tasks. I would also want it to be more diverse than…

329
00:35:07.980 --> 00:35:12.840
David Bau: like, then I just fit this through LMs, and it just replaced names and stuff.

330
00:35:12.970 --> 00:35:23.850
David Bau: I think that's fine. Yeah, the fact that, like, some models cancel it makes it, I think, better, because otherwise it's very concerning.

331
00:35:24.060 --> 00:35:35.000
David Bau: Right, right. Some models can definitely do this. So I, I, like, I don't know, like… right, go ahead. Oh, I'm just saying, I like the different levels they created, so…

332
00:35:35.120 --> 00:35:47.120
David Bau: And I'm just wondering, so what the next step might be asking how the model solves the problem, right? And especially, like, how the model solves the problems differently. Especially if you could…

333
00:35:47.410 --> 00:35:50.689
David Bau: For example, you could, like, patch a particular

334
00:35:50.840 --> 00:35:54.240
David Bau: sentence from Level 1 to level 3, and see, like, how that…

335
00:35:54.360 --> 00:36:02.050
David Bau: change things, I don't know. To answer, like, how the model, like, from, approaches, Different levels differently.

336
00:36:02.510 --> 00:36:06.620
David Bau: But I like the level of different levels, so… Oh, yeah.

337
00:36:06.790 --> 00:36:14.080
David Bau: This is a little silly, but I feel like it would be interesting to see your eye and something at the end. Like, someone be like.

338
00:36:14.670 --> 00:36:18.130
David Bau: I love football, and then the next lines would be like, who's that?

339
00:36:18.340 --> 00:36:33.949
David Bau: And then make it into a closed prompt, yeah. I think that would be, like, that would really show that it's also reading, you know, and it's understanding these concepts, and not just, like, finding a pattern or finding a speech pattern. I try to ask for, names that doesn't exist.

340
00:36:34.310 --> 00:36:40.519
David Bau: And, I don't know what the phase model does. Instruct models answers as of…

341
00:36:40.760 --> 00:36:53.269
David Bau: I'm asking about, like, Alice. The reason it all those goes, because the user's asking about this, this person must exist, and then proceed to attribute whatever sentences that they're unclear about to that.

342
00:36:53.540 --> 00:36:55.230
David Bau: Not existing versus name.

343
00:36:55.880 --> 00:36:57.700
David Bau: Yeah, thanks. Bye.

344
00:36:59.090 --> 00:37:05.980
David Bau: Thank you. Okay, next team, next team. Team S? Is Team S ready now? Yep. Okay, Team S.

345
00:37:06.330 --> 00:37:07.550
David Bau: 15 minutes.

346
00:37:09.790 --> 00:37:13.559
David Bau: I don't know where to go. You find your thing. I think this is it.

347
00:37:15.480 --> 00:37:16.580
David Bau: Alright.

348
00:37:23.390 --> 00:37:24.430
David Bau: Whoa.

349
00:37:24.770 --> 00:37:32.289
David Bau: We have a little bit different prompt design based on the last week's material.

350
00:37:32.440 --> 00:37:39.059
David Bau: So we first do, like, a binary class, then we do a pairwise comparison.

351
00:37:39.220 --> 00:37:44.819
David Bau: And we, do a multiple-choice pairwise comparison, and then…

352
00:37:45.000 --> 00:37:50.500
David Bau: We do an attribute-specific class. Move on to the next page,

353
00:37:51.030 --> 00:37:58.140
David Bau: So, for the banner clause, it's basically you show a… just one picture, and

354
00:37:58.350 --> 00:38:15.060
David Bau: the model to say, like, if, this urban scene appears to be what? Like, we can't give them a pair of either safe or dangerous, either wealthy or poor, something like that, and we do, we…

355
00:38:15.190 --> 00:38:21.079
David Bau: do a logic lens, inside the model, to see how it,

356
00:38:21.350 --> 00:38:24.890
David Bau: How, what we have in the inside.

357
00:38:24.900 --> 00:38:36.509
David Bau: And move on to the comparison. This one is basically, the same task that we have for, the MIT, place pause, test.

358
00:38:36.510 --> 00:38:48.670
David Bau: Which is giving two images, making the model, say whether, which location, of these two,

359
00:38:48.940 --> 00:39:04.950
David Bau: pictures looks safer? Is it on the picture on the left, or is it the picture on the right? And, at the meantime, we want to also, like, look into the logic… the logic list to see what it, do… where it does the decision.

360
00:39:05.070 --> 00:39:09.299
David Bau: And move on to the next slide. This is, the…

361
00:39:09.330 --> 00:39:28.199
David Bau: just a little runtime that I do open, and we do the first 100 tests, with the two, pictures here, and asking which image, which images is either wealthier, more beautiful, or something like that. Reply only with left or right,

362
00:39:28.390 --> 00:39:31.409
David Bau: If uncertain, reply equal.

363
00:39:32.030 --> 00:39:41.219
David Bau: So this is a result compared to what, in the dataset. So, you see, like, right here, like, for,

364
00:39:41.310 --> 00:39:54.770
David Bau: When asking concepts about, beauty, it matches with the human status pretty well. But actually, when it… when we try to, like, ask whether it's safer or, like,

365
00:39:54.770 --> 00:40:06.379
David Bau: more depressing, there's, like, zero matches. But for… I think for the wealthier part, it starts to, like, just pure guessing.

366
00:40:06.440 --> 00:40:09.440
David Bau: And, now this…

367
00:40:09.530 --> 00:40:17.409
David Bau: I, I will probably try more models for this task, and probably more, data.

368
00:40:17.430 --> 00:40:35.500
David Bau: Move on to the next slide. So this, is, a, multiple choice, question. So, we want to ask the model, how does… how safe does this tree appear? So instead of, giving them a,

369
00:40:35.800 --> 00:40:45.009
David Bau: lighting the rate for each score. We think that this one is better, because, it's more straightforward for the language model to understand.

370
00:40:45.100 --> 00:40:56.969
David Bau: And we first, just do a pure prompt, prompt, and then we do a comparison. Like, we give it another picture, and to, make…

371
00:40:57.030 --> 00:41:05.120
David Bau: Then, like, compare and to have the, how do you say, like, the, the scores right here.

372
00:41:05.210 --> 00:41:12.520
David Bau: And this is basically how we want to design our future, experiment.

373
00:41:12.840 --> 00:41:16.469
David Bau: Right now, Lex, let's move on to the…

374
00:41:16.760 --> 00:41:20.269
David Bau: Yes, and this is also… yeah, sorry, this…

375
00:41:20.380 --> 00:41:33.620
David Bau: The last… there is another one called Attribute Specific Class, where we want to see, like, in scale, the degree of, perceived… the perspective of one of these places.

376
00:41:33.620 --> 00:41:42.709
David Bau: Like, we want to see, like, the decoding from, too high or too low, like, whether the model thinks that,

377
00:41:42.770 --> 00:41:48.260
David Bau: If this picture fits in, the description that we provided.

378
00:41:48.420 --> 00:41:50.479
David Bau: And, that's all for the…

379
00:41:50.770 --> 00:42:07.730
David Bau: methods. So, before I start with my experiments, I want to give a context about vision language model, because it's different with language model. We have an image here, and the image goes through the, BIT, and we get the image token.

380
00:42:07.770 --> 00:42:14.499
David Bau: And the, different lists, like, earlier vision language model, like.

381
00:42:14.810 --> 00:42:19.020
David Bau: Nava, the image token doesn't go in the…

382
00:42:19.820 --> 00:42:27.590
David Bau: tax decoder, so, it goes through the, cross-attention, so the, tax token

383
00:42:28.230 --> 00:42:32.719
David Bau: And look at the image token through the auto tension, but the

384
00:42:32.820 --> 00:42:43.279
David Bau: image token, it doesn't change in the decoder. So, one way to visualize that, how, how the image…

385
00:42:43.840 --> 00:43:02.010
David Bau: affect the final decision is to look at the attention heat map. Like, we use the last token's attention on the map and visualize it through, different layers, and, it's really hard to find the patterns at the beginning, but if you keep looking at

386
00:43:02.490 --> 00:43:08.839
David Bau: this for, like, one minute. Maybe you can find, like, the… at the beginning, like.

387
00:43:08.940 --> 00:43:16.540
David Bau: earlier layer, the attention is maybe focus on, the adage of the image, but when this comes to, like.

388
00:43:16.620 --> 00:43:35.299
David Bau: layer 20, around layer 20, it begins to focus on the center of the image. So, an initial guesses, the model starts to look at the image, or look at the contents in the image at layer 20, and so there's, the…

389
00:43:35.690 --> 00:43:40.069
David Bau: some, initiatives. It's a small group.

390
00:43:40.520 --> 00:43:43.289
David Bau: Yeah, I divided into 3 different…

391
00:43:43.530 --> 00:43:54.340
David Bau: Stage, and the first stage is to spec for random, and focus on the adage, and the second stage is focused on some, real functions in the image.

392
00:43:54.860 --> 00:44:00.160
David Bau: in the, I know a few layers, it doesn't look at anything.

393
00:44:00.260 --> 00:44:08.080
David Bau: So, there's another thing interesting is, if we try to visualize the… Attention.

394
00:44:08.190 --> 00:44:09.969
David Bau: For the image token.

395
00:44:10.170 --> 00:44:12.770
David Bau: We can see, at the beginning,

396
00:44:12.920 --> 00:44:17.869
David Bau: Very beginning is… it has something, but it dropped very quickly, and

397
00:44:18.460 --> 00:44:31.750
David Bau: to close to zero until, layer 20, so it grew up very quickly. So, what I think is, maybe the model doesn't really look at the image token.

398
00:44:32.490 --> 00:44:41.269
David Bau: the, first, like, Tanji Liars. It's just more focused on the tax token and, in, like, liar…

399
00:44:41.470 --> 00:44:44.300
David Bau: 1819 is our duplicate image.

400
00:44:44.620 --> 00:44:48.210
David Bau: And also, I try to,

401
00:44:48.670 --> 00:44:57.499
David Bau: see, like, the probability of the final decision, like, safe or unsafe. So, from this figure, we can,

402
00:44:57.770 --> 00:45:04.550
David Bau: Find that the final decision The probability of safe or unsafe There's…

403
00:45:04.750 --> 00:45:07.979
David Bau: Close to zero for all the liar until…

404
00:45:08.230 --> 00:45:13.199
David Bau: layer 21, and they start to go up. So, I think this…

405
00:45:13.250 --> 00:45:16.529
David Bau: Three things say the same thing, that maybe in

406
00:45:16.560 --> 00:45:29.059
David Bau: The first 20 layers, the model tried to understand what the question is, and after the model understands the question, it starts to look at the image and find some useful information, and after

407
00:45:29.060 --> 00:45:45.320
David Bau: like, there are plenty that start to make the decision, so that's something I found interesting here. Very good hypothesis, yeah. It's pretty neat. It's sort of circumstantial evidence, but it gives you a sense for what might be going on. You can try to prove that later. Yeah, I think this very close to, like, humans.

408
00:45:45.440 --> 00:45:50.939
David Bau: Like, reasoning process, right? When we have this kind of question, we first

409
00:45:51.190 --> 00:46:07.419
David Bau: try to understand the question, like, what kind of thing I need to find in the image so I can make the final decision. And then I look at the image and find something that I have my decision. Okay, so this is for… Yeah, so we're about out of time for this one. So, any suggestions for the team?

410
00:46:10.550 --> 00:46:12.820
David Bau: Suggestions? You must have suggestions for the team.

411
00:46:14.210 --> 00:46:26.569
David Bau: I'll throw something out there. It seems like a lot of the questions that you're asking are, like, very subjective, which is, like, very interesting, and I wonder what… if there's, like, a way to compare it to more, kind of, like,

412
00:46:26.680 --> 00:46:30.530
David Bau: Straightforward, objective questions about the image.

413
00:46:30.700 --> 00:46:40.149
David Bau: And you could, like, compare across those, like, is there a tree in this image versus, is this image depressing? Yeah, that might… that may… what if that logic happens at different layers?

414
00:46:40.530 --> 00:46:59.770
David Bau: Right? Yeah, yeah. But, like, our topic is about, scene-level perception reason, but, we didn't look at the object-level things, but… But you could also ask very objective scene-level things. Like, is this outdoors? Is it indoors? Or things like that, right? I mean, very objective things, right? Versus…

415
00:46:59.940 --> 00:47:01.900
David Bau: Right? Even if you want to…

416
00:47:02.420 --> 00:47:05.170
David Bau: Yeah, that might be something, too.

417
00:47:06.300 --> 00:47:07.470
David Bau: to try to compare.

418
00:47:08.710 --> 00:47:21.579
David Bau: I've also tried a larger model, just like a 7D model. In fact, the reasoning is definitely desired. Maybe the larger model may have more reasoning process that trivia's higher, too.

419
00:47:22.420 --> 00:47:25.780
David Bau: I like that suggestion. Any other good suggestions? Any other cool ideas?

420
00:47:36.800 --> 00:47:38.259
David Bau: I think that's a very good one.

421
00:47:39.490 --> 00:47:47.150
David Bau: You keep on thinking. I have some suggestions for you, but they have to do with some of the technical choices, but if you guys have ways of making this super interesting…

422
00:47:47.360 --> 00:47:51.940
David Bau: And, like, yeah, I can give them some ideas, but for some technical things.

423
00:47:52.060 --> 00:47:56.070
David Bau: So you guys are the only team working with, like, you know, a lot of images.

424
00:47:56.250 --> 00:47:57.950
David Bau: And the thing that… so…

425
00:47:58.180 --> 00:48:09.070
David Bau: we've been coaching everybody on how to generate a lot of data using LLMs to generate text and things, so one of the opportunities you have, you're going to be in a different situation, is to actually generate

426
00:48:09.680 --> 00:48:12.499
David Bau: Some of your data sets by generating images.

427
00:48:12.700 --> 00:48:16.600
David Bau: Right, so you're using real image datasets, you know, I can see, right?

428
00:48:16.790 --> 00:48:17.460
David Bau: But…

429
00:48:17.590 --> 00:48:25.630
David Bau: But, like, if you want a minimal pair, if you want to say, hey, I wonder what makes this image unsafe. Is it because there's a fire hydrant?

430
00:48:25.710 --> 00:48:42.960
David Bau: you know, can I make a version of this image without the fire hydrant? I wonder if that'll, like, you know, cause the model to behave differently. This is very interesting, like, we can use diffusion models to generate a place that is not safe, and… Or, you know, for generating data, you don't have to…

431
00:48:43.100 --> 00:48:50.620
David Bau: use a little cheap model, right? You could use Nano Banana or whatever, use whatever, like, the commercial ones are, right, to generate

432
00:48:51.390 --> 00:48:52.480
David Bau: information.

433
00:48:52.760 --> 00:49:11.570
David Bau: And so, and so you can do image editing and do all sorts of stuff, but then the question is, like, so, but you can be creative on how to do it. You can think about, like, are there… are there things that you want to do to generalize sensitive images so that they make an interesting asset that you can get a little bit more control over than a natural

434
00:49:11.690 --> 00:49:13.829
David Bau: Data testing. Yeah.

435
00:49:14.250 --> 00:49:27.630
David Bau: Yeah, the other thing I'll recommend you read is, so, you're looking at attention patterns for getting some intuition about things. There's this great paper written by a North Houston professor, Byron Wallace, and his students from a couple years ago, called

436
00:49:27.770 --> 00:49:31.750
David Bau: Attention is not explanation, so it's a little caution paper.

437
00:49:31.790 --> 00:49:33.250
David Bau: About, you know.

438
00:49:33.290 --> 00:49:53.179
David Bau: inferring too much from attention patterns. So I would read that. There's also a follow-up paper from Naomi, I think, that… called, like, Attention is not, not accumulation, so it's a little controversy. So you can take a look at those two papers and see what you think, but we'll… so the other things for Frisian.

439
00:49:53.340 --> 00:49:56.809
David Bau: These salience methods, where you're trying to figure out what part of the image.

440
00:49:57.150 --> 00:50:10.159
David Bau: is important, turn out to be important, but we're not covering standards methods till much later. But we do have a stainless method expert on the team. He's a Fosaf Gabriel Smarty.

441
00:50:10.390 --> 00:50:18.259
David Bau: who's one of the, you know, world's experts in salience methods. So, so have you guys been in touch with Gabriel? Have you seen him?

442
00:50:19.410 --> 00:50:22.130
David Bau: Okay, so his name is Gabriel.

443
00:50:22.550 --> 00:50:31.199
David Bau: And… and ask me about them later, and then I'll connect you. Thank you very much.

444
00:50:31.560 --> 00:50:35.169
David Bau: Excellent. Okay, any other suggestions, any ideas for the team?

445
00:50:35.410 --> 00:50:44.000
David Bau: Like, you might be interested in the Maya paper? Oh, yeah, the Maya paper, and I also asked Nikhil for his latest paper, so if you read… okay, yeah, so…

446
00:50:44.130 --> 00:50:50.250
David Bau: Will you send it on Discord? Sure. Okay, look it on Discord. We'll get a couple papers to you.

447
00:50:50.530 --> 00:50:54.059
David Bau: And, you know, attention is not explanation.

448
00:50:54.250 --> 00:51:00.499
David Bau: And, Gabriel Starte's name, on Discord, and we'll get you…

449
00:51:01.040 --> 00:51:03.280
David Bau: And then Nikhil, our TA,

450
00:51:03.910 --> 00:51:10.669
David Bau: who's skipping out on being TA today, because he got his ICML thing in probably at 7am this morning.

451
00:51:10.960 --> 00:51:12.329
David Bau: I fell asleep.

452
00:51:12.810 --> 00:51:22.279
David Bau: But his paper that he just submitted, is also, a VLN paper, and very, you know, so you can, and it's all pretty…

453
00:51:22.690 --> 00:51:24.310
David Bau: So, asking for his preprint.

454
00:51:25.410 --> 00:51:28.180
David Bau: Yeah, then I'll have to get that preprint ready for you then.

455
00:51:30.900 --> 00:51:32.310
David Bau: Okay, great. Who's next?

456
00:51:35.170 --> 00:51:43.620
David Bau: Let's see, let's… I don't know who's next. Will you guys click on another slide that's hard for me to click? Thanks. Okay, somebody nominate yourself.

457
00:51:45.220 --> 00:51:48.500
David Bau: I think that sounds fine. That's yours, like… Okay.

458
00:51:52.090 --> 00:51:53.010
David Bau: Biants.

459
00:51:53.270 --> 00:51:59.700
David Bau: Thank you, guys. It's such a beginning, it's the beginning. Yeah, they're so excellent. Team excellent.

460
00:52:01.620 --> 00:52:03.810
David Bau: And so, yeah, I broke down and charged it.

461
00:52:03.960 --> 00:52:08.919
David Bau: So basically what we've done in this week is just, like,

462
00:52:09.040 --> 00:52:13.019
David Bau: Kind of, like, start cleaning our data, because we have a lot of data, which is the thing.

463
00:52:13.330 --> 00:52:20.560
David Bau: So, we're working with earnings calls, which are these transcripts of the calls that companies do with

464
00:52:21.470 --> 00:52:35.850
David Bau: people, like, reporters, every quarter in the year. And then transcribed? Yeah, so we had the transcripts, because my boss had them, so he just gave them to me. Nice. It's really nice, but we have so much data. So we have data from 2001 to 2020.

465
00:52:36.030 --> 00:52:41.010
David Bau: And we have in total, like, 200… 200K, like, 300K transcripts.

466
00:52:41.410 --> 00:52:44.490
David Bau: And then… Don't keep going.

467
00:52:44.780 --> 00:52:53.180
David Bau: So the setup for the earnings call is usually, they start by opening up the opening statements, in which they give, like, a description of how things have been.

468
00:52:53.350 --> 00:52:58.510
David Bau: And then there is a question, a Q&A, with reporters and analysts from, like, JP Morgan.

469
00:52:59.320 --> 00:53:06.119
David Bau: And we said… we decided, we thought it was a good idea to keep, chunks that were, like, question and answer.

470
00:53:06.670 --> 00:53:12.930
David Bau: So that's sort of the idea. That's gonna be, like, our unit of observation. It's gonna be the number of the charts. Can you keep going?

471
00:53:13.970 --> 00:53:20.580
David Bau: So what does this look like? For example, we have, an analyst, Denise McAllefine, who's asking.

472
00:53:21.400 --> 00:53:24.019
David Bau: he was asking the CEO of Netflix.

473
00:53:24.330 --> 00:53:30.829
David Bau: You are wanting to put Walmart out of its misery by taking over its operation. Would you consider doing the same for Blockbuster? This is an old transcript.

474
00:53:30.920 --> 00:53:52.400
David Bau: And this is a very long answer that doesn't really say anything, but it doesn't contain much uncertainty. It doesn't say… it's just Reed Hastings just, like, saying nothing for a lot of words. Yeah, it's a lot of words, but it doesn't really say anything. It's just saying that maybe not. But there is… so some of them are just, like, these kind of questions, and it doesn't really say a lot about their uncertainty with respect to the future of the company.

475
00:53:52.650 --> 00:53:54.820
David Bau: Next. Sorry. Sorry, Claire.

476
00:53:55.290 --> 00:54:03.889
David Bau: But now, for example, this one is a software company, and they're asking them, about the quarter,

477
00:54:05.520 --> 00:54:12.890
David Bau: They're just asking for the revenue in the next quarter, and he does give a little bit of, like, a… oh, that's much better, thanks. Yeah, he'll give a…

478
00:54:12.950 --> 00:54:31.989
David Bau: description here says, it depends on the amount of new contracts we get with closing any given quarter, so this is a little bit more, like, an uncertain answer, we're not really sure. And then if we keep going, this is a long one, sorry. Very challenging. This is the CarMax spread CEO, when they were asking him for closing statements, and he was just saying.

479
00:54:31.990 --> 00:54:37.199
David Bau: The retail environment, there's a lot of uncertainty out there, the credit markets, the gas prices.

480
00:54:37.300 --> 00:54:48.209
David Bau: And he's saying we're in such a slow environment right now that there is so much uncertainty in the marketplace, it's very difficult for us to figure out what's going to happen in the rest of the year. So this is an example with very high uncertainty.

481
00:54:48.360 --> 00:54:54.180
David Bau: And now, did you say that this was already pre-labeled manually? So this is our issue.

482
00:54:54.580 --> 00:54:56.530
David Bau: We have all data, we don't have labels.

483
00:54:56.810 --> 00:55:09.949
David Bau: So I did those labels. How many did you label by hand? Not that many, 16. 16, that's a lot, that's good. But the issue is that some of them are really easy. Can I get the LLM to generalize the labels for you?

484
00:55:10.080 --> 00:55:20.180
David Bau: So, we did run an example on, like, 500 chunks. So, when we split by chunks of Q&As, we have 2 million, 3 million observations.

485
00:55:20.440 --> 00:55:25.339
David Bau: They look like this. I think there is some issues here in this traction that we're gonna fix in the next…

486
00:55:25.530 --> 00:55:26.570
David Bau: Couple days.

487
00:55:27.050 --> 00:55:34.090
David Bau: Do you want to explain it? Yeah, yeah, explain what you do? Yeah. I'll finish the title.

488
00:55:35.510 --> 00:55:43.610
David Bau: Oh, yeah, well, we are… we don't know how to do the, the labels, basically. That's our issue. Yeah, yeah, yeah, so I, so, so, so, I mean, so, like, so…

489
00:55:44.010 --> 00:55:58.729
David Bau: you have 60 examples, and you tried… you tried ICL prompting of a large model to see if, like, you gave 60 examples, and then you gave it 10 more, and you say, you know what, I need you to label the spans, and I need you to say

490
00:55:58.730 --> 00:56:07.369
David Bau: whether it's uncertain or not, here's 60 true examples. Here's 10 more that are unlabeled. Can you add the labels to this one? Have you tried that?

491
00:56:07.410 --> 00:56:11.780
David Bau: We did the labels here, but we didn't give the example. We didn't do it.

492
00:56:12.210 --> 00:56:24.740
David Bau: But this… yeah, we didn't do it like that, but we could try that. Yeah, I think that's… that's a typical way that people are doing it now, right? They say, oh, here's some really high-quality ones, now that you've seen these examples, and

493
00:56:24.740 --> 00:56:35.959
David Bau: And they would also, like, write a long set of definitions above, like the kind of thing you probably talked about here. They'll say, this is what I'm looking for. For example, here's 10 examples of it being done well.

494
00:56:36.250 --> 00:56:42.619
David Bau: And then here's, here's, like, 10 things that I would like you to label. And then you could try to double-check it in a few ways.

495
00:56:42.660 --> 00:56:59.230
David Bau: you could try labeling the same thing twice with two different prompts, or two different sets of examples, see if it comes out the same. If it's very different, then you might not trust that one. But if it's pretty consistent, then you might trust it. You can ask a second LLM to do it. You could ask a second LLM to check your work.

496
00:56:59.360 --> 00:57:13.689
David Bau: You know, here's a mix of good ones and bad ones, can you tell me which ones are inaccurate? Can I pick up the bad ones? You know, that type of thing. So you can come up with, like, ways of doing it and double-checking it to keep the quality high.

497
00:57:13.860 --> 00:57:19.260
David Bau: Anyway, so… but I like… it's a cool problem, it's like a cool labeling problem, like…

498
00:57:19.500 --> 00:57:36.240
David Bau: Perez would love this problem, right? You know, Ethan Perez, who wrote the paper about… Yeah, all this stuff, right? Yeah, great, yes, right, so… but, you know, he's written a bunch of papers on how to use LLMs to sort of amplify data set labeling like this, and

499
00:57:36.710 --> 00:57:41.850
David Bau: And so, I think it's, you know, these kind of methods Seem like it's…

500
00:57:42.240 --> 00:57:43.799
David Bau: It's the kind of thing you want to try.

501
00:57:44.260 --> 00:57:47.240
David Bau: So I wouldn't give up.

502
00:57:47.550 --> 00:57:55.150
David Bau: I think that the problem that you have is a great problem, and it's, like, an interesting research contribution to figure out how to auto-label.

503
00:57:55.410 --> 00:58:01.229
David Bau: the stuff. One thing we're thinking of is not SATA empowered, so…

504
00:58:01.480 --> 00:58:10.730
David Bau: Oh, yeah, you can go… I don't know if we have it, maybe. So there's a lot of… well, this is the labeling that we got from prompting the definition into Llama?

505
00:58:11.220 --> 00:58:27.999
David Bau: But it is the case that there is very few cases of high uncertainty, there's a lot of known certainty. Nobody… none of the CEOs want to admit high uncertainty, yeah. Yeah, exactly. And they mostly show up in the Q&A, which is why we're only doing the Q&A, not the opening statements.

506
00:58:28.000 --> 00:58:34.149
David Bau: But that's okay, I mean, you have… you have data set and balance it. Like, if you need to… if you need to balance it later, obviously, you can sample.

507
00:58:34.360 --> 00:58:40.429
David Bau: you know, more equally. But it might be, like, do you believe, like, maybe the real data is like this, right?

508
00:58:40.550 --> 00:58:52.019
David Bau: So this is the result. Yeah, so when I did it manually, I also got kind of, like, the same thing. I got a lot of, known certainty, I got a lot of… some more… so this is very similar to what… can you go next?

509
00:58:52.140 --> 00:58:54.689
David Bau: I did, like, manually, I did, like, this.

510
00:58:54.900 --> 00:59:02.720
David Bau: And I got mostly uncertainty, some intermediate, and very few high, so that's sort of how the data is, like, what it looked like. I think that's okay.

511
00:59:03.210 --> 00:59:04.060
David Bau: Yeah.

512
00:59:04.390 --> 00:59:11.809
David Bau: You know, it can be more challenging to train a model with data set imbalance if you're trying to do a classifier.

513
00:59:12.060 --> 00:59:15.849
David Bau: You know, you can ask all your machine learning friends.

514
00:59:15.960 --> 00:59:19.549
David Bau: for techniques for how to deal with that, right?

515
00:59:19.780 --> 00:59:26.249
David Bau: But but I wouldn't worry about it. I think that's cool. But the real question is, you've got this vast amount of data.

516
00:59:26.520 --> 00:59:34.330
David Bau: And you're trying to figure out how to use as much of it as you can. And so, see if you can auto-label more of it. It would be great.

517
00:59:36.710 --> 00:59:39.590
David Bau: Eric, would you go back to the prompt slide?

518
00:59:40.560 --> 00:59:43.229
David Bau: Yeah, so just a little bit about,

519
00:59:44.390 --> 00:59:50.069
David Bau: experiment, so, like, we… using this dataset and the 61 samples on Vernica.

520
00:59:50.220 --> 00:59:57.020
David Bau: labeled. We basically tested out, like, how, models But, like…

521
00:59:57.780 --> 01:00:07.270
David Bau: formed this task of identifying, like, the levels of uncertainty using the same labels, like, those, the three-way labeling, no uncertainty, intermediate, and high.

522
01:00:07.370 --> 01:00:20.380
David Bau: And this is, like, what we gave it, gave the prompt. So, basically, we gave it the definition, and I think one thing here is that we were thinking of how, like, because

523
01:00:20.520 --> 01:00:23.909
David Bau: We kind of want to, like, one interesting research question is.

524
01:00:24.090 --> 01:00:27.459
David Bau: How the model, classified, like.

525
01:00:27.700 --> 01:00:41.269
David Bau: different, like, so, like, it distinguished between uncertainty versus the first moment, like, general sentiment of this earnings call. So, like, if this earnings call is, like, pessimistic, that doesn't mean

526
01:00:41.410 --> 01:00:47.900
David Bau: Well, it's bad, but, like, that's just first movement. The uncertainty measure is, like, independent of that, so we sort of want to…

527
01:00:47.930 --> 01:01:03.110
David Bau: know how models understand that distinction. So, we give it definition and explicitly, like, saying that do not treat positive, negative sentiment, or clear numerical… numerical guidance as uncertainty by itself. So, something like that.

528
01:01:03.320 --> 01:01:10.980
David Bau: And then we just tested out, like, how the model is performing the task. Okay, I'm supposed to say thank you very much.

529
01:01:11.220 --> 01:01:17.709
David Bau: Sorry, either… so, but you can put up, like, are there other things that you want feedback on?

530
01:01:18.070 --> 01:01:21.810
David Bau: Can you go ahead? What are the key things you want feedback on?

531
01:01:22.260 --> 01:01:31.839
David Bau: What kind of pets in? We just, yeah, we need mostly the labeling, and and this is another attempt at labeling, and then you tried this, and then how did it do?

532
01:01:32.240 --> 01:01:33.540
David Bau: Keep going.

533
01:01:35.330 --> 01:01:38.270
David Bau: What's this? This is a rock. This is just a disk.

534
01:01:38.620 --> 01:01:39.670
David Bau: Okay, cool.

535
01:01:39.910 --> 01:01:56.830
David Bau: This is just, this is not the labels, this is the, like, model's prediction on a model prediction, right? How good the model is at. So you, like, labeled 60, right? So you could hold out, like, you know, 30 of your labeled things, and you could ask the model to auto-label your holdouts to see if it agrees.

536
01:01:56.830 --> 01:02:09.929
David Bau: your thing, right? So you can… right, so you can… you can kind of see how auto visualizations work. I'd be… I'd be kind of… you know, you're… you're at this thing where you've got this massive piece of data, and you're trying to figure out how to deal with it, and… and it's its own little mini research.

537
01:02:10.030 --> 01:02:29.409
David Bau: then you can treat it as its own little problem. It's not… it's probably not enough… it might be enough for a whole paper just labeling this data, but it's not… it would be the point of this class. The point of the class is going to be to use that data for other things. Yeah. But but yes, it's… but it's… it's a serious problem. It's worth, you know, trying a few different things like this.

538
01:02:29.850 --> 01:02:32.299
David Bau: And then measuring… measuring how you're doing.

539
01:02:32.650 --> 01:02:39.480
David Bau: yeah, we'll try a few different things. Yeah. Any other suggestions for this team?

540
01:02:39.670 --> 01:02:48.470
David Bau: Now, do you guys… okay, go ahead. Yeah, I don't… this is… this is more just kind of, like, an open question, maybe for everybody. Like, how did you decide

541
01:02:48.900 --> 01:03:05.900
David Bau: 3 labels instead of just, like, yes, no on the one hand, or you could have 4 labels, you could have 10 labels, like, how did you end up with… Ideally, we wanted some sort of continuous measure, but we thought that would be too broad. So initially, when you did your experiment, we had asked it to give it, like, a score from 0 to 1.

542
01:03:06.740 --> 01:03:08.040
David Bau: But,

543
01:03:08.600 --> 01:03:17.030
David Bau: We were thinking of, so, like, this is on the chunk of question-answering pairs. So, like, an earnings cost, like, is consists of, like.

544
01:03:17.510 --> 01:03:34.099
David Bau: many chunks. So, like, we could sort of, like, aggregate that using those, three-way classification into, like, a continuous. We just wanted to be very conservative, so we wanted to go as legal as legal label as possible, because we wanted it to be very clear what it was.

545
01:03:34.530 --> 01:03:39.729
David Bau: You know, so when you make more categories, it becomes, like, what does it mean, higher and lower?

546
01:03:40.230 --> 01:03:56.060
David Bau: Yeah, so that's definitely an issue that we're thinking about. You could also just… I think it'd be pretty easy to, like, go from this to yes, no. Yeah, but yeah, we thought that, too. And then just, like, try it and see if you get better performance, if you get different.

547
01:03:57.710 --> 01:04:05.750
David Bau: But on the other hand, right, like, on the same spirit, there's almost nothing to be lost if you're auto-labeling to ask an LM

548
01:04:05.960 --> 01:04:08.770
David Bau: To give you a rating from 0 to 100.

549
01:04:09.260 --> 01:04:22.400
David Bau: And, and then, and then, you know, and then you might be like, okay, well, actually, it doesn't have very good resolution, so I'm gonna round it down to yes, no, like that. Like, there's almost nothing to be lost from just trying. The,

550
01:04:22.550 --> 01:04:36.500
David Bau: you can… you can do things, like, you can say… you can… you can have the different LMs to see if they agree with it. Like, oh, you get a rating from 1 to 100, then you ask the LM a different way. Which one's more uncertain? You know, and then you give it two examples.

551
01:04:36.680 --> 01:04:43.149
David Bau: And if it agrees, then it means it's got some consistency, or consistency with another LM, maybe you believe it more.

552
01:04:43.370 --> 01:04:48.719
David Bau: But if it's very internally consistent, then maybe you believe it lasts. You can do different…

553
01:04:49.170 --> 01:04:59.740
David Bau: different things. I've been surprised, you know, some research going on, like David Atkinson in the lab is, just asking for numbers from 1 to 100,

554
01:04:59.870 --> 01:05:07.460
David Bau: on these things, and it's nice. It gives him the resolution to be able to check if the models

555
01:05:07.630 --> 01:05:16.820
David Bau: Like, the models, you know, have big error bars on a scale of 1 to 100, but by getting these numbers, it allows them to fit models to it that

556
01:05:17.010 --> 01:05:28.260
David Bau: you know, like, you can do linear regression instead of just yes-no assessments and things like that. And so, so it gives them a few more tools that you can use.

557
01:05:29.690 --> 01:05:31.150
David Bau: So it's a reasonable question.

558
01:05:33.810 --> 01:05:42.290
David Bau: Yeah, so… is this… is this interesting to people? Is there anything that, like, do you see, like, where this might go for a research paper in the end?

559
01:05:44.890 --> 01:05:46.310
David Bau: Where is this going?

560
01:05:48.130 --> 01:05:48.800
David Bau: Yeah.

561
01:05:49.380 --> 01:05:52.090
David Bau: Are you focused on speaker uncertainty?

562
01:05:52.740 --> 01:05:55.109
David Bau: Yes. Like, the first, like, interpreting.

563
01:05:55.240 --> 01:05:57.659
David Bau: Yes, I think we are.

564
01:05:58.590 --> 01:06:03.710
David Bau: Because I… but… Versus… Yeah.

565
01:06:04.190 --> 01:06:09.420
David Bau: She's like… That, like, that person feels uncertain.

566
01:06:09.730 --> 01:06:29.400
David Bau: Based on the way they're talking right now. You know, how, like, the situation at all is uncertain, or, like, just the way they word it is confusing, like, I'm not certain of what they're saying. I guess that came up in the first presentation, whether, this has something to do with how the model is uncertainty itself. Someone mentioned uncertainty, research that we have to review.

567
01:06:29.830 --> 01:06:34.199
David Bau: But yeah, that's definitely, yeah, that we… that's so many questions.

568
01:06:36.010 --> 01:06:44.460
David Bau: There's a different… have you considered… like, okay, I'll start. I'll let you make your suggestion. No, I was just gonna say, like, I think…

569
01:06:44.510 --> 01:07:03.419
David Bau: it is interesting that, like, the context of the earnings calls, where there's sort of this reputation management thing, where it's like, you don't want to say, like, like, oh, we're definitely going to fail and go bankrupt, like, you know what I was like, you know, it's really uncertain about just all the changes, and, like, I think that they might not be uncertain at all. Yeah, but there's also… there's also an incentive to,

570
01:07:03.420 --> 01:07:22.910
David Bau: provide some sort of honesty, usually, because, usually if you don't… if you lie to a shareholder, they… they don't like it. Yes, that makes sense. So there's also, like… but definitely they try to be very… that's why we have very little examples of high uncertainty. So this… right, exactly. So I was about to say the same thing as you, but in different ways, but in more stark terms. There's this question of honesty. Some management is more honest than others.

571
01:07:23.000 --> 01:07:32.139
David Bau: And you have interesting ground truth because you're in the financial markets, so you can see when management is actually misrepresenting the situation by looking at the next quarter.

572
01:07:32.290 --> 01:07:35.950
David Bau: And so, so I'm, I'm really, like, so…

573
01:07:36.310 --> 01:07:44.030
David Bau: if you were… if you were a quantitative trader training your model, you would actually do this. You would… you'd be training your models

574
01:07:44.140 --> 01:07:54.060
David Bau: On looking at not just measuring uncertainty, but seeing if you can tell from their verbal cues and so on, whether management is obscuring the truth.

575
01:07:54.270 --> 01:08:05.159
David Bau: And something like that. So I'd be kind of, like, I wonder if that's the kind of thing that you'd want to try to label as well, or if you want to label things based on financial ground truth or anything like that.

576
01:08:07.220 --> 01:08:17.190
David Bau: Anyway, yes, yes. Anyways, it's a thing that comes to mind when you show me, like, the actual text, and you tell the stories.

577
01:08:17.420 --> 01:08:22.199
David Bau: That there might be, and maybe even a third dimension beyond uncertainty.

578
01:08:22.540 --> 01:08:25.899
David Bau: That might be interesting to… Try to label.

579
01:08:26.250 --> 01:08:36.290
David Bau: Yeah, we're open to ideas. Like, economic activities, how, sort of, that correlates with… Yes, or… Or management honesty, even.

580
01:08:37.000 --> 01:08:39.770
David Bau: So, yeah, it's, like, super interesting.

581
01:08:40.729 --> 01:08:44.940
David Bau: So… I think that a lot of people are actually deeply interested in this.

582
01:08:45.180 --> 01:08:47.469
David Bau: So, what people are deeply interested in

583
01:08:47.910 --> 01:08:49.790
David Bau: Will an LM lie to you?

584
01:08:50.470 --> 01:08:52.800
David Bau: But people are also deeply interested in

585
01:08:53.790 --> 01:08:56.139
David Bau: Doesn't LM know if you're lying to it?

586
01:09:03.220 --> 01:09:15.689
David Bau: What would be the graduates for that? Well, you can see the financial statements. Yeah, so they can say, hey, you know, we booked a lot of sales, and then you look at the next quarter, and they haven't booked a lot of sales. They're lying.

587
01:09:15.990 --> 01:09:21.650
David Bau: But you can… but you know the future. Yeah, because we have… Right, because you have the future. The old data, yes. Right.

588
01:09:24.010 --> 01:09:33.250
David Bau: I think you probably should also be careful of, like, one, like, these are public companies, so maybe the LMs do know the outcomes.

589
01:09:33.350 --> 01:09:43.320
David Bau: Perhaps, you know, or, like, know how… Oh, they memorized them, so you might have to anonymize it. And also, the, like, they know the history. Differentiate between…

590
01:09:43.500 --> 01:10:00.380
David Bau: the LM telling how… seeing, like, seeing how certain the person is, or the LM's sort of also imposing its own determination over, like, what it's saying. They're like, are you here to double your sales? Like, I doubt it. Yeah, I know… I know this company went out of business. Right. Because the LM.

591
01:10:00.730 --> 01:10:08.999
David Bau: Or even if they didn't know about the company at all, just, like, the… like, the cr… like, the… like, what they're… the content of what they're saying, like, the more…

592
01:10:09.280 --> 01:10:22.329
David Bau: I don't know, like, boastful or, like, unreasonable, it sounds, like… I see, so that's a question. If you're looking… are you looking for linguistic cues or societal cues? Like, if you're only… if you want to make sure you're trying to understand linguistic processing.

593
01:10:22.420 --> 01:10:29.569
David Bau: Do you need to make an anonymized version of this dataset, where you go through and you say, change all the company names to fictitious company names.

594
01:10:29.720 --> 01:10:31.750
David Bau: And things like that, so that you can't count.

595
01:10:32.050 --> 01:10:46.540
David Bau: what it's about, right? Like, does that… does that mean it can be meaningless? Because now, you can't read it, because the way that people talk about it, you would have to know this is a car company, but if you change the name of the company, you can't tell anymore. But it's kind of an interesting question.

596
01:10:46.740 --> 01:10:49.299
David Bau: I like it. Oh, fascinating.

597
01:10:49.670 --> 01:10:50.920
David Bau: Fascinating question.

598
01:10:51.100 --> 01:10:53.680
David Bau: It might be something that you need to control for.

599
01:10:56.540 --> 01:11:04.109
David Bau: Yeah, going back to the definition of uncertainty, I'm curious if you looked at, like, uncertainty taxonomies?

600
01:11:05.910 --> 01:11:13.670
David Bau: Right, what does that mean? Like, there are different kinds of uncertainty, like, aleatoric or, like, inherent uncertainty. Like, it seems like a lot of…

601
01:11:13.870 --> 01:11:23.570
David Bau: the statements that you show are, like, oh, this is… Have you heard of these? Yeah. Will you send them, will you send someone to the Discord? Yeah. Alright, I'm gonna go on to the next team, but that, I think, is a really great comment.

602
01:11:24.070 --> 01:11:28.210
David Bau: Yes, lots of… Yes. Thank you, thank you.

603
01:11:29.320 --> 01:11:39.960
David Bau: Who's X team? There's no team after S and Teams. There's no team after you? Team B. Team B, okay, Team B. We're the B team. Team Al, you renamed yourself, but you didn't change your folder.

604
01:11:40.660 --> 01:11:41.530
David Bau: Okay.

605
01:11:41.630 --> 01:11:42.849
David Bau: Are you the last name?

606
01:11:44.070 --> 01:11:52.459
David Bau: Don't know. No, there was Team M also, and there's other Team M. Oh, are you Team M? Okay, okay. So, Team… Team… Team L, or B,

607
01:11:53.300 --> 01:12:00.270
David Bau: I had to change our name again. Okay, yes, who knows? Team Pencil and…

608
01:12:00.510 --> 01:12:04.080
David Bau: That was a very good name for a team, Team Excellent, I like that.

609
01:12:04.180 --> 01:12:05.290
David Bau: embedded.

610
01:12:06.730 --> 01:12:19.729
David Bau: Okay. Yes. So what we tested out in… since Tuesday was there was the idea of… It's been a long path. I know it's been so long ago, it's been so long ago. Is there sick of it? Like, how much of…

611
01:12:19.860 --> 01:12:33.390
David Bau: syncophancy is actually… sycophancy, like, the assistant trying to manipulate the user, or is it just, like, next word prediction? And one thing we were thinking of testing is, like, okay, let's swap the roles so that I'm gonna have the assistant

612
01:12:33.630 --> 01:12:36.380
David Bau: say, for Llama 70B instruct.

613
01:12:36.560 --> 01:12:51.030
David Bau: hey, to the user, I'm curious to get your opinion on this political question, and for some reason, I'll just inject that, like, the assistant says, I'm liberal, or I'm conservative, and see just… and then when it predicts for the user's role.

614
01:12:51.260 --> 01:12:53.339
David Bau: Does it predict that the user

615
01:12:53.570 --> 01:13:00.470
David Bau: will match the elements. Oh, see if the user to be symptomatic. That's interesting. This is… I don't know.

616
01:13:00.490 --> 01:13:18.949
David Bau: That's a funny experiment. Yeah. Sure, why not? Because these models were trained to fine-tune on the user parts of these exchanges as well. And what we see is that the user is super sycophantic as well, so, if the assistant says, you know… You can't call it sycophantic, you have to stay persuadable, right? Because that would be the…

617
01:13:19.170 --> 01:13:24.760
David Bau: Like, models think that persuasion we were discussing about… Models think that the users are super persuadable. Persuadable.

618
01:13:25.900 --> 01:13:33.089
David Bau: So, yeah, so we see the same kind of pattern, where if the assistant says, I strongly disagree with this question, then

619
01:13:33.200 --> 01:13:38.320
David Bau: Then the user… it predicts that the user is mad, like, they say that it also is strongly disagreeable times.

620
01:13:38.600 --> 01:13:39.790
David Bau: Yeah, it's actually…

621
01:13:40.350 --> 01:13:55.439
David Bau: I should have put up the comparison, but it looks as though it's even kind of more persuadable than the system. And we also see this along the liberal and conservative campuses, where, yeah, CS column is, like, I'm conservative.

622
01:13:56.140 --> 01:14:02.609
David Bau: So, this might just be super out of distribution and long, and it's just a… Oh, no, it's really interesting, that's cool. It's a good experiment.

623
01:14:04.690 --> 01:14:12.130
David Bau: So I did a couple of media experiments, picking Yuen from a Chinese as a Chinese…

624
01:14:12.660 --> 01:14:17.369
David Bau: mobile, and Gemma is an American model, to compare the

625
01:14:17.570 --> 01:14:20.140
David Bau: To see the provenance effect, origin effect.

626
01:14:20.290 --> 01:14:27.500
David Bau: On, and asking, are economic and social conservatism or liberalism separable in different…

627
01:14:27.640 --> 01:14:32.389
David Bau: In LLMs from different origins, and thus this very bio…

628
01:14:32.820 --> 01:14:45.640
David Bau: I picked Gemma, at two prompts, abortion for the social, dimension of ideology, and a taxation, to see whether economic ideology,

629
01:14:46.430 --> 01:14:49.820
David Bau: exist, and… How sincere this may look.

630
01:14:52.260 --> 01:14:52.990
David Bau: Coastal.

631
01:14:53.630 --> 01:14:59.869
David Bau: And this is… entry abortion graph. It's kind of a mess, but…

632
01:15:00.150 --> 01:15:03.039
David Bau: You see there is abortion.

633
01:15:03.180 --> 01:15:05.620
David Bau: Domain knowledge in the middle.

634
01:15:05.720 --> 01:15:20.489
David Bau: and abortion legal texts, regulation, etc, etc, but it triggers social issues, controversial issues, and LGBTQ. So, interestingly, ISIS and Adolf Hitler is…

635
01:15:20.640 --> 01:15:26.370
David Bau: I don't know, I didn't know what his stance was on this issue. Interestingly, it triggers this kind of…

636
01:15:26.630 --> 01:15:32.490
David Bau: very extremist, both religious extremists and kind of nationalist extremist.

637
01:15:32.860 --> 01:15:35.880
David Bau: Kind of, features.

638
01:15:36.200 --> 01:15:40.670
David Bau: And I can't wait to understand this is a kind of American topic?

639
01:15:41.100 --> 01:15:42.219
David Bau: Then…

640
01:15:42.670 --> 01:15:45.940
David Bau: feminism issues. These are all social.

641
01:15:46.170 --> 01:15:47.080
David Bau: kind of…

642
01:15:47.230 --> 01:15:56.740
David Bau: Dimensions, but there is no economic dimension here, and this means economic and social dimensions are separable to each other.

643
01:15:57.570 --> 01:16:02.699
David Bau: And there's pro-life, bundle here.

644
01:16:03.150 --> 01:16:07.189
David Bau: This means Coventry understands, kind of.

645
01:16:07.680 --> 01:16:18.969
David Bau: different aspects of, kind of, ideology, whether it is conservative or liberal, there are some responsible features. What technology did you use to make this graph? The

646
01:16:19.350 --> 01:16:20.490
David Bau: Thanks, I just started.

647
01:16:21.320 --> 01:16:27.240
David Bau: All visual cycles… And what algorithm is in Neuron Pedia? Which one did they… what was this? It was the,

648
01:16:27.530 --> 01:16:33.230
David Bau: There are many other country, the… that I used…

649
01:16:36.050 --> 01:16:45.510
David Bau: This is the transcoder circuits? This is the transcoder, so this is not Michael Hannah's circuit? Yeah, that's… I was… This is Michael Hannah's thing. Okay, okay, I had only one option, so…

650
01:16:45.680 --> 01:16:48.160
David Bau: Okay, yeah, yeah, yeah, we should… Yes.

651
01:16:48.630 --> 01:16:56.200
David Bau: I'm not familiar… I've never used it. I've heard of it, I've talked a lot about it with people, but I've never used it myself, so that's why I'm asking. So this is,

652
01:16:56.430 --> 01:16:59.510
David Bau: country steering experiment.

653
01:16:59.650 --> 01:17:02.470
David Bau: Because in the previous,

654
01:17:02.890 --> 01:17:16.390
David Bau: In the graph, there is, I said, pro-life, and it also understand abortion and guns are related to each other topic… related to topics, and bundle them together.

655
01:17:16.760 --> 01:17:18.249
David Bau: And I kind of…

656
01:17:18.450 --> 01:17:25.870
David Bau: experimented on a couple of different features, and I, I was suspicious on it, I was suspicious on it, but…

657
01:17:26.180 --> 01:17:33.930
David Bau: Interestingly, abortion and… or guns feature, is kind of responsible for stance.

658
01:17:34.250 --> 01:17:44.069
David Bau: But domain, features, are not kind of responsible for stance, as a stance identifier.

659
01:17:44.400 --> 01:17:47.760
David Bau: And it changed the leftist model

660
01:17:47.950 --> 01:17:51.759
David Bau: basically a conservative model. I say this is…

661
01:17:52.290 --> 01:17:58.500
David Bau: Just one, just one feature change. Just here, that one feature. Yeah, just one feature change that I do.

662
01:17:58.620 --> 01:18:00.650
David Bau: Interesting. And this is the kind of…

663
01:18:00.950 --> 01:18:07.430
David Bau: I need to dig into that feature, maybe it triggers… Additional features, but…

664
01:18:07.600 --> 01:18:15.359
David Bau: It basically, applies to all social, kind of, conservatism or liberalism, because it bundles abortion and guns.

665
01:18:15.620 --> 01:18:24.829
David Bau: I have some suspicions on that. Maybe if I ask about LGBTQ question, this feature may turn…

666
01:18:25.060 --> 01:18:26.709
David Bau: Remember, all the way I can.

667
01:18:26.920 --> 01:18:27.940
David Bau: Also, vaginal.

668
01:18:29.760 --> 01:18:38.140
David Bau: And when we look at the abortion graph, this is kind of a more structured, but still, they have social issues, they go with tea rights.

669
01:18:38.680 --> 01:18:54.359
David Bau: Different from Queen, we have sensitive issues, features. Understand this is a sensitive issue. In the Queen model, there was… I didn't see any kind of features titled as sensitive.

670
01:18:54.590 --> 01:19:00.050
David Bau: But China labeled this as sensitive issues. It triggers immigration compared to gun.

671
01:19:00.240 --> 01:19:05.980
David Bau: And still, we have a GBGQ. This is the…

672
01:19:06.210 --> 01:19:14.530
David Bau: kind of a shared feature, but immigration, instead of the gun, JAMA2 kind of triggered the immigration topics.

673
01:19:15.650 --> 01:19:25.249
David Bau: And morality. In Kuwen model, we don't see… I didn't see any kind of religious and moral kind of,

674
01:19:25.930 --> 01:19:32.200
David Bau: features, but Gemma, too, apparently have some moral kind of features.

675
01:19:32.430 --> 01:19:35.529
David Bau: To understand, kind of, ideology and

676
01:19:38.230 --> 01:19:43.730
David Bau: And LGBTQ is the responsible for… Stents.

677
01:19:43.950 --> 01:19:58.829
David Bau: as it can be labeled as a stance identifier compared to all others. I experimented on a couple of things. Some features turn the model into neutral, but not kind of strongly. LGBTQ, just one feature.

678
01:19:59.100 --> 01:20:01.790
David Bau: little, kind of, steering.

679
01:20:03.230 --> 01:20:07.750
David Bau: Minus 1, turn the model into conservative model.

680
01:20:12.450 --> 01:20:30.419
David Bau: And this also tells us that social dimensions are kind of correlated to each other, and then, there is one kind of responsible… there can be one responsible kind of feature, turning all answers…

681
01:20:32.900 --> 01:20:35.739
David Bau: Chrome, kind of, liberal to conservative.

682
01:20:37.330 --> 01:20:41.449
David Bau: So when it comes to taxation, this is an economic topic.

683
01:20:42.320 --> 01:20:47.989
David Bau: Kuvan is also, still kind of scattered around, because it triggers…

684
01:20:48.140 --> 01:20:54.009
David Bau: Medical disease terms and legal terms, regulations, tariffs.

685
01:20:54.350 --> 01:21:01.760
David Bau: And interestingly, to understand this is kind of forward relations. Oh, sorry. All right, do you have anything else you want to comment on?

686
01:21:03.120 --> 01:21:05.779
David Bau: You know, I can quickly… just show one more thing.

687
01:21:05.980 --> 01:21:25.489
David Bau: Power is responsible for a stance in the coup, and when it comes to JAMA, JAMA tons of triggers tons of, things, and political theory and ideology is kind of responsible for… this is a general kind of feature. It turns to mobile, ideology, and…

688
01:21:25.600 --> 01:21:36.700
David Bau: CON is more supportable than Gemma, and CON3 has explicit bundling, but Gemma 2 has implicit bundling between issues. LGB2 is common, and model permanence

689
01:21:36.800 --> 01:21:51.789
David Bau: shapes, not just what models believe, but how they recognize. Domain markers are not identifiers, that's identifiers. Cool. Nice. So, any suggestions from people? What if people are interested in what's going on there? Any suggestions?

690
01:22:02.280 --> 01:22:10.350
David Bau: Our team focuses on discrefancy, and then kind of domain knowledge experiments, and then tried to understand

691
01:22:10.660 --> 01:22:13.500
David Bau: The model provenance effects at the same time.

692
01:22:18.940 --> 01:22:20.550
David Bau: Want to hear my suggestions?

693
01:22:21.220 --> 01:22:25.209
David Bau: I always have opinions. Why do I have so many opinions? You guys, I'm sure you guys have opinions, too.

694
01:22:27.750 --> 01:22:28.779
David Bau: Any opinions?

695
01:22:28.880 --> 01:22:29.640
David Bau: Suggestion.

696
01:22:30.410 --> 01:22:34.139
David Bau: I guess just, trying to get back up to speed, so…

697
01:22:34.430 --> 01:22:42.870
David Bau: I guess you're studying sequence of models with respect to, like, Perpetually divisive tissues? Is that…

698
01:22:43.550 --> 01:22:45.890
David Bau: Yeah. But then the political issues.

699
01:22:48.440 --> 01:22:55.369
David Bau: Yeah, the framing that we went in was, like, what… how can we understand the political leaning of an LLM?

700
01:22:55.570 --> 01:23:01.539
David Bau: But then, we were realizing that that's sort of entangled with, like, is that even a coherent?

701
01:23:01.710 --> 01:23:05.789
David Bau: But before we can dig into fully understanding, like.

702
01:23:06.880 --> 01:23:21.980
David Bau: the multi-dimensional sort of ideology that it might have. We have to see, like, it's in response to something else, like, it's in response to the user, it's in response to the context, and so that's why we're focusing on significantly as, like, the starting spot.

703
01:23:23.010 --> 01:23:25.170
David Bau: Simultaneously looking at. Right.

704
01:23:33.020 --> 01:23:39.700
David Bau: So I… so I… I think… so, like, I… I… so my… my worry… about the topic.

705
01:23:39.890 --> 01:23:42.690
David Bau: is that…

706
01:23:43.050 --> 01:23:56.480
David Bau: Oh, what was the joke that you made when you said, like, it was last week, or a couple weeks ago, when you said, oh, you know, I'm, something something liberal, whatever, and then you said, oh, it should say liberal, but it says, oh, I'm,

707
01:23:56.640 --> 01:24:01.050
David Bau: I'm… I'm a stereotype. 100% bad, right?

708
01:24:01.140 --> 01:24:10.550
David Bau: So my, my, my main fear about your work is that a lot of this stuff is so disgusted and so much in the news that, you know, you, like.

709
01:24:10.590 --> 01:24:21.470
David Bau: like, there's… like, it's like the snow tracks that we have outside, right? You know, I worry that you're gonna fall into people's common thinking, and you're gonna be like, oh, you know, there's succancy, and…

710
01:24:21.470 --> 01:24:33.410
David Bau: And certain stereotypes line up, and the model is, like, reinforced the stereotypes, it's just the same message, and we've heard this paper a lot. It's like another one. And then you'll be like, oh, we'll find the mechanisms of stereotypes, but, like, okay, so…

711
01:24:33.480 --> 01:24:41.860
David Bau: So how do you avoid this, right? And so, so one… one thing that'd be… that could spice it up would be interesting.

712
01:24:42.150 --> 01:24:49.540
David Bau: Might be… To try to, develop Datasets are ways of probing.

713
01:24:49.970 --> 01:24:53.199
David Bau: For behavior that people don't think

714
01:24:53.780 --> 01:24:56.069
David Bau: Is related to politics at all.

715
01:24:56.180 --> 01:24:57.340
David Bau: Does that make sense?

716
01:24:57.570 --> 01:25:08.030
David Bau: Like, I don't know. Like, like the answer to, like, what seemingly objective questions, or maybe judgment questions that don't seem to have anything to do with politics, like…

717
01:25:08.370 --> 01:25:12.179
David Bau: You know, what kind of shoes should I buy, or something like that? Who knows, right?

718
01:25:12.320 --> 01:25:16.450
David Bau: But, like, things that totally… Like…

719
01:25:16.600 --> 01:25:21.939
David Bau: Shouldn't be related to your stance on these, you know, controversial ethical issues.

720
01:25:22.240 --> 01:25:27.680
David Bau: But it'd be interesting to see if there are, like, strong

721
01:25:28.110 --> 01:25:42.639
David Bau: And it's easy for models to predict the same thing, and then you could go in and say, why is that? Why is it that the models predict that, yes. Seeing that, like, is sick and see a general thing that's, like, not just…

722
01:25:43.220 --> 01:25:58.359
David Bau: specific to… Well, so you had this opposite thing, right, which was, hey, I wonder if there's, like, hidden cues for succinct. Like, if I say that I wear a certain kind of shoe, and I like to read a certain kind of book.

723
01:25:58.490 --> 01:26:04.310
David Bau: Is that… is that gonna tip the model off? Like, will it know, you know, that I'm a closet liberal?

724
01:26:04.540 --> 01:26:05.380
David Bau: Right?

725
01:26:05.580 --> 01:26:11.520
David Bau: And I'm sort of asking, well, how far can you push at? And can you push it in the opposite direction?

726
01:26:11.770 --> 01:26:18.570
David Bau: But, like, you know, how much… how much of the spectrum of behavior

727
01:26:18.810 --> 01:26:25.319
David Bau: is determined by these very, very strong latencies that you say, oh, you find this one latent thing, completely flip.

728
01:26:25.520 --> 01:26:37.629
David Bau: the behavior of the model. What flips? You know, is it, you know, is it, besides these political beliefs, is it a whole pattern of behavior, including things that people wouldn't expect right now?

729
01:26:37.960 --> 01:26:44.149
David Bau: And, and the different LLMs agree, and is there a mechanism for why that is?

730
01:26:44.400 --> 01:26:53.399
David Bau: What is the… the concern is by focusing just on… So, I think the most interesting thing

731
01:26:53.650 --> 01:27:05.230
David Bau: the, for, from my point of view, is… your idea… Presenting…

732
01:27:05.590 --> 01:27:09.650
David Bau: Signals that are hidden, that are, like, less obvious.

733
01:27:09.940 --> 01:27:18.009
David Bau: You know, when you say, hey, oh, I'm from Atlanta, or I like to eat this kind of food, or something like this, right? And say, oh, and the model, like, funholes you.

734
01:27:18.120 --> 01:27:22.570
David Bau: It's I-N-W-U-Y. Beautiful, great. I think that's interesting.

735
01:27:22.840 --> 01:27:30.640
David Bau: And so I'm… my suggestion, as you're developing your data sets and so on, is, like, how far…

736
01:27:31.280 --> 01:27:45.550
David Bau: Can you push that? Right, because, like, we're starting with demographic categories. You're starting with these demographic categories, and instead of, like, just continuing to circle around the demographic categories, can you amplify your data sets to, like, look at more of these unexpected things? Can you get all the way to…

737
01:27:45.670 --> 01:27:47.469
David Bau: Like, I don't know.

738
01:27:48.110 --> 01:27:48.870
David Bau: like…

739
01:27:49.100 --> 01:27:57.190
David Bau: you're a mathematician. Do you believe in the axiom of choice? Right? This is not a political decision, right? But it's also not…

740
01:27:57.500 --> 01:27:58.589
David Bau: One way or the other.

741
01:27:59.120 --> 01:28:08.240
David Bau: What if you're conservative? Do you believe in the axiom Choice? What if you're liberal? Do you believe in it? I don't know. Maybe it changes it. Yeah. Right?

742
01:28:08.290 --> 01:28:22.409
David Bau: I think the hard thing is, like, I think we're focusing on dem… like, we started with the demographics because there's ground truth data about, like, the correlations between these things, and so, like, I would worry, you know, maybe there's some, you know, these…

743
01:28:22.430 --> 01:28:41.489
David Bau: these elements have gobbled up so much data, like, maybe they know some pattern that I don't know about, and, like, it's really, like, accurately reporting what it exactly. Yeah, so you can ground yourself in things where there's no data. But we could be, like, very obscure. You could, like, go and try to generalize that, and then you could also say.

744
01:28:41.580 --> 01:28:53.429
David Bau: it's… you could say, oh, it has this complete misconception. Obviously, people are not like this. But, you know, we see the mechanism for why the model has this misconception, or something like that.

745
01:28:53.710 --> 01:29:07.970
David Bau: Yeah, one thing I was also thinking about is, like, is there a hierarchy of, like, preferences where it's like, you know, say you describe, you know, you know, I support gun rights, I'm white, I'm high school educated, and I'm a flamin' liberal, you know, like, like, like, there's…

746
01:29:08.000 --> 01:29:25.390
David Bau: Contradiction here. Like, when does it, choose to go with person self-reported versus be like, you can't possibly be, you know? But I agree that, like, pushing it in a weirder direction makes sense. Yeah, yeah, yeah. That's, that's my main…

747
01:29:25.900 --> 01:29:28.120
David Bau: Does that make sense? I don't definitely make sense.

748
01:29:29.330 --> 01:29:34.390
David Bau: Anyway, so that's my opinion. Okay. Oh, okay, we gotta keep on going. We gotta keep on going. What's the next team? E-Mam.

749
01:29:35.780 --> 01:29:36.620
David Bau: Okay.

750
01:30:04.260 --> 01:30:06.609
David Bau: Go, go, go, great, go! Okay, so,

751
01:30:06.780 --> 01:30:23.290
David Bau: This, week, we actually focus more on engineering our benchmarks, such that, like, we have a good collection of data set, and a good collection of the inferences as well. So, mainly we engineered our, way through,

752
01:30:23.420 --> 01:30:25.440
David Bau: Building that, codebase.

753
01:30:25.850 --> 01:30:30.189
David Bau: And we focused on two things, egocentric and,

754
01:30:30.390 --> 01:30:41.099
David Bau: allocentric, directions. So egocentric means, if I'm the agent, am I moving, up, down, left or right, or front to back, left or right? Something like that.

755
01:30:41.100 --> 01:30:51.270
David Bau: But allocentric means, if I'm in this, location, and if I want to go towards the door, which door am I talking about, and should I, move towards

756
01:30:51.270 --> 01:31:04.780
David Bau: or should I move towards the clock? Something like that. So, it means that now the model needs to have that special representation within it, such that it knows that clock is over there, 1 is here, and

757
01:31:04.780 --> 01:31:12.959
David Bau: Why is this challenging? As a language model, you are not given an image, you are given a string of some characters.

758
01:31:13.090 --> 01:31:18.879
David Bau: Like this one. So… You need to interpret, what kind of,

759
01:31:19.170 --> 01:31:38.060
David Bau: map representation do you have internally, such that you can navigate through the maze while also avoiding all these obstacles on the way. So, we don't have a lot of results this time, but we are working on the inference and how to speed that up.

760
01:31:38.610 --> 01:31:44.750
David Bau: So… what we… Part of doing is, you can, check out,

761
01:31:44.970 --> 01:31:49.100
David Bau: Will you make the screen big because it's really small fonts? Just press, press present.

762
01:31:50.940 --> 01:31:55.349
David Bau: What is… there's also… is there a photo with the document? I see a document. Okay, cool.

763
01:31:56.260 --> 01:31:59.630
David Bau: I think they should be fine. Yeah. So…

764
01:32:00.470 --> 01:32:18.530
David Bau: In… as a data, we are collecting, all these attributes, like what type of, grid is it? Grid work basically is, 2D grid, and, we are also expanding into, some 3D views and, some other… we are also thinking of other geometries, but these are the baselines that we are working on.

765
01:32:18.650 --> 01:32:28.870
David Bau: Size, we can have, either 5 cross 5, 7 cross 7, or 5 cross 7 as well, which is not shown on the top image, but in the bottom one, you can see the…

766
01:32:29.120 --> 01:32:30.689
David Bau: Grid size is basically a string.

767
01:32:31.790 --> 01:32:45.689
David Bau: So yes, we are storing, all these data. This is, basically, we call it, the ground truth labels, rather than, generations. So…

768
01:32:46.170 --> 01:32:55.979
David Bau: We use basic computer science algorithms like BFS and EFS to search for the shortest path, and this is the storage of that ground truth.

769
01:32:57.560 --> 01:33:01.320
David Bau: In the inference, we collect the same things.

770
01:33:01.520 --> 01:33:03.420
David Bau: That we did for,

771
01:33:03.950 --> 01:33:09.239
David Bau: The data collection in terms of ground truth, and, we do the same thing for,

772
01:33:10.560 --> 01:33:20.129
David Bau: our entrance as well. But this time we collect a few more details, like, what temperature did we set, in that particular, dataset, and,

773
01:33:20.340 --> 01:33:28.100
David Bau: What were the probabilities, what was the probability of the next token generation? Things like that, so that, we can benchmark it with.

774
01:33:28.920 --> 01:33:30.789
David Bau: Hey, let's introducing the problem.

775
01:33:31.000 --> 01:33:33.620
David Bau: the questions we asked before. Oh, yeah.

776
01:33:37.590 --> 01:33:43.960
David Bau: So, the questions that we're asking are, like, how far are you from the statue? So, for example, if,

777
01:33:44.360 --> 01:33:50.570
David Bau: The agenda's here, if you can look at it, the agenda's here, and statue is here, how far are you from the statue?

778
01:33:50.950 --> 01:34:07.899
David Bau: And how many steps do you need to perform in total to move to the goal, or something like that? And what do you find? Are the models really good at it, or really bad at it? Proprietary models are good at it. Lama, the smaller one, is not. The Lama bigger one is good, without reasoning.

779
01:34:08.450 --> 01:34:10.159
David Bau: How's that? Oh, how good.

780
01:34:10.610 --> 01:34:19.099
David Bau: They are giving connect results, we have still not, collected the data, so right now, as I speak… Have you guys ever played NetHack?

781
01:34:19.670 --> 01:34:23.879
David Bau: Sorry? Have you ever played NetHack? Have you ever heard of this… this game?

782
01:34:24.020 --> 01:34:24.869
David Bau: the desk.

783
01:34:25.010 --> 01:34:27.479
David Bau: Anybody here ever played NetEk?

784
01:34:28.330 --> 01:34:31.109
David Bau: Oh, I'm so old.

785
01:34:36.570 --> 01:34:43.140
David Bau: So… Back in the day.

786
01:34:43.430 --> 01:34:48.040
David Bau: when… Teletypes… you know what a teletype is?

787
01:34:51.300 --> 01:34:57.629
David Bau: Back in the day, when, you… everybody had mainframe computers, and they didn't have, like, PCs.

788
01:34:58.070 --> 01:35:02.249
David Bau: It used to be the way you used a computer is you would hook up a typewriter.

789
01:35:02.660 --> 01:35:07.649
David Bau: To, like, a serial… Or… that went down to their basement.

790
01:35:07.780 --> 01:35:11.720
David Bau: So, connect to the big computer, it would run your typewriter.

791
01:35:11.910 --> 01:35:18.499
David Bau: And then the computer would listen to your type… things that you typed, and then it would type things to answer you.

792
01:35:18.690 --> 01:35:21.350
David Bau: You could call up a report, and we'd print it out.

793
01:35:21.670 --> 01:35:24.790
David Bau: You get it on paper? That was a… that's called a teletype.

794
01:35:25.050 --> 01:35:32.870
David Bau: And then people switched to video terminals. Anybody heard of a VT100?

795
01:35:33.500 --> 01:35:35.149
David Bau: Oh my gosh, do you know.

796
01:35:36.080 --> 01:35:41.879
David Bau: Okay, so they… because all the computer codes that, like, when you have a terminal emulator, it's a VC100 emulator.

797
01:35:42.170 --> 01:35:43.110
David Bau: on this topic.

798
01:35:44.060 --> 01:35:50.080
David Bau: Anyway, so these terminals came up, these video terminals, it's very high-tech, and then the first thing that happened is people were like, oh, you can make video games!

799
01:35:51.000 --> 01:35:54.130
David Bau: And then one of the first video games to show up was this game called Mac.

800
01:35:54.490 --> 01:36:05.629
David Bau: And then there's a new version of it called NetHack that came out a couple years later, and it's still available, but for today, you can download NetHack to your Mac if you play on it. And it's this little terminal game that looks just like

801
01:36:06.090 --> 01:36:12.930
David Bau: the maps that you're making, like this. Right? You, like, move around a little… Great world.

802
01:36:13.550 --> 01:36:17.410
David Bau: With your little dude. Your little dude is like a little at sign.

803
01:36:17.510 --> 01:36:24.079
David Bau: You, like, move it up, and move it to the right, like, get some treasures, and fight some monsters.

804
01:36:24.450 --> 01:36:33.860
David Bau: No. It's like Pac-Man. No, but because there's, like, magic potions you can drink, and then there's sports you can pick up, and it's a very rich game, much richer than Pac-Man.

805
01:36:34.310 --> 01:36:35.090
David Bau: Right?

806
01:36:35.270 --> 01:36:37.709
David Bau: And… And the thing is.

807
01:36:38.130 --> 01:36:46.700
David Bau: There was a training environment, around NetHack. It's a very sophisticated game, and they trained a lot of the models

808
01:36:47.170 --> 01:36:52.270
David Bau: on how to read an ACC board, it's like a Language Monster's task, the CASCI text.

809
01:36:52.500 --> 01:36:55.579
David Bau: Right? And how to decide on what the next move is.

810
01:36:55.720 --> 01:36:57.130
David Bau: How to win the game.

811
01:36:57.390 --> 01:37:05.600
David Bau: And I would… Be surprised if none of the big companies included winning NetHack.

812
01:37:05.780 --> 01:37:14.160
David Bau: They probably include winning chess in the training data, but then there's this nice NetHack training environment, they probably include winning net pack.

813
01:37:14.410 --> 01:37:19.680
David Bau: In the reinforcement learning training environment, because it's kind of a similar

814
01:37:19.790 --> 01:37:21.610
David Bau: Of a lot of human education.

815
01:37:21.950 --> 01:37:31.270
David Bau: And the navigation container and things like that. So when you say strong models, like, big models are, like, really good at this, probably because they were trained on it. They were probably trained on that hack.

816
01:37:31.420 --> 01:37:32.210
David Bau: Right?

817
01:37:32.620 --> 01:37:39.950
David Bau: And so, so, and then when I look at… so you should… Look up NetHack.

818
01:37:41.630 --> 01:37:44.769
David Bau: A NetHack board is a lot more challenging

819
01:37:44.880 --> 01:37:59.439
David Bau: than the boards that you were trying, and you could see if any of the models could actually tell you distance in a NetHack board, or navigate a NetHack board. They're, like, they're a lot trickier than this. Does that make sense? Yeah, yeah.

820
01:37:59.920 --> 01:38:02.609
David Bau: Anyway, yeah. And also, long story, sorry.

821
01:38:02.770 --> 01:38:07.609
David Bau: Yeah, also add on to Kimmy's, structure.

822
01:38:08.710 --> 01:38:26.620
David Bau: All right. Any other thing you want to suggest? Yeah, just one more thing to add on. So, we have to very carefully to design the prompt to the model, so the model can, can, can describe how far it is, and how far, how close they are to some, like, landmarks.

823
01:38:26.620 --> 01:38:30.759
David Bau: So before that, we didn't, put this, like, coordinates

824
01:38:30.760 --> 01:38:49.910
David Bau: introduce this coordination system to the model, because we want to see, like, how model would like to… how they would choose to describe in this. But if we didn't include that, the model, like, didn't, like, answer quite well. They… they didn't know what they would… they would like to… how to describe in this concept of foreign quest. So we have to include this.

825
01:38:49.910 --> 01:38:54.059
David Bau: And then the model can output some reasonable response to it.

826
01:38:54.180 --> 01:38:55.969
David Bau: That's one of the alphablo is.

827
01:38:57.840 --> 01:39:01.980
David Bau: So, given that more complicated settings?

828
01:39:02.900 --> 01:39:12.000
David Bau: So here's a question for you. So I don't know if you're interested in this. So this is, I mean, this is setting off… okay, so you should also talk to Gabriel, because he just submitted for ICML a grid world.

829
01:39:12.070 --> 01:39:31.410
David Bau: paper, right? Did I… do we mention that to you before? I think he mentioned it to you, that he's working on it. So, where… where he's probing into the models, doing… is it inter… like, all the… all my colleagues, it's all interpretability work, right? So it's, like, probing into models. So you should do… do… ask him about it for two reasons. One is.

830
01:39:32.840 --> 01:39:40.619
David Bau: to get ideas, and the other is to make sure that your paper is less… you don't want to make a paper that's less interesting than his. You don't want to make a tiny subset of what he's doing.

831
01:39:40.750 --> 01:39:58.349
David Bau: Which is the main challenge with what you've got here. And so I think that you've got an issue of, like, how do you make sure that you do something that hasn't been done 500 times before, right? Like, robotics people have been studying grid world for about 50 years, right? And so how do you make sure that you don't, like.

832
01:39:58.410 --> 01:40:05.980
David Bau: write another Greek world paper, but it's, like, just watered down, and never gets to peer review. Does that make sense? I think that's the main…

833
01:40:06.250 --> 01:40:18.770
David Bau: question is challenging for me, so I… so, yes, any suggestions? Yeah, this is a… this is just a good question. Like, when I look at the grid, like, I think it's correspondence to, like, the way I experience space feels intuitive, but, like.

834
01:40:18.780 --> 01:40:28.820
David Bau: How do you know that it's intuitive in that sense for the model? Like, what's the correspondence between this and then, like, the, like, physical, real-world settings that you all motivated the project with at the beginning?

835
01:40:28.900 --> 01:40:34.440
David Bau: Like, is this… yeah, how do you link this to how the model thinks about life physics work?

836
01:40:35.350 --> 01:40:42.959
David Bau: We are using neuronP to, figure out, which neurons activate the boost, and,

837
01:40:43.380 --> 01:40:48.089
David Bau: Which particular portion of hypertension or any other mechanism inside it.

838
01:40:48.200 --> 01:40:51.459
David Bau: Is kind of responsible for,

839
01:40:51.740 --> 01:41:08.859
David Bau: working through these problems the most. I think we did a few results in the last week's presentation as well, on, I don't think that the question is really trying to get an answer. I think the question is more of a question to ask if you want to do your research in this direction. Yeah, we're going in this direction.

840
01:41:08.960 --> 01:41:18.529
David Bau: Right? Like, if you were to do this, but you put real city names on your grid. You said Berlin is over here, Paris is over here, you know.

841
01:41:18.650 --> 01:41:34.010
David Bau: you know, Rome is over here, Moscow's over here, right? Would that make it easier for your mom? Because it's like, oh, now I know where things are, because I know, like, things in the real world. What if you put them in the wrong order? What if you mixed up

842
01:41:34.110 --> 01:41:53.769
David Bau: like, real cities, then would it be in conflict with itself? Does it know these two things are the same thing? I think that that's kind of the spirit, right? Does that make sense? Yeah, yeah, yeah, yeah. Yeah, yeah, yeah, yeah, yeah, right? Like, so, like, does it… does a model think about this abstract

843
01:41:54.470 --> 01:41:57.970
David Bau: Boy world, the same way it thinks about

844
01:41:58.100 --> 01:42:00.680
David Bau: The real world, or is that totally unrelated?

845
01:42:00.940 --> 01:42:12.100
David Bau: I think it's a very, you know, you know, like, that's, like, nobody's written that paper. Like, the Roblox people wouldn't touch that one. Does that make sense? Thank you.

846
01:42:12.610 --> 01:42:13.320
David Bau: Yeah.

847
01:42:15.610 --> 01:42:17.540
David Bau: How are we doing out of time? Are we out of time?

848
01:42:17.990 --> 01:42:19.600
David Bau: Did everybody get some advice?

849
01:42:19.960 --> 01:42:23.319
David Bau: Okay, I think everybody got some advice. Thank you very much.

850
01:42:23.450 --> 01:42:24.809
David Bau: We talked to all the teams.

851
01:42:25.500 --> 01:42:26.889
David Bau: And we'll see you on Tuesday.

852
01:42:27.700 --> 01:42:29.669
David Bau: Alright, stay safe from the snow.

853
01:42:36.190 --> 01:42:45.259
David Bau: I don't know. Netac.

854
01:42:45.500 --> 01:42:54.730
David Bau: It's a fun game. You install it, and you might get addicted to it. It's pretty good. I haven't played it for like that.

855
01:42:54.950 --> 01:42:57.660
David Bau: Many of the adjac.

856
01:42:58.070 --> 01:42:59.310
David Bau: I mean, that happened.

857
01:42:59.520 --> 01:43:06.879
David Bau: That's it, this one, this one. Oh, okay. Yeah, you see? All these funny little… Oh, wow, things. Alright, this is more fun.

858
01:43:07.040 --> 01:43:28.440
David Bau: Yeah, yeah, yeah. So I'll see it. And then you can also look for the Netback Training Environment, or I don't know what they call it, but they have their own… No, it's all. It's all, like, open source one, like, oh, the open source. Oh, so Haber is your student, or… Yeah, he's a postdoc in my lab. Oh, okay, bro, sorry, so you can ask him about that. Thank you, thank you. He'll be there.

859
01:43:28.470 --> 01:43:30.069
David Bau: Yeah, that's go back to Sugaroo.

860
01:43:37.240 --> 01:43:38.780
David Bau: But I know I can do more.

861
01:43:39.830 --> 01:43:40.640
David Bau: Praise.

862
01:43:41.370 --> 01:43:43.109
David Bau: Did you get my sleep last night?

863
01:43:43.300 --> 01:43:53.889
David Bau: I don't know what happened to Nikhil. Nikhil, like, he didn't even tell me exactly. No, but it's alright. Yeah, it's okay.

864
01:43:54.380 --> 01:43:57.020
David Bau: It's fast enough, for sure.

865
01:43:57.220 --> 01:43:58.230
David Bau: But,

866
01:43:58.370 --> 01:44:04.970
David Bau: But I logged in in the morning and did sort of vinyl edits on them to make it work a bit like any other works for the pair.

867
01:44:05.410 --> 01:44:09.080
David Bau: And, but no, let's see…

