NYU NLP/Text-as-Data Speaker: Tatsunori Hashimoto, 10/22 - Shared screen with speaker view
if you have questions, please raise your hand and we will unmute you.
How about simply using (x, z) as input to language models where z is some topic embeddings, instead of x only?
(Actually similar to Cho’s question)
Would truncating losses decrease the density assigned onto diverse *but still faithful* tokens/phrases?