读书笔记：YT 神功源自 TWSS

YT 是黑话，以前论过，不赘。无需深究，乃借题发挥，引入最近的读书笔记一则。

Quote

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:

Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.
Man 2: That’s what she said!

摘自 http://us.textanalyticsnews.com/fc_fcbi1lz/lz.aspx?p1=05555212S3562&CC=&p=1&cID=0&cValue=1

just finished reading the academic paper on this research, done by some professors at Washington Univ.

It is very, very research oriented and academic and should not even bother practitioners in industry at all.

It is eye-catching and certainly has academic value due to no one having done anything on this so-called TWSS (That is What She Said) problem before.

It is intended to identify/classify via machine learning a subset of puns which might (and might not) contain sarcasm on a brand. But mainly it is only a very small subset of data associated with some adult jokes.

First of all, puns are the last thing which should be brought to the table as an object for automatic processing in a real life system not only because they are statistically rare but also because they are so complex and often involve cultural context. There are endless jobs which are much more widespread and much more tractable for automatic processing. Spending resources on such a problem in industry is not wise, nor effective.

It is one of those again, technology news reporters like to cover stories like that as it draws people's attention and imagination.

Some research is twisted/exaggerated out of context to sound like the next big thing in real life technology.

If they are real for apps they should show benchmarks from real life large corpus. Not the benchmark reported in the paper on some select corpus of a particular source, but the one from the social media at large. First question to answer is how much TWSS is in social media, how relevant it is when it does occur to brands and lastly how the classification will be used in apps. None of these are answered by the research publication, so it is not worth the time in looking into this.

It is eye catching. That's all.

RE: Subject: What can jokes teach us about NLP?
Can your text analytics algorithm tell the difference between a joke and a serious statement?

Reference:
http://www.aclweb.org/anthology-new/P/P11/P11-2016.pdf

本文引用地址：http://blog.sciencenet.cn/blog-362400-617371.html

作者liwei999

作者 liwei999

相关文章

DeepSeek-V3解析及技术报告英中报告对照版

如何构建和优化推理型大型语言模型？DeepSeek R1的启示

新浪张俊林：大语言模型的涌现能力——现象与解释

发表回复

You missed

Qwen2.5-Omni：迈向通用多模态AI的里程碑——解读首个支持实时多模态输入与输出的统一模型

Google DeepMind 发布多模态轻量级开源模型 Gemma 3：性能与功能全面升级