YT 是黑话,以前论过,不赘。无需深究,乃借题发挥,引入最近的读书笔记一则。
Quote
For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.
Man 2: That’s what she said!
摘自 http://us.textanalyticsnews.com/fc_fcbi1lz/lz.aspx?p1=05555212S3562&CC=&p=1&cID=0&cValue=1
just finished reading the academic paper on this research, done by some professors at Washington Univ.
It is very, very research oriented and academic and should not even bother practitioners in industry at all.
It is eye-catching and certainly has academic value due to no one having done anything on this so-called TWSS (That is What She Said) problem before.
It is intended to identify/classify via machine learning a subset of puns which might (and might not) contain sarcasm on a brand. But mainly it is only a very small subset of data associated with some adult jokes.
First of all, puns are the last thing which should be brought to the table as an object for automatic processing in a real life system not only because they are statistically rare but also because they are so complex and often involve cultural context. There are endless jobs which are much more widespread and much more tractable for automatic processing. Spending resources on such a problem in industry is not wise, nor effective.
It is one of those again, technology news reporters like to cover stories like that as it draws people's attention and imagination.
Some research is twisted/exaggerated out of context to sound like the next big thing in real life technology.
If they are real for apps they should show benchmarks from real life large corpus. Not the benchmark reported in the paper on some select corpus of a particular source, but the one from the social media at large. First question to answer is how much TWSS is in social media, how relevant it is when it does occur to brands and lastly how the classification will be used in apps. None of these are answered by the research publication, so it is not worth the time in looking into this.
It is eye catching. That's all.
RE: Subject: What can jokes teach us about NLP?
Can your text analytics algorithm tell the difference between a joke and a serious statement?
Reference:
http://www.aclweb.org/anthology-new/P/P11/P11-2016.pdf
http://blog.sciencenet.cn/blog-362400-617371.html