There were two new trends that I found particularly compelling this year:
A couple of specific highlights of papers I liked this year:
As a side note, I'd like to thank all you wonderful machine learning folks who have been doing a remarkable amount of unsupervised structured learning that I should have been paying better attention to over the last few years. Now I've got to hit the books.
- Noisy Genre
This is pretty clunky term covers genres of language which are not well-formed. As far as I can tell this covers everything other than Newswire, Broadcast news, and read speech. This is what I would call "language in the wild" or in a snarkier mood, "language" (sans modifier). For the purposes of HLT-NAACL, it covers Twitter messages, email, forum comments, and ... speech recognition output. It's this kind of language that got me into NLP and why I ended up working on speech, so I'm pretty excited that this is receiving more attention from the NLP community at large. - Mechanical Turk for language tasks
Like the excitement over wikipedia a few years ago, NLP folks have fallen in love with Amazon's Mechanical Turk. Mechanical turk was used for speech transcription, sentence compression, paraphrasing, and quite a lot more; there was even a workshop day dedicated solely to this topic. I didn't go to it, but will catch up on the papers this week or so. This work is very cool, particularly when it comes to automatically detecting and dealing with outlier annotations. The resource and corpora development uses of Mechanical Turk are obvious and valuable. It's in the development of "high confindence" or "gold standard" resources that I think this work has an opportunity to intersect very nicely in work on ensemble techniques and classifier combination/fusion. If each turker is considered to be an annotator, the task of identifying a gold standard corpus is identical to generating a high-confidence prediction from an ensemble.
A couple of specific highlights of papers I liked this year:
- “cba to check the spelling”: Investigating Parser Performance on Discussion Forum Posts Jennifer Foster. This might be the first time I fully agree with a best paper award. This paper looked at parsing outrageously sloppy forum comments. These are rife with spelling errors, grammatical errors, weird exclamations (lol). The paper is a really nice example of the difficulty that "noisy genres" of text pose to traditional (i.e., trained on WSJ text) models. The error analysis is clear and the paper proposes some nice solutions to bridge this gap by adding noise to the WSJ data. Also, bonus points for subtly including
- Cheap, Fast and Good Enough: Automatic Speech Recognition with Non-Expert Transcription
Scott Novotney and Chris Callison-Burch. A nice example of using Mechanical Turk to generate training data for a speech recognizer. High quality transcription of speech is pretty expensive and critically important to speech recognizer performance. Novotney and Callison-Burch found that Turkers are able to transcribe speech fairly well, and at a fraction of the cost. This paper includes a really nice evaluation of Turker performance and some interesting approaches to ranking Turker performance. - The Simple Truth about Dependency and Phrase Structure Representations: An Opinion Piece
Owen Rambow. This paper was probably my favorite in terms of bringing joy and being a breath of fresh air. The argument Rambow lays out is that Dependency and Phrase Structure Representations of syntax are meaningless in isolation. Moreover, these are simply alternate representations of identical syntactic phenomena. Linguists love to fight over a "correct" representation of syntax. This paper takes the position that the distinction between the representations is merely preference not substantive -- fighting over the correct representation of a phenomenon is a distraction to understanding the phenomenon itself. Full disclosure: I've known Owen for years, and like him personally as well as his work. - Type-Based MCMC
Percy Liang, Michael I. Jordan and Dan Klein. Over the last few years, I've been boning up on MCMC methods. I haven't applied them to my own work yet, but it's really only a matter of time. This work does a nice job of pointing out a limitation of token based MCMC -- specifically that sampling on a token by token basis can make it overly difficult to get out of local minima. Some of this difficulty can be overcome by sampling based on types, that is, sampling based on a higher level feature across the whole data set, as opposed to within a particular token. This makes intuitive sense and was empirically well motivated.
As a side note, I'd like to thank all you wonderful machine learning folks who have been doing a remarkable amount of unsupervised structured learning that I should have been paying better attention to over the last few years. Now I've got to hit the books.
1 comment:
This is an excellent recap of NAACL-HLT. I am really glad that I came across it while doing a search for noisy genre at NAACL. t really helped round out my general impression of the conference.
Besides the two trends that you mentioned, I noticed that there was quite a buzz about Twitter too. People seemed to be doing all kinds of stuff with it. Detecting new events; detecting controversies. Annotating conversations with
dialog tags, named entities, personalized annotation tags based on user’s interests and concerns...
BTW, I also really appreciated your frank observation "people who let me in can't be that great". I feel the same way! Perhaps a new-phd phenomena :)
Post a Comment