Links
Archives
My Tweets
Error: Twitter did not respond. Please wait a few minutes and refresh this page.
Categories
Author Archives: aria42
Condensr: Automatic Yelp Snippet Summarization using Natural Language Processing
In an earlier post, I hinted at a Yelp review summarization system demo. Well, here it is. Some background: Christy Sauper, Regina Barzilay, and I recently presented Incorporating Content Structure into Text Analysis Applications at the Empirical Methods of Natural … Continue reading
Posted in computer science, machine learning, nlp, reviews
1 Comment
Classification with MIRA in Clojure
A few people from my last post asked for an accessible explanation of the margin infused relaxation algorithm (MIRA) and confidence-weighted learning (CW) classification algorithms I discussed. I don’t think I can easily explain CW, but I think MIRA, or … Continue reading
Posted in clojure, computer science, machine learning
4 Comments
The Relation between MIRA and CW Learning
Note: This post won’t make sense unless you’re steeped in recent machine learning. There’s a good chance that if you are, you already know this. During a machine learning reading group with Mike Collins, Jenny Finkel, Alexander Rush and myself … Continue reading
Posted in computer science, machine learning, nlp
4 Comments
Extracting Useful Review Snippets from Yelp Using Natural Language Processing
Recently, Christy Sauper, Regina Barzilay, and me published a paper, Incorporating Content Structure into Text Analysis Applications, about how to use content structure in a document to improve accuracy on information extraction tasks. One of the datasets we worked with … Continue reading
Posted in computer science, discourse, information extraction, nlp
Leave a comment
Clojure Unsupervised Part-Of-Speech Tagger Explained
Last week, I posted a 300 line clojure script which implements some recent work I’ve published in unsupervised part-of-speech tagging. In this post, I’m going to describe more fully how the model works and also how the implementation works. This post is going to assume that you have some basic background in probability and that you know some clojure. The post is massive, so feel free to skip sections if you feel like something is too remedial; I’ve put superfluous details in footnotes or marked paragraphs.
Continue reading
Posted in clojure, computer science, nlp
10 Comments
State-Of-The-Art Unsupervised Part-Of-Speech Tagging in 300 lines of Clojure (from Scratch)
Recently, Yoong-Keok Lee, Regina Barzilay, and myself, published a paper on doing unsupervised part-of-speech tagging. I.e., how do we learn syntactic categories of words from raw text. This model is actually pretty simple relevant to other published papers and actually … Continue reading
Posted in computer science
4 Comments
Computer and Computational Science
There’s a divide I’ve noticed amongst people lumped into a “computer science” department. Compactly, I think there are computer scientists and computational scientists; the knowledge base of these groups is rapidly diverging and CS departments should do a better job … Continue reading