Category Archives: computer science

Condensr: Automatic Yelp Snippet Summarization using Natural Language Processing

In an earlier post, I hinted at a Yelp review summarization system demo. Well, here it is. Some background: Christy Sauper, Regina Barzilay, and I recently presented Incorporating Content Structure into Text Analysis Applications at the Empirical Methods of Natural … Continue reading

Posted in computer science, machine learning, nlp, reviews | 1 Comment

Classification with MIRA in Clojure

A few people from my last post asked for an accessible explanation of the margin infused relaxation algorithm (MIRA) and confidence-weighted learning (CW) classification algorithms I discussed. I don’t think I can easily explain CW, but I think MIRA, or … Continue reading

Posted in clojure, computer science, machine learning | 4 Comments

The Relation between MIRA and CW Learning

Note: This post won’t make sense unless you’re steeped in recent machine learning. There’s a good chance that if you are, you already know this. During a machine learning reading group with Mike Collins, Jenny Finkel, Alexander Rush and myself … Continue reading

Posted in computer science, machine learning, nlp | 4 Comments

Extracting Useful Review Snippets from Yelp Using Natural Language Processing

Recently, Christy Sauper, Regina Barzilay, and me published a paper, Incorporating Content Structure into Text Analysis Applications, about how to use content structure in a document to improve accuracy on information extraction tasks. One of the datasets we worked with … Continue reading

Posted in computer science, discourse, information extraction, nlp | Leave a comment

Clojure Unsupervised Part-Of-Speech Tagger Explained

Last week, I posted a 300 line clojure script which implements some recent work I’ve published in unsupervised part-of-speech tagging. In this post, I’m going to describe more fully how the model works and also how the implementation works. This post is going to assume that you have some basic background in probability and that you know some clojure. The post is massive, so feel free to skip sections if you feel like something is too remedial; I’ve put superfluous details in footnotes or marked paragraphs.
Continue reading

Posted in clojure, computer science, nlp | 10 Comments

State-Of-The-Art Unsupervised Part-Of-Speech Tagging in 300 lines of Clojure (from Scratch)

Recently, Yoong-Keok Lee, Regina Barzilay, and myself, published a paper on doing unsupervised part-of-speech tagging. I.e., how do we learn syntactic categories of words from raw text. This model is actually pretty simple relevant to other published papers and actually … Continue reading

Posted in computer science | 4 Comments

Computer and Computational Science

There’s a divide I’ve noticed amongst people lumped into a “computer science” department. Compactly, I think there are computer scientists and computational scientists; the knowledge base of these groups is rapidly diverging and CS departments should do a better job … Continue reading

Posted in computer science | Tagged , | 3 Comments