Interdisciplinary Seminar Series

Title: Stream Characterization from Content

Speaker: Allen Gorin, Director, Human Language Technology Research, U.S. Department of Defense, Fort Meade, MD

Date: Monday, May 3, 2010 12:00 - 1:00 pm

Location: DIMACS Center, CoRE Bldg, Room 431, Rutgers University, Busch Campus, Piscataway, NJ


Coping with Information Overload is a major challenge of the 21st century. Huge volumes and varieties of multilingual data must be processed to extract salient information. We have previously reported on research for how to automatically characterize large volumes of streaming content. However, information includes both content and associated meta-data, which humans deal with as a gestalt but computer systems often treat separately. Attributed random graphs provide a useful mechanism for jointly modeling content and context. This talk describes such methods, with experimental proof-of-concept on the Switchboard and Enron corpora. This research is in collaboration with Priebe and Grothendieck.

Slides: Stream Characterization from Content