| |
Computer Science Seminar Series
Trends in Language and Information Processing
September 10, 3:00pm
Weir Hall, Room 235
Dr. Vasile Rus, Assistant Professor
Systems Testing Research Fellow of the Fedex Institute of Technology
Department of Computer Science and Institute for Intelligent Systems
The University of Memphis
Abstract:
The explosion of the World Wide Web over the last decades has made available huge amounts of information
to our fingertips. We are suddenly faced with an information overload problem: too much information as
opposed to lack of information. To deal with the information overload problem there is an acute need for
tools that efficiently mine vast repositories of information and deliver what the user is looking for.
Most of the information (>90%) on the web is textual or a mixture of text with other media (images,
tables, charts, etc). The unstructured nature of natural languages (compared to structured information
in databases) poses particular challenges to extracting information from documents encoded in natural
language, textual form.
In the first part of the talk I will broadly present several solutions to extracting information from
unstructured collections of documents. In particular, we define Information Retrieval, Information
Extraction and Question Answering as different ways to obtaining information from large repositories of
textual documents.
The second part of the talk will present a solution to Textual Entailment. Textual Entailment is the
task of deciding whether one fragment of text, called the Hypothesis (H), can be entailed (logically
inferred) from another fragment of text, called the Text (T). To solve the entailment challenge we
adopted a graph-based approach in three stages. First, both the Text and Hypothesis are mapped onto
lexico-syntactic graphs by mapping core concepts onto nodes and syntactic dependencies among concepts
onto edges. Second, a graph subsumption operation is performed between graph-H and graph-T and a score
which quantifies the degree of subsumption is computed. Third, we decide based on the computed score
whether T entails H. We conclude by presenting results on a standard data set for textual entailment and
future work.
Biography:
Dr. Vasile Rus received his Bachelors degree in Computer Science from Technical University of
Cluj-Napoca in June 1997 with a Diploma Thesis entitled Distributed and Collaborative Configuration
Management, masterpieced while at LSR Laboratory, INPG, Grenoble, France. Dr. Rus earned his Masters of
Science in Computer Science and Doctor of Philosophy in Computer Science degrees from Southern Methodist
University at Dallas, Texas in May 1999 and May 2002, respectively. His research interests include
intelligent systems, software engineering, artificial intelligence, natural language processing,
knowledge representation based on natural language and syntax-based semantics with applications to
autotutoring, textual entailment, question answering and other applications where semantics is a must.
His professional interests are mainly focused on systems software, software systems analysis and design,
configuration management. Dr. Rus has served as Programme Committee member in many AI and Computational
Linguistics conferences, as reviewer for as many journals and is currently co-chair of the NSF-sponsored
Workshop on the Question Generation Shared Task and Evaluation Challenge. As a faculty of the Institute
for Intelligent Systems, Dr. Rus is working on a number of projects in the institute including
AutoTutor, MetaTutor, Coh-Metrix, W-Pal, and Authoring Tools to name just a few. Dr. Rus teaches Natural
Language Processing, Information Retrieval and Web Search, Operating Systems and Data Structures related
classes.
[ Home |
Site Map ]
|
|