Department of Computer and Information Science

 

Computer Science Seminar Series

Trends in Language and Information Processing


September 10, 3:00pm

Weir Hall, Room 235

Dr. Vasile Rus,
Assistant Professor Systems Testing Research Fellow of the Fedex Institute of Technology Department of Computer Science and Institute for Intelligent Systems
The University of Memphis


Abstract:

The explosion of the World Wide Web over the last decades has made available huge amounts of information to our fingertips. We are suddenly faced with an information overload problem: too much information as opposed to lack of information. To deal with the information overload problem there is an acute need for tools that efficiently mine vast repositories of information and deliver what the user is looking for. Most of the information (>90%) on the web is textual or a mixture of text with other media (images, tables, charts, etc). The unstructured nature of natural languages (compared to structured information in databases) poses particular challenges to extracting information from documents encoded in natural language, textual form. In the first part of the talk I will broadly present several solutions to extracting information from unstructured collections of documents. In particular, we define Information Retrieval, Information Extraction and Question Answering as different ways to obtaining information from large repositories of textual documents. The second part of the talk will present a solution to Textual Entailment. Textual Entailment is the task of deciding whether one fragment of text, called the Hypothesis (H), can be entailed (logically inferred) from another fragment of text, called the Text (T). To solve the entailment challenge we adopted a graph-based approach in three stages. First, both the Text and Hypothesis are mapped onto lexico-syntactic graphs by mapping core concepts onto nodes and syntactic dependencies among concepts onto edges. Second, a graph subsumption operation is performed between graph-H and graph-T and a score which quantifies the degree of subsumption is computed. Third, we decide based on the computed score whether T entails H. We conclude by presenting results on a standard data set for textual entailment and future work.

Biography:

Dr. Vasile Rus received his Bachelors degree in Computer Science from Technical University of Cluj-Napoca in June 1997 with a Diploma Thesis entitled Distributed and Collaborative Configuration Management, masterpieced while at LSR Laboratory, INPG, Grenoble, France. Dr. Rus earned his Masters of Science in Computer Science and Doctor of Philosophy in Computer Science degrees from Southern Methodist University at Dallas, Texas in May 1999 and May 2002, respectively. His research interests include intelligent systems, software engineering, artificial intelligence, natural language processing, knowledge representation based on natural language and syntax-based semantics with applications to autotutoring, textual entailment, question answering and other applications where semantics is a must. His professional interests are mainly focused on systems software, software systems analysis and design, configuration management. Dr. Rus has served as Programme Committee member in many AI and Computational Linguistics conferences, as reviewer for as many journals and is currently co-chair of the NSF-sponsored Workshop on the Question Generation Shared Task and Evaluation Challenge. As a faculty of the Institute for Intelligent Systems, Dr. Rus is working on a number of projects in the institute including AutoTutor, MetaTutor, Coh-Metrix, W-Pal, and Authoring Tools to name just a few. Dr. Rus teaches Natural Language Processing, Information Retrieval and Web Search, Operating Systems and Data Structures related classes.


[ Home | Site Map ]