CS-ECE Colloquium: Building knowledge bases for natural language understanding
Intelligent systems that are capable of understanding natural languages can have many applications from healthcare to business to law. One of the ways we can formulate natural language understanding is by treating it as a task of mapping natural language text to its meaning representation: entities and relations anchored to the world. Knowledge bases (KBs) can facilitate natural language understanding by mapping words to their meaning representations, for example nouns to entities and verbs to relations. State of the art knowledge bases such as NELL, Freebase, and YAGO have been successful at constructing such knowledge bases, which contain beliefs about real world entities and relations, by leveraging the redundancy of millions of documents to detect language patterns. The accumulated knowledge have been used to improve the ability of intelligent systems to make inferences. Under multilingual and multimodal settings, knowledge bases present a virtuous learning opportunity: more and higher confident beliefs can be extracted by processing data in more languages or modalities; in turn, since entities and their relations in the KBs exist in the world no matter what language or modality is being used to express them, KBs can act as interlingua for relating corpora in different languages and modalities through KB entities and relations. This is especially useful for low resource languages where there are few if any aligned bilingual texts to support effective natural language