AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Pattern Matching and Discourse Processing in Information Extraction from Japanese Text

Journal of Artificial Intelligence ResearchAug-1-1994

Information extraction is the task of automaticallypicking up information of interest from an unconstrained text. Informationof interest is usually extracted in two steps. First, sentence level processing locates relevant pieces of information scatteredthroughout the text; second, discourse processing merges coreferential information to generate the output. In the first step, pieces of information are locally identified without recognizing any relationships among them. A key word search or simple patternsearch can achieve this purpose. The second step requires deeperknowledge in order to understand relationships among separately identified pieces of information. Previous information extraction systems focused on the first step, partly because they were not required to link up each piece of information with other pieces. To link the extracted pieces of information and map them onto a structuredoutput format, complex discourse processing is essential. This paperreports on a Japanese information extraction system that merges information using a pattern matcher and discourse processor. Evaluationresults show a high level of system performance which approaches human performance.

information extraction, pattern matching and discourse processing, tanabe pharmaceutical, (3 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.53

AI Access Foundation

10119

Journal of Artificial Intelligence Research

Country: Asia > Japan (0.08)

Industry: Health & Medicine (0.37)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.91)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.91)

Add feedback

Automatically constructing a dictionary for information extraction tasks

Riloff, E.

ClassicsFeb-1-1993

Knowledge-based natural language processing systems have achieved good success with certain tasks but they are often criticized because they depend on a domain-specific dictionary that requires a great deal of manual knowledge engineering. This knowledge engineering bottleneck makes knowledge-based NLP systems impractical for real-world applications because they cannot be easily scaled up or ported to new domains. In response to this problem, we developed a system called AutoSlog that automatically builds a domain-specific dictionary of concepts for extracting information from text. Using AutoSlog, we constructed a dictionary for the domain of terrorist event descriptions in only 5 person-hours. We then compared the AutoSlog dictionary with a handcrafted dictionary that was built by two highly skilled graduate students and required approximately 1500 person-hours of effort. We evaluated the two dictionaries using two blind test sets of 100 texts each. Overall, the AutoSlog dictionary achieved 98% of the performance of the handcrafted dictionary. On the first test set, the Auto-Slog dictionary obtained 96.3% of the performance of the handcrafted dictionary. On the second test set, the overall scores were virtually indistinguishable with the AutoSlog dictionary achieving 99.7% of the performance of the handcrafted dictionary.

artificial intelligence, expert system, natural language, (17 more...)

Classics

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
South America (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Industry:

Law Enforcement & Public Safety > Terrorism (0.72)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)

Add feedback