AbstractsComputer Science

Investigating PRECISE

by Nils Everling

Institution: KTH Royal Institute of Technology
Year: 2015
Keywords: Natural Sciences; Computer and Information Science; Computer Science; Naturvetenskap; Data- och informationsvetenskap; Datavetenskap (datalogi)
Record ID: 1370436
Full text PDF: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-166574


A natural language interface to a database (NLIDB) lets a user query a database using a natural language. PRECISE (Popescu et al., 2003) is a formal model for a portable SQL NLIDB which interprets a question by pairing sentence tokens to database attributes and values with a maximum flow solution. PRECISE is said to be sound and complete for a large class of semantically tractable questions. We implemented PRECISE and deployed it on Geoquery, a database of geographical facts. PRECISE made no errors in terms of returning a single, incorrect query, giving it the highest possible precision value. However, out of the 448 questions given, PRECISE was only able to produce SQL queries for 162, giving it a recall value of 0.361. A considerable amount of sentences gave rise to multiple interpretations, which prompted PRECISE to produce no query. Moreover, PRECISE by design could not produce queries for sentences which did not contain a WH-token ({"what", "where", "when", "who", "which"}). Our implementation of PRECISE required some manual configuration when deployed on Geoquery for best recall. While the results are tied to our implementation they give an indication of the size of the semantically tractable class as well as the portability of PRECISE.