Appendix A: Gnarli Walkthrough

User input is highlighted in bold and comments are in italics. Several debugging modes are used:  TRACE prints out the rules Gnarli selects to parse the input question, DEBUG prints out the preprocessed question and the final SQL query, and PRONOUN lists antecedents and the pronouns with which they are tagged.

Movie Database

Welcome to Gnarli!

Current syntax file is movie.syn: Test Movie Database.

> tell me about "tom hanks"
29      Tom Hanks      1956-07-09     Concord, California, USA

> when was he born?
1956-07-09

> what war movies came out after then and before 1965?
Lawrence of Arabia
Ciociara, La
Bridge on the River Kwai, The

> how many oscars did "casablanca" win?
2

> which oscars did it win?
BEST-PICTURE
BEST-DIRECTOR

> who won best director the most?
John Ford      4
William Wyler  3
Frank Capra    3
Steven Spielberg       2
Oliver Stone   2

> where was he born?
ERROR: Cannot figure out what you mean by 'he'!

> where were THEY born?
Cape Elizabeth, Maine, USA     John Ford
Mulhouse, France       William Wyler
Palermo, Sicily, Italy Frank Capra
Cincinnati, Ohio, USA  Steven Spielberg
New York, New York, USA        Oliver Stone

> how many actors from "italy" won oscars?
7

> what action movies longer than 150 minutes won an oscar since 1990
Saving Private Ryan
Titanic
Braveheart

> how long are they?
170     Saving Private Ryan
194     Titanic
177     Braveheart

> what are the 4 longest dramas to win an oscar?
Dances with Wolves     224
Gone with the Wind     222
Ben-Hur 212
Giant   201

> who directed them?
Kevin Costner  Dances with Wolves
Victor Fleming Gone with the Wind
William Wyler  Ben-Hur
George Stevens Giant

> debug on
DEBUG set to ON
> what director from "california" directed the most sci-fi flicks?
PREPROCESS: who from "california" directed the most science fiction movies?
QUERY: SELECT DISTINCT Person.name, count(*) as cnt FROM Movie, Director, Person WHERE Person.pob like "%california%" AND Movie.genre like "%S%" AND Director.movie_id = Movie.id AND Director.director_id = Person.id GROUP BY Person.id order by cnt desc limit 5;
ANSWER:
George Lucas   2
Michael Bay    1

> what rated r movies with "love" in the title won best picture?
PREPROCESS: which rated r movies with "love" in the title won best picture?
QUERY: SELECT DISTINCT Movie.name FROM Oscar, Movie WHERE Movie.rating like "r" AND Movie.name like "%love%" AND Oscar.type = "BEST-PICTURE" AND Oscar.movie_id = Movie.id;
ANSWER:
Shakespeare in Love

> debug off
DEBUG set to OFF
> show me what dramas that have some ridiculously obscure piece of information and came out between 1970 and 1973
WARNING: Ignored a large amount of text: 's that have some ridiculously obscure piece of information and '
French Connection, The
Godfather, The
Klute

> what PG-rated musicals starring "liza minnelli" came out then
Cabaret

> pronoun on
PRONOUN set to ON
> tell me what actors from "new york, new york" were born after 1979, and the exact birthdate?
pronoun: "select" => 'they'
pronoun: "select" => 'he'
pronoun: "new york, new york" => 'there'
pronoun: in 1979 => 'then'
pronoun: in select => 'then'
Macaulay Culkin 1980-08-26

> pronoun off
PRONOUN set to OFF
> what movies came out then
By 'then' do you mean
1. in 1979
2. in 1980-08-26
? 1
Kramer vs. Kramer
Norma Rae

Thank you for using Gnarli!

 

Course Catalog Database

Welcome to Gnarli!

Current syntax file is classes.syn: Test Course Catalog Database.

> what "chemistry" classes involving "research" start after 11?
Introduction to Research-Junior Year
Experimental Physical Chemistry

> who teaches them?
James E. Porter Introduction to Research-Junior Year
Bretislav Messener    Experimental Physical Chemistry

> what "econ" profs teach the most classes
Caroline M. Roxby      7
Alberto F. Mussina     6
Richard E. Braves       4
Christopher L. Foote   4
Jerry R. Green 4

> what "af-am" classes about "religion" meet in the fall?
Afro-Atlantic Religions

> what department has the most lecturers?
6       Anthropology
4       Afro-American Studies
4       Economics
3       Chemistry and Chemical Biology
3       Computer Science

> is "Jones" teaching "operating systems"?
Yes

> which "phil" classes have the earliest exams
The Pre-Socratics: Proseminar
Socrates: Proseminar
Existentialism: Proseminar
Philosophy of Science
Philosophy and Literature

> trace on
TRACE set to ON
> what graduate "physics" classes meet on friday and are taught by full profs?
which
graduate
"([^"]*)" class
class|meet
DAY:(.)
full
professor
WARNING: Ignored a large amount of text: ' and are taught by '
General Theory of Relativity
Electrodynamics I
Advanced Quantum Mechanics I
Advanced Quantum Mechanics II
Statistical Physics
Non-Relativistic Quantum Electrodynamics

> what departments have the least professors on leave
which department
least
professor
on leave
1       Computer Science
2       Applied Mathematics
3       Afro-American Studies
3       Engineering Sciences
4       Chemistry and Chemical Biology

> trace off
TRACE set to OFF
> how many profs do those departments have
21      Computer Science
10      Applied Mathematics
15      Afro-American Studies
31      Engineering Sciences
28      Chemistry and Chemical Biology

> debug on
DEBUG set to ON
> what undergraduate "cs" classes meet at 10-11:30 in the fall with the final exam before January 23?
PREPROCESS: which undergraduate "computer science" class meet at 10:00:00 and end at 11:30:00 in the fall with the exam before DATE:20010123?
QUERY: SELECT class_name FROM exams, classes WHERE class_number between 100 and 200 AND classes.department_id = 106 AND start = "10:00:00" AND end = "11:30:00" AND (semester = 'B' OR semester = 'F' OR semester = 'Y') AND (fall_date < 20010123) AND (semester = "B" OR semester = "F" OR semester = "Y") AND classes.examgrp1 = exams.grp;
ANSWER:
Introduction to Formal Systems and Computation

> what profs have the most classes on thursdays?
PREPROCESS: who have the most class on DAY:Hs?
QUERY: SELECT DISTINCT prof_name, count(*) as cnt FROM classes, professors WHERE days like "%H%" AND (classes.prof1_id = professors.prof1_id OR classes.prof2_id = professors.prof1_id) GROUP BY professors.prof1_id order by cnt desc limit 5;
ANSWER:
Andrew Anderson  6
Garrett Manley     4
Garth Chamberlain       3
Dan W. Jurgenson      3
Robert S. Brandstein  3

> what and when are the latest "engineering" classes
PREPROCESS: which and when are the latest "engineering" class
QUERY: SELECT days, start, end, semester, class_name FROM classes WHERE start IS NOT NULL AND classes.department_id = 108 order by start desc limit 5;
ANSWER:
MW      16:30:00       18:00:00       S       Introduction to Operations Research
H       16:00:00       18:00:00       B       Engineering Design Projects
H       16:00:00       18:00:00       B       Engineering Design Projects
MW      16:00:00       17:30:00       F       Estimation and Control of Dynamic Systems
TH      16:00:00       17:30:00       F       Special Topics in Biomedical Engineering: Orthoped

> debug off
DEBUG set to OFF
> how many visiting associate professors teach a "math" class
2

> tell me about "cs 50"
106     50      Introduction to Computer Science I    4949    10614          F       MWF        10:00:00       11:00:00       3       Introduction to the intellectual enterprises of computer science. Algorithms: their design, specification, and analysis. Software development: problem decomposition, abstraction, data structures, implementation, debugging, testing... Computer Science 101

> what introductory "math" classes meet on monday and wednesday from 9-10?
Introduction to Linear Algebra and Multivariable Calculus

> what department has the most assistant profs
15      Mathematics
14      Economics
6       Biological Sciences
5       Anthropology
5       Philosophy

Thank you for using Gnarli!

Housing Database

Welcome to Gnarli!

 

Current syntax file is housing.syn: Test Housing Database.

 

> what apartments in "boston" have a pool, are available in less than 6

months, and cost under $2500?

1650 Beacon Hill Classic Brownstone

2230    Saint Germain

 

> trace on

TRACE set to ON

> how many units with more than 4 brs are in cities with at most 100000

people?

how many

apartment

over (\d+) bedroom

city

at most (\d+) population

10

 

> trace off

TRACE set to OFF

> does "cheshire" have any furnished units with an elevator and a doorman?

No

 

> pronoun on

PRONOUN set to ON

> how many apts does it have

pronoun: "cheshire" => 'there'

pronoun: "cheshire" => 'it'

6

 

> pronoun off

PRONOUN set to OFF

> what houses under $600 per bedroom can be rented before February 2002?

1050  525.00      Acton, Boxborough

1187  395.67      Arbor Court

3330  555.00      Longwood Towers

1550  516.67      Cambridge Rentals

1550  516.67      Boston Rentals

 

> where are they?

Boxborough  Acton, Boxborough  Acton, Boxborough

Marlborough 37 Hosmer St.  Arbor Court

Brookline  20 Chapel St.  Longwood Towers

Cambridge  65 Court St. Cambridge Rentals

Boston  65 Court St.  Boston Rentals

 

> what percent of families have children in the 5 biggest cities?

25.6454 Boston

31.6271 Worcester

37.8838 Springfield

37.3901 Lowell

18.9041 Cambridge

 

> What is their population?

589141 Boston

172648 Worcester

152082 Springfield

105167 Lowell

 

> debug on

DEBUG set to ON

> Tell me the names and addresses of apts between $3000 and $4000 with at

least 1.5 bathrooms and a lease of over 8 months starting after 6/2002

PREPROCESS: Tell me the names and addresses of apartments between $3000

and $4000 with at least 1.5 bath and a lease of over 8 months starting

after DATE:20020601

QUERY: SELECT cities.name,listings.address, listings.name FROM cities,

listings WHERE listings.price >= 3000 and listings.price <= 4000 AND

listings.bath >= 1.5 AND DATE_ADD(start_date, interval 8 month) <=

end_date AND start_date > 20020601 AND listings.city_id = cities.city_id;

ANSWER:

Cambridge 12 Museum Way  Museum Towers

Boston  7 Harcourt St.  Garrison Square

 

Thank you for using Gnarli!

 

Appendix B: Sample Error Output and Failures

Welcome to Gnarli!

Current syntax file is movie.syn: Test Movie Database.

> trace on
TRACE set to ON
> debug on
DEBUG set to ON

Gnarli (incorrectly) interprets the following question as "Who has starred in movies that have collectively won the most oscars?"

> who has starred in the most oscar winning movies?
PREPROCESS: who has was in the most oscar winning movies?
who
most
oscar
movies
WARNING: Ignored a large amount of text: ' has was in the '
QUERY: SELECT DISTINCT Person.name, count(*) as cnt FROM Movie, Oscar, Actor, Person WHERE Oscar.movie_id = Movie.id AND Actor.actor_id = Person.id and Actor.movie_id = Movie.id GROUP BY Person.id order by cnt desc limit 5;
ANSWER:
Jack Nicholson 10
Clark Gable    9
Dustin Hoffman 9
Diane Keaton   8
Robert Duvall  8

Gnarli ANDs these criteria together instead of attempting a same-table join

> which movies won oscars for best actor and best actress?
PREPROCESS: which movies won oscars for best actor and best actress?
which
movies
oscar
best actor
best actress
QUERY: SELECT DISTINCT Movie.name FROM Oscar, Movie WHERE Oscar.type = "BEST-ACTOR" AND Oscar.type = "BEST-ACTRESS" AND Oscar.movie_id = Movie.id;
ANSWER:

Gnarli correctly answers the following question...

> which movies starred "hanks" and "sinise"?
PREPROCESS: which movies starred "hanks" and "sinise"?
which
movies
starred "([^"]*)" and "([^"]*)"
QUERY: SELECT DISTINCT Movie.name FROM Movie, Actor as Act1, Actor as Act2 WHERE Act1.actor_id = 29 and Act2.actor_id = 31 and Act1.movie_id = Act2.movie_id and Movie.id = Act1.movie_id;
ANSWER:
Forrest Gump
Apollo 13

... but because the syntax file did not include certain preprocessor rules, it could not answer the following equivalent question:

> have "hanks" and "sinise" been in a movie together?
PREPROCESS: boolean has "hanks" and "sinise" was in a movie together?
boolean
"([^"]*)"
"([^"]*)" was in
QUERY: SELECT count(*) FROM Actor, Person WHERE Person.id = 29 AND Actor.actor_id = 31 AND Actor.actor_id = Person.id;
ANSWER:
No

Gnarli reports an ambiguous substitution.

> what movies was "michael" in?
PREPROCESS: which movies was "michael" in?
which
movies
(was|were) "([^"]*)" in
WARNING: Ambiguous: more than one 'michael' in database; using Michael Anderson (I)
QUERY: SELECT DISTINCT Movie.name FROM Movie, Actor WHERE Actor.actor_id = 1214 AND Actor.movie_id = Movie.id;
ANSWER:

Gnarli complains that it does not know what to do.

> "tom hanks" is great
PREPROCESS: "tom hanks"
"([^"]*)"
QUERY: SELECT  FROM Person WHERE Person.id = 29;
ERROR: Could not determine what information to retrieve!

Thank you for using Gnarli!