select all from class Movie
select all class Movie
(the same, just shorter) List all objects in the class movie.
Note the system is not case sensitive when referring to
database related terms, such as class names or tags.
results
select p from p in class Person
A locator-variable p is defined to loop over the key-values in class Person.
The query works exactly like the (1), but this query can be extended easily,
as p can be referred to for tag-dereferencing, column-wise sorting etc..
results
select m, m->Director from m in class Movie
The from-clause defines an iterator variable m that loops over all key-values of class movie.
The select-clause defines two select-fields - m is just the object value (shown as it's name)
and m->Director is the object value, which is stored with the tag Directory
in object in the class Movie.
results
select m, m->Director, m->Director->Real_name from m in class Movie
Basically the same as (3), but the result of m->Director has been dereferenced to
get the Real_name tag value for the particular object from the Person super-class.
results
select a, a->Real_name from m in class Movie, a in m->Based_on->Author
The from-clause defines two iterators - m loops over the Movie class
and a which takes the value of the Person-object, that is stored with the
Based_on-object (in class Book) for movies that are based on novels or plays.
The select-clause defines two result columns which report the Author
objects for such books. The second column shows how that object can be dereferenced further
to gain access to tags inside the Author objects.
Note that is isn't important where the dereferencing is done (select or from-clause).
The above query is the same as
select m->Based_on->Author, m->Based_on->Author->Real_name from m in class Movie
which performs the same without the need for an extra variable a, but duplicates some query elements.
results
select p, p->Relations[2] from p in class Person
This query makes use of tag-positioning.
All values fields have a position relating to their tag. So far we've dereferenced
tags to retrieve their values, e.g. m->Based_on. This notation is in fact translated
to m->Based_on[1] to say that we're interested in the data element which is one to the right
of the tag. A look at the models shows a different arrangement
for the Relations data in the Person class, where the real data is positioned
2 levels to the right of the Relations tag.
The bracketed number 2 means to skip the first level of data (some people have a Spouse
or other tag in the database) and only report the value to the right of that tag.
results
select p, p->Relations[Spouse] from p in class Person
This query positioned the tag-value by name rather than number. The level2 tag notation at Relations is resolved only at the Spouse tag in this case. This query is equivalent to :
select p, p->Spouse from p in class Person
which shows that the user does not have to specify the exact path
from the root of the object. The name for a tag has to be unique within
a class, so there is always only one possible path from the root
to a named tag in the object.
results
select m, d from m in class Movie, d in m->Director where exists d
The where-clause puts a condition on the preceding
from-clause. The AQL keyword exists is a boolean construct,
which returns TRUE if a value exists, and FALSE if no value can be found.
The result table has no blank fields in the
second column as all those movies without a value for d (i.e. m->Director)
have been omitted.
results
select m->Cast from m in class Movie where m = "True Romance"
The query iterates the variable m over the Movie class, but only
activates the select-clause (which reports the cast of a movie)
if the value of the Movie iterator is "True Romance".
Although m takes object-typed value, we can compare them by their
text-value (i.e. the object's name). The objects name has to be quoted for that matter.
The same results could be achieved by looping over all persons, and only selecting those
that have starred in "True Romance".
select p from p in class Person where p->Stars_in = "True Romance"
This shows an interesting aspect of AQL - It is possible to get the same result using totally different queries.
Other ways of obtaining the same results will be shown later.
results
select m, year from m in class Movie, year in m->Release[Date] where year > `1990
This condition compares two value of DateType. A date like 1990 would easily
be confused with an integer, and therefore literal date-values have to be preceded
with the reverse tick `.
The where-clause also has the effect of restricting the listing of m-values
to those where a release-date is stored in the database, which
implies the where exists ->year condition.
The iterator year could also be defined as m->Date, but the
above construct was chosen for readability. See example (7).
results
select m from m in class Movie where not exists m->Release[Date]
The not keyword negates the value of the exists m->Release[Date]
condition.
results
select p, yeardiff(p->Date_of_Birth, today) from p in class Person
Calculations with values of type Int or Float are easily written
using the standard operators +, -, / and *. For DateType values this
is not so easy. A set of functions is provided to calculate the
difference between two dates at various levels.
This query extract the year portion from two dates and subtracts those
to compute the age. The result of those date-functions is an integer.
The AQL keyword today will be rendered to
`1999-12-06
(or whatever the current date is) during evaluation.
The word today is therefore a valid date-literal value.
results
select p, yeardiff(p->Date_of_Birth, today) from p in class Person where yeardiff(p->Date_of_Birth, today) > 55
Same as (12), but restricts the list to people with ages greater than 55.
Note that only values for p can be included for which the Born tag exists. See example (10),
where a similar behavior occurred.
results
Possible date-functions are :-
These functions will return a sensible result of the date/time value specifies the appropriate portion that the function requests.
No sensible result will be returned for monthdiff (m->Release[Date], `1996). The date-function monthdiff can only produce a good result if both dates contain month-details. Some movies in fact have release dates with month information, but the date literal '1996 does not. The result of evaluating monthdiff will therefore always produce NULL.
All date-functions return an integer value, which is positive if left argument is the smaller (i.e. earlier) date.
select p, count(select p->Stars_in) from p in class Person
The table-function count acts on the sub-query select p->Stars_in
and returns an integer value for all objects where the Stars_in-tag exists. It simply counts
the number of times the Stars_in-tag is contained in each object p.
Note,
that count (p->Stars_in) is not valid. Table-functions
only operate over valid table-expressions. And the minimal valid table
expression to list all values of the Stars_in tag needs the
keyword select.
results
select m from m in class Movie where count(select m->Cast) > 3
This time the table-function is evaluated in the where-clause, which
compares it's integer result with the number 3.
results
select avg(select yeardiff(p->Date_of_Birth, now) from p in class Actor)
The avg table function acts on the return-value of another table expression.
The inner select-query returns a list of ages, over which the average value
is computed.
Such complex queries should be constructed inside-out. First try out the inner
query and see that it returns the desired result. Then wrap that result
up in the final table-function.
results
select max(select m->Date from m in class Movie)
Similar to (16), max acts on a list of dates.
results
Possible table-functions are :-
All table functions will work on either accumulated object attributes (14),(15) or on the table values of whole select-queries(16),(17). All such tables expression have to produce a table with only one column, and in most cases they have to be of a specific value-type for the table-function to work.
$x := count(select all class movie)
Scalar variable have a perl-like '$' syntax and can assume just one value.
This value is usually the return value of a table-function expression.
A variable assignment is treated as a whole query expression in itself.
results
The output will be a table with one column and one row containing the single
return value that the variable $x will assume.
The query is therefore equivalent to
select count(select all class movie)
$x := avg(select m->Rating from m in class Movie);
select m, r from m in class Movie, r in m->Rating where r > $x
The first query assigns a scalar variable to be the average rating of movie (ignoring
movie for which no such value is stored).
The second query then uses this value to list all movies for which the rating value is higher
that that.
This demonstrates how queries can be concatenated using the semi-colon ';'. Usually
one or more queries will calculate variables, whereas the last query is a table expression
that uses those variables to generate the output.
results
The same query could be composed without a scalar variable, but the difference
in execution time is quite startling :
select m, r from m in class Movie,
r in m->Rating
where r > avg(select m1->Rating from m1 in class Movie)
The first query would only run avg once (to get the value for $x)
and finish in a time linearly related to the number of movies.
The second query will execute the nested query inside the avg function in every
iteration cycle of m from the outer query.
The execution time is therefore quadratically related to the number of movies.
This is an example how queries can optimized by arranging it in a clever way.
@tmp := select first(select m->Date)
from m in class Movie, a in m->Cast
where a[1] = "James Bond" order by :1;
select min(@tmp):1
This query is structured into two parts. The first query lists the release dates of all films
with the character named "James Bond". The nature of the model for the Release tag means
that multiple release dates can be stored. The table-functions first ensures we're only listing
one release date per movie. The results are sorted and assigned to the table variable @tmp.
The second part of the query uses the table function min to extract the smallest value
from the first column of the given table. This will be the earliest release date of any bond film.
results
@characters := select c,
count(select m1 from m1 in class Movie where m1->Cast[2] = c),
m, first(select m->Release[Date])
from m in class Movie, c in m->Cast[2];
select character::row:1, movie::row:3, date::row:4
from row in @characters
where row:2 > 1
order by :character, :date
The table-variable feature is used again to separate the work into two parts. The first query
does all the work, whereas the second query, just selects specific fields and does the sorting.
We will learn later how this can be written all in one query.
The second query demonstrates a row-variable. The iterator row is declared as a cursor
on the table @characters. In the select-clause we extract fields 1, 3 and 4 from that row
whereas the where-clause uses field 2 for the comparison.
The select-clause also renames the select-fields, which makes the query more readable.
Those aliased names ('character', 'movie' and 'date') can then be used for the sort-clause
to refer to select-fields by their name.
results
The use of table-variables sometimes stems from the way people compose
queries by incrementally building them up from less complex queries.
Their use is not always necessary though. Here is the same example,
but without table-variables :
select c, m, mdate
from m in class Movie, c in m->Cast[2],
num in count(select m1 from m1 in class Movie where m1->Cast[2] = c)
where num > 1,
mdate in first(select m->Release[Date])
order by :c, :mdate
See example (29) for more on splitting queries into two parts and table-variables.
The database contains a particular dataset, which is useful for testing
and learning. It contains the last 20 James Bond movies. Some have little detail
stored about them, but the one feature that identifies them is the character name
"James Bond", which is stored with the particular Bond-actor.
The following examples will demonstrate a range of features which get
progressively more complicated.
select m from m in class Movie where m->Cast[2] = "James Bond"
In all Bond-movies there is a character called "James Bond".
The character name is the data-element to the right of the actors name under the Cast-tag
in the Movie-object.
results
select m, m->Cast from m in class Movie where m->Cast[2] = "James Bond"
The where-clause is important : it applies to the iterator m,
and will be evaluated once per movie. So the list of possible
values for m is restricted to Bond-movies.
The select-clause will then evaluate m->Cast again for all
of those movies, and therefore report all cast-members of those movies,
not just the ones playing "James Bond".
This may not always be what you want, so compare with (24).
results
select m, a from m in class Movie, a in m->Cast where a[1] = "James Bond"
The subtle difference to (23) is important. The where-clause now applies
directly to the iterator a (for "actor"). It will only be true if
the given actor a is playing "James Bond", and because the
select-clause reports the value of a, only the Bond-actors
are shown.
Note, how we get the value for the character name in this query.
The iterator m->Cast s automatically rewritten as
m->Cast[1] (the first data-element to the right is reported by default).
Therefore we only have to step one more position further to the right to get
to the second field, which is the character name.
results
If the query was written as
select m, m->Cast from m in class Movie, a in m->Cast where a[1] = "James Bond"
the result would be the same as (23), because the extra declaration of a is just used in the where-clause and is basically the same as referring to m->Cast[2] directly.
select m, a, a->Stars_in from m in class Movie, a in m->Cast where a[1] = "James Bond"
Basically the same as (24), but with an extra select field which lists all movies by the actor a.
results
select a, count(select m1 from m1 in class Movie where m1->Cast = a)
from m in class Movie, a in m->Cast where a[1] = "James Bond"
This query makes use of a nested sub-query. For every bond actor as determined by the outer query,
the nested query loops over all movies again finding the ones which where the cast includes the actor
a from the outer query.
results
select a, count(select m1 from m1 in class Movie,
a1 in m1->Cast where a1 = a and a1[1] != "James Bond")
from m in class Movie,
a in m->Cast where a[1] = "James Bond"
The inner is similar to the one in (26), but the where-clause is more elaborate
to count only the films where the current actor a (from the outer scope)
played someone that wasn't called "James Bond".
results
select a, otherm, otherc
from m in class Movie,
a in m->Cast
where a[1] = "James Bond",
otherm in a->Stars_in, a1 in otherm->Cast
where a = a1,
otherc in a1[1]
where otherc != "James Bond"
We don't need a nested query here to count the number of movies for each Bond-actor.
The first where-clause restricts the value of a to only those actors who have played
"James Bond". The iterator otherm then find all their movies (which at this stage
still includes all their bond movies as well).
The last clause then gets their character names from all those movies and excludes "James Bond".
This will exclude all actors who only played James Bond.
results
All of the "James Bond" examples have been expressed so far in just one query. The query in example (28) grows to considerable complexity. The section on table-variables showed how some queries are better split up into two or more queries. A final query can then combine the results as shown in example (22).
@a := select m->Cast from m in class Movie
where m->Cast[2] = "James Bond";
@b := select c from m in class Movie,
c in m->Cast,
otherm in c->Stars_in where c[1] = "James Bond";
@a diff @b
The table @a lists all actors from movies where a character was called "James Bond".
The second table @b will list all those actors who actually
played the "James Bond" character. That leaves us to report the difference of the two
table using the diff table-function.
The use of table-variables can be avoided because the diff keyword can be placed
between any two expressions which return a table value, i.e. two select-from-where queries :
select m->Cast from m in class Movie
where m->Cast[2] = "James Bond"
diff
select c from m in class Movie,
c in m->Cast,
otherm in c->Stars_in where c[1] = "James Bond"
The trick of the two queries which are combines by diff, leave no ambiguity. Spot the difference in this query, where everything is expressed as just one select-from-where query :
select othera
from m in class Movie
where m->Cast[2] = "James Bond",
othera in m->Cast
where othera[1] != "James Bond"
At first sight it seems to do the right thing. We select only the movies m where a character is called "James Bond". The variable othera then lists all actors again for only those movies and we check that we only report those who didn't play "James Bond". So how is it different ?
The line "Sophie Marceau" is missing from the result. Closer inspection of the data reveals that no character name is stored for her role in "The world is not enough". The second where-clause on othera therefore has no value for othera[1] to be used in the comparison. The comparison will evaluate to NULL and no result is shown for "Sophie Marceau".
See the section on NULL values.