Java代写|INFR11199 Coursework Assignment Part 1


In this assignment, you will implement the minimization procedure for conjunctive queries
and a lightweight database system for evaluating conjunctive queries called Minibase.
The goals of this assignment are threefold:

1. to understand the minimization procedure for conjunctive queries,
2. to teach you how to translate conjunctive queries into relational algebra query plans,
3. to learn the iterator model for relational operator evaluation and build na¨ıve im
plementations of the most common operators (e.g., selection, projection, join).

The assignment consists of three tasks:

Task 1 (30%): Implementation of the minimization procedure for conjunctive queries.

Task 2 (45%): Implementation of the iterator model and common RA operators.

Task 3 (15%): Implementation of aggregation operators (SUM, AVG, MEDIAN).

The remaining 10% of your mark is for code style and comments; more details later on.

You will use the minimization procedure only in Task1 but not in Tasks 2 & 3; this is to
ensure that any errors made in Task 1 are not propagated into the other tasks.

You should consider the set semantics for each task, meaning that every relation
consists of distinct tuples and the input and output of each operator contain no duplicates.

You will start from a skeleton project consisting of two main classes:

• CQMinimizer, which should contain your code for Task 1
• Minibase, which should contain your code for Tasks 2 & 3.

Both classes define the expected command line interface. You are free to modify these
classes but must preserve their command line interface as it is essential for marking.

The skeleton project also comes with a parser for our query language, so you do not
have to write your own (unless you want to). The main classes show how to parse query
strings into Java objects.

You are free to use any text editor or IDE to complete the project. We will use Maven
to compile your project. We recommend setting up a local development environment by
installing Java 8 or later and using an IDE such as IntelliJ or Eclipse. To import the
project into IntelliJ or Eclipse, make sure that you import as a Maven project (select the
pom.xml file when importing).

In Task 1, you will implement the minimization algorithm for conjunctive queries dis
cussed in class. You will build a program that reads a query from the specified input file,
minimizes the query, and writes a minimized query in the specified output file.

Task 1 considers the language of conjunctive queries as presented in class, with queries
expressed using the rule-based form:

where R1; : : : Rm are relation names, v is a tuple of variables, t1; : : : ; tm are tuples of
terms, and each variable from v appears in the body. A term can be a positive integer
constant (e.g., 42), a string constant enclosed in single quotation marks (e.g., ’ADBS’),
or a variable. A valid variable name consists of lowercase letters, e.g., a; x; abc are valid
variable names; note that this is different from the notation used in class where a; b; c
denote constants. A valid relation name consists of uppercase letters, and the relation
name in the head does not appear in the body. In our query language, ’ADBS’ is a string
constant while ADBS (without quotation marks) is a valid relation name.

Examples of valid conjunctive queries are:

Q(x) :- R(x; ’This is a test string’)
Q(x; z) :- R(4; z); S(x; y); T(12; ’x’)
Q() :- R(y; z); R(z; x)


Examples of invalid conjunctive queries in our language are:

Q(y) :- R(x; ’test’) { y does not appear in body
Q() :- { empty body
Q(x; 3) :- R(4; z); S(x; y) { constant in head
Q(y) :- R(y); S() { empty relational atom in body

You can assume that only valid, correctly-typed conjunctive query will be provided as
input. For instance, Q(x) :- R(x; 4); R(y; ’4’) is wrongly-typed as the second attribute of
R does not have a unique type. Similarly, Q(x) :- R(x; y); R(x; y; z) is also wrongly-typed
as the arity of R is not unique. Such queries will never appear as input.

For simplicity, you can assume that every relational atom appearing in the body or
the head has no repeated variables. For instance, the atom R(x; x) is invalid according
to this restriction, while R(x; xx) and R(3; 3) are valid; the query Q(x; x) :- R(x) is also
invalid. But the same variable can appear in different relational atoms in the body.

You do not need to perform any type or naming checks on the input query.