Scala代写 | COMP3000 Assignment Two Syntax Analysis
这个作业是用Scala写一个语法解析器
COMP3000 Programming Languages 2020
Assignment Two
Syntax Analysis
Overview
This assignment asks you to develop a lexical analyser, parser
and tree builder for a simple functional programming language
called FunLang. We will build on these components in the third
assignment to complete a full implementation of this language.
Building this implementation will give you insight into the way
that programming language implementations work in general,
as well as specific experience with how functional language
programs are written, how they are compiled, and how they are
executed.
This kind of task often arises in programming situations other
than language implementation. For example, many applications
have configuration files that are written in simple languages.
The application must be able to read these files reliably and
understand their structure, just as a compiler must read
program files and understand them.
FunLang
FunLang is a language that contains elements from mainstream
functional languages such as Haskell, ML and Scala; it uses a
Scala-like syntax. The necessary information about FunLang will
be provided in this document.
The description here is a brief overview of the FunLang
language. Aspects such as checking the validity of names or
types and translating the program into an executable form are
beyond the scope of syntax analysis and hence are not
considered in this assignment.
The basic unit of a FunLang program is the expression; there are
no statements. In fact, a program is just a single expression. For
example, here is a simple program that returns the result of a
simple arithmetic operation:
2 + 3 * 4
When this program is run it will print the value of the
expression: the number 14.
Block expressions are used to build programs out of smaller
expressions. A block expression is a pair of curly braces
containing one or more definitions, followed by a single
expression. The idea is that the definitions can give names to
values and functions. The value of a block expression is given by
its final expression, which can use the defined names. For
example, here is a program consisting of a block expression
that uses two values:
{
val a = 5
val b = a + 1
a * b
}
This program will print the result of multiplying a by b, so 30 will
be printed. (The name a can be used in the definition
of b since b is defined later, but that is a name analysis issue, so
we don’t need to worry about it here.) There are no assignment
statements, so the value bound to a particular occurrence of a
name cannot be changed.
Definitions can also define functions. For example, here is a
program that defines a value, a function and calls the function
passing the value as a parameter:
{
val x = 100
def inc (a : Int) = a + 1
inc (x)
}
This program will print 101.
All of these programs have just one expression at the top level.
In fact, that’s the definition of a program in FunLang: a single
expression. Expression forms are interchangeable as long as
they have the correct type. E.g., anywhere we can put a number
can also take a block or some other kind of expression that
evaluates to a number. For example, here is an artificial
program that uses blocks nested inside an arithmetic operation.
{
val a = 3
a + 1
} *
{
val b = 10
b – 1
}
This program will print 36 since it is multiplying 4 times 9.
We’ve seen a few different forms of expression: numbers,
addition expressions, multiplication expressions and function
call expressions. There are also other arithmetic operations,
Boolean values, Boolean literals, relational and logical
operators, and conditional expressions. The complete syntax of
FunLang is given below.
Finally, FunLang comments are as in Java: either beginning with
two slashes and continuing to the end of the line, or
surrounded by /* and */.
val z = 3 // This is a comment
z + 7 /* So is this */
The syntax of FunLang
To guide your implementation, here is a context-free grammar
for the FunLang language on which you can base your parser.
First, the syntax of programs and expressions:
program : exp.
exp : app
| block
| cond
| exp “==” exp
| exp “<” exp | exp “+” exp | exp “-” exp | exp “*” exp | exp “/” exp | “false” | “true” | idnuse | integer | “(” exp “)”. app : idnuse “(” exp “)”. block : “{” definitions exp “}”. cond : “if” “(” exp “)” “then” exp “else” exp. Now the syntax of definitions that can occur in blocks: definitions : defngroup+. defngroup : fundefn+ | valdefn. fundefn : “def” idndef “(” arg “)” “=” exp. arg : idndef “:” tipe. valdefn : “val” idndef “=” exp. Functions that are defined adjacent to each other are grouped together and they can call each other. The node class FunGroup is used for this purpose and each function is represented by a Fun node. Values are not grouped (or equivalently, each value group consists of a single value) and are represented by Val nodes. Finally, the syntax of types: tipe : “Bool” | “Int” | tipe “=>” tipe
| “(” tipe “)”.
We use the word “tipe” instead of “type” since the latter is a
Scala keyword which prevents us from using it as the name of a
parser in our code. A function type is specified using the
arrow => and describes the type of a function that takes a value
of the type on the left of the arrow and returns a value of the
type on the right of the arrow. (Type analysis issues are also out
of scope for this assignment.)
The grammar above is not immediately suitable for encoding as
a parser. The exp and tipe non-terminals are ambiguous since
they make no allowance for precedence and associativity of the
operators. You should rewrite the grammar productions to
implement the following precedence and associativity rules:
• The following expression constructs have precedence as
shown from lowest to highest with constructs on the same
line having the same precedence:
1. conditional expressions
2. equal and less than
3. addition and subtraction
4. multiplication and division
5. all other kinds of expression
• All binary expression operators are left associative, except
for the relational operators (equality and less than) which
are not associative.
• The function type (=>) is right associative.
The parser skeleton you are given already handles the lexical
issues such as parsing integers, identifiers (both defining
occurrences idndef and applied occurrences idnuse) and
comments.
What you have to do
You have to write, document and test a Scala syntax analyser
including tree builder for FunLang.
You are strongly advised not to try to solve the whole
assignment in one go. It is best to write code to handle the
parsing and tree construction for some simple constructs first
and then build up to the full language.
Your code must use the Scala parsing library as discussed in
lectures and practicals. You should use the expression language
syntax analyser and tree builder from the mixed classes as a
guide for your implementation.
The associated assignment code bundle contains a skeleton sbt
project for the assignment. The modules are very similar to
those used in the practical exercises for Week 5 onwards. The
skeleton contains the modules you will need. Some of the
parsing and tree construction is given to you as an illustration;
you must provide the rest (look for FIXME in the code).
As well as lexing and parsing the input, your program should
construct a suitable source program tree to represent the
parsed result. See FunLangTree.scala in the skeleton for the full
definition and description of the tree structures that you must
use. Do not modify the tree classes, just create instances in your
parser code.
As an example of the desired tree structure, here is the tree that
should be produced from the first program above:
PlusExp (IntExp (2), StarExp (IntExp (3), IntExp (4)))
Notice that the higher precedence of multiplication over
addition has been taken into account in this tree.
As a more complex example, here is the tree for the inc function
program above:
BlockExp (
List (
Val (IdnDef (“x”), IntExp (100)),
FunGroup (
List (
Fun (
IdnDef (“inc”),
Arg (IdnDef (“a”), IntType ()),
PlusExp (IdnUse (“a”), IntExp (1)))))),
AppExp (IdnUse (“inc”), IdnUse (“x”)))
The tree contains a single block with two definitions: one for “x”
and one for “inc”. The function definition contains three
children: one for the name, one for the argument name and
type, and one for the body expression. Finally, the block has the
function call as its value expression.
Running the syntax analyser and testing it
The skeleton for this assignment is designed to be run from
within sbt. For example, to compile your project and run it on
the file test/simple.fun you use the command
run test/simple.fun