Course List: http://www.c-jump.com/bcc/
The purpose of this assignment is to experiment with recursive method calls in Java. Your task is to download Desktop Calculator v3 from CIS-257 sample code web page, then make it compile and run. You will finish the program by adding the code to compute arithmetic expressions and print the results.
Lexer deals with assembly of words or symbols into a meaningful statement of the input language.
Syntax refers to issues regarding the grammar of a statement.
Semantics refers to issues regarding the meaning of a statement. Semantics is defined as "the meaning of a symbol or a sequence of symbols."
The Lexer Java class implements Recursive-Descent Parser for Arithmetic Expressions.
Let's start by introducing a few definitions already supported by the sample program.
Java class Lexer deals with understanding of symbols in arithmetic expressions. A few examples of arithmetic expressions with proper syntax are:
2 - 3 / 22 1 - 2 - 3 5 (1) ( 1 + 2 ) * 6
Syntax refers to issues regarding the grammar of the arithmetic expression. Another example demonstrates bad syntax:
10 - 2 +
It is an error caused by missing number at the end of the user input.
The rules to formulate valid expressions and weed out inputs with errors are implemented in the Lexer class.
The functionality described here is already implemented by the sample program.
Before Lexer class decides how to interpret the input, the input string is tokenized. That is, an expression such as
( 1 + 2 ) * 6
is split into an array of tokens representing
numbers
arithmetic operators, and
parentheses specifying sub-expressions: (expression)
Parentheses allow grouping of sub-expressions such as ( 1 + 2 ). Each sub-expression inside the parentheses must be evaluated prior to the rest of the expression.
The TokenStream class splits user input into an array of Java strings. Each String contains a single token of input. Expression ( 1 + 2 ) * 6 generates an array of tokens like this:
( 1 + 2 ) * 6
The TokenStream class provides operations to navigate the sequence of tokens as follows:
method begin() makes first token current
method next() advances to the next token
method isEnd() tells whether the end of the token array is reached.
Other methods allow to extract and analyze token data:
getString() returns the value of the current token
isString() compares current token to a specific string
isNumber() returns true if the token is a number, false otherwise
The TokenStream class also handles white space: any extra spaces entered by the user are ignored by the TokenStream methods.
The functionality described here is already implemented by the sample program.
Semantics refers to issues regarding the logical meaning of the input symbols. Semantics deal with making decisions to store or discard information while processing a sequence of tokens. The Lexer class understands semantics of expression
( 1 + 2 ) * 6
as a combination of multiplication and addition involving three numerical values. The structure of parsed expression is internally represented by an abstract syntax tree:
* / \ + 6 / \ 1 2
Such abstract syntax tree (AST for short) effectively represents the logical structure of arithmetic expression (1+2)*6. The meaning is:
Multiply the left-hand-side value (+) by the right-hand-side value 6
Prior to multiplication, obtain the left-hand-side value as a sum of 1 and 2
Print the result at the end
The Lexer class implements recursive-descent parser for an arithmetic expression. The Lexer constructor stores user input in the TokenStream object.
Lexer's method startRule() is the top-level entry into the parser:
the content of input is tokenized
the expressionRule() is invoked
the result is checked for an error
If there is no error, the current version of the program prints the AST structure on computer screen.
The functionality described here is already implemented by the sample program.
To understand the input, the Lexer class uses the following grammar rules:
Expression: Term Term "+" Term Term "-" Term Term: Primary Primary "*" Primary Primary "/" Primary Primary "%" Primary // remainder operator -- not implemented Primary: Number "-" Primary "+" Primary "(" Expression ")" Name "=" Expression // variable assignment -- not implemented
What is the meaning of grammar rules Expression, Term, and Primary?
First, we say that arithmetic Expression starts with the Term and ends with a Term.
Second, we define Term to begin as a Primary, and end as a Primary.
The Primary can be
a number
a sub-expression enclosed in parentheses
a Primary preceeded by an unary operator for positive and negative values.
In the future, Primary can also accommodate grammar for assignment of values to variables.
Although the above grammar rules are highly abstract (not written in Java), the corresponding methods in the Lexer class are expressionRule(), termRule(), and primaryRule().
The grammar defines a syntax of input. Anything that fails to pass the rules is an error. An arithmetic expression has to pass all of the rules before it can be successfully evaluated. For instance, according to the grammar, expression
3 + 2
is parsed in stages:
symbol 3 is a Term, which is a Primary defined as a Number.
symbol + is a binary plus operator "+" recognized by the Expression rule.
symbol 2 is an Expression, because it's a Term which is a Primary defined as a Number.
Thus, expression 3 + 2 satisfies the grammar and can be successfully parsed. Conversly, the expression
3 - 5 /
does not pass, because 5 / does not evaluate to a Term which requires that the division operator "/" must have two operands. Since right-hand-side operand is missing, the Lexer reports an error for this expression.
The functionality described here is already implemented by the sample program.
Why do we have as many as three rules in our grammar? There are a couple of reasons. First, because we would like to observe common precedence rules for arithmetic operators:
Unary plus and minus are computed before any other part of the expression. This is handled by Primary.
Grouping by parentheses ( ) guarantees that a sub-expression inside the parentheses is computed before the rest of the expression. Handled by Primary.
An assignment of a value to a variable takes place after the expression on the right-hand-side of the = operator is computed. Handled by Primary.
Multiplication and division is computed before addition and subtraction. Handled by Term.
Addition and subtraction have lowest precedence. Handled by Expression.
Consider the following two expressions:
1 - 2 - 3 1 - (2 - 3)
The first yields negative 4. The second is positive 2. The results are computed according to the AST structures generated by the parser:
1 - 2 - 3 1 - (2 - 3) - - / \ / \ - 3 1 - / \ / \ 1 2 2 3
The difference between these two expressions is driven by the operator associativity. Binary minus has left-to-right associativity. Left-to-right means that without parentheses 1 - 2 is calculated first, then 3 is subtracted. Natuarally, parentheses change the order of computation. Parentheses allow the user to override the conventional associativity and precedence of arithmetic operators.
The Lexer class handles operator precedence and associativity according to the implementation of the grammar rules. The program already generates correct ASTs for both cases as shown.
The current demo version of the desktop calculator displays the structure of the AST on the console screen after parsing the input.
Instead of printing the AST, change the program to evaluate the expression and show its result. Add support for the binary "%" remainder (modulo) operator. Add support for variables and the assignment operator "=" (see detailed instructions below.)
Helpful ideas:
The AST is a tree made up of AstNode objects. The print() method in AstNode is an example of a recursive method call.
Add your own method evaluate() to the AstNode class. The method should return a double which you can display as a result.
The evaluate() method should examine the type of the node. For example,
If it's a number, convert the value to a double and return the result.
For a node that represents an unary operator, call evaluate() method on the right-hand-side node and return the result. For the node that represents unary minus, negate the result prior to returning it. For example,
if ( type == '-' && leftNode == null ) {
return -rightNode.evaluate();
}
Note any unary operator has its left-hand-side node set to null. Use this knowledge to distinguish between the unary and the binary types of plus and minus operators.
For a node that represents a binary operator, call evaluate() methods on both sides of the node. Combine the results according to the binary operator of the current node. For example,
if ( type == '+' ) { return leftNode.evaluate() + rightNode.evaluate(); } else if ( type == '-' ) { return leftNode.evaluate() - rightNode.evaluate(); } else if ... // and so on...
Be careful with the division operator. What happens when you divide by zero? Are you willing to rely on the default behavior, warn the user, or both?
Implement the assignment operator for variables. For example, the user wants to create and use variables which persist in the calculator's memory:
x = 1 y = 2 z = x + y
The result of the last expression is 3. Additionally, the user can continue using and updating the variables x, y, and z in newly entered expressions.
Optional: once created, how could you get rid of an existing variable? Notice how the main() method handles a quit command. You could add additional commands to the program as necessary. For example, a special command which tells the calculator to remove all or one specific variable can be very helpful.
Open AstNode class in source editor. Add new static attribute specifying that the node represents a variable. Programming languages often refer to the variable names and other names as identifiers:
public static final int IDENTIFIER = 'I';
To store values of the variables, we need a symbol table. For example, we can add the symbol table to the AstNode class:
public class AstNode { //... static public HashMap<String, Double> symbolTable = new HashMap<>();
Note that the symbol table is declared static to persist between subsequent expressions entered by the user. For example, the result of assigning x a value is that x can be used later in other expressions:
> x = 2 + 5 // assign value to x > x // use x (this simply prints 7 as the result)
Since lexer now has to use the symbol table, we can add it as an attribute to the Lexer class:
public class Lexer { //... HashMap<String, Double> symbolTable; //...
and use Lexer constructor to take the reference to the symbol table:
// modified Lexer constructor: public Lexer(StringBuilder text, HashMap<String, Double> symbolTable) { tokenStream = new TokenStream(text); this.symbolTable = symbolTable; }//Lexer
The main() method needs an update to call the Lexer constructor with two parameters:
public static void main(String[] args) { //... Lexer lexer = new Lexer( input, AstNode.symbolTable );
Add new method named isIdentifier() to the TokenStream class:
public boolean isIdentifier() { if ( isEnd() ) { return false; } return Character.isAlphabetic( tokens[ tokenIdx ].charAt( 0 ) ); }//isString
This method should make it easy to detect variables in the stream of tokens.
The proposed grammar rules suggest that the variable assignment should be handled at the primaryRule() level. For example,
private AstNode primaryRule() {
if (tokenStream.isEnd()) {
return new AstNode(
AstNode.ERROR,
"Error: unexpected end of input"
);
}
// New code to handle variables and variable assignment:
if (tokenStream.isIdentifier()) {
AstNode primary = new AstNode(AstNode.IDENTIFIER, tokenStream.getString());
tokenStream.next();
if (tokenStream.isEnd()) {
// nothing else left in the input stream
return primary;
}
if ( tokenStream.isString("=") ) {
AstNode fork = new AstNode(
tokenStream.getString().charAt(0), // =
tokenStream.getString()
);
tokenStream.next();
if (tokenStream.isEnd()) {
return new AstNode(
AstNode.ERROR,
"Error: bad ("+fork.value+") operator syntax"
);
}
fork.leftNode = primary;
fork.rightNode = expressionRule();
return fork;
}
return primary;
}
//...
At this point it should be possible to build the project and do a little bit of testing. For example, I am able to produce the printout like this:
> x = 2 + ( 5 - 3 ) = .x .+ ..2 ..- ...5 ...3 Success!
Finally, we can handle new node types in AstNode evaluate() method as follows:
public double evaluate() { switch ( type ) { case IDENTIFIER: if ( symbolTable.containsKey( value ) ) { return symbolTable.get( value ); } System.out.println("\n Undefined variable "+value ); break; case '=': { double rightValue = rightNode.evaluate(); symbolTable.put( leftNode.value, rightValue ); return rightValue; } //...
Upload your Java source files. For simplicity, before submitting, ZIP your src folder containing the Java source code and submit it as a single ZIP archive.
Do not send your project, .class, .jar files, or any other artifacts on your system. I only need your Java source to grade this assignment.