Java – how to get the line number in the @ init action of antlr3 tree parser

In ANTLR version 3, how to obtain the line number in the @ init operation of the advanced tree parser rule?

For example, in the @ init operation below, I want to push the line number with the sentence text

sentence
    @init { myNodeVisitor.pushScriptContext( new MyScriptContext( $sentence.text )); }
    : assignCommand 
    | actionCommand;
    finally {
        m_nodeVisitor.popScriptContext();
    }

I need to push the context before performing the operation associated with the symbol in the rule

Some things don't work:

>Use $sentence Line – it is undefined, even though $sentence Textyes. > Push the definition to the rule operation No token is available in the rule before it is placed in the rule Placed after the rule, the action occurs after the action associated with the rule symbol. > Use this expression in the @ init action, which compiles but returns a value of 0: gettreenodestream() getTreeAdaptor(). getToken($sentence.start). getLine(). Editor: actually, this does work if $sentence Start is either a real token or a fictional token with a reference – see the Bart kiers answer below

It seems that I can easily get the matching text and the first matching token in the @ init rule, so there should be a simple way to get the line number

Solution

You can use the following to look forward to step 1 in the token / tree flow of tree syntax: commontree ahead = (commontree) input LT (1), you can put it in the @ init section

Each commontree (the default tree implementation in ANTLR) has a gettoken () method that returns the token associated with the tree And each token has a getline () method, which, not surprisingly, returns the line number of the token

Therefore, if you do the following:

sentence
@init {
  CommonTree ahead = (CommonTree)input.LT(1);
  int line = ahead.getToken().getLine();
  System.out.println("line=" + line);
}
  :  assignCommand 
  |  actionCommand
  ;

You should be able to see the correct line number being printed I'll say something because it won't go as planned in all cases Let me demonstrate using a simple example syntax:

grammar ASTDemo;

options { 
  output=AST;
}

tokens {
  ROOT;
  ACTION;
}

parse
  :  sentence+ EOF -> ^(ROOT sentence+)
  ;

sentence
  :  assignCommand 
  |  actionCommand
  ;

assignCommand
  :  ID ASSIGN NUMBER -> ^(ASSIGN ID NUMBER)
  ;

actionCommand
  :  action ID -> ^(ACTION action ID)
  ;

action
  :  START
  |  STOP
  ;

ASSIGN : '=';
START  : 'start';
STOP   : 'stop';
ID     : ('a'..'z' | 'A'..'Z')+;
NUMBER : '0'..'9'+;
SPACE  : (' ' | '\t' | '\r' | '\n')+ {skip();};

The tree syntax is as follows:

tree grammar ASTDemoWalker;

options {
  output=AST;
  tokenVocab=ASTDemo;
  ASTLabelType=CommonTree;
}


walk
  :  ^(ROOT sentence+)
  ;

sentence
@init {
  CommonTree ahead = (CommonTree)input.LT(1);
  int line = ahead.getToken().getLine();
  System.out.println("line=" + line);
}
  :  assignCommand 
  |  actionCommand
  ;

assignCommand
  :  ^(ASSIGN ID NUMBER)
  ;

actionCommand
  :  ^(ACTION action ID)
  ;

action
  :  START
  |  STOP
  ;

If you run the following test classes:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "\n\n\nABC = 123\n\nstart ABC";
    ASTDemoLexer lexer = new ASTDemoLexer(new ANTLRStringStream(src));
    ASTDemoParser parser = new ASTDemoParser(new CommonTokenStream(lexer));
    CommonTree root = (CommonTree)parser.parse().getTree();
    ASTDemoWalker walker = new ASTDemoWalker(new CommonTreeNodeStream(root));
    walker.walk();
  }
}

You will see the following printed out:

line=4
line=0

As you can see, "ABC = 123" produces the expected output (line 4), but "start ABC" does not produce (line 0) This is because the root of the operation rule is an action tag, which will never be defined in the lexical analyzer, but only in the tag {...} block And because it does not exist in the input, 0 rows are appended to the input by default If you want to change the line number, you need to provide a "reference" tag as an argument to this so-called fictitious action tag, which is used to copy the attribute to itself

Therefore, if you change the actioncommand rule in the composite syntax to:

actionCommand
  :  ref=action ID -> ^(ACTION[$ref.start] action ID)
  ;

The line number will be as expected (line 6)

Note that each parser rule has a start and end attribute, which are references to the first and last tokens, respectively If action is a lexer rule (such as foo), you can omit its start:

actionCommand
  :  ref=FOO ID -> ^(ACTION[$ref] action ID)
  ;

Now, the action token has copied all the attributes pointed to by $ref, except for the type of token, which is, of course, int action But this also means that it copies the text attribute, so in my example, the ast – > ^ (action [$ref.start] action ID) created by ref = action ID may be as follows:

[text=START,type=ACTION]
                  /         \
                 /           \
                /             \
   [text=START,type=START]  [text=ABC,type=ID]

Of course, it is a suitable ast because the node type is unique, but it makes debugging confusing because action and start share the same Text attribute

You can copy all attributes to the division by providing a second string parameter Textand A fictional tag other than type, as follows:

actionCommand
  :  ref=action ID -> ^(ACTION[$ref.start,"Action"] action ID)
  ;

If you run the same test class again now, you will see the following:

line=4
line=6

If you check the generated tree, it will look like this:

[type=ROOT,text='ROOT']
  [type=ASSIGN,text='=']
    [type=ID,text='ABC']
    [type=NUMBER,text='123']
  [type=ACTION,text='Action']
    [type=START,text='start']
    [type=ID,text='ABC']
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>