Class: EBNF::Rule

Inherits:

Object

Object
EBNF::Rule

show all

Defined in:: lib/ebnf/rule.rb

Overview

Represent individual parsed rules

Constant Summary collapse

BNF_OPS = Operations which are flattened to seprate rules in to_bnf.

%w{
  alt diff not opt plus rept seq star
}.map(&:to_sym).freeze

TERM_OPS =

%w{
  hex istr range
}.map(&:to_sym).freeze

OP_ARGN = The number of arguments expected per operator. nil for unspecified

{
  alt: nil,
  diff: 2,
  hex: 1,
  istr: 1,
  not: 1,
  opt: 1,
  plus: 1,
  range: 1,
  rept: 3,
  seq: nil,
  star: 1
}

Instance Attribute Summary collapse

#cleanup ⇒ Object

Determines preparation and cleanup rules for reconstituting EBNF ? * + from BNF.
#comp ⇒ Rule

A comprehension is a sequence which contains all elements but the first of the original rule.
#expr ⇒ Array

Rule expression.
#first ⇒ Array<Rule> readonly

Terminals that immediately procede this rule.
#follow ⇒ Array<Rule> readonly

Terminals that immediately follow this rule.
#id ⇒ String

ID of rule.
#kind ⇒ :rule, ...

Kind of rule.
#orig ⇒ String

Original EBNF.
#start ⇒ Boolean

Indicates that this is a starting rule.
#sym ⇒ Symbol

Symbol of rule.

Class Method Summary collapse

.from_sxp(sxp) ⇒ Rule

Return a rule from its SXP representation:.

Instance Method Summary collapse

#<=>(other) ⇒ Object

Rules compare using their ids.
#==(other) ⇒ Boolean

Two rules are equal if they have the same #sym, #kind and #expr.
#add_first(terminals) ⇒ Integer

Add terminal as proceding this rule.
#add_follow(terminals) ⇒ Integer

Add terminal as following this rule.
#alt? ⇒ Boolean

Is this rule of the form (alt …)?.
#build(expr, kind: nil, cleanup: nil, **options) ⇒ Object

Build a new rule creating a symbol and numbering from the current rule Symbol and number creation is handled by the top-most rule in such a chain.
#eql?(other) ⇒ Boolean

Two rules are equivalent if they have the same #expr.
#first_includes_eps? ⇒ Boolean

Do the firsts of this rule include the empty string?.
#for_sxp ⇒ Array

Return representation for building S-Expressions.
#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ Rule constructor

A new instance of Rule.
#inspect ⇒ Object
#non_terminals(ast, expr = @expr) ⇒ Array<Rule>

Return the non-terminals for this rule.
#pass? ⇒ Boolean

Is this a pass?.
#rule? ⇒ Boolean

Is this a rule?.
#seq? ⇒ Boolean

Is this rule of the form (seq …)?.
#starts_with?(sym) ⇒ Array<Symbol, String>

Does this rule start with sym? It does if expr is that sym, expr starts with alt and contains that sym, or expr starts with seq and the next element is that sym.
#symbols(expr = @expr) ⇒ Array<Rule>

Return the symbols used in the rule.
#terminal? ⇒ Boolean

Is this a terminal?.
#terminals(ast, expr = @expr) ⇒ Array<Rule>

Return the terminals for this rule.
#to_bnf ⇒ Array<Rule>

Transform EBNF rule to BNF rules:.
#to_peg ⇒ Array<Rule>

Transform EBNF rule for PEG:.
#to_regexp ⇒ Regexp

For :hex or :range, create a regular expression.
#to_ruby ⇒ String

Return a Ruby representation of this rule.
#to_sxp(**options) ⇒ String (also: #to_s)

Return SXP representation of this rule.
#to_ttl ⇒ String

Serializes this rule to an Turtle.
#translate_codepoints(str) ⇒ Object

Utility function to translate code points of the form ‘#xN’ into ruby unicode characters.
#valid?(ast) ⇒ Boolean

Validate the rule, with respect to an AST.
#validate!(ast, expr = @expr) ⇒ Object

Validate the rule, with respect to an AST.

Constructor Details

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ `Rule`

Returns a new instance of Rule.

Parameters:

sym (Symbol, nil) —

nil is allowed only for @pass or @terminals
id (Integer, nil)
expr (Array) —
The expression is an internal-representation of an S-Expression with one of the following oparators:
- alt – A list of alternative rules, which are attempted in order. It terminates with the first matching rule, or is terminated as unmatched, if no such rule is found.
- diff – matches any string that matches A but does not match B.
- hex – A single character represented using the hexadecimal notation #xnn.
- istr – A string which matches in a case-insensitive manner, so that (istr "fOo") will match either of the strings "foo", "FOO" or any other combination.
- opt – An optional rule or terminal. It either results in the matching rule or returns nil.
- plus – A sequence of one or more of the matching rule. If there is no such rule, it is terminated as unmatched; otherwise, the result is an array containing all matched input.
- range – A range of characters, possibly repeated, of the form (range "a-z"). May also use hexadecimal notation.
- rept m n – A sequence of at lest m and at most n of the matching rule. It will always return an array.
- seq – A sequence of rules or terminals. If any (other than opt or star) to not parse, the rule is terminated as unmatched.
- star – A sequence of zero or more of the matching rule. It will always return an array.
kind (:rule, :terminal, :terminals, :pass) (defaults to: nil) —

(nil)
ebnf (String) (defaults to: nil) —

(nil) When parsing, records the EBNF string used to create the rule.
first (Array) (defaults to: nil) —

(nil) Recorded set of terminals that can proceed this rule (LL(1))
follow (Array) (defaults to: nil) —

(nil) Recorded set of terminals that can follow this rule (LL(1))
start (Boolean) (defaults to: nil) —

(nil) Is this the starting rule for the grammar?
top_rule (Rule) (defaults to: nil) —

(nil) The top-most rule. All expressed rules are top-rules, derived rules have the original rule as their top-rule.
cleanup (Boolean) (defaults to: nil) —

(nil) Records information useful for cleaning up converted :plus, and :star expansions (LL(1)).

Raises:

(ArgumentError)

# File 'lib/ebnf/rule.rb', line 108

def initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil)
  @sym, @id = sym, id
  @expr = expr.is_a?(Array) ? expr : [:seq, expr].compact
  @ebnf, @kind, @first, @follow, @start, @cleanup, @top_rule = ebnf, kind, first, follow, start, cleanup, top_rule
  @top_rule ||= self
  @kind ||= case
  when sym.to_s == sym.to_s.upcase then :terminal
  when !BNF_OPS.include?(@expr.first) then :terminal
  else :rule
  end

  # Allow @pass and @terminals to not be named
  @sym ||= :_pass if @kind == :pass
  @sym ||= :_terminals if @kind == :terminals

  raise ArgumentError, "Rule sym must be a symbol, was #{@sym.inspect}" unless @sym.is_a?(Symbol)
  raise ArgumentError, "Rule id must be a string or nil, was #{@id.inspect}" unless (@id || "").is_a?(String)
  raise ArgumentError, "Rule kind must be one of :rule, :terminal, :terminals, or :pass, was #{@kind.inspect}" unless
    @kind.is_a?(Symbol) && %w(rule terminal terminals pass).map(&:to_sym).include?(@kind)

  case @expr.first
  when :alt
    raise ArgumentError, "#{@expr.first} operation must have at least one operand, had #{@expr.length - 1}" unless @expr.length > 1
  when :diff
    raise ArgumentError, "#{@expr.first} operation must have exactly two operands, had #{@expr.length - 1}" unless @expr.length == 3
  when :hex, :istr, :not, :opt, :plus, :range, :star
    raise ArgumentError, "#{@expr.first} operation must have exactly one operand, had #{@expr.length - 1}" unless @expr.length == 2
  when :rept
    raise ArgumentError, "#{@expr.first} operation must have exactly three, had #{@expr.length - 1}" unless @expr.length == 4
    raise ArgumentError, "#{@expr.first} operation must an non-negative integer minimum, was #{@expr[1]}" unless
      @expr[1].is_a?(Integer) && @expr[1] >= 0
    raise ArgumentError, "#{@expr.first} operation must an non-negative integer maximum or '*', was #{@expr[2]}" unless
      @expr[2] == '*' || @expr[2].is_a?(Integer) && @expr[2] >= 0
  when :seq
    # It's legal to have a zero-length sequence
  else
    raise ArgumentError, "Rule expression must be an array using a known operator, was #{@expr.first}"
  end
end

Instance Attribute Details

#cleanup ⇒ `Object`

Determines preparation and cleanup rules for reconstituting EBNF ? * + from BNF



77
78
79

# File 'lib/ebnf/rule.rb', line 77

def cleanup
  @cleanup
end

#comp ⇒ `Rule`

A comprehension is a sequence which contains all elements but the first of the original rule.

Returns:

(Rule)



44
45
46

# File 'lib/ebnf/rule.rb', line 44

def comp
  @comp
end

#expr ⇒ `Array`

Rule expression

Returns:

(Array)



54
55
56

# File 'lib/ebnf/rule.rb', line 54

def expr
  @expr
end

#first ⇒ `Array<Rule>` (readonly)

Terminals that immediately procede this rule

Returns:

(Array<Rule>)



64
65
66

# File 'lib/ebnf/rule.rb', line 64

def first
  @first
end

#follow ⇒ `Array<Rule>` (readonly)

Terminals that immediately follow this rule

Returns:

(Array<Rule>)



69
70
71

# File 'lib/ebnf/rule.rb', line 69

def follow
  @follow
end

#id ⇒ `String`

ID of rule

Returns:

(String)



39
40
41

# File 'lib/ebnf/rule.rb', line 39

def id
  @id
end

#kind ⇒ `:rule`, ...

Kind of rule

Returns:

(:rule, :terminal, :terminals, or :pass)



49
50
51

# File 'lib/ebnf/rule.rb', line 49

def kind
  @kind
end

#orig ⇒ `String`

Original EBNF

Returns:

(String)



59
60
61

# File 'lib/ebnf/rule.rb', line 59

def orig
  @orig
end

#start ⇒ `Boolean`

Indicates that this is a starting rule

Returns:

(Boolean)



74
75
76

# File 'lib/ebnf/rule.rb', line 74

def start
  @start
end

#sym ⇒ `Symbol`

Symbol of rule

Returns:

(Symbol)



35
36
37

# File 'lib/ebnf/rule.rb', line 35

def sym
  @sym
end

Class Method Details

.from_sxp(sxp) ⇒ `Rule`

Return a rule from its SXP representation:

Also may have (first ...), (follow ...), or (start #t).

Examples:

inputs

(pass _pass (plus (range "#x20\\t\\r\\n")))
(rule ebnf "1" (star (alt declaration rule)))
(terminal R_CHAR "19" (diff CHAR (alt "]" "-")))

Parameters:

sxp (String, Array)

Returns:

(Rule)

# File 'lib/ebnf/rule.rb', line 160

def self.from_sxp(sxp)
  if sxp.is_a?(String)
    sxp = SXP.parse(sxp)
  end
  expr = sxp.detect {|e| e.is_a?(Array) && ![:first, :follow, :start].include?(e.first.to_sym)}
  first = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :first}
  first = first[1..-1] if first
  follow = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :follow}
  follow = follow[1..-1] if follow
  cleanup = sxp.detect {|e| e.is_a?(Array) && e.first.to_sym == :cleanup}
  cleanup = cleanup[1..-1] if cleanup
  start = sxp.any? {|e| e.is_a?(Array) && e.first.to_sym == :start}
  sym = sxp[1] if sxp[1].is_a?(Symbol)
  id = sxp[2] if sxp[2].is_a?(String)
  self.new(sym, id, expr, kind: sxp.first, first: first, follow: follow, cleanup: cleanup, start: start)
end

Instance Method Details

#<=>(other) ⇒ `Object`

Rules compare using their ids

# File 'lib/ebnf/rule.rb', line 434

def <=>(other)
  if id && other.id
    if id == other.id
      id.to_s <=> other.id.to_s
    else
      id.to_f <=> other.id.to_f
    end
  else
    sym.to_s <=> other.sym.to_s
  end
end

#==(other) ⇒ `Boolean`

Two rules are equal if they have the same #sym, #kind and #expr.

Parameters:

other (Rule)

Returns:

(Boolean)

# File 'lib/ebnf/rule.rb', line 418

def ==(other)
  other.is_a?(Rule) &&
  sym   == other.sym &&
  kind  == other.kind &&
  expr  == other.expr
end

#add_first(terminals) ⇒ `Integer`

Add terminal as proceding this rule.

Parameters:

terminals (Array<Rule, Symbol, String>)

Returns:

(Integer) —

if number of terminals added

# File 'lib/ebnf/rule.rb', line 657

def add_first(terminals)
  @first ||= []
  terminals = terminals.map {|t| t.is_a?(Rule) ? t.sym : t} - @first
  @first += terminals
  terminals.length
end

#add_follow(terminals) ⇒ `Integer`

Add terminal as following this rule. Don’t add _eps as a follow

Parameters:

terminals (Array<Rule, Symbol, String>)

Returns:

(Integer) —

if number of terminals added

# File 'lib/ebnf/rule.rb', line 668

def add_follow(terminals)
  # Remove terminals already in follows, and empty string
  terminals = terminals.map {|t| t.is_a?(Rule) ? t.sym : t} - (@follow || []) - [:_eps]
  unless terminals.empty?
    @follow ||= []
    @follow += terminals
  end
  terminals.length
end

#alt? ⇒ `Boolean`

Is this rule of the form (alt …)?

Returns:

(Boolean)



399
400
401

# File 'lib/ebnf/rule.rb', line 399

def alt?
  expr.is_a?(Array) && expr.first == :alt
end

#build(expr, kind: nil, cleanup: nil, **options) ⇒ `Object`

Build a new rule creating a symbol and numbering from the current rule Symbol and number creation is handled by the top-most rule in such a chain.

Parameters:

expr (Array)
kind (Symbol) (defaults to: nil) —

(nil)
cleanup (Hash{Symbol => Symbol}) (defaults to: nil) —

(nil)
options (Hash{Symbol => Object})

# File 'lib/ebnf/rule.rb', line 184

def build(expr, kind: nil, cleanup: nil, **options)
  new_sym, new_id = @top_rule.send(:make_sym_id)
  self.class.new(new_sym, new_id, expr,
                 kind: kind,
                 ebnf: @ebnf,
                 top_rule: @top_rule,
                 cleanup: cleanup,
                 **options)
end

#eql?(other) ⇒ `Boolean`

Two rules are equivalent if they have the same #expr.

Parameters:

other (Rule)

Returns:

(Boolean)



429
430
431

# File 'lib/ebnf/rule.rb', line 429

def eql?(other)
  expr == other.expr
end

#first_includes_eps? ⇒ `Boolean`

Do the firsts of this rule include the empty string?

Returns:

(Boolean)



649
650
651

# File 'lib/ebnf/rule.rb', line 649

def first_includes_eps?
  @first && @first.include?(:_eps)
end

#for_sxp ⇒ `Array`

Return representation for building S-Expressions.

Returns:

(Array)

# File 'lib/ebnf/rule.rb', line 197

def for_sxp
  elements = [kind, sym]
  elements << id if id
  elements << [:start, true] if start
  elements << first.sort_by(&:to_s).unshift(:first) if first
  elements << follow.sort_by(&:to_s).unshift(:follow) if follow
  elements << [:cleanup, cleanup] if cleanup
  elements << expr
  elements
end

#inspect ⇒ `Object`

# File 'lib/ebnf/rule.rb', line 408

def inspect
  "#<EBNF::Rule:#{object_id} " +
  {sym: sym, id: id, kind: kind, expr: expr}.inspect +
  ">"
end

#non_terminals(ast, expr = @expr) ⇒ `Array<Rule>`

Note:

this is used for LL(1) tansformation, so rule types are limited

Return the non-terminals for this rule.

alt => this is every non-terminal.
diff => this is every non-terminal.
hex => nil
istr => nil
not => this is the last expression, if any.
opt => this is the last expression, if any.
plus => this is the last expression, if any.
range => nil
rept => this is the last expression, if any.
seq => this is the first expression in the sequence, if any.
star => this is the last expression, if any.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 473

def non_terminals(ast, expr = @expr)
  ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).map do |sym|
    case sym
    when Symbol
      r = ast.detect {|r| r.sym == sym}
      r if r && r.rule?
    when Array
      non_terminals(ast, sym)
    else
      nil
    end
  end.flatten.compact.uniq
end

#pass? ⇒ `Boolean`

Is this a pass?

Returns:

(Boolean)



388
389
390

# File 'lib/ebnf/rule.rb', line 388

def pass?
  kind == :pass
end

#rule? ⇒ `Boolean`

Is this a rule?

Returns:

(Boolean)



394
395
396

# File 'lib/ebnf/rule.rb', line 394

def rule?
  kind == :rule
end

#seq? ⇒ `Boolean`

Is this rule of the form (seq …)?

Returns:

(Boolean)



404
405
406

# File 'lib/ebnf/rule.rb', line 404

def seq?
  expr.is_a?(Array) && expr.first == :seq
end

#starts_with?(sym) ⇒ `Array<Symbol, String>`

Does this rule start with sym? It does if expr is that sym, expr starts with alt and contains that sym, or expr starts with seq and the next element is that sym.

Parameters:

sym (Symbol, class) —

Symbol matching any start element, or if it is String, any start element which is a String

Returns:

(Array<Symbol, String>) —

list of symbol (singular), or strings which are start symbol, or nil if there are none

# File 'lib/ebnf/rule.rb', line 550

def starts_with?(sym)
  if seq? && sym === (v = expr.fetch(1, nil))
    [v]
  elsif alt? && expr.any? {|e| sym === e}
    expr.select {|e| sym === e}
  else
    nil
  end
end

#symbols(expr = @expr) ⇒ `Array<Rule>`

Return the symbols used in the rule.

Parameters:

expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 528

def symbols(expr = @expr)
  expr[1..-1].map do |sym|
    case sym
    when Symbol
      sym
    when Array
      symbols(sym)
    end
  end.flatten.compact.uniq
end

#terminal? ⇒ `Boolean`

Is this a terminal?

Returns:

(Boolean)



382
383
384

# File 'lib/ebnf/rule.rb', line 382

def terminal?
  kind == :terminal
end

#terminals(ast, expr = @expr) ⇒ `Array<Rule>`

Note:

this is used for LL(1) tansformation, so rule types are limited

Return the terminals for this rule.

alt => this is every terminal.
diff => this is every terminal.
hex => nil
istr => nil
not => this is the last expression, if any.
opt => this is the last expression, if any.
plus => this is the last expression, if any.
range => nil
rept => this is the last expression, if any.
seq => this is the first expression in the sequence, if any.
star => this is the last expression, if any.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 508

def terminals(ast, expr = @expr)
  ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).map do |sym|
    case sym
    when Symbol
      r = ast.detect {|r| r.sym == sym}
      r if r && r.terminal?
    when String
      sym
    when Array
      terminals(ast, sym)
    end
  end.flatten.compact.uniq
end

#to_bnf ⇒ `Array<Rule>`

Transform EBNF rule to BNF rules:

Transform (rule a "n" (op1 (op2))) into two rules:

(rule a "n" (op1 _a_1))
(rule _a_1 "n.1" (op2))

Transform (rule a (opt b)) into (rule a (alt _empty b))
Transform (rule a (star b)) into (rule a (alt _empty (seq b a)))
Transform (rule a (plus b)) into (rule a (seq b (star b)

Transformation includes information used to re-construct non-transformed.

AST representation

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 258

def to_bnf
  return [self] unless rule?
  new_rules = []

  # Look for rules containing recursive definition and rewrite to multiple rules. If `expr` contains elements which are in array form, where the first element of that array is a symbol, create a new rule for it.
  if expr.any? {|e| e.is_a?(Array) && (BNF_OPS + TERM_OPS).include?(e.first)}
    #   * Transform (a [n] rule (op1 (op2))) into two rules:
    #     (a.1 [n.1] rule (op1 a.2))
    #     (a.2 [n.2] rule (op2))
    # duplicate ourselves for rewriting
    this = dup
    new_rules << this

    expr.each_with_index do |e, index|
      next unless e.is_a?(Array) && e.first.is_a?(Symbol)
      new_rule = build(e)
      this.expr[index] = new_rule.sym
      new_rules << new_rule
    end

    # Return new rules after recursively applying #to_bnf
    new_rules = new_rules.map {|r| r.to_bnf}.flatten
  elsif expr.first == :opt
    this = dup
    #   * Transform (rule a (opt b)) into (rule a (alt _empty b))
    this.expr = [:alt, :_empty, expr.last]
    this.cleanup = :opt
    new_rules = this.to_bnf
  elsif expr.first == :star
    #   * Transform (rule a (star b)) into (rule a (alt _empty (seq b a)))
    this = dup
    this.cleanup = :star
    new_rule = this.build([:seq, expr.last, this.sym], cleanup: :merge)
    this.expr = [:alt, :_empty, new_rule.sym]
    new_rules = [this] + new_rule.to_bnf
  elsif expr.first == :plus
    #   * Transform (rule a (plus b)) into (rule a (seq b (star b)
    this = dup
    this.cleanup = :plus
    this.expr = [:seq, expr.last, [:star, expr.last]]
    new_rules = this.to_bnf
  elsif [:alt, :seq].include?(expr.first)
    # Otherwise, no further transformation necessary
    new_rules << self
  elsif [:diff, :hex, :range].include?(expr.first)
    # This rules are fine, they just need to be terminals
    raise "Encountered #{expr.first.inspect}, which is a #{self.kind}, not :terminal" unless self.terminal?
    new_rules << self
  else
    # Some case we didn't think of
    raise "Error trying to transform #{expr.inspect} to BNF"
  end
  
  return new_rules
end

#to_peg ⇒ `Array<Rule>`

Transform EBNF rule for PEG:

Transform (rule a "n" (op1 ... (op2 y) ...z)) into two rules:

(rule a "n" (op1 ... _a_1 ... z))
(rule _a_1 "n.1" (op2 y))

Transform (rule a "n" (diff op1 op2)) into two rules:

(rule a "n" (seq _a_1 op1))
(rule _a_1 "n.1" (not op1))

Returns:

(Array<Rule>)

# File 'lib/ebnf/rule.rb', line 327

def to_peg
  new_rules = []

  # Look for rules containing sub-sequences
  if expr.any? {|e| e.is_a?(Array) && e.first.is_a?(Symbol)}
    # duplicate ourselves for rewriting
    this = dup
    new_rules << this

    expr.each_with_index do |e, index|
      next unless e.is_a?(Array) && e.first.is_a?(Symbol)
      new_rule = build(e)
      this.expr[index] = new_rule.sym
      new_rules << new_rule
    end

    # Return new rules after recursively applying #to_bnf
    new_rules = new_rules.map {|r| r.to_peg}.flatten
  elsif expr.first == :diff && !terminal?
    this = dup
    new_rule = build([:not, expr[2]])
    this.expr = [:seq, new_rule.sym, expr[1]]
    new_rules << this
    new_rules << new_rule
  elsif [:hex, :istr, :range].include?(expr.first)
    # This rules are fine, they just need to be terminals
    raise "Encountered #{expr.first.inspect}, which is a #{self.kind}, not :terminal" unless self.terminal?
    new_rules << self
  else
    new_rules << self
  end
  
  return new_rules.map {|r| r.extend(EBNF::PEG::Rule)}
end

#to_regexp ⇒ `Regexp`

For :hex or :range, create a regular expression.

Returns:

(Regexp)

# File 'lib/ebnf/rule.rb', line 366

def to_regexp
  case expr.first
  when :hex
    Regexp.new(Regexp.escape(translate_codepoints(expr[1])))
  when :istr
    /#{expr.last}/ui
  when :range
    Regexp.new("[#{escape_regexp_character_range(translate_codepoints(expr[1]))}]")
  else
    raise "Can't turn #{expr.inspect} into a regexp"
  end
end

#to_ruby ⇒ `String`

Return a Ruby representation of this rule

Returns:

(String)



239
240
241

# File 'lib/ebnf/rule.rb', line 239

def to_ruby
  "EBNF::Rule.new(#{sym.inspect}, #{id.inspect}, #{expr.inspect}#{', kind: ' + kind.inspect unless kind == :rule})"
end

#to_sxp(**options) ⇒ `String` Also known as: to_s

Return SXP representation of this rule

Returns:

(String)



211
212
213

# File 'lib/ebnf/rule.rb', line 211

def to_sxp(**options)
  for_sxp.to_sxp(**options)
end

#to_ttl ⇒ `String`

Serializes this rule to an Turtle.

Returns:

(String)

# File 'lib/ebnf/rule.rb', line 220

def to_ttl
  @ebnf.debug("to_ttl") {inspect} if @ebnf
  statements = [%{:#{sym} rdfs:label "#{sym}";}]
  if orig
    comment = orig.to_s.strip.
      gsub(/"""/, '\"\"\"').
      gsub("\\", "\\\\").
      sub(/^\"/, '\"').
      sub(/\"$/m, '\"')
    statements << %{  rdfs:comment #{comment.inspect};}
  end
  statements << %{  dc:identifier "#{id}";} if id
  
  statements += ttl_expr(expr, terminal? ? "re" : "g", 1, false)
  "\n" + statements.join("\n")
end

#translate_codepoints(str) ⇒ `Object`

Utility function to translate code points of the form ‘#xN’ into ruby unicode characters



448
449
450

# File 'lib/ebnf/rule.rb', line 448

def translate_codepoints(str)
  str.gsub(/#x\h+/) {|c| c[2..-1].scanf("%x").first.chr(Encoding::UTF_8)}
end

#valid?(ast) ⇒ `Boolean`

Validate the rule, with respect to an AST.

Uses #validate! and catches RangeError

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules

Returns:

(Boolean)

# File 'lib/ebnf/rule.rb', line 639

def valid?(ast)
  validate!(ast)
  true
rescue SyntaxError
  false
end

#validate!(ast, expr = @expr) ⇒ `Object`

Validate the rule, with respect to an AST.

Parameters:

ast (Array<Rule>) —

The set of rules, used to turn symbols into rules
expr (Array<Symbol,String,Array>) (defaults to: @expr) —

(@expr) The expression to check, defaults to the rule expression. Typically, if the expression is recursive, the embedded expression is called recursively.

Raises:

(RangeError)

# File 'lib/ebnf/rule.rb', line 569

def validate!(ast, expr = @expr)
  op = expr.first
  raise SyntaxError, "Unknown operator: #{op}" unless OP_ARGN.key?(op)
  raise SyntaxError, "Argument count missmatch on operator #{op}, had #{expr.length - 1} expected #{OP_ARGN[op]}" if
    OP_ARGN[op] && OP_ARGN[op] != expr.length - 1

  # rept operator needs min and max
  if op == :alt
    raise SyntaxError, "alt operation must have at least one operand, had #{expr.length - 1}" unless expr.length > 1
  elsif op == :rept
    raise SyntaxError, "rept operation must an non-negative integer minimum, was #{expr[1]}" unless
      expr[1].is_a?(Integer) && expr[1] >= 0
    raise SyntaxError, "rept operation must an non-negative integer maximum or '*', was #{expr[2]}" unless
      expr[2] == '*' || expr[2].is_a?(Integer) && expr[2] >= 0
  end

  case op
  when :hex
    raise SyntaxError, "Hex operand must be of form '#xN+': #{sym}" unless expr.last.match?(/^#x\h+$/)
  when :range
    str = expr.last.dup
    str = str[1..-1] if str.start_with?('^')
    str = str[0..-2] if str.end_with?('-')  # Allowed at end of range
    scanner = StringScanner.new(str)
    hex = rchar = in_range = false
    while !scanner.eos?
      begin
        if scanner.scan(Terminals::HEX)
          raise SyntaxError if in_range && rchar
          rchar = in_range = false
          hex = true
        elsif scanner.scan(Terminals::R_CHAR)
          raise SyntaxError if in_range && hex
          hex = in_range = false
          rchar = true
        else
          raise(SyntaxError, "Range contains illegal components at offset #{scanner.pos}: was #{expr.last}")
        end

        if scanner.scan(/\-/)
          raise SyntaxError if in_range
          in_range = true
        end
      rescue SyntaxError
        raise(SyntaxError, "Range contains illegal components at offset #{scanner.pos}: was #{expr.last}")
      end
    end
  else
    ([:alt, :diff].include?(expr.first) ? expr[1..-1] : expr[1,1]).each do |sym|
      case sym
      when Symbol
        r = ast.detect {|r| r.sym == sym}
        raise SyntaxError, "No rule found for #{sym}" unless r
      when Array
        validate!(ast, sym)
      when String
        raise SyntaxError, "String must be of the form CHAR*" unless sym.match?(/^#{Terminals::CHAR}*$/)
      end
    end
  end
end

Class: EBNF::Rule

Overview

Constant Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ Rule

Instance Attribute Details

#cleanup ⇒ Object

#comp ⇒ Rule

#expr ⇒ Array

#first ⇒ Array<Rule> (readonly)

#follow ⇒ Array<Rule> (readonly)

#id ⇒ String

#kind ⇒ :rule, ...

#orig ⇒ String

#start ⇒ Boolean

#sym ⇒ Symbol

Class Method Details

.from_sxp(sxp) ⇒ Rule

Examples:

inputs

Instance Method Details

#<=>(other) ⇒ Object

#==(other) ⇒ Boolean

#add_first(terminals) ⇒ Integer

#add_follow(terminals) ⇒ Integer

#alt? ⇒ Boolean

#build(expr, kind: nil, cleanup: nil, **options) ⇒ Object

#eql?(other) ⇒ Boolean

#first_includes_eps? ⇒ Boolean

#for_sxp ⇒ Array

#inspect ⇒ Object

#non_terminals(ast, expr = @expr) ⇒ Array<Rule>

#pass? ⇒ Boolean

#rule? ⇒ Boolean

#seq? ⇒ Boolean

#starts_with?(sym) ⇒ Array<Symbol, String>

#symbols(expr = @expr) ⇒ Array<Rule>

#terminal? ⇒ Boolean

#terminals(ast, expr = @expr) ⇒ Array<Rule>

#to_bnf ⇒ Array<Rule>

#to_peg ⇒ Array<Rule>

#to_regexp ⇒ Regexp

#to_ruby ⇒ String

#to_sxp(**options) ⇒ String Also known as: to_s

#to_ttl ⇒ String

#translate_codepoints(str) ⇒ Object

#valid?(ast) ⇒ Boolean

#validate!(ast, expr = @expr) ⇒ Object

#initialize(sym, id, expr, kind: nil, ebnf: nil, first: nil, follow: nil, start: nil, top_rule: nil, cleanup: nil) ⇒ `Rule`

#cleanup ⇒ `Object`

#comp ⇒ `Rule`

#expr ⇒ `Array`

#first ⇒ `Array<Rule>` (readonly)

#follow ⇒ `Array<Rule>` (readonly)

#id ⇒ `String`

#kind ⇒ `:rule`, ...

#orig ⇒ `String`

#start ⇒ `Boolean`

#sym ⇒ `Symbol`

.from_sxp(sxp) ⇒ `Rule`

#<=>(other) ⇒ `Object`

#==(other) ⇒ `Boolean`

#add_first(terminals) ⇒ `Integer`

#add_follow(terminals) ⇒ `Integer`

#alt? ⇒ `Boolean`

#build(expr, kind: nil, cleanup: nil, **options) ⇒ `Object`

#eql?(other) ⇒ `Boolean`

#first_includes_eps? ⇒ `Boolean`

#for_sxp ⇒ `Array`

#inspect ⇒ `Object`

#non_terminals(ast, expr = @expr) ⇒ `Array<Rule>`

#pass? ⇒ `Boolean`

#rule? ⇒ `Boolean`

#seq? ⇒ `Boolean`

#starts_with?(sym) ⇒ `Array<Symbol, String>`

#symbols(expr = @expr) ⇒ `Array<Rule>`

#terminal? ⇒ `Boolean`

#terminals(ast, expr = @expr) ⇒ `Array<Rule>`

#to_bnf ⇒ `Array<Rule>`

#to_peg ⇒ `Array<Rule>`

#to_regexp ⇒ `Regexp`

#to_ruby ⇒ `String`

#to_sxp(**options) ⇒ `String` Also known as: to_s

#to_ttl ⇒ `String`

#translate_codepoints(str) ⇒ `Object`

#valid?(ast) ⇒ `Boolean`

#validate!(ast, expr = @expr) ⇒ `Object`