Class: EBNF::LL1::Scanner

Inherits:
StringScanner
  • Object
show all
Defined in:
lib/ebnf/ll1/scanner.rb

Overview

Overload StringScanner with file operations and line counting

  • Reloads scanner as required until EOF.

  • Loads to a high-water and reloads when remaining size reaches a low-water.

FIXME: Only implements the subset required by the Lexer for now.

Constant Summary collapse

HIGH_WATER =

Hopefully large enough to deal with long multi-line comments

512 * 1024
LOW_WATER =
4 * 1024

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input, **options) ⇒ Scanner

Create a scanner, from an IO

Parameters:

  • input (String, IO, #read)
  • options (Hash{Symbol => Object})
  • options[Integer] (Hash)

    a customizable set of options



34
35
36
37
38
39
40
41
42
# File 'lib/ebnf/ll1/scanner.rb', line 34

def initialize(input, **options)
  @options = options.merge(high_water: HIGH_WATER, low_water: LOW_WATER)

  @previous_lineno = @lineno = 1
  @input = input.is_a?(String) ? encode_utf8(input) : input
  super(input.is_a?(String) ? @input : "")
  feed_me
  self
end

Instance Attribute Details

#inputString, ... (readonly)

Returns:

  • (String, IO, StringIO)


18
19
20
# File 'lib/ebnf/ll1/scanner.rb', line 18

def input
  @input
end

#linenoInteger

The current line number (one-based).

Returns:

  • (Integer)


24
25
26
# File 'lib/ebnf/ll1/scanner.rb', line 24

def lineno
  @lineno
end

Instance Method Details

#ensure_buffer_fullObject

Ensures that the input buffer is full to the high water mark, or end of file. Useful when matching tokens that may be longer than the low water mark



46
47
48
49
50
51
52
53
54
# File 'lib/ebnf/ll1/scanner.rb', line 46

def ensure_buffer_full
  # Read up to high-water mark ensuring we're at an end of line
  if @input.respond_to?(:eof?) && !@input.eof?
    diff = @options[:high_water] - rest_size
    string = encode_utf8(@input.read(diff))
    string << encode_utf8(@input.gets) unless @input.eof?
    self << string if string
  end
end

#eos?Boolean

Returns true if the scan pointer is at the end of the string

Returns:

  • (Boolean)


60
61
62
63
# File 'lib/ebnf/ll1/scanner.rb', line 60

def eos?
  feed_me
  super
end

#restString

Returns the “rest” of the line, or the next line if at EOL (i.e. everything after the scan pointer). If there is no more data (eos? = true), it returns “”.

Returns:

  • (String)


70
71
72
73
# File 'lib/ebnf/ll1/scanner.rb', line 70

def rest
  feed_me
  encode_utf8 super
end

#scan(pattern) ⇒ String

Tries to match with pattern at the current position.

If there is a match, the scanner advances the “scan pointer” and returns the matched string. Otherwise, the scanner returns nil.

If the scanner begins with the multi-line start expression

Examples:

s = StringScanner.new('test string')
p s.scan(/\w+/)   # -> "test"
p s.scan(/\w+/)   # -> nil
p s.scan(/\s+/)   # -> " "
p s.scan(/\w+/)   # -> "string"
p s.scan(/./)     # -> nil

Parameters:

  • pattern (Regexp)

Returns:

  • (String)


92
93
94
95
96
97
98
99
# File 'lib/ebnf/ll1/scanner.rb', line 92

def scan(pattern)
  feed_me
  @previous_lineno = @lineno
  if matched = encode_utf8(super)
    @lineno += matched.count("\n")
  end
  matched
end

#scan_until(pattern) ⇒ String

Scans the string until the pattern is matched. Returns the substring up to and including the end of the match, advancing the scan pointer to that location. If there is no match, nil is returned.

Examples:

s = StringScanner.new("Fri Dec 12 1975 14:39")
s.scan_until(/1/)        # -> "Fri Dec 1"
s.pre_match              # -> "Fri Dec "
s.scan_until(/XYZ/)      # -> nil

Parameters:

  • pattern (Regexp)

Returns:

  • (String)


112
113
114
115
116
117
118
119
# File 'lib/ebnf/ll1/scanner.rb', line 112

def scan_until(pattern)
  feed_me
  @previous_lineno = @lineno
  if matched = encode_utf8(super)
    @lineno += matched.count("\n")
  end
  matched
end

#skip(pattern) ⇒ Object

Attempts to skip over the given pattern beginning with the scan pointer. If it matches, the scan pointer is advanced to the end of the match, and the length of the match is returned. Otherwise, nil is returned.

similar to scan, but without returning the matched string.

Parameters:

  • pattern (Regexp)


128
129
130
131
# File 'lib/ebnf/ll1/scanner.rb', line 128

def skip(pattern)
  scan(pattern)
  nil
end

#skip_until(pattern) ⇒ Object

Advances the scan pointer until pattern is matched and consumed. Returns the number of bytes advanced, or nil if no match was found.

Look ahead to match pattern, and advance the scan pointer to the end of the match. Return the number of characters advanced, or nil if the match was unsuccessful.

It’s similar to scan_until, but without returning the intervening string.

Parameters:

  • pattern (Regexp)


140
141
142
# File 'lib/ebnf/ll1/scanner.rb', line 140

def skip_until(pattern)
  (matched = scan_until(pattern)) && matched.length
end

#terminateObject

Set the scan pointer to the end of the string and clear matching data



153
154
155
156
# File 'lib/ebnf/ll1/scanner.rb', line 153

def terminate
  feed_me
  super
end

#unscanObject

Sets the scan pointer to the previous position. Only one previous position is remembered, and it changes with each scanning operation.



146
147
148
149
# File 'lib/ebnf/ll1/scanner.rb', line 146

def unscan
  @lineno = @previous_lineno
  super
end