Module: EBNF::Unescape
- Included in:
- LL1::Lexer, PEG::Rule
- Defined in:
- lib/ebnf/unescape.rb
Overview
Unsecape strings
Constant Summary collapse
- ESCAPE_CHARS =
{ '\\t' => "\t", # \u0009 (tab) '\\n' => "\n", # \u000A (line feed) '\\r' => "\r", # \u000D (carriage return) '\\b' => "\b", # \u0008 (backspace) '\\f' => "\f", # \u000C (form feed) '\\"' => '"', # \u0022 (quotation mark, double quote mark) "\\'" => '\'', # \u0027 (apostrophe-quote, single quote mark) '\\\\' => '\\' # \u005C (backslash) }.freeze
- ESCAPE_CHAR4 =
u005C (backslash)
/\\u(?:[0-9A-Fa-f]{4,4})/u.freeze
- ESCAPE_CHAR8 =
UXXXXXXXX
/\\U(?:[0-9A-Fa-f]{8,8})/u.freeze
- ECHAR =
More liberal unescaping
/\\./u.freeze
- UCHAR =
/#{ESCAPE_CHAR4}|#{ESCAPE_CHAR8}/n.freeze
Class Method Summary collapse
-
.unescape(string) ⇒ String
Perform string and codepoint unescaping if defined for this terminal.
-
.unescape_codepoints(string) ⇒ String
Returns a copy of the given
input
string with all\uXXXX
and\UXXXXXXXX
Unicode codepoint escape sequences replaced with their unescaped UTF-8 character counterparts. -
.unescape_string(input) ⇒ String
Returns a copy of the given
input
string with all string escape sequences (e.g.\n
and\t
) replaced with their unescaped UTF-8 character counterparts.
Class Method Details
.unescape(string) ⇒ String
Perform string and codepoint unescaping if defined for this terminal
58 59 60 |
# File 'lib/ebnf/unescape.rb', line 58 def unescape(string) unescape_string(unescape_codepoints(string)) end |
.unescape_codepoints(string) ⇒ String
Returns a copy of the given input
string with all \uXXXX
and \UXXXXXXXX
Unicode codepoint escape sequences replaced with their unescaped UTF-8 character counterparts.
27 28 29 30 31 32 33 34 35 36 37 38 39 |
# File 'lib/ebnf/unescape.rb', line 27 def unescape_codepoints(string) string = string.dup string.force_encoding(Encoding::ASCII_8BIT) if string.respond_to?(:force_encoding) # Decode \uXXXX and \UXXXXXXXX code points: string = string.gsub(UCHAR) do |c| s = [(c[2..-1]).hex].pack('U*') s.respond_to?(:force_encoding) ? s.force_encoding(Encoding::ASCII_8BIT) : s end string.force_encoding(Encoding::UTF_8) if string.respond_to?(:force_encoding) string end |
.unescape_string(input) ⇒ String
Returns a copy of the given input
string with all string escape sequences (e.g. \n
and \t
) replaced with their unescaped UTF-8 character counterparts.
50 51 52 |
# File 'lib/ebnf/unescape.rb', line 50 def unescape_string(input) input.gsub(ECHAR) {|escaped| ESCAPE_CHARS[escaped] || escaped} end |