The WhiteSpaceAnalyzer recognizes tokens as maximal strings of non-whitespace characters. If implemented in Ruby the WhiteSpaceAnalyzer would look like;
class WhiteSpaceAnalyzer def initialize(lower = true) @lower = lower end def token_stream(field, str) return WhiteSpaceTokenizer.new(str, @lower) end end
As you can see it makes use of the WhiteSpaceTokenizer.
Create a new WhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will be done based on the current locale.
lower | set to false if you don’t want the field’s tokens to be downcased |
static VALUE frb_white_space_analyzer_init(int argc, VALUE *argv, VALUE self) { Analyzer *a; GET_LOWER(false); #ifndef POSH_OS_WIN32 if (!frb_locale) frb_locale = setlocale(LC_CTYPE, ""); #endif a = mb_whitespace_analyzer_new(lower); Frt_Wrap_Struct(self, NULL, &frb_analyzer_free, a); object_add(a, self); return self; }
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.