In Files

Methods

Class Index [+]

Quicksearch

Ferret::Analysis::AsciiWhiteSpaceAnalyzer

Summary

The AsciiWhiteSpaceAnalyzer recognizes tokens as maximal strings of non-whitespace characters. If implemented in Ruby the AsciiWhiteSpaceAnalyzer would look like;

  class AsciiWhiteSpaceAnalyzer
    def initialize(lower = true)
      @lower = lower
    end

    def token_stream(field, str)
      if @lower
        return AsciiLowerCaseFilter.new(AsciiWhiteSpaceTokenizer.new(str))
      else
        return AsciiWhiteSpaceTokenizer.new(str)
      end
    end
  end

As you can see it makes use of the AsciiWhiteSpaceTokenizer. You should use WhiteSpaceAnalyzer if you want to recognize multibyte encodings such as “UTF-8”.

Public Class Methods

new(lower = false) → analyzer click to toggle source

Create a new AsciiWhiteSpaceAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will only be done to ASCII characters.

lower

set to false if you don’t want the field’s tokens to be downcased

static VALUE
frb_a_white_space_analyzer_init(int argc, VALUE *argv, VALUE self)
{
    Analyzer *a;
    GET_LOWER(false);
    a = whitespace_analyzer_new(lower);
    Frt_Wrap_Struct(self, NULL, &frb_analyzer_free, a);
    object_add(a, self);
    return self;
}

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.