In Files

Methods

Class Index [+]

Quicksearch

Ferret::Analysis::AsciiStandardAnalyzer

Summary

The AsciiStandardAnalyzer is the most advanced of the available ASCII-analyzers. If it were implemented in Ruby it would look like this;

  class AsciiStandardAnalyzer
    def initialize(stop_words = FULL_ENGLISH_STOP_WORDS, lower = true)
      @lower = lower
      @stop_words = stop_words
    end

    def token_stream(field, str)
      ts = AsciiStandardTokenizer.new(str)
      ts = AsciiLowerCaseFilter.new(ts) if @lower
      ts = StopFilter.new(ts, @stop_words)
      ts = HyphenFilter.new(ts)
    end
  end

As you can see it makes use of the AsciiStandardTokenizer and you can also add your own list of stop-words if you wish. Note that this tokenizer won’t recognize non-ASCII characters so you should use the StandardAnalyzer is you want to analyze multi-byte data like “UTF-8”.

Public Class Methods

new(lower = true, stop_words = FULL_ENGLISH_STOP_WORDS) → analyzer click to toggle source

Create a new AsciiStandardAnalyzer which downcases tokens by default but can optionally leave case as is. Lowercasing will be done based on the current locale. You can also set the list of stop-words to be used by the StopFilter.

lower

set to false if you don’t want the field’s tokens to be downcased

stop_words

list of stop-words to pass to the StopFilter

static VALUE
frb_a_standard_analyzer_init(int argc, VALUE *argv, VALUE self)
{
    bool lower;
    VALUE rlower, rstop_words;
    Analyzer *a;
    rb_scan_args(argc, argv, "02", &rstop_words, &rlower);
    lower = ((rlower == Qnil) ? true : RTEST(rlower));
    if (rstop_words != Qnil) {
        char **stop_words = get_stopwords(rstop_words);
        a = standard_analyzer_new_with_words((const char **)stop_words, lower);
        free(stop_words);
    } else {
        a = standard_analyzer_new(lower);
    }
    Frt_Wrap_Struct(self, NULL, &frb_analyzer_free, a);
    object_add(a, self);
    return self;
}

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.