Class CJKBigramAwareLengthFilterFactory

java.lang.Object
org.apache.lucene.analysis.util.AbstractAnalysisFactory
org.apache.lucene.analysis.util.TokenFilterFactory
org.apache.tika.eval.core.tokens.CJKBigramAwareLengthFilterFactory

public class CJKBigramAwareLengthFilterFactory extends org.apache.lucene.analysis.util.TokenFilterFactory
Creates a very narrowly focused TokenFilter that limits tokens based on length _unless_ they've been identified as <DOUBLE> or <SINGLE> by the CJKBigramFilter.

This class is intended to be used when generating "common tokens" files.

  • Field Summary

    Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

    LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.lucene.analysis.TokenStream
    create(org.apache.lucene.analysis.TokenStream tokenStream)
     

    Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory

    availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters

    Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory

    get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • CJKBigramAwareLengthFilterFactory

      public CJKBigramAwareLengthFilterFactory(Map<String,String> args)
  • Method Details

    • create

      public org.apache.lucene.analysis.TokenStream create(org.apache.lucene.analysis.TokenStream tokenStream)
      Specified by:
      create in class org.apache.lucene.analysis.util.TokenFilterFactory