public class CachingNaiveBayesClassifier extends SimpleNaiveBayesClassifier
http://en.wikipedia.org/wiki/Naive_Bayes_classifier
This is NOT an online classifier.
analyzer, classFieldName, indexReader, indexSearcher, query, textFieldNames
Constructor and Description |
---|
CachingNaiveBayesClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
String classFieldName,
String... textFieldNames)
Creates a new NaiveBayes classifier with inside caching.
|
Modifier and Type | Method and Description |
---|---|
protected List<ClassificationResult<BytesRef>> |
assignClassNormalizedList(String inputDocument)
Calculate probabilities for all classes for a given input text
|
void |
reInitCache(int minTermOccurrenceInCache,
boolean justCachedTerms)
This function is building the frame of the cache.
|
assignClass, countDocsWithClass, getClasses, getClasses, normClassificationResults, tokenize
public CachingNaiveBayesClassifier(IndexReader indexReader, Analyzer analyzer, Query query, String classFieldName, String... textFieldNames)
reInitCache()
.indexReader
- the reader on the index to be used for classificationanalyzer
- an Analyzer
used to analyze unseen textquery
- a Query
to eventually filter the docs used for training the classifier, or null
if all the indexed docs should be usedclassFieldName
- the name of the field used as the output for the classifiertextFieldNames
- the name of the fields used as the inputs for the classifierprotected List<ClassificationResult<BytesRef>> assignClassNormalizedList(String inputDocument) throws IOException
SimpleNaiveBayesClassifier
assignClassNormalizedList
in class SimpleNaiveBayesClassifier
inputDocument
- the input text as a String
List
of ClassificationResult
, one for each existing classIOException
- if assigning probabilities failspublic void reInitCache(int minTermOccurrenceInCache, boolean justCachedTerms) throws IOException
minTermOccurrenceInCache
- Lower cache size with higher value.justCachedTerms
- The switch for fully exclude low occurrence docs.IOException
- If there is a low-level I/O error.Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.