Class UserDictionary
- java.lang.Object
-
- org.apache.lucene.analysis.ja.dict.UserDictionary
-
- All Implemented Interfaces:
Dictionary
public final class UserDictionary extends Object implements Dictionary
Class for building a User Dictionary. This class allows for custom segmentation of phrases.
-
-
Field Summary
Fields Modifier and Type Field Description static int
LEFT_ID
static int
RIGHT_ID
static int
WORD_COST
-
Fields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
getBaseForm(int wordId, char[] surface, int off, int len)
Get base form of wordTokenInfoFST
getFST()
String
getInflectionForm(int wordId)
Get inflection form of tokensString
getInflectionType(int wordId)
Get inflection type of tokensint
getLeftId(int wordId)
Get left id of specified wordString
getPartOfSpeech(int wordId)
Get Part-Of-Speech of tokensString
getPronunciation(int wordId, char[] surface, int off, int len)
Get pronunciation of tokensString
getReading(int wordId, char[] surface, int off, int len)
Get reading of tokensint
getRightId(int wordId)
Get right id of specified wordint
getWordCost(int wordId)
Get word cost of specified wordint[][]
lookup(char[] chars, int off, int len)
Lookup words in textint[]
lookupSegmentation(int phraseID)
static UserDictionary
open(Reader reader)
-
-
-
Field Detail
-
WORD_COST
public static final int WORD_COST
- See Also:
- Constant Field Values
-
LEFT_ID
public static final int LEFT_ID
- See Also:
- Constant Field Values
-
RIGHT_ID
public static final int RIGHT_ID
- See Also:
- Constant Field Values
-
-
Method Detail
-
open
public static UserDictionary open(Reader reader) throws IOException
- Throws:
IOException
-
lookup
public int[][] lookup(char[] chars, int off, int len) throws IOException
Lookup words in text- Parameters:
chars
- textoff
- offset into textlen
- length of text- Returns:
- array of {wordId, position, length}
- Throws:
IOException
-
getFST
public TokenInfoFST getFST()
-
lookupSegmentation
public int[] lookupSegmentation(int phraseID)
-
getLeftId
public int getLeftId(int wordId)
Description copied from interface:Dictionary
Get left id of specified word- Specified by:
getLeftId
in interfaceDictionary
- Returns:
- left id
-
getRightId
public int getRightId(int wordId)
Description copied from interface:Dictionary
Get right id of specified word- Specified by:
getRightId
in interfaceDictionary
- Returns:
- right id
-
getWordCost
public int getWordCost(int wordId)
Description copied from interface:Dictionary
Get word cost of specified word- Specified by:
getWordCost
in interfaceDictionary
- Returns:
- word's cost
-
getReading
public String getReading(int wordId, char[] surface, int off, int len)
Description copied from interface:Dictionary
Get reading of tokens- Specified by:
getReading
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
public String getPartOfSpeech(int wordId)
Description copied from interface:Dictionary
Get Part-Of-Speech of tokens- Specified by:
getPartOfSpeech
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Part-Of-Speech of the token
-
getBaseForm
public String getBaseForm(int wordId, char[] surface, int off, int len)
Description copied from interface:Dictionary
Get base form of word- Specified by:
getBaseForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getPronunciation
public String getPronunciation(int wordId, char[] surface, int off, int len)
Description copied from interface:Dictionary
Get pronunciation of tokens- Specified by:
getPronunciation
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
public String getInflectionType(int wordId)
Description copied from interface:Dictionary
Get inflection type of tokens- Specified by:
getInflectionType
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
public String getInflectionForm(int wordId)
Description copied from interface:Dictionary
Get inflection form of tokens- Specified by:
getInflectionForm
in interfaceDictionary
- Parameters:
wordId
- word ID of token- Returns:
- inflection form, or null
-
-