|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Input
- the type of the input data to be processed.public interface Extractor<Input>
It defines the signature of a generic Extractor.
Nested Class Summary | |
---|---|
static interface |
Extractor.BlindExtractor
This interface specializes an Extractor able to handle
URI as input format. |
static interface |
Extractor.ContentExtractor
This interface specializes an Extractor able to handle
InputStream as input format. |
static interface |
Extractor.TagSoupDOMExtractor
This interface specializes an Extractor able to handle
Document as input format. |
Method Summary | |
---|---|
ExtractorDescription |
getDescription()
Returns a ExtractorDescription of this extractor. |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext context,
Input in,
ExtractionResult out)
Executes the extractor. |
Method Detail |
---|
void run(ExtractionParameters extractionParameters, ExtractionContext context, Input in, ExtractionResult out) throws IOException, ExtractionException
extractionParameters
- the parameters to be applied during the extraction.context
- The document context.in
- The extractor input data.out
- the collector for the extracted data.
IOException
- On error while reading from the input stream.
ExtractionException
- On other error, such as parse errors.ExtractorDescription getDescription()
ExtractorDescription
of this extractor.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |