public class GTSConnector extends BaseOutputConnector
Modifier and Type | Class and Description |
---|---|
protected static class |
GTSConnector.ReaderListener
Reader listener object that extracts the app name
|
Modifier and Type | Field and Description |
---|---|
static String |
_rcsid |
protected static int |
DT_COMPOUND_DOC |
protected static int |
DT_MSEXCEL |
protected static int |
DT_MSOUTLOOK |
protected static int |
DT_MSPOWERPOINT |
protected static int |
DT_MSWORD |
protected static int |
DT_PDF |
protected static int |
DT_TEXT |
protected static int |
DT_UNKNOWN |
protected static int |
DT_ZERO |
static String |
INGEST_ACTIVITY
Ingestion activity
|
protected static String[] |
ingestableMimeTypeArray |
protected static Map |
ingestableMimeTypeMap |
protected HttpPoster |
poster
Local data
|
static String |
REMOVE_ACTIVITY
Document removal activity
|
currentContext, params
DOCUMENTSTATUS_ACCEPTED, DOCUMENTSTATUS_REJECTED
Constructor and Description |
---|
GTSConnector()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
int |
addOrReplaceDocumentWithException(String documentURI,
VersionContext pipelineDescription,
RepositoryDocument document,
String authorityNameString,
IOutputAddActivity activities)
Add (or replace) a document in the output data store using the connector.
|
String |
check()
Test the connection.
|
boolean |
checkDocumentIndexable(VersionContext outputDescription,
File localFile,
IOutputCheckActivity activities)
Pre-determine whether a document (passed here as a File object) is indexable by this connector.
|
boolean |
checkMimeTypeIndexable(VersionContext outputDescription,
String mimeType,
IOutputCheckActivity activities)
Detect if a mime type is indexable or not.
|
void |
connect(ConfigParams configParameters)
Connect.
|
void |
disconnect()
Close the connection.
|
protected static int |
fingerprint(File file)
Fingerprint a file!
Pass in the name of the (local) temporary file that we should be looking at.
|
String[] |
getActivitiesList()
Return the list of activities that this connector supports (i.e.
|
protected static String |
getAppName(File documentPath)
Get a binary document's APPNAME field, or return null if the document
does not seem to be an OLE compound document.
|
String |
getFormCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form check javascript method to call.
|
String |
getFormPresaveCheckJavascriptMethodName(int connectionSequenceNumber)
Obtain the name of the form presave check javascript method to call.
|
VersionContext |
getPipelineDescription(Specification spec)
Get an output version string, given an output specification.
|
protected void |
getSession()
Set up a session
|
protected static String |
hexprint(byte x) |
protected static boolean |
isStrange(byte x)
Check if character is not typical ASCII.
|
protected static boolean |
isText(byte[] beginChunk,
int chunkLength)
Test to see if a document is text or not.
|
protected static boolean |
isWhiteSpace(byte x)
Check if a byte is a whitespace character.
|
protected static char |
nibbleprint(int x) |
void |
outputConfigurationBody(IThreadContext threadContext,
IHTTPOutput out,
Locale locale,
ConfigParams parameters,
String tabName)
Output the configuration body section.
|
void |
outputConfigurationHeader(IThreadContext threadContext,
IHTTPOutput out,
Locale locale,
ConfigParams parameters,
List<String> tabsArray)
Output the configuration header section.
|
void |
outputSpecificationBody(IHTTPOutput out,
Locale locale,
Specification os,
int connectionSequenceNumber,
int actualSequenceNumber,
String tabName)
Output the specification body section.
|
void |
outputSpecificationHeader(IHTTPOutput out,
Locale locale,
Specification os,
int connectionSequenceNumber,
List<String> tabsArray)
Output the specification header section.
|
String |
processConfigurationPost(IThreadContext threadContext,
IPostParameters variableContext,
Locale locale,
ConfigParams parameters)
Process a configuration post.
|
String |
processSpecificationPost(IPostParameters variableContext,
Locale locale,
Specification os,
int connectionSequenceNumber)
Process a specification post.
|
protected static int |
recognizeApp(String appName)
Translate a string application name to one of the kinds of documents
we care about.
|
void |
removeDocument(String documentURI,
String outputDescription,
IOutputRemoveActivity activities)
Remove a document using the connector.
|
void |
viewConfiguration(IThreadContext threadContext,
IHTTPOutput out,
Locale locale,
ConfigParams parameters)
View configuration.
|
void |
viewSpecification(IHTTPOutput out,
Locale locale,
Specification os,
int connectionSequenceNumber)
View specification.
|
addOrReplaceDocument, checkDateIndexable, checkDocumentIndexable, checkDocumentIndexable, checkLengthIndexable, checkLengthIndexable, checkMimeTypeIndexable, checkMimeTypeIndexable, checkURLIndexable, checkURLIndexable, getOutputDescription, noteAllRecordsRemoved, noteJobComplete, outputSpecificationBody, outputSpecificationBody, outputSpecificationHeader, outputSpecificationHeader, outputSpecificationHeader, processSpecificationPost, processSpecificationPost, requestInfo, viewSpecification, viewSpecification
clearThreadContext, deinstall, getConfiguration, install, isConnected, outputConfigurationBody, outputConfigurationHeader, outputConfigurationHeader, pack, packFixedList, packList, packList, poll, processConfigurationPost, setThreadContext, unpack, unpackFixedList, unpackList, viewConfiguration
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
clearThreadContext, deinstall, getConfiguration, install, isConnected, poll, setThreadContext
public static final String _rcsid
public static final String INGEST_ACTIVITY
public static final String REMOVE_ACTIVITY
protected static final int DT_UNKNOWN
protected static final int DT_COMPOUND_DOC
protected static final int DT_MSWORD
protected static final int DT_MSEXCEL
protected static final int DT_MSPOWERPOINT
protected static final int DT_MSOUTLOOK
protected static final int DT_TEXT
protected static final int DT_ZERO
protected static final int DT_PDF
protected HttpPoster poster
protected static final String[] ingestableMimeTypeArray
protected static final Map ingestableMimeTypeMap
public String[] getActivitiesList()
getActivitiesList
in interface IOutputConnector
getActivitiesList
in class BaseOutputConnector
public void connect(ConfigParams configParameters)
connect
in interface IConnector
connect
in class BaseConnector
configParameters
- is the set of configuration parameters, which
in this case describe the target appliance, basic auth configuration, etc. (This formerly came
out of the ini file.)public void disconnect() throws ManifoldCFException
disconnect
in interface IConnector
disconnect
in class BaseConnector
ManifoldCFException
protected void getSession() throws ManifoldCFException
ManifoldCFException
public String check() throws ManifoldCFException
check
in interface IConnector
check
in class BaseConnector
ManifoldCFException
public boolean checkMimeTypeIndexable(VersionContext outputDescription, String mimeType, IOutputCheckActivity activities) throws ManifoldCFException, ServiceInterruption
checkMimeTypeIndexable
in interface IPipelineConnector
checkMimeTypeIndexable
in class BaseOutputConnector
mimeType
- is the mime type of the document.ManifoldCFException
ServiceInterruption
public boolean checkDocumentIndexable(VersionContext outputDescription, File localFile, IOutputCheckActivity activities) throws ManifoldCFException, ServiceInterruption
checkDocumentIndexable
in interface IPipelineConnector
checkDocumentIndexable
in class BaseOutputConnector
localFile
- is the local file to check.ManifoldCFException
ServiceInterruption
public VersionContext getPipelineDescription(Specification spec) throws ManifoldCFException, ServiceInterruption
getPipelineDescription
in interface IPipelineConnector
getPipelineDescription
in class BaseOutputConnector
spec
- is the current output specification for the job that is doing the crawling.ManifoldCFException
ServiceInterruption
public int addOrReplaceDocumentWithException(String documentURI, VersionContext pipelineDescription, RepositoryDocument document, String authorityNameString, IOutputAddActivity activities) throws ManifoldCFException, ServiceInterruption, IOException
addOrReplaceDocumentWithException
in interface IPipelineConnector
addOrReplaceDocumentWithException
in class BaseOutputConnector
documentURI
- is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process
and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.pipelineDescription
- includes the description string that was constructed for this document by the getOutputDescription() method.document
- is the document data to be processed (handed to the output data store).authorityNameString
- is the name of the authority responsible for authorizing any access tokens passed in with the repository document. May be null.activities
- is the handle to an object that the implementer of a pipeline connector may use to perform operations, such as logging processing activity,
or sending a modified document to the next stage in the pipeline.IOException
- only if there's a stream error reading the document data.ManifoldCFException
ServiceInterruption
public void removeDocument(String documentURI, String outputDescription, IOutputRemoveActivity activities) throws ManifoldCFException, ServiceInterruption
removeDocument
in interface IOutputConnector
removeDocument
in class BaseOutputConnector
documentURI
- is the URI of the document. The URI is presumed to be the unique identifier which the output data store will use to process
and serve the document. This URI is constructed by the repository connector which fetches the document, and is thus universal across all output connectors.outputDescription
- is the last description string that was constructed for this document by the getOutputDescription() method above.activities
- is the handle to an object that the implementer of an output connector may use to perform operations, such as logging processing activity.ManifoldCFException
ServiceInterruption
public void outputConfigurationHeader(IThreadContext threadContext, IHTTPOutput out, Locale locale, ConfigParams parameters, List<String> tabsArray) throws ManifoldCFException, IOException
outputConfigurationHeader
in interface IConnector
outputConfigurationHeader
in class BaseConnector
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.tabsArray
- is an array of tab names. Add to this array any tab names that are specific to the connector.ManifoldCFException
IOException
public void outputConfigurationBody(IThreadContext threadContext, IHTTPOutput out, Locale locale, ConfigParams parameters, String tabName) throws ManifoldCFException, IOException
outputConfigurationBody
in interface IConnector
outputConfigurationBody
in class BaseConnector
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.tabName
- is the current tab name.ManifoldCFException
IOException
public String processConfigurationPost(IThreadContext threadContext, IPostParameters variableContext, Locale locale, ConfigParams parameters) throws ManifoldCFException
processConfigurationPost
in interface IConnector
processConfigurationPost
in class BaseConnector
threadContext
- is the local thread context.variableContext
- is the set of variables available from the post, including binary file post information.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.ManifoldCFException
public void viewConfiguration(IThreadContext threadContext, IHTTPOutput out, Locale locale, ConfigParams parameters) throws ManifoldCFException, IOException
viewConfiguration
in interface IConnector
viewConfiguration
in class BaseConnector
threadContext
- is the local thread context.out
- is the output to which any HTML should be sent.parameters
- are the configuration parameters, as they currently exist, for this connection being configured.ManifoldCFException
IOException
public String getFormCheckJavascriptMethodName(int connectionSequenceNumber)
getFormCheckJavascriptMethodName
in interface IPipelineConnector
getFormCheckJavascriptMethodName
in class BaseOutputConnector
connectionSequenceNumber
- is the unique number of this connection within the job.public String getFormPresaveCheckJavascriptMethodName(int connectionSequenceNumber)
getFormPresaveCheckJavascriptMethodName
in interface IPipelineConnector
getFormPresaveCheckJavascriptMethodName
in class BaseOutputConnector
connectionSequenceNumber
- is the unique number of this connection within the job.public void outputSpecificationHeader(IHTTPOutput out, Locale locale, Specification os, int connectionSequenceNumber, List<String> tabsArray) throws ManifoldCFException, IOException
outputSpecificationHeader
in interface IPipelineConnector
outputSpecificationHeader
in class BaseOutputConnector
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.os
- is the current pipeline specification for this connection.connectionSequenceNumber
- is the unique number of this connection within the job.tabsArray
- is an array of tab names. Add to this array any tab names that are specific to the connector.ManifoldCFException
IOException
public void outputSpecificationBody(IHTTPOutput out, Locale locale, Specification os, int connectionSequenceNumber, int actualSequenceNumber, String tabName) throws ManifoldCFException, IOException
outputSpecificationBody
in interface IPipelineConnector
outputSpecificationBody
in class BaseOutputConnector
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.os
- is the current pipeline specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.actualSequenceNumber
- is the connection within the job that has currently been selected.tabName
- is the current tab name.ManifoldCFException
IOException
public String processSpecificationPost(IPostParameters variableContext, Locale locale, Specification os, int connectionSequenceNumber) throws ManifoldCFException
processSpecificationPost
in interface IPipelineConnector
processSpecificationPost
in class BaseOutputConnector
variableContext
- contains the post data, including binary file-upload information.locale
- is the preferred local of the output.os
- is the current pipeline specification for this job.connectionSequenceNumber
- is the unique number of this connection within the job.ManifoldCFException
public void viewSpecification(IHTTPOutput out, Locale locale, Specification os, int connectionSequenceNumber) throws ManifoldCFException, IOException
viewSpecification
in interface IPipelineConnector
viewSpecification
in class BaseOutputConnector
out
- is the output to which any HTML should be sent.locale
- is the preferred local of the output.connectionSequenceNumber
- is the unique number of this connection within the job.os
- is the current pipeline specification for this job.ManifoldCFException
IOException
protected static int fingerprint(File file) throws ManifoldCFException
ManifoldCFException
protected static String getAppName(File documentPath) throws ManifoldCFException
ManifoldCFException
protected static int recognizeApp(String appName)
protected static boolean isText(byte[] beginChunk, int chunkLength)
protected static boolean isStrange(byte x)
protected static boolean isWhiteSpace(byte x)
protected static String hexprint(byte x)
protected static char nibbleprint(int x)