Modifier and Type | Field and Description |
---|---|
protected int |
checkingRobots
This will be set to nonzero if the robots structure is currently in use
|
protected String |
hostName
Host name
|
protected long |
invalidTime
Timestamp.
|
protected boolean |
isValid
This flag describes whether or not the host record is valid yet.
|
protected int |
port
Port
|
protected String |
protocol
Protocol
|
protected boolean |
readingRobots
This will be set to "true" if the robots.txt for this host is in the process of being read.
|
protected ArrayList |
records
This is the list of robots records for the host, or null if no robots.txt found.
|
Constructor and Description |
---|
Robots.Host(String protocol,
int port,
String hostName)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
canBeFlushed(long currentTime)
Check if the current record can be flushed.
|
boolean |
isFetchAllowed(IThreadContext threadContext,
String throttleGroupName,
long currentTime,
String pathString,
String userAgent,
String from,
String proxyHost,
int proxyPort,
String proxyAuthDomain,
String proxyAuthUsername,
String proxyAuthPassword,
IProcessActivity activities,
int connectionLimit)
Check a given path string against this host's robots file.
|
protected void |
makeValid(IThreadContext threadContext,
String throttleGroupName,
long currentTime,
String userAgent,
String from,
String proxyHost,
int proxyPort,
String proxyAuthDomain,
String proxyAuthUsername,
String proxyAuthPassword,
String hostName,
IProcessActivity activities,
int connectionLimit)
Initialize the record.
|
protected void |
parseRobotsTxt(BufferedReader r,
String hostName,
IVersionActivity activities)
Parse the robots.txt file using a reader.
|
protected String protocol
protected int port
protected String hostName
protected long invalidTime
protected boolean isValid
protected ArrayList records
protected boolean readingRobots
protected int checkingRobots
public boolean isFetchAllowed(IThreadContext threadContext, String throttleGroupName, long currentTime, String pathString, String userAgent, String from, String proxyHost, int proxyPort, String proxyAuthDomain, String proxyAuthUsername, String proxyAuthPassword, IProcessActivity activities, int connectionLimit) throws ServiceInterruption, ManifoldCFException
currentTime
- is the current time in milliseconds since epoch.pathString
- is the path string to check.ServiceInterruption
ManifoldCFException
public boolean canBeFlushed(long currentTime)
protected void makeValid(IThreadContext threadContext, String throttleGroupName, long currentTime, String userAgent, String from, String proxyHost, int proxyPort, String proxyAuthDomain, String proxyAuthUsername, String proxyAuthPassword, String hostName, IProcessActivity activities, int connectionLimit) throws ServiceInterruption, ManifoldCFException
protected void parseRobotsTxt(BufferedReader r, String hostName, IVersionActivity activities) throws IOException, ManifoldCFException
IOException
ManifoldCFException