STRHTTPCRL (Start HTTP Crawl)

Note: To use this command, have the 5722-DG1 (HTTP Server) product installed.

 

Purpose

The Start HTTP (STRHTTPCRL) starts a crawling session.

Restriction: You must have *IOSYSCFG special authority to use this command.

 

Required Parameters

OPTION
Specifies the function to perform.

The possible values are:

*CRTDOCL
Specifies that a document list is created.
*UPDDOCL
Specifies that a document list is updated.
DOCLIST
Specifies the document list file to store the list of path names of each document found.
OBJECTS
Specifies the set of URL and options objects to use to create or update a document list. Specify this parameter to create or update a document list instead of individual values.
URLOBJ
Specifies a URL object name.
OPTOBJ
Specifies an options object name.
DOCDIR
Specifies the document directory where the downloaded documents are stored.
LANG
Specifies the language associated with the URL.

The possible values are:

*ARABIC
*BALTIC
*CENTEUROPE
*CYRILLIC
*ESTONIAN
*GREEK
*HEBREW
*JAPANESE
*KOREAN
*SIMPCHINESE
*TRADCHINESE
*THAI
*TURKISH
*WESTERN
URLLIST
Specifies the URL, URL filter, and crawl depth. Used only for OPTION (*CRTURLOBJ) or OPTION (*UPDURLOBJ).
URL
Specifies the URL to start the crawl.
URLFTR
Specifies the URL filter or domain name restriction.
MAXDEPTH
Specifies the maximum crawl depth. The top level is the URL and is known as level 0. Links on level 0 go to level 1 pages. Links on level 1 pages go to level 2 pages, and so on.
MAXSIZE
Specifies the maximum size file to download. Only files that are within this limit are downloaded.
MAXSTGSIZE
Specifies the maximum storage size for downloaded files.
MAXTHD
Specifies the maximum number of threads.
MAXRUNTIME
Specifies the maximum run time. The maximum amount of time in hours and minutes that the crawl should run. Any file in the process of being downloaded when time expires is completely downloaded.
LOGFILE
Specifies the path and name of the crawl activity log file. This file contains the URLs that are found and downloaded as well as any exceptions (such as empty file) that are found during the crawl.
CLRLOG
Specifies if the log file should be cleared before writing to it.

 

Optional Parameters

PRXSVR
Specifies the proxy server for HTTP.
PRXPORT
Specifies the proxy server port for HTTP.
SECPRXSVR
Specifies the proxy server for HTTPS.
SECPRXPORT
Specifies the proxy server port for HTTPS.

Error messages for STRHTTPCRL

*ESCAPE messages

HTP160C
Request to create or append to a document list failed. Reason &1.
HTP166E
Request to print the status of a document list failed. Reason &1.