STRHTTPCRL (Start HTTP Crawl)
Note: To use this command, have the 5722-DG1 (HTTP Server) product installed.
Purpose
The Start HTTP (STRHTTPCRL) starts a crawling session.
Restriction: You must have *IOSYSCFG special authority to use this command.
Required Parameters
- OPTION
- Specifies the function to perform.
The possible values are:
- *CRTDOCL
- Specifies that a document list is created.
- *UPDDOCL
- Specifies that a document list is updated.
- DOCLIST
- Specifies the document list file to store the list of path names of each document found.
- OBJECTS
- Specifies the set of URL and options objects to use to create or update a document list. Specify this parameter to create or update a document list instead of individual values.
- URLOBJ
- Specifies a URL object name.
- OPTOBJ
- Specifies an options object name.
- DOCDIR
- Specifies the document directory where the downloaded documents are stored.
- LANG
- Specifies the language associated with the URL.
The possible values are:
- *ARABIC
- *BALTIC
- *CENTEUROPE
- *CYRILLIC
- *ESTONIAN
- *GREEK
- *HEBREW
- *JAPANESE
- *KOREAN
- *SIMPCHINESE
- *TRADCHINESE
- *THAI
- *TURKISH
- *WESTERN
- URLLIST
- Specifies the URL, URL filter, and crawl depth. Used only for OPTION (*CRTURLOBJ) or OPTION (*UPDURLOBJ).
- URL
- Specifies the URL to start the crawl.
- URLFTR
- Specifies the URL filter or domain name restriction.
- MAXDEPTH
- Specifies the maximum crawl depth. The top level is the URL and is known as level 0. Links on level 0 go to level 1 pages. Links on level 1 pages go to level 2 pages, and so on.
- MAXSIZE
- Specifies the maximum size file to download. Only files that are within this limit are downloaded.
- MAXSTGSIZE
- Specifies the maximum storage size for downloaded files.
- MAXTHD
- Specifies the maximum number of threads.
- MAXRUNTIME
- Specifies the maximum run time. The maximum amount of time in hours and minutes that the crawl should run. Any file in the process of being downloaded when time expires is completely downloaded.
- LOGFILE
- Specifies the path and name of the crawl activity log file. This file contains the URLs that are found and downloaded as well as any exceptions (such as empty file) that are found during the crawl.
- CLRLOG
- Specifies if the log file should be cleared before writing to it.
Optional Parameters
- PRXSVR
- Specifies the proxy server for HTTP.
- PRXPORT
- Specifies the proxy server port for HTTP.
- SECPRXSVR
- Specifies the proxy server for HTTPS.
- SECPRXPORT
- Specifies the proxy server port for HTTPS.
Error messages for STRHTTPCRL
*ESCAPE messages
- HTP160C
- Request to create or append to a document list failed. Reason &1.
- HTP166E
- Request to print the status of a document list failed. Reason &1.