Rob's search thingy Crawler options

-a
Generate abstract file.
-c
Crawl instead of index.
This also enables charset conversion.
-b
Broken server: Do a combined HEAD and GET.
Some broken servers respond with a GET to a HEAD request. Others with a 403 or 500 error.
The default behaviour of the software is to do a HEAD, and if the content-type turns out to be text, HTML or PDF, then a GET.
With the '-b' option, the software will do a GET and abort the download if the content-type isn't text, HTML or PDF. These aborted download are reported as: 'Operation was aborted by an application callback'.
The file may be downloaded completely anyway. Especially small files downloaded over fast links. The abort works best when downloading large files over slow links.
-d
Enable debug.
-e
Enable external link report.
-f List_Of_Sites
List of files to be indexed. With '-c'; List of websites to be crawled.
-l
Allow more then 64k words.
-p
Index PDF files.
pdftotext needs to be installed for this.
-r
Re-use old wordlist.
Update this list to become the new wordlist.
-s
Print word stats.
Warning: Long list!
-t
Text output.
Can be used for debugging.
-u
Index non-ASCII.
This assumes UTF-8.
Note: Without this option the indexer will treat all non-ASCII as word delimiters.
-v
Print version and exit.
-w Wait_Time
Wait time between docs (s). ms resolution.
Default: 1 s.