What are optional fields?

Page Revision: 2008/04/02 23:51

Optional fields are not what you think. You should only check fields as optional when you HAVE to. Yes this checkbox should not be so seemingly innocuous, but it is and you should hardly ever check it. Each optional field will increase the extraction time by a factor of 2.

You should only check a field optional if its tags are similar to other fields and you sometimes want to act as if the field is not there at all. For example, say that we have three fields to extract name, city, and state from the following HTML:


We want Craig Junior for the name field, so we choose the start tag to be
and the end tag to be . The start tag for city is
and the end tag is
. The same tags are used for state.

However, sometimes the name is not there:


So, <BR>Phoenix<BR>AZ<BR> gets extracted for name, and city and state are left blank, since their start tags are not found after the </TD>.

However, if you made the name field optional, Web Scraper will perform an extraction with the name field included and get the results above with two out of three fields blank. It will also try an extraction without the name field and see that only one of the three fields is left blank. Thus it uses the results from the second extraction leaving you will the maximum amount of non-blank fields.
