Home | Index

SearchMonkey Guide

Step 3: Data Extraction

The third step of creating a Page data service is to specify your page extraction rules. Given a particular page structure, you must specify an XSLT stylesheet to extract the desired data and represent it as DataRSS. This step is the heart of your data service.

Figure 2.4. Data Extraction Screen

Data Extraction Screen
[Note] Note

If there are any problems with the extraction code, the Preview Pane displays a bulleted list of warnings and errors.

At the bottom of the screen is the Preview Pane, which displays the results of your data service for one of your test URLs.

Figure 2.5. Preview Pane: Data Extraction Screen

Preview Pane: Data Extraction Screen

If all is well, the Preview Pane should display an HTML bulleted list of name/value pairs representing the DataRSS structure for that URL. <item> rel attributes and <meta> property attributes appear as regular text, while literal <meta> values and the values of resource attributes appear in bold. If there are any problems with the extraction code, the Preview Pane displays a bulleted list of warnings and errors.

The Preview Pane contains several controls for cycling through your test URLs and determining your data inputs and outputs: