Home | Index

SearchMonkey Guide

The Data Class

When you access the Data Mapping screen, SearchMonkey automatically creates an instance of the Data class. The Data class represents the DataRSS XML returned by all the services you selected on the Data Service Selection screen.

[Note] Note

Always try to use Data::get() before resorting to Data::xpath() or Data::xpathString(). Data::get() is easier to use, and it tends to be more robust if the underlying XML structure happens to change. You should only use the XPath functions if you have a problem that requires the full power of the DOM.

The Data class provides three static PHP helper functions for extracting data:

public static function get()

string get ( string $expression ) 

The function Data::get() retrieves a string value from Data's XML using a simplified expression format.

In general, you do not need to construct Data::get() expressions yourself. For each element and attribute of interest in your DataRSS feed, the SearchMonkey provides a button in the web GUI that generates the necessary expression for you.

Figure C.1. PHP Insert Buttons

PHP Insert Buttons

Data::get() works well for retrieving most DataRSS elements and attributes. However, if any property, rel, or id attributes in your expression include any ".", "/", or ":" characters, then Data::get() fails, returning FALSE. For example, trying to retrieve the vcard data in:

<adjunct id='smid:a:b:c'>
 <item rel='rel:Card'>
  <meta property='vcard:a/b/c/d'>Data</meta>
 </item>
</adjunct>

would fail. For these sorts of situations, use Data::xpath() or Data::xpathString() instead.

[Note] Note

Always try to use Data::get() before resorting to Data::xpath() or Data::xpathString(). Data::get() is easier to use, and it tends to be more robust if the underlying XML structure happens to change. You should only use the XPath functions if you have a problem that requires the full power of the DOM.

Parameters

  • expression — Specifies a path to an element or attribute within a DataRSS feed. A Data::get() expression uses a forward slash ("/") as a delimiter, and relies on the fact that in a DataRSS feed, you can retrieve any element simply by specifying a trail of rel and property attribute values. Given that a typical <adjunct> has the structure:

    <adjunct id="id" version="version">
     <item rel="rel" resource="resource">
      <meta property="property" datatype="datatype">Data</meta>
      <item rel="rel" resource="resource"> ... </item>
     </item>
     ...
    </adjunct>

    you can walk the path to a given element value by starting with the id of the desired <adjunct>, followed by the rel of the desired <item>, followed by the property of the desired <meta>, and so on. For example, in a DataRSS feed that contains:

    <adjunctcontainer>
      <adjunct id="yahoo:index">
        ...
      </adjunct>
      <adjunct id="smid:aa3L2" version="1.0">
        <item rel="rel:Resource">
          <meta property="dc:title">Smoked Salmon and Egg Salad</meta>
          <meta property="dc:publisher">http://www.allrecipes.com</meta>
          ...
        </item>
        ...
      </adjunct>
    </adjunctcontainer>

    a path of "smid:aa3L2/rel:Resource/dc:title" would retrieve the string value, "Smoked Salmon and Egg Salad".

    To retrieve an attribute value, append the name of the attribute with an @ sign. For example, a path of "yahoo:index/rel:Posting/rel:Link/@resource" retrieves the value of the resource attribute:

    <adjunctcontainer>
      <adjunct id="yahoo:index">
        ...
        <item rel="rel:Posting">
          ... 
          <item rel="rel:Link" resource="http://foo.bar.com" />
        </item>
      </adjunct>
    </adjunctcontainer>

    returning a value of "http://foo.bar.com".

Returns

A string representing the nodeValue of the specified DataRSS element or attribute. The function returns FALSE if:

  • the provided expression is invalid

  • the provided expression does not match an attribute or element in the source DataRSS

  • a property, rel, or id attribute in the expression includes any ".", "/", or ":" characters