Enterprise IT Consultant Views on Technologies and Trends

Mar 25 2011   2:50AM GMT

Advanced Mashup Scripting in EMML – Part I

Sasirekha R Profile: Sasirekha R

EMML Advanced Mashup Techniques – Part I

Mashups are expected to be used by the business users directly (and not require programming skill) and the tag language EMML makes it easy to learn and use. Using the simple statements and commands of EMML, quite a few useful mashups can be created.

As usage of Mashup accelerates, the tendency to create more powerful and complex mashups would follow.

 In this blog, I have tried to give the crux of some of these advanced techniques.

Web clipping for direct use of HTML

Using the web clipping (or Screen scraping) technique, you can get the entire HTML from any URL as a service response. The EMML Reference Runtime Engine converts the HTML retrieved by the <directinvoke> statement is to XHTML in the http://www.w3.org/1999/xhtml namespace. You can filter, combine or transform with XHTML to create your own output.

Example of <directInvoke> resulting in web clipping:

<operation name=”queryGoogle”>
  <output name=”result” type=”document”>
    <res:queries/>
  </output>
  <directinvoke outputvariable = “$searchresult”
     endpoint=”http://www.google.com/search?q=EMML”/>
  <foreach variable=”$query” items=”$searchresult//xhtml:a[@class=’l’]”>
    <appendresult outputvariable=”$result”>
      <res:itemlink>{$query/@href}</res:itemlink>
    </appendresult>
  </foreach>
</operation>

The itemlink list in the XML result is as follows:

<itemlink href=”http:// en.wikipedia.org/wiki/EMML/”/>
<itemlink href=”http:// www.openmashup.org/omadocs/v1.0/index.html”/>

Normalizing Data for Effective Joins, Grouping or Filtering

Mashups – as the name suggests – is expected to get results from different services (not designed to work with each other). Though similar data is obtained from these services invoked, they would not be in identical forms. For effective usage of joins, grouping or filtering these results from different services, normalization of data to a single representation becomes essential. Creating a custom XPath function is the best method to normalize data.

The following example shows joining mortgage rates from two web sites – one refers to the APR and the second uses custom terms:

Create the custom XPath function – to normalize the custom terminology – as a Java class myFinanceFunction that extends org.oma.emml.client.EMMLUserFunction.

public class MyFinanceFunctions extends EMMLUserFunction {
static Set mortgageAliases = new HashSet();
static { mortgageAliases.add(“5/1 Orange Mortgage”); }
public static String mortgage(String data) {
  if (mortgageAliases.contains(data))
    return “5-Year ARM”;
  return data; }
}

Compile this class, adding web-apps-home/emml/WEB-INF/lib/emml.jar to the classpath. Deploy the compiled class to web-apps-home/emml/WEB-INF/classes for the EMML Engine that host the mashups using this function.

Add a namespace for the class as an xmlns attribute to the <mashup> tag that uses this function.

xmlns:finance=”java:com.mycompany.mashups.MyFinanceFunctions”
name=”MortgageComparisons”>

Use the custom function in the XPath expressions in the <join> statement (or where data needs to be normalized).

<output name=”result” type=”document”/>

  <join outputvariable=”$result”

    joincondition=”$feed1/feed/finance:mortgage(Product) =

    $feed2/feed/finance:mortgage(Product)”/>

  <display message=”result = ” expr=”$result”/>

….

Removing Duplicates With Filtering

To remove duplicates in a mashup, simply merge, join or group results. Sort the combined results (If needed) based on the key field that determines uniqueness to ensure that duplicates are contiguous. Use <filter> with a filtering expression that compares the key value of either the preceding or following ‘item’ to determine if this ‘item’ is unique. The filter expression can use the axis feature in XPath to compare preceding or following items.

Following is a simple <filter> statement:

<filter inputvariable=”$a” outputvariable=”$a” filterexpr=”/rss/channel/item[contains(title,’Java’)” />]

In addition to the default XPath axis – the child axis, you can refer to previous nodes (preceding / preceding-sibling), following nodes (following / following-sibling), the parent node, ancestor nodes, descendant nodes and others. You can also use wildcards like following::* or following::node() to identify all following nodes of any name.

The following example checks the title of each item (after merging results from two RSS services) to remove duplicates:

  <!– merge the results –>
  <merge inputvariables=”$feed1, $feed2″ outputvariable=”result”/>
  <!– filter for unique items based on title –>
  <filter inputvariable=”$result” outputvariable=”$result”
    filterexpr=”/rss/channel/item[not(preceding::title = ./title)]” />

I shall cover some more techniques in the second part of this blog. The Advanced Mashup Techniques are detailed in http://www.openmashup.org/omadocs/v1.0/emml/advMashupIntro.html.

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to: