<<Back to Article

Start Data Plumbing With Yahoo Pipes

/skill level/
/viewed/
0 Times

As APIs and RSS feeds become more commonplace, we look for ways to mash them together. Even if you are a programmer, it can be a tedious process to write scripts that download and parse XML files.

Yahoo Pipes solves the problem by providing an easy, graphical way to interact with data. No command line or shell scripts necessary!

A Pipe is a collection of data sources upon which we can apply operators, such as filtering, merging, and sorting. The output of a Pipe usually an RSS feed, though Yahoo provides a number of ways to get results, such as email or mobile phone text message.

Pipes are useful in that they can merge multiple sources into one, or take a firehose of information and filter out only the stuff that's important to you. They can also do much more, but we're going to keep things simple for this entree into data plumbing. So lower those jeans and let's get crackin'!

This article is a wiki. If you think you know more about plumbing than we do, don't be a wise guy. Shut your pie hole, log in and add it yourself.

Contents

What You'll Need

  • A Yahoo account (you can get one here)
  • Basic knowledge of RSS
  • Plumber's tape -- just kidding!

Your First Pipe: Filtering Data

Head on over to Yahoo Pipes and make sure you sign in.

For your first pipe, let's imagine you really like to read the Monkeybites blog. Unfortunately, you don't like to read anything we have to say about Google. Let's make a Pipe that takes the Monkeybites feed and excludes any items with Google in the title.

You can see the completed pipe here, but it will be easier to understand if you start from scratch. Go ahead and click "Create a pipe" and you'll be taken to a beautiful blank canvas.

Creating a pipe. Click Image to Expand
Creating a pipe. Click Image to Expand

On the left, you'll see a menu with sources, user inputs, operators, etc. These are the pieces available to put together your pipes. We need to get data from somewhere, so let's click and drag a "Fetch Feed" source into the canvas.

Fetching the Feed. Click Image to Expand
Fetching the Feed. Click Image to Expand

Along with the Fetch Feed box, we now also have another box. The results of a pipe flow from the data source to this Pipe Output box. The stuff in between is where the magic happens.

In the Fetch Feed box, type the URL of the Monkeybites feed: http://www.webmonkey.com/rss/blog

Next, go to the Operators menu, then drag a Filter operator onto the canvas between our other two boxes. Now it's time for some plumbing. Click and hold the circle at the bottom of the Fetch Feed box, then drag the mouse to the circle at the top of the Filter box. You should see a blue tail connecting the two boxes, which "pipes" the data from the feed to the filter.

Connecting a feed to the filter
Connecting a feed to the filter

Once we have connected the source to the filter, we choose which part of the feed we want to block. In this case, let's not show posts that have Google in the title. So, select item.title from the menu on the left of the Filter box, then fill in "Google" in the text box on the right. Finally, connect the bottom of the filter box to the top of the pipe output box, like so:

Connecting the filter to the pipe.  Click Image to Expand
Connecting the filter to the pipe. Click Image to Expand

Wipe the grease off your hands and pat yourself on the back. You just completed your first pipe. Click save, then "Run Pipe" and you'll see a list of results, but no mentions of Google in the headlines.

Look Ma, no Google. Click Image to Expand
Look Ma, no Google. Click Image to Expand

If instead you only want to see the articles about Google, you can "Edit Source" and change your filter to "Permit" instead of "Block."

Accepting User Input

It's possible you wouldn't want to filter on the term "Google." Maybe you want to filter on whatever you feel like that day without having to edit your source every time.

Pipes has several options to take user input. Under the appropriately-named User Inputs menu, drag a Text Input to the canvas. I like to position mine to the right of the fetch feed box.

Essentially what we are creating is a variable that we can plug into other areas of our pipe. Under name, I called mine "query," then for prompt I wrote "Term to block." For a default value, I entered "Google," the same term we were filtering before.

Now drag the circle at the bottom of the filter box to the tiny circle to the right of where we filtered Google in the first example.

Attaching a variable. Click Image to Expand
Attaching a variable. Click Image to Expand

When you connect the input to the filter box, the text goes away and is replaced by "text [wired]" to show that this text will come from user input.

Save and run the pipe. You'll see the same results as before. If you change the input and click the "Run Pipe" button, now you'll see different results, filtered for whatever you typed in the box. Check out my version of the pipe.

You have now created a basic filtered pipe and made it more friendly by accepting user input. In the next section we'll merge multiple data sources, and put it all together.

Merging Data Sources

Create a new pipe so we can start fresh. Drag a couple "Fetch Feed" boxes onto the blank canvas. We'll be merging these two data sources together.

Here is the completed feed, but don't look now. You'll spoil the surprise!

For the first source, let's continue to use the Monkey bites feed. We aren't tired of it yet! http://www.webmonkey.com/rss/blog

Our second feed will be from the Wired culture blog, The Underwire.

Under the operators menu, drag a Union box to the canvas. You'll notice this box has a whole bunch of inputs at the top and one output at the bottom. Plumbers, connect those pipes! Drag the circles at the bottom of each feed box to one of the circles at the top of the union box. Then connect the union box to the pipe output and save.

Merging pipes together. Click Image to Expand
Merging pipes together. Click Image to Expand

When you run the pipe, you'll see that we now have content from both of the blogs in once place. The only problem is that the two feeds are stacked one after another, not intermingled. We have a sorting problem.

Lucky for us, Yahoo Pipes makes it easy for data plumbers to solve their ordering issues. You may even have noticed the Sort operator. That's what we'll be using. But how do we insert the sort between the union box and the pipe output?

So far all we have done is connect pipes. Now we need to disconnect. Click on the circle at the top of the pipe output box. A scissors icon will appear above the box. Click it, and the connection disappears.

Cutting the connection
Cutting the connection

Now we can drag the sort operator to the canvas. Connect the union output to the sort input. The sort options box will populate with feed fields. We want to sort by the date, which is item.pubDate in our feed. Let's sort in descending order, with the newest stuff showing at the top of the feed, because this is how RSS usually sorts.

Connect the sort box to the pipe output and, as quickly as you can say unclogged toilet, we've solved the sorting issue. Save and run the pipe, and the two feeds are ordered by date.

Put it All Together: Filtering, Merging and Sorting, Oh My!

I know what you're saying now: what if we wanted to merge two feeds *and* filter out stories about Google? Ever the savvy plumber, you're one step ahead.

Starting from our merged and sorted pipe in the previous section, we want to add a filter that allows user input.

Drag a filter between the union box and the sort box and connect it to each of them. Then plop a text input box, give it some settings like we did earlier, and connect it to the filter keyword input.

Here's the finished pipe, which should look something like this:

The finished pipe. Click Image to Expand
The finished pipe. Click Image to Expand

Ideas for your next Pipe

Yahoo lets you browse published Pipes, so you can view the source to figure out how it works. This can be a great way to learn more.

Here are some Pipes to check out, which may give you some inspiration to write your own:

  • YouTunes Top 100 uses an iTunes feed of popular music and searches YouTube for the corresponding music videos.
  • Photos near Wineries searches Yahoo Local for wineries, then queries Flickr for pictures taken near those locations.
  • Aggregated News Alerts lets you create a persistent search for a keyword across many different sites, weeding out duplicates.
  • Mini Lifestream takes your user information from various online tools to aggregate everything in one feed, your lifestream.
  • This page was last modified 22:22, 5 August 2008.
Edit this article
Reddit Digg