File Under: Frameworks, Programming

Build a Microblog with Django

Thus far in our introductory Django tutorial, we’ve installed the open-source Django framework, set up a blog and beefed it up by adding some extras like semantic content tags, some handy template tags and a list of our bookmarks from delicious.com. If you haven’t been following along, now would be a good time to go back to Lesson 1 and catch up.

However, what we’ve created is not much different than what one could do with WordPress or another out-of-the-box blogging tool. That’s OK for a learning project. But now we’re getting close to being experts, we are going to explore some territory beyond what we can do with pre-built tools.

Let’s build something a little more advanced. Let’s build a microblog.

Contents

  1. Let’s get ready to Tumble
  2. Laying the groundwork
  3. How it works
  4. Listen for the Signals
  5. Writing the URLs and views
  6. Where do we go from here?
  7. RSS feeds
  8. Conclusion

Let’s get ready to Tumble

These days, many of us follow our friends on services like FriendFeed or Facebook, where the concept of a “blog post” is much looser, smaller and faster.

Whether you want to call it a newsfeed, lifestream, microblog, tumblelog or a dozen other clever portmanteaus or neologisms, the concept is the same: a site to log snippets of your life and its discoveries in one place.

FriendFeed is a favorite. The service combines tweets from Twitter, photos from Flickr, links from Delicious, updates from Facebook and other sundry data and displays them all on one page. The page itself also has an RSS feed, giving you (or any of your many fans) a way to follow your every move.

Let’s build a version of Friendfeed using Django. We’ll build a page displaying both our links and blog posts together in a single, time-based “river.” As a bonus we’ll add in some RSS sugar so our friends can follow along without having to visit the site.

Laying the groundwork

Before we get started, let’s think for a bit about what needs to be done. We will post data two separate using two models: the blog Entry model and the Link model. We need to query them both at the same time and then sort the feed into reverse chronological order.

The key to making it work is by creating a way to normalize our data. The database table for each of these models are significantly different. Instead of querying both models, wouldn’t it be better if we just query against one table?

Let’s do it.

How it works

There’s one thing all models will have in common: a publication date. What we’re going to do is create a kind of meta model which stores the date as well as what kind of data it is (i.e. blog entry, link, etc).

Django ships with two tools which actually make it very easy to build our meta model. The first is the content types framework. As the docs explain, “instances of ContentType represent and store information about the models installed in your project, and new instances of ContentType are automatically created whenever new models are installed.”

Huh? Basically, what the Django Project is telling us is just by referencing the content type, we will always know what type of content we’re storing. Storing the content type is half the battle.

The other half of the battle will be won by using a built-in field known as a GenericForeignKey. You may have noticed a ForeignKey field in the Django docs. ForeignKey is similar to what we’re going to use. Where ForeignKey refers to a single model, our GenericForeignKey can refer to any model by referencing the content_type instead.

If that sounds confusing, don’t worry. It’ll make more sense when you see it in action.

Let’s start writing some code. I’m going to store our meta app in a folder named “tumblelog” so create the folder and create a new models.py file inside it (don’t forget the __init__.py file as well, Python’s never-ending “gotcha”). Open up models.py in your text editor and paste in this code:

from django.db import models

from django.contrib.contenttypes import generic



class TumbleItem(models.Model):

    content_type = models.ForeignKey(ContentType)

    object_id = models.PositiveIntegerField()

    pub_date = models.DateTimeField()

    content_object = generic.GenericForeignKey('content_type', 'object_id')



    class Meta:

        ordering = ('-pub_date',)



    def __unicode__(self):

        return self.content_type.name





If you’re familiar with our previous tutorials (or Django in general), by now this should look somewhat familiar. We have a regular ForeignKey to the ContentType as we discussed above. Then we have an id field, which we need to pass to the GenericForeignKey field (along with content_type). Finally we add a datetime field for easy sorting. If you look at the GenericForeignKey documentation you’ll notice this is pretty much the same code used to demonstrate GenericForeignKeys. The only difference is we’ve added a date field.

Now we have a basic script, but we certainly don’t want to update this by hand every time we post something. Especially since our delicious.com import script runs automatically.

Well, it turns out there’s a very powerful feature baked into Django which can handle the task for us.

Django includes an internal “dispatcher” which allows objects to listen for signals from other objects. In our case, our tumblelog app is going to “listen” to our Entry and Link models. Every time a new Entry or Link is saved, those models will send out a signal. When the tumblelog app gets the signal it will automatically update itself.

Sweet. How do we do it?

Listen for the Signals

The first thing we need to do is import the signals framework. Add this to the top of your model.py file:

from django.db.models import signals

from django.contrib.contenttypes.models import ContentType

from django.dispatch import dispatcher

from blog.models import Entry

from links.models import Link

OK, now move to the bottom of the file and add these lines:

for modelname in [Entry, Link]:

	dispatcher.connect(create_tumble_item, signal=signals.post_save, sender=modelname)



In Django 1.1 (and 1.0 I think) the signal have changed. Change the import to:

from django.db.models import signals

from django.contrib.contenttypes.models import ContentType

from django.db.models.signals import post_save

from blog.models import Entry

from links.models import Link

and change the dispatcher line to

post_save.connect(create_tweet_item, sender=Entry)

post_save.connect(create_tweet_item, sender=Link)



So what’s it do? Well it tells the dispatcher to listen for the post_save signal from out Entry and Link models and whenever it gets the signal, it will fire off our create_tumble_item function.

But wait, we don’t have a create_tumble_item function do we?

No, so we better write one. Just above the dispatcher function add this code:

def create_tumble_item(sender, instance, signal, *args, **kwargs):

	if 'created' in kwargs:

		if kwargs['created']:

			create = True

			ctype = ContentType.objects.get_for_model(instance)

			if ctype.name == 'link':

				pub_date = instance.date

			else:

				pub_date = instance.pub_date

			if create:

				t = TumbleItem.objects.get_or_create(content_type=ctype, object_id=instance.id, pub_date=pub_date)

What’s this function doing? For the most part it’s just taking the data sent by the dispatcher and using it to create a new TumbleItem.

The first line looks to see if a variable, created, has been passed by the dispatcher. The post_save signal is sent everytime an object is saved, so it’s possible we’ve just updated an existing item rather than creating a new one. In that case, we don’t want to create a new TumbleItem, so the function checks to make sure this is, in fact, a new object.

If the dispatcher has passed the variable created, and its true, then we set our own create flag and get the content_type of the passed instance. This way, we know whether it’s a Link or an Entry.

If it’s a Link we need to find the datetime info. When we built our Link model, we called the field date, so we set our normalizing pub_date field equal to the instance.date field.

If the instance is a blog Entry we set pub_date to the value of the instance’s pub_date field since, it shares the name we used in our Entry model.

Then we create a new TumbleItem using the built-in get_or_create method. We also pass in a content_type, id and pub_date<code> to their respective fields.

And there you have it. Any time you create a blog entry or the script adds a new link, our new TumbleItem model will automatically update.

There’s a slight problem though. What about the data we’ve already got in there?

Well to make sure it gets added, we’re going to have to comment out the following lines:

	if 'created' in kwargs:

		if kwargs['created']:

Comment it out and save the file. Then fire up the terminal and enter this code one line at a time:

>>> from blog.models import Entry

>>> from link.models import Link

>>> entries = Entry.objects.all()

>>> for entry in entries:

...     entry.save()

...



>>> links = Link.objects.all()

>>> for link in Link:

...     link.save()

...

>>>

What did it do? We just called the <code>save method and dispatcher passed its signal along to the TumbleItem model which then updated itself. Because we commented out the lines that check to see if an item is new, the function ran regardless of the instance’s created variable.

Now uncomment those lines and save the file.

Before we move on I should point out post_save isn’t the only signal Django can send. There are in fact a whole bunch of useful signals like request_started, pre_save and many more. Check out the Django wiki for more details.


Writing the URLs and views

Now we need to add a tumblelog URL and pull out the data. Open up your project level urls.py file and add this line:

(r'^tumblelog/', 'tumblelog.views.tumbler'),

Now create a new views.py file inside your tumblelog folder and open it up to paste in this code:

from tumblelog.models import TumbleItem

from django.shortcuts import render_to_response



def tumbler(request):

	context = {

		'tumble_item_list': TumbleItem.objects.all().order_by('-pub_date')

	}



	return render_to_response('tumblelog/list.html', context)

What’s going on here? Well first we grab the tumblelog items and then we use a Django shortcut function to return to pass the list on to template named list.html.

Let’s create the template. Add a new folder “tumblelog” inside your templates folder and create the list.html file. We’ll start by extending our base template and filling in the blocks we created in an earlier lesson:

{% block title %}My Tumblelog{% endblock %}

{% block primary %}

{% for object in object_list %}

	{%if object 'link'%}

		<a href="{{ obj.url }}" title="{{ obj.title}}">{{ obj.title}}</a>{{object.description}} Posted on {{object.pub_date|date:"D d M Y"}}

	{% endif %}



	{%if object 'entry'%}

		<h2>{{ object.title }}</h2>



		<p>{{ object.pub_date }}</p>

		{{ object.body_html|truncatewords_html:"20"|safe }}

		<p><a href="{{object.get_absolute_url}}">more</a></p>

	{% endif %}

{% endfor %}

{% endblock %}



If you start up the development server and head to http://127.0.0.1:8000/tumblelog/ you should see your blog entries and links ordered by date.

Where do we go from here?

We’re looking pretty good up until we get to the template stage. The if statements aren’t so bad when we’re only sorting two content types, but if you start pulling in tons of different types of data you’ll quickly end up with spaghetti code.

A far better idea would be to write a template tag which takes the content type as an argument, uses it to call a template file, renders it and passes back to the html as a string.

We’ll leave it up to you as an exercise (hint: read up on Django’s render_to_string method).

The other thing you might be wondering about is pagination. If you save dozens of links a day and blog like a madman, this page is going to get really large really fast. Django ships with some built-in pagination tools (see the docs) and there’s also a slick django-pagination app available from Eric Florenzano. Eric even has a very nice screencast on how to set things up.


RSS feeds

We’re almost done. Let’s hook up some RSS feeds for our new site.

It turns out, Django ships with a very nice syndication framework. All we need to do is turn it on and hook it up.

Let’s start by going into our TumbleItem models.py file and setting up the framework. Paste in this code at the bottom of the file:

from django.contrib.syndication.feeds import Feed

class LatestItems(Feed):

    title = "My Tumblelog: Links"

    link = "/tumblelog/"

    description = "Latest Items posted to mysite.com"

    description_template = 'feeds/description.html'



    def items(self):

        return TumbleItems.objects.all.order_by('-pub_date')[:10]



Here we’ve imported the syndication class Feed and then defined our own feed class to fill in some basic data. The last step is returning the items using a normal objects query.

We need to create the URLs first. There are several places you could do this, I generally opt for the main project URLs.py file. Open it up and paste this code just below the other import statements:

from tumblelog.models import LatestItems

feeds = {

    'tumblelog': LatestItems

}

Now add this line to your urlpattern function:

(r'^feeds/(?P<url>.*)/$', 'django.contrib.syndication.views.feed', {'feed_dict': feeds}),

The last step is creating the description template in a new folder we’ll lable “feeds.” Just plug in the same code from the tumblelog template, but delete the forloop tag since we’re only passing a single object this time.

And there you have it. Fire up the development server and point it to http://127.0.0.1:8000/feeds/tumblelog/. If you want to add separate feeds for links or just blog entries, all you need to do is repeat the process, creating new feed classes for each.

Conclusion

Whew! It’s been a long journey and we wrote a lot of code, but hopefully you learned a good bit about how Django works. You not only have a blog/linklog/tumblelog, but hopefully some ideas of how to approach other projects.

Speaking of projects, although we used one, there’s really no need to organize your code this way. Sometimes the projects come in handy, but other times they make your PYTHONPATH into a snarled mess. For some other ideas on how organize you code, check out James Bennet’s article on the subject.

If you ever have questions about Django, jump on the #Django IRC channel, there’s load of friendly and helpful developers who can point you in the right direction, and be sure to join the mailing list. Happy coding!