A Wired.com user account lets you create, edit and comment on Webmonkey articles. You will also be able to contribute to the Wired How-To Wiki and comment on news stories at Wired.com.
It's fast and free.
processing...Retrieve Sign In
Please enter your e-mail address or username below. Your username and password will be sent to the e-mail address you provided us.
processing...Welcome to Webmonkey
- edit articles
- add to the code library
- design and write a tutorial
- comment on any Webmonkey article
Sign In Information Sent
Create Automated Backups in Google Docs Using the GData API
/skill level/
/viewed/
As you're probably aware, Google Documents offers free online storage for a variety of common files. You can upload Microsoft Office documents, spreadsheets, text files and presentations, then have read/write access to them from anywhere. And even if you don't actually use Google Documents for editing or creating documents, it can serve as a handy backup for your desktop files.
Google hosts a set of APIs called GData that provide a simple way of writing data on the web. The GData APIs don't do much, but they can be used to publish data to any Google web service that supports the GData format. Luckily, Google Docs is one of those services.
We're going to take Google's GData API for Google Docs and use it to automatically upload our local office docs into the online service, all without opening a browser window or clicking on any buttons.
If you want to go a little further, you can create your own web application that creates secure, timed backups of your work.
We'll start out by taking a look at the specific GData APIs that allow you to upload files from your local machine and store them in Google Documents, then attach those to a script to automate everything.
This article is a wiki. Got extra advice about using the Documents List Data API? Log in and add it.
Contents |
Install the GData API
Because most of what we're going to do is shell-based, we'll be using the the Python GData library. Here's Google's Python GData client.
If you're not a Python fan, there are a number of other client libraries available for interacting with Google Docs, including JavaScript, PHP, .NET, Java and Objective-C. Go ahead and pick from the list.
To get started, download the Python GData Client Library. Follow the instructions for installing the Library as well as the dependencies (in this case, ElementTree -- only necessary if you aren't running Python 2.5)
Now, just to make sure you've got everything set up correctly, fire up a terminal window, start Python and try importing the modules we need:
>>> import gdata.docs >>> import gdata.docs.service
Assuming those imports worked, you're ready to start working with the API.
Get Started
The first thing we need to get out of the way is what kinds of documents we can upload. There's a handy static member we can access to get a complete list:
>>> from gdata.docs.service import SUPPORTED_FILETYPES >>> SUPPORTED_FILETYPES
Running that command will reveal that these are our supported upload options:
- RTF: application/rtf
- PPT: application/vnd.ms-powerpoint
- DOC: application/msword
- HTM: text/html
- ODS: application/x-vnd.oasis.opendocument.spreadsheet
- ODT: application/vnd.oasis.opendocument.text
- TXT: text/plain
- PPS: application/vnd.ms-powerpoint
- HTML: text/html
- TAB: text/tab-separated-values
- SXW: application/vnd.sun.xml.writer
- TSV: text/tab-separated-values
- CSV: text/csv
- XLS: application/vnd.ms-excel
Definitely not everything you might want to upload, but between the Microsoft's Office options and good old plain text files, you should be able to backup at least the majority of your files.
Now let's take a look at authenticating with the GData API.
Authenticate with GData
Create a new file named gdata_uploader.py and save it somewhere on your Python Path. Now open it in your favorite text editor. Paste in this code:
from gdata.docs import service
def create_client():
client = service.DocsService()
client.email = 'yourname@gmail.com'
client.password = 'password'
client.ProgrammaticLogin()
return client
All we've done here is create a wrapper function for easy logins. Now, any time we want to login, we simply call create_client. To make you code a bit more robust you can pull out those hardcoded email and password attributes and define them elsewhere.
Upload a document
Now we need to add a function that will actually upload a document. Just below the code we created above, paste in this function:
def upload_file(file_path, content_type, title=None):
import gdata
ms = gdata.MediaSource(file_path = file_path, content_type = content_type)
client = create_client()
entry = client.UploadDocument(ms,title)
print 'Link:', entry.GetAlternateLink().href
Now let's play with this stuff in the shell:
>>> import gdata_upload
>>> gdata_upload.upload_file('path/to/file.txt','text/plain','Testing gData File Upload')
Link: http://docs.google.com/Doc?id=<random string of numbers>
>>>
Note that our upload_file takes an optional parameter "title", if you import Python's date module and pass along the date as a string it's easy to make incremental backups, like: myfile-082908.txt, myfile-083008.txt and so on.
Where to go from here
To automate our backup process you could call the upload file function from a cronjob. For instance, I use:
0 21 * * * python path/to/backup_docs.py 2>&1
In this case, backup_docs.py is just a three line file that imports our functions from gdata_uploader.py and then uses Python's os module to grab a list of files I want backed up and calls the upload_file function.
While the automated script is a nice extra backup, unfortunately the GData Documents API is somewhat limited. For instance, it would be nice if we could automatically move our document to a specific folder, but that currently isn't possible.
There are some read functions available though, have a look through the official docs and if you come up with a cool way to use the API, be sure to add it to this page.
/related_articles/
Special Offer For Webmonkey Users
WIRED magazine:
The first word on how technology is changing our world.
