Blogs Media Lab Development

Honeybees and Confetti Drops: Having Fun with Web Design

We’re a serious bunch at the Walker Art Center, except when we aren’t. Cat breaks have made their way into Art News from Elsewhere, and we’ve tucked in a few Easter eggs for fans of these hidden amusements. Our new site includes a confetti drop that appears when you click on Parties & Special Events […]

We’re a serious bunch at the Walker Art Center, except when we aren’t. Cat breaks have made their way into Art News from Elsewhere, and we’ve tucked in a few Easter eggs for fans of these hidden amusements. Our new site includes a confetti drop that appears when you click on Parties & Special Events in the calendar. And for those who find their way to a place that they shouldn’t, there’s a custom 404 page. God forbid there’s a server crash, we’ll send you to a page featuring Charles Ray’s Unpainted Sculpture.

Last week Eric added accumulating bees to the Lifelike exhibition page. The longer you stay on the page, the larger the swarm.


For those of you hoping to attract a few bees of your own, here’s Eric’s script.

Continuous Deployment with Fabric

We have been using Fabric to deploy changes to walkerart.org. Fabric is a library that enables a string of commands to be run on multiple servers. Though similar things could be done with shell scripts, we enjoy staying in one language as much as possible. In Fabric, strings are composed and sent to remote servers […]

textured brown fabric
We have been using Fabric to deploy changes to walkerart.org. Fabric is a library that enables a string of commands to be run on multiple servers. Though similar things could be done with shell scripts, we enjoy staying in one language as much as possible. In Fabric, strings are composed and sent to remote servers as commands over an SSH connection. Our Fabric scripts have been evolving over time with the project using the mentality: “If you know you are going to be doing something more than twice, script it!”

With Fabric we can tailor our deployments precisely. We deploy often with one of two commands:
[cci]fab production simple_deploy[/cci] or [cci]fab production deploy[/cci].
[cci]simple_deploy[/cci] simply pulls new code from the repo and restarts the web server.
[cci]deploy[/cci] does many things, each of which can be executed independently, and is explained below.

The scripts we run go both ways, code goes up to the server and data comes back to the workstation. We have [cci]fab sync_with_production[/cci], which pulls the database and images. The images arrive locally in a directory specified by an environment variable or the default directory. Conventional naming schemes simplify variables across systems such as the database name. Except for some development settings, our workstation environments are identical to the production environment, which means we can replicate a bug or feature locally and immediately.

We have been collecting all of the commands we normally run on the servers into our fabfile. And then we can group them by calling tasks from other tasks. Our deployment consists of 12 tasks. With this Fabric task, one can deploy to the production or staging server with this one command:
[cci]fab production deploy[/cci].

This makes it incredibly simple to put code that is written on developer workstations into production in as safe and secure way. Here is our deployment in Fabric:

[cc lang=”python”]
def deploy():
with cd(env.project):
run(‘git pull’)
get_settings()
install_requirements()
production_templates()
celerybeat(‘stop’)
celeryd(‘stop’)
synccompress()
migrate()
gunicorn(‘restart’)
celeryd(‘start’)
celerybeat(‘start’)

[/cc]
First the “with” blocks put us onto the remote server, into the right directory and within Python’s virtual environment. From there “git_pull” gets the new code which contains the settings files, and “get_settings” copies any new settings into place. The task called “install_requirements” calls on pip to validate our virtual environment’s packages against the setting file called requirements. All third party packages are locked to a version so we aren’t surprised by new “features” that have adverse effects. We use celery to harvest data from other sites so we make sure they are running with fresh config files. The task “syncompress” does our compressing of css and js, “migrate” alters the database per our migration files and gunicorn is the program that is running django.

It takes about 60 seconds for a new version of the website to get into production. From there it takes 0-10 minutes for the memcached values to expire before the public changes are visible. We are deploying continuously so watch closely for updates!

Optimizing page load time

We launched the new walkerart.org late on December 1, and it’s been a great ride. The month leading up to (and especially the preceding week starting Thanksgiving Day, when I was physically moving servers and virtualizing old machines) was incredibly intense and really brought the best out of our awesome team. I would be remiss […]

We launched the new walkerart.org late on December 1, and it’s been a great ride. The month leading up to (and especially the preceding week starting Thanksgiving Day, when I was physically moving servers and virtualizing old machines) was incredibly intense and really brought the best out of our awesome team. I would be remiss if I didn’t start this post by thanking Eric & Chris for their long hours and commitment to the site, Robin for guiding when needed and deflecting everything else so we could do what we do, and Andrew and Emmet for whispering into Eric’s ear and steering the front-end towards the visual delight we ended up with. And obviously Paul and everyone writing for the site, because without content it’s all just bling and glitz.

Gushy thanks out of the way, the launch gave us a chance to notice the site was a little … slow. Ok, a lot, depending on your device and connection, etc. Not the universally fast experience we were hoping for. The previous Walker site packed all the overhead into the page rendering, so with the HTML cached the rest would load in under a second, easy. The new site is heavy even if the HTML is cached. Just plain old heavy: custom fonts, tons of images popping and rotating, javascript widgets willy-nilly, third-party API calls…

Here’s the dirty truth of the homepage when we kicked it out the door December 1:

12/1: 2.6 MB over 164 requests. Load times are pretty subjective depending on a lot of things, but we had good evidence of the page taking at least 4+ seconds from click to being usable — and MUCH longer in some cases. Everyone was clearly willing to cut us some slack with a shiny new site, but once the honeymoon is over we need to be usable every day — and that means fast. This issue pretty quickly moved to the top of our priority list the Monday after launch, December 5.

The first thing to tackle was the size: 2.6 MB is just way too big. Eric noticed our default image scaling routine was somehow not compressing jpgs (I know, duh), so that was an easy first step and made a huge difference in download size.

12/5: 1.9 MB.

On the 6th we discovered (again, duh) lossless jpeg and png compression and immediately applied it to all the static assets on the site, but not yet to the dynamically-generated versions. Down to 1.8 MB. We also set up a fake Content Delivery Network (CDN) to help “parallelize” our image downloads. Modern browsers allow six simultaneous connections to a single domain, so by hosting all our images at www.walkerart.org we were essentially trying to send all our content through one tiny straw. Chris was able to modify our image generator code to spread requests across three new domains: cdn0.walkerart.org, cdn1, etc. This bypasses the geography and fat pipe of a real CDN, but does give the end user a few more straws to suck content through.

Requests per Domain

www.walkerart.org 26
cdn1.walkerart.org 24
cdn0.walkerart.org 24
cdn2.walkerart.org 21
f.fontdeck.com 4
other 7

 

By the 8th we were ready to push out global image optimization and blow away the cache of too-big images we’d generated. I’m kind of astounded I’d never done this on previous sites, considering what an easy change it was and what a difference it made. We’re using jpegoptim and optipng, and it’s fantastic: probably 30% lossless saving on already compressed jpegs and pngs. No-brainer.

12/8: 1.4 MB, almost half of what we launched with.

Next we needed to reduce the number of requests. We pushed into the second weekend with a big effort to optimize the Javascript and CSS. Earlier attempts using minify had blown up and were abandoned. Eric and Chris really stepped up to find a load order that worked and a safe way to combine and compress files without corrupting the contents. Most of the work was done Friday, but we opted to wait for Monday to push it out.

Meanwhile, I spent the weekend pulling work from the client’s browser back to the server where we could cache it site-wide. This doesn’t really impact bytes transferred, but it does remove a remote API call, which could take anywhere from a fraction of a second (unnoticeable) to several seconds in a worst-case scenario (un-usable). This primarily meant writing code to regularly call and cache all of our various Twitter feeds and the main weather widget. These are now served in the cached HTML and it’s negligible in the load time, instead of 200+ ms on average. It all adds up!

 

CSS Sprite for Header and Footer nav images (it has a transparent background, so it’s supposed to look like that):

 

 

So Monday, 12/12, we pushed out our first big changes to address the number of queries. Eric had combined most of the static pngs into a CSS Sprite, the javascript and CSS were reduced to fewer files, and the third party APIs were no longer called in the browser. Really getting there, now.

12/12: 1.37 MB, and 125 requests

Happily (as I was writing this) Eric just pushed out the last (for now) CSS sprite, giving us these final numbers:

12/13: 1.37 MB, and 110 requests! (down 53% and 67% respectively)

This isn’t over, but it’s gotten to the point of markedly diminishing returns. We’re fast enough to be pretty competitive and no longer embarrassing on an iPad, but there are a few more things to pick off around the edges. We’re heavier and slower than most of our museum peers, but lighter and faster than a lot of similar news sites. Which one are we? Depends which stats I want to compare. :)

We used the following tools to help diagnose and prioritize page loading time issues:
http://tools.pingdom.com/fpt/
https://developers.google.com/pagespeed/
http://developer.yahoo.com/yslow/

 

 

 

 

Building the 50/50 Voting App

For our upcoming exhibition 50/50: Audience and Experts Curate the Paper Collection, we’re trying something a bit different. As you can probably tell from the title, we’re allowing our audience to help us curate a show. The idea is that our chief curator, Darsie Alexander, will curate 50% of the show, and the audience will […]

50/50 Voting App

For our upcoming exhibition 50/50: Audience and Experts Curate the Paper Collection, we’re trying something a bit different. As you can probably tell from the title, we’re allowing our audience to help us curate a show. The idea is that our chief curator, Darsie Alexander, will curate 50% of the show, and the audience will select from a group of 180 different print works for the other half.

As with most things presented to New Media, the question was posed, “how best do we do this?”. The exhibition is being hung in the same room as Benches and Binoculars, so the obvious answer was to use the kiosk already there as the voting platform for the show. With this in mind I started to think of different ways to present the voting app itself.

My initial idea was to do a “4-up” design. Display four artworks and ask people to choose their favorite. The idea was that this would make people confirm a choice in comparison to others. If you see some of what you’re selecting against, it can make it easier to know whether you want specific works in the show or not. But it also has the same effect in reverse. If you have two artworks that you really like, it can be just as hard to only be able to choose one. The other limitation? After coming up with the 4-up idea, we also decided to add iPhones into the mix as a possible voting platform (as well as iPads, an general internet browsers). The images on the iPhone’s screen were much to small to make decent comparisons on.

Nate suggested instead using a “hot or not” style voting system. One work that you basically vote yes or no on. This had the small downfall of not being able to compare a work against others, but allowed us to negate the “analysis paralysis” of the 4-up model. It also worked much better on mobile devices.

The second big decision we faced was “what do we show”? I had assumed in the beginning that we’d be showing label copy of every work like we do just about everywhere but it was suggested early on that we do no such thing. We didn’t want to influence voters by having a title or artist on every piece. With works by Chuck Close and Andy Warhol mixed into the print selections, it’s much too easy to see their name and vote for them simply because of their name. We wanted people to vote on what work they wanted to see, not what artist they wanted to see.

Both of these decisions proved to be pivotal in the popularity of the voting app. It made the voting app very streamlined and simplified. With 180 works to go through it makes it much easier to get through the entire thing. Choices are quick and easy. The results screen after voting on each artwork shows the current percentage of no to yes votes. This is a bit of a psychological pull. You as a user know what you think of this artwork, but what do others think about it? The only way to find out is to vote.

50/50 Voting App Results Screen

Because of this the voting app has been a success far beyond what we even thought it would be. I thought if we got 5,000-10,000 votes we would be doing pretty well. Half way through the voting process now, we have well over 100,000 votes. We’ve had over 1,500 users voting on the artworks. We’ve collected over 500 email addresses wanting to know who the winners are when all the voting is tallied. We never expected anything this good and we have several weeks of voting yet to come.

One interesting outcome of all of these votes has been the number of yes’s to no’s over all of the works. Since the works are presented randomly (well, pseudo randomly for each user), one might expect that half the works would have more yes than no votes, and vice versa. But that’s not turned out to be the case. About 80% of the works have more no votes than yes’s. Why is this?

There are various theories. Perhaps people are more selective if they know something will be on view in public. Perhaps people in general are just overly negative. Or perhaps people really don’t like any of our artwork!

But one of the more interesting theories of why this is goes back to the language we decided to use. Originally we were going to use the actual words “Yes” and “No” to answer the question “Would you like to see this artwork on view?”. This later got changed to “Definitely” and “Maybe Not”. Notice how the affirmative answer has much more weight behind it: “Yes, most definitely!”, whereas the negative option leaves you a bit of wiggle room “Eh, maybe not”. It’s this differentiation between being sure of a decision and perhaps not so sure that may have contributed to people saying no more often than yes.

Which begs the question, what if it was changed? What if the options instead were “Definitely Not” and “Sure”? Now the definitive answer is on the negative and the positive answer has more room to slush around (“Hell no!” vs “Ahh sure, why not?!”). It would be interesting to see what the results would have been with this simple change in language. Maybe next time. This round, we’re going to keep our metrics the same throughout to keep it consistant.

The voting for 50/50 runs until Sept 15. If you’d like to participate, you still have time!

Creating a community calendar using Google Apps and WordPress

For Walker Open Field, we wanted a way to collect community submitted events and display them on our site. We have our own calendar and we discussed whether adding the events to our internal Calendar CMS was the best way, or if using an outside calendar solution was the direction to go. In the end, […]

For Walker Open Field, we wanted a way to collect community submitted events and display them on our site. We have our own calendar and we discussed whether adding the events to our internal Calendar CMS was the best way, or if using an outside calendar solution was the direction to go. In the end, we decided to do both, using Google Calendar for community events and our own calendar CMS for Walker-programmed events.

The Open Field website is based on the lovely design work of Andrea Hyde, and the site is built using WordPress, which we use for this blog and a few other portions of our website. WordPress is relatively easy to template once, so it makes for quick development. WordPress also has a load of useful plug-ins and built-in features that saved us a lot of time. Here’s how we used it an put it together:

Collecting Events
To accept event ideas from community members, we used the WordPress Cforms II plugin, which makes it very easy to build otherwise complex forms and process them. You can create workflows with the submissions, but we simply have Cforms submit the form to us over email. A real person (Shaylie) reviews each of the event submissions and adds the details to…

Google Calendar
We use Google’s Calendar app to contain the calendar for the community events. When Shaylie gets the email about a new event, she reviews it, follows up on any conflicts or issues, and then manually adds it to google calendar. We toyed with the idea of using the Calendar API to create a form that would allow users to create events directly in the calendar, but decided against it for two reasons. First, it seemed overly complicated for what we thought would amount to less than 100 events. Secondly, we would still have to review every submission and it would be just as cumbersome to do it after the fact rather than beforehand.

We also use Google Calendar to process our own calendar internal calendar feed. The Walker Calendar can spit out data as XML and ICAL. We have our own proprietary XML format that can be rather complex, but the ICAL format is widely understood and Google Calendar can import it as a subscription.

Getting data out of Google Calendar
We now have two calendars in google calendar: Walker Events and Community Events. Google provides several ways to get data out of google calendar, and the one we use is the ATOM format with a collection of google name-spaced elements. The calendar API is quite robust, but there are a few things worth noting:

  • You must ask for the full feed to get all the date info
  • Make sure you set the time zone, both on the feed request but add it to the events when you link tot hem (using the ctz paramater)
  • Asking for only futureevents and singleevents (as paramaters) makes life easier, since you don’t have to worry about complexities of figuruing out the repeating logic, which is complicated

This is our feed for Open Field Community Events.

Calendar data into WordPress
Since version 2.8, WordPress has included the most excellent SimplePie RSS/ATOM parsing library. As the name would have you believe, it is pretty simple to use. To pull the data out of the Google Calendar items with simplePie, you extend the SimplePie_Item class with some extra methods to get that gd:when data.

Combining two feeds in SimplePie is not hard. By default, SimplePie will sort them by the modified or published date, which in the Google Calendar API is the date the event was edited, not when it happens. Instead, we want to sort them by the gd:date field. There are probably a few ways to do this, but the way that I set up was to simply loop through all the data, put it into an array with the timestamp as the key, and then sort that array by key. Here’s the code:

[php]
<?php

//include the rss/atom feed classes
include_once(ABSPATH . WPINC . ‘/feed.php’);
require_once (ABSPATH . WPINC . ‘/class-feed.php’);

// Get a SimplePie feed object from the specified feed source.
$calFeedsRss = array(
#walker events
‘http://www.google.com/calendar/feeds/95g83qt1nat5k31oqd898bj2ako1phq1%40import.calendar.google.com/public/full?orderby=starttime&ctz=America/Chicago&sortorder=a&max-results=1000&futureevents=true’,

#public events
‘http://www.google.com/calendar/feeds/walkerart.org_cptqutet6ou4odcg6n2mvk4f44%40group.calendar.google.com/public/full?orderby=starttime&ctz=America/Chicago&sortorder=a&max-results=1000&futureevents=true&singleevents=true’
);

$feed = new SimplePie();
$feed->set_item_class(‘SimplePie_Item_Extras’);
$feed->set_feed_url($calFeedsRss);
$feed->set_cache_class(‘WP_Feed_Cache’);
$feed->set_file_class(‘WP_SimplePie_File’);
$feed->set_cache_duration(apply_filters(‘wp_feed_cache_transient_lifetime’, 3600)); //cache things for an hour
$feed->init();
$feed->handle_content_type();

if ( $feed->error() )
printf (‘There was an error while connecting to Feed server, please, try again!’);

$count = 0; // hack, but we’re going to count each loop and use it as a little offset on the sort val
foreach ($feed->get_items() as $item){

if (strtolower($item->get_title()) != ‘walker open field’){
$eventType = ‘walker';
$related = $item->get_links ( ‘related’ );
$related = $related[0];
if ( strpos($related,’walkerart.org’) === False ){
$related = $item->get_link();
// if it’s a google calendar event, make sure we set the tiem zone
$related .= "&ctz=America%2FChicago";
$eventType = ‘community';
}

#we offset the actual starttime a little bit in case two events have the same start time, they would overwrite in t he array
$sortVal = $item->get_gcal_starttime(‘U’) + $count;
$myData = array(
‘title’ => $item->get_title(),
‘starttime’ => $item->get_gcal_starttime(‘U’),
‘endtime’ => $item->get_gcal_endtime(‘U’),

‘link’ => $related,
‘eventType’ => $eventType,
‘text’ => $item->get_content(),
‘date’ => $item->get_gcal_starttime(‘U’)
);
$cals[ $sortVal ] = $myData;
}
$count++;
}
//sort the array by keys
ksort($cals);
// $cals now contains all the event info we’ll need

?>
[/php]

Once this is done, you can simply take the $cals array and loop through it in your theme as needed. Parsing the google calendar feeds is not an inexpensive operation, so you may wish to use the Transients API in WordPress to cache this information.

Caveats and Issues
Overall, this approach has worked well for us. We have run into some issues where the Google Calendar ATOM feed would show events that had been deleted. Making sure to set the futureevents and singleevents paramaters fixed this. We also ran into some issues using the signleevents, so we ended up manually creating occurrences for events that would have otherwise had a repeating structure.

Simple iTunes U stats aggregation with python and xlrd

Like many institutions, we put numbers for our various online presences in our annual report and other presentations: YouTube views, Twitter followers, Facebook fans, etc. For most services, this is very easy to do: log in, go to the stats page, write the big number down. We also want to include the iTunes U numbers, […]

Like many institutions, we put numbers for our various online presences in our annual report and other presentations: YouTube views, Twitter followers, Facebook fans, etc. For most services, this is very easy to do: log in, go to the stats page, write the big number down. We also want to include the iTunes U numbers, but for iTunes U, there is no centralized stats reporting outside of the Microsoft Excel file Apple sends iTunes U administrators every week. Tabulating the stats by hand would be time consuming and error-prone, so I wrote a short python script to automate it. Here is is:

[python]
#!/usr/bin/env python
# encoding: utf-8
"""
itunesUstats.py
Created by Justin Heideman on 2010-05-19.
"""

import sys, os, glob, xlrd

def main():
#change this to the path where your stats are
path = ‘itunes_U_stats/all/’
totalDownloads = 0

for infile in glob.glob( os.path.join(path, ‘*.xls’) ):
# open the file
wb = xlrd.open_workbook(infile)

# get the most recent days’s tracks
sh = wb.sheet_by_index(1)

# get the downloads from that day
downloads = sh.col_values(1)

#first entry is u’Count’, whcih we don’t want
downloads.pop(0)

#sum it up
totalDownloads += sum(downloads)

# show a little progress
print sum(downloads)

#done, output results
print "—————————————————-"
print "Total downloads: %d" % totalDownloads

if __name__ == ‘__main__':
main()
[/python]

This script uses the excellent xlrd python module to read the Excel files (simple xlrd tutorial here), which is roughly 27.314 times easier than trying to use an Excel macro to do the same thing. To use this, simply change the path on line 13 to a directory containing all your iTunes stats files, and run the script from the command line. You’ll get output like this:

929.0
732.0
779.0
854.0
1000.0
987.0
765.0
812.0
1275.0
1333.0
1114.0
1581.0
1278.0
1568.0
1854.0
2102.0
1108.0
1078.0
----------------------------------------------------
Total downloads: 21149

Do note that the script is tied to the excel format Apple has been using, so if they change it, this will break. Apple explains the iTunes report fields here. This script tells you ” all the iTunes U tracks users downloaded that week through the DownloadTrack, DownloadTracks, and SubscriptionEnclosure actions”.

User testing using paper prototypes

A few years ago I was trying to explain the concept of “fail early, fail often” to someone, and failing.  (see what I did there?  ;-)  They didn’t understand why you just wouldn’t take longer to build it right the first time. Now that we’re deep in the process of redesigning our website, I am […]

A few years ago I was trying to explain the concept of “fail early, fail often” to someone, and failing.  (see what I did there?  ;-)  They didn’t understand why you just wouldn’t take longer to build it right the first time.

Now that we’re deep in the process of redesigning our website, I am starting to see the real danger in that sort of thinking.  Despite all our best intentions, we’ve fallen into a trap of thrashing back and forth around certain ideas – unable to agree, unwilling to move forward until we “solve it”, and essentially stuck in the same cycle illustrated in this cartoon.

Click for the whole cartoon (scroll down a bit)

To try to help break the recent impasse on site navigation, we’re doing some simple user testing using paper prototypes of several ideas.  These are meant to be rough sketches to essentially pass/fail the “do they get it?” test, but they’re also giving us a ton of valuable little hints into how people see and understand both our website and our navigation.

An example of some paper prototypes for the navigation. (Don't worry, it's just a rough idea and one of many!)

Our basic process so far is to ask people (non-staff) for first impressions of the top nav: does it make sense?  Do they think they know what they’ll get under each button?  Then we show the flyouts and see if it’s what they expected.  Anything missing?  Anything doesn’t meet their expectations?  Finally we ask a few targeted “task” questions, like “where would you look if you wanted information about n work of art you saw in the galleries?”

Even this simple round of testing has revealed some clearly wrong assumptions on our part.  By fixing these things now (failing early) and iterating quickly, we can do more prototypes and get more feedback (failing often).  I’ll try to post updates as we proceed.

PS — Anyone else doing paper prototypes like this?  I think we all know we’re “supposed” to do quick user testing, but honestly this is the first time in years we’ve actually done something like it.

Setting up smartphone emulators for testing mobile websites

While developing the Walker’s mobile site, I needed to test the site in a number of browsers to ensure compatibility. If you thought testing a regular website was a pain, mobile is an order of magnitude worse. Our mobile site is designed to work on modern smartphones. If you’re using a 4 year old Nokia […]

While developing the Walker’s mobile site, I needed to test the site in a number of browsers to ensure compatibility. If you thought testing a regular website was a pain, mobile is an order of magnitude worse.

Our mobile site is designed to work on modern smartphones. If you’re using a 4 year old Nokia phone with a 120×160 screen, our site does not and will not work for you. If you want to test on older/less-smart phones, PPK has a quick overview post that has some pointers. Even so, getting the current smartphone OS running is no piece of cake. So this post will outline how to get iPhone, Android, WebOS, and, ugh, BlackBerry running in emulation. Note: I left out Windows Mobile, as does 99% of the smartphone buying public.

Let’s knock off some low hanging fruit: iPhone

Getting the iPhone to run in emulation is very easy. First, you have to have a mac. If you’re a web developer, you’re probably working on a mac. You need to get the iPhone developer tools. You’ll have to register for a free Apple Developer account, agreeing to their lengthy and draconian agreement. Once that’s done, you can slurp down the humongous 2.3gb download and install it. Once installed, you’ll have a nice folder named Developer at the root of your drive, and navigate inside it and look for the iPhone Simulator.app. That’s your boy, so launch it and, hooray! You can now test your sites in Mobile Safari.

iPhone Simulator in Apple Developer Tools

The iPhone Simulator is by far the easiest to work with, since it’s a nice pre-packaged app, just like any other. And it is a simulator, not an emulator. The difference being, a simulator just looks and acts like an iPhone, but actually runs native code on your machine. An emulator emulates a different processor, running the whole host OS inside the emulator. The iPhone simulator runs an x68 version of Safari, and just links to the mobile frameworks, compiled in x86, on your local machine. A real, actual iPhone has all the same frameworks, but they’re compiled in ARM code on the phone.

Walker Art Center on the iPhone

Android

In typical google fashion, Android is a bit more confusing, but also more powerful. There are roughly three different flavors of Android out there in the wild: 1.5, 2.0, and 2.1. The browser is slightly different in each, but for most simple sites this should be relatively unimportant.

To get the Android Emulator running, download the Android SDK for your platform. I’m on a Mac, so that’s what I focus on here. You’ll need to have up-to-date java, but if you’re on a Mac, this isn’t a problem. Bonus points to google for being the only one to not require you to sign up as a developer to get the SDK. Once you have the file, unpack it and pull up the Terminal. Navigate to the directory, and then look inside the tools directory. You need to launch the “android” executable:

Very tricky: Launch the android executable.

This will launch the Android SDK and Android AVD Manager:

Android SDK and AVD Manager

The first thing you’ll probably want to do is go to Installed Packages and hit Update All…, just to get everything up-to-date. With that done, move back to Virtual Devices and we’re going to create a New virtual device:

Set up new Android Virtual Device

Name it whatever you want, I’d suggest using Android 2.1 as your target, give it a file size of around 200mb (you don’t need much if you aren’t going to install any apps) and leave everything else as default. Once it’s created, you can simply hit start, wait for it to boot, and you’re now running Android:

Android Emulator Running

Palm WebOS

Palm is suffering as a company right now, and depending on the rumors, is about to be bought by Lenovo, HTC, Microsoft, or Google. Pretty much everyone agrees that WebOS is really cool, so it’s definitely worth testing your mobile site on. WebOS, like the iPhone and Android, use Webkit as it’s browser, so things here are not going to be unexpected. The primary difference is the available fonts.

Running the WebOS emulator is very easy, at least on the Mac. First, you need to download an grab a copy of VirtualBox, and second, you download and install the Palm SDK. Both are linked from this page.

Installing VirtualBox is dead easy, and works just like any other OS X .pkg install process:

Then download and install the Palm WebOS SDK:

When you’re done, look in your /Applications folder for an app named Palm Emulator:

When you launch the emulator, you’ll be asked to choose a screen size (corresponding to either the Pre or the Pixi) and then it will start VirtualBox. It’s a bit more of a cumbersome startup process than the iPhone Simulator, but about on par with Android.

WebOS emulator starting up. It fires up VirtualBox in the background.

WebOS running.

BlackBerry

BlackBerry is the hairiest of all the smartphones in this post. Unless you know the Research In Motion ecosystem, and I don’t, it seems that there are about 300 different versions of BlackBerry, and no easy way to know what version you should test on. From what I can tell, the browser is basically the same on all the more recent phones, so picking one phone and using that should be fairly safe. RIM is working on BlackBerry 6, which is purported to include a WebKit based browser, addressing the sadness their browser causes in web developers everywhere.

The first thing you’re going to need to simulate a BlackBerry is a windows machine. I use VMWare Fusion on my mac, and have several instances of XP, so this is not a problem. The emulator is incredibly slow and clunky, so you’ll want a fairly fast machine or a Virtual Machine with the settings for RAM and CPU cranked up.

There are three basic parts you’ll need to install to get the BlackBerry emulator running: Java EE 5, BlackBerry Smartphone Simulator, and BlackBerry Email and MDS Services Simulator. Let’s start with Java. You need Java Enterprise Edition 5, and you can get that on Sun/Oracle’s Java EE page. I’ve had Java EE 5 and 6 on my windows machine for quite some time, so I’m not actually sure what version BlackBerry requires, but it’s one of them, and they’re both free. Get it, install it, and add one more hunk of junk to your system tray.

Now you need the emulators themselves: To get an emulator, head over to the RIM emulator page and pick a device. I went with the 9630 since that seems fairly popular and it was at the top of the list of devices to chose. I’d grab the latest OS for a generic carrier. You will have to register for a no-cost RIM developer account to download the file.

While you’re there, you’ll also want to grab the MDS (aka Mobile Data Service) emulator. This is what enables the phone to actually talk to the internet. To grab this, click on the “view all BlackBerry Smartphone Simulator downloads” link, and then choose the first item from the list, “BlackBerry Email and MDS Services Simulator Package”. Click through and grab the latest version.

Once the download completes, copy the .EXEs to windows and run them. You’ll walk through the standard windows install process, and when you’re done, you’ll be left with some new menu items. Let’s start the MDS up first, since we’d like a net connection. Here’s where you should find it:

I like to take screenshots of Windows to show how crazy bad it is.

And this is what it looks like starting up:

MDS running. It's a java app.

Now let’s start up the phone emulator itself:

BlackBerry 9630 Emulator

For me, it takes quite a while to start the phone, about a minute. I started off with a smaller VM instance and it was 5+ minutes to launch, so be warned. After it starts, you’ll be left with a screen like this:

You can’t use the mouse to navigate on the screen, which is crazy counter-intuitive for anyone who has used the other three phones mentioned in this post. Instead, you click on the buttons on screen or use your keyboard to navigate. Welcome to 2005. To get to the browser, hit the hangup button, then arrow over to the globe and hit enter. You can hit the little re-wrap/undo button to get to the URL field once the browser launches. Here’s what our site looks like:

A glimpse inside a blog spammer’s tools

We get a fair amount of spam on the Walker Blogs: Defensio has blocked 49108 spam messages since it started counting. Even with a 99.07% accuracy rate and a captcha, spam gets through our filters. Over the weekend, I noticed a couple spam comments come through that I thought were interesting. Here’s an example: {Amazing|Amazing […]

We get a fair amount of spam on the Walker Blogs: Defensio has blocked 49108 spam messages since it started counting. Even with a 99.07% accuracy rate and a captcha, spam gets through our filters. Over the weekend, I noticed a couple spam comments come through that I thought were interesting. Here’s an example:

{Amazing|Amazing Dude|Wow dude|Thanks dude|Thankyou|Wow man|Wow}, {that is|this is|that’s} {extremely|very|really} {good|nice|helpful} {info|information}, {thanks|cheers|much appreciated|appreciated|thankyou}.

The geeky types among us will immediately recognize that as some sort of spam template language. Pick a word or phrase in each section, and you have a nearly limitless selection of spam phrases.

The template language in itself isn’t all that interesting, but what I found very interesting is that the link the spammer left goes to this account on del.icio.us:

The pages and pages of tagged links look to be the library of links that our spammer is using to spam blogs. The comments they’ve left on delicious look to be alternate text to be used as comments on posts. My guess is they’re using the del.icio.us tags to match keywords or tags on blog posts. I guess it’s not surprising to see spammers using web 2.0 services for doing their filthy work.

Building the Walker’s mobile site, part 2 — google analytics without javascript

As I mentioned in my last post on our mobile site, one of the key features for our site was making sure that we don’t use any javascript unless absolutely necessary. If you use Google Analytics  (GA) as your stats package, this poses a problem, since the supported way to run GA is via a […]

ga_mobileAs I mentioned in my last post on our mobile site, one of the key features for our site was making sure that we don’t use any javascript unless absolutely necessary. If you use Google Analytics  (GA) as your stats package, this poses a problem, since the supported way to run GA is via a chunk of javascript at the bottom of every page. And to make matters worse, the ga.js file is not gzipped, so you’re loading 9K which would otherwise be about 4k, on a platform where every byte counts. By contrast, if you could just serve the tracking gif, it is 47 bytes. And no javascript that might not run on B-grade or below devices.

A few weeks ago, Google announced support for analytics inside mobile apps and some cursory support for mobile sites:

Google Analytics now tracks mobile websites and mobile apps so you can better measure your mobile marketing efforts. If you’re optimizing content for mobile users and have created a mobile website, Google Analytics can track traffic to your mobile website from all web-enabled devices, whether or not the device runs JavaScript. This is made possible by adding a server side code snippet to your mobile website which will become available to all accounts in the coming weeks (download snippet instructions). We will be supporting PHP, Perl, JSP and ASPX sites in this release. Of course, you can still track visits to your regular website coming from high-end, Javascript enabled phones.

And that is the extent of the documentation you will find anywhere on Google on how to run analytics without javascript. The code included is handy if you happen to run one of their platforms, but the Walker’s mobile site runs on the python side of AppEngine, so their code doesn’t do us much good. Thankfully, since they provide us with the source, we can without too much trouble, translate the php or perl into python and make it AppEngine friendly.

How it works

Regular Google Analytics works by serving some javascript and a small 1px x 1px gif file to your site from Google. The gif lets Google learn many things from the HTTP request your browser makes, such as your browser, OS, where you came from, your rough geo location, etc. The javascript lets them learn all kinds of nifty things about your screen, flash versions, event that fire, etc. And Google tracks you through a site by setting some cookies on that gif they serve you.

To use GA without javascript, we can still do most of that, and we do it by generating our own gif file and passing some information back to Google through our server. That is, we generate a gif, assign and track our own cookie, and then gather that information as you move through the site, and use a HTTP request with the appropriate query strings and pass it back to Google, which they then compile and treat as regular old analytics.

The Code

To make this work in appeinge, we create a  URL in our webapp that we’ll serve the gif from. I’m using “/ga/”:

[python]
def main():
application = webapp.WSGIApplication(
[(‘/’, home.MainHandler),
# edited out extra lines here
(‘/ga/’, ga.GaHandler),
],
debug=False)
wsgiref.handlers.CGIHandler().run(application)
[/python]

And here’s the big handler for /ga/. I based it mostly off the php and some of the perl (click to expand the full code):

[code lang=”python” collapse=”true”]
from google.appengine.ext import webapp
from google.appengine.api import urlfetch
import re, hashlib, random, time, datetime, cgi, urllib, uuid

# google analytics stuff
VERSION = "4.4sh"
COOKIE_NAME = "__utmmobile"

# The path the cookie will be available to, edit this to use a different cookie path.
COOKIE_PATH = "/"

# Two years in seconds.
COOKIE_USER_PERSISTENCE = 63072000

GIF_DATA = [
chr(0x47), chr(0x49), chr(0x46), chr(0x38), chr(0x39), chr(0x61),
chr(0x01), chr(0x00), chr(0x01), chr(0x00), chr(0x80), chr(0xff),
chr(0x00), chr(0xff), chr(0xff), chr(0xff), chr(0x00), chr(0x00),
chr(0x00), chr(0x2c), chr(0x00), chr(0x00), chr(0x00), chr(0x00),
chr(0x01), chr(0x00), chr(0x01), chr(0x00), chr(0x00), chr(0x02),
chr(0x02), chr(0x44), chr(0x01), chr(0x00), chr(0x3b)
]

class GaHandler(webapp.RequestHandler):
def getIP(self,remoteAddress):
if remoteAddress == ” or remoteAddress == None:
return ”

#Capture the first three octects of the IP address and replace the forth
#with 0, e.g. 124.455.3.123 becomes 124.455.3.0
res = re.findall(r’\d+\.\d+\.\d+\.’, remoteAddress)
if res:
return res[0] + "0"
else:
return ""

def getVisitorId(self, guid, account, userAgent, cookie):
#If there is a value in the cookie, don’t change it.
if type(cookie).__name__ != ‘NoneType': # or len(cookie)!=0:
return cookie

message = ""

if type(guid).__name__ != ‘NoneType': # or len(guid)!=0:
#Create the visitor id using the guid.
message = guid + account
else:
#otherwise this is a new user, create a new random id.
message = userAgent + uuid.uuid1(self.getRandomNumber()).__str__()

m = hashlib.md5()
m.update(message)
md5String = m.hexdigest()

return str("0x" + md5String[0:16])

def getRandomNumber(self):
return random.randrange(0, 0x7fffffff)

def sendRequestToGoogleAnalytics(self,utmUrl):
”’
Make a tracking request to Google Analytics from this server.
Copies the headers from the original request to the new one.
If request containg utmdebug parameter, exceptions encountered
communicating with Google Analytics are thown.
”’
headers = {
"user_agent": self.request.headers.get(‘user_agent’),
"Accepts-Language": self.request.headers.get(‘http_accept_language’),
}
if len(self.request.get("utmdebug"))!=0:
data = urlfetch.fetch(utmUrl, headers=headers)
else:
try:
data = urlfetch.fetch(utmUrl, headers=headers)
except:
pass

def get(self):
”’
Track a page view, updates all the cookies and campaign tracker,
makes a server side request to Google Analytics and writes the transparent
gif byte data to the response.
”’
timeStamp = time.time()

domainName = self.request.headers.get(‘host’)
domainName = domainName.partition(‘:’)[0]

if len(domainName) == 0:
domainName = "m.walkerart.org";

#Get the referrer from the utmr parameter, this is the referrer to the
#page that contains the tracking pixel, not the referrer for tracking
#pixel.
documentReferer = self.request.get("utmr")

if len(documentReferer) == 0 or documentReferer != "0":
documentReferer = "-"
else:
documentReferer = urllib.unquote_plus(documentReferer)

documentPath = self.request.get("utmp")
if len(documentPath)==0:
documentPath = ""
else:
documentPath = urllib.unquote_plus(documentPath)

account = self.request.get("utmac")
userAgent = self.request.headers.get("user_agent")
if len(userAgent)==0:
userAgent = ""

#Try and get visitor cookie from the request.
cookie = self.request.cookies.get(COOKIE_NAME)

visitorId = str(self.getVisitorId(self.request.headers.get("HTTP_X_DCMGUID"), account, userAgent, cookie))

#Always try and add the cookie to the response.
d = datetime.datetime.fromtimestamp(timeStamp + COOKIE_USER_PERSISTENCE)
expireDate = d.strftime(‘%a,%d-%b-%Y %H:%M:%S GMT’)

self.response.headers.add_header(‘Set-Cookie’, COOKIE_NAME+’=’+visitorId +'; path=’+COOKIE_PATH+'; expires=’+expireDate+';’ )
utmGifLocation = "http://www.google-analytics.com/__utm.gif"

myIP = self.getIP(self.request.remote_addr)

#Construct the gif hit url.
utmUrl = utmGifLocation + "?" + "utmwv=" + VERSION + \
"&utmn=" + str(self.getRandomNumber()) + \
"&utmhn=" + urllib.pathname2url(domainName) + \
"&utmr=" + urllib.pathname2url(documentReferer) + \
"&utmp=" + urllib.pathname2url(documentPath) + \
"&utmac=" + account + \
"&utmcc=__utma%3D999.999.999.999.999.1%3B" + \
"&utmvid=" + str(visitorId) + \
"&utmip=" + str(myIP)

# we dont send requests when we’re developing
if domainName != ‘localhost':
self.sendRequestToGoogleAnalytics(utmUrl)

#If the debug parameter is on, add a header to the response that contains
#the url that was used to contact Google Analytics.
if len(self.request.get("utmdebug")) != 0:
self.response.headers.add_header("X-GA-MOBILE-URL" , utmUrl)

#Finally write the gif data to the response.
self.response.headers.add_header(‘Content-Type’, ‘image/gif’ )
self.response.headers.add_header(‘Cache-Control’, ‘private, no-cache, no-cache=Set-Cookie, proxy-revalidate’ )
self.response.headers.add_header(‘Pragma’, ‘no-cache’ )
self.response.headers.add_header(‘Expires’, ‘Wed, 17 Sep 1975 21:32:10 GMT’ )
self.response.out.write(”.join(GIF_DATA))

[/code]

So now we know what to do with our requests at /ga/ when we get them, we just need to make the proper requests to that URL in the first place. So we need to generate the URL we’re going to have the visitor’s browser request in the first place. With normal django, we would be able to use template_context to automatically insert it into the page’s template values. But, since AppEngine doesn’t use that, we have our own helper functions to do that, which I showed some of in my last post. Here’s the updated helper functions, with the GoogleAnalyticsGetImageUrl function included:

[code lang=”python”]
import settings

def googleAnalyticsGetImageUrl(request):
url = ""
url += ‘/ga/’ + "?"
url += "utmac=" + settings.GA_ACCOUNT
url += "&utmn=" + str(random.randrange(0, 0x7fffffff))

referer = request.referrer
query = urllib.urlencode(request.GET) #$_SERVER["QUERY_STRING"];
path = request.path #$_SERVER["REQUEST_URI"];

if len(referer) == 0:
referer = "-"

url += "&utmr=" + urllib.pathname2url(referer)

if len(path)!=0:
url += "&utmp=" + urllib.pathname2url(path)

url += "&guid=ON";

return {‘gaImgUrl':url}

def getTempalteValues(request):
myDict = {}
myDict.update(ua_test(request))
myDict.update(googleAnalyticsGetImageUrl(request))
return myDict

[/code]

Assuming we use getTemplateValues to set up our inital template_values dict, we should have a variable named ‘gaImgUrl’ in our page. To use it, all we need to do is put this at the bottom of every page on the site:

[code lang=”html”]
<img src="{{ gaImgUrl }}" alt="analytics" />
[/code]

My settings file contains the GA_ACCOUNT variable, but replaces the standard GA-XXXXXX-X setup with MO-XXXXXX-X. I’m assuming the MO- tells google that it’s a mobile so accept the proxied requests.

One thing to keep in mind with this technique is that you cannot cache your rendered templates. The image you server will necessarily have a different query string every time, and if you cached it, you would ruin your analytics. Instead, you should cache nearly everything from your view functions, except the gaImgUrl variable.

Next