New Media Initiatives

Just another Walker Blogs weblog

Part of: blogs.walkerart.org

 
by Nate Solas at 2:49 pm 2009-07-13
Filed under:
11 Comments

aenWe’re in the process of retiring our last production server running NT and ColdFusion (whew!), and this means we needed to get a few old projects ported to our newer Linux machines.  The main site, http://aen.walkerart.org/, is marginally database-driven: that is, it pulls random links and projects from a database to make the pages different each time you load.  The admin at the time was nice enough to include MDB dump files from the Microsoft Access(!) project database, and the free mdbtools software was able to extract the schema and generate import scripts.  Most of this page works as-is, but I had to tweak the schema by hand.

After the database was ported to MySQL, it was time to convert the ColdFusion to PHP.  (Note: the pages still say .cfm so we don’t break links or search engines – it’s running php on the server)  Luckily the scripts weren’t doing anything terribly complicated, mostly just selects and loops with some “randomness” thrown in.  I added a quick database-abstraction file to handle connections and errors and sanitize input, and things were up and running quickly.

… sort of.  The site is essentially a repository of links to other projects, and was launched in February 2000.  As you might imagine there’s been some serious link rot, and I’m at a bit of loss on how to approach a solution.  Steve Dietz, former New Media curator here at the Walker, has an article discussing this very issue here (ironically mentioning another Walker-commissioned project that’s suffered link rot.  Hmm.).

One strategy Dietz suggests is to update the links by hand as the net evolves.  This seems resource-heavy, even if a link-validating bot could automate the checking — someone would have to curate new links and update the database.  I’m not sure we can make that happen.

It also occurred to me to build a proxy using the wayback machine to try to give the user a view of the internet in early 2000.  There’s no API for pulling pages, but archive.org allows you to build a URL to get the copy of a page closest to a specific date, so it seems possible.  But this is tricky for other reasons – what if the site actually still exists?  Should we go to the live copy or the copy from 2000?  Do we need to pull the header on the url and only go to archive.org if it’s a 404 to 500?  And what if the domain is now owned by a squatter who returns a 200 page of ads?  Also, archive.org respects robots.txt, so a few of our links have apparently never been archived and are gone forever.  Rough.

In the end, the easy part was pulling the code to a new language and server – it works pretty much exactly like it did before, broken links and all.  The hard part is figuring out what to do with the rest of the web…  I do think I’ll try to build that archive.org proxy someday, but for now the fact it’s running on stable hardware is good enough.

Thoughts?  Anyone already built that proxy and want to share?

 
by Nate Solas at 1:07 pm 2009-06-22
Filed under:
2 Comments

New Media has a number of development servers located in-house where we get stuff done before releasing it out into the wild.  Until last week these were protected by an aging OpenBSD firewall running packet filter and all was well until midweek when the motherboard failed.  Not having a spare on hand, I was scrambling for a solution.

Linksys wireless router

Linksys wireless router

Being familiar with the dd-wrt project, I was pretty sure I could build a firewall out of a Linksys router.  We went with the WRT54GL, currently as cheap as $50 on Amazon.  (We bought local so we’d have it sooner, and it was a bit more).

The first step after flashing the firmware with the latest dd-wrt build (v24-sp2) was to take off the antennas and turn off the radio.  The last thing I want for the firewall is to be broadcasting an SSID and allow wireless associations.  This actually requires a startup script on the router, with a line to remove the wireless module so it won’t try to reenable itself:

wl radio off
wl down
rmmod wl

Good start.  Next I needed to bridge the WAN port with the LAN ports, which ended up being a struggle until I found the easy options in the dd-wrt GUI.  First, set the LAN to use a static IP and make sure you can connect via another machine to configure it.  You’ll also need to enable SSH access and remote configuration – but be sure to lock this down once the firewall is running!

Once you have the LAN configured, you need to set the WAN connection type to “disabled”.  This will give you a checkbox to bridge the LAN and WAN:  “Assign WAN port to switch”.  Lastly, under Advanced Routing set the Operating Mode to “Router” so it stops trying to do NAT.  Apply these settings, and you’ll basically have an expensive dumb switch – all traffic shows up on every port, and there’s no logic at all.  We’re halfway there.

Being unfamiliar with iptables (we use OpenBSD and pf for firewalls around here), I was under the impression that iptables rules would work in a bridging environment.  This is not the case: bridged packets don’t reach iptables at all!  The best I could do was block everything (manual restart needed), or otherwise blow up the configuration (manual restart needed) as I tried to mess with the bridge.  This was an incredibly frustrating learning curve as everything I could find made it sound like this was the way to configure a firewall in Linux, but it just wasn’t working.

Note to keep you sane: don’t do any of this testing in the startup scripts or you’ll brick your router, guaranteed.  Do it all from the command line with a known-good startup.  That way it’s a simple (but annoying) power cycle to get things back up.

The trick, it turns out, is a kernel module called ebtables.  Luckily, this is included in the dd-wrt build, but it’s not turned on by default!  Add this to your startup script:

insmod ebtables
insmod ebtable_filter
insmod ebt_ip.o

And, ta-da, all your iptables rules will start impacting packets!  Now it’s just a matter of configuring the firewall rules.  We’re using something like this:  (vlan0 represents the LAN ports, and vlan1 is the WAN port)

# drop everything by default:
iptables -P FORWARD DROP
# clear the old rules:
iptables -F FORWARD
# forward stuff that's established already
iptables -A FORWARD -m state --state ESTABLISHED,RELATED -j ACCEPT
# let connections out:
iptables -A FORWARD -i vlan0 -m state --state NEW -j ACCEPT

# firewall access rules
iptables -F INPUT
# WAC ips can get to fw:
iptables -A INPUT -p tcp -d 1.2.3.4 -s 4.3.2.1/24 -j ACCEPT
# drop everything else!
iptables -A INPUT -p tcp -d 1.2.3.4 -j DROP

# ... snipped all the actual access rules and packet flood protection ...

The only trick here is the last few lines which limit access to the firewall machine itself.  We can’t use the FORWARD rules since these packets are destined for the internal hardware and not forwarded, but we do need to limit access via the INPUT chain.  In this example the firewall has IP 1.2.3.4 and the network I want to access it from has 4.3.2.x.  That way I can leave the firewall’s remote access turned on and limit it to our network.  (because there’s no terminal access you can’t make it a truly transparent bridge or you’d never be able to change the config!)

I admit I’m a bit nervous posting some of this in case there’s a glaring security hole, but it seems good to me.  Anyone see anything they’d like to warn me about before we get hacked?

And there you have it!  For the cost of a cheap router and some time (not much, since you can just follow these notes!) you have a full-featured bridging firewall running on dedicated hardware.  With a little extra work it would be easy to get VPN running and much more…  I’m hoping for years of service from this little guy!

( Hat tip another DIY firewall solution that I’d really like to try someday. )

 
by Justin Heideman at 6:03 pm 2009-02-05
Filed under:
1 Comment

If you’ve tuned into the live streaming events the Walker Channel has carried in the past, you have been forced to use Real Player to watch. Real was great back in the day when the Walker Channel was launched, but in 2009 it is a little dated. Flash streaming is much more convenient, and the VP6 codec flash offers is quite good. 

For tonight’s The Art of the Book panel discussion, we will be using ustream.tv to stream the event, rather than Real Player. No fancy plugins or separate applications required. It is also free, and doesn’t require us to run our own Real Media server. It will also help us decrease the turn-around on getting a recorded event into the Walker Channel, iTunes U and YouTube. None of this is rocket surgery, of course. Other places, like The UpTake, have been using free straming services very effectively, we’re just a little late getting on the bandwagon. 

We’re doing tonight’s lecture is a test of ustream, and we will be working out any kinks. We’ve done some testing already, but haven’t used it in a live setting where anyone other than a handful of people have been watching. 

If you’re watching and run into any problems, let us know. Shoot me an e-mail (click on my name to get the address), hit us on twitter, post here, or join the chatroom on ustream.

 
by Nate Solas at 12:39 pm 2008-11-22
Filed under:
1 Comment

It had been a slow build, but an incident a few weeks ago made it finally clear: the Walker website was becoming a victim of its own success.  A post on the Teens site contained a picture of the Joker for the then-upcoming Batman movie, and as Halloween approached we found ourselves on the front page of Google image search as people began looking for costume ideas.  The exponential traffic was crippling our web server:

The biggest problem was simply that Apache is heavy.  It’s resource-intensive, especially when you are running several modules as we were – PHP, proxy, cache, etc.  The teens site is especially difficult since it runs as a combination of a blog (PHP on Apache 2) and .wac pages (mod_perl & Axkit).  Every hit to the Joker post – even if the page was cached – would tie up a number of Apache processes as it served the style sheets, images, and javascript to support the page.  We were reaching our MaxClients setting but unable to raise it without running out of memory for our other more intensive servers (mod_perl and postgres, I’m looking at you…).

As this diagram shows, it’s nothing but Apache servers, and it just wasn’t scaling to meet our current demand.

The approach was two-fold: some quick auditing and re-writing of the worst offending .wac pages’ SQL to speed up the slow pages, and yet another web server in front of everything.  It was a no brainer to pick Lighttpd, or “Lighty”.  It’s written to do one thing – serve static content – and do it extremely fast.  Fortunately it can also proxy requests, so it was a pretty simple matter to reassign some ports and write a few rules to route all requests through Lighty.

The end result is astonishing.  Our server hums along happily under even the most intense traffic we can throw at it (the email blast for the British Television Advertising Awards) and doesn’t even start to complain.  Moving the bulk of the requests to the extremely fast and resource-light server meant we could devote more resources to quickly processing the slow pages (mod_perl).  Between the SQL tuning and the extra resources, the bulk of these pages are now served between 2 and 10(!!!) times faster!

The lesson here, for anyone with an Apache server creaking and groaning under increased traffic, is to stop waiting and install Lighty.  If your site is PHP-based, you can run this as a fast CGI module from Lighty and do away with Apache altogether.  You can also use Lighty to stream (and “scrub”!) flv and mp4 video files.  (I’m using both of these techniques for the new ArtsConnectEd.)

The only caveat: be careful as you look for examples on the web.  Remarkably, it seems there are many confused webmasters who expect to see a performance boost by putting Lighty behind their struggling Apache.  This will not help at all, and in fact will probably make things worse.  Lighty has to be first in the chain to take the load off Apache.

Enjoy the speed!  I know our server is enjoying the breathing room!

 
by Justin Heideman at 12:31 pm 2007-09-28
Filed under:
Comments Off

David Zicarelli of Cycling ‘74 has posted some initial notes on what’s new and different in Max 5. He

For current users, I would describe Max 5 as analogous to Apple’s transition from Mac OS 9 and Mac OS X. At some point, Apple decided that the technological foundation of their operating system was unsustainable, and required a completely new approach. We came to the same conclusion about many aspects of Max, and especially about the graphical interface — often the most complicated and difficult system within any large application software project.

Max was based on the way the Macintosh worked in 1987. Since then, a lot of things have changed about graphical interfaces, file systems, and pretty much everything else. As a result, the assumptions of 1987 were simply too deeply embedded to keep Max going for another 20 years with the same internal codebase. This became increasingly apparent in recent years, as we seemed be doing nothing but patching Max to keep it working with the latest hardware and software.

I’ve wanted to make Max better, but recently most of my work has been the drudgery of making it operate on OS X, or on Windows, or on Intel processors. While I’ve been doing this, I’ve also been accumulating ideas for what I would do once I got over all this kind of work.

Well, that day did come when we finally finished the Intel port of the OS X version, although it took about a year longer than I thought it would. Once I was able to clear that off my desk, I began organizing my thoughts about what Max requires to survive another 20 years.

He mentions the possibility of Max being available on Linux, changes to the way some objects work, and updates to the GUI. In respect to the GUI, my friend Paul puts it best:

I’m excited to see what their new interface changes will entail. I kind of liked it’s raw ugliness, same way I really liked Slashdot’s 1993 ugliness before they shined it up. Its like a really ugly comfortable couch that you know you should get rid of eventually.

(Obviously, Max is for visual thinkers)

Comments Off
 
by eric ishii eckhardt at 6:23 pm 2006-12-18
Filed under:
4 Comments

I just saw Botanicalls at the ITP Winter Show. It is a cell phone information system that connects people and plants. A person can call a plant on their phone and get information about the species of plant and check if the plant needs watering. On the other hand a plant that needs watering or more sun can call a person up and ask for help. When the plant gets successfully watered it calls again to say thanks.

Botanicalls

 
by Nate Solas at 3:42 pm 2006-12-06
Filed under:
3 Comments

Today I’ve got two good tools for web developers.

Lately I’ve had to write a number of regular expressions for the upcoming mnartists.org calendar – most in Java, and a few in Javascript. In theory a regexp is a regexp no matter the language, but in practice that’s rarely the case. Between these subtle differences and the maddening wait for compiling or reloading a page, it’s clear some sort of live testing environment is useful:

  • Javascript tester – allows replacement testing as well
  • Java tester – really nice in that it gives accurate feedback on your regexp errors and even helps you format the matching text as a java String

If you’re a developer messing with Java or Javascript regular expressions, IMHO it’s worth bookmarking those two pages.

Here’s a Java one – looks complicated, actually pretty straightforward. Anyone care to take a stab at what it does? :)

line = line.replaceAll(”\[([bi])\]([^\[]*)\[/\1\]“,”<$1>$2</$1>”);

(or can you do it better? I get by, but I know my regexps are sometimes clunky at best…)

 

Powered by WordPress