Open data on the cheap
Jimmy Leach, Head of Digital Diplomacy at the Foreign & Commonwealth Office blogged earlier about a quiet little project his team have rolled out, using extended RSS 2.0 feeds to provide access to the FCO’s travel advice data.
As he says, the key thing for publicly-funded organisations is to get the information out there, which is why the corporate platform, and even corporate social media channels, are just the beginning. RSS feeds are a tried-and-tested technique for ensuring content can reach a wider audience:
Anyone can follow the latest alerts and changes using our travel advice RSS feeds in a standard reader like Google Reader or Netvibes. But you can consider this a call to developers to use our feeds as they want, to make our data useful, to add relevant information, to create visualisations, mobile apps or map-based viewers, incorporating extra machine-readable data about locations, contact details, reviews, ticket booking, all sorts of information and services that you wouldn’t expect a government department to provide, and, hopefully, pulled together in clever and innovative ways that you wouldn’t expect from a civil servant.
The existing travel advice feed on the site contains the alerts, the advice, the news and the embassy details all rolled into one. It does the job. It’s useful and relevant but it is also blunt and we know there are ways we can do it better.
We have started the ball rolling by creating some test feeds containing additional custom elements, so that each element in the feed is generated from a single field from our database of travel advice.
Inspired by Matthew Somerville’s use of iUI to fake an iPhone look and feel to his Train Times application, I’ve put together a little demo of how the FCO’s new feeds might be repurposed, with a little app optimised for iPhones which takes the latest alerts, visualises them on a map, and enables you to get the phone number and opening times for an embassy if you need it. It’s a fairly silly little proof of concept, but hopefully it shows that RSS feeds don’t just have to live in newsreaders. And it’s what Andrea DiMaio has decreed.
The bigger point here is about open data and cost. Most enterprise CMSes can generate RSS feeds, and it’s a technology that almost all developers and webbies feel comfortable with. So without the cost and complexity of building and mantaining a full API to their database, a corporate public sector organisation has been able to support reuse in a quick and simple way. Jimmy has asked for thoughts from developers and others on how the feeds might be cleaned up and made more useful, so do give him your ideas.
Filed under Development, Government, Technical | Comments (18)Newsroom: the backstory
Cast your mind back if you will to chilly February, amid the growing crescendo/death spiral of pre-election communications. Neil and his team were finishing off the new corporate website, having shunned friends and family for weekends on end. A member of the senior management team came bounding back from a cross government meeting where they had been shown this, and, in a nutshell, they wanted one too.
The brief was helpfully loose: make it easier for the media to access the information they needed via simple link in the bottom of a press notice, without generating a load of extra work for Press Officers. From the Digital team’s perspective, we wanted to increase visibility of our YouTube and Flickr content for media, ensuring that these channels get promoted in every news release. Oh, and the kicker: make something technology independent, that could survive the imminent move from WordPress to SiteCore, without incurring external costs. So we set out to develop something based largely in client-side technologies (i.e. Javascript and CSS) which usefully aggregated corporate announcements, multimedia output and press office contacts for mainstream media and bloggers in a single place – frankly, more of a technical and design challenge than a strategic one, but a fun one nonetheless.
There were half a dozen or so information sources to play with*:
- Press Releases, ministerial speeches (RSS feed)
- Tweets from corporate accounts (RSS feeds)
- Videos on YouTube (RSS feed with multimedia enclosures)
- Flickr photos (API)
- Podcasts on SoundCloud (added by the team later, again, RSS feed)
- Contact details for Press Officers & key facts on policies (static text)
- Email alerts for media to sign up to via GovDelivery
*We also had a plan to add a couple of extras which were built but not yet used. Case studies published elsewhere online were to be tagged using a corporate Delicious account and imported into the newsroom using the RSS feed for the tag. Urgent statements or rebuttals put out by a Press Officer out of hours sometimes aren’t issued as Press Notices in the normal way, so we set up a private Tumblr site to which these could be emailed, which could be embedded or imported into the Newsroom, again via RSS.
The primary tool in our arsenal was the wonderous Feed2JS, which takes an RSS feed and gives you a snippet of Javascript to embed which will render it for you in HTML. It’s free and awesome (and you can even self-host it if you want). This little tool helps single-handedly render the majority of the Newsroom content, the code snippet tweaked slightly to ensure the <noscript> alternative ensures the site degrades fairly gracefully for non-Javascript enabled browsers.
I also developed a couple of code snippets to render the content of a Flickr account or set as an RSS, HTML or Javascript snippet, and likewise with YouTube – feel free to grab the code from those links if that kind of thing is of use to you.
-
Version 0.1 (click the image to enlarge) was a good proof of concept, built in an empty page template on our old WordPress site. But there was too much to take in for a notoriously lazy audience. -
Version 0.2 was an improvement, splitting the content into more manageable chunks with a natty Apple-style navigation bar and some concertina sections done in Javascript – but it still felt hard to differentiate the content types on the page -
Version 0.3 was almost there, introducing some nice little icons for the different content types, using CSS to help visually distinguish the lists, and losing the unnecessary mission statement with some DOM-rewriting to save valuable pixels for this audience. And then we moved to SiteCore and purdah struck, so… -
…Version 1.0, which you can now see in all its glory transferred the code into a new CMS and migrated across a stylesheet. The team added SoundCloud podcasts using its RSS feed, in the same way as the other media types.
Early feedback on the prototype from journalists was positive, the Press Office got a nice-looking tool which required literally zero additional work beyond emailing over their contact list, and Neil got one of his much-loved quick wins – and within SiteCore too. Props for this one to Rhys and Ian in the BIS Digital Communications team.
Photo credit: Victoria Peckham
Filed under Design, Development, Technical | Comments (9)Public Appointments by RSS
In the words of Directgov:
A public appointment is an appointment to the board of a public body or to a government committee. Around 18,500 men and women hold a public appointment.
The public bodies involved are quite important, including health trusts, museum boards and regulators, some demanding specialist skills in law or social work, but many requiring general common sense and broad experience. So it’s important that the people who fill these posts are of the right calibre and reflect the diversity of our society.
The Cabinet Office has recently revamped its Public Appointments system, and you can now sign up to sophisticated email alerts about public appointments vacancies you might be interested in. As a publisher of vacancies, the central system also has an excellent API, enabling you to extract data feeds from the vacancy database to republish on your own site. There’s even some RDFa in the output should you wish to use that to mark-up the vacancy descriptions.
I’ve just created and added a dead simple RSS feed for the BIS-related public appointments to our homepage. But anyone can grab the code and set it up to generate their own feed, or indeed re-publish the vacancy data far and wide in any format compliant with its licence, in order to help spread the word about the interesting and varied positions available.
Hurrah for open data and APIs, and above all, hurrah to the Cabinet Office for building one in this case. Thanks chaps.
Filed under Development, Government, Technical | Comments OffAdding RDFa to a consultation
Recently, I’ve been involved in a project to ensure our consultations support RDFa markup, to make them indexable and reusable by third parties, including Directgov. Without duplicating the quite accessible and useful COI guidance, I thought I’d summarise here the process involved from the perspective of implementing the standard with minimal prior knowledge of the whys and wherefores.
Why bother?
As of Jan 1st 2010, it’s now a mandatory requirement for government sites. But more importantly than that, it’s a Jolly Good Idea to provide a low-maintenance way of enabling other systems and services to grab a list of consultations from your site, and identify the important metadata about them, including the closing date and how to respond. Short term, it will make services like TellThemWhatYouThink and Directgov more useful, but in terms of the bigger picture, it will expose the opportunity to get involved with policymaking to a wider audience, and reduce the hassle for those who are already part of our regular stakeholder group (by making possible new services such as auto email alerts, RSS feeds, cross-government updates and so on).
What’s involved?
RDFa offers a simple way to add meaningful information to existing web pages, which can be extracted easily by software (as opposed to hit-and-miss ‘scraping’ of regular web pages). As a lay person, I’d say there are three key principles which I can articulate:
- Be unobtrusive and minimalistic: taking this approach lets you add extra items to pages which aren’t seen by regular browsing visitors, but which are accessible to software robots looking for them. It’s also not ‘an extra thing’ to maintain and serve like an RSS feed, so reduces risk, in theory.
- Offer clean data: through being consistent in how data about the consultation is described, the idea is that RDFa helps to extract very clean information about the consultation – for example, an unambiguous closing date, a response email address, an exact postcode, all in formats which can then be used in other ways (plotted on a map, listed on a calendar, turned into a mailform on a website etc)
- Extend existing conventions: the most complicated aspect of implementing this particular specification is that the authors have gone out of their way to find existing wheels rather than reinvent their own. So they use Dublin Core metadata to describe authors and organisations; vCard to describe response contact information; plus nods to DBPedia and FOAF (Friend Of A Friend) to support these major semantic web initiatives. Only for the gaps where specific consultation information needs to be marked up is there a new standard introduced, using the namespace (prefix)
argot.
In a nutshell, the process involves tweaking the template for your consultation pages, adding extra metadata elements and attributes. This is only as easy or hard as your CMS makes it. It’s important that it’s right though – even a few ‘broken bits’ could render the page useless to a software robot trying to extract data from it.
How to do it
Read the COI guidance (and give it to your developer), which is the most comprehensive guide, with useful illustrated examples. There’s also a worked up HTML page showing how this works, and of course you’re welcome to look at ours (which I *think* are right, based on feedback from the gurus).
As an example (but again, you should read the official guidance) I found I needed to work through the following:
- ensure we have a single page per per consultation
- amend the DOCTYPE, if you’re using something like the standard XHTML strict/transitional version. Needs to tell requesters of the page that it contains RDFa
- add some attributes to the <html> element, highlighting the namespaces (vocabularies) you’re referencing in the document
- add Dublin Core metadata elements/attributes to your page <head> element if they’re not there already
- ensure we have a wrapper <div> around the consultation information which again references the namespaces (vocabularies) you’re using. This also identifies the name of the organisation publishing the document
- add some Dublin Core metadata attributes as <spans> within this <div> identifying this as a consultation
- add some Dublin Core attributes to key bits of the HTML, such as the consultation title, start date, closing date and description, marking these as such – and in the case of dates, ensuring there’s a machine-readable data format value in the attribute. Also add a unique identifier – a reference number – to each consultation (not something we’d done routinely before)
- ensure the contact details for responses is carefully structured using vCard format, with separate ‘Full Name’, ‘Street Address’, ‘Locality’ and ‘Post Code’ elements, suitably marked-up with attributes. Since vCard doesn’t cover the specific case of a consultation with an email reply address, for example, these elements are marked up with the new argot: namespace attributes
- add Dublin Core-based attributes describing the file attachments – the consultation document itself, and any related ones such as appendices or Impact Assessments
UPDATE: in retrospect, it was foolish to attempt a blog post about code without some code examples. I’ve tried and failed to find a half-decent code syntax highlighter plugin for WordPress, but the following couple of screenshots hopefully illustrate the before and after situations for the contact information part of a consultation:
Before, plain HTML:
After, with RDFa added (and marked up more semantically as a list item within the consultation metadata)
What help is available?
I worked from the examples given in the COI guidance and the pioneers in this at the Ministry of Justice. The COI Digigov team are your allies in helping to implement this, and should be able to answer queries and/or direct you to sources of further implementation advice and support.
In terms of online tools, you can see whether your RDFa is visible to suitably-equipped applications using Mark Birbeck’s tool or bookmarklet, if you prefer (and he should know; he invented RDFa).
Good luck!
P.S. If you Know About This Stuff and feel I’m giving duff advice here, please drop me a line in the comments or via the contact form and I’ll correct. Thanks.
Filed under Development, Government, Technical, Uncategorized | Comments (10)Unleashing a Government response
A quick one – today at work we’re launching ‘Unleashing Aspiration’: the Government’s response to the review of access to the professions, which was led by Rt Hon Alan Milburn MP and reported last year.
The digital brief was, on the face of it, not massively exciting – it’s a long document, covering 88 recommendations, with a small but informed audience of policy, media and stakeholder visitors – many of whom will go through the whole document in detail almost however we publish it.
But this kind of document does set an interesting challenge for online presentation – it’s really as close as policy documents get to a faceted classification in information design terms, with responses to each recommendation organised by theme, by audience affected, and by the Departments who are leading on each – and with lots of embedded links to other initiatives. The policy team, though tight on resource, are interested in following the comment and discussion around each of the recommendations.
So it’s also a natural fit for WordPress, where the Themes are defined as WordPress categories, and we use WordPress tags to indicate audience and lead department. Commenting is built-in, as is the facility for tag and category descriptions, which provide a space for useful ‘virtual chapter’ overviews. By offering the ability to cut the document up in so many ways, it provides a variety of accessible entry points for different audiences, which is promising raw material for digital engagement outreach, for example to student communities or the third sector.
It’s not going to win any design awards – it’s intentionally quite neutral and clean with just some simple colour-coding – but I think it’s an unusual and potentially helpful approach to enable readers to get into a document of this kind through different routes. It’s also been a good training exercise for the team – props to Alistair Reid for getting his head around the anatomy of WordPress in barely a week, and doing rather more cut-and-paste than is strictly healthy.
Filed under Government, Technical, WordPress | Comments (5)






