Some brief thoughts on borrowing content

One of the great things about WordPress as a publishing platform is the way it deals with incoming links, comments and trackbacks. Linking is the currency of the web, and WordPress gives you the maximum possible intelligence on who is linking to you, where traffic is coming from and who might be citing your posts.

I noticed a new trackback this morning citing the Street Museum interview I did a while ago, and as normal followed it to source to see what the person had to say. It turned out that what they had to say was rather familiar: a complete copy and paste of the original post, word for word and image for image.

My first response was one of irritation, and I posted asking the Twitterverse what they thought. Their answers ranged from “hunt them down…” to “don’t worry about it” but I think some of more subtle responses are worth reflecting on:

1) This blog is CC licensed. I also bang on about free content being better content. Thus, from a purely legal perspective, I’m absolutely allowing people to distribute, copy, display, so my irritation is unfounded;

2) The devil is in the detail. This isn’t a spam blog but a hand-curated one. I’ve had entire blocks of content “borrowed” before, and yes, this irritates the **** out of me. This is different.

3) As James Clay pointed out, “he is using the blog as an online store of stuff he finds”, and that’s absolutely right – it is a different mode of use, albeit one quite hard to quantify.

4) In a perfect world, the person copying the post would have got in contact via email, just out of courtesy. But they didn’t, and ultimately, I’m not going to lose any sleep over that.

5) The notion of “attribution” is important and subtle here. I use a LOT of CC images when I do presentations, and have worried about *how* to attribute effectively – a link on¬†the image? a table of links at the end of the presentation (my preferred method to date)? an email? In this particular case, it is pretty clear that the article was written by me, borrowed from this blog – the signposting is pretty clear.

The long and short: have it, it’s only fair ūüôā

The paywall experiment

Shortly the Times will begin its Great Paywall Experiment, locking out all but paid (£1 a day, £2 a week) subscribers.

It is very easy to laugh at Murdoch for taking this approach, but actually it’s a pretty good thing that someone has the balls/stupidity/temerity/whatever to do it. Many people – me included – have spent a lot of time over the past few years debating where the¬†value is now that the volume of free stuff has increased a million fold. Like many of my peers, I’ve long been convinced of the value of scale rather than scarcity, but its been a long and often hard sell to many who still believe in traditional models, and we have all been scrabbling about looking for solid financial evidence to support one or other argument.

The Times approach will be a ¬†pretty effective litmus test of applying a traditional business model to a non-traditional environment.¬†It will also be an interesting case study in what social media – and, in fact, the web –¬†actually means when it comes to virality and the Long Tail. Not only does the locking out prevent Google from spidering content, it also means that bloggers, Twitter users – in fact anyone using the web to comment – simply won’t be able to. Any links pointing to content in the Times site will end up at the sign-up page: essentially a pretty effective total blackout of in-linking. And if you believe what most of us have been saying since the dawn of the web – that linking is the lifeblood of this environment – then it is hard to understand how this model is going to work.

On Radio 4 this morning we heard James Harding, editor of the Times talking about the move. Some of what he said needs applauding: he talked about how in this online world we should appreciate the value of journalism, for example. This wonderful post by Kevin Kelly talks about the need for quality editorial control and journalism; certainly, the more STUFF there is shouting at us, the more we’re going to appreciate the people who can write well or help us sift our way through the quantity.

On the other hand, he also came out with some pretty startling stuff about “shop windows” (the analogy: window shoppers are the “shallow” reader, “real” shoppers are the ones who will pay) and how the Times is “doing exactly what we do with the printed paper”. Looking at the new designs for the site, the immediate response is “Hey, a newspaper. On a screen”. The visual analogy is so strong,¬†you can’t help but feel that someone just scanned the paper version and uploaded it. And if we’ve learnt anything over the last few years, it is that online content is more than just a hyperlinked book…

Over more than a decade of working with content-rich organisations, I’ve pretty much come to the conclusion that traditional models can’t be mindlessly shoe-horned into this new paradigm where a copy-paste-send makes us all content pirates. Charging for access to content requires a lot of thought: it only works in specific, carefully honed environments where the value of content displaces cost in ways that are more subtle than ever before. Nowadays it is about location, mobility, usability, time to market, update frequency and so many other factors. I can’t help but feel that The Times is just leapfrogging all of this subtlety on the whims of a wrinkled old CEO with a bee in his bonnet. Either way, it’s going to be an interesting experiment.

Here’s hoping the results are shared more widely than the content…

Managing and growing a cultural heritage web presence

I’m absolutely delighted (and only slightly scared) to announce that I’ve been commissioned to write a book for Facet Publishing.

Ever since I started working with museums online, I’ve felt that there is a need for strategic advice to help managers of cultural heritage web presences. There are of course hundreds of thousands of resources if you’ve got technical questions, but not many places where you can ask things like “how should I build my web team and structure my budget?” or “how do I write a strategy or business plan?”.

Facet approached me in July asking whether I’d be interested in authoring something for them, and this seemed like the ideal opportunity to try and answer some of these questions.

My (draft) synposis is as follows:

This book will provide a guide for anyone looking to build or maintain a cultural heritage web presence. It will aim to cater both to those who are single-handedly trying to keep their site running on limited budget and time as well as those who have big teams, large budgets and time to spend.

As well as describing the strategic approaches which are required to develop a successful online presence, the book will contain data and case studies on current practice from large and small cultural heritage institutions. This research will help give the reader an insight into how these institutions manage their websites as well as providing hints and tips on best practice. It will have an accompanying web presence which will provide template downloads and other up-to-date information including links and white papers.

As you’ll see, I have no intention of trying to do this all by myself – over the coming year I’m going to be on the phone to many of you (hide now!) asking how you do what you do, and compiling this into what I hope will be a useful guide.

If you have any ideas about what I should include, or the questions I should be asking – please do get in touch either via this blog or on Twitter at @m1ke_ellis!

Pushing MRD out from under the geek rock

The week before last (30th June – 1st July 2009), I was at the JISC Digital Content Conference having been asked to take part in one of their parallel sessions.

I thought I’d use the session to talk about something I’m increasingly interested in – the shifting of the message about machine readable data (think API’s, RSS, OpenSearch, Microformats, LinkedData, etc) from the world of geek to the world of non-geek.

My slides are here:

[slideshare id=1714963&doc=dontthinkwebsitesthinkdatafinal-090713100859-phpapp02]

Here’s where I’m at: I think that MRD (That’s Machine Readable Data – I couldn’t seem to find a better term..) is probably about as important as it gets. It underpins an entire approach to content which is flexible, powerful and open. It embodies notions of freely moving data, it encourages innovation and visualisation. It is also not nearly as hard as it appears – or doesn’t have to be.

In the world of the geek (that’s a world I dip into long enough to see the potential before heading back out here into the sun), the proponents of MRD are many and passionate. Find me a Web2.0 application without an API (or one “on the development road-map”) and I’ll find you a pretty unusual company.

These people don’t need preaching at. They’re there, lined up, building apps for Twitter (to the tune of 10x the traffic which visits, developing a huge array of services and visualisations, graphs, maps, inputs and outputs.

The problem isn’t the geeks. The problem is that MRD needs to move beyond the realm of the geek and into the realm of the content owner, the budget holder, the strategist, for these technologies to become truly embedded. We need to have copyright holders and funders lined up at the start of the project, prepared for the fact that our content will be delivered through multiple access routes, across unspecified timespans and to unknown devices.¬†We need our specifications to be focused on re-purposing, not on single-point delivery. We need solution providers delivering software with web API’s built in. We need to be prepared for a world in which no-one visits our websites any more, instead picking, choosing and mixing our content from externally syndicated channels.

In short, we now need the relevant people evangelising about the MRD approach.

Geeks have done this well so far, but now they need help. Try searching on “ROI for API’s” (or any combination thereof) and you’ll find almost nothing – very little evidence outlining how much API’s cost to implement, what cost savings you are likely to see from them; how they reduce content development time; few guidelines on how to deal with syndicated content copyright issues.

Partly, this knowledge gap is because many of the technologies we’re talking about are still quite young. But a lot of the problem is about the communication of technology, the divided worlds that Nick Poole (Collections Trust) speaks about. This was the core of my presentation: ten reasons why MRD is important, from the perspective of a non-geek (links go to relevant slides and examples in the slide deck):

  1. Content is still king
  2. Re-use is not just good, it’s essential
  3. “Wouldn’t it be great if…”: Life is easier when everyone can get at your data
  4. Content development is cheaper
  5. Things get more visual
  6. Take content to users, not users to content (“If you build it, they probably won’t come”)
  7. It doesn’t have to be hard
  8. You can’t hide your content
  9. We really is bigger and better than me
  10. Traffic

All this is is a starter for ten. Bigger, better and more informed people than me probably have another hundred reasons why MRD is a good idea. I think this knowledge may be there – we just need to surface and collect it so that more (of the right) people can benefit from these approaches.

Scraping, scripting, hacking

I just finished my talk at Mashed Library 2009 – an event for librarians wanting to mash and mix their data. My talk was almost definitely a bit overwhelming, judging by the backchannel, so I thought I’d bang out a quick blog post to try and help those I managed to confuse.

My talk was entitled “Scraping, Scripting and Hacking your way to API-less data”, and intended to give a high-level overview of some of the techniques that can be used to “get at data” on the web when the “nice” options of feeds and API’s aren’t available to you.

The context of the talk was this: almost everything we’re talking about with regard to mashups, visualisations and so on relies on data being available to us. In the cutting edge of Web2 apps, everything has got an API, a feed, a developer community. In the world of museums, libraries and government, this just isn’t the case. Data is usually held on-page as html (xhtml if we’re lucky), and programmatic access is nowhere to be found. If we want to use that data, we need to find other ways to get at it.

My slides are here:

[slideshare id=1690990&doc=scrapingscriptinghacking-090707060418-phpapp02]

A few people asked that I provide the URLs I mentioned together with a bit of context. Many of the slides above have links to examples, but here’s a simple list for those who’d prefer that:

Phew. Now I can see why it was slightly overwhelming ūüôā

For the webs2, please follow the crowd

The last talk I gave – in December 2008 – was at¬†Online Information¬†and titled “What does Web2.0 DO for us?”.

Here are the slides (my third slide deck to get “homepaged” on slideshare…yay…):

[slideshare id=812457&doc=whatdoesweb2doforusmikeellisv12-1228296734998366-8&w=425]

This one was attempting to focus on Web2.0 in the Enterprise. Frankly, “The Enterprise” is a subject which fills me with fear, dread and trepidation, but the movement of Web2.0 into that space is probably inevitable as sales teams around the world spot another opportunity and sell it out to cash-rich bods wanting to “be innovative” in the name of their behemoth of a company. It’ll be interesting to watch.

The talk was popular, which I’m pleased about. Online Information is a funny old conference – the halls are stacked with basically the same company replicated about 200 times: reasonably bad CMS systems with reasonably bad sales people trying to sell to a reasonably badly informed market of people. I sound over-rude, but I have to be honest – I last went in about 2003 and absolutely¬†nothing¬†has changed. Which can’t be good in the tech field, right?

My slides were supposed to be about one thing (why the social web is important in “The Enterprise”, and why “The Enterprise” should take it seriously) – in the end, I actually focused on why “web2” is important¬†to people¬†rather than as a “thing” in abstract. I see the connecting of people with other people as reason for believing in the social web as a sound platform upon which to build any content. I believe this engagement is key to bringing (heritage) content to the foreground; furthermore, I think that even though web2.0 has been hyped to death, we should continue to believe in what “the social web” means. Mainly, we should believe this because the social web is about¬†people and connections¬†and as such has enormous importance to us as social, connected animals.¬†

One of the problems with talking about “Web2.0” is that the phrase carries an implicit weight with it: as soon as there is a count attached, you’re naturally looking for the current one to expire – for “Web2” to be replaced by “Web3” and shortly after that, “Web4”. Useful though “Web2.0” is as a phrase, I’m with the commentators now who suggest we talk about “the web”, or – my preference – “the social web”. Not because it is any less important, but because it is more so.

Incidentally, earlier today I was researching some stuff for a keynote I’m due to give in The Hague later in February (more details soon…) and used Google Trends to check on the phrase “web2.0”. It’s interesting to note that it reached its peak during q4 2007, and has since dropped off in popularity:¬†


Web2.0 on Google Trends

You’ll see immediately that this follows the Gartner Hype Curve prediction (or at least the beginning of it) – it’ll be interesting to watch in the coming months and years how the curve settles into a dampened “plateau of productivity”.¬†(I’d also be interested if anyone can figure out why there is a gap between 2004 when O’Reilly first mentioned the phrase and mid-2005…)

For the graph junkies, here’s the same period for the phrase “social web”:


"Social Web" on Google Trends

So. That’s the hype. Maybe now we can get on with producing some astonishing, user-focused content..

Where the F have you been?

It’s been a long while (possibly the biggest gap since the launch of this blog..) since my last post – over a month.

This is unprecedented for me, and I’ve had four or five emails (thanks!) asking me why. I’ve always dodged around with an answer, not because I was trying to avoid some horrific truth but because until the last couple of days I simply haven’t had the brain time to devote to the reasons.

The first part of the answer to “Mike, where the F have you been?” is this: I’ve been busy keeping balls in the air: another presentation (What does Web 2.0 DO for us?) which I delivered to a roomful at Online Information 2008 on 4th Dec…the beginning stages writing a module for the new Digital Heritage MA/MSc at Leicester University – an opportunity which I’m hugely excited about, and not a little bit scared too…continuing work on three side-projects, none of which I can talk about just yet…development and writing for a corporate blog for internal comms…a desktop notification app…not to mention the hectic craziness of helping look after a 2-boy young family. Etcetefuckinra.

All of which is terribly boring, TBH, because if there’s one thing we all know about each other it is this: we’re all much too busy. In fact a corporate stat somewhere a while ago said that everyone believes themselves to be busier than 90% of everyone else. This is, of course, also true for me.

This leads to the second part of the answer: I’ve felt for a long time that the landscape of blogging has been changing considerably, particularly with lifestreaming now a part of our daily diet. I’ve blogged about noise on various occasions, and I’ve also noticed a huge shift in my own reading habits – a shift which has an obvious effect on my writing habits, too. I’m less interested in “blog post as news”, instead preferring longer, deeper, better written pieces like the beautifully-crafted Business Requirements Are Bullshit. I’m me – you’re you – but the important thing for me is that I write in a way which complements the medium and as much as possible brings some kind of value to those of you who have given up some of your valuable time to read what I have to say.

This brings me neatly on to the third part which was summed up in a conversation with Brian Kelly and Paul Walk over a post-work pint recently: why the F do we all blog, anyway? We were talking at the time about Paul’s much-commented post on blog awards. Paul is similar to me – and different to Brian – in that the former blogs as a hobby and not as a job. Paul runs his blog under his own name; Brian runs his (albeit not “officially”), under “UKWebFocus”. Brian has a series of blog policies and sticks closely to his particular topics; Paul could write about his washing powder if he so chose. I’ve always been clear (both to my readers and employers) that this isn’t a “work blog” – but it isn’t a “personal” one, either.

I started Electronic Museum as a way of reflecting on technology in the museum space. More than a year on and I’m interested in innovation, in technology ubiquity, in sharing data, in real people, in the value of attention data, in the user as focus. All of these call back to what makes museums unique, in my opinion, and it is in these arenas that I personally feel the battles for online content will be (or are being) fought and won. The point is it isn’t just a conversation about museums any more. And really, it never has been, in this always-on, radically-connected crazy internetwebthing we spend so much time staring at and talking about.

Much as I’ve carved a niche here with museum professionals who seem to value what I have to say, I’m also fascinated by the irony that nowadays it isn’t niche professionals that we need any more. Curators (museum and otherwise) – IMO – aren’t anything at all without the vision to see that what they know needs communicating in new, challenging ways; ways that may well undermine their professionalism purely because the social network they engage with has dug up someone who knows better than them. Content owners need to start to understand that value simply can’t be measured by “visits” when many people are out there having experiences with their content and not within the walled garden of their site. Technologists have got to stop hiding behind PEBCAC and start engaging with the people that are currently alienated by technology.

So what – exactly – am I saying?

I guess it is this: you’ll notice a shift over the coming weeks and months as I write about more of the things I’m doing outside of the museum space: my dabblings with the Arduino, for instance, the various other projects I’m continuously working on, a secretish partnership I’ll be able to talk about in January, and so on. I hope I won’t break the niche I’ve created – I hope that if you are a “museum professional” then you’ll continue to hang out here – I think what I have to say will be interesting, or at least mildly entertaining, whoever you are.

If you love something, set it free

Last week, I had the¬†privilege of being asked to be one of the¬†keynote speakers at a conference in Amsterdam called Kom je ook?. This translates as “Heritage Upgrade” and describes itself as “a symposium for cultural heritage institutions, theatres and museums”.

I was particularly excited about this one: firstly, my partner keynoters were Nina Simon (Museum Two) and Shelley Bernstein (Community Manager at the Brooklyn Museum) – both very well known and very well respected museum and social web people. Second (if I’m allowed to generalise): “I like the Dutch” – I like their attitude to new media, to innovation and to culture in general; and third – it looked like fun.

Nina talked about “The Participatory Museum” – in particular she focussed on an oft-forgotten point: the web isn’t social technology per se; it is just a particularly good tool for making social technology happen. The fact that the online medium allows you to track, access, publish and distribute are good reasons for using the web BUT the fact that this happens to populate one space shouldn’t limit your thinking to that space, and shouldn’t alter the fact that this is always, always¬†about people and the ways in which they come together. The changing focus of museum moving from being a content provider to being a platform provider also rang true with me in so many ways. Nina rounded off with a “ten tips for social technology” (slide 12 and onwards).

Shelley gave another excellent talk on the incredible work she is doing at the Brooklyn Museum. She and I shared a session on Web2 at Museums and the Web 2007, and once again it is the genuine enthusiasm and authenticity which permeates everything she does which really comes across. This isn’t “web2 for web2’s sake” – this is genuine, pithy, risky, real content from enthused audiences who really want to take part in the life of the museum.¬†

My session was on setting your data and content free:

[slideshare id=768086&doc=mikeellisifyoulovesomethingsetitfreefinal-1227110930707512-9&w=425]

Hopefully the slides speak for themselves, but in a nutshell my argument is that although we’ve focussed heavily on the social aspects of Web2.0 from a user perspective, it is the stuff going on under the hood which really pushes the social web into new and exciting territory. It is the data sharing, the mashing, the API’s and the feeds which are at the heart of this new generation of web tools. We can resist the notion of free data by pretending that people use the web (and our sites) in a linear, controlled way, but the reality is we have fickle and intelligent users who will get to our content any which way. Given this, we can either push back against freer content by pretending we can lock it down, or – as I advocate – do what we can to give user access to it.