This is the second of a pair of blog posts about the 3rd Open Data camp that was held in Bristol on 14/15 May 2016. See Open Data Camp Hat-Trick (Day 1) for the first post. Observant readers may notice there’s nearly eight months between the two. Sorry about that!
— Simon Perry (@simonperry) May 14, 2016
I’d responded with a “#WLTM folk interested in text as data: extracting entities from unstructured text etc.” So my second day didn’t start with idle meandering and chatting.
Instead I had a great conversation with @jargonautical (Lucy Knight) about the potential of ‘text as data’. This is an area of particular interest to me. Many organisations claim to ‘do information management well’ when in fact what they do is bury information inside documents and then do document management (and usually even that pretty badly). So you can’t find all the times a particular project has been discussed in Investment Committee meetings because the minutes and attachments are all in folders named after the month of the meeting with no metadata about the content. I’d like to make it easy to parse the text, extract entities (project names, job titles, cash amounts) and then map connections in multiple different ways. Turned out that’s the sort of thing Lucy would like to do too, so maybe it’s something worth pursuing. Imagine being able to do that with all the corporate PDF documents currently released as open data.
When the pitching started I stopped sitting on my hands and suggested a session on Linked Data stories as a follow-up to one of the discussions on Saturday. I wanted to try to draw out some stories that would illustrate the value and potential of Linked Data. To be honest I was in search of the compelling ‘elevator pitch’, but had been warned that it probably didn’t exist yet. I think we had a good discussion and there are some useful notes in the session record too (thanks to whoever scribed them). There was a neat little tale about how having Linked Data on the relationship between postcodes and geographical areas used by statisticians (Lower level super output areas, or LSOAs) allowed someone to create a tool for DCLG that allows a user to upload a list of up to 10 000 postcodes and returns the associated Index of Multiple Deprivation data. This is significantly less work than downloading 2 CSV files with 1.5M/20k rows and then matching them.
There were also some interesting, if slightly off-topic, insights from the health domain about using trusted third parties to match sensitive data-sets and return correlated data without the identifiers that had been used. And we had some fairly deep philosophical comments: “The problem isn’t with the technology, it’s with the process of talking about things.”
There were some suggested useful actions at the end of the session:
- Encourage those with expertise or authority in a domain to create and publish Linked Data vocabularies;
- Don’t seek too hard for ‘the standard’ approach, try instead to focus on the dominant common practice;
- Find ways to tell ‘simple stories’ about how Linked Data can make a difference.
The two sessions after lunch were, for me, a little different to the usual unconference fare. I spent the first one in a ‘skills swap’ breakout, picking up some tips from @jargonautical on the basics of how to use APIs. And the final session of the day was a ‘plenary goldfish bowl conversation‘, inspired by a pitch from @owenboswarva, asking “Where have all the user groups gone?” This took the demise of the Open Data User Group, the Public Sector Transparency Board and other similar bodies as the backdrop and opened up a discussion on how we might improve advocacy for open data. I have to be honest and say the format didn’t really work for me: not as coherent as a speaker-led talk with questions; but not as open and fluid as a typical unconference discussion.
So that was ODCamp3. Bristol was a great place to spend the weekend in and the Watershed was a smashing venue. There’s more information available if you want to find the session grid, links to photographs and about 15 or so other blogs. I’d like to say a big thank you to all the organisers, special mention to Jen of Networked Planet who put a tremendous amount of energy into the event. And now it’s time to look forward to number four.