Gov 2.0 Expo: Live Blogging #2

Session Title, "Mission Possible: Putting Government Linked Open Data on the Web"
Sandro Hawke (W3C), John L. Sheridan (Information Policy and Services Directorate of the UK's National Archives) .

Sandro kicked it off with a geeky Q&A that made me feel right at home...asking questions about knowledge base for the audience.  Who programs, who understands HTTP repsonse codes, who understands XML, JSON, etc.
What is Linked Data?
  1. Concept of spreadsheet or database and have it work over the web
    1. Web Identifiers (URIs)
    2. Publish information as good old fashioned web pages
    3. Triples (subject, property, value) statements
      1. i.e. Massachusettes -> nickname -> "Bay State"

Benefits of Linked Data?
  1. Enables web-scale data publishing 
  2. Everything is a resource
  3. Everything can be annotated
  4. Easy to extend
  5. Easy to merge

How do you deal with data with varying providence or legal status?

Example used for Triples Discussion: http://dbpedia.org/page/Massachusetts


What are URIs?
  1. Very much like URLs with a few differences and with lots of history and conflict.
  2. Information Resource -> a file on your hard drive
  3. Anything you can imagine can be a resource
  4. URIs name arbitrary resources.
  5. http://en.wikipedia.org/wiki/URI

A very academic session about data standardization and uniformity for sharing.  Uber geeky. Enjoy it greatly but need time to synthesize for a better explanation.  I jotted down some privacy risks for pushing out data like this...


Privacy Risks of Linked Data?

  1. The definition of PII discuss Linked Data: "Information which can be used to distinguish or trace an individual's identity, such as their name, social security number, biometric records, etc. alone, or when combined with other personal or identifying information which is linked or linkable to a specific individual, such as date and place of birth, mother's maiden name, etc." OMB M-07-16
  2. The subject, property, value triple allows linking a person to values, indicators, statuses, etc.  For example: Person, MedicalCondition, <<Condition>> 
  3. Look at the Massachusetts example above and imagine Massachusetts was instead an individual.  Tripled Data published on this individual could be aggregated if the vocabulary (subject and property) was known. 
  4. Being able to add or tweak the API of linked data sets...allowing manipulation of data on individuals by third-party individuals.

Linked Data Links (har har):
  1. Tim Berners-Lee -- http://www.w3.org/DesignIssues/LinkedData.html
  2. Scholarly articles -- http://scholar.google.com/scholar?q=Linked+Data+Publication&hl=en&as_sdt=0&as_vis=1&oi=scholart (Good job, PSU.edu being the first link!)
  3. How to publish linked data on the web: http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/