Semantic Web Standards and the Variety "V" of Big Data
Share this Session:
  Bob DuCharme   Bob DuCharme
Director of Digital Media Solutions
TopQuadrant
 


 

Wednesday, August 20, 2014
10:15 AM - 11:00 AM

Level:  Technical - Introductory


A popular definition of Big Data applications is based on "the three Vs": the ability to work with the increasing Volume, Velocity, and Variety of data being created today. A September 2013 Gartner report said that most organizations find the handling of Variety to be a bigger challenge than either Velocity or Variety.

As NoSQL fans argue about the value of either using or forgetting about schemas with their data, the W3C's RDF Schema specification (and its optional superset OWL) gives us a standardized, well-implemented technique for handling the variety of Big Data as well as the perfect compromise between using and not using schemas: schemas that describe only the parts of a dataset that we're interested in. This brings multiple benefits that will be covered in this talk:

  • The ability to focus on the most useful parts of potentially useful datasets with no need to account for their entire structure.
  • How, in addition to describing the structure of potentially useful data in different datasets, RDFS and OWL let us describe relationships between values from different datasets that makes it easier to relate and integrate them.
  • How the use of schemas as descriptive metadata (as opposed to proscriptive rules for the data to comply with) makes it easier to work with data that wasn't originally designed for your application.
  • The ability of the W3C SPARQL query language to query across combinations of disparate datasets and small, add-on integration schemas to find new patterns and business intelligence in the aggregated data.


Bob DuCharme is the Director of Digital Media Solutions at TopQuadrant, the leading provider of software and solutions for modeling, developing and deploying semantic web applications. He has been writing and speaking on RDF-related technology since 2002, and co-chaired the 2008 Linked Data Planet Conference in New York City. Earlier in his career he did development and data and systems architecture at Moody's Investors Service and LexisNexis. Bob received his BA in Religion from Columbia University and his Masters in Computer Science from New York University.


   
Close Window