Thursday, August 21, 2014
02:00 PM - 02:45 PM
|Level: ||Technical - Intermediate|
Technologies such as Hadoop have addressed the "Volume" problem of Big Data, and technologies such as Spark have recently addressed the "Velocity" problem – but the "Variety" problem is largely unaddressed – there is a lot of manual "data wrangling" to mange data models.
These manual processes do not scale well. Not only is the variety of data increasing, also the rate of change in the data definitions is increasing. We can’t keep up. NoSQL data repositories can handle storage, but we need effective models of the data to fully utilize it.
This talk will present tools and a methodology to manage Big Data Models in a rapidly changing world. This talk covers:
- Creating Semantic Metadata Models of Big Data Resources
- Graphical UI Tools for Big Data Models
- Tools to synchronize Big Data Models and Application Code
- Using NoSQL Databases, such as Amazon DynamoDB, with Big Data Models
- Using Big Data Models with Hadoop, Storm, Spark, Giraph, and Inference
- Using Big Data Models with Machine Learning to generate Predictive Models
- Developer Collaborative/Coordination processes using Big Data Models and Git
- Managing change – Big Data Models with rapidly changing Data Resources
Techniques will be demonstrated with a sample application.
Marc Hadfield founded Vital AI to create software systems that understand and harness data. Mr. Hadfield’s career has focused on large-scale data analysis in Financial Services, Life Science, and Enterprise Publishing. He has been principally focused on the interplay of Semantic Data, Machine Learning, Graph Analytics, and Natural Language Processing as a foundation for data-driven applications. Mr. Hadfield, as CTO of Alitora Systems, has worked with the Gladstone Institute and the Gates Foundation to apply these techniques to drug discovery; as CTO of Inform Technologies to apply them to content recommendation for news and video publishers; and as consultant to Bloomberg to apply them to communication regulatory compliance. Using this experience, Mr. Hadfield designed the Vital AI software components and processes to do the heavy lifting of data-driven applications, freeing up developer and data science resources for deeper data analysis.