Web Science Summer School - Second keynote speech from Professor Jim Hendler
Professor Jim Hendler will give the second keynote of the week on Tuesday morning in the Health Sciences building. He is s the Director of the Institute for Data Exploration and Applications and the Tetherless World Professor of Computer, Web and Cognitive Sciences at Rensselaer Polytechnic Institute. He also serves as a Director of the UK’s charitable Web Science Trust. He has authored over 200 technical papers in the areas of Semantic Web, artificial intelligence, agent-based computing and high performance processing. The abstract for his speech is below: Big Data” usually refers to the very large datasets generated by scientists, to the many petabytes of data held by companies like Facebook and Google, and to analyzing real-time data assets like the stream of twitter messages emerging from events around the world. Key areas of interest include technologies to manage much larger datasets, technologies for the visualization and analysis of databases, cloud-based data management and datamining algorithms. Recently, however, we have begun to see the emergence of another, and equally compelling data challenge — that of the “Broad data” that emerges from millions and millions of raw datasets available on the World Wide Web. For broad data the new challenges that emerge include Web-scale data search and discovery, rapid and potentially ad hoc integration of datasets, visualization and analysis of only-partially modeled datasets, and issues relating to the policies for data use, reuse and combination. In this talk, we present the broad data challenge and discuss potential starting points for solutions including those arising from research in the Semantic Web area. We illustrate these approaches using data from a “meta-catalog” of over 1,000,000 open datasets that have been collected from about two hundred governments around the world. Blogging from WSI Intern and DigiChamp Alex Hovden.