Data Science with Human in the Loop: Harnessing User Semantics at Scale

Lora Aroyo

VU University Amsterdam

ABSTRACT: Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they donít actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.

- http://controcurator.org/

BIO: Prof. dr. Lora Aroyo is full professor Computer Science, at Vrije Universiteit Amsterdam, where she leads the Web & Media Group. Her research work is focused on semantic technologies for modeling user and context for personalized access of online multimedia collections, e.g. cultural heritage collections, multimedia archives and interactive TV. She has been prominently involved in national and international Digital Humanities initiatives. She was a scientific coordinator of the NoTube project, dealing with the integration of Web and TV data with the help of semantics, and a number of nationally funded projects, such as CHIP and Agora, dealing with modelling events and event narratives. Lora is actively involved in the Semantic Web community as a program chair for the European and the International Semantic Web Conferences. She is also actively involved in the Personalization and User modeling community as vice-president of the User Modeling Inc.. She is a three time holder of IBM Faculty Awards for her work on Crowd Truth: Crowdsourcing for ground truth data collection for adapting IBM Watson system to medical domain.Web: http://lora-aroyo.org Twitter: @laroyo and slideshare: http://www.slideshare.net/laroyo