Online Information Quality

26 - 29 March 2018

Venue: Snellius

If you are invited or already registered for this workshop, you have received login details by email.

Description

In the “Onlife” era, the distinction between online and offline has faded away. This means that our online actions have a direct impact on our offline actions, and vice-versa. Nevertheless, we, as humans, are often unable to evaluate the consequences of the flow of online information, as the speed, persistence, heterogeneity and volume of online information is inherently different from those in the offline world in which we have evolved.

Fake news is a straightforward example of how dependent our offline world is from typical online phenomena that human agents may be unable to understand or control. This calls for the development of tools and methods that enable a more informed, augmented access to online information. While the veracity of online information is one of the more urgent challenges we may have to tackle, many other informational qualities (like neutrality and completeness) can benefit from being quantified and presented to users who access online information. This idea of semantically enhancing or annotating online information can be extended to include specific low-level metadata, like timestamps and author-identities, to help users to correctly interpret the information they observe. By presenting quality assessments and meta-information to online users, we can increase their awareness, and lead them to a more conscious consumption (and creation) of online information. Tools for assessing the quality of Web information have already been proposed, but such tools have to be accompanied with an effort to spread awareness about the quality of online information and to increase the incentives that users have in using them.

Several key-players in the internet and technology industry are currently proposing solutions to the rise of fake news by combining automated methods with human contributions. In doing so, they only address a limited range of actions that can be taken in order to improve our access to online information and our ability to evaluate its multiple qualities. A document containing factual information may, for instance, still be incomplete or lack readability. A potentially even more problematic issue is the “framing” of information in online environments, for even factual raw data can be framed in ways that can make certain conclusions, outlooks or worldviews more likely than others, and thus influence the decision-making processes of the consumers of these data. These framing processes are far from being comparable to fake news, but their presence and influence is crucial in determining how issues are defined, understood and acted upon by users of information. As such, the lack of transparency caused by the framing quickly becomes a problem that needs to be addressed when we study and try to improve information quality in online environments.

Therefore, it is important to identify the quality aspects (or dimensions) that are most likely to affect the lives of those who consume online information, to develop methods to quantify these qualities, and to understand how different actors evaluate these qualities. In the case of fake news this can be assessed by authorities through proper validation methodologies and lead to a binary evaluation of information quality (“fake” vs. “real”), but in most other cases the evaluations will be more gradual and potentially also more subjective. Preliminary studies on this topic have been proposed, and will provide a starting basis for the workshop.

The practice of source criticism offers an inspiring starting point for the assessment and augmentation of online information, but will have to be extended and adapted to the range of non-traditional sources of information that are present online (e.g., blog posts).

Aim

This workshop aims at gathering together scholars from diverse disciplines, in order to identify a set of challenges related to Online Information Quality that are specifically relevant to them. These disciplines comprise Computer Science, Digital Humanities, Media Studies, Management, and Philosophy. Moreover, by leveraging this mutlidisciplinary setting, the workshop will also provide preliminary solutions to such challenges, and identify cross-disciplinary research developents that will be undertaken to address them. In particular, challenges and solutions will cover the following items:

Which quality. Several quality dimensions can be considered when assessing online information. Accuracy, precision, complexity, completeness,neutrality and transparency are examples of such qualities. Is it possible to define a minimum list of fundamental qualities? How is such a boundary defined? How are such dimensions related to each other?

Which uses and purposes. Online actors rarely share and search in a vacuum, but instead share and/or look for information for a specific reason. We may look for information to answer a specific question, and perhaps act on that answer, but also to enlarge our horizon, stay up to date, or simply for diversion. Similarly, information can be shared to inform or help others, but also to convince them of a particular viewpoint, or often even to influence their behaviour. This inevitably influences how information is valued, both from the perspective of the producer/disseminator as from the perspective of the consumer. For example, for historians, the veracity of provenance of the information encountered may be more important than the veracity of the content itself. It may be more important to know that a given document was authentically authored by Julius Caesar, than to know that its content is accurate. Vice-versa, for Computer Scientists, the veracity of provenance may be as important as the veracity of the information itself: knowing that a given hotel review was really produced by a given author on a given website may be as important as understanding that its content is veracious.

How to assess quality. What are the roles of NLP and of machine learning when assessing the quality of online information? How is it possible to combine these automated methods with human-in-the-loop methods (e.g., niche/crowdsourcing, social network analysis)?

Societal and economic impact. From restaurant reviews to political discussions, online information has a huge societal and economic impact. On the one hand, it would be important to quantify such an impact. On the other hand, it is important to understand the value of the various possible assessments, so to be able to prioritize those assessments which value is higher. Especially considering the computational burden of such computation, this would allow optimizing this estimation.

From this vantage points, two relevant areas of investigation might be:

- Quality of information related to business (or organizational/strategic) decisions. Digital tools and the flow of information online contribute to the emergence of issues that need to be faced by organizations (both for-profit and not for profit). In turn, by determining the strategy-making agenda of these organizations, such issues have a big impact on the society and on the economy, which these organizations influence through their behavior. An example might be the waves of interest and centrality assigned by firms to sustainability issues whenever crisis happen and become a “hot topic” in online discussions.

- The use/abuse/misuse of high/low quality information in making policy decisions. The weakening of traditional political forces and structures and the emergence of bottom-up social movements that either influence indirectly or govern directly the policy-making process in an increasing amount of countries, poses the problem of how political decisions are made. In particular, policy analysts are called to consider the risks of drifting into populism or sheer bad policy due to an ill-informed decision-making process influenced by “bad information and framing”.

Processability. If one of the goals of the assessment of information quality is the increase of the user awareness when accessing online information, it is important to understand what are the processability limits of such users when accessing this augmented information. In fact, even supposing we can estimate a large amount of qualities, such estimates need to be processed by users in order to be effective.

Workshop files

Scientific organizers:

Lora Aroyo

Kaspar Beelen

Davide Ceolin

Vladi Finotto

Patrick Allo, Vrije Universiteit Brussel

Online Information Quality

26 - 29 March 2018

Venue: Snellius

Workshop files

Scientific organizers:

Follow us on: