Automated Workflow Composition in the Life Sciences

9 - 13 March 2020

Venue: Lorentz Center@Oort

If you are invited or already registered for this workshop, you have received login details by email.

In the age of computational science, researchers in the life sciences – just as in other domains – regularly face the need of composing several individual software tools into pipelines or workflows that perform the specific data analysis processes that they need in their research. For over 20 years now, dedicated scientific workflow management systems have been supporting scientists in this task, and they continue to gain popularity. In fact, recent years have seen significant progress in the functional annotation of bioinformatics software tools, as well as their virtualization, containerization and assembly into workflows for automatically executing the processes.

 

At least since the rise of the Semantic Web in the early 2000s, also the idea of semantics-based automated composition of workflows has been around to simplify the work with scientific workflows further and free life science researchers from having to deal with the technicalities of software composition. This would not only save valuable research time, but also reduce errors, allow benchmarking of data analysis pipelines and enable new scientific findings by discovering workflows that researchers would not have thought of themselves. However, despite its obvious potential and appeal, the need for optimizing data analysis workflows, and despite different research groups working on the topic, automated workflow composition has not yet arrived in the daily practice of life science researchers.

 

The reasons for this are manifold. Some are more practical (for example the lack of automatic composition tools in the commonly used software frameworks), others are of more fundamental nature (such as questions on specification languages, composition algorithms, formal semantics and workflows representations). On one important aspect, namely the semantic annotation of tools on a large scale, the life science community has made significant progress in the last years: The EDAM ontology provides a controlled vocabulary of bioinformatics operations, data types and formats, and the bio.tools registry has become a large collection of bioinformatics tools that are semantically annotated with terms from the EDAM ontology. As demonstrated in a recent Bioinformatics publication (https://academic.oup.com/bioinformatics/article/35/4/656/5060940), this forms a solid basis for performing automated workflow composition in the life sciences domain. Nevertheless, it is still a long way to its use in daily scientific practice.

 

This workshop will bring together researchers and practitioners who have been working on different aspects related to automated workflow composition in the life sciences. These include life science researchers, tool providers, infrastructure developers, ontologists, algorithmics researchers and many more. They do not normally come together as a group at the regular scientific events, so a Lorentz workshop devoted to this topic provides a unique opportunity to join forces and together significantly advance the field.

 

Towards this goal, the workshop aims at:

bringing the participants from different backgrounds to a common workable level of knowledge on automated workflow composition through a series of presentations on the relevant topics by experts in the field, letting the participants apply the presented concepts and techniques to a selection of real workflow scenarios from the life sciences to challenge their usability in practice, anddiscuss and evaluate the outcomes of these activities to develop a common perspective on future directions in the field of automated workflow composition.

Read more...

    March 9

    Monday (Workflows in the Life Sciences)

    until 11:00 Arrival, Coffee

    11:00-11:15 Welcome by the Lorentz Center

    11:15-12:00 Workshop goals and structure (workshop organizers), problem definition and possible solutions, brief introductions.

    12:00-13:00 Lunch break

    13:00-14:00 Opening Keynote “Workflow Wanders and Wonders” (Prof. Carole Goble, University of Manchester)

     

    14:00-16:30 Concrete workflow examples from different domains

    14:00-14:20 Genomics (Leon Mei, LUMC)

    14:20-14:40 Proteomics (Veit Schwämmle, SDU, Denmark)

    14:40-15:00 Proteogenomics (Tim Griffin, University of Minnesota)

    15:00-15:30 Coffee break

    15:30-15:50 Metabolomics (Aswin Verhoeven, LUMC)

    15:50-16:10 Metaomics (Pratik Jagtap, University of Minnesota)

    16:10-16:30 Scientometrics and text mining (Magnus Palmblad, LUMC)

     

    16:30-17:00 Outlook on Tuesday-Thursday

     

    17:00- Wine & cheese reception combined with poster session (attendants, in particular early-stage researchers, will be invited to bring relevant posters to stimulate discussion and interaction. Posters will be up all week.)

     

    March 10

    uesday (Semantics, Ontologies and Functional Tool Annotations)

    9:00-09:45 Keynote Presentation on the principles of semantics and ontologies, including examples of biomedical ontologies (Prof. Robert Stevens, University of Manchester) 

    9:45-10:00 Discussion on the presentation

     

    10:00-10:30 Break

     

    10:30-11:00 Presentation: “EDAM, bio.tools and other important projects for software description in the European life sciences community” (Matúš Kalaš and Hervé Ménager)

     

    11:00-11:30 Presentation: “Tool function description in practice” (Hans Ienasescu and Jon Ison)

     

    11:30-12:00 Discussion on the presentations

     

    12:00-14:00 Lunch break

     

    14:00-16:00 Breakout sessions (in thematic groups) to work on semantic annotations of the tools (EDAM + bio.tools) needed in the workflow scenarios defined on Monday. What is there and what is needed in EDAM or bio.tools? 

     

    16:00-16:30 Break (coffee available)

     

    16:30-17:30 Reports from breakout sessions presentation of the developed annotations, summary of problems and particular challenges 

     

    18:00-late Pizza and curatathon (optional)

     

    March 11

    Wednesday (Automated Workflow Composition: Specification and Algorithms)

    09:00-09:20 Presentation: “Tool prediction in Galaxy” (Alireza Khanteymoori, University of Freiburg)

    09:20-09:40 Presentation on intelligent workflow instance generation and selection with the WINGS framework (Prof. Paul Groth, University of Amsterdam)

    09:40-10:00 Presentation: “The Automated Pipeline Explorer (APE)” (Anna-Lena Lamprecht, Utrecht University)

    10:00-10:20 Presentation: “Semantic Data Federation with SADI, HYDRA and a Valet” (Prof. Chris Baker, University of New Brunswick)

     

    10:20-11:00 Break

     

    11:00-12:00 Panel discussion of commonalities and differences between the approaches

     

    12:00-14:00 Lunch break

     

    14:00-16:00 Breakout sessions (in thematic groups) How would these approaches work out in the different domains, on the different workflow examples? 

     

    16:00-16:30 Break

     

    16:30-17:30 Reports from breakout sessions: summary of insights, problems and particular challenges 

     

    17:30-late Workshop dinner at Het Prentenkabinet (see directions in slides)

     

    March 12

    Thursday (Comparison/Ranking/Selection/Benchmarking of Workflows)

    09:00-09:15 Short introduction to comparison/ranking/selection problems (Anna-Lena Lamprecht, Utrecht University) 

    09:15-11:30 Breakout sessions What can you say without executing the workflow? (design time decisions, implementation)

    11:30-12:00 Reports from the breakout sessions

    12:00-14:00 Lunch break

    14:00-15:00 Presentation: Tool and workflow benchmarking (Salvador Capella-Gutierrez, Barcelona Supercomputing Center)

    15:00-16:00 Breakout sessions How to assess the results of the workflows? (What can you say about workflow with executing them?)

    16:00-16:30 Break

    16:30-17:30 Reports from breakout sessions

    18:00-late Pizza and hackathon (optional)

     

    March 13

    9:00-10:00 Plenary discussion

     

    10:00-10:30 Break

     

    10:30-12:00 Wrap-up and reviewing of session reports

     

    12:00-13:00 Lunch

     

    13:00-14:00 Inspirational Closing Keynote: “Making Workflows FAIR with 

    Nanopublications” (Tobias Kuhn)

     

    14:00- Farewell (workshop organisers)

     

    Please login to view the participants information. You have received the log in details in your registration confirmation.

    Jon Ison, Technical University of Denmark  

    Magnus Palmblad, Leiden UMC  

    Anna-Lena Lamprecht, Utrecht University  

    Veit Schwämmle, University of Southern Denmark  


Follow us on:

Niels Bohrweg 1 & 2

2333 CA Leiden

The Netherlands

+31 71 527 5400