Commit 6d759871 authored by Jerome Mariette's avatar Jerome Mariette
Browse files

No commit message

No commit message
parent 84b77cd3
......@@ -28,7 +28,7 @@ Biologists produce large data sets and are in demand of rich and simple
web portals in which they can upload and analyse their files. Providing such
tools requires to mask the complexity induced by the needed High Perfomance
Computing (HPC) environment. The connexion between interface and computing
infrastructure is usually specific to each portal. With jflow, we introduce
infrastructure is usually specific to each portal. With Jflow, we introduce
a Workflow Management System (WMS), composed of Jquery plugins which can easily be
embedded in any web application and a Python library providing all requested
features to setup, run and monitor workflows.
......@@ -54,7 +54,7 @@ organize them into workflows.
Generic WMS, such as Galaxy (Goecks et al., 2010), provide a user friendly
graphical interface making easier workflow creation and execution, hiding the
complexity of tool installation and tuning. Today, it is probably the most used
WMS due to its intuitivness and large software package collection.
WMS due to its intuitiveness and large software package collection.
Unfortunatly, using such environments does not enable an easy integration within
already existing web interfaces.
......@@ -70,7 +70,7 @@ now a common need. Specialized web portals such as MG-RAST (Meyer et al., 2008)
or MetaVir (Roux et al., 2011) provide multiple services and analysis tools in
an integrated manner for specific experiments or data type. These
applications usually hide the processing steps in back-office and implement
their own way to manage job executions. Using jflow, developers of such tools
their own way to manage job executions. Using Jflow, developers of such tools
could easily integrate WMS features in their applications.
......@@ -92,31 +92,31 @@ providing user oriented views.
\end{itemize}
The plugins give access to multiple communication methods and events and
communicate with the server side by requesting the jflow REST API running under
communicate with the server side by requesting the Jflow REST API running under
a cherrypy (http://www.cherrypy.org/) web server. The provided server uses the
JSONP communication technique enabling cross-domain requests.
To be available from the different Jquery plugins, the workflows have to be
implemented using the jflow API which includes classes and functions to build
components and workflows. A jflow component is in charge of a command line
implemented using the Jflow API which includes classes and functions to build
components and workflows. A Jflow component is in charge of a command line
execution. A workflow chains several components. Adding a component to the
system requires to write a Python \textit{Component} subclass. In jflow,
system requires to write a Python \textit{Component} subclass. In Jflow,
different solutions are available to ease component creation. To wrap a single
command line, the developer can give a position or a flag for each parameter.
Jflow also embeds an XML parser which allows it to run geniune Mobyle (Neron et
al., 2009) components. Finally, to allow developpers to integrate components
from other WMS, jflow provides a class skeleton, in which only the parsing step
has to be implemented. In the same way, a jflow workflow is built from a
from other WMS, Jflow provides a class skeleton, in which only the parsing step
has to be implemented. In the same way, a Jflow workflow is built from a
\textit{Workflow} subclass. Components are added as variables and chained
linking outputs and inputs.
To define the parameters presented to the final user, jflow gives access to
To define the parameters presented to the final user, Jflow gives access to
different class methods. Each parameter has at least a name, a user help text
and a data type. For type parameters such as files or directories, it is
possible to set required file format, size limitation and location. Jflow
handles server side files with regular expressions, but also URL files and
client side files, in which case, it automatically uploads them. Before running
the workflow, jflow checks data type compliance for every parameter. To manage
the workflow, Jflow checks data type compliance for every parameter. To manage
job submission, status checking and error handling, it relies on Makeflow
(Albrecht et al., 2012) and weaver (Bui et al., 2012). It benefits from the
error recovery feature and the support of most distributed resource management
......@@ -139,13 +139,13 @@ Workflows are listed thanks to the \textit{availablewf} plugin built within a
NG6 modal box. It requests the server to get the workflows implemented by the
developer. A \textit{select.availablewf} event thrown by the
\textit{availablewf} plugin is listened and catched to generate the parameter
form using the \textit{wfform} plugin. Considering the parameter type, jflow
form using the \textit{wfform} plugin. Considering the parameter type, Jflow
adapts its display. As example, a date is displayed as a calendar, where a
boolean is represented by a checkbox.
As it is dedicated to biological data, NG6 inputs are often experimental sample,
composed of reads files and metadata such as name, tissue, developpement stage.
To help providing such parameters, jflow enables to use structured data inputs.
To help providing such parameters, Jflow enables to use structured data inputs.
Such parameter sets are represented within the \textit{wfform} plugin in a
spreadsheet, allowing to copy and paste multiple lines. Iterating over a set of
samples is thus as easy as filling a spreadsheet.
......@@ -156,7 +156,7 @@ samples is thus as easy as filling a spreadsheet.
\caption{\textbf{Jflow integration:} (a) A piece of the NG6 HTML code source in
which is positioned an empty div to build the \textit{activewf} plugin and a
modal box for the \textit{wfstatus} plugin. (b) The Jquery code in charge to
build jflow plugins and manage user action. When the \textit{select.activewf}
build Jflow plugins and manage user action. When the \textit{select.activewf}
event is thrown from \textit{activewf-div}, a function is called with two
parameters: \textit{event} and \textit{workflow}. The last parameter stores
all the workflow's information, such as its name and its id, used in this
......@@ -174,6 +174,12 @@ presented on Figure~\ref{fig::jflow_example}. This view shows the workflow's
execution graph where a component is represented by a node and an input / output
link by an edge.
NG6 was first implemented using the Ergatis (Orvis et al., 2010) WMS. It comes
with its own user interface like Galaxy. Using NG6 led to set, run and monitor
workflows from Ergatis and to browse sequencing runs and analyses from NG6. With
Jflow, all actions are available from the same interface, what is a real gain
for the user. In its new version, the environment is in production since 2013.
Jflow has been used to process xxx sequencing runs on a 5 000 cores HPC.
\section{Conclusion}
......@@ -220,8 +226,13 @@ Clouds, and Grids. {\it SWEET at ACM SIGMOD}, {\bf20}.
\bibitem[Bui, 2012]{Bui} Bui P, (2012) Compiler Toolchain For Data Intensive
Scientific Workflows. {\it Ph.D. Thesis, University of Notre Dame}.
\bibitem[Lopes {\it et~al}., 2010]{Lopes} Lopes C, et al. (2010) Metavir: a web
server dedicated to virome analysis, {\it Bioinformatics}, {\bf 26}, 2347-2348.
\bibitem[Lopes {\it et~al}., 2010]{Lopes} Lopes C, et al. (2010) Cytoscape Web:
an interactive web-based network browser, {\it Bioinformatics}, {\bf 26},
2347-2348.
\bibitem[Orvis {\it et~al}., 2010] Orvis J, et al. (2010) Ergatis: A web
interface and scalable software system for bioinformatics workflows. {\it
Bioinformatics}, 15;26(12).
\end{thebibliography}
\end{document}
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment