Commit 734b64ec authored by Jerome Mariette's avatar Jerome Mariette
Browse files

No commit message

No commit message
parent a1fceaf0
\documentclass{bioinfo}
\copyrightyear{2014}
\pubyear{2014}
\begin{document}
\firstpage{1}
\title[short Title]{Jflow: A fully scalable Javascript workflow management system}
\author[Mariette \textit{et~al}]{J\'{e}r\^{o}me Mariette\,$^{1,*}$,
Fr\'{e}d\'{e}ric Escudi\'{e}\,$^1$, Philippe Bardou\,$^2$, Christophe
Klopp\,$^1$}
\address{$^{1}$Plate-forme bio-informatique Genotoul, INRA, Biom\'{e}trie et
Intelligence Artificielle, BP 52627, 31326 Castanet-Tolosan Cedex, France.\\
$^{2}$Plate-forme SIGENAE, INRA, G\'{e}n\'{e}tique Cellulaire, BP 52627, 31326
Castanet-Tolosan Cedex, France.}
\history{Received on XXXXX; revised on XXXXX; accepted on XXXXX}
\editor{Associate Editor: XXXXXXX}
\maketitle
\begin{abstract}
\section{Summary:}
Building rich WEB environments aimed at helping scientists analyse their data is a common
trend in bioinformatics. These applications are often specialized WEB portals or
generic Workflow management systems (WMS). The first class provides multiple services and
analysis tools in an integrated interface for a specific experiment or data type. Quite often
these systems hide the processing steps in the back-office. The second class, for example Galaxy,
is mainly focused on workflow creation and provides a rather poor end-user interface but enable
to combine tools and data sources as desired. We introduce jflow a fully
scalable WMS that can easily be embedded in any WEB site, providing all WMS features and benefits
to your project.
\section{Availability:}
The software package is available under the GNU General Public License (GPL) at
http://bioinfo.genotoul.fr/jflow.
\section{Contact:}
\href{support.genopole@toulouse.inra.fr}{support.genopole@toulouse.inra.fr}
\end{abstract}
\section{Introduction}
Workflow management systems (WMS) are defined as software packages managing and executing
computational pipelines. Nowadays, such systems are widely used in bioinformatics because
they enable researchers to analyse the large amount of data generated by high throughput
platforms. Some WMS like Galaxy (Giardine et al., 2005) enable users to process their data using
collection of local tools through web forms. Galaxy is probably the most used of such systems because it
is based on a comprehensive web interface designed for tool and database integration. BioMOBY
(Wilkinson et al., 2002) and Taverna (Oinn et al., 2004) differ from others WMS
because they organize and integrate multiple web service providers. The main limit of such WMS is the
network performance, service availability and I/O compatibility between providers. All these WMS have
their own user interface and can hardly be used as components of an existing web project. Nowadays, in
order to provide access to new tools, it is quite common to implement a web portal that wraps the execution
of the given software package. We describe jflow which includes on one hand the core of the WMS able to execute
workflows defined by a set of components, and on the other hand a fully scalable
jquery (http://jquery.com/) plugin able to request the jflow REST API.
\begin{methods}
\section{Implementation}
Jflow core is based upon the Makeflow (Albrecht et al., 2012) workflow
engine and weaver (Bui, 2012), its Python API which is embedded within jflow. Using makeflow permits to
run jflow under different batch systems such as Condor, SGE, Work Queue or a single multicore machine.
\subsection{Adding new components and workflows}
Adding a new component in the system requires to write a Python class inheriting from the Component
class and to overwrite the process method wrapping the new tool. The class provides access to multiple
concepts such as map/reduce or multimap in order to define the way the command line pattern
should be applied to the input data. Multiple formats are available as input and output in order to
force the workflow creator to combine components the right way.
In the same way, writing a workflows consist of inheriting from the Workflow
class and overwriting the process method. In this last method, the workflow is defined as the
succession of components. Moreover, a property file is to be created to define the workflow parameters.
It gathers all the information including the data type which will be checked from the provided interfaces.
As an example, a date type will be displayed as a calendar in the graphical interface and the dd/mm/yyyy
format compliance will be checked in command line mode.
\subsection{A jquery plugin}
All information about a workflows will be accessible from both jflow command
line interface and its REST API. Thus, users can list available workflows and
their states, run and monitor them. Accessing those functionalities from the command
line interface can easily be done using the jflow command line. The same thing
can also be done from a website integrating the jflow plug-in.
To do so, jflow offers four modules to retrieve workflows information. As a
jquery plugin, these modules are fully scalable and provide multiple methods and events to
ease jflow integration. As example, the “click” event on a specific workflow triggers an event
that can be listen to and used to build a workflow form wherever the web site designer wants it
to be showed.
\end{methods}
\section{Conclusion}
jflow http://bioinfo.genotoul.fr/jflow is a simple and efficient solution to
embed workflow management systems features within any Web application.
\begin{thebibliography}{}
\bibitem[Giardine B {\it et~al}., 2005]{Giardine} Giardine B, et al. (2005)
Galaxy: a platform for interactive large-scale genome analysis, {\it Genome Res.},
{\bf 15}, 1451-1455.
\bibitem[Wilkinson {\it et~al}., 2002]{Wilkinson} Wilkinson MD, Links M. (2002)
BioMOBY: an open source biological web services proposal. {\it Brief Bioinform},
{\bf3}, 331-341.
\bibitem[Oinn {\it et~al}., 2004]{Oinn} Oinn T, et al. (2004) Taverna: a tool
for the composition and enactment of bioinformatics workflows. {\it Bioinformatics},
{\bf20}, 3045-305.
\bibitem[Albrecht {\it et~al}., 2012]{Albrecht} Michael Albrecht, Patrick Donnelly,
Peter Bui, and Douglas Thain (2012) Makeflow: A Portable Abstraction for Data
Intensive Computing on Clusters, Clouds, and Grids. {\it SWEET at ACM SIGMOD}, {\bf20}.
\bibitem[Bui, 2004]{Bui} Peter Bui (2012) Compiler Toolchain For Data Intensive Scientific
Workflows. {\it Ph.D. Thesis, University of Notre Dame}.
\end{thebibliography}
\end{document}
This diff is collapsed.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment