Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
meteor
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Admin message
A compter du 1er avril, attention à vos pipelines :
Nouvelles limitations de Docker Hub
Show more breadcrumbs
metagenopolis
meteor
Merge requests
!1
Fix bug while reading fastq.gz files generated by recent samtools
Code
Review changes
Check out branch
Download
Patches
Plain diff
Merged
Fix bug while reading fastq.gz files generated by recent samtools
fix_samtools_gzip
into
master
Overview
0
Commits
1
Pipelines
0
Changes
1
Merged
Florian Plaza-Onate
requested to merge
fix_samtools_gzip
into
master
1 year ago
Overview
0
Commits
1
Pipelines
0
Changes
1
Expand
0
0
Merge request reports
Compare
master
version 1
bf0578d9
1 year ago
master (base)
and
latest version
latest version
bf0578d9
1 commit,
1 year ago
version 1
bf0578d9
1 commit,
1 year ago
1 file
+
74
−
97
Inline
Compare changes
Side-by-side
Inline
Show whitespace changes
Show one file at a time
meteor-pipeline/meteor.rb
+
74
−
97
Options
@@ -767,40 +767,6 @@ class MeteorSession
#------------------------------------------------------------------------------
# count fastq reads
def
CountReadFastqFile
(
aInputFile
)
aLineCount
=
aReadCount
=
aBaseCount
=
0
fd
=
nil
in_fq
=
nil
# open input fastq in read mode
begin
fd
=
File
.
open
(
aInputFile
)
# fastq may be gziped
in_fq
=
aInputFile
=~
/gz$/
?
Zlib
::
GzipReader
.
new
(
fd
)
:
fd
rescue
STDERR
.
puts
"Error: cannot open
#{
aInputFile
}
!"
exit
1
end
# read input fastq line by line
in_fq
.
each
do
|
line
|
aLineCount
+=
1
aModulo
=
aLineCount
%
4
#if line =~ /^@/
if
aModulo
==
1
# read id line
aReadCount
+=
1
else
aBaseCount
+=
line
.
size
-
1
if
aModulo
==
0
# size - 1 because of line ending character
end
end
in_fq
.
close
if
not
in_fq
.
closed?
return
[
aReadCount
,
aBaseCount
]
end
#------------------------------------------------------------------------------
def
CountReadAndReIndexCsfastaQualFiles
(
aInputCsFastaFile
,
aInputQualFile
,
aOutputCsFastaFile
,
aOutputCsQualFile
)
aReadCount
=
aBaseCount
=
0
@@ -857,70 +823,81 @@ class MeteorSession
#------------------------------------------------------------------------------
def
CountReadAndReIndexFastqFile
(
aInputFile
,
aOutputFile
)
aInputLineCount
=
aReadCount
=
aBaseCount
=
aCpt
=
0
adn
=
qual
=
prev
=
seqId
=
nil
fd
=
nil
in_fq
=
nil
# open input fastq in read mode
begin
fd
=
File
.
open
(
aInputFile
)
# fastq may be gzipped, xz or bz2
in_fq
=
aInputFile
=~
/gz$/
?
Zlib
::
GzipReader
.
new
(
fd
)
:
fd
rescue
STDERR
.
puts
"Error: cannot open
#{
aInputFile
}
!"
exit
1
end
# open output fastq in write mode
File
.
open
(
aOutputFile
,
"w"
)
do
|
out_fq
|
# read input fastq line by line
in_fq
.
each
do
|
line
|
line
.
chomp!
# line begins with @seqid
if
(
line
=~
/^@(.+)$/
)
## check if the previous read is valid (quality exists and same size as adn)
aQualSize
=
(
qual
.
nil?
)
?
0
:
qual
.
size
if
(
aQualSize
>
0
)
if
(
(
not
adn
.
nil?
)
and
aQualSize
==
adn
.
size
)
aReadCount
+=
1
aBaseCount
+=
aQualSize
# then write this indexed read
out_fq
.
print
"@
#{
aReadCount
}
\n
#{
adn
}
\n
+
\n
#{
qual
}
\n
"
adn
=
nil
end
end
seqId
=
$1
## NB: quality line might start with @
end
aCpt
+=
1
if
(
line
==
"+"
or
line
==
"+
#{
seqId
}
"
)
qual
=
nil
aCpt
=
3
# previous line was adn
adn
=
prev
end
if
aCpt
==
4
qual
=
line
end
prev
=
line
end
fd
.
close
if
not
fd
.
closed?
#in_fq.close if not in_fq.closed?
# evaluate last read
aQualSize
=
(
qual
.
nil?
)
?
0
:
qual
.
size
if
(
aQualSize
>
0
)
if
(
(
not
adn
.
nil?
)
and
aQualSize
==
adn
.
size
)
aReadCount
+=
1
aBaseCount
+=
aQualSize
out_fq
.
print
"@
#{
aReadCount
}
\n
#{
adn
}
\n
+
\n
#{
qual
}
\n
"
end
end
end
return
[
aReadCount
,
aBaseCount
]
aInputLineCount
=
aReadCount
=
aBaseCount
=
aCpt
=
0
adn
=
qual
=
prev
=
seqId
=
nil
fd
=
nil
in_fq
=
nil
# open input fastq in read mode
begin
fd
=
File
.
open
(
aInputFile
)
rescue
STDERR
.
puts
"Error: cannot open
#{
aInputFile
}
!"
exit
1
end
# fastq may be gzipped
is_gzip
=
aInputFile
=~
/gz$/
# open output fastq in write mode
File
.
open
(
aOutputFile
,
"w"
)
do
|
out_fq
|
while
not
fd
.
eof
in_fq
=
is_gzip
?
Zlib
::
GzipReader
.
new
(
fd
)
:
fd
# read input fastq line by line
in_fq
.
each
do
|
line
|
line
.
chomp!
# line begins with @seqid
if
(
line
=~
/^@(.+)$/
)
## check if the previous read is valid (quality exists and same size as adn)
aQualSize
=
(
qual
.
nil?
)
?
0
:
qual
.
size
if
(
aQualSize
>
0
)
if
(
(
not
adn
.
nil?
)
and
aQualSize
==
adn
.
size
)
aReadCount
+=
1
aBaseCount
+=
aQualSize
# then write this indexed read
out_fq
.
print
"@
#{
aReadCount
}
\n
#{
adn
}
\n
+
\n
#{
qual
}
\n
"
adn
=
nil
end
end
seqId
=
$1
## NB: quality line might start with @
end
aCpt
+=
1
if
(
line
==
"+"
or
line
==
"+
#{
seqId
}
"
)
qual
=
nil
aCpt
=
3
# previous line was adn
adn
=
prev
end
if
aCpt
==
4
qual
=
line
end
prev
=
line
end
# fix bug of gzip file concatenating multiple streams
if
not
fd
.
eof
unused
=
in_fq
.
unused
in_fq
.
finish
fd
.
pos
-=
unused
?
unused
.
length
:
0
end
end
fd
.
close
if
not
fd
.
closed?
# evaluate last read
aQualSize
=
(
qual
.
nil?
)
?
0
:
qual
.
size
if
(
aQualSize
>
0
)
if
(
(
not
adn
.
nil?
)
and
aQualSize
==
adn
.
size
)
aReadCount
+=
1
aBaseCount
+=
aQualSize
out_fq
.
print
"@
#{
aReadCount
}
\n
#{
adn
}
\n
+
\n
#{
qual
}
\n
"
end
end
end
return
[
aReadCount
,
aBaseCount
]
end
#------------------------------------------------------------------------------
Loading