Question: Life cycle of a fastq file
2
gravatar for edrezen
3.9 years ago by
edrezen720
France
edrezen720 wrote:

Hello,

I am curious about the life cycle of NGS sequencing files (like fastq files) once they have been generated. I suppose the answer strongly depends on the biological field but are there some general trends ?

My questions are :

  • what is the first tool you use for processing a brand new fastq file ?
  • when your data analysis is finished, what is the last processing you do on it ? store it somewhere forever ? get rid of it ?

Thank you.

next-gen fastq • 1.2k views
ADD COMMENTlink modified 3.9 years ago by reza.jabal300 • written 3.9 years ago by edrezen720
6
gravatar for Devon Ryan
3.9 years ago by
Devon Ryan88k
Freiburg, Germany
Devon Ryan88k wrote:

The first thing I use is either FastQC or a trimmer, generally a trimmer (I typically run FastQC on the trimmed file). I'm increasingly becoming a fan of simply deleting fastq files after mapping. BamHash (I have a fork with an extra tool that I find useful) is really handy in this regard, since it allows ensuring that fastq files can be reconstructed from the alignment files.

ADD COMMENTlink written 3.9 years ago by Devon Ryan88k

Oh thanks, I didn't know BamHash.
 

ADD REPLYlink written 3.9 years ago by edrezen720

I wouldn't have either were it not for twitter :)

ADD REPLYlink written 3.9 years ago by Devon Ryan88k
3
gravatar for reza.jabal
3.9 years ago by
reza.jabal300
United Kingdom
reza.jabal300 wrote:

This paper thoroughly answers your questions:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4179624/

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by reza.jabal300
1

Thanks for the link, it seems interesting.

Note that I am also curious about other fields than exome sequencing. For instance, people who work in metagenomics may have also interesting insights.

ADD REPLYlink written 3.9 years ago by edrezen720
1
gravatar for Nicolas Rosewick
3.9 years ago by
Belgium, Brussels
Nicolas Rosewick7.3k wrote:

it depends really on the library type (RNA-Seq, Chip-Seq, DNA-Seq, .... ). But you could use a quality control tool as fastqc to check your fastq.  

ADD COMMENTlink written 3.9 years ago by Nicolas Rosewick7.3k
1
gravatar for Madelaine Gogol
3.9 years ago by
Madelaine Gogol5.0k
Kansas City
Madelaine Gogol5.0k wrote:

+1 for fastqc. 

As far as the life cycle goes, our institute has a systematically organized file system holding many of the old gzipped fastq files. Occasionally, people want to revisit the old data, and it's one way to ensure you're really starting from the raw data. I'm not sure how long this will continue, but it's what we're doing for now.

ADD COMMENTlink written 3.9 years ago by Madelaine Gogol5.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1615 users visited in the last hour