Question: Parsing a FASTQ File
0
gravatar for mrsmith
7 months ago by
mrsmith0
mrsmith0 wrote:

I am relatively new to the field, and I could desperately use some help.

I am trying to process a FASTQ File using DADA2, but I really would like to separate all of the forward and reverse reads for each sample out of a very large FASTQ file. The file was initially large FASTA file, and I have already trimmed the file to remove the primers and barcodes using qiime1 , and I still have the mapping file. I then converted the file using qiime1 from a fasta to a fastq, but I'm really at a loss as to what I should do next.

dada2 • 368 views
ADD COMMENTlink modified 7 months ago by Dattatray Mongad240 • written 7 months ago by mrsmith0
1

I do not understand either. How can a file originally have been a fasta file, and then a fastq file? Where do the quality encodings come from? But if you simply have a fastq files (paired-end) with both reads in the same file (you call that interleaved), aiming to deinterleave into two separate files, here are some inspirations.

ADD REPLYlink written 7 months ago by ATpoint13k

I am sorry but the question is not clear to me. What do you want to achieve?

And are you talking about demultiplexing?

ADD REPLYlink written 7 months ago by Nitin Narwade380

Qiime1 has a script, split_sequence_file_on_sample_ids.py, which will separate fastq or fasta files demultiplexed using split_libraries.py, into separate files for each sample. But this will not separate forward reads from reverse reads, if your forward and reverse reads are all in one file.

ADD REPLYlink written 7 months ago by mastal5112.0k
1
gravatar for swbarnes2
7 months ago by
swbarnes24.8k
United States
swbarnes24.8k wrote:

Converting a fastq to a fasta results in a total loss of the quality scores. You are going to need the original quality scores to call variants.

So stop playing around with fastas, and get the original fastqs. The originals will also have read1 and read2 separate.

ADD COMMENTlink written 7 months ago by swbarnes24.8k
0
gravatar for Dattatray Mongad
7 months ago by
National Centre for Cell Science, Pune
Dattatray Mongad240 wrote:

Some points to be cleared first:

  1. If you have single FASTQ files then your data is not paired-end.
  2. You are talking about seperating reads. Is it mean demultiplexing? i.e seperating reads of each sample. And DADA2 assume that you have demultiplexed FASTQ files.
  3. DADA2 need raw FASTQ files to detect variants.

For more information, please refer DADA2 tutorial

ADD COMMENTlink written 7 months ago by Dattatray Mongad240
1

If you have single FASTQ files then your data is not paired-end.

Interleaved FQ files do indeed exist. See my comment above.

ADD REPLYlink written 7 months ago by ATpoint13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 854 users visited in the last hour