Question: Parsing a FASTQ File
0
gravatar for mrsmith
12 months ago by
mrsmith10
mrsmith10 wrote:

I am relatively new to the field, and I could desperately use some help.

I am trying to process a FASTQ File using DADA2, but I really would like to separate all of the forward and reverse reads for each sample out of a very large FASTQ file. The file was initially large FASTA file, and I have already trimmed the file to remove the primers and barcodes using qiime1 , and I still have the mapping file. I then converted the file using qiime1 from a fasta to a fastq, but I'm really at a loss as to what I should do next.

dada2 • 502 views
ADD COMMENTlink modified 12 months ago by Dattatray Mongad320 • written 12 months ago by mrsmith10
1

I do not understand either. How can a file originally have been a fasta file, and then a fastq file? Where do the quality encodings come from? But if you simply have a fastq files (paired-end) with both reads in the same file (you call that interleaved), aiming to deinterleave into two separate files, here are some inspirations.

ADD REPLYlink written 12 months ago by ATpoint19k

I am sorry but the question is not clear to me. What do you want to achieve?

And are you talking about demultiplexing?

ADD REPLYlink written 12 months ago by Nitin Narwade400

Qiime1 has a script, split_sequence_file_on_sample_ids.py, which will separate fastq or fasta files demultiplexed using split_libraries.py, into separate files for each sample. But this will not separate forward reads from reverse reads, if your forward and reverse reads are all in one file.

ADD REPLYlink written 12 months ago by mastal5112.0k
1
gravatar for swbarnes2
12 months ago by
swbarnes26.0k
United States
swbarnes26.0k wrote:

Converting a fastq to a fasta results in a total loss of the quality scores. You are going to need the original quality scores to call variants.

So stop playing around with fastas, and get the original fastqs. The originals will also have read1 and read2 separate.

ADD COMMENTlink written 12 months ago by swbarnes26.0k
0
gravatar for Dattatray Mongad
12 months ago by
National Centre for Cell Science, Pune
Dattatray Mongad320 wrote:

Some points to be cleared first:

  1. If you have single FASTQ files then your data is not paired-end.
  2. You are talking about seperating reads. Is it mean demultiplexing? i.e seperating reads of each sample. And DADA2 assume that you have demultiplexed FASTQ files.
  3. DADA2 need raw FASTQ files to detect variants.

For more information, please refer DADA2 tutorial

ADD COMMENTlink written 12 months ago by Dattatray Mongad320
1

If you have single FASTQ files then your data is not paired-end.

Interleaved FQ files do indeed exist. See my comment above.

ADD REPLYlink written 12 months ago by ATpoint19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 934 users visited in the last hour