Question: Issues With Fastq File
0
gravatar for upendrakumar.devisetty
7.2 years ago by
United States
upendrakumar.devisetty370 wrote:

Hi, I have issues with several of my fastq files and I did not the problem until i tried to upload the files onto SRA NCBI. The problem is several of my files have corrupted reads. Sometimes the length of the sequence in that particular read is not the same as quality and other times the sequence gets merged with header of the next read and so on. See below for both kinds of examples.

$> zcat RIL_2_UN_Rep5.fq.gz | grep -C 4 'HS3:229:C12DBACXX:1:2306:16681:128211'
@HS3:229:C12DBACXX:1:2306:15588:128229 1:N:0:
ACTTAGATGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCT
+
FFHHHHHJIJJJJJJIJJFIJJJJJJJJIJJJJJIJJFGGGFHH
@HS3:229:C12DBACXX:1:2306:16681:128211 1:N:0:
ACTTAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATC
+
FFHHHHHJGGIIIGIJIJJFGIJJFJIIJJ@HS3:229:C12DBACXX:2:1101:2723:2202 1:N:0:
ACTTAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATC


$> zcat RIL_7_UN_Rep5.fq.gz | awk 'length($0) > 45'
@HS3:229:C12DBACXX:1:2308:3094:111639 1:N@HS3:229:C12DBACXX:2:1101:2675:2212 1:N:0:
@HS3:229:C12DBACXX:4:2308:2028:396@HS3:229:C12DBACXX:5:1101:2144:2218 1:N:0:
DDFADBFGGIIII@HS3:229:C12DBACXX:6:1101:2492:2152 1:N:0:

Is there a way to remove these reads or trim the reads? How do i deal with these problematic files.

Thanks in advance for your help.

Upendra

reads fastq • 2.0k views
ADD COMMENTlink modified 7.2 years ago by Gabriel R.2.7k • written 7.2 years ago by upendrakumar.devisetty370
0
gravatar for Gabriel R.
7.2 years ago by
Gabriel R.2.7k
Danmarks Tekniske Universitet
Gabriel R.2.7k wrote:

Yes you can trim them if you want, but if I were you, I would try to determine a more pressing issue: Why did that happen ? Go through the steps in the pipeline and figure out what went on because this is not expected. Plus, how you know that that if say a sequence has 30bp and the quality has 40bp, how these two fit together ? Maybe the sequence corresponds to the first 30bp or maybe the last 30bp, you just do not know.

Also, for sanity's sake, ditch fastq and use bam.

ADD COMMENTlink written 7.2 years ago by Gabriel R.2.7k

Thanks for the suggestion. Its a lot of work but i guess have to do it to check what went wrong...

ADD REPLYlink modified 7.2 years ago • written 7.2 years ago by upendrakumar.devisetty370
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1524 users visited in the last hour