Question: How to use a fastq file from paired library layout
0
gravatar for thustar
2.8 years ago by
thustar100
thustar100 wrote:

Hi biostars!

I want to assemble some reads from real dataset like http://sra.dnanexus.com/studies/ERP000108/runs. However, I am confused with the description of Accession ERR011087. It says, library layout: paired. I do not know what this "paired" means.

The first 24 lines of ERR011087.fastq is

@ERR011087.1 I330_1_FC30JM6AAXX:4:1:0:199 length=88 TTCANATATGGAAAAACAGGGAGCGGAAATCACGTTACTTGCGTATCATCGGAAAAGGCAGGCTGTCCATGCTCCAACCGGTTAATGA +ERR011087.1 I330_1_FC30JM6AAXX:4:1:0:199 length=88 IIII"9I;III<+<-45CI13;-=93+046/0<1:-06>4.2+4:I86III0.863;GA@7I:5./2$62110='0(2(0$+++&+( @ERR011087.2 I330_1_FC30JM6AAXX:4:1:0:242 length=88 ACAANCTTCTCAATCTCGGTCTTTTTCTTGGGGAACTCCTTGGTAATAGAACTTGGAACACAGTCCTTGGATGAATACCGTTCTTTTG +ERR011087.2 I330_1_FC30JM6AAXX:4:1:0:242 length=88 @?;+"IIIIIIF+FII@9<16I<<bd+b6+4>1&&4%-08)/$$+III4.I@III3CIE:,@+04>8799H015./21/@/51791 @ERR011087.3 I330_1_FC30JM6AAXX:4:1:0:394 length=88 ATCANTTTCACTCAAACCATTAATAACATCTACCTGGTTCTTCAGGCTTCGATTCGTTTAAGGGTGATCAAGAGGCAATCATCAGAAA +ERR011087.3 I330_1_FC30JM6AAXX:4:1:0:394 length=88 2BI;"IIIIIIIG:8CCB<e?i7i c1ei4i)4&lt;7;212+f5="" ;6iiif&lt;7gi8c?i8'70="7@=$&lt;7+2.-+,4&amp;/*.24,&amp;4*&amp;*" @err011087.4="" i330_1_fc30jm6aaxx:4:1:0:438="" length="88" accancaatatcggtaacagtacccgtcttggaacccttaacctgaagattgatggctttggcagctttggcaactggcgttgctttg="" +err011087.4="" i330_1_fc30jm6aaxx:4:1:0:438="" length="88" &lt;2="">."I7IIII8;=8)(CI;/II81):2>548,+7(&:6?&+-06+DIGCBII6-GIB9<i7i= 911?+4+21;-)43:.20---+-="" @err011087.5="" i330_1_fc30jm6aaxx:4:1:0:740="" length="88" actgntctttggcatggctcatgagcattcccatcttgtttgtcagccagataggtgccaacaaccaccgtcttgaagtttctaccat="" +err011087.5="" i330_1_fc30jm6aaxx:4:1:0:740="" length="88" 3iii"iiiiiiiii="">;IIIF5I>3;45=IB3=):2<d6;ah0:*5h6ibiiic9iii:ii1d=282>3;-11ID:.0,H<,6-'5/7 @ERR011087.6 I330_1_FC30JM6AAXX:4:1:0:753 length=88 ATGANCGCTATGCATGATGATACGACTGTTTTTGTCGCGCGCCTCAGCGTGTGCACCTTTACGCCCAGATATGACGCGACAGCGTTGG +ERR011087.6 I330_1_FC30JM6AAXX:4:1:0:753 length=88 IIII"IIII3I=I6I=5I18I+;:+4959A&0>&,++(&(-(,90IIIAB;IA;IDIIIF;@G56:+9=?034,0+210'+204+&

I did not see anything suggests there exists some kind of pair in the file. Does this file only contain single-end reads or pair-end reads? If it contains pair-end reads, how can I figure out which two reads are in one pair?

Thanks.

next-gen • 1.7k views
ADD COMMENTlink modified 2.8 years ago by qingxiangg30 • written 2.8 years ago by thustar100
4
gravatar for Jenez
2.8 years ago by
Jenez520
Sweden
Jenez520 wrote:

To answer you first question:

'Library layout: Paired' simply means that it's a paired end sequence run. If you visit ENA (which in my opinion is simpler to use for raw read downloads) for ERR011087, this is what you find http://www.ebi.ac.uk/ena/data/view/ERR011087&display=html. You can see that there are two fastq files listed here.

I'm guessing that you retrieved yours through ncbi sra with fastq dump? You have to be careful here and actually specify that you want both pairs of sequences extracted, as I believe the default is for it to not to...

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by Jenez520

Yes, I use fastq-dump to convert sra file to fastq file.

Do you mean that if I use fastq-dump directly, I will have a mistake?

ADD REPLYlink written 2.8 years ago by thustar100
2

Depends on how you used fastq-dump. For paired end illumina files you would want to use --split-files option to get the two PE files.

ADD REPLYlink written 2.8 years ago by genomax67k
0
gravatar for qingxiangg
2.8 years ago by
qingxiangg30
qingxiangg30 wrote:

Simple answer to your question,

ERR011088 is the pair file of ERR011087

ERR011090 is the pair file of ERR011089

Check the design description, pair-end files will share the same sample, i.e., Solexa sequencing of MetaHit individual MH0001 random pair end library for ERR011088, ERR011087

ADD COMMENTlink written 2.8 years ago by qingxiangg30

oh, really? I do not agree with you.

Could you please show me the link to the description? I did not notice that.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by thustar100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 662 users visited in the last hour