presence of Adaptators whereas read's =100 b ?
2
0
Entering edit mode
9.0 years ago
Siva ▴ 20

Hello guys,

I got 2 files: read1.fastq and read2.fastq, after to have done a mapping with bwa (each reads are 100 b) I've checked the bamtools stats and have this:

***********************************************
Stats for BAM file(s):
***********************************************

Total reads:       820130832
Mapped reads:      812921517    (99.121%)
Forward strand:    418786804    (51.0634%)
Reverse strand:    401344028    (48.9366%)
Failed QC:         0    (0%)
Duplicates:        0    (0%)
Paired-end reads:  820130832    (100%)
'Proper-pairs':    342545373    (41.7672%)
Both pairs mapped: 787752117    (96.052%)
Read 1:            431818871
Read 2:            388311961
Singletons:        25169400     (3.06895%)
Average insert size (absolute value): 309.402
Median insert size (absolute value): 241

So I suppose my reads from illumina hiseq are cleaned, and I got files without adaptors! Am I right?

Regards,
Siva

illumina hiseq reads adaptors • 2.9k views
ADD COMMENT
0
Entering edit mode

Oh and the mapping is good for the 100 bases, so I really supposed their is no adaptors...

HWI-ST1194:55:C11P5ACXX:4:2316:21329:100923     83      ScaffXRQ8f0000271       13416   55      101M    =       13340   -177    TAATTGTGTTAGGAAAATCTTTAGGTGGAGTGAAGATTTTAAGGGGCAAAAGCTACATTTGATGGCTGGCGAGTTGATATCGAAACTTGAGAACAAAGTAA    @A:5A>DDC@DCC>CCC>>>;;B@;;=3HE;FGGGG>@>C=2;8)IGIHCD<B?*ACGGBBG>HF<F<EHDHHECBFAGEJHHGGBC@FHBHFFFFFFCCC   NM:i:2  AS:i:91 XS:i:81 RG:Z:dnaseq0
ADD REPLY
2
Entering edit mode
9.0 years ago
PoGibas 5.1k

Before every analysis I run FastQC (per base sequence content) to visually inspect if there are any adapter sequences in my raw fastq.

ADD COMMENT
0
Entering edit mode

After to have run a fastqc I can see that "Adapter content" is green, but "per base sequence content" orange, but no prob I think. However "kmer content" is red and kmers which are most present are at the beginning : 1 to 15 bp.

It could be adapters?

Sequence,Count,PValue,Obs/Exp Max,Max Obs/Exp Position
AGAGCAC,71845,0.0,10.088463,9
TGCCGTC,54690,0.0,9.652958,48-49
ADD REPLY
0
Entering edit mode

Can you post per base sequence content image?

ADD REPLY
0
Entering edit mode
ADD REPLY
1
Entering edit mode

This is ok, when there are adapters in the beginning of your reads you usually see ~100sequence content (in the image bellow 5 first bases are clearly adapter). To get better idea of your reads you can use --nogroup in fastqc.

< image not found >

ADD REPLY
0
Entering edit mode

Great ! Thanks Pgibas :)

ADD REPLY
0
Entering edit mode

It this helped to answer your question, you can accept it :-)

ADD REPLY
0
Entering edit mode
9.0 years ago
Siva ▴ 20

Hum actually I've seen this post: Finding Adapters For Illumina Reads

and that's why I haven't tried fastqc, because thought it will indicate me "adapter contamination" only and not default adapters used when making libraries.

I'll try fastqc, thanks Pgibas.

ADD COMMENT

Login before adding your answer.

Traffic: 2676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6