Three reads with the same name in the BAM file
1
0
Entering edit mode
8.1 years ago
64zqor • 0

Hi all,

I am dealing with the paired-end BAM file, and come up with many warnings like this:

WARNING: Could not find pair for HWI-ST430:177:2:1:4979:15503#0
WARNING: Could not find pair for HWI-ST430:177:2:1:5127:13427#0
WARNING: Could not find pair for HWI-ST430:177:2:1:6521:21452#0

I check the warning reads in the BAM file, and find all the warning reads have three reads with the same name. For example:

 **HWI-ST430:177:2:1:4979:15503#0** 65  chr32   26100696    60  79M21S  chr5    36697147    0   ACTTTGCAATTTAAGTTTTACTTACTTTTTAACTAATATACATGCCTAAAATTTACAAAAACAATAATAAAAACAACAGAACACTGGAAACATTTTTAAA    >;=<>=<<=======<====;===;=======<=>>>>>><=>>==>>>>=>>>>==>?>=<<==>?>>>?>?==><=?>><=<>>>?>?=>??>?===>    BD:Z:FFHFCIKKIHG@EEEHF??DGGEDGGE???DEEGGEFFFFGDHHHHGGE??FF?DGDG???EDGFGFGGF@@@FEHFEIEGFEEIJJIHBHGLJDD@EF@   MD:Z:79 PG:Z:MarkDuplicates RG:Z:Basenji    BI:Z:FFIECHGIHFEAFEEHEAAFFHDFFHDAAAFEEIHFGGHGGGHHGHHHFBBGFBGGGHBBBFGHGGFGGFBBBGHIGHJGHGHFKJJJJEIKLJGHBGFB   NM:i:0  AS:i:79 XS:i:19

 **HWI-ST430:177:2:1:4979:15503#0** 129 chr5    36697147    60  72M28S  chr32   26100696    0   ATTTGCCCCTGGGCTATTTTTTTCCTNCCATGTAAGATTCCGTTTTAAAAATGTTTCCAGTGTTCTGTTGTTTTTATTATTGTTTTTGTAAATTTTAGGC    ===<=<<<<====<=>========<<!<<<=><<=>>>>>=5=>>>>>>>>>>=>>>==>=>=>>>>=?>=>>>>>>>>=?>=>>>?>>>??>??>;<=>    SA:Z:chr32,26100739,-,36M64S,60,0;  BD:Z:FFG@JKKFFHIIEHIGFF?????EGGEEEGHHEGEEDGFEGEGF??DE???FHEF?EGGHIFFGFEIFGGFG@@@EGGEGGGFHAAAHGJHBJJDDEHHI   MD:Z:26T37T7    PG:Z:MarkDuplicates RG:Z:Basenji    BI:Z:FFFBHHHFFHGGDGHGGEAAAAADFGEEEIHHGHFFFGFEGHHFBBGFBBBGHGFBEGIIIFGFEFHGFHHGCCCHIGHIGHHGDDDIIKIFKJGHGHGH   NM:i:2  AS:i:65 XS:i:21

 **HWI-ST430:177:2:1:4979:15503#0** 401 chr32   26100739    60  36M64H  =   26100696    -79 GCCTAAAATTTACAAAAACAATAATAAAAACAACAG    ===<=>>=>>===>===<=>===========>;===    SA:Z:chr5,36697147,+,72M28S,60,2;   BD:Z:IHHE??FF?EGEF???FEFFFDFGE@@AHHIJFIFF   MD:Z:36 PG:Z:MarkDuplicates RG:Z:Basenji    BI:Z:HGHGBBFFAEGFFAAAEFFEGFEGFABBFGHGGHFF   NM:i:0  AS:i:36 XS:i:22

The BAM file is alignment of HiSeq reads aligned to the reference genome using bwa, and use picard to remove redundancy. Base realignments were done using gatk.

My confusion is: 1、Why there are three reads with the same name, but have no relation? 2、Maybe the first two are treated as mate pairs and the third as a single read. So could I just ignore it?

Could eveyone help me? Many thanks for your help!

sequencing next-gen alignment genome Assembly • 2.9k views
ADD COMMENT
4
Entering edit mode

your 3 reads reads have flag 65, 129 and 401. First and second are "Paired and first/Second in pair", the third is a "supplementary alignment". So this is not an error.

ADD REPLY
2
Entering edit mode

Hello Alphabet!

It appears that your post has been cross-posted to another site: http://stackoverflow.com/questions/36274708

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode
8.1 years ago

The warnings can mean that your paired fastq files are not synchronized..

This means that you have paired reads and that one of the reads cannot find its mate in the same file (if you have only one combined file with both reads) or in the corresponding mate file.

This can occur after a stringent trimming of the sequences if you used certain trimmer programs. If you use cutadapt, trimmomatic or BBDuk, they manage the trimming process to avoid this

ADD COMMENT

Login before adding your answer.

Traffic: 1897 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6