Question: Regarding filter primary alignment
0
gravatar for DL
2.3 years ago by
DL20
India
DL20 wrote:

Hi, Can anyone tell me that how to filter primary alignment from the bam file. i have paired end data and mapped the data on genome using bwa. Now, i want to filter reads that have primary alignment from bam file because i want to identify SNPs and its important that i should remove secondary align reads from the bam file. i searched for the answer but do not get satisfactory answer. So please anyone can tell me that how to get reads from bam.

Thanks & Regards Deepika

sequencing snp next-gen genome • 2.7k views
ADD COMMENTlink modified 2.3 years ago by WouterDeCoster41k • written 2.3 years ago by DL20

Now, i want to filter reads that have primary alignment from bam file because i want to identify SNPs and its important that i should remove secondary align reads from the bam file

As far as I know variant callers will take this into account so you don't have to worry about this.

ADD REPLYlink written 2.3 years ago by WouterDeCoster41k

Thank you for your response. Would you please tell me which software do you use for variant calling. ??

ADD REPLYlink written 2.3 years ago by DL20

We use GATK and samtools.

ADD REPLYlink written 2.3 years ago by WouterDeCoster41k

Thank you !!!

In samtools, people used this command : samtools faidx NC_012967.1.fasta

samtools view -b -S -o SRR030257.bam SRR030257.sam

samtools sort SRR030257.bam -o SRR030257.sorted.bam

samtools index SRR030257.sorted.bam

samtools mpileup -u -f NC_012967.1.fasta SRR030257.sorted.bam > SRR030257.bcf

bcftools call -v -c SRR030257.bcf > SRR030257.vcf

but i have question related to this, you said variant caller will take this into account. So, should i use sam file as its as that i got from mapping. i should not filter those reads that map multiple positions on genome. i mean to say that pick the best aligned reads from sam/bam file. i am so confused because i read so many papers and search for answer in forum but still do not get clear vision on that. If you give me advise something then i will be grateful to you.

Thanks

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by DL20

I suggest you to follow the best practices from gatk.

ADD REPLYlink written 2.3 years ago by WouterDeCoster41k
1
gravatar for Carlo Yague
2.3 years ago by
Carlo Yague4.7k
Belgium
Carlo Yague4.7k wrote:

use samtools view and the flag 256 (not primary alignment) to extract primary/secondary alignment :

samtools view -F 256 input.bam # only primary alignments
samtools view -f 256 input.bam  # only secondary alignments
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Carlo Yague4.7k

Thank you so much for your quick response. I ran the both command. In the the second command, i did not get any output is it possible ??

I want to ask a question related to first command. it extract best reads from bam files on the basis of position. Suppose several reads mapped to one location so it picked best one or one read mapped to several location so it picked the best position. I am new one in this kind of work so hope you do not mind if i ask some questions to you. i pasted some part of output and you can see first read mapped to different location of same chromosome.

@PG     ID:bwa  PN:bwa  VN:0.7.12-r1039 CL:bwa mem -t 20 rose-genome.fasta /work/jclotault/cutadapt/DE-R1-filter.fastq.gz /work/jclotault/cutadapt/DE-R2-filter.fastq.gz
H1:1:H5GCTBCXY:1:1107:1178:2120 83      Chr05   1282000 40      151M    =       1281229 -922    GAGTAACTTCATCAAATCTTACATCCCTGGAGACAATGAGTTTCCTAGTAGCAAGATTGTAGCATTTATACCCTTTTTGAGTAGATGAAT
ATCCTAGGAAAACACATTTATCAGCTCTAGGATCAAGTTTATCACGTTGATGAGCTTGAAT   FGFEHIHHIHIIIHIFHHHHH@@1EFIIHHGHIIHHHIIIHHHEHHHGC<EIHEF?HGFFCEEIIIIHHFIIIHGFHHHHIIIHHIHIHIIIIHEHFIIHGCD1EHEHHIGHHIIHGIIHGH
FHECEHDIHHHEIIIHIIIHHEEH@@BAD   NM:i:1  MD:Z:58T92      AS:i:146        XS:i:146        XA:Z:Chr00,-20998106,151M,1;Chr05,-82297399,151M,1;Chr02,-68189779,151M,2;Chr07,+53332990,151M,2;
H1:1:H5GCTBCXY:1:1107:1178:2120 163     Chr05   1281229 60      151M    =       1282000 922     GGTAAAGAGGCCATGCATGATTAACTGCAACTGATAACAAAACTCTAACTGTGTTCATTTTAGCAACAGGAGCAAAAGTTTCCTTGTAAT
CTACCCCATAAGTTTGAGTAAAACCACGAGCCACTAAACGAGCTTTATGTCTCTCCACAGT   <<DD0F<11CEHHICHHHH@HHHIGHH1GEGHIIIIIIIGIIHF@HHIHHHIIHIIHHIIIHIGFHHIIGHHIHHGHHHHHIIHHGHHHHEHIHEHHGHIID<FH?HHCFHHHHCHEHCHHI
HHIIHIEDCCDHGEEH@FE?CFHHHHCHG   NM:i:1  MD:Z:68T82      AS:i:146        XS:i:128        XA:Z:Chr02,+68189035,151M,5;Chr05,+82296655,151M,5;Chr00,+20997367,5S143M3S,5;
H1:1:H5GCTBCXY:1:1107:1377:2167 83      Chr07   5387480 13      151M    =       5386363 -1268   GGGTTGTGGGTGGCGGGGCTTGAGTGACAGGTCGGTGAGGCGAGAGAGAGTGAAAGCCCGAGCTGCGGCCGCCATGGAACAACCACGAGA
GACTTGTGGCCTCGTGTGGTGCCTCTCGAATCTTGGGGGCTCGATTCTTGAGCTTCGGCTT   .HEA.E=EHHIHHDEEGHE@FC0FHHGCCIHCIHHEH=IHIHGCHFIHGGHHHH<HCCHHCCHHIHHHIIIHHIIHHHHIIHEHHHE@IHGIIIGIIIIIHIIIHHHHHIIIHIIIIIIHII
HIIIHIHIIIIHGHIIIIIHIHGIDDADD   NM:i:2  MD:Z:0T140A9    AS:i:145        XS:i:145
H1:1:H5GCTBCXY:1:1107:1377:2167 163     Chr07   5386363 13      151M    =       5387480 1268    ACTCAATTAGAAGCTACTGGATAAAGTTCATAACTTTGAAACTGGGGATGGACATATCACAAGTTCGGCACCAATACTAAATTATACAAG
TGCAATGTGTGGTAGTCCAAGATCAGGGGTTTACATAAGGATGCCACTAAAAAGTAGCAGT   D?D@DHHI@H@HE?GHHIHFHIIFF1D1CGHHIIIIFE@HHICGHHGHHDCECHIIHIGHHI?HECH/CCCCE@1F?@F@C1<CGEHIIEG@1CCHHHCHED@D@DH1CHHG@G@ED1C0/0
<<1<CF@@G@?E1D1D@GEHI0DC11F1<   NM:i:1  MD:Z:149C1      AS:i:149        XS:i:146
H1:1:H5GCTBCXY:1:1107:1504:2099 83      Chr00   13715595        0       151M    =       13714992        -754    CTTTCCTTGTTTAGCTCGGTTTCTCCTAGCCGAAGTGAGAAAACGTGCTAAAGTTGACTTTTCATTTCCATGCT
TCCATCATTTCCTTTTCTTACAGCTCGCATAATTCTCCATAGTAGGCTTTATTTAGCCTCTAAATAATATATTTCGA   EIHHIHEHIIIHHDIHIHHHIIIIHIIIIIIIIHIIIIIIHIHIIHGHHDHIIIHIHIIIIIIIIIIHIIHIIHIIIIIIIIIIIIIIIIIIIIIIIIIHIIIIII
IIIHIIIIIIIIIIIIIIIIIHIIHHHHIIIIIIIIIIIIDBDDD   NM:i:0  MD:Z:151        AS:i:151        XS:i:151
H1:1:H5GCTBCXY:1:1107:1504:2099 163     Chr00   13714992        0       151M    =       13715595        754     AGGTGTTTATATAGGGAAGAAAAAGAAGAGTGAAATGATGAGTGGAAGAAAAATAATGAAAGTAGATCTAAGTT
CACTTGTAAAATATGGAAAAGATAGAGAAAAGATGAAATGAAAGCAAAGCATGAAGGTGCAGCAACATGGAAGTGGT   DB@<<<C<F@G<CE1GHH<DCH@DC1D<D1DCF1DH@GG@HHCGH@GCGECGHHHHIIIIHEHHC@<CFCCHGHFHEH@G@FGEHGHEE@GFHGHHHE?HI@F1@F
GHIIEECF?@CHCGHDGFHIIHHIIIIIIIIHIHHHHH?GHH?H<   NM:i:0  MD:Z:151        AS:i:151        XS:i:151
H1:1:H5GCTBCXY:1:1107:1662:2155 83      Chr07   12664169        0       151M    =       12663271        -1049   TTAAATCAGATGGATAGTTTTTGAGCTAATGCTTTACATAAAGCGGTTTTTCTAGTCCCTGGAGTTCCATGTAA
AAGAACAATGCTGCAAAGTCATAAAACAAATCACATAGGTAATTAGAATTTCTTAGCAAAAAAAGATGAGAAATGAA   IHIIIIIIHHCHHECEHHEEHHHIHGHIHHFF1HEHF@CECHCC<0GC<CHHG@E<0C1IIHEHF@C1HC@HHEHHGFCHEHF@@HEHHHC?EHIHIIHIHCEHHI
IHHGG11HF@CEFC@HHEHEEHCF/HHHHFHHIHHFCEHF@0@D0   NM:i:0  MD:Z:151        AS:i:151        XS:i:151
H1:1:H5GCTBCXY:1:1107:1662:2155 163     Chr07   12663271        0       151M    =       12664169        1049    CATCTGTCGGCCTCATTCACCACTTTATACAATTTGGTGAATGTAACCAAAGGTGAGGGACAACACAGTATTGA
TGGAGGTCTTCCAAAGCAAATAAAGGATAAGAAATAACATGAATTATAGCTTGACAATGCCATCCTCAACTCTCAGT   0DD0@C<C@CCC@CCFHHIHC@1C1F?E<1<<<FHHHIIHH1FCGHE1CC?<<<1<?<E@@EHH1EHC<CC1<D1D1C1<01DG1D1<<<11CEE?CGE1CCGF1D
CC1<D1<111C1<D1<DGG?1C1<11111<<1<1<FHCF1DG@H<   NM:i:4  MD:Z:65T23C5G47A7       AS:i:131        XS:i:131
ADD REPLYlink modified 2.3 years ago by WouterDeCoster41k • written 2.3 years ago by DL20
1

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 2.3 years ago by WouterDeCoster41k

Thank you so much for your quick response. I ran the both command. In the the second command, i did not get any output is it possible ??

Yes it is possible to not have any secondary alignment. You can check the number of secondary alignments with :

samtools flagstat input.bam

one read mapped to several location so it picked the best position.

this one is correct

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Carlo Yague4.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1337 users visited in the last hour