Entering edit mode
9.3 years ago
wangyang703092
▴
120
I have been generating consensus sequences from a Bowtie2-aligned BAM file using the following script:
samtools mpileup -d 100000 -Q 0 -q 0 -f Reference.fasta input.bam
In the pileup file, the consensus sequence has a gap where I can clearly see reads that map to the reference. In the pileup file,there is a gap between 42449 and 42469
Supercontig_1.1 42441 T 10 ,$..,,...,, EIDIH@G6DI
Supercontig_1.1 42442 A 9 .$.,,...,, DEIHBI6AI
Supercontig_1.1 42443 C 8 .,,...,,$ EIH>I5GE
Supercontig_1.1 42444 A 7 .$,$,..., EEH@EAG
Supercontig_1.1 42445 T 5 ,$..., EADEH
Supercontig_1.1 42446 G 4 ..., AEEH
Supercontig_1.1 42447 G 4 .$.$., BBAH
Supercontig_1.1 42448 C 2 ., <G
Supercontig_1.1 42449 T 2 .$,$ 8E
Supercontig_1.1 42469 C 2 ^",^", EA
Supercontig_1.1 42470 T 3 ,,^". >?E
Supercontig_1.1 42471 G 4 ,,.^", FBFB
Supercontig_1.1 42472 G 5 ,,.,^". E=FAE
In the sam file,there exists alignments: (part of them):
N57638:1:64U4YAAXX:1:5:19405:8167 355 Supercontig_1.1 42376 2 76M = 42580 280 CACTGGACGAAGGATACGCGTCACTGACAGTACCTTGCCACCGAAGAATTGAGTTCACGGACGACTACATGGCTGA IIIIIIHIIHHHIIDIIHHFIIDIIIIBIIFIDIHEHIIIHIIHFFHHGHIFBDGEFEIE@CEBEFHDGEDGBEFH AS:i:-10 XS:i:-10 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:49C25C0 YS:i:-30 YT:Z:CP
N57638:1:64U4YAAXX:4:16:18415:14633 355 Supercontig_1.1 42376 6 76M = 42672 372 CACTGGACGAAGGAAACGCGTCACTGACAGTACCTTGCCACCGAAGAATCGAGTTCACGGACGACTACATGGCTGA IIIFIIIIGIEIIIIHIDIIIIDIIIHDHIFHIHIBIHIIIEIDIHHHHEIGHGIGDED<BEC8>ECEBB,@A?@D AS:i:-11 XS:i:-11 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:14T60C0 YS:i:-16 YT:Z:CP
N57638:1:64U4YAAXX:2:19:4746:19217 16 Supercontig_1.1 42377 1 60M * 0 0 ACTGGACGAAGGATACGCGTCACTGACAGTACCTTGCCACCGAAGAATTGAGTTCACGGA A-C;C98;52B;2?67/.72:-4;7<:4429@=?@>1=9=;=?=@77844=<=BB??B?B AS:i:-3 XS:i:-3 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:48C11 YT:Z:UU
N57638:1:64U4YAAXX:3:78:11945:7264 339 Supercontig_1.1 42378 12 76M = 42087 -367 CTGGACGAAGGAAACGCGTCACTGACAGTACCTTGCCACCGAAGAATCGAGTTCACGGACGACTACATGGCTGACA 5;7?7<E>>@BCC>EEB@<B<<BDABBGFAECAAAAFB@FGHHGHHFGGEGGBG@GGGGGG@HHHHHHHHHHHHDH AS:i:-15 XS:i:-15 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:12T60C0A1 YS:i:0 YT:Z:CP
N57638:1:64U4YAAXX:3:46:17100:16602 419 Supercontig_1.1 42379 7 76M = 42666 342 TGGACGAAGGATACGCGTCACTGACAGTACCTTGCCACCGAAGAATTGAGTTCACGGACGACTACATGGCTGACAA IIIGHIHIIIIIIIIIEGIIIIIHGI=IIIBIIIIBIIIIGIGIFIIHDDEFGIIIIFIIHGFIIIGDDHE@DEHH AS:i:-16 XS:i:-16 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:46C25C0A2 YS:i:-12 YT:Z:CP
N57638:1:64U4YAAXX:3:82:4520:20210 355 Supercontig_1.1 42379 0 76M = 42599 296 TGGACGANGGAAACGCGTCACTGACAGTACCTTGCCACCNAAGAATCGAGTTCACGGACGACTACATGGCTGACAA IIHIIFC#ECCEEEEHIIIIIHIFIIIEIIIIIIIDDDD#>>???@EEEE@@DBEFBEDB@ADEADDDAAABB@B5 AS:i:-17 XS:i:-17 XN:i:0 XM:i:5 XO:i:0 XG:i:0 NM:i:5 MD:Z:7A3T27G32C0A2 YS:i:-33 YT:Z:CP
N57638:1:64U4YAAXX:4:101:2199:9982 355 Supercontig_1.1 42379 7 76M = 42571 268 TGGACGAAGGATACGCGTCACTGACAGTACCTTGCCACCGAAGAATTGAGTTCACGGACGACTACATGGCTGACAA IIIIIIIIIIIIIIHIIIIIIIIHIIIIGIIIIIIIIIIIHIIIIIIIDIGIIIIHIIIIHHIIHHHHGIIEIGHB AS:i:-17 XS:i:-17 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:46C25C0A2 YS:i:-22 YT:Z:CP
Does anyone know why this happened, thanks!
Thanks,but why you said the alignments will always be ignored, aren't the reads mapped to the region between 42449 and 42469? I am new to mpileup.
All but one of those are secondary alignments. Given that the main use of mpileup is in variant calling, secondary alignments are ignored by default (note the
--ff
options defaults).Sorry Devon, what are secondary alignments, and where is the
--ff
option (in bowtie2?)--ff
is a samtools mpileup option. A secondary alignment is defined in the SAM specification.Thanks for your patience,I would like to use the secondary alignments to call SNP, how can I set the mpileup options.When I read the manual, I haven't find the
--f
optionHi, I found that
--ff
is a filter option, but it can only accept INT, I don't know how to set that, do you have any idea?At least the version I have also allows a (presumably comma separated) string, but the integer representation of what I would presume you want is 1540 (UNMAP 4, QCFAIL 512, DUP 1024). That would then ignore unmapped reads and those marked as duplicates or as having failed quality control.
I did what you told me, my command is
samtools mpileup --ff 1540 -f aaa.fasta aln.bam
, but it has no difference with the result without--ff
option. So I download the lastest version samtools-1.1 and bcftools-1.1, when using-ff 1540
orstring[UNMAP,QCFAIL,DUP]
, it does work! Firstly, I used bcftools view to do the consensus calling, it went wrong, then I found bcftools call is the former bcftools view, the question is that what is the equal command tobcftools view -cg
inbcftools call
Glad that fixed things for you.
Do you know how to set the bcftools call options because I'm not familiar to it. I used
bcftools view -cg
before, and now I usebcftools call -c
,do you think it's correct?bcftools call -c
would seem correct.Thank you so much for your help