Entering edit mode
2.8 years ago
VIJITH KUMAR
•
0
I am trying to convert my sorted .bam file to .bcf, using the following command:
samtools mpileup -f hg19.fa sequence1.sorted.bam > X.bcf
But the result was:
1 samples in 1 input files
chr15 5062469 N 0 * *
chr15 5062470 N 0 * *
chr15 5062471 N 0 * *
chr15 5062472 N 0 * *
chr15 5062473 N 0 * *
chr15 5062474 N 0 * *
chr15 5062475 N 0 * *
chr15 5062476 N 0 * *
chr15 5062477 N 0 * *
as a .bcf file.
I ran the following command:
samtools view -H sequence1.sorted.bam
to view .bam listings, and later:
grep ">" hg19.fa
to view reference genome data.
The following, respectively, shows the output.
@HD VN:1.6 SO:coordinate
@SQ SN:chr1 LN:249250621
@SQ SN:chr10 LN:135534747
@SQ SN:chr11 LN:135006516
@SQ SN:chr12 LN:133851895
@SQ SN:chr13 LN:115169878
@SQ SN:chr14 LN:107349540
@SQ SN:chr15 LN:102531392
@SQ SN:chr16 LN:90354753
@SQ SN:chr17 LN:81195210
@SQ SN:chr17_ctg5_hap1 LN:1680828
@SQ SN:chr18 LN:78077248
@SQ SN:chr19 LN:59128983
@SQ SN:chr2 LN:243199373
@SQ SN:chr20 LN:63025520
@SQ SN:chr21 LN:48129895
@SQ SN:chr22 LN:51304566
@SQ SN:chr3 LN:198022430
@SQ SN:chr4 LN:191154276
@SQ SN:chr4_ctg9_hap1 LN:590426
@SQ SN:chr5 LN:180915260
@SQ SN:chr6 LN:171115067
@SQ SN:chr6_apd_hap1 LN:4622290
@SQ SN:chr6_cox_hap2 LN:4795371
@SQ SN:chr6_dbb_hap3 LN:4610396
@SQ SN:chr6_mann_hap4 LN:4683263
@SQ SN:chr6_mcf_hap5 LN:4833398
@SQ SN:chr6_qbl_hap6 LN:4611984
@SQ SN:chr6_ssto_hap7 LN:4928567
@SQ SN:chr7 LN:159138663
@SQ SN:chr8 LN:146364022
@SQ SN:chr9 LN:141213431
@SQ SN:chrM LN:16571
@SQ SN:chrUn_gl000211 LN:166566
@SQ SN:chrUn_gl000212 LN:186858
@SQ SN:chrUn_gl000213 LN:164239
@SQ SN:chrUn_gl000214 LN:137718
@SQ SN:chrUn_gl000215 LN:172545
@SQ SN:chrUn_gl000216 LN:172294
@SQ SN:chrUn_gl000217 LN:172149
@SQ SN:chrUn_gl000218 LN:161147
@SQ SN:chrUn_gl000219 LN:179198
@SQ SN:chrUn_gl000220 LN:161802
@SQ SN:chrUn_gl000221 LN:155397
@SQ SN:chrUn_gl000222 LN:186861
@SQ SN:chrUn_gl000223 LN:180455
@SQ SN:chrUn_gl000224 LN:179693
@SQ SN:chrUn_gl000225 LN:211173
@SQ SN:chrUn_gl000226 LN:15008
@SQ SN:chrUn_gl000227 LN:128374
@SQ SN:chrUn_gl000228 LN:129120
@SQ SN:chrUn_gl000229 LN:19913
@SQ SN:chrUn_gl000230 LN:43691
@SQ SN:chrUn_gl000231 LN:27386
@SQ SN:chrUn_gl000232 LN:40652
@SQ SN:chrUn_gl000233 LN:45941
@SQ SN:chrUn_gl000234 LN:40531
@SQ SN:chrUn_gl000235 LN:34474
@SQ SN:chrUn_gl000236 LN:41934
@SQ SN:chrUn_gl000237 LN:45867
@SQ SN:chrUn_gl000238 LN:39939
@SQ SN:chrUn_gl000239 LN:33824
@SQ SN:chrUn_gl000240 LN:41933
@SQ SN:chrUn_gl000241 LN:42152
@SQ SN:chrUn_gl000242 LN:43523
@SQ SN:chrUn_gl000243 LN:43341
@SQ SN:chrUn_gl000244 LN:39929
@SQ SN:chrUn_gl000245 LN:36651
@SQ SN:chrUn_gl000246 LN:38154
@SQ SN:chrUn_gl000247 LN:36422
@SQ SN:chrUn_gl000248 LN:39786
@SQ SN:chrUn_gl000249 LN:38502
@SQ SN:chrX LN:155270560
@SQ SN:chrY LN:59373566
@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa samse hg19bwaidx sequence1.bwa SRR9598696.fastq
@PG ID:samtools PN:samtools PP:bwa VN:1.12 CL:samtools view -bS sequence1_se.sam
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.12 CL:samtools sort -O bam -o sequence1.sorted.bam -T temp sequence1.bam
@PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.12 CL:samtools view -H sequence1.sorted.bam
and
>chr1
>chr10
>chr11
>chr12
>chr13
>chr14
>chr15
>chr16
>chr17
>chr17_ctg5_hap1
>chr18
>chr19
>chr2
>chr20
>chr21
>chr22
>chr3
>chr4
>chr4_ctg9_hap1
>chr5
>chr6
>chr6_apd_hap1
>chr6_cox_hap2
>chr6_dbb_hap3
>chr6_mann_hap4
>chr6_mcf_hap5
>chr6_qbl_hap6
>chr6_ssto_hap7
>chr7
>chr8
>chr9
>chrM
>chrUn_gl000211
>chrUn_gl000212
>chrUn_gl000213
>chrUn_gl000214
>chrUn_gl000215
>chrUn_gl000216
>chrUn_gl000217
>chrUn_gl000218
>chrUn_gl000219
>chrUn_gl000220
>chrUn_gl000221
>chrUn_gl000222
>chrUn_gl000223
>chrUn_gl000224
>chrUn_gl000225
>chrUn_gl000226
>chrUn_gl000227
>chrUn_gl000228
>chrUn_gl000229
>chrUn_gl000230
>chrUn_gl000231
>chrUn_gl000232
>chrUn_gl000233
>chrUn_gl000234
>chrUn_gl000235
>chrUn_gl000236
>chrUn_gl000237
>chrUn_gl000238
>chrUn_gl000239
>chrUn_gl000240
>chrUn_gl000241
>chrUn_gl000242
>chrUn_gl000243
>chrUn_gl000244
>chrUn_gl000245
>chrUn_gl000246
>chrUn_gl000247
>chrUn_gl000248
>chrUn_gl000249
>chrX
>chrY
So, why does it fail to generate the .bcf file?
Thank you kevin for the response. I am quite a novice in Bioinformatics; can you tell why the reference genome, "hg19.fa" needs to have .fa files like
chr4_ctg9_hap1
,chrUn_gl000215
>chrUn_gl000216
,>chrUn_gl000217
,>chrUn_gl000218
,>chrUn_gl000219
etc. instead of just the ChrN.fa(N=1 -22,x,y,M)Hi, those are not primary contigs (for now, let's just think of a 'contig' as being a chromosome); thus, you can ignore them. The primary contigs will be named:
For more information on the other contigs, please follow the information from here: Additional Data In Human Genome (Hg18 / Hg19) Assembly ?
Hi kevin,
I tried variant calling using the code you suggested. But it failed with an error in the 'mpileup'. the result has been attached.
later, I tried variant calling using 'samtools mpileup', using the following code.
It generated a .vcf file.
Is this a valid result? Did I follow your code correctly?
If this command ran successfully without error, then, yes, the VCF should be valid:
My command should also have run, however, note that you will not have the BAM files that I specify in my code, i.e., you need to replace this:
...with this:
Can you try my command again, please? If necessary, please run it all on a single line without the forward slashes (
/
)Hi Kevin,
I have correctly modified your command, and that resulted in a .bcf file. But I don't think it is a valid .bcf file. I have attached both the "modified command" and the snap of the resulting .bcf file. Kindly, have a look; I hope the command has run. Could it be due to issues lurking in the reference .fa file?
Hi, the
-Ob
parameter instruct BCFtools to output in binary VCF format. This can help to save disk space and improve random access of records in the BCF.What you need to do to view it is:
Please take a look at the output options for
bcftools mpileup
by simply running: