Closed:Trouble with samtools mpileup on my exome bam files
0
0
Entering edit mode
9.6 years ago
ivivek_ngs ★ 5.2k

Hi All,

I am facing some trouble using the samtools mpileup command on my bam files. I run the below command on my bam file

samtools view -H tumor.realigned.recal.bam

the command seems to be working fine except for the last two line which I see are this

@PG    ID:GATK IndelRealigner    VN:2.3-4-g57ea19f    CL:knownAlleles=[] targetIntervals=/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999_marked.sorted.bam.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@PG    ID:MarkDuplicates    PN:MarkDuplicates    VN:1.84(1332)    CL:net.sf.picard.sam.MarkDuplicates INPUT=[/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999.sorted.bam] OUTPUT=/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999_marked.sorted.bam METRICS_FILE=metricN.log REMOVE_DUPLICATES=false ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT    PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
@PG    ID:bwa    PN:bwa    VN:0.5.9-r16

Is this spurious for the bam? I am trying to calculate the B-Allele frequencies for my bam files with the script from this link. I created the perl script and then ran the same command with samtools mpileup to create the bed file with the B-Allele frequency but it does not seem to create the bed file. First I thought it is an error of the mpileup but then I checked that the mpileup works fine. Now am thinking is it because my bam file might not be in the correct order? Can anyone share some insight onto this? I also encountered an error on the perl file which is mentioned in that link. This was the error.

<mpileup> Set max per-file depth to 8000
Use of uninitialized value $start in addition (+) at /scratch/GT/softwares/code/mpileup2baf.pl line 16, <> line 1.
Use of uninitialized value $num_reads in numeric lt (<) at /scratch/GT/softwares/code/mpileup2baf.pl line 20, <> line 1.
Use of uninitialized value $start in addition (+) at /scratch/GT/softwares/code/mpileup2baf.pl line 16, <> line 2.
Use of uninitialized value $num_reads in numeric lt (<) at /scratch/GT/softwares/code/mpileup2baf.pl line 20, <> line 2.

This means there is some problem in my bam file. Can anyone point out where am getting wrong? The bam file are sorted and they are marked for duplicates and then realigned around the indels and then recalibrated using the GATK tools. I am using this exome data for my analysis. I would appreciate if someone can suggest me where I am failing and how to correct this.

alignment SNP sequencing • 592 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6