Hi All,
I am facing some trouble using the samtools mpileup command on my bam files. I run the below command on my bam file
samtools view -H tumor.realigned.recal.bam
the command seems to be working fine except for the last two line which I see are this
@PG ID:GATK IndelRealigner VN:2.3-4-g57ea19f CL:knownAlleles=[] targetIntervals=/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999_marked.sorted.bam.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@PG ID:MarkDuplicates PN:MarkDuplicates VN:1.84(1332) CL:net.sf.picard.sam.MarkDuplicates INPUT=[/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999.sorted.bam] OUTPUT=/scratch/GT/vdas/pietro/exome_seq/results/T_S7999/T_S7999_marked.sorted.bam METRICS_FILE=metricN.log REMOVE_DUPLICATES=false ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
@PG ID:bwa PN:bwa VN:0.5.9-r16
Is this spurious for the bam? I am trying to calculate the B-Allele frequencies for my bam files with the script from this link. I created the perl script and then ran the same command with samtools mpileup to create the bed file with the B-Allele frequency but it does not seem to create the bed file. First I thought it is an error of the mpileup but then I checked that the mpileup works fine. Now am thinking is it because my bam file might not be in the correct order? Can anyone share some insight onto this? I also encountered an error on the perl file which is mentioned in that link. This was the error.
<mpileup> Set max per-file depth to 8000
Use of uninitialized value $start in addition (+) at /scratch/GT/softwares/code/mpileup2baf.pl line 16, <> line 1.
Use of uninitialized value $num_reads in numeric lt (<) at /scratch/GT/softwares/code/mpileup2baf.pl line 20, <> line 1.
Use of uninitialized value $start in addition (+) at /scratch/GT/softwares/code/mpileup2baf.pl line 16, <> line 2.
Use of uninitialized value $num_reads in numeric lt (<) at /scratch/GT/softwares/code/mpileup2baf.pl line 20, <> line 2.
This means there is some problem in my bam file. Can anyone point out where am getting wrong? The bam file are sorted and they are marked for duplicates and then realigned around the indels and then recalibrated using the GATK tools. I am using this exome data for my analysis. I would appreciate if someone can suggest me where I am failing and how to correct this.