Entering edit mode
5.5 years ago
priskt
•
0
I have been converting a bam file from hg19 to hg38 using CrossMap. This was the code:
python CrossMap.py bam -a hg19ToHg38.over.chain.gz SAMPLE.bam SAMPLE.hg38
Would you please help me interpret this output of CrossMap:
Total alignments:53531694 QC failed: 0 R1 unique, R2 unique (UU) 0 R1 unique, R2 unmapp (UN): 0 R1 unique, R2 multiple (UM): 0 R1 multiple, R2 multiple (MM): 0 R1 multiple, R2 unique (MU): 0 R1 multiple, R2 unmapped (MN): 0 R1 unmap, R2 unmap (NN): 53531694 R1 unmap, R2 unique (NU): 0 R1 unmap, R2 multiple (NM): 0
If all the reads have been unmapped, does this mean that the reads did not align with hg38? Is this a valid bam conversion that I can carry on with to do downstream analysis?
It is unlikely that none of your reads map to hg38, if they did map to hg19. So something technical went wrong. Could you check if the chromosome identifiers match in your bam and in your chain file?
Thank you so much WouterDeCoster , you were spot on. The chain file has chromosome identifiers as "chr1" while the bam file uses "1". I reran the analysis with GRCh37_to_GRCh38.chain.gz and it worked, the results are perfect. However, I have to have the final conversion in hg38 format (with "chr" identifiers) because all the other BAM and vcf files that I have to work on are in this format. I looked for a chain file that can convert from GRCh37 to hg38 and couldn't find any. Would you suggest that I convert from GRCh37 to hg19, then from hg19 to hg38 (these have readily available chain files); OR to convert from BAM to FASTQ then do an alignment using hg38?
Just change the chromosome names in the BAM file (
samtools reheader
).Thank you Devon Ryan , and for that I have found good guides here A: Bam File: Change Chromosome Notation and Edit every instance of a chromosome name in BAM file?