Entering edit mode
6.0 years ago
seta
★
1.9k
Hi all,
I would like to examine and visualize the BAM file (from whole genome sequencing) in some HLA alleles; could you please let me know any appropriate tool to this end?
Thanks
You may have to first do a focused analysis using your data, and then visualise the results that you generate. Doing a standard alignment (i.e., to any of the reference genomes) to HLA will not provide an accurate picture of the HLA alleles in your samples.
Take a look at, e.g., Polysolver, HLAscan, xHLA, and OptiType.
Kevin
Thanks, Kevin. HLA typing was done on whole genome sequencing data using a commercial software (SeqNext-HLA); however, for some samples, several HLA alleles were reported that differ mostly in 2th and 3th fields , that's why I want to check these alleles at the BAM file. For example, I have HLA-A11:121, HLA-A11:162, HLA-A*11:26 alleles that differ in 2th field. For checking in the BAM file by IGV, I have to know the variant in the 2th filed and its gnomic coordinate, yes? Could you please help me how I can obtain this information?
Thank you
I see, so, which files does SeqNext-HLA produce - the BAMs? I wonder could you download FASTA sequences from IMGT database and use those as references in IGV? That is what I would try. You should be able to download your exact HLA FASTA sequences from IMGT.
The software got fastq as input and created just a pdf report; it may produce BMA file during the process but not return as the output. Regarding the FASTA HLA sequence, your mean is genomic sequence? which allele should be considered as reference given my above example? and how I can obtain the genomic position of the variant? Also, I find this paper used IGV for alignment of HLA alleles, but I'm confused what shall I do. Could you please kindly share me your idea about it?
Thanks
If you can repeat what has been performed in that publication, then that should be sufficient. In fact, I just looked at the publication and they also used IMGT.
IMGT stores a FASTA sequence for many hundreds of HLA alleles, and these are as specific as (I believe) your HLA-A11:121, HLA-A11:162, etc. There are different ways of doing the experiment and I have only worked on this in a cancer setting whereby WGS tumour DNA was aligned to a merged FASTA of all IMGT HLA allele sequences, which then allowed us to identify the HLA of the tumour (in many cases, it was different from the matched normal due to somatic mutation).
Hi Kevin. So, your mean is using specific HLA sequences, say A11:121, HLA-A11:162 as reference in IGV and BAM file as input for resolving the issue. Sorry for further questions as it's my first experience; Please kindly tell me if your mean is coding (not genomic) HLA sequence? and that BAM file generated from mapping fastq reads to hg38 (although, I have just this BAM file)? Also, please kindly let me know which position should be looking for to find which allele, for example, HLA-A11:121, HLA-A11:162, HLA-A*11:26 is correct?
Many thanks for your help
Hey, it is basically about following the logic in this program (I worked briefly with the developer): https://bitbucket.org/mcgranahanlab/lohhla/src/master/
To help, the methodology employed by this program can be found in the Methods of the related publication. Start from the section entitled 'LOHHLA (Loss Of Heterozygosity in Human Leukocyte Antigen) algorithm'.
When broken down, the methods are really quite simple.
I am not sure about the BAMs that you have currently used, i.e., I don't know to which genome the reads were initially aligned.
Thanks for your feedback. However, unfortunately, I didn't get the point as I cannot find any way to do my purpose (BAM visualization of HLA alleles) in LOHHLA tool. My BAM files generated by aligning fastq reads to hg38.
I brought IMGT and the LOHHLA pipeline to your attention in case you had wanted to start again from the FASTQ stage, and obviously I was not implying that you should follow the pipeline 100% - it was merely to give you ideas.
As you appear to want to stick with the output of SeqNext-HLA, may I suggest that you familiarise yourself more with how thst pipeline proceeds, and also the contents of the BAMs that it produces. Then, taking everything mentioned in this thread into account, you may be able to do what you want
Many thanks, Kevin. Unfortunately, the analysis with SeqNext-HLA was done by another person, not me. The final results, just a list of alleles for each individual, given to me for final reporting; however, there are some cases with several alleles that I want to check them manually by IGV to find which one is correct. Anyway, I download the HLA sequence from IMGT and use as reference in IGV. Also, I extracted the specific region related to HLA from a bam file (created by aligning the fastq reads to hg38) and try to load it into IGV, but it gave me an error that the name of BAM file is not matched with the reference, which is correct. So, the extracted reads should be mapped with the HLA sequence (region) to visualize them on IGV, yes, is it right? I checked several HLA calling tools, but sounds that they return only the final allele, not the corresponding BAM file. Could you please kindly let me know any HLA calling tool that also return the appropriate BAM file as the output for running it into IGV?
Please kindly share me if you have any idea on this issue.
Thank you very much