Entering edit mode
6.6 years ago
bioinfo8
▴
230
Hi,
I have multiple paired-end bam files (aligned with reference) and bed file of multiple genes (chr, start, end). I would like to visualize the alignment (I know it can be done in IGV), but I want to get alignment in clustal format for each gene, so that I can know the location of forward and reverse reads and how they mapped to the reference from different samples.
Any guidance would be appreciated.
Thanks.
Export a smaller BAM in the regions you are interested in using
samtools view region
and then convert it to fasta using (should be possible using pipes)samtools fasta
. Follow this up by a regular MSA with the reference gene sequence.That is what I did and thought of MSA:
But the resulting fasta file looks like this (including forward and reverse reads):
I want to get the reads in sequential order in one file so that I can go for MSA with reference gene. I am stuck here. Also, I don't know whether I should make two separate files for forward and reverse reads. How should I proceed?
What does that exactly mean?
In the order of their alignment with the reference as well as I should know if there are overlaps and remove those regions to get the final sequence.
I think the reads should be extracted in order (as long as your BAM is sorted). You can easily check that.
That makes it sound like you are not looking for reads but a consensus. That would be a different operation (BAM format file to FASTA alignment file )
gives almost similar to that of fasta, so not helpful. :(
Just take the fasta file you obtained above and run an MSA with clustal (or any other tool you like) and see what you get. Do not forget to include the reference in fasta format.
Yes, I did and it showed alignments and file is quite big (as there are multiple reads >1000). I realized that I need consensus sequence from all the reads for that region and then can align consensus sequence with reference gene. How to get consensus sequence from all reads?
See this: How to get the consensus sequence from a BAM alignment and follow the post linked in @Istvan's answer.
Also see if this shortens the process: https://support.bioconductor.org/p/64748/#64755
Thanks, but its not that helpful.
Because you changed your requirements.