How to pull out protein sequences from from BAM files containing genomic alignments

0

Entering edit mode

9.3 years ago

julia92796 • 0

Hi,

I'm trying to pull out protein sequences from BAM files containing genome alignments. What is the best way to do this? Right now, I have a BAM file containing the neanderthal alignment to the human genome and a fasta file containing the human reference sequence. I wasn't sure whether the next step would be to pull out the neanderthal consensus sequence using SAMtools. If so, where do I go from there, and if that's not the case, what should I do next?

SAMtools genome alignment sequence gene • 2.1k views

ADD COMMENT • link updated 3 months ago by RD ▴ 30 • written 9.3 years ago by julia92796 • 0

0

Entering edit mode

Have you seen this thread: Looking for neanderthal genomes to download

What are you looking to get at the end?

ADD REPLY • link 9.3 years ago by GenoMax 154k

0

Entering edit mode

Thanks, this thread is helpful. I have a list of human proteins, as well as the genes coding for these proteins, and I'm trying to find homologs to the proteins in humanoid species. I suspect that the next step in my process is generating the consensus sequence for the neanderthal genome, but after that, I'm not sure exactly where to proceed.

ADD REPLY • link 9.3 years ago by julia92796 • 0

0

Entering edit mode

Hey @Julia92796, Were you able to figure it out? I'm curious how you built the consensus sequences and aligned them to human proteins. I noticed a lot of insertions, and the MSA seems to be failing at domain regions.

ADD REPLY • link 3 months ago by RD ▴ 30

Login before adding your answer.