Question: Is there a way I can upload a reference sequence to Clustal Omega to get alligned protein sequences /or a different way of getting the seqeunces
0
gravatar for vellryba
16 months ago by
vellryba0
vellryba0 wrote:

Hello.

My aim is to find out correlated mutations within a single paired reads. For example, I need to know if the sequence ID X, that has mutation at position lets say 800, also has a mutation at position at 1100. So I managed to get bam and sam files containing only reads that span the regions I am interested in. I have the fasta sequences and I used Translator X to translate those into protein fasta.

Now I know what I was expecting to get back and when I loaded these into Clustal Omega to get an alignment. This doesnt work that well. There are gaps and sequenced that were just badly translated. I looked at the badly translated sequences in the fasta file I get from the Translator X and they are already there. When I looked at the nucleotide fasta, these are fine. Is there a way I can feed my reference sequence into an alignment tool so I can get the protein sequences translated and aligned correctly?

Does anybody have any experience with this type of analysis?

sequencing alignment • 503 views
ADD COMMENTlink written 16 months ago by vellryba0
1

I don't fully understand your question.

If you have a reference sequence and your reads are covering the region you are interested in completely why is there a need to look at protein translations?

ADD REPLYlink written 16 months ago by genomax75k

Hi, I know there is a mutation present (sometimes) in some of the reads. I also know that there is a mutation (sometimes again) a bit further down the genome. I want to see if that second mutation is only present when the first one is present. In other words, these mutations are hierarchical. I have the sam and bam file that only contains the reads that span both of the regions.

Now I just want to somehow count either nucleotide (or protein) variants in those reads. Something like this:

1position A 2nd position C - 1200
1 position A 2nd position T - 800

etc.

I am just not sure how to go about it

ADD REPLYlink modified 16 months ago by RamRS25k • written 16 months ago by vellryba0

Use bam-readcount to get this information.

ADD REPLYlink written 16 months ago by genomax75k

Hi,

this only gives me a count at each position. I need to see if they are correlated. Like this:

first position 800   second position 1000 count: 
AT 1000
CT 800
AG 600

etc.

ADD REPLYlink modified 16 months ago by RamRS25k • written 16 months ago by vellryba0

Sorry to bother you, but do you have any other suggestion? This one wont work due to the reasons below.

ADD REPLYlink written 16 months ago by vellryba0

You can probably do LD/Correlation analysis using PLINK (not my area of strength). This is only a pointer for you to consider.

ADD REPLYlink written 16 months ago by genomax75k

Do you specifically want to find reads which contain multiple mutations, or are you just interested in co-localised mutations?

ADD REPLYlink written 16 months ago by Joe15k

Hi, I need to know that the mutations came from a single paired read. There are particular regions I have in mind.

ADD REPLYlink modified 16 months ago • written 16 months ago by vellryba0

If the pair of reads you are looking at flanks the regions of interest then they represent a fragment that spans the region. Unless you have reads that go through the region of interest you have not way of confirming that a particular mutation is present in those fragments.

You will need to use sanger sequencing to confirm that the mutation exists using the original sample.

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax75k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1144 users visited in the last hour