Question: SSR marker discovery from RNA-seq data when a reference genome is available
0
gravatar for seta
2.2 years ago by
seta1.2k
Sweden
seta1.2k wrote:

Hi all,

I have Illumina reads (36bp, single end) from mRNA sequencing of human samples in two different conditions. I want to find the probable effective simple sequence repeat (SSR) markers between experimental conditions. Since the genome is available, I mapped reads to the reference genome (global alignment) and extracted the consensus sequence from bam file. I considered the lowest level for insertion/deletion cost during mapping, please advise me another useful option for mapping to this end. However, the consensus sequence was full of N, referring no read mapped to that region. Could you please let me know if I should determine the SSR on this consensus sequence or you have alternative suggestions and comments for SSR discovering in RNA-seq data when a reference genome is available?

Thanks

rna-se alignment ssr genome • 748 views
ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by seta1.2k
0
gravatar for WouterDeCoster
2.2 years ago by
Belgium
WouterDeCoster39k wrote:

Can't you identify SSR sequences directly from the reference genome, and afterwards find those which are actually expressed? I don't see the point of alignment, calling consensus and then looking at repetitive sequences. As said before in a very similar thread you made: I have my doubts if RNA-seq is the most optimal technology to study SSRs.

ADD COMMENTlink written 2.2 years ago by WouterDeCoster39k

Thanks, friend. The samples resulted from healthy and disease humans, unfortunately, we have not the genome sequence of the diseased group. Although RNA-seq data may not be the most optimal for SSR discovery, here just this kind of data is available, any suggestions!

Assumed SSR identified from the reference genome, how I find which SSR are actually expressed?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by seta1.2k

You would need to write a program.

ADD REPLYlink written 2.2 years ago by theobroma221.1k

Sorry, could you please explain more?

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by seta1.2k

That's not very helpful.

ADD REPLYlink written 2.2 years ago by WouterDeCoster39k

WouterDeCoster, as you suggested, assumed SSR identified from the reference genome, how I find which SSR are actually expressed?

ADD REPLYlink written 2.2 years ago by seta1.2k

When you have generated a bed file of SSR locations you can use bedtools to find intervals in which reads are present in the alignment.

ADD REPLYlink written 2.2 years ago by WouterDeCoster39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 956 users visited in the last hour