Question

SSR marker discovery from RNA-seq data when a reference genome is available

0

Entering edit mode

7.1 years ago

seta ★ 1.9k

Hi all,

I have Illumina reads (36bp, single end) from mRNA sequencing of human samples in two different conditions. I want to find the probable effective simple sequence repeat (SSR) markers between experimental conditions. Since the genome is available, I mapped reads to the reference genome (global alignment) and extracted the consensus sequence from bam file. I considered the lowest level for insertion/deletion cost during mapping, please advise me another useful option for mapping to this end. However, the consensus sequence was full of N, referring no read mapped to that region. Could you please let me know if I should determine the SSR on this consensus sequence or you have alternative suggestions and comments for SSR discovering in RNA-seq data when a reference genome is available?

Thanks

SSR RNA-se genome alignment • 1.8k views

ADD COMMENT • link 7.1 years ago by seta ★ 1.9k

score 0 · Answer 1 · 2017-03-14

0

Entering edit mode

7.1 years ago

WouterDeCoster 47k

Can't you identify SSR sequences directly from the reference genome, and afterwards find those which are actually expressed? I don't see the point of alignment, calling consensus and then looking at repetitive sequences. As said before in a very similar thread you made: I have my doubts if RNA-seq is the most optimal technology to study SSRs.

ADD COMMENT • link 7.1 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks, friend. The samples resulted from healthy and disease humans, unfortunately, we have not the genome sequence of the diseased group. Although RNA-seq data may not be the most optimal for SSR discovery, here just this kind of data is available, any suggestions!

Assumed SSR identified from the reference genome, how I find which SSR are actually expressed?

ADD REPLY • link 7.1 years ago by seta ★ 1.9k

0

Entering edit mode

You would need to write a program.

ADD REPLY • link 7.1 years ago by theobroma22 ★ 1.2k

0

Entering edit mode

Sorry, could you please explain more?

ADD REPLY • link 7.1 years ago by seta ★ 1.9k

0

Entering edit mode

That's not very helpful.

ADD REPLY • link 7.1 years ago by WouterDeCoster 47k

0

Entering edit mode

WouterDeCoster, as you suggested, assumed SSR identified from the reference genome, how I find which SSR are actually expressed?

ADD REPLY • link 7.1 years ago by seta ★ 1.9k

0

Entering edit mode

When you have generated a bed file of SSR locations you can use bedtools to find intervals in which reads are present in the alignment.

ADD REPLY • link 7.1 years ago by WouterDeCoster 47k