Question: Quantification of repeats expression
0
gravatar for valerie
15 months ago by
valerie40
valerie40 wrote:

Hi guys,

I want to calculate repeats expression in my RNAseq data. I've obtained bam files using TopHat and now I need gtf file for repeats to calculate the counts. Where can I download it?

Thanks!

rna-seq repeats ngs • 476 views
ADD COMMENTlink modified 15 months ago by Constantine200 • written 15 months ago by valerie40
2

Remember to get a GTF file that matches your genome. If your genome came from Ensembl then you need to get the GTF from ensembl. Chromosome identifiers may otherwise not match.

BTW: Repeat tracks are under "Variation and Repeats" group in UCSC table browser.

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax46k
3
gravatar for Constantine
15 months ago by
Constantine200
Germany
Constantine200 wrote:

Go to UCSC

https://genome.ucsc.edu/

and under Tools > Table Browser

Choose your genome and track (ideally RefSeq genes), and select "Output format: GTF"

ADD COMMENTlink modified 15 months ago • written 15 months ago by Constantine200

Thank you! I need, repeats, why RefSeq genes? Is it correct to choose 'Variation and Repeats' as group and 'RepeatMasker' as track?

ADD REPLYlink written 15 months ago by valerie40
1

Yes. See my comment above.

ADD REPLYlink written 15 months ago by genomax46k

Thank you! I used mouse mm10 genome for TopHat and will use mm10 here again.

ADD REPLYlink written 15 months ago by valerie40

Did the genome come from UCSC or Ensembl or someplace else? Also keep in mind the "multi-hits" setting for TopHat. Since you are interested in repeats that setting may affect your results significantly.

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax46k

Actually I downloaded an archive with genome, Bowtie2 indexes and other files here: ftp://ussd-ftp.illumina.com/Mus_musculus/UCSC/mm10/ So it is UCSC as far as I understand

ADD REPLYlink written 15 months ago by valerie40

That is correct. So you are fine with getting the repeats GTF from UCSC.

ADD REPLYlink written 15 months ago by genomax46k

Thank you for your help!

ADD REPLYlink written 15 months ago by valerie40

Hi Valerie,

I have a similar project, working on SSR repeats. Could you please kindly tell me what is your workflow for doing the work?

ADD REPLYlink modified 15 months ago • written 15 months ago by seta960

Hi Seta,

I simply use tophat2 to map the reads to reference genome. Then I sort my reads using samtools and use htseq-count to obtain counts from bam file. On this stage I needed gtf file we discussed here. Then you can the apply any normalization to counts, I prefer DeSeq. Let me know if still you have questions.

ADD REPLYlink written 15 months ago by valerie40

Hi friend, Thank you very much for your explanation. As you mentioned "repeat" in the title of your question, I thought that you have a specific way for surveying these regions. Now, I found that you follow the common way.

ADD REPLYlink modified 15 months ago • written 15 months ago by seta960
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1739 users visited in the last hour