Quantification of repeats expression
1
0
Entering edit mode
7.2 years ago
valerie ▴ 100

Hi guys,

I want to calculate repeats expression in my RNAseq data. I've obtained bam files using TopHat and now I need gtf file for repeats to calculate the counts. Where can I download it?

Thanks!

ngs RNA-Seq repeats • 2.3k views
ADD COMMENT
2
Entering edit mode

Remember to get a GTF file that matches your genome. If your genome came from Ensembl then you need to get the GTF from ensembl. Chromosome identifiers may otherwise not match.

BTW: Repeat tracks are under "Variation and Repeats" group in UCSC table browser.

ADD REPLY
3
Entering edit mode
7.2 years ago
Constantine ▴ 290

Go to UCSC

https://genome.ucsc.edu/

and under Tools > Table Browser

Choose your genome and track (ideally RefSeq genes), and select "Output format: GTF"

ADD COMMENT
0
Entering edit mode

Thank you! I need, repeats, why RefSeq genes? Is it correct to choose 'Variation and Repeats' as group and 'RepeatMasker' as track?

ADD REPLY
1
Entering edit mode

Yes. See my comment above.

ADD REPLY
0
Entering edit mode

Thank you! I used mouse mm10 genome for TopHat and will use mm10 here again.

ADD REPLY
0
Entering edit mode

Did the genome come from UCSC or Ensembl or someplace else? Also keep in mind the "multi-hits" setting for TopHat. Since you are interested in repeats that setting may affect your results significantly.

ADD REPLY
0
Entering edit mode

Actually I downloaded an archive with genome, Bowtie2 indexes and other files here: ftp://ussd-ftp.illumina.com/Mus_musculus/UCSC/mm10/ So it is UCSC as far as I understand

ADD REPLY
0
Entering edit mode

That is correct. So you are fine with getting the repeats GTF from UCSC.

ADD REPLY
0
Entering edit mode

Thank you for your help!

ADD REPLY
0
Entering edit mode

Hi Valerie,

I have a similar project, working on SSR repeats. Could you please kindly tell me what is your workflow for doing the work?

ADD REPLY
0
Entering edit mode

Hi Seta,

I simply use tophat2 to map the reads to reference genome. Then I sort my reads using samtools and use htseq-count to obtain counts from bam file. On this stage I needed gtf file we discussed here. Then you can the apply any normalization to counts, I prefer DeSeq. Let me know if still you have questions.

ADD REPLY
0
Entering edit mode

Hi friend, Thank you very much for your explanation. As you mentioned "repeat" in the title of your question, I thought that you have a specific way for surveying these regions. Now, I found that you follow the common way.

ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6