Question: Create Interval List for Picard CollectRNASeqmetrics
1
gravatar for Harshal
4.0 years ago by
Harshal50
London, UK
Harshal50 wrote:

Hi,

How to create ribosomal RNA Interval list for  to be used with picard CollectRNASeqMetrics ?
I have mapped reads to Drosophila_melanogaster Ensembl Genome with Ensembl GTF .  I need to identify the percent of reads mapping to ribosomal RNA ?  Is there a way to create interval list from GTF file ? 

Thanks !!

rna-seq next-gen • 6.0k views
ADD COMMENTlink modified 3.1 years ago by lhaiyan320 • written 4.0 years ago by Harshal50

maybe this can help https://pythonhosted.org/pybedtools/intervals.html

ADD REPLYlink written 4.0 years ago by Nandini710
5
gravatar for Dan D
4.0 years ago by
Dan D6.6k
Tennessee
Dan D6.6k wrote:

The documentation for the IntervalList format is somewhat hard to find. From the linked page:

Represents a list of intervals against a reference sequence that can be written to and read from a file. The file format is relatively simple and reflects the SAM alignment format to a degree. A SAM style header must be present in the file which lists the sequence records against which the intervals are described. After the header the file then contains records one per line in text format with the following values tab-separated: Sequence name, Start position (1-based), End position (1-based, end inclusive), Strand (either + or -), Interval name (an, ideally unique, name for the interval),

So the first thing you need to do is get the header from your SAM/BAM file:

samtools view -H [your.bam] > intervalList.txt

If your GTF file is standard and we assume that it contains only ribosomal intervals, then we need the first, fourth, fifth, seventh, and ninth fields from the file. We can append them onto our text file which contains the header:

cut -s -f 1,4,5,7,9 [your.gtf] >> intervalListBody.txt

This is a very basic approach and you'll probably want to modify it somewhat for your specific needs, but hopefully it's a good start.

ADD COMMENTlink written 4.0 years ago by Dan D6.6k

Thanks Deedee !! It worked ! 

ADD REPLYlink written 4.0 years ago by Harshal50
0
gravatar for Kamil
3.9 years ago by
Kamil1.8k
Boston
Kamil1.8k wrote:

You can see my ribosomal intervals file and a simple script I used to create it here:

https://gist.github.com/slowkow/b11c28796508f03cdf4b

 

Related questions:

Ribosomal Intervals For Collectrnaseqmetrics

ADD COMMENTlink written 3.9 years ago by Kamil1.8k
0
gravatar for lhaiyan3
3.1 years ago by
lhaiyan320
United States
lhaiyan320 wrote:

Hi, Kamil:

Thanks for the post. For the rRNA.interval file, you use the 1, 4, 5, 7 and 9 fields from the file. Can you please also tell me how to get the genes.interval and exons.interval file? I want to have the human and mouse genes, exons, rRNA.intervals. I download the gtf file from Ensembl. Thanks very much.

 

HY

ADD COMMENTlink written 3.1 years ago by lhaiyan320

You'll have to modify line 43 to say "exon" or "gene" instead of "transcript".

ADD REPLYlink written 3.1 years ago by Kamil1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1552 users visited in the last hour