How to create custom gtf annotation file?
0
0
Entering edit mode
4.7 years ago
John ▴ 270

Hi

I am using RSEM (with bowtie2) for alignment then gene count. Using Refseq Annotation (gff3), and genomic.fna reference Fasta file from NCBI. RSEM can convert gff3 to gtf file.

How can I subset the GTF file (or gff3 file) by gene a name. I want to extract the annotation (gtf) for particular gene and extract the gene sequence from reference Fasta file. Then I want to perform alignment.

This is especially to reduce time by avoiding aligning whole genome.

Thanks in anticipation.

RNA-Seq R genome • 3.2k views
ADD COMMENT
3
Entering edit mode

This could potentially force some reads to be aligned to your gene, which would have normally aligned somewhere else.

ADD REPLY
1
Entering edit mode

That's what happened. There are more reads than I expected.

ADD REPLY
0
Entering edit mode

You should not do that! Aligning to only your genes will bias the analysis as your RNASeq experiment reflect the entire transcriptome not just your gene.

ADD REPLY
0
Entering edit mode

Yes, just switch to pseudo-aligners if you want to increase the speed. That's sufficient for gene expression

ADD REPLY
1
Entering edit mode

Can't you just grep for the gene name of interest and redirect the output to a file? All the lines relevant to that gene should have the ID, and this would select and place all lines with the given gene id into a single file.

ADD REPLY
1
Entering edit mode

If you are just interested in gene expression, you could speed up your analysis if you use pseudo-aligner like salmon, which are much faster than "real" aligner programms.

Or if you really need the nucleotide precise alignment, than I would use STAR, which is a little faster and has a higher fidelity.

Edit: I moved it into the comments, but I adressed the issue of running time, since the overall question was how to speed up the alignment process.

ADD REPLY
0
Entering edit mode

Could you rewrite this answer to address OPs question about gtf files

ADD REPLY

Login before adding your answer.

Traffic: 2065 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6