Question: How to create custom gtf annotation file?
0
gravatar for John
11 months ago by
John210
United States
John210 wrote:

Hi

I am using RSEM (with bowtie2) for alignment then gene count. Using Refseq Annotation (gff3), and genomic.fna reference Fasta file from NCBI. RSEM can convert gff3 to gtf file.

How can I subset the GTF file (or gff3 file) by gene a name. I want to extract the annotation (gtf) for particular gene and extract the gene sequence from reference Fasta file. Then I want to perform alignment.

This is especially to reduce time by avoiding aligning whole genome.

Thanks in anticipation.

rna-seq R genome • 787 views
ADD COMMENTlink modified 11 months ago by caggtaagtat1.1k • written 11 months ago by John210
3

This could potentially force some reads to be aligned to your gene, which would have normally aligned somewhere else.

ADD REPLYlink written 11 months ago by caggtaagtat1.1k
1

That's what happened. There are more reads than I expected.

ADD REPLYlink written 11 months ago by John210

You should not do that! Aligning to only your genes will bias the analysis as your RNASeq experiment reflect the entire transcriptome not just your gene.

ADD REPLYlink written 11 months ago by kristoffer.vittingseerup3.3k

Yes, just switch to pseudo-aligners if you want to increase the speed. That's sufficient for gene expression

ADD REPLYlink written 11 months ago by caggtaagtat1.1k
1

Can't you just grep for the gene name of interest and redirect the output to a file? All the lines relevant to that gene should have the ID, and this would select and place all lines with the given gene id into a single file.

ADD REPLYlink written 11 months ago by seidel7.1k
1

If you are just interested in gene expression, you could speed up your analysis if you use pseudo-aligner like salmon, which are much faster than "real" aligner programms.

Or if you really need the nucleotide precise alignment, than I would use STAR, which is a little faster and has a higher fidelity.

Edit: I moved it into the comments, but I adressed the issue of running time, since the overall question was how to speed up the alignment process.

ADD REPLYlink modified 11 months ago • written 11 months ago by caggtaagtat1.1k

Could you rewrite this answer to address OPs question about gtf files

ADD REPLYlink written 11 months ago by russhh5.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 862 users visited in the last hour