Question: Extract sequences from a FASTA file using a gtf file
0
gravatar for sdbaney
9 months ago by
sdbaney0
sdbaney0 wrote:

Hi, I have a de novo assembled FASTA file that I used with Cuffdiff. I now have a sorted gtf file (only retained the transcripts that were significantly differentially expressed).

Is there a program that I can use to: 1. pull only the transcripts listed in the gtf file from the FASTA file 2. find the longest ORFs

so that I can run some manual BLAST searches to validate my semi-automatic BLAST.

Is there an easy way to do this? Perhaps I just need to run a search function in the FASTA file using the transcript IDs listed in my sorted gtf file.

rna-seq • 631 views
ADD COMMENTlink modified 9 months ago by Carlo Yague4.6k • written 9 months ago by sdbaney0
1
gravatar for WouterDeCoster
9 months ago by
Belgium
WouterDeCoster40k wrote:

The first part looks like a job for bedtools getfasta.

ADD COMMENTlink written 9 months ago by WouterDeCoster40k
2

Duh duh duh duh duh duh doo doo doo da! BEDTools!

ADD REPLYlink written 9 months ago by swbarnes26.2k
1

Most bioinformaticians to be replaced by BEDTools

ADD REPLYlink written 9 months ago by WouterDeCoster40k
1

Love the comments xD

ADD REPLYlink written 9 months ago by Carlo Yague4.6k

Dun dun DUNNNNNNNNN!

ADD REPLYlink written 9 months ago by swbarnes26.2k
1
gravatar for Carlo Yague
9 months ago by
Carlo Yague4.6k
Belgium
Carlo Yague4.6k wrote:
  1. Pull only the transcripts listed in the gtf file from the FASTA file

Use bedtools getfasta.

  1. Find the longest ORFs

See this related thread for some options.

ADD COMMENTlink written 9 months ago by Carlo Yague4.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1693 users visited in the last hour