Question: Extract sequences from a FASTA file using a gtf file
0
gravatar for sdbaney
16 months ago by
sdbaney0
sdbaney0 wrote:

Hi, I have a de novo assembled FASTA file that I used with Cuffdiff. I now have a sorted gtf file (only retained the transcripts that were significantly differentially expressed).

Is there a program that I can use to: 1. pull only the transcripts listed in the gtf file from the FASTA file 2. find the longest ORFs

so that I can run some manual BLAST searches to validate my semi-automatic BLAST.

Is there an easy way to do this? Perhaps I just need to run a search function in the FASTA file using the transcript IDs listed in my sorted gtf file.

rna-seq • 986 views
ADD COMMENTlink modified 16 months ago by Carlo Yague4.9k • written 16 months ago by sdbaney0
1
gravatar for WouterDeCoster
16 months ago by
Belgium
WouterDeCoster43k wrote:

The first part looks like a job for bedtools getfasta.

ADD COMMENTlink written 16 months ago by WouterDeCoster43k
2

Duh duh duh duh duh duh doo doo doo da! BEDTools!

ADD REPLYlink written 16 months ago by swbarnes27.5k
1

Most bioinformaticians to be replaced by BEDTools

ADD REPLYlink written 16 months ago by WouterDeCoster43k
1

Love the comments xD

ADD REPLYlink written 16 months ago by Carlo Yague4.9k

Dun dun DUNNNNNNNNN!

ADD REPLYlink written 16 months ago by swbarnes27.5k
1
gravatar for Carlo Yague
16 months ago by
Carlo Yague4.9k
Canada
Carlo Yague4.9k wrote:
  1. Pull only the transcripts listed in the gtf file from the FASTA file

Use bedtools getfasta.

  1. Find the longest ORFs

See this related thread for some options.

ADD COMMENTlink written 16 months ago by Carlo Yague4.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1462 users visited in the last hour