Question

Transcriptome Analysis with only a fasta file

0

Entering edit mode

9.4 years ago

coreyhowe99 ▴ 30

I'm very inexperienced with bioinformatics and I have a transcriptome that I am hoping to analyze. The 3 things I would like to do are find the expression of each gene, blast specific genes against the transcriptome, and I would like to have each gene of the transcriptome blasted online to find if it shares homology with any other genes. There are currently no reference transcriptomes or genomes of the organism and the only file I have is a single fasta file. I have downloaded the local blast executables for blasting specific sequences against the transcriptome, but in terms of doing expression analysis and blasting every gene in the transcriptome, I'm having trouble. I have been looking into different analysis software online and found the galaxy site, but it looks like there is not much I can do with only a fasta file. It seems that most of the programs require a fastq, sam/bam, gff/gtf file so I am not sure if or how I can do any analysis with only a fasta file.

Any ideas of what software and analysis I'm able to do with this fasta file? Any advice for me for this process?

rna-seq • 4.7k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by coreyhowe99 ▴ 30

Ram · Answer 1 · 2014-12-20

Can you give an example sequence from your file? Most likely the FASTA file contains assembled transcripts made from the raw data. Certainly there is a lot you can do with 'only' a FASTA file, e.g. annotating the sequences using Blast2GO, which is easy enough to use for a beginner; but you should definitely get hold of the raw data, then you can also map the reads back to transcripts and do some quantification, e.g. using galaxy. As a general advice for someone coming new into a field is to follow the established standard, by following the methods of other published papers. You might as well replicate exactly what others have done before. Here is an example: the transcriptome of the zooplankton Calanus finmarchicus. That way you can also determine what your data and effort is worth in terms of publication.

Summary:

get raw data
annotate transcripts using blast
try to replicate methods of a similar paper on your data
adapt methods (only) if necessary

Ram · Answer 2 · 2014-12-20

0

Entering edit mode

9.4 years ago

Antonio R. Franco ★ 5.1k

I would say that you don't provide with enough data

How much data do you have ?

How did you get your data ?

When you say you don't have fastq, do you mean you don't have quality of your reads?

Do you have data coming from one condition only?

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by Antonio R. Franco ★ 5.1k

0

Entering edit mode

I have a file of ~41,000 sequences at 35MB, that I had sent to me from a lab in europe. An example of a sequence from the file is in my reply above. And yes the data is from one condition.

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by coreyhowe99 ▴ 30