Question: Aligning reads to genome together with transcriptome
1
gravatar for EVR
4.6 years ago by
EVR570
Earth
EVR570 wrote:

Hi,

I am new to Genome alignment. I have mRNA reads, de novo trasncriptome and genome. I would like to map the reads to genome together with transcriptome and later I want to find the location of certain transcripts in genome. For an example, I would like to know the location in transcript A in genome and number of reads mapped to that transcript A in that location of genome.

How can it be done. Kindly guide me.

 

 

ADD COMMENTlink modified 4.6 years ago by iraun3.8k • written 4.6 years ago by EVR570

which species you are studying ? If you have the genome you might have also the associated gene annotation. Thus you could align the reads against the genome and count the reads for each transcript using featureCounts (for example).

ADD REPLYlink written 4.6 years ago by Nicolas Rosewick9.0k
0
gravatar for michael.ante
4.6 years ago by
michael.ante3.6k
Austria/Vienna
michael.ante3.6k wrote:

Hi Tom,

For the start, I'd like to refer you to the Tophat2 / Cufflinks protocol paper. Most of the aligners like Tophat2, STAR, BBMAP, HiSAT .... take your mRNA-Seq reads and align them to the genome and include exon-exon junctions.

Afterwards, you need software like Cufflinks, StringTie, or Mix2; to estimate the transcript abundances.

Cheers,

Michael

ADD COMMENTlink modified 2.0 years ago by RamRS30k • written 4.6 years ago by michael.ante3.6k
0
gravatar for iraun
4.6 years ago by
iraun3.8k
Norway
iraun3.8k wrote:

As @NicoBxI has pointed out, the most straightfoward way is to check if your genome has an annotation file (gtf, gff3 file...) available and published. If the answer is yes, in this file you'll see the coordinates of the transcripts, and you could mapp the reads against the genome and quantify the number of reads associated to each transcript using featureCounts software with the annotation file . In the case that the genome has not been annotated yet, I'll try to make an annotation file using the transcripts of the transcriptome. Using for example PASA software, you can get a gtf/gff3 file giving the genome and the transcriptome as input.

ADD COMMENTlink modified 2.0 years ago by RamRS30k • written 4.6 years ago by iraun3.8k

Hi,

I am working on non-model organism and it has genome and associated gff3 file. But the de novo assembled transcriptome has different transcripts name obtained from trinity. For an example,genome gff3 file has scaffold2353,scaffold3667 etc and my transcriptome has header like mm_tr_v3_1789, mm_tr_v3_198, etc.

Also there is no word called "transcript" defined in gff3 file for my genome. In this situation how could I map this transcriptome find out the location of particular transcript(say mm_tr_v3_1789) in the genome.

Please guide me. thanks in advance.

ADD REPLYlink modified 2.0 years ago by RamRS30k • written 4.6 years ago by EVR570

Hi EVR,

What approach did you finally use for your analysis, would be really good to share?

Thanks

ADD REPLYlink written 3.2 years ago by bioinfo1730
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1928 users visited in the last hour