PASA(Program to Assemble Spliced Alignment) annotation questions
1
2
Entering edit mode
9.1 years ago

Hello Everyone,

We have recently generated two de-novo transcriptomics assembly for two different but related species. These new transcripts seem quite good on the basis of quality measurements, completeness and alignment with the previously sequenced genome and annotation. We were able to pick up novel genes and previously unannotated transcripts. In order to pick the alternate spliced transcripts we are currently trying running PASA (Program to Assemble Spliced Alignment) annotation.

After PASA annotation, when I made a comparison between trinity transcripts, already existing annotation and PASA annotation in IGB. I find out a case where trinity transcripts fully supported by previously curated annotation as well as the RNASeq data, but no PASA annotation. PASA annotation only shows one fragment, rest of the parts not even present in the valid and failed .gff3 file generated by PASA [Figure]. This leads to few questions - for which I don't have any answer. Here are my questions:

  1. Why there is this difference in the PASA annotation- when there is already manual curation and RNAseq depth evidences are present? And which one is correct?
  2. PASA annotation using blat and gmap for the alignment of transcripts to the genome. And we have also used blat to align the transcript to the genome. Then why there is different in two blat alignment?

Data used for Transcriptomics assembly = 100 bp Paired end reads, non-strand specific

Attached figure description:-

  • Dark Blue colour = New assembled transcript annotation [this is the annotation generated by aligning assembled transcripts with the genome using blat].
  • Orange = existing curated annotation.
  • Red = PASA annotation
  • Blue colours = shows valid alignment annotation for blat and gmap respectively.
  • Read Depth = Green for control and Dark red = Knock-out

Figure: < image not found >

I am very much looking forward for the reply. Any suggestions/view would be very helpful.

Many Thanks
Reema,

PS = I also posting the same post in the seq-answer.

Assembly IGB alignment RNA-Seq PASA • 6.8k views
ADD COMMENT
2
Entering edit mode
9.1 years ago
Nowlan Freese ▴ 860

Hi Reema,

Our senior software developer was able to look over your .gff3 file. The issue has to do with how IGB currently displays .gff3 data. It's something we should be able to fix with a small patch.

The .bed file does display correctly, so if you have a choice in the output, go with the .bed file for now.

Thank you for helping us identify this issue!

Nowlan

ADD COMMENT
3
Entering edit mode

Ok, The problem solved. It seems like the problem associated with .gff3 file. PASA annotation generates three annotation file .gff3,.gtf and .bed. Instead of .gff3 file if I upload .gtf file in IGB then it shows the full transcript annotation. However, there is still some transcripts bits missed in .gtf. But .bed format works great on both .gtf and .gff3.

ADD REPLY
0
Entering edit mode

Hi Reema,

I troubleshoot issues for IGB, and I would like to know more about what's happening with your files.

In order for me to get to the bottom of this, can you please provide the .gff3 and the .bed file?

Thanks!

Nowlan

ADD REPLY
0
Entering edit mode

Hello Nowlan

Could you please send me the ID where I can send you these files?

Best,
Reema,

ADD REPLY
0
Entering edit mode

Hello Reema,

My University email is nfreese@uncc.edu, and I also have a gmail account: nowlan.freese@gmail.com. Either should work, if the files are large they might need to be sent to my gmail account via Google Drive.

Thanks!

Nowlan

ADD REPLY

Login before adding your answer.

Traffic: 3327 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6