Question: RNA-Seq - Alignment GFF3/CDS against cluster (realised with CD-HIT)
0
gravatar for vperlo
15 months ago by
vperlo0
vperlo0 wrote:

Hi, I would like to align and visualise the GFF3/CDS of a genome reference with the long read (HQ Isoforms from PacBio SMRT Analysis). I have already used CD-HIT to cluster the isoforms (HQ Isoforms.fa -> HQ Isoforms.clstr) Could you give me some advise for that (eventually software, command-line with example), Thanks you,

Head CDS: • GFF3 Head of the gff3 (/90days/uqvperlo/Spontaneum_Genome/Sspon.v20171023.gff3) Chr1A maker gene 39107 43192 . + . ID=Sspon.001A0000000;Name=Sspon.001A0000000 Chr1A maker mRNA 39107 43192 . + . ID=Sspon.001A0000000-mRNA-1;Parent=Sspon.001A0000000;Name=Sspon.001A0000000-mRNA-1;_AED=0.05;_eAED=0.11;_QI=0|0|0|1|0.77|0.7|10|0|569 Chr1A maker exon 42527 42760 . + . ID=Sspon.001A0000000-mRNA-1:8;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker exon 42788 42920 . + . ID=Sspon.001A0000000-mRNA-1:9;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker exon 43027 43192 . + . ID=Sspon.001A0000000-mRNA-1:10;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 39107 39412 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41046 41217 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41317 41452 . + 2 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41551 41764 . + 1 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41925 42068 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1

• CDS Head of CDS (Sspon.v20171023.cds.fasta)

Sspon.001A0000000 ATGGGCCTCTCGGGCGCGACGATCTCCGCGCCGCTCGGCTGCCGTGGGCTGCCGCGCGGGGCGGTCGGCGGAGGTAAGGCGCGGAAGGCGGAGGCGGAGAGGTGGCGGCGGGCGGGGTGGAGCCGACGCCTGGGCGGACCGATGGTGAGGTGCGTGGCGACCGAGAAGCACGACGAGGCGGCGGCGGCGGTCGGCGTGGAGTTCGCGGACGAGGAGGACTACCGCAAGGGGGGCGGCGGCGAGCTGCTTTACGTGCAAATGCAGGCCACCAAGCCCATGGAAAGCCAGTCCAAGATCGCTTCCAAGCTGTTGCCCATATCTGATGAAAATACGGTACTTGATTTGGTTATCATTGGTTGTGGTCCTGCTGGTCTTTCTCTAGCTTCAGAGTCAGCTAAGAAAGGTCTCACCGTCGGTCTCATTGGGCCTGACCTTCCATTCACAAATAACTATGGTGTGTGGGAGGATGAGTTTAAAGATCTTGGTCTAGAGAGTTGTATCGAGCACGTCTGGAAGGATACTATTGTCTACCTAGACAATAACAAGCCCATACTGATTGGCCGTTCTTATGGCAGGGTGCACCGTGACTTGCTCCATGAGGAGCTGCTGAGAAGATGCTATGAAGCTGGCGTGACATACCTGAACTCCAAAGTGGACAAGATCATAGAATCTCCAGATGGACACAGAGTAGTCTGTTGTGATAAGGGTCGTGAGATAATTTGCAGGCTTGCCATTGTTGCTTCGGGGGCAGCATCTGGTAGGCTTCTAGAGTATGAGGTTGGGGGTCCCCGTGTTTGCGTGCAGACTGCATACGGAGTAGAAGTTGAGGTGGAAAATAGTCCATATGATCCCAGCTTAATGGTTTTCATGGACTACAGAGATTGTTTCAAAGAGGAATTTTCACACACTGAACAAGAAAATCCCACTTTCCTGTATGCTATGCCCATGTCATCCACACGAGTTTTCTTTGAGGAAACATGCTTAGCTTCTAAAGATGCTATGTCATTGGATCTACTTAAGAAGAGGCTGATGTATCGGTTAAATATGATGGGAGTTCGTATCCTGAAAGTTTACGAGGAGGAATGGTCCTACATTCCTGTTGGAGGGTCCTTACCAAACACAGATCAGAAAAATCTTGCATTTGGTGCTGCAGCAAGTATGGTGCATCCTGCAACAGGGTACTCAGTGGTCAGATCTTTGTCTGAGGCTCCAAGATATGCTTCTGTGATATCCGAAATCTTAAGAAACCGAGTTCCTGCAGAATATTTGCTTGGAAATTCTCAAAATTACAGTCCATCAATGCTTGGTAAGCATTCTTGTGGTATTTATTCAACTTGGTTGTACTATTACACAACTCCAGGACAAGAACAGTTTTACTGTGTAAAGTCCATACAAGTTAGCAAAACTTTATCATGGAGAACACTATGGCCTCAAGAAAGGAAACGCCAGCGATCCTTCTTCCTTTTCGGATTAGCGTTGATAATCCAACTGAATAATGAAGGCATACAAACATTCTTTGAAGCCTTTTTCAGGGTGCCGAAATGGATGTGGCGGGGATTTCTTGGCTCAACCCTTTCATCCGTCGATCTCATACTATTTTCATTCTACATGTTTGCCATAGCTCCAAATCAATTGCGAATGAACCTCGTCAGACATCTCCTCTCTGACCCGACTGGATCAACCATGATCAAGACCTACCTGACCTTATAA Sspon.001A0000010 ATGCAAGGAGGCCCACTGAGCCCGGATGAGTACCGGGTGGCGTCACCGCCGGCGCTGCTGCACCAGCCGGCGTCCCTCATCGTCGTGGCCATCGACCGGGACCGGAACAGCCAACTGGCCGTTAAGTGGGTCATGGACCACCTCCTCTCCGGCGCCTCTCAGATCGTCCTCCTCCACGTGGCCGCCCATTACCCCACCAACCATGGGTTCGCCATGGCCGAGATGACGCAGGGCGCGCTGGAGGCTGAAATGAAGGAGATCTTTGTCCCCTACAGAGGATTCTTCAACCGGAATGGGGTAAATGTGGAAGTCTCCGAGGTAGTGTTGGAAGAGGCAGACGTGTCCAAGGCCATTCTAGGTTACATCACTGCAAACAAGATCCAGAGCATTGCGCTCGGCGGAGCCAGCAGAAATGCATTCACCAAGAAATTCAAGAACGCGGACGTGCCATCGACCCTGATGAAGTGCGCGCCG

• Cluster of HQ Isoform (OUTPUT_CLUSTER_HQ) or HQ not clustered (all_quivered_hq.100_30_0.99.fasta (/all_hq.seq_VPO2.fa)) HEAD all_hq.seq_VPO2.fa • aaaaaaaaaatgaagataataaactgcggattctttctttctcttccattcttacgtttccatattaaAGTGTAGTTTTTTTACTTAAATTTAATAATATTAATCTAATATGCCCATTGGTGTTCCAAAAGTACCTTACCGGATTCCCGGAGATGAAGAAGCGACTTGGGTTGACTTATACAATGTTATGTATCGAGAAAGGACACTTTTTTTAGGTCAAGAGATTCGTTGCGAGATCACGAATCATATTACAGGTCTGATGGTATATCTCAGTATAGAAGATGGAAATAGCGATATTTTTTTGTTTATAAACTCCCCTGGCGGGTGGCTAATCTCAGGAATGGCGATTTTTGATACGATGCAAACGGTGACACCTGATATATATACAATATGCCTCGGAATAGCCGCGTCCATGGCCTCCTTCATTCTGCTTGGAGGAGAACCCACCAAGCGTATAGCATTCCCTCACGCTAGGATTATGCTTCACCAACCTGCTAGTGCTTATTATCGGGCAAGGACACCAGAATTTTTACTAGAAGTGGAAGAGTTACACAAAGTTCGCGAAATGATCACAAGGGTTTATGCACTAAGAACAGGCAAGCCTTTTTGGGTTGTATCCGAAGACATGGAAAGGGATGTTTTTATGTCAGCAGACGAAGCCAAAGCTTATGGACTTGTCGATATTGTAGGGGATGAAATGATTGACGAGCACTGCGATACTGATCCAGTGTGGTTTCCAGAAATGTTTAAGGATTGGTAGTGGGTGGATTCCTTGTAAATTTATTTCAAACCTAAATTCGGGTTTTATGCTATAGAAATAGAAATACTTCATGATGATGAATCATCAGGTTAAGATCGATCTAAACCAATCCATTTTATATATACATACGACATGCCAACGGTTAAACAACTTATTAGAAATGCAAGACAGCCAATACGAAATGCTAGAAAATCGGCCGCGCTTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCCCAAAAAACCCAACTCTGCCTTACGTAAAGTTGCCAGAGTACGATTAACCTCTGGATTTGAAATCACTGCTTATATACCTGGTATTGGCCATAATTTACAAGAACATTCTGTAGTATTAGTAAGAGGAGGAAGGGTTAAGGATTTACCCGGTGTGAGATATCGCATTATTCGAGGAACCCTAGATGCTGTCGCAGTAAAGAATCGTCAACAAGGGCGTTCTAGTGCGTTGTAGATTCTTATCCAAGACTTGTATCATTTGATGATGCCATGTGAATCGCTAGAAACATGTGAAGTGTATGGCTAACCCAATAACGAAAGTTTCGTAAGGGGACTGGAGCAGGCTACCATGAGACAAAAGATCTTCTTTCTAAAGAGATTCGATTCGGAACTCTTATATGTCCAAGGTCAATATGGAAATTCTTTCAGAGGTTTTCCCTTACTTTGTCCGTGTCAACAAACAATTCGAAATACCTCGACTTTTTCAGAACAGGTCCGAGTCAAATAGCAATGATTCGAAGCACTTCTTTTTCCATTACACTATTTCGGAAACCTAAGGACTCGATCGTATGGATATGGAAAATACAGGATTTCCGGTCCTAGCGGGAAAAGGAGGGAAACGGATACTCAATTTAAAGTGAGTAAACAGAATTCCATACTCGATCTCATAGATCCCTATAGAATTCTGTGGAAAGCCGTATTCGATGAAAGTCGTATGTACGGCTTGGAGGGAGATCTTTCctatctttcgagatccaccctacaatatggggccaaaaag • aaaaaaaggggggggacaggaggcaagatcaaatatctatggggcacccttacttcactttTCTTTTTTTGCATTTCTCAGTAAAAAGAGAGCAGTGCAGTATAGAATTTTTTTTCACTACTTCCGGTGGATAAGGAAAGGCATACATATCATACGTGGATTAGTATAATTTAGGATACTACTCTATTTTTATGATTATACCACCCTCATCTTAACTTCCTATTCGACTCTTATGAAATCGAGTACCGGTATTTTACCGGAATCCATACATAAACCGTATTCGTTTATTGTTAGAGAATGAATTTCATCTAATTAGTTACTTTTTGGGGGTTAAACCGCACCCCTTCCATTTTCTAGTTCCAACTTTGTTTGTGTATGTTCAATGAATAATTGGAAAGAATCCACCTCATTCTCACTCCATTTGCCGACTCCTTTTTTTATGGATTTGCTCTCATCTTTAGACCCCAAAAAGCGTATATTGATAGTAACCTAATAAAATAAAAACCAAGCAAACGCTCTTTTTCCTAAATTAAAATATGGGATCCTTTTTATTTAAGATCCTCTTTTCTCCATCTCATTTCTACTTCTACTGCTTATTGGGATGTCTATGTGACCCATAGAAAGTTGGTCATATAATACATACATAATTGTATGTATAACTATAAGAAAAAGGAAGGGAAATTGGATAAGAAAAgaaagattttgggttatacatatag HEAD OUTPUT_CLUSTER_HQ

c10/f3p32/3350 isoform=c10;full_length_coverage=3;non_full_length_coverage=32;isoform_length=3350 aactcgcagcagtagagcAGCACGAGCAACACGCCGCGCCGCTCCAACCATCTCAGCTTCGCGCTTCCCGCGCCCCGCCGCCGCGCCCGCCATGGCGTCCGAGCGGCACCACTCCATCGACGCGCAGCTCCGTGCCCTGGCCCCAGGCAAGGTCTCCGAGGAGCTCATCCAGTACGACGCCCTGCTCGCGACCGTTTCCTCGACATCCTCCAGGACCTCCATGGCCCTAGCCTTCGCGAATTTGTCCAGGAGTGCTACGAGGTGTCGGCCGATTACGAGGGCAAGAAGGACACGTCGAAGCTGGGCGAGCTGGGCACCAAGCTCACGGGGCTGGCGCCCGCCGACGCCATCCTGGTGGCGAGCTCCATCCTGCACATGCTCAACCTGGCCAACCTGGCCGAGGAAGTGGAGCTGGCGCACCGCCGCCGGAACAGCAAGCTCAAGCACGGGGACTTCTCCGACGAGGGCTCCGCCACCACCGAGTCGGACATCGAGGAGACGCTCAAGCGCCTCGTGTCGCTGGGCAAGACCCCCGAGGAGGTGTTCGAGGCGCTCAAGAACCAGAGCGTCGACCTCGTCTTCACCGCGCACCCCACGCAGTCCGCCAGGAGGTCGCTCCTGCAGAAAAACGCCAGGATCCGGAATTGTCTGACGCAGCTGAGTACCAAGGACGTCACGGTCGAGGACAAGAAGGAGCTCGACGAGGCTCTGCAGAGAGAGATCCAAGCAGCTTTCAGAACTGATGAGATCCGGAGAGCACAACCCACTCCACAGGATGAAATGCGCTATGGGATGAGCTACATCCATGAAACTGTATGGAAGGGTGTGCCTAAGTTTTTGCGCCGTGTGGATACAGCCCTGAAGAATATCGGCATCAATCAGCGCCTTCCCTACAATGTTCCTCTCATTAAGTTCTGTTCTTGGATGGGTGGTGACCGTGATGGAAATCCAAGAGTTACTCCGGAGGTGACAAGAGATGTATGCTTGCTGTCCAGAATGATGGCTGCAAACTTGTACATCGATCAGGTCGAAGACCTGATGTTTGAGCTCTCTATGTGGCGCTGCAATGATGAACTTCGTGCTCGAGCCGAAGAAGTCCAGAGTACTCCAGCTTCAAAGAAAGTTACCAAGTATTACATAGAATTCTGGAAGCAAATTCCTCCAAACGAGCCCTACCGGGTGATACTTGGTGCTGTAAGGGACAAGTTATACAACACACGCGAGCGTGCACGCCATCTGCTGGCAACTGGATTTTCTGAAATTTCTGTGGACTCGGTATTTACCAATATCGAAGAGTTCCTTGAGCCCCTTGAGCTATGCTACAAATCCCTGTGTGACTGCGGCGACAAGGCCATCGCGGACGGGAGCCTCCTGGACCTCCTGCGCCAGGTGTTCACGTTCGGGCTCTCCCTGGTGAAGTTGGACATCCGTCAGGAGTCGGAGCGGCACACCGACGTGATCGACGCCATCACCACGTACCTTGGCATCGGGTCGTACCGCTCGTGGCCCGAGGACAAGCGGATGGAGTGGCTGGTGTCGGAGCTGAAAGGCAAGCGGCCGCTGCTGCCCCCGGACCTTCCCATGACCGAGGAGATCGCCGACGTCATCGGGGCGATGCACGTCCTCGCGGAGCTCCCGTCCGACAGCTTCGGCCCCTACATCATCTCCATGTGCACAGCCCCCTCCGACGTGCTCGCCGTGGAGCTCCTGCAGCGCGAGTGTGGCATTCGCCAGACGCTGCCCGTGGTGCCGCTGTTCGAGAGGCTGGCCGACCTGCAGGCGGCGCCCGCGTCCGTGGAGCGGCTCTTCTCCACTGACTGGTACTTCGACCACATCAAGGGCAAGCAGCAGGTGATGGTCGGGTACTCCGACTCCGGCAAGGACGCCGGCCGCCTGTCCGCGGCGTGGCAGCTGTACGTGGCGCAGGAGGAGATGGCCAAGGTGGCCAAGAAATACGGCGTGAAGCTGACCTTGTTCCACGGGCGCGGCGGCACCGTGGGCAGGGGTGGCGGGCCGACGCACCTGGCCATCCTGTCCCAGCCGCCGGACACCATCAACGGGTCAATCCGCGTGACGGTGCAGGGCGAGGTCATCGAGTTCATGTTCGGGGAGGATCACCTGTGCTTCCAGTCTCTGCAGCGCTTCACGGCCGCCACGCTGGAGCACGGCATGCACCCGCCGGTGTCTCCCAAGCCCGAGTGGCGCAAGCTCATGGAGGAGATGGCAGTCGTGGCCACGGAGGAGTACCGCTCCGTCGTCGTCAAGGAGCCGAGATTCGTCGAGTACTTCAGATCGGCTACCCCTGAGACTGAGTACGGGAAGATGAACATCGGCAGCCGGCCAGCCAAGAGGAAGCCGGGCGGCGGCATCACCACCCTGCGCGCCATCCCCTGGATCTTCTCGTGGACCCAGACGAGGTTCCACCTCCCCGTGTGGCTGGGAGTCGGCGCCGCCTTCAAGTGGGCCATCGACAAGGACATCAAGAACTTCCAGAAGCTCAAAGAGATGTACAACGAGTGGCCATTCTTCAGGGTCACCCTGGACCTGCTGGAGATGGTTTTCGCCAAGGGAGATCCTGGCATTGCCGGCTTGTATGACTTGCTGCTTGTCGCCGACGATCTCAAGCCCTTTGGGAAGCAGCTCAGGGACAAATACGTGGAGACAGAGAAGCTTCTCCTACAGATCGCTGGGCACAAGGATATTCTTGAAGGCGATCCTTACCTGAAGCAGGGGCTGCGGCTACGCAATCCCTACATCACCACCCTGAACGTGTTGCAGGCCTACACGCTGAAGCGGATAAGGGATCCGTGCTTCAAGGTGACGCCGCAGCCGCCGCTGTCCAAGGAGTTCGCCGACGAGAACAAGCCCGCCGGACTGGTGAAGCTGAACCCGGCGAGCGAGTACCCGCCCGGGCTGGAAGACACGCTCATCCTCACCATGAAAGGTATCGCCGCCGGCATGCAGAACACCGGCTAGGCCGCTTCCCTTCACTCACCTGCAGAGTACTGCACGGCAATAATAATCAGCTTCCGGATGGTGTCGTTTTGTCAGTTTTGGATGGAAATGCTGAAAACTGACACCTTCTGTTTTCACTATGTTTATGTTTATGTAATTTCCTCGGCTTTGGCCTCTTTATATTTTCACTCTTGTTGTGAAGTCCAAGTGGAAAAATCTTGGCATCTTAAACATATTGTAATAATGAACATCATACAATCTACAAATTTACTATTTTGTATTAATCTATCTGGCAGGGAAAATGTCACTTTATATCCCAGCCCATTGGATGGACTTTTTTACCATGATgctagttcaaccatcctcttttgattgtgctaaacaatttctgaaat c46/f4p4/1615 isoform=c46;full_length_coverage=4;non_full_length_coverage=4;isoform_length=1615 tgcgctgcggccgggtcggatctgagacgagacgagcccccctcccctcaaccggaacttgttACCATCCCATCCCACTCCCCACCGGATCTCGTCGGACTCGGATCCGCCCGACCACCCCGCGCCGCCGCCGCAGCAGATCAGAGAAGATGGCCGCAGTTGACACCTTCCTCTTCACCTCGGAGTCTGTGAACGAGGGACACCCTGACAAGCTCTGCGACCAGGTCTCAGATGCCGTTCTTGATGCTTGCCTTGCTGAGGACCCTGACAGCAAGGTTGCTTGCGAGACCTGCACCAAGACCAACATGGTCATGGTCTTTGGTGAGATCACCACCAAGGCCAATGTCGACTACGAGAAGATCGTCAGGGAGACCTGCCGCAACATTGGTTTCGTGTCAGCCGATGTTGGGCTTGACGCTGACCACTGCAAGGTGCTTGTGAACATTGAGCAGCAGTCTCCTGATATTGCTCAGGGTGTGCATGGCCACTTCACCAAGCGCCCCGAGGAGATTGGAGCTGGTGACCAGGGACACATGTTTGGGTATGCGACTGATGAGACCCCTGAGCTGATGCCTCTCAGCCATGTCCTTGCCACCAAGCTTGGTGCTCGCCTCACTGAGGTCCGCAAGAATGGAACCTGCCCCTGGCTCAGGCCTGATGGGAAGACCCAGGTGACAGTTGAGTACCGCAATGAGGGTGGTGCCATGGTCCCCATTCGGGTCCACACTGTCCTCATCTCTACCCAGCACGACGAGACAGTCACCAATGATGAGATTGCTGCTGACCTGAAGGAGCATGTCATCAAGCCTGTCATCCCTGAGCAGTACCTTGACGAGAAGACCATCTTCCACCTTAACCCATCTGGTCGCTTTGTCATTGGTGGACCTCACGGTGATGCTGGTCTTACTGGCCGGAAGATCATCATTGACACCTATGGTGGCTGGGGAGCCCATGGTGGTGGTGCTTTCTCTGGCAAGGACCCAACCAAGGTCGACCGCAGTGGAGCCTACGTCGCAAGGCAGGCTGCCAAGAGCATTGTCGCCAACGGCCTTGCTCGCCGCGCCATCGTCCAGGTCTCCTACGCCATCGGTGTGCCCGAGCCTCTCTCCGTGTTTGTCGACACGTACGGCACTGGCACGATCCCCGACAAGGAGATCCTCAAGATTGTGAAGGAGAACTTTGACTTCAGGCCTGGCATGATCATCATCAACCTTGACCTCAAGAAAGGCGGCAATGGGCGCTACCTCAAGACGGCGGCCTACGGACACTTTGGAAGGGACGACCCTGACTTCACCTGGGAGGTGGTGAAGCCCCTCAAGGCGGAGAAGCCTTCTGCCTAAGGCGCCCTTTTTCAGAAGAAGCTTTTGGTCTGCTGCGCTTATCATGTTTTATTATGGCTTCTATATGTTGTGATTCTTGATCTGCCCTTGCTTATCATTTGTATTTGTATCGTCCTAATAAGTGGTACTTTGTGAGGGTCTTACTGTGTCTGCTTAATTACCTAGAGGATTATTTCTGGTTTTGCTGCTTATGTAATGCTTAAAACAATGAAAGAAGCTACAGGCTACAGCTACTTTGAGaagtaatgggacttcgtgcattttggttatatatt

Head OUTPUT_ALL.CLSTR head output_all.clstr

Cluster 0 0 8239nt, >c28005/f1p3/8239... * 1 2760nt, >Sspon.003C0007900... at +/96.34% Cluster 1 0 7687nt, >c20466/f1p1/7687... * Cluster 2 0 7576nt, >c24192/f1p3/7576... * 1 1938nt, >Sspon.001B0026690... at +/99.38% Cluster 3

rna-seq • 421 views
ADD COMMENTlink written 15 months ago by vperlo0

Can you assist by tidying your post somewhat? - thanks in advance.

ADD REPLYlink written 15 months ago by Kevin Blighe41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 727 users visited in the last hour