Hi, I would like to align and visualise the GFF3/CDS of a genome reference with the long read (HQ Isoforms from PacBio SMRT Analysis). I have already used CD-HIT to cluster the isoforms (HQ Isoforms.fa -> HQ Isoforms.clstr) Could you give me some advise for that (eventually software, command-line with example), Thanks you,
Head CDS: • GFF3 Head of the gff3 (/90days/uqvperlo/Spontaneum_Genome/Sspon.v20171023.gff3) Chr1A maker gene 39107 43192 . + . ID=Sspon.001A0000000;Name=Sspon.001A0000000 Chr1A maker mRNA 39107 43192 . + . ID=Sspon.001A0000000-mRNA-1;Parent=Sspon.001A0000000;Name=Sspon.001A0000000-mRNA-1;_AED=0.05;_eAED=0.11;_QI=0|0|0|1|0.77|0.7|10|0|569 Chr1A maker exon 42527 42760 . + . ID=Sspon.001A0000000-mRNA-1:8;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker exon 42788 42920 . + . ID=Sspon.001A0000000-mRNA-1:9;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker exon 43027 43192 . + . ID=Sspon.001A0000000-mRNA-1:10;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 39107 39412 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41046 41217 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41317 41452 . + 2 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41551 41764 . + 1 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1 Chr1A maker CDS 41925 42068 . + 0 ID=Sspon.001A0000000-mRNA-1:cds;Parent=Sspon.001A0000000-mRNA-1
• CDS Head of CDS (Sspon.v20171023.cds.fasta)
Sspon.001A0000000 ATGGGCCTCTCGGGCGCGACGATCTCCGCGCCGCTCGGCTGCCGTGGGCTGCCGCGCGGGGCGGTCGGCGGAGGTAAGGCGCGGAAGGCGGAGGCGGAGAGGTGGCGGCGGGCGGGGTGGAGCCGACGCCTGGGCGGACCGATGGTGAGGTGCGTGGCGACCGAGAAGCACGACGAGGCGGCGGCGGCGGTCGGCGTGGAGTTCGCGGACGAGGAGGACTACCGCAAGGGGGGCGGCGGCGAGCTGCTTTACGTGCAAATGCAGGCCACCAAGCCCATGGAAAGCCAGTCCAAGATCGCTTCCAAGCTGTTGCCCATATCTGATGAAAATACGGTACTTGATTTGGTTATCATTGGTTGTGGTCCTGCTGGTCTTTCTCTAGCTTCAGAGTCAGCTAAGAAAGGTCTCACCGTCGGTCTCATTGGGCCTGACCTTCCATTCACAAATAACTATGGTGTGTGGGAGGATGAGTTTAAAGATCTTGGTCTAGAGAGTTGTATCGAGCACGTCTGGAAGGATACTATTGTCTACCTAGACAATAACAAGCCCATACTGATTGGCCGTTCTTATGGCAGGGTGCACCGTGACTTGCTCCATGAGGAGCTGCTGAGAAGATGCTATGAAGCTGGCGTGACATACCTGAACTCCAAAGTGGACAAGATCATAGAATCTCCAGATGGACACAGAGTAGTCTGTTGTGATAAGGGTCGTGAGATAATTTGCAGGCTTGCCATTGTTGCTTCGGGGGCAGCATCTGGTAGGCTTCTAGAGTATGAGGTTGGGGGTCCCCGTGTTTGCGTGCAGACTGCATACGGAGTAGAAGTTGAGGTGGAAAATAGTCCATATGATCCCAGCTTAATGGTTTTCATGGACTACAGAGATTGTTTCAAAGAGGAATTTTCACACACTGAACAAGAAAATCCCACTTTCCTGTATGCTATGCCCATGTCATCCACACGAGTTTTCTTTGAGGAAACATGCTTAGCTTCTAAAGATGCTATGTCATTGGATCTACTTAAGAAGAGGCTGATGTATCGGTTAAATATGATGGGAGTTCGTATCCTGAAAGTTTACGAGGAGGAATGGTCCTACATTCCTGTTGGAGGGTCCTTACCAAACACAGATCAGAAAAATCTTGCATTTGGTGCTGCAGCAAGTATGGTGCATCCTGCAACAGGGTACTCAGTGGTCAGATCTTTGTCTGAGGCTCCAAGATATGCTTCTGTGATATCCGAAATCTTAAGAAACCGAGTTCCTGCAGAATATTTGCTTGGAAATTCTCAAAATTACAGTCCATCAATGCTTGGTAAGCATTCTTGTGGTATTTATTCAACTTGGTTGTACTATTACACAACTCCAGGACAAGAACAGTTTTACTGTGTAAAGTCCATACAAGTTAGCAAAACTTTATCATGGAGAACACTATGGCCTCAAGAAAGGAAACGCCAGCGATCCTTCTTCCTTTTCGGATTAGCGTTGATAATCCAACTGAATAATGAAGGCATACAAACATTCTTTGAAGCCTTTTTCAGGGTGCCGAAATGGATGTGGCGGGGATTTCTTGGCTCAACCCTTTCATCCGTCGATCTCATACTATTTTCATTCTACATGTTTGCCATAGCTCCAAATCAATTGCGAATGAACCTCGTCAGACATCTCCTCTCTGACCCGACTGGATCAACCATGATCAAGACCTACCTGACCTTATAA Sspon.001A0000010 ATGCAAGGAGGCCCACTGAGCCCGGATGAGTACCGGGTGGCGTCACCGCCGGCGCTGCTGCACCAGCCGGCGTCCCTCATCGTCGTGGCCATCGACCGGGACCGGAACAGCCAACTGGCCGTTAAGTGGGTCATGGACCACCTCCTCTCCGGCGCCTCTCAGATCGTCCTCCTCCACGTGGCCGCCCATTACCCCACCAACCATGGGTTCGCCATGGCCGAGATGACGCAGGGCGCGCTGGAGGCTGAAATGAAGGAGATCTTTGTCCCCTACAGAGGATTCTTCAACCGGAATGGGGTAAATGTGGAAGTCTCCGAGGTAGTGTTGGAAGAGGCAGACGTGTCCAAGGCCATTCTAGGTTACATCACTGCAAACAAGATCCAGAGCATTGCGCTCGGCGGAGCCAGCAGAAATGCATTCACCAAGAAATTCAAGAACGCGGACGTGCCATCGACCCTGATGAAGTGCGCGCCG
• Cluster of HQ Isoform (OUTPUT_CLUSTER_HQ) or HQ not clustered (all_quivered_hq.100_30_0.99.fasta (/all_hq.seq_VPO2.fa)) HEAD all_hq.seq_VPO2.fa • aaaaaaaaaatgaagataataaactgcggattctttctttctcttccattcttacgtttccatattaaAGTGTAGTTTTTTTACTTAAATTTAATAATATTAATCTAATATGCCCATTGGTGTTCCAAAAGTACCTTACCGGATTCCCGGAGATGAAGAAGCGACTTGGGTTGACTTATACAATGTTATGTATCGAGAAAGGACACTTTTTTTAGGTCAAGAGATTCGTTGCGAGATCACGAATCATATTACAGGTCTGATGGTATATCTCAGTATAGAAGATGGAAATAGCGATATTTTTTTGTTTATAAACTCCCCTGGCGGGTGGCTAATCTCAGGAATGGCGATTTTTGATACGATGCAAACGGTGACACCTGATATATATACAATATGCCTCGGAATAGCCGCGTCCATGGCCTCCTTCATTCTGCTTGGAGGAGAACCCACCAAGCGTATAGCATTCCCTCACGCTAGGATTATGCTTCACCAACCTGCTAGTGCTTATTATCGGGCAAGGACACCAGAATTTTTACTAGAAGTGGAAGAGTTACACAAAGTTCGCGAAATGATCACAAGGGTTTATGCACTAAGAACAGGCAAGCCTTTTTGGGTTGTATCCGAAGACATGGAAAGGGATGTTTTTATGTCAGCAGACGAAGCCAAAGCTTATGGACTTGTCGATATTGTAGGGGATGAAATGATTGACGAGCACTGCGATACTGATCCAGTGTGGTTTCCAGAAATGTTTAAGGATTGGTAGTGGGTGGATTCCTTGTAAATTTATTTCAAACCTAAATTCGGGTTTTATGCTATAGAAATAGAAATACTTCATGATGATGAATCATCAGGTTAAGATCGATCTAAACCAATCCATTTTATATATACATACGACATGCCAACGGTTAAACAACTTATTAGAAATGCAAGACAGCCAATACGAAATGCTAGAAAATCGGCCGCGCTTAAGGGATGTCCTCAGCGTCGAGGAACATGTGCTAGGGTGTATACTATCAACCCCAAAAAACCCAACTCTGCCTTACGTAAAGTTGCCAGAGTACGATTAACCTCTGGATTTGAAATCACTGCTTATATACCTGGTATTGGCCATAATTTACAAGAACATTCTGTAGTATTAGTAAGAGGAGGAAGGGTTAAGGATTTACCCGGTGTGAGATATCGCATTATTCGAGGAACCCTAGATGCTGTCGCAGTAAAGAATCGTCAACAAGGGCGTTCTAGTGCGTTGTAGATTCTTATCCAAGACTTGTATCATTTGATGATGCCATGTGAATCGCTAGAAACATGTGAAGTGTATGGCTAACCCAATAACGAAAGTTTCGTAAGGGGACTGGAGCAGGCTACCATGAGACAAAAGATCTTCTTTCTAAAGAGATTCGATTCGGAACTCTTATATGTCCAAGGTCAATATGGAAATTCTTTCAGAGGTTTTCCCTTACTTTGTCCGTGTCAACAAACAATTCGAAATACCTCGACTTTTTCAGAACAGGTCCGAGTCAAATAGCAATGATTCGAAGCACTTCTTTTTCCATTACACTATTTCGGAAACCTAAGGACTCGATCGTATGGATATGGAAAATACAGGATTTCCGGTCCTAGCGGGAAAAGGAGGGAAACGGATACTCAATTTAAAGTGAGTAAACAGAATTCCATACTCGATCTCATAGATCCCTATAGAATTCTGTGGAAAGCCGTATTCGATGAAAGTCGTATGTACGGCTTGGAGGGAGATCTTTCctatctttcgagatccaccctacaatatggggccaaaaag • aaaaaaaggggggggacaggaggcaagatcaaatatctatggggcacccttacttcactttTCTTTTTTTGCATTTCTCAGTAAAAAGAGAGCAGTGCAGTATAGAATTTTTTTTCACTACTTCCGGTGGATAAGGAAAGGCATACATATCATACGTGGATTAGTATAATTTAGGATACTACTCTATTTTTATGATTATACCACCCTCATCTTAACTTCCTATTCGACTCTTATGAAATCGAGTACCGGTATTTTACCGGAATCCATACATAAACCGTATTCGTTTATTGTTAGAGAATGAATTTCATCTAATTAGTTACTTTTTGGGGGTTAAACCGCACCCCTTCCATTTTCTAGTTCCAACTTTGTTTGTGTATGTTCAATGAATAATTGGAAAGAATCCACCTCATTCTCACTCCATTTGCCGACTCCTTTTTTTATGGATTTGCTCTCATCTTTAGACCCCAAAAAGCGTATATTGATAGTAACCTAATAAAATAAAAACCAAGCAAACGCTCTTTTTCCTAAATTAAAATATGGGATCCTTTTTATTTAAGATCCTCTTTTCTCCATCTCATTTCTACTTCTACTGCTTATTGGGATGTCTATGTGACCCATAGAAAGTTGGTCATATAATACATACATAATTGTATGTATAACTATAAGAAAAAGGAAGGGAAATTGGATAAGAAAAgaaagattttgggttatacatatag HEAD OUTPUT_CLUSTER_HQ
c10/f3p32/3350 isoform=c10;full_length_coverage=3;non_full_length_coverage=32;isoform_length=3350 aactcgcagcagtagagcAGCACGAGCAACACGCCGCGCCGCTCCAACCATCTCAGCTTCGCGCTTCCCGCGCCCCGCCGCCGCGCCCGCCATGGCGTCCGAGCGGCACCACTCCATCGACGCGCAGCTCCGTGCCCTGGCCCCAGGCAAGGTCTCCGAGGAGCTCATCCAGTACGACGCCCTGCTCGCGACCGTTTCCTCGACATCCTCCAGGACCTCCATGGCCCTAGCCTTCGCGAATTTGTCCAGGAGTGCTACGAGGTGTCGGCCGATTACGAGGGCAAGAAGGACACGTCGAAGCTGGGCGAGCTGGGCACCAAGCTCACGGGGCTGGCGCCCGCCGACGCCATCCTGGTGGCGAGCTCCATCCTGCACATGCTCAACCTGGCCAACCTGGCCGAGGAAGTGGAGCTGGCGCACCGCCGCCGGAACAGCAAGCTCAAGCACGGGGACTTCTCCGACGAGGGCTCCGCCACCACCGAGTCGGACATCGAGGAGACGCTCAAGCGCCTCGTGTCGCTGGGCAAGACCCCCGAGGAGGTGTTCGAGGCGCTCAAGAACCAGAGCGTCGACCTCGTCTTCACCGCGCACCCCACGCAGTCCGCCAGGAGGTCGCTCCTGCAGAAAAACGCCAGGATCCGGAATTGTCTGACGCAGCTGAGTACCAAGGACGTCACGGTCGAGGACAAGAAGGAGCTCGACGAGGCTCTGCAGAGAGAGATCCAAGCAGCTTTCAGAACTGATGAGATCCGGAGAGCACAACCCACTCCACAGGATGAAATGCGCTATGGGATGAGCTACATCCATGAAACTGTATGGAAGGGTGTGCCTAAGTTTTTGCGCCGTGTGGATACAGCCCTGAAGAATATCGGCATCAATCAGCGCCTTCCCTACAATGTTCCTCTCATTAAGTTCTGTTCTTGGATGGGTGGTGACCGTGATGGAAATCCAAGAGTTACTCCGGAGGTGACAAGAGATGTATGCTTGCTGTCCAGAATGATGGCTGCAAACTTGTACATCGATCAGGTCGAAGACCTGATGTTTGAGCTCTCTATGTGGCGCTGCAATGATGAACTTCGTGCTCGAGCCGAAGAAGTCCAGAGTACTCCAGCTTCAAAGAAAGTTACCAAGTATTACATAGAATTCTGGAAGCAAATTCCTCCAAACGAGCCCTACCGGGTGATACTTGGTGCTGTAAGGGACAAGTTATACAACACACGCGAGCGTGCACGCCATCTGCTGGCAACTGGATTTTCTGAAATTTCTGTGGACTCGGTATTTACCAATATCGAAGAGTTCCTTGAGCCCCTTGAGCTATGCTACAAATCCCTGTGTGACTGCGGCGACAAGGCCATCGCGGACGGGAGCCTCCTGGACCTCCTGCGCCAGGTGTTCACGTTCGGGCTCTCCCTGGTGAAGTTGGACATCCGTCAGGAGTCGGAGCGGCACACCGACGTGATCGACGCCATCACCACGTACCTTGGCATCGGGTCGTACCGCTCGTGGCCCGAGGACAAGCGGATGGAGTGGCTGGTGTCGGAGCTGAAAGGCAAGCGGCCGCTGCTGCCCCCGGACCTTCCCATGACCGAGGAGATCGCCGACGTCATCGGGGCGATGCACGTCCTCGCGGAGCTCCCGTCCGACAGCTTCGGCCCCTACATCATCTCCATGTGCACAGCCCCCTCCGACGTGCTCGCCGTGGAGCTCCTGCAGCGCGAGTGTGGCATTCGCCAGACGCTGCCCGTGGTGCCGCTGTTCGAGAGGCTGGCCGACCTGCAGGCGGCGCCCGCGTCCGTGGAGCGGCTCTTCTCCACTGACTGGTACTTCGACCACATCAAGGGCAAGCAGCAGGTGATGGTCGGGTACTCCGACTCCGGCAAGGACGCCGGCCGCCTGTCCGCGGCGTGGCAGCTGTACGTGGCGCAGGAGGAGATGGCCAAGGTGGCCAAGAAATACGGCGTGAAGCTGACCTTGTTCCACGGGCGCGGCGGCACCGTGGGCAGGGGTGGCGGGCCGACGCACCTGGCCATCCTGTCCCAGCCGCCGGACACCATCAACGGGTCAATCCGCGTGACGGTGCAGGGCGAGGTCATCGAGTTCATGTTCGGGGAGGATCACCTGTGCTTCCAGTCTCTGCAGCGCTTCACGGCCGCCACGCTGGAGCACGGCATGCACCCGCCGGTGTCTCCCAAGCCCGAGTGGCGCAAGCTCATGGAGGAGATGGCAGTCGTGGCCACGGAGGAGTACCGCTCCGTCGTCGTCAAGGAGCCGAGATTCGTCGAGTACTTCAGATCGGCTACCCCTGAGACTGAGTACGGGAAGATGAACATCGGCAGCCGGCCAGCCAAGAGGAAGCCGGGCGGCGGCATCACCACCCTGCGCGCCATCCCCTGGATCTTCTCGTGGACCCAGACGAGGTTCCACCTCCCCGTGTGGCTGGGAGTCGGCGCCGCCTTCAAGTGGGCCATCGACAAGGACATCAAGAACTTCCAGAAGCTCAAAGAGATGTACAACGAGTGGCCATTCTTCAGGGTCACCCTGGACCTGCTGGAGATGGTTTTCGCCAAGGGAGATCCTGGCATTGCCGGCTTGTATGACTTGCTGCTTGTCGCCGACGATCTCAAGCCCTTTGGGAAGCAGCTCAGGGACAAATACGTGGAGACAGAGAAGCTTCTCCTACAGATCGCTGGGCACAAGGATATTCTTGAAGGCGATCCTTACCTGAAGCAGGGGCTGCGGCTACGCAATCCCTACATCACCACCCTGAACGTGTTGCAGGCCTACACGCTGAAGCGGATAAGGGATCCGTGCTTCAAGGTGACGCCGCAGCCGCCGCTGTCCAAGGAGTTCGCCGACGAGAACAAGCCCGCCGGACTGGTGAAGCTGAACCCGGCGAGCGAGTACCCGCCCGGGCTGGAAGACACGCTCATCCTCACCATGAAAGGTATCGCCGCCGGCATGCAGAACACCGGCTAGGCCGCTTCCCTTCACTCACCTGCAGAGTACTGCACGGCAATAATAATCAGCTTCCGGATGGTGTCGTTTTGTCAGTTTTGGATGGAAATGCTGAAAACTGACACCTTCTGTTTTCACTATGTTTATGTTTATGTAATTTCCTCGGCTTTGGCCTCTTTATATTTTCACTCTTGTTGTGAAGTCCAAGTGGAAAAATCTTGGCATCTTAAACATATTGTAATAATGAACATCATACAATCTACAAATTTACTATTTTGTATTAATCTATCTGGCAGGGAAAATGTCACTTTATATCCCAGCCCATTGGATGGACTTTTTTACCATGATgctagttcaaccatcctcttttgattgtgctaaacaatttctgaaat c46/f4p4/1615 isoform=c46;full_length_coverage=4;non_full_length_coverage=4;isoform_length=1615 tgcgctgcggccgggtcggatctgagacgagacgagcccccctcccctcaaccggaacttgttACCATCCCATCCCACTCCCCACCGGATCTCGTCGGACTCGGATCCGCCCGACCACCCCGCGCCGCCGCCGCAGCAGATCAGAGAAGATGGCCGCAGTTGACACCTTCCTCTTCACCTCGGAGTCTGTGAACGAGGGACACCCTGACAAGCTCTGCGACCAGGTCTCAGATGCCGTTCTTGATGCTTGCCTTGCTGAGGACCCTGACAGCAAGGTTGCTTGCGAGACCTGCACCAAGACCAACATGGTCATGGTCTTTGGTGAGATCACCACCAAGGCCAATGTCGACTACGAGAAGATCGTCAGGGAGACCTGCCGCAACATTGGTTTCGTGTCAGCCGATGTTGGGCTTGACGCTGACCACTGCAAGGTGCTTGTGAACATTGAGCAGCAGTCTCCTGATATTGCTCAGGGTGTGCATGGCCACTTCACCAAGCGCCCCGAGGAGATTGGAGCTGGTGACCAGGGACACATGTTTGGGTATGCGACTGATGAGACCCCTGAGCTGATGCCTCTCAGCCATGTCCTTGCCACCAAGCTTGGTGCTCGCCTCACTGAGGTCCGCAAGAATGGAACCTGCCCCTGGCTCAGGCCTGATGGGAAGACCCAGGTGACAGTTGAGTACCGCAATGAGGGTGGTGCCATGGTCCCCATTCGGGTCCACACTGTCCTCATCTCTACCCAGCACGACGAGACAGTCACCAATGATGAGATTGCTGCTGACCTGAAGGAGCATGTCATCAAGCCTGTCATCCCTGAGCAGTACCTTGACGAGAAGACCATCTTCCACCTTAACCCATCTGGTCGCTTTGTCATTGGTGGACCTCACGGTGATGCTGGTCTTACTGGCCGGAAGATCATCATTGACACCTATGGTGGCTGGGGAGCCCATGGTGGTGGTGCTTTCTCTGGCAAGGACCCAACCAAGGTCGACCGCAGTGGAGCCTACGTCGCAAGGCAGGCTGCCAAGAGCATTGTCGCCAACGGCCTTGCTCGCCGCGCCATCGTCCAGGTCTCCTACGCCATCGGTGTGCCCGAGCCTCTCTCCGTGTTTGTCGACACGTACGGCACTGGCACGATCCCCGACAAGGAGATCCTCAAGATTGTGAAGGAGAACTTTGACTTCAGGCCTGGCATGATCATCATCAACCTTGACCTCAAGAAAGGCGGCAATGGGCGCTACCTCAAGACGGCGGCCTACGGACACTTTGGAAGGGACGACCCTGACTTCACCTGGGAGGTGGTGAAGCCCCTCAAGGCGGAGAAGCCTTCTGCCTAAGGCGCCCTTTTTCAGAAGAAGCTTTTGGTCTGCTGCGCTTATCATGTTTTATTATGGCTTCTATATGTTGTGATTCTTGATCTGCCCTTGCTTATCATTTGTATTTGTATCGTCCTAATAAGTGGTACTTTGTGAGGGTCTTACTGTGTCTGCTTAATTACCTAGAGGATTATTTCTGGTTTTGCTGCTTATGTAATGCTTAAAACAATGAAAGAAGCTACAGGCTACAGCTACTTTGAGaagtaatgggacttcgtgcattttggttatatatt
Head OUTPUT_ALL.CLSTR head output_all.clstr
Cluster 0 0 8239nt, >c28005/f1p3/8239... * 1 2760nt, >Sspon.003C0007900... at +/96.34% Cluster 1 0 7687nt, >c20466/f1p1/7687... * Cluster 2 0 7576nt, >c24192/f1p3/7576... * 1 1938nt, >Sspon.001B0026690... at +/99.38% Cluster 3
Can you assist by tidying your post somewhat? - thanks in advance.