I have some queries before running Braker in ETP mode, where I will provide both RNA Seq and protein data. I am looking forward to your valuable suggestions.
SRR1,SRR2,SRR3,SRR4respectively represent old male flower,old female flower,young male flower, and young female flower. I already haveSRA Toolsinstalled and will provide the SRA IDs of these RNA Seq datasets.I have downloaded the "
viridiplantae.fa" fromOrthoDBand would like to combine it with proteins from select species closely related to my plant.
Query 1: The Braker protocol states that coding sequence prediction quality improves if Braker trains UTR parameters for AUGUSTUS, requiring stranded RNA Seq alignment. Is UTR training really important? If so, how can I verify if the RNA Seq libraries use a stranded protocol?
Query2: For GeneMark-ETP mode, Braker uses Stringtie2 for assembly, requiring aligned reads with XS tags. Since I have HISAT2 installed as an optional Braker dependency, I should run it with the --dta tag to include XS tags.
What is recommended here?
a. Run HISAT2 with the --dta tag, generate BAM files, and provide these to Braker?
b. Run Braker with unaligned RNA Seq data (will Braker use --dta by default)?
========================================================================