Hello and thank you for your time and consideration,
I've been using htseq-count for gene counting. It has worked before using a different genome. However, currently, it is failing to count genes. Instead it creates gene count files full of 0 values. Is it because the fasta header and first column of the gff file are not the same (e.g., Sc0000000.g1.t1 vs. Sc0000000)? Please find examples of the fasta alignment file, gff file, and the code I ran below. Thank you again for your help and time.
Fasta alignment file:
'>Sc0000000.g1.t1
atggcggctttagctggcatacgagttcttgagtttgctggattagctccttgcccttttgcaggaatgatattggcag
attttggagcagttgtaacccgcatagatagagtgaaaacgctgtctttcatcggcgtagatccacttgcgcgaggcaaaaaatctatttgcattgat
ctaaagaatcctcgagggaaggagatcgtacgaaagctgtgcctgaagtcggatgtgttgatagagccgtttcgaccaggggtgatggaaacactcgg
cctaggacccgaaatattattaaaggataatccaagactgatttatgctcgattaacaggttatggacagacaggaacacactcccagagggcaggcc
atgacatcaattacattgcgctgagtggattattatcaaagctaggccgcaaggatgaaaaaccaacacctcccataaatctgttagcagactttgct
gggggaggtttgacctgcgtgcttggagtattgttggccttgtttgagagaagtaaatctggaaaaggccaagttgttgacgtttccatggtcgaagg
atcagcatatatcggctcatttatttataaaacactgagtggtcttgaatttgcatggccaaaccccaaggagcgtggcaccaacctattggataccg
gggcgccattttatgatacatacaaaacatcggatggtaaatttatggctgttggtgcacttgaacccaagttttataaacaatttctgaaagggttg
gatttggaagggaaacttccacaccaggcagacttatatcagtggcccaaagcaaaggaagagatagcaagcatattctcaacaaagactcaatcaga
ttggtgcaagatctttgagaatcttgatgcttgcgtggaaccagtgcttaccacagacgaagcacctcttcatcctcataaccaagagagaaatgcat
tttctttcagtcatggggcttatgaacccacccctgcaccaaaactttcaagaactcctgcagactgcctgccaaaacctctccctgggacaggagaa
cacacaatggaaattttacaacatactgggtattcacaatctgatattcaagaacttattgatgagggtgttgttgaatgcccaaacatgaagtcctc
actttga'
GFF file:
Sc0000000 AUGUSTUS transcript 4615 9517 . + . transcript_id "g1.t1"; gene_id "g1";
Sc0000000 AUGUSTUS CDS 4615 4948 . + 0 transcript_id "g1.t1"; gene_id "g1";
Sc0000000 AUGUSTUS CDS 5690 5916 1 + 2 transcript_id "g1.t1"; gene_id "g1";
Sc0000000 AUGUSTUS CDS 8068 8266 1 + 0 transcript_id "g1.t1"; gene_id "g1";
Sc0000000 AUGUSTUS CDS 9114 9517 1 + 2 transcript_id "g1.t1"; gene_id "g1";
Code I ran:
htseq-count -f auto -i gene_id --type CDS -m intersection-nonempty -n 12 6B.bam ../reference/Mcap_reference_new.gff3 '>' ../genecounts/{.}.counts.txt
Thank you!