Extracting all the genes without introns for a species
1
0
Entering edit mode
18 months ago
GR ▴ 400

Hi All,

I was wondering if there is a quick way to extract all the genes without introns for a species from the gff file?

Thanks, RT

introns • 758 views
ADD COMMENT
1
Entering edit mode

Take a look at AGAT toolkit. It should have something in it that will do this. Doc available: https://agat.readthedocs.io/en/latest/?badge=latest

ADD REPLY
0
Entering edit mode

What do you mean by a gene without introns? mRNA (spliced transcript)? What you want to achieve when threre are isoforms? Extract each isoform independently or create a chimere by merging all possible isoforms in one single feature?

ADD REPLY
0
Entering edit mode
18 months ago

you're looking for transcripts having count(exon)==1. So it's something like:

wget -O - -q "https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_42/gencode.v42.annotation.gff3.gz" |\
gunzip -c |\
awk '($3=="exon")' |\
cut -f 9 |\
tr ";" "\n" |\
grep '^transcript_id=' |\
cut -f2 -d '=' |\
sort |\
uniq -u
ADD COMMENT

Login before adding your answer.

Traffic: 2348 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6