How to extract de novo transcript from RNA long read alignement ?
1
0
Entering edit mode
3.5 years ago
sacha ★ 2.4k

Hi,

I have long reads from transcripts sequencing ( one amplicon sequenced wtih PACBIO ) that I mapped to a gene locus using minimap2. I would like to extract transcript structure ( in GTF format ? ) with their abundance. For example, in the screenshot bellow, you can see an alignment showing 2 kind of transcripts. One, with an intronic retention. I would like to get the structure of those transcripts and the amount .

Pacbio alignement

RNA pacbio • 1.0k views
ADD COMMENT
0
Entering edit mode
3.5 years ago
Juke34 8.8k

If your minimap2 output is a bam format you can use agat_convert_minimap2_bam2gff.pl from AGAT to convert the data into GFF and then you can extract the sequences using agat_sp_extract_sequences.pl. You may also convert the GFF into GTF using agat_convert_sp_gff2gtf.pl

ADD COMMENT
0
Entering edit mode

From a GFF file, can I remove duplicate using AGAT ? And get the count of each item ?

ADD REPLY
1
Entering edit mode

agat_convert_minimap2_bam2gff.pl will not remove the duplicates but the other scripts (with _sp_ in their name) will remove duplicates automatically (when parsing the file). If you need a close look at the removed duplicates you can run agat_convert_sp_gxf2gxf.pl that will generate a log file.

ADD REPLY

Login before adding your answer.

Traffic: 1027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6