Question: Generating GTF files with repalced features
0
gravatar for abiuma.arasu
5 days ago by
abiuma.arasu0 wrote:

I need to extract GTF annotation rows for transcripts based on the feature type transcript (column 3) of the original tab-delimited GTF and replace the feature type from transcript to exon. Though I don't get any error while executing the command it does not replace the transcript with exon. Can you anyone tell me what's the mistake I am doing here?

awk 'BEGIN{FS="\t"; OFS="\t"} $3 == "transcript"{ print; $3="exon"; $9 = gensub("(transcript_id\\s\"{0,1})([^;\"]+)(\"{0,1});", "\\1\\2_premrna\\3;", "g", $9); print; next}{print}'  refdata-cellranger-GRCh38-1.2.0/genes/genes.gtf > GRCh38-1.2.0.premrna.gtf
sequencing rna-seq tool gene • 57 views
ADD COMMENTlink modified 5 days ago by ATpoint38k • written 5 days ago by abiuma.arasu0

Looks like 10x seems to have changed the awk command and it does not seem to work as you found out.

Can you try (this was what was there before):

awk 'BEGIN{FS="\t"; OFS="\t"} $3 == "transcript"{ $3="exon"; print}'  genes.gtf > GRCh38-1.2.0.premrna.gtf

If this works then can you email 10x support and let them know that their current example does not seem to work. Update this thread when you hear back from them.

ADD REPLYlink modified 4 days ago • written 4 days ago by genomax89k

Thanks! The command partially worked. Though the replacement of the feature type from transcript to exon occurred the file size has decreased.The original file size was 923313 KB and the file obtained after execution is 52000 KB. The total number of lines in the original file is 1780460 and the lines in the file obtained after execution of the command is 118158.

ADD REPLYlink modified 4 days ago • written 4 days ago by abiuma.arasu0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1941 users visited in the last hour