How to import a gtf file in R so that it appears in tabular format as if it were a data frame?
2
0
Entering edit mode
9 months ago
Alex • 0

Hi all!

How to import a gtf file in R so that it appears in tabular format as if it were a data frame?

This is the file name:

 GCF_023701775.1_HaSCD2_genomic.gtf

This is the file head:

#gtf-version 2.2

#!genome-build HaSCD2

#!genome-build-accession NCBI_Assembly:GCF_023701775.1

#!annotation-source NCBI RefSeq Helicoverpa armigera Annotation Release 101 

NC_064776.1 Gnomon  gene    4347    6338    .   -   .   gene_id "LOC126054536"; transcript_id ""; db_xref "GeneID:126054536"; description "uncharacterized LOC126054536"; gbkey "Gene"; gene "LOC126054536"; gene_biotype "protein_coding";
NCBI • 1.2k views
ADD COMMENT
0
Entering edit mode

It's just a tab-separated file. You can use read.delim(filename, sep="\t", skip=n) where n = header lines. Is that really what you want?

ADD REPLY
2
Entering edit mode

Actually tab-separated with nested columns separated by semicolon, so the rtracklayer route should be preferred since it takes care of separating those columns-in-columns.

ADD REPLY
1
Entering edit mode

Thus my question to the OP..."Is that really what you want?" as it wasn't clear to me what they were trying to accomplish with a data frame. But yes, I figured the real solution would involve txdb or rtracklayer, which are both great for this.

ADD REPLY
5
Entering edit mode
9 months ago
 library(rtracklayer)

 GTF <- rtracklayer::import('your.gtf')
 GTF <- data.frame(GTF)

although i would opt for a txdb if you are working in R

library(GenomicFeatures)
txdb <- makeTxDbFromGFF('your.gtf')
ADD COMMENT
0
Entering edit mode
9 months ago
acvill ▴ 340

If you find that the functions from the rtracklayer or GenomicsFeatures libraries don't work for your annotations, you can try my read_gtf() function:

Transform a GTF file into a data frame in R

ADD COMMENT

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6