repeatmasker plain text into gtf
1
0
Entering edit mode
4.1 years ago
priya120195 ▴ 20

THIS IS MY PLAIN TEXT FILE:

 cat filtered_for_repeatsgtf.txt |head

239 chr1    67108752    67108881    +   RLTR17B_Mm  LTR ERVK
314 chr1    3145673 3145796 -   RMER16A3    LTR ERVK
3620    chr1    5242237 5242959 -   RMER13B LTR ERVK
1530    chr1    7339880 7340133 -   MYSERV6-int LTR ERVK
2842    chr1    9436682 9437312 +   RLTR1D2_MM  LTR ERV1
1317    chr1    28311234    28311561    -   MTD LTR ERVL-MaLR
4789    chr1    29359731    29360380    -   MERVL_2A-int    LTR ERVL
2845    chr1    34602700    34603167    -   RLTR10  LTR ERVK
4419    chr1    45088448    45089377    -   RLTR13D6    LTR ERVK
287 chr1    60817355    60817487    +   LTR33   LTR ERVL

I want to convert this into ensebml gtf format: i.e

 cat Mus_musculus.GRCm38.99.withchr.gtf|head 
chr1    havana  gene    3073253 3074322 .   +   .   gene_id ENSMUSG00000102693; gene_version 1; gene_name 4933401J01Rik; gene_source havana; gene_biotype TEC;
chr1    havana  transcript  3073253 3074322 .   +   .   gene_id ENSMUSG00000102693; gene_version 1; transcript_id ENSMUST00000193812; transcript_version 1; gene_name 4933401J01Rik; gene_source havana; gene_biotype TEC; transcript_name 4933401J01Rik-201; transcript_source havana; transcript_biotype TEC; tag basic; transcript_support_level NA;

is there any tool or script to do this?

sequencing next-gen alignment • 1.0k views
ADD COMMENT
2
Entering edit mode

is there any tool or script to do this?

awk

ADD REPLY
2
Entering edit mode
4.1 years ago
Dave Carlson ★ 1.7k

Note that RepeatMasker comes with a utility script to convert their default *.out file format to GFF3. You can find it at:

/path/to/RepeatMasker/util/rmOutToGFF3.pl

If you specifically need GTF format, you can convert using awk as Pierre suggested or using an existing tool (e.g., see here).

ADD COMMENT

Login before adding your answer.

Traffic: 2967 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6