Entering edit mode
2.5 years ago
TheCatalyst
•
0
Hi. I have obtained a tabular file with the co-ordinates for all interspersed repeats from UCSC genome browser but I'm specifically looking for LTR-retrotransposons.
I am not sure about the format you got, but you should try to get a GFF file if possible. Try the following command first:
grep -ie "ltr" | head
and see what comes out of this. Assuming you are on Linux/Mac, otherwise you just open it in a text editor, if possible.What is the advantage of a GFF file or in which context? The most useful tools exist for BED files, most importantly, bedtools. IGV, UCSC etc all can load BED files.
Also possible, I am not sure what op has anyway, I think GFF might contain more annotation information in order to filter the data. I am guessing OP has this file: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.out.gz
That is original output from RepeatMasker and can be parsed despite it is some whitespace padded format, but I wanted to make sure if it is really that file and would like to know the purpose of the investigation.
Just to update. Thanks. I needed to follow this: UCSC genome browser > table browser > group: repeats > track: repeatmasker > table: rmsk > filter: repname > LTR column
Ok, so is your request solved by this?
Yes. Thanks.