Repetitive Regions In Hg18
3
3
Entering edit mode
9.8 years ago
Vikas Bansal ★ 2.4k

Dear all,

I want to download a file (BED format) for repetitive regions in hg18. eg of format ->

chr  start  stop

I looked in UCSC browser but did not find it. My aim is -> I have a file like this

chr  start   stop

and I want to filter out all those regions which lie between the repetitive region (that is why I want to download the file).

Thanks and Best regards,

Vikas

repeats ucsc hg genome • 3.4k views
ADD COMMENT
0
Entering edit mode

Got to 'Tables' and select the species and assembly you want. Then you select the group 'Variation and Repeat' and choose the track you want to download. As output format you pick 'bed' and that't it!

ADD REPLY
4
Entering edit mode
9.8 years ago

Got to 'Tables' and select the species and assembly you want. Then you select the group 'Variation and Repeat' and choose the track you want to download. As output format you pick 'bed' and that't it!

ADD COMMENT
0
Entering edit mode

Thanks for your reply. I did the same thing but I am confused for selecting track. There are so many repeat options. If I will select RepeatMasker, will it give me positions of all interspersed repeats (as mentioned in repeatmasker website)?

ADD REPLY
0
Entering edit mode

In the genome Browser: If you click on the gray bar on the left side of the track you are forwarded to the track controls of this track. There is a link 'view table scheme' and there you will get the table name of this track. The regular repeat track is 'rmsk', as far as I know.

ADD REPLY
0
Entering edit mode

Perfect. I am trying to download it, but it is stopping after sometime again and again. It says time out. Is there any ftp website from where I can download this by giving command 'wget' from linux?

ADD REPLY
0
Entering edit mode

As Pierre already mentioned, you can download everything from the http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/ folder. But you can also try to change the output format to 'all fields...' and then parse out the fields you need for your bed file. That worked for me. Sometimes the table browser has problems with converting the files to other output formats and then it dies.

ADD REPLY
0
Entering edit mode

Yes. I tried this link. I looked for rmskRM327 file for whole genome. But they do not have this file but I noticed that it is given for each chromosome separately.

ADD REPLY
4
Entering edit mode
9.8 years ago

Download a track of repeats, for example: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/database/simpleRepeat.txt.gz

then, use BEDTOOLS to filter out your data with this file.

ADD COMMENT
1
Entering edit mode
9.8 years ago

If you wish to filter out these regions then why not just use the repeat masked genome?

e.g. PreMaskedGenomes from Repeatmasker

ADD COMMENT
4
Entering edit mode

BEDtools: intersectBed -a yourFile.bed -b repeats.bed -v > yourFile.withoutReadsOverlappingWithRepeats.bed

ADD REPLY
0
Entering edit mode

I would say it depends on the format Vikas wants to overlap the repeats with. If it is also in bed format, intersectBed with -v parameter is much faster than working with masked sequences.

ADD REPLY
0
Entering edit mode

Hi. The thing is I have already done everything (mapping, analysis etc). Now I have my output file and I just want to exclude those positions which are present in repeats region. Any advice?

ADD REPLY
0
Entering edit mode

+1 for bedtools Galaxy also has a range of intersect options as well

ADD REPLY

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6