How to retrieve only blocks that contain sequences for ALL species from a maf file?
1
1
Entering edit mode
2.3 years ago

Hi - I'm relatively new to bioinformatics so my computer lingo is very clunky and simple explanations would be very much appreciated!

I've got maf files containing sequences from 16 species. Lots of blocks only contain alignments for a subset of the 16 species so I want to filter the file so that I only have blocks which have data for all species. I've found bx-python on GitHub (https://github.com/bxlab/bx-python) and the script maf_limit_to_species.py allows you to select a subset of species in a maf and returns any block which contains one or more of these species, but not all of them which is what I want.

I've tried to upload my mafs onto galaxy (http://galaxy.southgreen.fr/galaxy) but can't connect to their HPC to upload my data so stuck there.

I've also tried to use MafTools (https://github.com/dentearl/mafTools) to work but I've only used python so can't work out how to get it to run and from the description I think it will only do the same thing as the bx-python script.

If anyone knows of a way to filter my data in the way I have described I would very much appreciate it!

Many thanks.

maf python mutation annotation format • 530 views
ADD COMMENT
0
Entering edit mode
11 months ago
Zhenpeng Yu ▴ 10

You can try MafFilter

ADD COMMENT

Login before adding your answer.

Traffic: 2718 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6