Question: Extract gff of a particular chromosome
1
gravatar for mhasa006
3.2 years ago by
mhasa00650
United States
mhasa00650 wrote:

I have a gff file that contains all the information from Chromosome 1 - Chromosome 14. But I need gff information on individual chromosome basis. For example, I am performing some experiment on Chromosome 11 and trying to visualize my result on IGV. When I load the gff file in IGV it is showing gene information of all the chromosome. How can I get the gff of only Chromosome 11?

igv chromosome gff3 • 2.4k views
ADD COMMENTlink modified 11 months ago by ahmedferoz2010 • written 3.2 years ago by mhasa00650

Why are you bothered by IGV showing you all chromosomes? You just need to double-click on the chromosome you are interested in to select and zoom to just that chromosome (or use the drop-down menu).

ADD REPLYlink written 3.2 years ago by genomax71k

Hello, I was trying to grep chromosome X from a gff file. I have an output but it is empty. I am a beginner in this. Please help me.

ADD REPLYlink written 11 months ago by ahmedferoz2010

Ok, you are not supposed to ask questions in other threads but before you refresh even more old ones, please show the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c and post the command you used.

ADD REPLYlink modified 11 months ago • written 11 months ago by ATpoint23k

hanks a lot. Sorry, I am not aware of it. I am still not getting a file with chromosome 'x' only. I have used grep chrX myfile.gff>chrx.gff

ADD REPLYlink written 11 months ago by ahmedferoz2010

@ahmedferoz20 I deleted your comment because you added it as an answer instead of using Add Reply.

What is the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c

ADD REPLYlink modified 11 months ago • written 11 months ago by ATpoint23k

Yep ok i ad doing it now.

  1. I have used grep chrX myfile.gff>chrx.gff to extract output of chrX.

2 The output of the script you gave me is

## species https://www.ncbi.nlm.nih.gov/Taxonomoy/Browser/www.tax.cgi?id=70

I hope i followed your suggestion to get help.

ADD REPLYlink modified 11 months ago by genomax71k • written 11 months ago by ahmedferoz2010
1

What? A link to ncbi is for sure not the output. head -n 20 your.gff will also do.

ADD REPLYlink modified 11 months ago • written 11 months ago by ATpoint23k

@ATpoint, I admire your patience.

ADD REPLYlink written 11 months ago by Carambakaracho1.6k

We all were unexperienced at some point. I assume that the problem is that the chromosomes are labelled as 1,2,3...X rather than chr1,chr2,chr3...chrX, that is why I asked for the cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c because that will list the unique chromosome names.

ADD REPLYlink written 11 months ago by ATpoint23k

Thanks a lot. I got it. However, my next challenge is to create a fasta file from that chromosome x file. I have to create a fasta file which contain 1000bp upstream of each gene.

ADD REPLYlink written 11 months ago by ahmedferoz2010
5
gravatar for genomax
3.2 years ago by
genomax71k
United States
genomax71k wrote:

Assuming your chromosomes are named chrNN the following should extract chr11

grep chr11 your_file.gff > chr11.gff

If chr11 is in the first column and you want only those lines then do

grep ^chr11 your_file.gff > chr11.gff
ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by genomax71k
3

Small correction, some times when you grep 'chr1' you end up in getting 'chr11', 'chr12' etc., adding '-w' will solve the problem.

grep -w chr11 your_file.gff > chr11.gff

ADD REPLYlink written 3.2 years ago by EagleEye6.4k
1

With awk for exact matches, preserving the header:

awk '$1 ~ /^#/ {print $0;next} {if ($1 == "chr11") print}' your_file.gff
ADD REPLYlink modified 11 months ago • written 11 months ago by ATpoint23k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1153 users visited in the last hour