Question: Extract gff of a particular chromosome
0
gravatar for mhasa006
2.7 years ago by
mhasa00630
United States
mhasa00630 wrote:

I have a gff file that contains all the information from Chromosome 1 - Chromosome 14. But I need gff information on individual chromosome basis. For example, I am performing some experiment on Chromosome 11 and trying to visualize my result on IGV. When I load the gff file in IGV it is showing gene information of all the chromosome. How can I get the gff of only Chromosome 11?

igv chromosome gff3 • 1.9k views
ADD COMMENTlink modified 4 months ago by ahmedferoz2010 • written 2.7 years ago by mhasa00630

Why are you bothered by IGV showing you all chromosomes? You just need to double-click on the chromosome you are interested in to select and zoom to just that chromosome (or use the drop-down menu).

ADD REPLYlink written 2.7 years ago by genomax63k
3
gravatar for genomax
2.7 years ago by
genomax63k
United States
genomax63k wrote:

Assuming your chromosomes are named chrNN the following should extract chr11

grep chr11 your_file.gff > chr11.gff

If chr11 is in the first column and you want only those lines then do

grep ^chr11 your_file.gff > chr11.gff
ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by genomax63k
2

Small correction, some times when you grep 'chr1' you end up in getting 'chr11', 'chr12' etc., adding '-w' will solve the problem.

grep -w chr11 your_file.gff > chr11.gff

ADD REPLYlink written 2.7 years ago by EagleEye6.2k

With awk for exact matches, preserving the header:

awk '$1 ~ /^#/ {print $0;next} {if ($1 == "chr11") print}' your_file.gff
ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint14k
0
gravatar for ahmedferoz20
4 months ago by
ahmedferoz2010
ahmedferoz2010 wrote:

Hello, I was trying to grep chromosome X from a gff file. I have an output but it is empty. I am a beginner in this. Please help me.

ADD COMMENTlink written 4 months ago by ahmedferoz2010

Ok, you are not supposed to ask questions in other threads but before you refresh even more old ones, please show the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c and post the command you used.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint14k

hanks a lot. Sorry, I am not aware of it. I am still not getting a file with chromosome 'x' only. I have used grep chrX myfile.gff>chrx.gff

ADD REPLYlink written 4 months ago by ahmedferoz2010

@ahmedferoz20 I deleted your comment because you added it as an answer instead of using Add Reply.

What is the output of cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint14k

Yep ok i ad doing it now.

  1. I have used grep chrX myfile.gff>chrx.gff to extract output of chrX.

2 The output of the script you gave me is

## species https://www.ncbi.nlm.nih.gov/Taxonomoy/Browser/www.tax.cgi?id=70

I hope i followed your suggestion to get help.

ADD REPLYlink modified 4 months ago by genomax63k • written 4 months ago by ahmedferoz2010
1

What? A link to ncbi is for sure not the output. head -n 20 your.gff will also do.

ADD REPLYlink modified 4 months ago • written 4 months ago by ATpoint14k

@ATpoint, I admire your patience.

ADD REPLYlink written 4 months ago by Carambakaracho930

We all were unexperienced at some point. I assume that the problem is that the chromosomes are labelled as 1,2,3...X rather than chr1,chr2,chr3...chrX, that is why I asked for the cut -f1 your.gff | sort -k1,1 | grep -v '^#' | uniq -c because that will list the unique chromosome names.

ADD REPLYlink written 4 months ago by ATpoint14k

Thanks a lot. I got it. However, my next challenge is to create a fasta file from that chromosome x file. I have to create a fasta file which contain 1000bp upstream of each gene.

ADD REPLYlink written 4 months ago by ahmedferoz2010
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2156 users visited in the last hour