Question: TopHat paramertes " -G/--GTF <GTF/GFF3 file>
0
gravatar for Y Tb
5.1 years ago by
Y Tb150
USA
Y Tb150 wrote:

I am going to run TopHat to map my rna-seq reads to human genome and I found that one of the option is to provide TopHat with the annotation GTF file   " -G/--GTF <GTF/GFF3 file>"  and I found the following note in tophat manual

 

Please note that the values in the first column of the provided GTF/GFF file (column which indicates the chromosome or contig on which the feature is located), must match the name of the reference sequence in the Bowtie index you are using with TopHat. You can get a list of the sequence names in a Bowtie index by typing:

 

bowtie-inspect --names your_index


So before using a known annotation file with this option please make sure that the 1st column in the annotation file uses the exact same chromosome/contig names (case sensitive) as shown by the bowtie-inspect command above.

So I checked the the first column in my GTF , but my question is how to use bowtie-inspect to check the index file (I mean which file should I use to check that)

 

rna-seq next-gen • 7.0k views
ADD COMMENTlink modified 5.1 years ago by Ashutosh Pandey11k • written 5.1 years ago by Y Tb150
1
gravatar for Ashutosh Pandey
5.1 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

The first step of RNA-seq alignment using Tophat is alignment of reads using Bowtie. To use bowtie one has to index the genome using "bowtie build". To use "-G" option , the chromsome names in the reference file that was used to create bowtie index should match the chromosome name in the GTF file. For example, if your reference fasta file has a chromosome name as "chr1" and if your GTF file uses only  "1" for chr1, then it will throw an error. You can add a suffix "chr" in your GTF file in order to make these two files compatible to each other.

ADD COMMENTlink modified 5.1 years ago • written 5.1 years ago by Ashutosh Pandey11k

Hi Pandey, thanks for helping me but I need some help for this point and I did the following steps:

1- I installed the human annotation GTF file from ensembl website and I checked the first column in the file and I found that the names of the chromosomes are 1,2,     X,Y. not chr1, chr2,......etc.

2-  I downloaded the index file from Bowtie website (I mean I didn't create the index by my self)

My question is how to check the index file to make sure the chromosomes have the sae names 1,2,.....

 

 

ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by Y Tb150

You need to use bowtie-inspect which comes with bowtie. Seee below link how to use it:

http://bowtie-bio.sourceforge.net/manual.shtml#the-bowtie-inspect-index-inspector

Basically use bowtie-inspect from command line and tell it where the index is located plus the suffix that is coomon to all the index files.

 

 

ADD REPLYlink written 5.1 years ago by Ashutosh Pandey11k

I used the following command to check the names of the index file, and I got the names without chr as shown below

ab@patrex:/disk2//Bowtie2Index> bowtie2-inspect -n genome

10

11

12

13

14

15

16

17

18

19

1

20

21

22

2

3

4

5

6

7

8

9

MT

X

Y

Also, I checked the annotation file and I found the names also without chr ,but the only difference is the order of chromosomes as shown below

1
10
11
12
13
14
15
16
17
18
19
2
20
21
22
3
4
5
6
7
8
9
MT
X

Y

 

My question here  is the different in order cause any problem or no (I mean is it okay to use both of them with different order)

ADD REPLYlink written 5.1 years ago by Y Tb150

Just try and check. The most harm it will do is throw an error that order don't match or it may work regardless of the order of chromosomes in those two files. Dont be scared that you will mess up something. Always have a backup for your files. 

ADD REPLYlink modified 5.1 years ago • written 5.1 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1339 users visited in the last hour