TopHat paramertes " -G/--GTF <GTF/GFF3 file>
1
0
Entering edit mode
10.0 years ago
Y Tb ▴ 230

I am going to run TopHat to map my rna-seq reads to human genome and I found that one of the option is to provide TopHat with the annotation GTF file -G/--GTF <GTF/GFF3 file> and I found the following note in tophat manual

Please note that the values in the first column of the provided GTF/GFF file (column which indicates the chromosome or contig on which the feature is located), must match the name of the reference sequence in the Bowtie index you are using with TopHat. You can get a list of the sequence names in a Bowtie index by typing:

bowtie-inspect --names your_index

So before using a known annotation file with this option please make sure that the 1st column in the annotation file uses the exact same chromosome/contig names (case sensitive) as shown by the bowtie-inspect command above.

So I checked the the first column in my GTF , but my question is how to use bowtie-inspect to check the index file (I mean which file should I use to check that)

RNA-Seq next-gen • 9.6k views
ADD COMMENT
1
Entering edit mode
10.0 years ago

The first step of RNA-seq alignment using Tophat is alignment of reads using Bowtie. To use bowtie one has to index the genome using bowtie build. To use -G option , the chromsome names in the reference file that was used to create bowtie index should match the chromosome name in the GTF file. For example, if your reference fasta file has a chromosome name as "chr1" and if your GTF file uses only "1" for chr1, then it will throw an error. You can add a suffix "chr" in your GTF file in order to make these two files compatible to each other.

ADD COMMENT
0
Entering edit mode

Hi Pandey, thanks for helping me but I need some help for this point and I did the following steps:

  1. I installed the human annotation GTF file from ensembl website and I checked the first column in the file and I found that the names of the chromosomes are 1,2,..., X,Y. not chr1, chr2,......etc.

  2. I downloaded the index file from Bowtie website (I mean I didn't create the index by my self)

My question is how to check the index file to make sure the chromosomes have the sae names 1,2,.....

ADD REPLY
0
Entering edit mode

You need to use bowtie-inspect which comes with bowtie. See this link how to use it.

Basically use bowtie-inspect from command line and tell it where the index is located plus the suffix that is coomon to all the index files.

ADD REPLY
0
Entering edit mode

I used the following command to check the names of the index file, and I got the names without chr as shown below

ab@patrex:/disk2//Bowtie2Index> bowtie2-inspect -n genome

10
11
12
13
14
15
16
17
18
19
1
20
21
22
2
3
4
5
6
7
8
9
MT
X
Y

Also, I checked the annotation file and I found the names also without chr ,but the only difference is the order of chromosomes as shown below

1
10
11
12
13
14
15
16
17
18
19
2
20
21
22
3
4
5
6
7
8
9
MT
X
Y

My question here is the different in order cause any problem or no (I mean is it okay to use both of them with different order)

ADD REPLY
0
Entering edit mode

Just try and check. The most harm it will do is throw an error that order don't match or it may work regardless of the order of chromosomes in those two files. Dont be scared that you will mess up something. Always have a backup for your files.

ADD REPLY

Login before adding your answer.

Traffic: 2695 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6