Question: Identify genes in a tophat aligned bam/sam file
0
gravatar for gchaves
3.0 years ago by
gchaves0
United States
gchaves0 wrote:

Hello all. I have sam/bam files with reads mapped to human genome from a RNA-seq experiment. I already tried bedtools bedtools multicov [OPTIONS] -bams aln.1.bam aln.2.bam ... aln.n.bam -bed <bed/gff/vcf>, but it gave me the error 'Could not find indexes.'. Also, if the bedtools work, what shall be the output and how can I get a list with counts of reads mapping to genes? Any help is more than appreciated. Thank you.

blast rna-seq alignment gene • 1.3k views
ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by gchaves0

Do you have the bai files that contain the index for these bam files in the same directory ? One issue could be that bedtools requires ".bam.bai" instead of ".bai" or vice-versa. For your second question, I would use RSeQC (http://rseqc.sourceforge.net/) or HTSeq (http://www-huber.embl.de/users/anders/HTSeq/doc/count.html) to generate gene counts. I am not against bedtools but I haven't ever used it for counts or RPKM generation. 

 

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Ashutosh Pandey11k

Thank you very much. No. I dont have bai files and I dont know how I can generate them.

ADD REPLYlink written 3.0 years ago by gchaves0

Download samtools (http://www.htslib.org/doc/samtools-1.2.html) and run "samtools index aln.1.bam" . I have updated my older comment.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Ashutosh Pandey11k

Ok. I now have the accepted_hits.bam.bai file.

ADD REPLYlink written 3.0 years ago by gchaves0

Good. Now run your original command again. You need not to mention the .bai files in your command. They will be automatically detected. 

ADD REPLYlink written 3.0 years ago by Ashutosh Pandey11k

Thank you!!

I am run it as bedtools multicov -bams accepted_hits.bam -bed genes.bed

It gave me Error: The requested file (genes.bed) could not be opened. Error message: (No such file or directory). Exiting!

I appreciate your help very much!!

ADD REPLYlink written 3.0 years ago by gchaves0

You may have a permission issue. Not sure though. Try "chmod 777 genes.bed" and rerun the original command. 

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Ashutosh Pandey11k

I just followed someone else's script. I dont have this genes.bed file though. Do you know what it is? Thank you!!

ADD REPLYlink written 3.0 years ago by gchaves0

You need to provide a genes.bed file for your organism of interest. Search biostars for posts that tell you how to download bed files from UCSC genome browser for your organism of interest. You can also provide gff/gtf file to Bedtools. So Ensembl Biomart is another place to where you can look for gtf file. Do some reading about different bedtools and different file formats (bed, gtf, vcf) required to run these tools. Also make sure that the chromosome names in your bed file is compatible with chromosome names in the bed or gtf file. You can run "samtools view -H aln.1.bam" to get the names of chromosomes in the reference genome that was used to align the reads.

ADD REPLYlink written 3.0 years ago by Ashutosh Pandey11k

User lacks the file, I reckon. One gets Permission denied error for the lack of permission rights plus 755 is better then 777 (it just scares me with fully open settings).

ADD REPLYlink written 3.0 years ago by Sukhdeep Singh9.3k

My bad. It was hard for me to believe that "genes.bed" is not a file but the string that was part of the command he was trying to run. So I just guessed about things and of course I was wrong :-).  

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by Ashutosh Pandey11k

I downloaded a bed file from the UCSC genome browser for the known genes of the human genome. It was sent to my galaxy account and I uploaded it to the server I am using. Unfortunately when I run this:

bedtools multicov [OPTIONS] -bams aln.1.bam aln.2.bam ... aln.n.bam -bed <bed gff="" vcf="">

the output does not seem to me to be what I would call a gene count.

ADD REPLYlink written 3.0 years ago by gchaves0

Please type the exact command you used with the file names. 

ADD REPLYlink written 3.0 years ago by Ashutosh Pandey11k
0
gravatar for gchaves
3.0 years ago by
gchaves0
United States
gchaves0 wrote:

I typed 

bedtools multicov -bams my_bam_file.bam -bed human_genome_bed_file.bed

ADD COMMENTlink written 3.0 years ago by gchaves0

Just a humble advice. Please be more elaborative when you reply. You said teh output does not seem what you would call a gene count. Tell us what you got as the output. First question 1) Were you able to run the bedtools successfully without any error 2) If yes, did it produce any outout file 3) if yes, show us first few lines of the output. You can got to the bedtools page and check what kind of output will be produced byt his function. 

ADD REPLYlink written 3.0 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 685 users visited in the last hour