Question: Samtools view command, Exec format error while trying to view a human chromosome region
0
gravatar for ricfoz
20 months ago by
ricfoz30
National School of Antropology and History, Mexico city, Mexico
ricfoz30 wrote:

Hello everyone, I am new to bioinformatics, and i am trying to retrieve a .fasta file of a gene located in human chr6 from a neandertal published sequence. The type of file with the one i begin with is a file.bam type of the single chr6 sequencing of an organism.

I have searched and realized that a useful tool for doing this is samtools, which i have already succesfully installed. I tried to view the file with the next syntax:

samtools view filename.bam chr6:posxx-posxx+1

I got an error, saying i should index it, so i proceeded to do so, with the needed sort command previous to the indexing, writting syntax as follows:

samtools sort filename.bam -o file.sorted
samtools index filename.sorted

after sorting i got a single file with the .sorted extention, which i followed to apply the index command, task which gave me a filename.sorted.bai file, plus two other tmp.xxxx.bam files, with the filename.sorted.bai file supposed to be indexed already, since was the result of my last command.

after this, i am trying to view the desired genic region in order to know if it's present in the sequencing data, before proceeding to apply mpileup to the files. and retrieving my desired .fasta file from the sequence.

According to the first error message i got, an indexed file should be viewable through the view command, so i proceed with the following command:

samtools view filename.sorted.bai chr6:xx-xx+1

and it gives me the following error:

[E: :hts:hopen] Failed to open file filename.sorted.bai
[E: :hts_open_format] Failed to open file filename.sorted.bai
samtools view: failed to open "filename.sorted.bai", for reading: Exec format error

i try applying the same syntax to other files, say the temporal files, and it retrieves the next error:

[main_samview] random alignment retrieval only works for indexed BAM or CRAM files

... I have gone back to sort and index the same original filename.bam file, but asking for an output in a .bam format at the end rather than the .bai which is supposed to be the default output format when indexing. Nothing has worked.

At the end i tried viewing the whole sequence, instead of asking for a region, and it worked nicely, with a filename.bam file, which was not indexed, just sorted, and actually as well with a filename.sorted.bam.temp.xxxx.bam file, writting the folowing commands:

samtools view filename.bam 
samtools view filename.sorted.bam.temp.xxxx.bam

both commands retreived a huge ammount of readings, whith headers and everything, but when i narrow my command by giving the region it does not work!,

Perhaps i am missing something with the REGION syntax?, do i have to perform any process on the filename.bai file previous to trying to view an especific region?, i would appreciate any help!, i am using samtools version 1.6

I apologize for the lenght of my post, but i tried to be most explicit with my kind of errors and the pipeline that i have followed in sake of clarity.

ADD COMMENTlink modified 20 months ago by Hussain Ather940 • written 20 months ago by ricfoz30
1

Which version of samtools are you using?

Have you tried samtools view filename.sorted.bam chr6:xx-xx? You do not use the .bai file. You just need to have it available in the same directory.

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax69k

I am using samtools-1.6

 I can't use that syntax you suggest, i have just the original .bam file, next a .sorted file, and at the end a .sorted.bai file, and two temp files, with next extensions: .bam.tmp.0000.bam , and .bam.tmp.0001.bam

i haven't got any "sorted.bam" file ... anyways, you suggest that i use the file that i got after the "samtools sort filename.bam" command?

ADD REPLYlink written 20 months ago by ricfoz30
1

That name is just a place holder. If you were able to successfully sort your bam file then use whatever file name you have for that sorted file.

Since you have the .tmp files still around you were NOT able to successfully complete the sorting of the bam file. So you would need to repeat that. The .tmp files should be deleted automatically once the sorting is successful. Don't think the order is critical but if you repeat try samtools sort -o file_sorted.bam original.filename.bam

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax69k

Thanks a lot, it is useful to know that i should use the sorted file insted of the .bai one ... still, i think i should be able to run that syntax only after indexing and having the .bai file in my folder right?

I had worked the pipeline a couple of times, so now i deleted everything but the original .bam file ... and repeating the tasks. I hope it works this time!

oh, and... those temp files came after index command, not sort, but i get your idea

ADD REPLYlink written 20 months ago by ricfoz30
1

You need to first sort original bam. Let that complete (no .tmp files should remain). Then index the sorted file. Finally do: samtools view filename.sorted.bam chr6:xx-xx That is the correct order.

ADD REPLYlink modified 20 months ago • written 20 months ago by genomax69k

as i told shussainather bellow:

Thanks, i tried with the sorted file, and it seems it worked something different, now it retrieves:

[main_samview] recion "chr6:xx-xx+1" specifies an unknown reference name. continue anyway.

ADD REPLYlink written 20 months ago by ricfoz30
1
gravatar for Hussain Ather
20 months ago by
Hussain Ather940
National Institutes of Health, Bethesda, MD
Hussain Ather940 wrote:

Try running samtools view filename.sorted.bam chr6:xx-xx+1 , not on the .bai file.

ADD COMMENTlink written 20 months ago by Hussain Ather940

Thanks, i tried with the sorted file, and it seems it worked something different, now it retrieves:

[main_samview] recion "chr6:xx-xx+1" specifies an unknown reference name. continue anyway.

ADD REPLYlink written 20 months ago by ricfoz30

That means your chromosomes are labelled 1,2,3 not chr1,chr2,chr3.

Use samtools view in.bam 6:xx-xx+1

ADD REPLYlink written 20 months ago by ATpoint19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1462 users visited in the last hour