Question: HTSeq error with bam file input
0
gravatar for akatrib
2.4 years ago by
akatrib0
UCLA
akatrib0 wrote:

I keep getting error messages when submitting HTSeq jobs, using both -
(a) sorted (by chromosomal coordinates) and
(b) unsorted bam files

I made sure to specify the appropriate order in the HTSeq options: "-r pos" for both sorted and unsorted. I'm re-running samtools now to sort by name, but I'm not sure that's necessary (since you can specify position in the order option.)

Here is the error message I get when specifiying the unsorted bam file:
Error occured when processing SAM input (record #1794713 in file /xx/xx_Aligned.out.bam I am not sure why it's referring to a sam input. Am I better off converting to sam? Is there anything else I could be missing

And here is the error message I get when specifying the sorted bam file:
Error occured when processing GFF file (line 1611183 of file /xx/GTF/Homo_sapiens.GRCh37.65.gtf)

I'm sure this is a common problem and that there are many posts about this. I am certainly looking into it but in the meantime was hoping to get any feedback if you're familiar with this. Clearly I'm a newbie to this kind of analysis.

Thank you so much!

error rna-seq bam count htseq • 1.0k views
ADD COMMENTlink modified 2.4 years ago by sergio.arredondo.alonso20 • written 2.4 years ago by akatrib0

Was that the GTF file that you aligned against? How did you do the alignment?

ADD REPLYlink written 2.4 years ago by andrew.j.skelton735.2k

Andrew,

Thanks for the quick response. Yes. I used both an existing (previously used for cuffdiff analysis and that worked well) and a newly-downloaded GTF file. For the alignment, I used STAR with unsorted bam output and then used samtools to create the sorted bam files (by position).

ADD REPLYlink written 2.4 years ago by akatrib0

Can you amend your post with the commands you've ran?

ADD REPLYlink written 2.4 years ago by andrew.j.skelton735.2k

~/software/python-2.7/bin/htseq-count -i gene_id $mapping_result/xx_Aligned.out.sorted.bam \ /xx/GTF/Homo_sapiens.Ensembl.GRCh37.72.gtf -r pos -s yes -f bam -q
Since I sorted by chromosomal coordinates, I thought specifying "-r pos" would be sufficient. Not sure why I would get the GTF processing error though (and I'm using the same gtf file as before). I will try to re-sort by name and use the "-r name" option to see if that magically helps resolve this.

ADD REPLYlink written 2.4 years ago by akatrib0

Did you run the command with -f bam? Because sam is the default type.

ADD REPLYlink written 2.4 years ago by WouterDeCoster31k

Hi! Yes I did. I added the options I used above.

ADD REPLYlink written 2.4 years ago by akatrib0
1
gravatar for sergio.arredondo.alonso
2.4 years ago by

Try to upgrade the library "pysam", it could be a fast solution.

ADD COMMENTlink written 2.4 years ago by sergio.arredondo.alonso20

Hi Sergio,

I'm running the latest version of pysam (0.9.0).

ADD REPLYlink written 2.4 years ago by akatrib0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1601 users visited in the last hour