Dear all, I am facing an issue with samtools tview command. When I try to scroll through the alignment in terminal, all the bases are appearing as "N"s. At first I thought it had something to do with the bam file or reference fasta files. But then I saw that when I increase or decrease the size of terminal window and then run samtools tview, the positions where these "N"s begin also changes. Does anybody has this problem ? are there any suggestions ?
this will happen if you forget to provide the fasta file as last argument, i.e. if you call it like this:
samtools tview your-bam-file
or, if you called tview correctly, but the fasta file does not exist:
samtools tview your-bam-file non-existent-ref-fasta
In both cases samtools tview will still show you the alignment/pileup, however, all reference bases will be Ns. In other words, make sure you use the reference fasta as last argument and make sure it exists. No need to worry about indexing your fasta file. If the index is missing, it will be computed automatically.
If things still fail, something really strange is going on.
I would go ahead and reindex your fasta file. I have solved the "N" problem that way. Check a position in the alignment in a region that you know doesn't contain "N"s in the reference. Use the 'g' key to quickly move positions.
I just ran into this issue. My problem was that the reference Fasta file contained a different name for the chromosome than the BAM file indicated. To fix it, I changed the Fasta header to match the one in the BAM file.
In my case,
samtools view -H in.bam | grep '@SQ' showed the name I should use (after the
Don't forget to rerun
samtools faidx ref.fasta after changing the Fasta file!
I think the OP had a different issue; I encountered the same and it is not a major problem.
1) Some (but not all) the bases of the reference (known to be different from N) are shown as N. 2) The position at which the bases start appearing as N varies when display size varie 3) (At least for me) The problem arises only when no reads are aligning on that portion of the reference. It never happened (to me) to encounter this issue when reads are aligning. This was apparent to me in some instances in which I had the beginning of the reference without reads but showing the correct nucleotides, then some Ns, then, after 300bp I started having reads aligning to the reference and again the reference started showing correctly the 4 nucleotides. 4) Finally I noticed that the beginning of the reference is correctly shown up to what fits in the display. Scroll one base to the left and that will be an N.
I do not know if this is a bug or is intended (maybe we don't care to see the whole reference when no reads are aligning on it), but it is not a major problem, since only affects regions in which you are not aligning.
I also have this problem. I tested a simulated small genome and used wgsim to generated a pair end reads file, and it works using samtools tview, but it didn't work when I changed to HG19, Does samtools tview an only handle small genome?