samtools idxstats unplaced contigs
1
0
Entering edit mode
3.0 years ago
LacquerHed ▴ 30

Trying to figure out what the last line is of samtools idxstats output:

Here are the last few lines,

GL456368.1  20208   266 0
JH584292.1  14945   8   0
JH584295.1  1976    31  0
*   0   0   33800462

Is this an additional unmapped region? Attempted to get the bam file however was unable, not sure of proper notation for samtools view

samtools view -b /Users/possorted_genome.bam *  >  asterisk.bam

Basic point is looking for a randomly integrated human transgene in a mouse snRNA-seq assembly.

Also can a transgene integrate potentially within an unplaced contig like the ones above, was able to find mouse version of gene by blasting a db of chromosome 13 from mapped reads - but can't find the human even though I know its there.

Thanks.

contig STAR samtools • 1.3k views
ADD COMMENT
0
Entering edit mode

If you are looking for a genomic insertion site then it may be better to follow the protocol described in this answer: Identification of the sequence insertion site in the genome

ADD REPLY
1
Entering edit mode
3.0 years ago
ATpoint 82k

The * means all unmapped reads. If you want to align against the transgene why not adding its sequence as an extra "chromosome" to the reference genome?

ADD COMMENT
0
Entering edit mode

Is it possible to just convert all unmapped reads to a bam file, covert to fasta and just blast that for the transgene? When trying with samtools view I couldn't figure out the right notation. Im guessing it should be there, and seems less involved than augmenting the assembly. Thanks!

ADD REPLY
0
Entering edit mode

to get unmapped reads from a bam file you can use:

samtools view -f 4 file.bam

you can then blast the sequences

A more proper way would be to add your transgene to your reference as suggested by ATpoint, as there could be reads mapping to both (possibly with some mismatch)

ADD REPLY

Login before adding your answer.

Traffic: 2413 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6