User: jkbonfield

gravatar for jkbonfield
jkbonfield350
Reputation:
350
Status:
Trusted
Location:
Last seen:
3 weeks ago
Joined:
2 years, 7 months ago
Email:
j*********@gmail.com

Posts by jkbonfield

<prev • 29 results • page 1 of 3 • next >
1
vote
1
answer
79
views
1
answers
Answer: A: What is the difference of output samtools depth and samtools view -c on location
... Samtools depth is using the mpileup algorithm to find overlapping data, along with all the nuances that involves. That means filtering by flags (unmapped data, secondary reads, duplicates, QC failure), limits of maximum depth, possibly some other things like removal of overlapping templates (I can ...
written 22 days ago by jkbonfield350
1
vote
1
answer
95
views
1
answers
Comment: C: file size after sorting the BAM file using samtools
... The answer about sizes has already been given so I won't repeat it. However in answer to part 2, we locally use Biobambam's bamseqchksum tool to validate that a file operation hasn't lost data in the process, or that it's lost only the bits we know will be lost. For example it can compute checksum ...
written 4 weeks ago by jkbonfield350
0
votes
0
answers
123
views
0
answers
Comment: C: Collapsing BAM based on seq and positions
... This sounds much like the ReducedReads format from early GATK versions. Ultimately it was retired because it wasn't sufficient to capture all the important information, but it may still be available if you can find an old enough GATK (2.8?). ...
written 7 weeks ago by jkbonfield350
0
votes
4
answers
181
views
4
answers
Comment: C: BAM files compression
... I did the maths on how long it takes to recover AWS CPU costs (based on a spot price some arbitrary time ago) in the reduction of AWS standard S3 disk charges for a BAM to CRAM conversion. At that point it happened to be around 1 day! Obviously longer for cheaper storage tiers. I didn't do the r ...
written 8 weeks ago by jkbonfield350 • updated 8 weeks ago by RamRS26k
2
votes
4
answers
181
views
4
answers
Answer: C: BAM files compression
... CRAM generation is actually faster than BAM generation in samtools, at least at the default compression levels. CRAM decoding is slower than BAM though unless you're I/O bound, in which case CRAM will be faster due to being smaller. See https://github.com/samtools/www.htslib.org/pull/23/commits/6a ...
written 8 weeks ago by jkbonfield350
5
votes
1
answer
147
views
1
answers
Answer: A: What is the difference between mpileup samtools and bcftools?
... `Bcftools mpileup` should be used instead of `samtools mpileup` for variant calling. That is, the VCF / BCF output mode of mpileup is better in bcftools. `Samtools mpileup` however has two different formats with the default always being a simple columnar format showing chr, pos, reference, depth, ...
written 9 weeks ago by jkbonfield350
0
votes
1
answer
152
views
1
answers
Comment: C: Aligning, Sorting and Converting to bam at the same command - possible?
... If you think you'll be doing markdup at some point then you may also want to add a "samtools fixmate -m" in there after the bowtie command as this way it doesn't require an additional sort later on. Also when piping it's often best to pipe uncompressed BAM. Some samtools commands have a "-u" optio ...
written 10 weeks ago by jkbonfield350
3
votes
2
answers
273
views
2
answers
Answer: A: How does samtools markdup works?
... Samtools markdup is written to match Picard 2.10.3 (also Biobambam's bamstreamingmarkduplicates) so if you can find documentation on those then it should also apply to Samtools. It may seem like a complex dance to have both name and position sorted requirements, but this is perhaps due to a traditi ...
written 11 weeks ago by jkbonfield350
0
votes
1
answer
148
views
1
answers
Comment: C: Extract all BAM reads that intersect a given region using the BAI index
... Not an answer as I haven't implemented this myself so don't know all the ins and outs. However basically the BAI index maps ranges (produced as "bins") to file offsets. Given the R-Tree isn't binary, a single bin may have multiple start/stop points for data within it, hence the linear index too. ...
written 3 months ago by jkbonfield350
1
vote
1
answer
185
views
1
answers
Answer: A: Size of Output BAM file bigger than the SUM Size of Input files after merging wi
... I am assuming both files are chromosome / position sorted first, meaning the output will be too. Are the two BAM files from different technologies? Compression tools such as gzip (the same algorithm is used exclusively in BAM) benefit from having similar looking data aggregated together. This is ...
written 3 months ago by jkbonfield350

Latest awards to jkbonfield

Scholar 8 weeks ago, created an answer that has been accepted. For A: How does samtools markdup works?
Appreciated 8 weeks ago, created a post with more than 5 votes. For A: What is the difference between mpileup samtools and bcftools?
Scholar 8 weeks ago, created an answer that has been accepted. For A: How does samtools markdup works?
Teacher 9 weeks ago, created an answer with at least 3 up-votes. For A: Recovering bam files after unknow deletion in the storage
Scholar 11 weeks ago, created an answer that has been accepted. For A: How does samtools markdup works?
Teacher 11 weeks ago, created an answer with at least 3 up-votes. For A: Recovering bam files after unknow deletion in the storage
Appreciated 8 months ago, created a post with more than 5 votes. For A: Is it possible to directly convert fastq to CRAM ?
Good Answer 8 months ago, created an answer that was upvoted at least 5 times. For A: Is it possible to directly convert fastq to CRAM ?
Teacher 8 months ago, created an answer with at least 3 up-votes. For A: Recovering bam files after unknow deletion in the storage
Teacher 8 months ago, created an answer with at least 3 up-votes. For A: what should a SAM/BAM record contain when there are no quality scores
Teacher 9 months ago, created an answer with at least 3 up-votes. For A: Recovering bam files after unknow deletion in the storage

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1857 users visited in the last hour