User: Nick Stoler

gravatar for Nick Stoler
Nick Stoler60
Reputation:
60
Status:
Trusted
Location:
Penn State
Website:
http://nstoler.com/
Last seen:
9 months, 4 weeks ago
Joined:
4 years, 7 months ago
Email:
n******@psu.edu

Posts by Nick Stoler

<prev • 20 results • page 1 of 2 • next >
0
votes
0
answers
408
views
0
answers
Comment: C: Nonconforming FASTG files produced by SPAdes
... So I wanted to provide some more information I've discovered about what's going on here. The old, SPAdes 3.1.1 colon-delimited format does seem to be what I suggested above, where the first edge is the name of the sequence, and the following edges are its neighbors. Also, the Bandage visualization ...
written 9 months ago by Nick Stoler60
0
votes
0
answers
408
views
0
answers
Nonconforming FASTG files produced by SPAdes
... I'm working with the [SPAdes][1] assembler (version 3.1.1), and trying to use its FASTG graph output. Looking at the contigs.fastg file it produces, I find header lines I can't understand, even after carefully reading the [FASTG specification][2]. The spec says headers are supposed to be in this fo ...
assembly written 11 months ago by Nick Stoler60
0
votes
2
answers
2.0k
views
2
answers
Comment: C: Python multiple sequence alignment
... That's an interesting question about whether it needs to be an MSA. I don't think it can be mapping, since the point is to be reference-free. Basically, I want to build a consensus out of each group of a handful of reads (it's duplex sequencing), independent of a reference. ...
written 2.4 years ago by Nick Stoler60
0
votes
2
answers
2.0k
views
2
answers
Python multiple sequence alignment
... I'm trying to find a fast implementation of a multiple sequence alignment algorithm that I can use from Python. The requirements aren't big. I'll only be aligning a handful (like a dozen) 300bp reads which should be very similar to each other (they come from the same molecule). But I have to do tha ...
alignment written 2.4 years ago by Nick Stoler60 • updated 2.4 years ago by wpwupingwp110
0
votes
2
answers
1.1k
views
2
answers
Comment: C: Vcf: How To Indicate Reference Allele Not Found?
... I noted that, though I wasn't sure how definitively that says "no REF alleles." So then what is the appropriate format for storing this information? It seems like quite an achilles heel to not be able to list the alleles in your sample unambiguously. (I could use a hack like an INFO column tag or s ...
written 3.9 years ago by Nick Stoler60
2
votes
2
answers
1.1k
views
2
answers
Vcf: How To Indicate Reference Allele Not Found?
... In VCF, the ALT column is supposed be where you show what variants you found. Or, as implied in the spec and all the example files I've seen, the ALT column shows all the non-reference variants you found. But if I'm sequencing a sample and want to use VCF to store its variant calls, I'd like to be ...
vcf variant-calling written 3.9 years ago by Nick Stoler60 • updated 3.7 years ago by Adam940
0
votes
2
answers
4.7k
views
2
answers
Comment: C: Extracting Reads Containing A Specific Variant From A Bam File
... Thanks! So it looks like that can get the information I need. It's a custom script like I've been writing (except certainly much nicer and faster), instead of an existing tool, so I'll take that as an answer to whether I'm reinventing the wheel. ...
written 3.9 years ago by Nick Stoler60
0
votes
2
answers
4.7k
views
2
answers
Comment: C: Extracting Reads Containing A Specific Variant From A Bam File
... That's good to know about fillmd's -e option, but unfortunately it seems it doesn't show indels. Also, soft-clipped bases appear just like SNV's. ...
written 3.9 years ago by Nick Stoler60
5
votes
2
answers
4.7k
views
2
answers
Extracting Reads Containing A Specific Variant From A Bam File
... Perhaps my Google-fu is failing me, because this doesn't seem like an uncommon need, but I can't find any answer here. Say I have a SAM/BAM file, and a variant at a known location. I'd like to extract all the alignments from the BAM file which contain that variant. That is, I'd like the ALT, not th ...
cigar format sam bam written 3.9 years ago by Nick Stoler60 • updated 7 days ago by Biostar ♦♦ 20
0
votes
1
answer
1.4k
views
1
answers
Comment: C: How Likely Are Two Sequences To Have The Same Snps By Chance?
... Sorry, my thinking actually led me to an answer approximating 0.001^2 for short sequence lengths. To be exact, the probability of a coincidental shared SNP I got is 1-(1-0.001^2)^n where n = sequence length. Is it actually 0.001, regardless of length? ...
written 4.5 years ago by Nick Stoler60

Latest awards to Nick Stoler

Popular Question 22 months ago, created a question with more than 1,000 views. For How Likely Are Two Sequences To Have The Same Snps By Chance?
Popular Question 22 months ago, created a question with more than 1,000 views. For Why Is Gzip Compression Of Fasta Less Efficient Than 2Bit?
Popular Question 3.4 years ago, created a question with more than 1,000 views. For Extracting Reads Containing A Specific Variant From A Bam File
Popular Question 3.4 years ago, created a question with more than 1,000 views. For Why Is Gzip Compression Of Fasta Less Efficient Than 2Bit?
Supporter 3.6 years ago, voted at least 25 times.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1288 users visited in the last hour