User: Dave Carlson

gravatar for Dave Carlson
Dave Carlson510
Reputation:
510
Status:
Trusted
Location:
Stony Brook University, NY
Last seen:
1 day, 6 hours ago
Joined:
5 years, 1 month ago
Email:
d*******************@gmail.com

Posts by Dave Carlson

<prev • 67 results • page 1 of 7 • next >
0
votes
3
answers
114
views
3
answers
Comment: C: Blastn, need help to increase speed
... Diamond only works for protein alignment (i.e., blastp or blastx). This is a nucleotide alignment question. ...
written 3 days ago by Dave Carlson510
1
vote
0
answers
59
views
0
answers
Comment: C: Very strange duplicate output with stand alone blast
... Are you limiting your blast search to a single HSP with `-max_hsps 1`? If not, that may be what's causing you to get multiple results from the same sequence. Depending on what your ultimate goal for the analysis is, you may actually want to keep multiple HSPs per matching sequence. ...
written 10 days ago by Dave Carlson510
0
votes
1
answer
115
views
1
answers
Comment: C: repeat masker output looks weird: 0% for all of repeat squences
... It looks like you're using the repeats from the Dfam database. Is your genome from a species that is present in Dfam (or closely related to one)? If not, perhaps the evolutionary distance between your species and the species in Dfam is making it hard to accurately identify your TEs? When I run Repe ...
written 12 days ago by Dave Carlson510
2
votes
2
answers
131
views
2
answers
Answer: A: Pacific Bio Long Reads vs Illumina Short Reads
... Results for assembling a highly repetitive 1 Gb plant genome with ~100x coverage PE Illumina data: 300 Mb assembly (1/3 of the genome) Results for assembling the same genome with ~100x Sequel 1 PacBio reads: 950 Mb genome (~90% of the genome) These days, I wouldn't even attempt genome assembly wit ...
written 13 days ago by Dave Carlson510
1
vote
3
answers
111
views
3
answers
Answer: A: Is there a site that provides genome size sort?
... In addition to the previously mentioned PlantC-values Database, there is the [Animal Genome Size Database][1] [1]: http://www.genomesize.com/ ...
written 19 days ago by Dave Carlson510
0
votes
5
answers
133
views
5
answers
Answer: A: Split multifasta-file into multiple files according to a pattern in header
... [Here's][1] python 3 solution using Biopython with regular expressions: #!/usr/bin/env python import argparse import re import sys from Bio import SeqIO parser = argparse.ArgumentParser(description="Search for pattern in fasta header and return sequences whose head ...
written 21 days ago by Dave Carlson510
1
vote
1
answer
149
views
1
answers
Comment: C: bcftools variant calling
... I don't know of a better paper, I'm afraid. I believe that Li 2011 describes the algorithms used by samtools/bcftools for calculating genotype likelihoods and calling variants. There is also a --multiallelic calling model implemented in more recent versions of bcftools, which is briefly described [ ...
written 22 days ago by Dave Carlson510
3
votes
1
answer
149
views
1
answers
Answer: A: bcftools variant calling
... You probably want to check out [Li 2011][1]: **A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data** Abstract >Motivation: Most existing methods for DNA sequence analysis > rely on accurate seque ...
written 24 days ago by Dave Carlson510
2
votes
1
answer
172
views
1
answers
Answer: A: Help with parallelization and loop cutadapt
... Hi Russel, I believe something like the following will work for your cutadapt command: parallel --verbose --link 'cutadapt -a AGATCGGAAGAG -A AGATCGGAAGAG -o {1.}.trimmed.fastq -p {2.}.trimmed.fastq {1} {2} > {1.}_cutadapt.txt' ::: *R1_001.fastq ::: *R2_001.fastq In the case of the filenam ...
written 9 weeks ago by Dave Carlson510
1
vote
1
answer
193
views
1
answers
Comment: C: How does DEseq calculate p value when n=1?
... I realize that this is not a direct answer to your question, but you should not expect to have any reliable stats from DESeq if you don't have biological replicates. This has been discussed extensively. See for example: https://www.biostars.org/p/398152/ Also the [latest DESeq2 vignette][1] gives ...
written 11 weeks ago by Dave Carlson510

Latest awards to Dave Carlson

Teacher 21 days ago, created an answer with at least 3 up-votes. For A: Assembler for only nanopore data
Scholar 22 days ago, created an answer that has been accepted. For A: Genome duplication assessment
Voter 15 months ago, voted more than 100 times.
Teacher 16 months ago, created an answer with at least 3 up-votes. For A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?
Teacher 17 months ago, created an answer with at least 3 up-votes. For A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?
Scholar 17 months ago, created an answer that has been accepted. For A: Genome duplication assessment
Popular Question 22 months ago, created a question with more than 1,000 views. For codeml jobs taking much longer on a server than on an imac
Teacher 22 months ago, created an answer with at least 3 up-votes. For A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?
Supporter 22 months ago, voted at least 25 times.
Popular Question 2.5 years ago, created a question with more than 1,000 views. For RepeatModeler finishes without creating output files
Teacher 3.5 years ago, created an answer with at least 3 up-votes. For A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1584 users visited in the last hour
_