User: floris.barthel

Reputation:
40
Status:
New User
Location:
Last seen:
6 months, 1 week ago
Joined:
3 years, 9 months ago
Email:
f*************@gmail.com

Posts by floris.barthel

<prev • 19 results • page 1 of 2 • next >
2
votes
3
answers
487
views
3
answers
Answer: A: Storing Varaint data from VCF
... I have been using a PostgreSQL database for storing sample metadata and variant calls. This works well for datasets with tens of millions of rows and range intersections are blazing fast using Postgres's `int4range` type. I'm starting to notice serious performance degradation (not scaling linearly ...
written 6 months ago by floris.barthel40
0
votes
1
answer
1.6k
views
1
answers
Comment: C: Error parsing SAM header. @RG line missing SM tag
... Any ideas anyone? https://www.biostars.org/u/30/ ? Using `samtools reheader` + `sed` to change the `SM` tag for multiple readgroups at once (something that I dont' think is possible with `AddOrReplaceReadGroups`) is a very elegant solution but if GATK complains about an invalid file it's not a viabl ...
written 11 months ago by floris.barthel40
0
votes
1
answer
1.6k
views
1
answers
Comment: C: Error parsing SAM header. @RG line missing SM tag
... I edited several (multi-readgroup) BAM file SM tags this way. After reindexing them, they seem to work fine, ie. I can open them in IGV without errors. However, when I use Picard `ValidateSamFile` I get the following error: ``` ## HISTOGRAM java.lang.String Error Type Count ERROR:INVALID_I ...
written 11 months ago by floris.barthel40
0
votes
0
answers
2.5k
views
0
answers
Comment: C: CNV from FFPE genomes
... Hi Richard, I am running into similar issues right now and wonder how you dealt with this problem? Do you have any insights to share? ...
written 11 months ago by floris.barthel40
0
votes
0
answers
706
views
0
answers
Comment: C: Multiple biopsy issue in MutSig and GISTIC tools
... I wonder what did you end up doing here? Would multiple related samples violate GISTIC assumptions? ...
written 11 months ago by floris.barthel40
0
votes
1
answer
749
views
1
answers
Comment: C: Samtools/bcftools force-calling using an input set of variants
... BTW, I don't clearly understand the difference between `bcftools mpileup -T file.bed` and `bcftools mpileup -R file.bed`. For a small testcase, `-R` seems to be significantly faster. ...
written 11 months ago by floris.barthel40
1
vote
1
answer
749
views
1
answers
Comment: C: Samtools/bcftools force-calling using an input set of variants
... Thanks ! This does the trick. I'm adding the `-A` parameter to `bcftools call` to include alts even if not present in the genotypes. Also, interestingly the `-v` (verbose) flag breaks `bcftools call` and I had to remove it to get anything to output. Not sure why. Unfortunately I ran into this issue: ...
written 11 months ago by floris.barthel40
7
votes
1
answer
749
views
1
answer
Samtools/bcftools force-calling using an input set of variants
... I have an input set of variants in VCF format (`input.vcf.gz`), which includes indels and SNPs. Multi-allelic variants are split across multiple lines. Using `bcftools view input.vcf.gz | vcf2bed | bedtools merge` I then created a bed file `targets.bed`. I would like to call **only** this predeterm ...
variant calling vcf samtools bcftools written 11 months ago by floris.barthel40
1
vote
1
answer
979
views
1
answer
Encode RNA-seq expression matrix
... Hi, I'm looking for an Encode RNA-seq expression matrix. I could download all the bam files and run htseq-count but I wonder if there's an easier way. I found [these][1] on the Encode site however they are based on GRCh38 and I'm looking for hg19 data. Also, the download speed for this file is goi ...
encode rna-seq written 2.6 years ago by floris.barthel40 • updated 21 months ago by Zhilong Jia1.5k
0
votes
0
answers
895
views
0
answers
Best-guessing insert size and stddev in MISO using paired-end data versus running MISO single ended
... I am conducting a miso analysis for a single gene on a large number of Paired End RNAseq bams that is spread out on several external HDs (twenty 4tb drives). Computing insert sizes for these is extremely tedious and slow. However, running MISO itself (in single-end mode or paired end mode with best ...
splicing miso rna-seq written 3.2 years ago by floris.barthel40

Latest awards to floris.barthel

Popular Question 10 months ago, created a question with more than 1,000 views. For R package for sequence homology
Popular Question 11 months ago, created a question with more than 1,000 views. For R package for sequence homology

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2626 users visited in the last hour