User: Rok

gravatar for Rok
Rok190
Reputation:
190
Status:
New User
Location:
Trondheim, Norway
Website:
http://mocnik.me/
Last seen:
8 years ago
Joined:
9 years, 9 months ago
Email:
r*********@gmail.com

Posts by Rok

<prev • 10 results • page 1 of 1 • next >
4
votes
1
answer
8.4k
views
1
answers
Answer: A: What Is The Expected Size Of A Whole Genome Vcf And Bcf?
... Under the assumption that each line will similar to this one: chr1 249250621 . A A 22 PASS 0/0 This means each line uses at max 45 bytes. Times length of human genome this makes VCF file of maximum size around 125GB. Size of the header is not used in the calculation since ...
written 8.0 years ago by Rok190
0
votes
1
answer
8.4k
views
1
answers
Comment: C: What Is The Expected Size Of A Whole Genome Vcf And Bcf?
... Do I understand this correctly? Every base in reference genome needs to be a line in a VCF file? ...
written 8.0 years ago by Rok190
2
votes
2
answers
3.0k
views
2
answers
Answer: A: Bam Files And Reads Quality Check
... This depends on have you want to do the filtering. If you have a list od reads you want to include or exclude you can use FilterSamReads from Picard tools. It also works if you want to include/exclude aligned reads. The size of the output is tripled because you are storing data back in uncompresse ...
written 8.0 years ago by Rok190
0
votes
2
answers
4.4k
views
2
answers
Answer: A: Creating Hg19 Reference Index
... If you have your chromosome in separate fasta files you should merge everything into one fasta file (whole genome). Using Create Sequence Dictionary from Picard Tools you create a dictionary for this whole genome. This is going to store order of chromosomes in the whole genome file. When you do map ...
written 8.1 years ago by Rok190
2
votes
2
answers
4.3k
views
2
answers
Answer: A: Preprocessing The Bam For Gatk
... Using Picard is a good idea, the only problem it seems you need to create some temporary files since it does not support using unix pipes. One of the possibilities is also writing a script to do it, but it is going to be more complex than just one liner in awk. It also depends on how you want to sor ...
written 8.4 years ago by Rok190
3
votes
1
answer
2.0k
views
1
answers
Answer: A: Get Vcf File For Heparanase Gene (Hpse)
... For VCF files I like to use Broad Institute resource bundles, you have all the data in same place and according to reference genome you prefer. ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/1.2/ After you download the version of dbSNP you like, you should find the location of HPSE inside ...
written 8.4 years ago by Rok190 • updated 9 months ago by RamRS27k
0
votes
6
answers
4.0k
views
6
answers
Comment: C: Parse Fastq File - Pad Reads With N'S
... I thought awk version would need less time than Python version, but on 2 million lines they needed the same time. ...
written 8.4 years ago by Rok190
3
votes
6
answers
4.0k
views
6
answers
Answer: A: Parse Fastq File - Pad Reads With N'S
... GNU Awk version: awk 'NR%4==0 {print t $0}; NR%4==2 {while(a++<100-length($0)){s=s "Q"; t=t "N"};print s $0}; NR%2==1 {print}' file.fastq You can define different characters for replacement in sequence line and quality score line(N and Q above). GNU Sed version (do not use, really slow): sed -e ...
written 8.4 years ago by Rok190
0
votes
0
answers
1.9k
views
0
answers
Snp130-Masked Fasta Files And Alignment
... Hello! I have some Chip-Seq reads that need aligning, because I want to eliminate bias towards reference alleles I wanted to try aligning with SNP-masked genome. I have downloaded SNP-Masked sequence files from the UCSC database (http://hgdownload.cse.ucsc.edu/goldenPath/hg18/snp130Mask/), and hav ...
dbsnp bowtie written 8.8 years ago by Rok190
21
votes
8
answers
7.1k
views
11 follow
8
answers
Dataset With Snps Linked To Phenotype
... Hello! I'm looking for a possible things to do for my master thesis in data mining. And because I'm also interested in bioinformatics I was thinking about doing a GWAS study. The problem is that I'm not very familiar with the databases that are available for bioinformaticions on the web. My questi ...
gwas snp dataset written 9.5 years ago by Rok190 • updated 9.5 years ago by Ali0

Latest awards to Rok

No awards yet. Soon to come :-)

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 714 users visited in the last hour