Moderator: Jorge Amigo

gravatar for Jorge Amigo
Jorge Amigo11k
Reputation:
11,160
Status:
Trusted
Location:
Santiago de Compostela, Spain
Website:
https://www.researchga...
Scholar ID:
Google Scholar Page
Last seen:
2 hours ago
Joined:
9 years, 5 months ago
Email:
a****@yahoo.com

Scrutinizing genomic human variation by dealing with high throughput genotyping and next generation sequencing results, among many other things.

Bioinformatician @ Genomic Medicine Group

Hospital Clínico Universitario, Santiago de Compostela, Spain

Posts by Jorge Amigo

<prev • 765 results • page 1 of 77 • next >
0
votes
2
answers
181
views
2
answers
Comment: C: Filter unique SNPs (rows) in VCF/text file
... sorting such a big file only for being able to use the uniq function wouldn't be the more efficient way to do it, plus a simple `sort -u` wouldn't address the problem described in the question as the lines that need to be merged/skipped do vary. if sorting would be considered (and it would be really ...
written 3 days ago by Jorge Amigo11k
0
votes
2
answers
181
views
2
answers
Comment: C: Filter unique SNPs (rows) in VCF/text file
... this is the same as an uniq function on entire lines, and it doesn't work with your example considering that positions qualities vary from line to line. you need to group (index) only the columns you expect to be repeated, as explained in the example I suggested previously. ...
written 3 days ago by Jorge Amigo11k
3
votes
2
answers
103
views
2
answers
Answer: A: Can I split a bed file into 1000bp bins and add up my read numbers?
... I don't know of a direct method to accomplish what you need, but have a look at this tentative proposal: awk 'FS=OFS="\t"{print $1, 0, $2}' human_hg19.fa.fai \ | bedtools makewindows -b - -w 1000 \ | bedtools map -a - -b input.bedgraph -c 4 -o sum \ | grep -P "\d$" what it does is: ...
written 7 days ago by Jorge Amigo11k
2
votes
2
answers
79
views
2
answers
Comment: C: awk gff column change
... not caring about the number of total columns: awk '{n=split($1,a,"_"); $4+=a[n]; $5+=a[n]; print}' input.txt not caring about the number of total columns, plus considering that both input and output are tabulated: awk 'FS=OFS="\t"{n=split($1,a,"_"); $4+=a[n]; $5+=a[n]; print}' input.txt ...
written 7 days ago by Jorge Amigo11k
1
vote
2
answers
181
views
2
answers
Answer: A: filtering unique snps in vcf
... I don't fully understand why you want to do this, but if what you need is just to remove lines considered duplicated attending to certain columns, then this perl code should work: perl -lane ' $index = join "\t", @F[0,7..$#F]; unless ( exists $found{$index} ) { print; $found{$index ...
written 7 days ago by Jorge Amigo11k
1
vote
2
answers
79
views
2
answers
Answer: A: awk gff column change
... if the value you need to add to columns 4 and 5 is always behind a "_" character in the first column, then this perl code should work: perl -lane 'if (/^\S+_(\d+)/) { $F[3] += $1; $F[4] += $1; print join "\t", @F}' input.txt ...
written 7 days ago by Jorge Amigo11k
1
vote
5
answers
20k
views
5
answers
Comment: C: How To Split Multiple Samples In Vcf File Generated By Gatk?
... the key is the `-c1`option, which filters variant lines containing less that 1 nonref allele, therefore it's supposed to be used to retrieve each sample's private variants and not to list reference homozigotes: http://www.htslib.org/doc/bcftools.html#view if you whish to have theses laters too in ...
written 5 weeks ago by Jorge Amigo11k
1
vote
3
answers
3.4k
views
3
answers
Comment: C: Finding a specific gene in a fastq file without mapping the reads?
... all mapped reads from r.sam would be that gene's reads. # total reads samtools view -c r.sam # mapped reads samtools view -cF4 r.sam if you get a non 0 value from the -cF4 option, then those reads would be that gene's. if you want to be sure that those reads are that gene's, and t ...
written 8 weeks ago by Jorge Amigo11k
1
vote
10
answers
47k
views
10
answers
Comment: C: Multiline Fasta To Single Line Fasta
... thanks for pointing it out. I've corrected my previous answer and tested thoroughly the new one. ...
written 8 months ago by Jorge Amigo11k
1
vote
3
answers
11k
views
3
answers
Comment: C: how to remove multiallelic from VCF
... note that the merging process when normalizing can be quite intense on such large files as gnomad's, so if you are interested in particular regions do not forget to indicate this in your normalization command. I've edited my answer to reflect this. ...
written 9 months ago by Jorge Amigo11k

Latest awards to Jorge Amigo

Appreciated 3 days ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Teacher 6 days ago, created an answer with at least 3 up-votes. For A: Which Version Of Gatk Do People Use
Scholar 6 days ago, created an answer that has been accepted. For A: Filtration of bam file but with header
Scholar 18 days ago, created an answer that has been accepted. For A: Filtration of bam file but with header
Scholar 5 weeks ago, created an answer that has been accepted. For A: Filtration of bam file but with header
Popular Question 4 months ago, created a question with more than 1,000 views. For Which Programs Are You Relying On For Solid Data Analysis?
Epic Question 4 months ago, created a question with more than 10,000 views. For LinkedIn PubMed Importer
Appreciated 5 months ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Good Answer 6 months ago, created an answer that was upvoted at least 5 times. For A: Is It Ok To Use One End Of A Set Of Paired-End Reads As A Set Of Single Reads?
Appreciated 6 months ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Appreciated 7 months ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Teacher 8 months ago, created an answer with at least 3 up-votes. For A: Which Version Of Gatk Do People Use
Good Answer 9 months ago, created an answer that was upvoted at least 5 times. For A: Is It Ok To Use One End Of A Set Of Paired-End Reads As A Set Of Single Reads?
Scholar 9 months ago, created an answer that has been accepted. For A: Filtration of bam file but with header
Appreciated 9 months ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Epic Question 11 months ago, created a question with more than 10,000 views. For How To Split A .Vcf.Gz File
Teacher 11 months ago, created an answer with at least 3 up-votes. For A: Which Version Of Gatk Do People Use
Great Question 12 months ago, created a question with more than 5,000 views. For Which Programs Are You Relying On For Solid Data Analysis?
Good Answer 12 months ago, created an answer that was upvoted at least 5 times. For A: Is It Ok To Use One End Of A Set Of Paired-End Reads As A Set Of Single Reads?
Good Answer 12 months ago, created an answer that was upvoted at least 5 times. For A: Is It Ok To Use One End Of A Set Of Paired-End Reads As A Set Of Single Reads?
Teacher 12 months ago, created an answer with at least 3 up-votes. For A: Which Version Of Gatk Do People Use
Good Answer 13 months ago, created an answer that was upvoted at least 5 times. For A: Is It Ok To Use One End Of A Set Of Paired-End Reads As A Set Of Single Reads?
Appreciated 13 months ago, created a post with more than 5 votes. For A: Order Of Gatk Commands
Commentator 14 months ago, created a comment with at least 3 up-votes. For C: How To Analyse Snp Data From Different Sources?
Scholar 15 months ago, created an answer that has been accepted. For A: Filtration of bam file but with header

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2297 users visited in the last hour