User: sacha

gravatar for sacha
sacha1.5k
Reputation:
1,530
Status:
Trusted
Location:
France
Website:
http://dridk.me/
Twitter:
dridk
Last seen:
1 week ago
Joined:
3 years, 5 months ago
Email:
s****@labsquare.org

doctor in molecular genetics
Rennes Hospital - genomics laboratory

Posts by sacha

<prev • 252 results • page 1 of 26 • next >
1
vote
3
answers
176
views
3
answers
Comment: C: Dna Pattern Matching In Python for Analysis
... Is the previous code the expected result? So, if you have a large dataset, you can avoid creation of matrix. This is not efficient, but will work : patterns = [] seq1 = str(next(SeqIO.parse("test.fa", "fasta")).seq) for i in range(len(seq1)): pattern = list() for ...
written 7 days ago by sacha1.5k
6
votes
3
answers
176
views
3
answers
Answer: A: Dna Pattern Matching In Python for Analysis
... Something like this ? #### Input fasta file : test.fa >A AAAAAAAAA >B AAAAACAAG >C GAAAACGAA >D AAAAAAAAT #### Python counter from Bio import SeqIO import pandas as pd from collections import Counter df = pd.DataFrame() for r ...
written 7 days ago by sacha1.5k
0
votes
1
answer
126
views
1
answers
Comment: C: edit headers of fasta files
... It is a mistake. I fixed it. Sorry ...
written 14 days ago by sacha1.5k
0
votes
1
answer
126
views
1
answers
Answer: A: edit headers of fasta files
... I use [seqkit][1] for fasta manipulation Try to select and replace fasta header with seqkit. Use grep and replace command using regular expression and capture. Something like this : seqkit grep -nr -p "WP_\d+\.\d" test.fa|seqkit replace -p ".+(WP_\d+)\.(\d).+" -r '$1[0-9].$2' output : ...
written 14 days ago by sacha1.5k
0
votes
0
answers
127
views
0
answers
Comment: A: Google Dataset Search
... Awesome !! ![enter image description here][1] [1]: https://i.imgur.com/7WjqubK.png ...
written 14 days ago by sacha1.5k
1
vote
5
answers
185
views
5
answers
Answer: A: how to count variants par sample per chromosome in a vcf file?
... Split your vcf file by sample and count how many times chromosom appear in each file . FILE=yourfile.vcf for sample in `bcftools query -l $FILE` do bcftools view -c1 -H -s $sample -o ${sample}.vcf $FILE cat ${sample}.vcf |cut -f1|uniq -c > ${sample}.count done ...
written 14 days ago by sacha1.5k
0
votes
1
answer
163
views
1
answers
Comment: C: Substractive Genomics Analysis
... So, just remove the header ( with awk for instance) and apply my previous command line. cat test.txt |awk 'BEGIN{keep=0}{if ($0 ~ "^>"){keep=1} if (keep == 1) print($0)}'|grep -P -B5 'Identities = \d+/\d+\s\(([7-9]\d|100)%' ...
written 15 days ago by sacha1.5k • updated 9 days ago by Sej Modha3.6k
1
vote
1
answer
200
views
1
answers
Answer: A: Fastest way to process fasta file?
... Try [memory map][1] . It maps your file in a virtual memory . Then you can read data from this memory without memory exceed. For exemple : - [http://docs.seqan.de/seqan/1.4.2-dox/?p=MMapString][2] - [https://docs.python.org/3/library/mmap.html][3] - [https://pypi.org/project/pyfasta/][4] [1]: ...
written 19 days ago by sacha1.5k
4
votes
1
answer
163
views
1
answers
Answer: A: Substractive Genomics Analysis
... use regular expression with grep to select line with `Identities = 228/327 (70%)` and print 5 line before ( -B 5 ) More than 70% can be expressed as : `(([7-9]\d|100)` `cat your_file.txt |grep -P -B5 'Identities = \d+/\d+\s\(([7-9]\d|100)%'` ...
written 19 days ago by sacha1.5k
0
votes
2
answers
171
views
2
answers
Answer: A: Add string to list of protein sequences in fasta file with different lengh
... from pyfaidx import Fasta, Faidx # Create faidx of fasta file faidx = Faidx('test.fa') # Get max len from faidx max_seq_name = max(faidx.index.keys(), key=(lambda k: faidx.index[k]["lenc"])) max_len = faidx.index[max_seq_name]["lenc"] # Loop ov ...
written 19 days ago by sacha1.5k

Latest awards to sacha

Teacher 7 days ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Scholar 7 days ago, created an answer that has been accepted. For A: bcftools extract data
Popular Question 9 days ago, created a question with more than 1,000 views. For looking for 16S RNA sequence consensus
Scholar 12 days ago, created an answer that has been accepted. For A: bcftools extract data
Voter 14 days ago, voted more than 100 times.
Popular Question 19 days ago, created a question with more than 1,000 views. For looking for 16S RNA sequence consensus
Scholar 19 days ago, created an answer that has been accepted. For A: bcftools extract data
Teacher 19 days ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Appreciated 19 days ago, created a post with more than 5 votes. For Big Browser : a new genom browser in development
Teacher 21 days ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Appreciated 10 weeks ago, created a post with more than 5 votes. For Big Browser : a new genom browser in development
Popular Question 10 weeks ago, created a question with more than 1,000 views. For looking for 16S RNA sequence consensus
Teacher 10 weeks ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Teacher 3 months ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Popular Question 5 months ago, created a question with more than 1,000 views. For looking for 16S RNA sequence consensus
Great Question 5 months ago, created a question with more than 5,000 views. For Stop to make GUI with Java .... Use Qt 5 !!
Popular Question 5 months ago, created a question with more than 1,000 views. For looking for 16S RNA sequence consensus
Popular Question 5 months ago, created a question with more than 1,000 views. For Somatic allele frequency from TCGA in non-coding DNA
Popular Question 5 months ago, created a question with more than 1,000 views. For Count repeat sequence
Popular Question 5 months ago, created a question with more than 1,000 views. For Big Browser : a new genom browser in development
Scholar 5 months ago, created an answer that has been accepted. For A: bcftools extract data
Popular Question 5 months ago, created a question with more than 1,000 views. For Count repeat sequence
Teacher 6 months ago, created an answer with at least 3 up-votes. For A: awk filed with different separator
Scholar 6 months ago, created an answer that has been accepted. For A: bcftools extract data
Popular Question 6 months ago, created a question with more than 1,000 views. For Big Browser : a new genom browser in development

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1653 users visited in the last hour