Moderator: Matt Shirley

gravatar for Matt Shirley
Matt Shirley7.2k
Reputation:
7,250
Status:
Trusted
Location:
Cambridge, MA
Website:
http://mattshirley.com/
Twitter:
mdshw5
Scholar ID:
Google Scholar Page
Last seen:
an hour ago
Joined:
6 years, 1 month ago
Email:
m*****@gmail.com

Posts by Matt Shirley

<prev • 643 results • page 1 of 65 • next >
0
votes
2
answers
147
views
2
answers
Answer: A: Downloading from nr data base.
... Just use a tool that indexes and parses the FASTA format, and allows for a case-insensitive regular expression match: $ pip install pyfaidx $ faidx nr.fasta --regex "(?i)streptomyces" ...
written 11 days ago by Matt Shirley7.2k
1
vote
10
answers
29k
views
10
answers
Comment: C: How To Extract A Sequence From A Big (6Gb) Multifasta File ?
... pip install pyfaidx then faidx --regex "^((?!>1;).)*$" input.fa > output.fa or faidx --invert-match --regex "^>1;.*$" input.fa > output.fa The first example uses negative lookaheads, which may be more difficult to reason about, while the second example depends on the `- ...
written 5 weeks ago by Matt Shirley7.2k
2
votes
2
answers
199
views
2
answers
Comment: C: Modifying Fasta file header
... Apparently biopython uses the strict definition (if FASTA has any) of the ID as everything before the first space. See https://www.biostars.org/p/18987/ To get the whole header you want `SeqRecord.description` not `SeqRecord.id` ...
written 10 weeks ago by Matt Shirley7.2k
1
vote
2
answers
199
views
2
answers
Comment: C: Modifying Fasta file header
... Most methods that access FASTA entries using the offsets stored in a *.fai file will truncate the header name at the first whitespace. However, Bio.SeqIO does not use this scheme. Both samtools and pyfaidx do, but there's a method in pyfaidx: `FastaRecord.longname` will recover the entire header nam ...
written 10 weeks ago by Matt Shirley7.2k
0
votes
2
answers
199
views
2
answers
Comment: C: Modifying Fasta file header
... It might be helpful to know why you want to modify your headers in this fashion and what some of your other headers look like. ...
written 10 weeks ago by Matt Shirley7.2k
1
vote
2
answers
158
views
2
answers
Comment: C: Adding Fasta unique identifiers
... awk '/^>/ {printf(">%d %s\n",++N,substr($0,2));next;} {print;}' input.fa > output.fa ...
written 10 weeks ago by Matt Shirley7.2k
0
votes
3
answers
173
views
3
answers
Answer: A: Delete fasta sequence with a pattern "unassigned peptidases"
... $ pip install pyfaidx $ faidx sequences.fa --regex '.*unassigned peptidases.*' --invert-match > no_peptidases.fa You can find more usage for `faidx` here: https://github.com/mdshw5/pyfaidx#faidx ...
written 10 weeks ago by Matt Shirley7.2k
2
votes
3
answers
269
views
3
answers
Answer: A: Parsing FASTA file using class in Python
... If you want a fasta file to act like a sequence dictionary, just use [pyfaidx](https://github.com/mdshw5/pyfaidx): import pyfaidx fa = pyfaidx.Fasta("sample.fa") for key in fa: print(key) # sequence name print(fa[key]) # sequence object You'll be using an efficient method t ...
written 11 weeks ago by Matt Shirley7.2k
5
votes
1
answer
371
views
1
answers
Comment: C: Can Biostars use question template like Github issue/PR?
... I really like this idea, though care needs to be taken not to punish users if they can't clearly describe the problem, and make sure "I haven't tried anything" is sometimes appropriate when we're learning new subject matter. ...
written 12 weeks ago by Matt Shirley7.2k
2
votes
8
answers
528
views
8
answers
Answer: A: fasta seq header
... $ pip install pyfaidx $ faidx -e "lambda x: x.split('|')[0]" genes.fa >gene_1 ATGCGTCGACGTCGTACGGGTTTT CGTACGGGTTATGCGTCGACGTC GTACGGGTTTT ... ...
written 12 weeks ago by Matt Shirley7.2k

Latest awards to Matt Shirley

Scholar 4 weeks ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Popular Question 5 weeks ago, created a question with more than 1,000 views. For Comments Left Inappropriately As Answers To A Question
Teacher 6 weeks ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Teacher 10 weeks ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Popular Question 11 weeks ago, created a question with more than 1,000 views. For Comments Left Inappropriately As Answers To A Question
Teacher 11 weeks ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Appreciated 12 weeks ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Commentator 12 weeks ago, created a comment with at least 3 up-votes. For C: What Does 2X250Bp Buy Us?
Popular Question 4 months ago, created a question with more than 1,000 views. For Troubling Trends In Scientific Software Use
Good Answer 4 months ago, created an answer that was upvoted at least 5 times. For A: How Can I Do Principal Components Analysis ?
Scholar 4 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Appreciated 5 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Teacher 5 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Teacher 6 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Commentator 6 months ago, created a comment with at least 3 up-votes. For C: What Does 2X250Bp Buy Us?
Appreciated 7 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Teacher 7 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Good Answer 8 months ago, created an answer that was upvoted at least 5 times. For A: Generate Vcf.Gz File And Its Index File Vcf.Gz.Tbi
Teacher 8 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Teacher 9 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Scholar 10 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Teacher 11 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Appreciated 12 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Teacher 12 months ago, created an answer with at least 3 up-votes. For A: How To Select Only One Human Genome Build (Hg19) From The Encode Project'S Data
Good Answer 12 months ago, created an answer that was upvoted at least 5 times. For A: Not having root access sucks; installing software without root privileges

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 447 users visited in the last hour