Moderator: Matt Shirley

gravatar for Matt Shirley
Matt Shirley8.2k
Reputation:
8,170
Status:
Trusted
Location:
Cambridge, MA
Website:
http://mattshirley.com/
Twitter:
mdshw5
Scholar ID:
Google Scholar Page
Last seen:
51 minutes ago
Joined:
6 years, 10 months ago
Email:
m*****@gmail.com

Posts by Matt Shirley

<prev • 695 results • page 1 of 70 • next >
0
votes
5
answers
213
views
5
answers
Answer: A: Rename FASTA files according to FASTA file header
... You can use `pyfaidx` for this: pip install pyfaidx faidx -x input.fasta See [here](https://github.com/mdshw5/pyfaidx/blob/master/README.rst#faidx) for detailed usage. ...
written 13 days ago by Matt Shirley8.2k
0
votes
3
answers
1.6k
views
3
answers
Comment: C: Making bed files from fasta
... I want to point out that this feature didn't work as intended until `pyfaidx` v0.5.2, where someone pointed out that the coordinate weren't 0-based half-open as expected. This has now been fixed: https://github.com/mdshw5/pyfaidx/releases/tag/v0.5.2 ...
written 27 days ago by Matt Shirley8.2k
1
vote
4
answers
394
views
4
answers
Comment: C: A very short gene with very high TPM
... You're correct. The TPM feature length scaling is taking a few reads and assigning most of the TPM space to them. ...
written 27 days ago by Matt Shirley8.2k
1
vote
4
answers
394
views
4
answers
Comment: C: A very short gene with very high TPM
... I should have been more clear. The issue I solve by using TMM scaling is that because TPM is a unity measure (the sum of all values must be 1e6), highly expressed transcripts/genes in a single sample will **cause all other genes to appear down regulated**, when in fact they are not. TMM normalizatio ...
written 28 days ago by Matt Shirley8.2k
4
votes
4
answers
394
views
4
answers
Answer: A: A very short gene with very high TPM
... Your observation of very short transcripts receiving a disproportionate amount of the TPM is consistent with my observations. It's clear why this is the case. TPM is roughly: (feature counts / feature length) / (sum of [all feature counts / feature lengths]) * 1e6 and so for features with ...
written 29 days ago by Matt Shirley8.2k
4
votes
2
answers
250
views
2
answers
Answer: A: How to extract specific genes from a fasta file
... There are many ways do get sequences from a FASTA file. Some (like the Kent utility) are more efficient than others (reading all sequences into memory and writing them back out). I made a Python library that's designed to be efficient (creates and operates on the same index .fai files that `samtools ...
written 5 weeks ago by Matt Shirley8.2k
0
votes
3
answers
1.4k
views
3
answers
Answer: A: Pysam is giving - ValueError: reference_id -1 out of range 0<=tid<65334. Any su
... Depending on your use case you might try [simplesam](https://github.com/mdshw5/simplesam). It's a much simpler API than pysam. Your code would become: fh = simplesam.Reader(open(self.alnfile, 'rb')) for aln in fh: print(aln.rname) You could also filter out unmapped reads by checkin ...
written 5 weeks ago by Matt Shirley8.2k
0
votes
4
answers
190
views
4
answers
Comment: C: Binning fasta sequences by size?
... See also https://www.biostars.org/p/150865/#163423 ...
written 6 weeks ago by Matt Shirley8.2k
1
vote
4
answers
190
views
4
answers
Answer: A: Binning fasta sequences by size?
... The `faidx` utility in [`pyfaidx`](https://github.com/mdshw5/pyfaidx#faidx) has this capability built-in: $ pip install pyfaidx $ faidx --size-range 1,199 file.fa > smalls.fa $ faidx --size-range 200,200000000 file.fa > bigs.fa An advantage of this approach is that you read the s ...
written 6 weeks ago by Matt Shirley8.2k
0
votes
6
answers
6.3k
views
6
answers
Comment: C: The Longest Chromosome > Sizeof(Int32)
... You'll never need more than 64kB of memory. ...
written 6 weeks ago by Matt Shirley8.2k

Latest awards to Matt Shirley

Popular Question 24 days ago, created a question with more than 1,000 views. For Masking low-complexity regions using pyfaidx MutableFastaRecord
Scholar 28 days ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Teacher 29 days ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Teacher 5 weeks ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Appreciated 7 weeks ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Appreciated 8 weeks ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Good Answer 10 weeks ago, created an answer that was upvoted at least 5 times. For A: How Can I Do Principal Components Analysis ?
Commentator 3 months ago, created a comment with at least 3 up-votes. For C: What Does 2X250Bp Buy Us?
Appreciated 4 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Appreciated 4 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Teacher 4 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Teacher 5 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Popular Question 5 months ago, created a question with more than 1,000 views. For On the utility of publishing a tool paper
Scholar 6 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Teacher 6 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Scholar 6 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Scholar 6 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Teacher 6 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Scholar 7 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Teacher 7 months ago, created an answer with at least 3 up-votes. For A: What Does 2X250Bp Buy Us?
Commentator 7 months ago, created a comment with at least 3 up-votes. For C: What Does 2X250Bp Buy Us?
Scholar 7 months ago, created an answer that has been accepted. For A: How to use pygr? worldbase doesn't return anything
Appreciated 7 months ago, created a post with more than 5 votes. For A: Ways To Detect Bias In Dna Sampling For Genomic Sequencing
Popular Question 8 months ago, created a question with more than 1,000 views. For On the utility of publishing a tool paper
Good Answer 8 months ago, created an answer that was upvoted at least 5 times. For A: How Can I Do Principal Components Analysis ?

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1393 users visited in the last hour