User: mforthman

gravatar for mforthman
mforthman30
Reputation:
30
Status:
New User
Location:
Last seen:
3 weeks, 3 days ago
Joined:
3 years, 6 months ago
Email:
m********@ufl.edu

Posts by mforthman

<prev • 53 results • page 1 of 6 • next >
0
votes
1
answer
142
views
1
answers
Comment: C: concat fasta sequences based on some common header string
... actually, I need to make it 'append' (>>) to the output file or else it just overwrites the output file with the last set of exons it concatenated. ...
written 8 weeks ago by mforthman30
0
votes
1
answer
142
views
1
answers
Comment: C: concat fasta sequences based on some common header string
... Thanks Pierre for the recommended commands. I tried to use it, adding a command to direct output to a new file, but it resulted in an empty file. Without that included, it prints sequences it concatenates to the screen (it does not print sequences that do not get concatenated with anything else). Pe ...
written 8 weeks ago by mforthman30
3
votes
1
answer
142
views
1
answer
concat fasta sequences based on some common header string
... I have a single fasta file with multiple exons from various genes. What I would like to do is use the GeneID in each header to find those exon sequences that have the same GeneID and concatenate them into a single sequence. Along with this, I would like to alter the sequence header so that it includ ...
fasta sequence dna written 8 weeks ago by mforthman30 • updated 8 weeks ago by Pierre Lindenbaum124k
0
votes
1
answer
117
views
1
answers
Comment: C: generating 'maximal' cds sequence from multiple isoforms of a gene
... This was great! After some time trying to sort the file, I think I got bedtools to do exactly what I wanted with the .gff file. Next step is to figure out how I can use the bedtools output to extract the "new" exons (what I'm more interested in for downstream analyses) from the corresponding NCBI ge ...
written 8 weeks ago by mforthman30
2
votes
1
answer
117
views
1
answer
generating 'maximal' cds sequence from multiple isoforms of a gene
... For a given gene, I would like to take the exons or CDS coordinates for all isoforms and 'merge' them to create a reference transcript that represents the 'maximal' number of coding regions. I have seen some previous posts that address this issue, but they only apply to Swiss-Prot, Entrez, Ensembl, ...
gene cds gff written 8 weeks ago by mforthman30 • updated 8 weeks ago by Brice Sarver3.2k
0
votes
0
answers
669
views
0
answers
prinseq error: Use of uninitialized value $qual in scalar chomp
... Running fastq-generated SRA files from NCBI through prinseq-lite. The program generates an error: Use of uninitialized value $qual in scalar chomp at /apps/prinseq/0.20.4/bin/prinseq-lite.pl line 2583, line 26875674. It still continues on until it has finished, but clearly there is somethin ...
software error prinseq sequence written 19 months ago by mforthman30
1
vote
2
answers
568
views
2
answers
using NCBI SRA data for prinseq
... I have been trying to download SRA data from NCBI and putting it in fastq format using fastq-dump. A colleague and I have been trying to figure out why the resulting fastq files are causing some errors when inputted into prinseq-lite. My collaborator has been using this fastq-dump command: fas ...
software error sra fastq sequence prinseq written 19 months ago by mforthman30 • updated 19 months ago by swbarnes27.0k
0
votes
4
answers
1.2k
views
4
answers
Comment: C: search text in one file and then replace with text from another file
... I've still been tinkering with the script and feel as though I might be getting closer to the solution: #!/usr/bin/env python import sys import re original_fn = sys.argv[1] company_fn = sys.argv[2] pattern = '(uce.+$|ENSOFAS.+$|[AB]_[0-9]+$)' map = {} ...
written 2.1 years ago by mforthman30
1
vote
4
answers
1.2k
views
4
answers
Comment: C: search text in one file and then replace with text from another file
... You are correct, that was my mistake. Thanks for catching that! And I appreciate you taking a further look at this later. ...
written 2.1 years ago by mforthman30
0
votes
4
answers
1.2k
views
4
answers
Comment: C: search text in one file and then replace with text from another file
... Correct, I kept the original question simple in original post. In reality, the company.fasta file had many differently formatted headers. I used the script you had provided to process just the `uce` headers, which worked. I already had the `ENSOFAS` headers. This leaves the others to be formatted. T ...
written 2.1 years ago by mforthman30

Latest awards to mforthman

Popular Question 11 weeks ago, created a question with more than 1,000 views. For Post-processing of reciprocal best hits output
Popular Question 7 months ago, created a question with more than 1,000 views. For Post-processing of reciprocal best hits output
Popular Question 19 months ago, created a question with more than 1,000 views. For Several exonerate options; which to choose?
Popular Question 19 months ago, created a question with more than 1,000 views. For FrameDP installation issue
Popular Question 19 months ago, created a question with more than 1,000 views. For How to extract exon sequences from annotated genome
Popular Question 19 months ago, created a question with more than 1,000 views. For search text in one file and then replace with text from another file
Popular Question 3.3 years ago, created a question with more than 1,000 views. For How to extract exon sequences from annotated genome
Popular Question 3.3 years ago, created a question with more than 1,000 views. For Reciprocal Best Hits across more than 2 species

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1696 users visited in the last hour