Question

Biopython or Bioperl help

0

Entering edit mode

9.7 years ago

iamtuttu5 ▴ 40

Hello all,

I am new to computational biology and NGS . As part of my work I have to use a code through which we can calculate chromosome size by analyzing genome fasta files. The program should take genome fasta file as input and store the sequence length in an array and then print out the sizes for each chromosome. anyone please help me on this

Thank you for your help :)

next-gen RNA-Seq biopython sequence • 3.7k views

ADD COMMENT • link updated 2.3 years ago by Ram 45k • written 9.7 years ago by iamtuttu5 ▴ 40

5

Entering edit mode

My best advice is to try it yourself and ask if you get stuck. Writing code is the only way to learn. The BioPerl HOWTOs are a good place to start and there are complete examples for doing things like getting sequence stats.

ADD REPLY • link 9.7 years ago by SES 8.6k

2

Entering edit mode

Sounds like a homework. Try it yourself. As SES proposed, check http://www.bioperl.org/wiki/HOWTO:SeqIO for some help. As you are starting, and you want to double check, use 'faSize" from kent utils at https://github.com/ENCODE-DCC/kentUtils/tree/master/bin/linux.x86_64

ADD REPLY • link 9.7 years ago by Alternative ▴ 290

0

Entering edit mode

Thank you ol :)

ADD REPLY • link 9.7 years ago by iamtuttu5 ▴ 40

Ram · Answer 1 · 2015-11-05

2

Entering edit mode

9.7 years ago

Jon ▴ 360

Here is a biopython script to take FASTA format and print out header and the length of the sequence. Run like: python script.py file.fasta

import sys
from Bio import SeqIO

with open(sys.argv[1], 'rU') as input:
    SeqRecords = SeqIO.parse(input, 'fasta')
    for rec in SeqRecords:
        print "%s\t%I" % rec.id, len(rec.seq))

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 9.7 years ago by Jon ▴ 360

Ram · Answer 2 · 2015-11-05

1

Entering edit mode

9.7 years ago

venu 7.1k

In this perl script there is a line my $sequence_length = length($sequence); , just write a line to print that. Your work will be done. you can find other scripts for Nucleotide sequence analysis as well in that repository

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 9.7 years ago by venu 7.1k