bedtools - getfasta: How to report both 'chrom:start-stop and' 'name' of sequence?
4
0
Entering edit mode
6.0 years ago
bi_Scholar ▴ 10

AFAIK bedtools only allows for '>chrom:start-stop' (default) or '>name' (-name)

Is it possible to have both, i.e. for a .bed feature

chr1    100 110 name    .   +

i want to have an output like

>name:chr:100-110(+)
NNNNNNNNNN
bedtools • 4.9k views
ADD COMMENT
3
Entering edit mode
6.0 years ago

Does name come from bed or from fasta header or fasta file name? From OP, it seems it comes from bed. If so you can create a fifth column with desired string.

$ awk -v OFS="\t" '{$5=$4":"$1":"$2"-"$3}1' test.bed | bedtools getfasta -fi test.fa -bed - -name
>Name:chr1:5-10
AAACC
ADD COMMENT
1
Entering edit mode
6.0 years ago

Hello,

you could first use the -name option. And than change the id with a little python script using Biopython like this:

from Bio import SeqIO

bed_file = "./regions.bed"
original_file = "./original.fasta"
renamed_file = "./renamed.fasta"

regions = dict()
records = []

with open(bed_file, "r") as bed:
    for line in bed:
        data = line.strip().split("\t")
        regions[data[3]] = data

for record in SeqIO.parse(original_file, 'fasta'):
    record.id = "{3}:{0}:{1}-{2}({5})".format(*regions[record.id])
    record.description = record.id
    records.append(record)

SeqIO.write(records, renamed_file, 'fasta')

fin swimmer

EDIT: Code correction. The original has overwritten the renamed.fasta in every iteration.

ADD COMMENT
1
Entering edit mode
6.0 years ago
Tm ★ 1.1k

Alternative to bedtools, you can use a subseq command of seqkit, which help fetching particular sequence region and also provides all the information in the header.

ADD COMMENT
1
Entering edit mode
6.0 years ago

You could use samtools faidx with bed2faidxsta.pl:

$ bed2faidxsta.pl < in.bed > out.fa

If you want the header to have a custom format, just edit line 67:

my $header = ">".join(":",($chr, $start, $stop, $id, $score, $strand));

You could move $id to the front of the list, for instance.

ADD COMMENT

Login before adding your answer.

Traffic: 1899 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6