rename Fasta file headings for use in MIRA
1
0
Entering edit mode
6.4 years ago
mn1154 • 0

Hi all,

I was wondering if anyone could help me with renaming the header lines on my FASTA files?

I am using a simulated paired end metagenome. The data currently looks like:

>r1.1 |SOURCES={KEY=bf97e692...,bw,559392-559472}|ERRORS={}|SOURCE_1="CP007128.1 Gemmatimonadetes bacterium KBS708, complete genome" (bf97e6923cd410b05af0dc7641aa6e2651e19392)
GTCGCTGCAGGGGCGCGACTCGGCGCGCGTGCGCGACTCGGCGCGCGTGCGCGACTTCGCGCTCT
ACGGCGAGACGACGG

>r1.2 |SOURCES={KEY=bf97e692...,fw,558357-558437}|ERRORS={}|SOURCE_1="CP007128.1 Gemmatimonadetes bacterium KBS708, complete genome" (bf97e6923cd410b05af0dc7641aa6e2651e19392)
GAGGGCGGCTTCCACCCCGGCACCGGCCTGGCCGCCGATCGCCTCGTCGGCATGACGAAGCTCGC
CGGCGAGTGCCGTAC

>r2.1 |SOURCES={KEY=bf97e692...,bw,4893168-4893248}|ERRORS={}|SOURCE_1="CP007128.1 Gemmatimonadetes bacterium KBS708, complete genome" (bf97e6923cd410b05af0dc7641aa6e2651e19392)
TGGAACAGCTCGTCGCGGGCTTCCTCGTAGGGCGTCGGGGTCGCGACAGCATCCCGTCGTCCGCG
GTTGTTATTGCCGTG

>r2.2 |SOURCES={KEY=bf97e692...,fw,4892115-4892195}|ERRORS={76:T}|SOURCE_1="CP007128.1 Gemmatimonadetes bacterium KBS708, complete genome" (bf97e6923cd410b05af0dc7641aa6e2651e19392)
ACTAGATTGACGACGAA*


Where >r1.1 and .r1.2 are a set of paired end reads. In order to run MIRA, i need to rename the header lines into a more conventional format, i.e. >r1/1 and >r1/2.

Does anyone know how I can do this and apply it to my whole FASTA files, so that I have reads numbered : 1/1, 1/2, 2/1, 2/2, 3/1, 3/2 and so on...

Maisie

sequencing next-gen Assembly • 1.2k views
0
Entering edit mode

I guess you can do this easily with sed. Do you want to delete the remaining parts? e.g.

SOURCES={KEY=bf97e692...,fw,558357-558437}|ERRORS={}|SOURCE_1="CP007128.1 Gemmatimonadetes bacterium KBS708, complete genome" (bf97e6923cd410b05af0dc7641aa6e2651e19392)

0
Entering edit mode
6.4 years ago
venu 7.0k
sed 's/\./\//' foo.fa > out.fa


You might have searched this forum for the similar threads. As of now here is the answer. But next time please show us what you have tried, so that we can fix the bugs in your own solution.