Question: adding a sequence to a genome
1
gravatar for Assa Yeroslaviz
8 weeks ago by
Assa Yeroslaviz1.4k
Munich
Assa Yeroslaviz1.4k wrote:

I would like to know if there are tools or methods for adding a specific sequence into a fastA file at a specific position.

I would like to modify the genome, by adding a specific sequence in a specific location on the genome.

thanks

Assa

insert fasta • 213 views
ADD COMMENTlink modified 6 weeks ago • written 8 weeks ago by Assa Yeroslaviz1.4k
1

Not tested myself but seqkit mutate https://bioinf.shenwei.me/seqkit/usage/#mutate seems to do that.

ADD REPLYlink written 8 weeks ago by microfuge1.8k
1

Yes it supports this and you can choose which chromosomes/sequences to insert.

ADD REPLYlink written 6 weeks ago by shenwei3565.3k
3
gravatar for Bastien Hervé
8 weeks ago by
Bastien Hervé4.7k
Karolinska Institutet, Sweden
Bastien Hervé4.7k wrote:

I did this script for one of my project some years ago. It allows you to look at bases at specific positions but also to modify your genome using another fasta file for the changes.

python lookmod_genome.py -m modify -f file.fasta -c changes_file.fasta

The changes_file.fasta should contains all changes you want to make using 4 key words (insertion, deletion, add and remove) and should looks like this :

>insertion:chr1:25:26
GCTAGCTAGC
>deletion:chr4:40:50
>add:your_chromosome_name
GTCGATCGTCATGGTT
>remove:your_chromosome_name

I used it quite a lot but it's self made so check the result with the look mode

Use Python 2 or modify the script to be Python 3 resilient

ADD COMMENTlink modified 6 weeks ago • written 8 weeks ago by Bastien Hervé4.7k

Thanks for the script. It seems not to work for me though.

I have tried all three options for lookmod.py. see below. Only the look options works. I used a test fastA file as input.

$ python lookmod_genome.py -m look -p GL988041:10:20 -g test -r test/test.fa

-----------------------------------------
Mode : look
Fasta file : test/test.fa
Position : GL988041:10:20
Surroundings length : 10
-----------------------------------------

----------------------
------   LOOK   ------
----------------------
-> GAAAAAAAAA**GCCGTGCCGT


$ python lookmod_genome.py -m modify -g test -r test/test.fa -c test/insertion.fa 

-----------------------------------------
Mode : modify
Fasta file : test/test.fa
Construction file : test/insertion.fa
Output file rename: test.fa
-----------------------------------------

Traceback (most recent call last):
  File "lookmod_genome.py", line 499, in <module>
    main(sys.argv[1:])
  File "lookmod_genome.py", line 474, in main
    for key, value in added_dict.iteritems():
AttributeError: 'dict' object has no attribute 'iteritems'


$ python lookmod_genome.py -m modify -g test -r test/test.fa -c test/insertion.fa -i test/output.txt

-----------------------------------------
Mode : modify
Fasta file : test/test.fa
Construction file : test/insertion.fa
Output file rename: test.fa
Output information File : test/output.txt
-----------------------------------------

Traceback (most recent call last):
  File "lookmod_genome.py", line 499, in <module>
    main(sys.argv[1:])
  File "lookmod_genome.py", line 474, in main
    for key, value in added_dict.iteritems():
AttributeError: 'dict' object has no attribute 'iteritems'

$ head test/test.fa 
>GL988041 dna:supercontig supercontig:CTHT_3.0:GL988041:1:6909506:1 REF
GAAAAAAAAAAAAAAAAGAGCCGTGCCGTAGCCCAGTTTTGAACTCTGAAGCCAGATCAG
ACGCGGGATGCAGGAGACCTGGGTGCGGGAGGTGCGGCAGCTGGCCCAAACGGTGCTGCA
GACCTTTTTGCATTGAAGCCCATTTTCACATCCTCTTTTCGTTCTTCCTCCGTCTCCTTC

Do the strings in the chromosome name need to have a certain structure? are spaces or symbols allowed in the names? I can't figure out what dict it looks for.

ADD REPLYlink written 6 weeks ago by Assa Yeroslaviz1.4k

Can you head the test/insertion.fa file please

ADD REPLYlink written 6 weeks ago by Bastien Hervé4.7k

I think I have solved it. I was using python which called python3, when I explicitly call it with python2 I can use the other two options as well.

in python3 the dict.iteritems() was changed to dict.items().

If possible maybe you can update it to fit python3?

thanks

ADD REPLYlink written 6 weeks ago by Assa Yeroslaviz1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1867 users visited in the last hour