Finding differences between 2 fasta files
1
0
Entering edit mode
6.1 years ago
Jautis ▴ 320

Hello, I have a two fasta files that represent large sections of a chromosome and I am wondering if there's a way to tell what bases differ between the two. 

 

For example, a subsection could look like this:

Fasta 1:

>chr1 ATACGTGATCGAT.........AATCGT

Fasta 2:

>1 ATCCGTGATCGAT.........AATCGT

And I would want an output that says base 3 is different between the two. 

 

Any ideas or advice would be great!

fasta differences vcf • 2.4k views
ADD COMMENT
0
Entering edit mode
6.1 years ago

I wrote a blastn2snp: https://github.com/lindenb/jvarkit/wiki/BlastnToSnp

#query              hit                                                                                              hit-index  hsp-index  query-POS  hit-POS    STRAND  REF(hit)  ALT(query)  blast.align_length  blast.hit.var  blast.query.var  blast.mid.var
No definition line  Homo sapiens chromosome 6, alternate assembly CHM1_1.1                                           1          9          21         74567818   -       T         A           18                  A              T                .
No definition line  Homo sapiens chromosome 6, alternate assembly HuRef                                              2          9          21         71600901   -       T         A           18                  A              T                .
No definition line  Homo sapiens chromosome 6, GRCh37.p13 Primary Assembly                                           3          9          21         74401398   -       T         A           18                  A              T                .
No definition line  Homo sapiens chromosome 5, alternate assembly CHM1_1.1                                           4          1          7          107821121  -       A         G           28                  T              C                .
No definition line  Homo sapiens chromosome 5, alternate assembly CHM1_1.1                                           4          9          16         14262358   +       G         C           18                  G              C                .
No definition line  Homo sapiens chromosome 5, alternate assembly CHM1_1.1                                           4          13         8          132662461  -       T         C           18                  A              G                .
No definition line  Homo sapiens chromosome 5, alternate assembly CHM1_1.1                                           4          20         14         170329095  -       G         C           18                  C              G                .
No definition line  Homo sapiens chromosome 5, alternate assembly HuRef                                              5          1          7          103561224  -       A         G           28                  T              C                .
No definition line  Homo sapiens chromosome 5, alternate assembly HuRef                                              5          9          16         14234054   +       G         C           18                  G              C                .
No definition line  Homo sapiens chromosome 5, alternate assembly HuRef                                              5          13         8          128416747  -       T         C           18                  A              G                .
No definition line  Homo sapiens chromosome 5, alternate assembly HuRef                                              5          19         14         165993804  -       G         C           18                  C              G                .
No definition line  Homo sapiens chromosome 5, GRCh37.p13 Primary Assembly                                           6          1          7          108388040  -       A         G           28                  T              C                .
No definition line  Homo sapiens chromosome 5, GRCh37.p13 Primary Assembly                                           6          9          16         14262514   +       G         C           18                  G              C                .
No definition line  Homo sapiens chromosome 5, GRCh37.p13 Primary Assembly                                           6          13         8          133231874  -       T         C           18                  A              G                .
No definition line  Homo sapiens chromosome 5, GRCh37.p13 Primary Assembly                                           6          19         14         170896537  -       G         C           18                  C              G                .
No definition line  Homo sapiens chromosome 19, alternate assembly CHM1_1.1                                          8          13         18         54878835   -       A         C           18                  T              G                .
No definition line  Homo sapiens chromosome 19, alternate assembly CHM1_1.1                                          8          14         18         54999463   +       T         G           18                  T              G                .
No definition line  Homo sapiens chromosome 19, alternate assembly CHM1_1.1                                          8          15         21         55131801   -       G         A           18                  C              T                .
No definition line  Homo sapiens chromosome 19, alternate assembly HuRef                                             9          14         18         51207279   -       A         C           18                  T              G                .
No definition line  Homo sapiens chromosome 19, alternate assembly HuRef                                             9          15         18         51329261   +       T         G           18                  T              G                .
No definition line  Homo sapiens chromosome 19, alternate assembly HuRef                                             9          16         21         51461318   -       G         A           18                  C              T                .

 

ADD COMMENT

Login before adding your answer.

Traffic: 1538 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6