Help with mitochondrial sequences from GenBank!!!
1
0
Entering edit mode
8 weeks ago

Hi guys,

I'm in desperate need for help to download mtDNA sequences from GenBank. It's so confusing... up to now I was only using the ENA website where it is possible to download a .tsv file containing all the ftp links.

Now, I have a bunch of sequences (e.g. MF991448) that you can easily find on GenBank but seems extremely hard to get in order to conduct some analyses on them.

Please if you know something I would be glad to hear form you!

ftp genbank mitochondrial • 309 views
0
Entering edit mode

k but seems extremely hard to get in order to conduct some analyses on them.

what is "hard to get" ?

0
Entering edit mode

Pardon, I mean the .bam file associated with the sequences.

1
Entering edit mode
8 weeks ago
GenoMax 104k

You are not going to find a reverse association with sequences in GenBank (which is a nucleotide database) to high-throughput sequencing datasets. If you are looking for mitochondrial genome datasets in SRA you can use a tool such as sra-explorer to find such data (see sra-explorer : find SRA and FastQ download URLs in a couple of clicks ). You are not likely to find BAM files for most datasets there. Data would primarily be in fastq format. You will need to align the data and create the BAM files yourself.

0
Entering edit mode

Thanks a lot but I couldn't fined the exact sequence I indicated as an example, which is a mitochondrial DNA from human.

0
Entering edit mode

People select reference of their choice when they do alignments. Generally alignments are not submitted to public databases such as NCBI/ENA only original sequence data is.

What is it that you want/need to do? Perhaps you are taking the wrong approach and we can suggest alternatives.

0
Entering edit mode

Yeah, you might be right. So, I will explain a bit more.

0
Entering edit mode

You can simply put those links in a file (one link per line) and then use wget -i file_with_links.

0
Entering edit mode

I know that already, in fact I did it for the mtDNA file I got from the ENA website in the pic above.

Unfortunately, I don't know how to get those ftp links form GenBank, seems impossible to find them anywhere...

0
Entering edit mode

Why do you need to get them from GenBank? GenBank/ENA/DDBJ databases are in sync as far as submitted sequence data goes. Links you show above have ERR* numbers meaning the data was originally submitted to EBI/ENA.

If you must have the links from NCBI then see this for an example: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=ERR2181263 If you look at the Data access tab you will see links for NCBI (as well as EBI/ENA). All SRA data is now hosted in cloud.

0
Entering edit mode

Thanks a lot! I think the website that you linked should do.

I actually didn't know that all DBs are now in cloud... I'm new of the field and I guess nobody where I work knew that too.

I typed the sequence I was looking for on the ENA website and worked as well. Thanks again.

Matteo