Help with mitochondrial sequences from GenBank!!!
1
0
Entering edit mode
2.9 years ago
Matteo Ungaro ▴ 100

Hi guys,

I'm in desperate need for help to download mtDNA sequences from GenBank. It's so confusing... up to now I was only using the ENA website where it is possible to download a .tsv file containing all the ftp links.

Now, I have a bunch of sequences (e.g. MF991448) that you can easily find on GenBank but seems extremely hard to get in order to conduct some analyses on them.

Please if you know something I would be glad to hear form you!

Thanks in advance, Matteo

ftp genbank mitochondrial • 1.3k views
ADD COMMENT
0
Entering edit mode

k but seems extremely hard to get in order to conduct some analyses on them.

what is "hard to get" ?

ADD REPLY
0
Entering edit mode

Pardon, I mean the .bam file associated with the sequences.

ADD REPLY
1
Entering edit mode
2.9 years ago
GenoMax 141k

You are not going to find a reverse association with sequences in GenBank (which is a nucleotide database) to high-throughput sequencing datasets. If you are looking for mitochondrial genome datasets in SRA you can use a tool such as sra-explorer to find such data (see sra-explorer : find SRA and FastQ download URLs in a couple of clicks ). You are not likely to find BAM files for most datasets there. Data would primarily be in fastq format. You will need to align the data and create the BAM files yourself.

ADD COMMENT
0
Entering edit mode

Thanks a lot but I couldn't fined the exact sequence I indicated as an example, which is a mitochondrial DNA from human.

ADD REPLY
0
Entering edit mode

People select reference of their choice when they do alignments. Generally alignments are not submitted to public databases such as NCBI/ENA only original sequence data is.

What is it that you want/need to do? Perhaps you are taking the wrong approach and we can suggest alternatives.

ADD REPLY
0
Entering edit mode

Yeah, you might be right. So, I will explain a bit more.

I need to download .bma file for GenBank accession codes of mtDNA sequences. What the link for downloading (with the command wget in Linux) should look like is something like the pic below.

mt sequences from ENA website

ADD REPLY
0
Entering edit mode

You can simply put those links in a file (one link per line) and then use wget -i file_with_links.

ADD REPLY
0
Entering edit mode

I know that already, in fact I did it for the mtDNA file I got from the ENA website in the pic above.

Unfortunately, I don't know how to get those ftp links form GenBank, seems impossible to find them anywhere...

ADD REPLY
0
Entering edit mode

Why do you need to get them from GenBank? GenBank/ENA/DDBJ databases are in sync as far as submitted sequence data goes. Links you show above have ERR* numbers meaning the data was originally submitted to EBI/ENA.

If you must have the links from NCBI then see this for an example: https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=ERR2181263 If you look at the Data access tab you will see links for NCBI (as well as EBI/ENA). All SRA data is now hosted in cloud.

ADD REPLY
0
Entering edit mode

Thanks a lot! I think the website that you linked should do.

I actually didn't know that all DBs are now in cloud... I'm new of the field and I guess nobody where I work knew that too.

I typed the sequence I was looking for on the ENA website and worked as well. Thanks again.

Matteo

ADD REPLY

Login before adding your answer.

Traffic: 3199 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6