Entering edit mode
5.2 years ago
netipandey.87
▴
10
Hello... Can anyone tell me how to maintain the consistency of the initial or raw data downloaded from ncbi site... Like the bacterial genus i am working on have assemblies at all levels i.e. complete genome/scaffold/ contig so...how to ensure the consistency of this initial or raw data downloaded from the ncbi site before further processing can you guide me via any paper.. i have gone through a lot of papers but nothing is mentioned on how to maintain consistency of raw data... Regards...
What exactly do you mean by that?
In general, raw data should be immutable. You don't make any changes to the contents (see below). Any derived analysis you do should be stored in new files/containers. If you are referring to tracking the data then one thing to consider is to relabel the file names so they make intuitive sense e.g. rename sequence files in form of
accession.fa
fromsequence.fasta
(that they generally download as, if you download via a browser from NCBI).