I am currently in the process of submitting sequencing data (raw reads) for multiple projects to SRA. During the first submission the process followed was:
- BioProject with my personal NCBI account
- create a group and invite members (PI, wet-lab student)
- Create and submit BioSamples, SRA.
This went fine but there are some issues:
- now I am the single project admin (can't change that in the page)
- Biosamples belongs to the group, but not SRA or Bioproject (no idea why)
- it seems that one can't create multiple groups (one per submission project since different people are involved in each)-
Ideally I would like to to have the PI as a first point of contact, and add me and others as admins as needed. Mainly to avoid the bus factor but also because in case I leave the group it would be good if the PI is ultimately responsible for all submissions.
Questions:
- What is the best way to do it? Create an umbrella account for the group? Proceed as before and ask the SRA staff to change the group permissions manually?
- How do you do it in your group? I am particularly interested in the opinions of (embedded) bioinformaticians and core facility members.
GEO has some guidelines for this situation, but I don't think this is entirely applicable to SRA/Bioprojects.
Have you considered emailing SRA support to see what their formal policy is? Others have probably come across this before you. While you may get some suggestions here if they are not directly applicable you may have to re-adjust your strategy later. It may take a business day or more, but NCBI staff responds to all tickets.
I am in contact with the staff at the moment because I can't seem to set permissions in the submission group, and will ask that as well. Nonetheless, if this subject warrants an entry in GEO's FAQs, I am surprised it is not common enough to warrant a mention in the SRA/Bioproject as well. As for the general strategy which I am asking about, GEO suggests three different ways of going about. It would be useful to know which one is more commonly used and why.
Got it. Until it becomes common practice to "release" data to investigators via SRA I would think #2 seems to be the best option for a core. This way you are in control of the process through the end of submission at which point control and long term responsibility is transferred to actual PI. From that point of view option #3 does not seem appropriate for a core. #1 may be ok but then you are doing the additional work of creating accounts for other users (not sure what happens if they already have an account).
What are you doing about metadata? Sometimes that is the critical piece of information that makes or breaks a SRA record in terms of being usable in future.
Yes, #2 also seemed to me most reasonable and efficient option until I stumbled across the issue of not being able to change permission. This is probably temporary.
Not sure if this is the answer to your question, but all the metadata is added in the forms we submitted when creating (i) Bioproject, (II) Biosample, and finally the in the SRA submission itself. I am trying to add as much information as possible when preparing these, and following the guidelines as much as possible. Is there anything else we can do or are missing? I actually quite like the Bioproject/Biosample/SRA structure.
That sounds great. Having this information match what actually ends up in the publication can save someone a lot of time/frustration.
Yes, I have been there. Looking for total (rRNA-depleted) RNA-seq datasets was a nightmare a few years ago. I might be wrong, but it also looks like the submission forms have improved over the years and now there many, and very specific, options for library type for example.