Hello helpers, I just got raw reads from 10Xgenomics and started to do trimming and assembly. I’m new to this technology and have some basic questions below. Basically, I want to use software trim galore for trimming first and have clarify these first so as to choose proper options.
1, I only have one library. In the fastq file, what is the first line mean @K00162:247:HNJVCBBXX:5:1101:1357:1314 1:N:0:NCTCGTTT ; how to tell if the sequences are from different samples? I guess if they are from the same species, I don’t need to do any demultiplex?
2, Is the adapter sequences used in 10Xgenomics the same as illumina? 3, Is 10Xgenomics using ASCII+33 quality scores as Phred scores?
Thanks a lot for your help! Nunu
You may want to use one of the 10x genomic software pipelines to do the analysis.
What kind of data is this? Here is an example of pipelines for single cell RNAseq data. You can find others at that same site.Answers for your questions.
1- Samples may already have been demultiplexed. Did you get a folder per sample with multiple files in it. 10x uses 4 illumina indexes per sample. They also output index reads in a separate file.
2 - Yes.
3 - Yes.
Edit: As pointed out by @Igor, I overlooked the assembly mention in original post. Apologies for that. @Igor links 10x assembler in comment below.
Note: If you do not use 10x assembler then you may lose information about linked reads etc., which is one of the reasons why you do 10x in first place.
supernova
also produces diploid assemblies.The original post mentioned assembly, so maybe it could be in regards to De Novo Assembly.
Either way, 10x Genomics files are organized in a unique way, so following generic tutorials based on standard Illumina data will be confusing.
hi, thanks a lot for the information. Yes, I'm doing de novo assembly. I have checked this website but it seems that they only recommend using supernova for diploid species, and mine is not diploid.. So I have to try using other assembly pipelines. The targeted genome is around 1Gb, for now, I want to use trim galore for trimming, followed by abyss. Any comments. Thanks
I would try Supernova first to see what you get. That would serve as a good control.
If you use Trim Galore and Abyss, you skip generation of linked (long) reads, which is the benefit of 10x Genomics technology over standard short-read sequencing.
Thanks a lot! My data is genomics data from protozoa, so I can't follow single cell RNA seq pipelines. I only have one folder since it's sequenced long time ago but another technician. I used the following to check the index file
so what does this to do with my trimming and assembly?
Looks like only first 4 are valid 10x indexes corresponding to code
SI-GA-C12
.Thanks, do you know what should I do with these indexes? Also, do I need to figure out first the diploidy state of my targeted species to use a certain assembly software?
I would still suggest trying
supernova
first to see what you get. If you don't want to do that then we are in uncharted territory. You could potentially treat the entire dataset as a normal paired-end dataset and then use assembler of your choice. Even though there are multiple indexes you just have one sample.thanks. I will use Supernova. Do I need to do trimming first? In my file, there are 3 files, one index file and one paired end read files. I only have one sample, so I don't need to care about the index file and just proceed with the paired end reads. Is this correct? After checking supernova, I will just need to use supernova run & supernova mkoutput? Thanks a lot.
10x provides extensive documentation on how to run Supernova: https://support.10xgenomics.com/de-novo-assembly/software/pipelines/latest/using/fastq-input