Question: Sample Names of Trimmed 16s file from HMP
0
gravatar for vasanth.chandrasekhar
11 days ago by
vasanth.chandrasekhar0 wrote:

I downloaded a trimmed 16s file for a single patient from the HMPdacc. The body site was stool and the study was 16S_PP1. When I pick OTUs form the file using the qiime using;

pick_closed_reference_otus.py -i ./SRS075963.fsa  -o $PWD/closed_reference_otu

I get a biom file with 12467 sample names, which is not what I expected. I was expecting to see a single sample name, since Im looking at 1 patient?

When I look at the sequence file I got from HMP, I get the following for the sample names for the sequences;

>GKLCT6U01ADS6O_cs_nbp_rc 
>GKLCT6U01ERG3G_cs_nbp_rc 
>GKLCT6U01DHALW_cs_nbp_rc
>GKLCT6U01DV7XB_cs_nbp_rc

The first 9 characters "GKLCT6U01" and the last 9 characters "_cs_nbp_rc " are conserved through all the sample names. The only difference between all of then is five character before the underscore "ADS6O" and "ERG3G." Which leads me to believe that these are sequence IDs and require a underscore after the "GKLCT6U01," so the renamed label would be;

>GKLCT6U01_ADS6O_cs_nbp_rc 
>GKLCT6U01_ERG3G_cs_nbp_rc 
>GKLCT6U01_DHALW_cs_nbp_rc
>GKLCT6U01_DV7XB_cs_nbp_rc

I want to know if this a correct assumption and if me editing the file is appropriate.

Thanks for helping

sequencing R • 56 views
ADD COMMENTlink written 11 days ago by vasanth.chandrasekhar0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 811 users visited in the last hour