Entering edit mode
4.3 years ago
annaA
▴
10
Hello I am working in a dataset which contains several categorical variables in the meta data (sample_data). i want to merge some replicates and maybe in the next steps I will need to merge other samples to. To do so I am using the merge_samples function as follows f_data: original phyloseq object "sample" : sample ID where the replicates have the same ID
test <- merge_samples(x=f_data, group="sample")
the meta data before look like this :
Sample Data: [237 samples by 15 sample variables]:
group sample ring_no species sex rearing_nest family_membership sampling index_1 index_2
G1 G1 FD001 White247 Bengalese_Finch Male VS13 Father Fostering S504 N708
G10 G10 FD026 Grey1304 Zebra_Finch Male KE05 Father Fostering S507 N712
G100 G100 FD250 Grey1302 Zebra_Finch Male KE03 Father Day_10 S503 N706
G101 G101 FD252 Grey1302 Zebra_Finch Male KE03 Father Day_100 S504 N712
G102 G102 FD256 Grey1322 Zebra_Finch Female KE03 Mother Day_35 S507 N710
G103 G103 FD270 Silver179 Bengalese_Finch Female VS13 Mother Day_10 S508 N705
and after the merging like this :
Sample Data: [234 samples by 16 sample variables]:
group sample ring_no species sex rearing_nest family_membership sampling index_1 index_2 breading_no
FD001 NA NA NA NA NA NA NA NA NA NA 1.0
FD002 NA NA NA NA NA NA NA NA NA NA 1.0
FD003 NA NA NA NA NA NA NA NA NA NA 1.0
FD004 NA NA NA NA NA NA NA NA NA NA 1.0
FD005 NA NA NA NA NA NA NA NA NA NA 1.0
FD006 NA NA NA NA NA NA NA NA NA NA 1.0
FD007 NA NA NA NA NA NA NA NA NA NA 1.0
FD008 NA NA NA NA NA NA NA NA NA NA 1.0
FD009 NA NA NA NA NA NA NA NA NA NA 1.0
FD026 NA NA NA NA NA NA NA NA NA NA 1.0
FD027 NA NA NA NA NA NA NA NA NA NA 1.0
FD028 NA NA NA NA NA NA NA NA NA NA 1.0
FD029 NA NA NA NA NA NA NA NA NA NA 1.0
I will appreciate any help in how to solve this problem. A
Hi,
Sorry, can you edit your post? I can't read the table because it is not formatted properly.
Just to confirm, in your metadata file provided to
phyloseq
, do you have the column namesample
?Can you do (below) and post here the result?
This may be related with the fact that
sample
variable in yoursample_data(f_data)
is not a factorial variable. If that is the case you can provide a factorial variable to thegroup
argument in the functionmerge_samples()
by doing:This should work. Let me know if it worked.
António
Hey thanks for your reply .
So the output of the str(sample_data(f_data)) is the following
I run the code you suggested but still I have this warning messange and the NAs in the metadata
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.Sure ! next time I'll do it right
So, from what I see in the metadata table before merging your
phyloseq
object, the variablesample
only holds distinct sample names.What do you want to merge? I believe that is different replicates, right? If so, you need to have one factor variable in your meta data file that has the same name for different replicates that belong to the same sample. It is not clear for me that you have such variables from your example. Just to be sure.
In this case, you have a
phyloseq
object bySample
and you want to merge byGroup
, i.e., merge sample replicates per control versus test conditions. Are you sure that you have this info accordingly in your meta data.If you have, I cannot understand the error.
António
In my data there are 3 samples with the same sample name( i.e "sample") FDR0033 lets say.So I thought if I merge them by "sample" it will work.But from what you are saying this is not correct. So I need to add a new variable "x" in which I will give the same name to these 3 samples only(?) and merge the object by "x" ?? I am right? sorry if this is really stupid is the first time I am dealing with this kind of analysis A.
So, in the case it seems that you're doing everything right.
I just tested with the
GlobalPatterns
data that comes withphyloseq
and it just works fine.As you see it works just fine. Which
phyloseq
version are you using?António
yeah but do you see that some of the variables are changed for example the variable "primer" in the new object is translated to numeric values?
I am using version 1.32.0
A
Yes, I understand what you're saying. But in this case make sense. I mean if you have different primers per different samples, and you merge the samples the information in primers are useless, you cannot merge them.
My version is 1.30.0. Can you try to downgrade?
António
Dear annaA, did you solve this problem? I have the exact same issue Im trying to fix now, which is that NAs appear in sample variables after merge_samples. Thanks a lot for any kind of help in this regard.
I noticed this started happening after updating to R to version 4.0 or higher. I believe it has something to do with the base R change of no longer automatically importing strings as factors.
Converting the columns to factor variables gets me about halfway there - they are no longer NAs, but they remain encoded as integers. The old workaround to then reassign the factor labels is no longer working for reasons I can't quite figure out.