![]() # *** Solution reached names(plist) <- ord_meths Plist = llply(as.list(ord_meths), function(i, physeq, dist), GP1, dist) # Square root transformation In this section I loop through different method parameter options to the plot_ordination function, store the plot results in a list, and then plot these results in a combined graphic using ggplot2. Define a human-associated versus non-human categorical variable: human = get_variable(GP1, "SampleType") %in% c("Feces", "Mock", "Skin", "Tongue") ![]() We will want to investigate a major prior among the samples, which is that some are human-associated microbiomes, and some are not. That still leaves 204 OTUs in the dataset, GP1. GP1 = prune_taxa((tax_table(GP1) %in% top5phyla), GP1) Top5phyla = names(sort(phylum.sum, TRUE)) phylum.sum = tapply(taxa_sums(GP1), tax_table(GP1), sum, na.rm=TRUE) GP1 = transform_sample_counts(GP1, function(x) 1E6 * x/sum(x)) Wh0 = genefilter_sample(GP, filterfun_sample(function(x) x > 5), A=0.5*nsamples(GP)) Remove OTUs that do not show appear more than 5 times in more than half the samples GP = GlobalPatterns Since the goal of this exercise is to demonstrate the plot_ordination capability, and not necessarily reveal any new knowledge about the Global Patterns dataset, the emphasis on this preprocessing will be on limiting the number of OTUs, not protecting intrinsic patterns in the data. I want to include some phylogenetic tree-based ordinations, which can be slow to calculate. To quickly demonstrate and compare the results of different ordination methods, I will first further filter/preprocess the OTUs in GP1. However, I make no assertion that these are the “optimum” approach(es) for your data and research goals, but rather, I highly recommend that you think hard about any preprocessing that you do, document it completely, and only commit to including it in your final analysis pipeline if you can defend the choices and have checked that they are robust. I am using several different methods of preprocessing here, for illustration and because the extent of data reduction is useful for my purposes. Your reasoning and decisions in preprocessing are extremely important, and up to you. In this case preprocessing is especially useful for showing graphically the high-level patterns in the data, as well as creating examples that compute in a short amount of time. ![]() ![]() In practice, you should probably perform and clearly-document well-justified preprocessing steps, which are supported in the phyloseq package with examples and details on a dedicated preprocessing tutorial. We want to filter low-occurrence, poorly-represented OTUs from this data, because they are essentially noise variables for the purposes of this tutorial. See the ggplot2 online documentation for further help. library("phyloseq") packageVersion("phyloseq") # '1.22.3' data(GlobalPatterns)
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |