Course Home Page – STOR 881 Object Oriented Data Analysis – Spring 2022

Instructor:    J. S. Marron

Email:   marron@unc.edu

Office:   352 Hanes Hall

SyllabusSTOR881Fall2021

 

Course Notes

  1.   STOR881-01-11-2022.pptx:  Organizational Matters, OODA Book, What is OODA?, Taste of OODA Examples (including Spanish Male Mortality, Amplitude – Phase, Shapes, Sounds, Faces), 3 Major Phases of OODA.
  2.   STOR881-01-13-2022.pptx:  Visualization, Scatterplot Matrix Views, Principal Component Analysis (PCA), Object Space – Trait Space, Scree Plots.
  3.   STOR881-01-18-2022.pptx:  Define Modes of Variation,  Prob. Dist’ns as data objects, PCA Toy & Real Examples, Limitation of PCA: Apple, Banana, Pear.
  4.   STOR881-01-20-2022.pptx:  Limitations of PCA: NCI-60 Data, OODA Terminology, Caution about DWD, Inference using DiProPerm, Marginal Distribution Plots.
  5. STOR881-01-25-2022.pptx:  Marginal Distribution Plot Analysis of Drug Discovery Data, Normalization and Correlation PCA, Transformations, Melanoma Data.
  6. STOR881-01-27-2022.pptx:

 

References

Aitchison, J. (1982) The statistical analysis of compositional data. Journal of the Royal Statistical Society: Series B (Methodological), 44, 139-160 (cited 1/18/22)

Anderson, T. W., & Darling, D. A. (1952) Asymptotic theory of certain” goodness of fit” criteria based on stochastic processes. The Annals of Mathematical Statistics, 193-212  (cited 1/27/22)

Bai, Z. D., & Saranadasa, H. (1996) Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6(2), 311-329 (cited 1/18/22)

Benito, M., Parker, J., Du, Q., Wu, J., Xiang, D., Perou, C. M., & Marron, J. S. (2004) Adjustment of systematic microarray data biases. Bioinformatics, 20(1), 105-114 (cited 1/11/22)

Borland, D., & Taylor, R. M. (2007). Rainbow color map (still) considered harmful. IEEE computer graphics and applications, 27(2), 14-17, (cited 1/27/22)

Borysov, P., Hannig, J., Marron, J. S., Muratov, E., Fourches, D., & Tropsha, A. (2016). Activity prediction and identification of mis‐annotated chemical compounds using extreme descriptors. Journal of Chemometrics, 30(3), 99-108 (cited 1/25/22)

Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 211-252 (cited 1/25/22)

Bullitt, E., & Aylward, S. R. (2002). Volume rendering of segmented image objects. IEEE Transactions on Medical Imaging, 21(8), 998-1002. (cited 1/11/22)

Cai, T., Liu, W., & Xia, Y. (2014) Two‐sample test of high dimensional means under dependence. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2), 349-372 (cited 1/18/22)

Cates, J., Fletcher, P. T., Styner, M., Shenton, M., & Whitaker, R. (2007, July). Shape modeling and analysis with entropy-based particle systems. In Biennial International Conference on Information Processing in Medical Imaging (pp. 333-345). Springer, Berlin, Heidelberg (cited 1/20/22)

Cattell, R. B. (1966) The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245-276 (cited 1/13/22)

Chaney, E. L., Pizer, S., Joshi, S., Broadhurst, R., Fletcher, T., Gash, G., … & Tracton, G. (2004). Automatic male pelvis segmentation from CT images via statistically trained multi-object deformable m-rep models. International Journal of Radiation Oncology, Biology, Physics, 60(1), S153-S154. (cited 1/11/22)

Chen, S. X., & Qin, Y. L. (2010) A two-sample test for high-dimensional data with applications to gene-set testing. The Annals of Statistics, 808-835  (cited 1/18/22)

DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, 44, 837-845 (cited 1/27/22)

Feng, Q., Hannig, J., & Marron, J. S. (2016). A note on automatic data transformation. Stat, 5(1), 82-87  (cited 1/25/2019, 1/27/22)

Gaydos, T. L., Heckman, N. E., Kirkpatrick, M., Stinchcombe, J. R., Schmitt, J., Kingsolver, J., & Marron, J. S. (2013). Visualizing genetic constraints. The Annals of Applied Statistics, 7(2), 860-882 (cited 1/27/2022)

Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics, Wiley (cited 1/27/22)

Hotelling, H. (1933) Analysis of a Complex of Statistical Variables Into Principal Components. Journal of Educational Psychology, 24, 417-441 (cited 1/13/22)

Hron, K., Menafoglio, A., Templ, M., Hrůzová, K. & Filzmoser, P. (2016) Simplicial principal component analysis for density functions in Bayes spaces. Computational Statistics & Data Analysis, 94, 330-350  (cited 1/18/22)

Inselberg, A. (1985) The Plane with Parallel Coordinates, Visual Computer 1: 69–91 (cited 1/18/22)

Inselberg, A. (2009) Parallel Coordinates: VISUAL Multidimensional Geometry and its Applications. Springer, New York (cited 1/18/22)

Izem, R., & Kingsolver, J. G. (2005). Variation in continuous reaction norms: quantifying directions of biological interest. The American Naturalist, 166(2), 277-289 (cited 1/27/22)

Izem, R., & Marron, J. S. (2007). Analysis of nonlinear modes of variation for functional data. Electronic Journal of Statistics, 1, 641-676 (cited 1/27/22)

Kimes, P. K., Cabanski, C. R., Wilkerson, M. D., Zhao, N., Johnson, A. R., Perou, C. M., Makowski, L., Marron, J. S. & Hayes, D. N. (2014) SigFuge: single gene clustering of RNA-seq reveals differential isoform usage among cancer samples, Nucleic Acids Research (2014): gku521 (cited 1/18/22)

Kingsolver, J. G., Heckman, N., Zhang, J., Carter, P. A., Knies, J. L., Stinchcombe, J. R., & Meyer, K. (2015). Genetic variation, simplicity, and evolutionary constraints for function-valued traits. The American Naturalist, 185(6), E166-E181 (cited 1/27/22)

Liu, X., Parker, J., Fan, C., Perou, C. M., & Marron, J. S. (2009). Visualization of cross-platform microarray normalization. Batch Effects and Noise in Microarray Experiments: Sources and Solutions. Wiley, New York, 167-181 (cited 1/20/22)

Maggiora, G. M. (2006). On outliers and activity cliffs why QSAR often disappoints (cited 1/25/22)

Marron, J. S. & Alonso, A. M. (2014) Overview of object oriented data analysis, Biometrical Journal, 56, 732-753 (cited 1/11/22)

Marron, J. S. & Dryden, I. L. (2021) Object Oriented Data Analysis, CRC Press (cited 1/11/22)

Miedema, J., Marron, J. S., Niethammer, M., Borland, D., Woosley, J., Coposky, J. & Thomas, N. E. (2012) Image and statistical analysis of melanocytic histology. Histopathology, 61(3), 436-444 (cited 1/25/22)

Menafoglio, A., Grasso, M., Secchi, P. & Colosimo, B.M. (2018) Profile monitoring of probability density functions via simplicial functional PCA with application to image data, Technometrics, 60, 497-510  (cited 1/18/22)

Parzen, E. (2004) Quantile probability and statistical data modeling, Statistical Science, 19, 652-662. (cited 1/18/22)

Patrangenaru, V., & Ellingson, L. (2019). Nonparametric statistics on manifolds and their applications to object data analysis. CRC Press. (cited 1/20/22)

Pearson, K. (1901) On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine, 2, 559-572 (cited 1/13/22)

Pigoli, D., Pantelis, Z. H., Coleman, J. S. & Aston, J. A. D. (2015) The analysis of acoustic phonetic data: exploring differences in the spoken romance languages. arXiv preprint arXiv:1507.07587 (cited 1/11/22)

Ramsay, J. O. & Silverman, B. W. (2002) Applied Functional Data Analysis, Springer, N.Y. ISBN 0-387-95414-7 (cited 1/11/22)

Ramsay, J. O. & Silverman, B. W. (2005) Functional Data Analysis, 2nd Edition, Springer, N.Y. ISBN 0-387-40080-X (cited 1/11/22)

Ramsay, J. O. (2005) Functional Data Analysis Web Site, http://psych.mcgill.ca/misc/fda/  (cited 1/11/22)

Srivastava, M. S., Katayama, S., & Kano, Y. (2013) A two sample test in high dimensional data. Journal of Multivariate Analysis, 114, 349-358 (cited 1/18/22)

Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American statistical Association, 69(347), 730-737  (cited 1/27/22)

Tukey, J. W. (1977) Exploratory data analysis, Pearson, N.Y. ISBN 978-0201076165 (cited 1/11/22)

Venables, W.N. & Ripley, B.D. (2013) Modern applied statistics with S-PLUS. Springer Science & Business Media (cited 1/20/22)

Wang, H. & Marron, J. S. (2007) Object oriented data analysis: sets of trees, Annals of Statistics, 35, 1849-1873  (cited 1/11/11)

Wei, S., Lee, C., Wichers, L., & Marron, J. S. (2015) Direction-projection-permutation for high dimensional hypothesis tests. Journal of Computational and Graphical Statistics, (cited 8/29/19, 1/18/22)

Wilkinson, L., & Friendly, M. (2009). The history of the cluster heat map. The American Statistician, 63(2), 179-184, (cited 1/27/22)

Zhang, J., Heckman, N., Cubranic, D., Kingsolver, J. G., Gaydos, T., & Marron, J. S. (2014). Prinsimp. R Journal, 6(2) (cited 1/27/22)