Course Home Page – STOR 881 Object Oriented Data Analysis – Fall 2019

Instructor:    J. S. Marron


Office:   352 Hanes Hall


Course Notes

  1.   STOR881-08-20-2019.pptx:  Organizational Matters, OODA Book, What is OODA?, Taste of OODA Examples (including Spanish Male Mortality), 3 Major Phases of OODA, Visualization, Scatterplot Matrix Views
  2.   STOR881-08-22-2019.pptx: Principal Component Analysis (PCA), Object Space – Feature Space, Scree Plots, Define Modes of Variation, PCA Toy Examples
  3.   STOR881-08-27-2019.pptx:  PCA Toy Examples, RNAseq Data, Cancer Gene Expression (PCA, DWD, Loadings), Caution About DWD
  4.   STOR881-08-29-2019.pptx:  Inference using DiProPerm,  Prob. Distn’s as Data Object, OODA Terminology, Matlab Software
  5.   09-03-2019:  Class Cancelled, Marron Sick
  6.   09-05-2019:  Class Cancelled, Marron Sick
  7.   STOR881-09-10-2019.pptx:  Marginal Distribution Plots, Drug Discovery Data, Correlation PCA
  8.   STOR881-09-12-2019.pptx:  Correlation PCA, General Transformations, Melanoma Data, Automatic Shifted Log Transformation, ROC curves
  9.   STOR881-09-17-2019.pptx:  Heatmap Views, Other Directions for Projection, Distance Methods (Fréchet Mean, Multidimensional Scaling), Landmark Based Shape Analysis, Equivalence Relations
  10.    STOR881-09-19-2019.pptx:  Landmark Based Shape Analysis, Shape Representations, Bladder-Prostate-Rectum Data, Skeletal Representations, Manifold Feature Spaces – Benjamin Leinwand: Longitudinal Brain Scans, Keerthi Anand:  Blind Source Separation
  11.   STOR881-09-24-2019.pptx:  Manifold Feature Spaces, PCA for s-reps, Principal Nested Spheres – Kevin O’Connor: Distributions as Data Objects, Jonghwan Yoo: JIVE on Methylation and Images
  12.   STOR881-09-26-2019.pptx:  Backwards PCA, Nonnegative Matrix Factorization, Manifold Learning, PCA Basics, History, Computation – Benjamin Langworthy: PCA for Multivar. Survival
  13.   STOR881-10-01-2019.pptx:  PCA as Optimization, Redistribution of Energy – Siyao Liu: Clustering Single Cell RNAseq Data
  14.   STOR881-10-03-2019.pptx: PCA Data Representation, PCA Simulation, PCA Graphical Representation, Classification -Discrimination, Fisher Linear Discrimination – Kentaro Hoffman: Time Frequency Analysis
  15.   STOR881-10-08-2019.pptx:  FLD Likelihood Approach, Gaussian Likelihood Ratio, HDLSS Discrimination, Maximal Data Piling – Yifeng Shi: Dealing with Histological Images Using Set-structured Method
  16.   STOR881-10-10-2019.pptx: Maximal Data Piling, Kernel Embedding – Chalmer Tomlinson: Longitudinal Autism, Samuel Rosin: Horseshoe Effect in Microbiome Data
  17.   STOR881-10-15-2019.pptx:  Kernel Embedding, Kernel PCA, t-SNE Visualization, Support Vector Machines – Alexander Murph: Visual Disturbances, Haodong Wang: Comparing t-SNE and UMAP for visualizing scRNA-seq data
  18.   October 17, 2019:   Fall Break, No Class
  19.   STOR881-10-22-2019.pptx:  Support Vector Machines, Distance Weighted Discrimination, Batch and Source Adjustment, Breast Cancer Data – Bryce Rowland: NMF for Hi-C Data, Thomas Keefe: Haar Wavelet Bases, Feng Cheng: How to normalize: a challenge in accelerating Magnetic Resonance Fingerprinting
  20.   STOR881-10-24-2019.pptx:  Distance Weighted Discrimination, NCI 60 Data, SVM & DWD Tuning, Outliers and Robust Methods – Seoyoon Cho: Relationship Between Mullen Score and Nutritions
  21.   STOR881-10-29-2019.pptx:  Spherical PCA, GWAS Data, VL1PCA, Start HDLSS Asymptotics – Richard Sizelove:  Integrative Analysis for Brain Functional Networks, Sumit Kar: Community structure in biological networks
  22.   STOR881-10-31-2019.pptx:  HDLSS Paradoxes and Explanations, Zero Covariance in Not Independence – Jose Sanchez: Stability of filtrations in Topological Data Analysis, Yonggang Sha: Statistical analysis methods for Multiple Myeloma Microarray  data
  23.   STOR881-11-05-2019.pptx:  Gaussian Scale Mixture and Zero Covariance, Mixing Conditions, HDLSS Analysis of PCA, Sparsity and DiProPerm – Yue Pan: Data Visualization for single cell RNA-seq data, Andrew Hamilton: Time Series Prediction
  24.   STOR881-11-07-2019.pptx: HDLSS Analysis of DWD Batch Adjustment, Radial DWD, Random Matrix Theory – David Bang: Catboost: Handling Large Categorical Variables, Carson Mosso: Manifold Learning, Katelyn Heath: Using ViSR Ultrasound breast data to diagnose malignancy in patients
  25.   STOR881-11-12-2019.pptx: Random Matrix Theory, High d Kernel Methods are Linear, PCA to Find Clusters, Smoothing – Dhruv Patel: Connections between Persistence Homology and Curvature, Hang Su: Characterization of Collaborative Cross Genomes
  26.   STOR881-11-14-2019.pptx:  Density Estimation, Scatterplot Smoothing, Inference Using SiZer – Wei Gu: A Heuristic Approach to Portfolio Optimization with Cardinality Constraints, Nicole Kramer: Calling DNA Loops in Hi-C Data
  27.   STOR881-11-19-2019.pptx:  Finish SiZer, Q-Q & PP Plots – Wongkyung Jang: OODA in Human-Computer Interaction, Bohan Li: Introduction to Word Segmentation, Andrew Marron: Kidney Function Methylation Data
  28.   STOR881-11-21-2019.pptx:  K-Means Clustering, SWISS, Hierarchical Clustering – Ram Basak:  FDA on Health Outcomes, Siqi Xiang: Analysis of Knee Osteoarthritis Data: Auto transformation and BET, Nicolas Wolczynski: Urban Sound Classification, Mingyi Wang: Symbolic Data principal component analysis
  29.   STOR881-11-26-2019.pptx:  Hierarchical Clustering, SigClust – Pavlos Zoubouloglou: Geodesic PCA in the Wasserstein Space, Taylor Petty: Forensic DNA Testing
  30.   November 28, 2019:    Thanksgiving, No Class
  31.   STOR881-12-03-2019.pptx:  SigClust, JIVE



Link to Marron’s Matlab Software (.zip file, expand to 4 directories, and put those in Matlab Path)

LungCancer2011.m, for Analysis of 2011 RNAseq Lung Cancer Data

counts, for 2011 RNAseq Lung Cancer Data

exonsMarron, for 2011 RNAseq Lung Cancer Data

Single .zip file with above 3, plus generated graphics