Algebraic Statistics

Information

Main Conference: SIAM AG19 July 9-13, 2019 (Tue-Sat).

Location: University of Bern, Bern, Switzerland.

Our session: The talks will be Tuesday and Wednesday (Jul 09-10) at Unitobler, F-121.

Algebraic statistics studies statistical models through the lens of algebra, geometry, and combinatorics. From model selection to inference, this interdisciplinary field has seen applications in a wide range of statistical procedures. This session will focus broadly on new developments in algebraic statistics, both on the theoretical side and the applied side.

Organizers

Jose Israel Rodriguez, University of Wisconsin --- Madison.

Elizabeth Gross, University of Hawaii at Manoa.

Schedule:

Tuesday, 09/Jul/2019: 10:00am - 12:00pm. MS172, part 1: Algebraic statistics: Sonja Petrovic, Thomas Kahle, Seth Sullivant, Marc Harkonen.

Wednesday, 10/Jul/2019. 10:00am - 12:00pm MS172, part 2: Algebraic statistics: Kathlen Kohn, Serkan Hosten, Elina Robeva.

Abstracts:

Thomas Kahle, OvGU Magdeburg
Oriented Gaussoids

An oriented gaussoid is a combinatorial structure that captures the possible signs of correlations among Gaussian random variables. We introduce this concept and present approaches to the classification and construction of oriented gaussoids, drawing parallels to oriented matroids, which capture the possible signs of dependencies in linear algebra.

Sonja Petrovic, IIT
Testing model fit for networks: algebraic statistics of mixture models and beyond

We consider statistical models for relational data that can be represented as a network. The nodes in the network are individuals, organizations, proteins, neurons, or brain regions, while edges---directed or undirected--- specific types of relationships between the nodes such as personal or organizational affinities or other social/financial relationships, or some physical or functional links such as co-activation of brain regions. One of the key open problems in this area is testing whether a proposed statistical model fits the data at hand. Algebraic statistics is known to provide theoretically reliable tools for testing model fit for a class of models that are log-linear exponential families; let's call these log-linear ERGMs. In this talk, we will discuss how the machinery can be extended to mixtures of log-linear ERGMs and other general linear exponential-family models that need not be log-linear, and what the hurdles are that need to be overcome in order for this set of tools to be generalizable, scalable and practical.

Elina Robeva, MIT
Nested Determinantal Constraints in Linear Structural Equation Models

Abstract: Directed graphical models specify noisy functional relationships among a collection of random variables. In the Gaussian case, each such model corresponds to a semi-algebraic set of positive definite covariance matrices. The set is given via parametrization, and much work has gone into obtaining an implicit description in terms of polynomial (in-)equalities. Implicit descriptions shed light on problems such as parameter identification, model equivalence, and constraint-based statistical inference. For models given by directed acyclic graphs, which represent settings where all relevant variables are observed, there is a complete theory: All conditional independence relations can be found via graphical d-separation and are sufficient for an implicit description. The situation is far more complicated, however, when some of the variables are hidden. We consider models associated to mixed graphs that capture the effects of hidden variables through correlated error terms. The notion of trek separation explains when the covariance matrix in such a model has submatrices of low rank and generalizes d-separation. However, in many cases, such as the infamous Verma graph, the polynomials defining the graphical model are not determinantal, and hence cannot be explained by d-separation or trek-separation. We show that these constraints often correspond to the vanishing of nested determinants and can be graphically explained by a notion of restricted trek separation.

Serkan Hosten, SFSU
The stratification of the maximum likelihood degree for toric varieties.

The lattice points of a lattice polytope give rise to a family of toric varieties when we allow complex coefficients in the monomial parametrization of the "usual" toric variety associated to the polytope. The maximum likelihood degree (ML degree) of any member of this family is at most the normalized volume of the polytope. The set of coefficient vectors associated to ML degrees smaller than the volume is parametrized by Gelfand-Kapranov-Zelevinsky's principal A-determinant. Not much is known about how the ML degree changes as one moves in the parameter space. We will discuss what we know starting with toric surfaces.

Seth Sullivant NCSU
Ideals of Gaussian Graphical Models

Gaussian graphical models are semialgebraic subsets of the cone of positive definite matrices. We will report on recent results trying to characterize the vanishing ideals of these models, in particular situations where they are generated by determinantal constraints.

Ha Khanh Nguyen The Ohio State University
(Cancelled) Geometry of Exponential Graph Models

When given network data, we can either compute descriptive statistics (degree distribution, diameter, clustering co- efficient, etc.) or we can find a model that explains the data. Modeling allow us to test hypotheses about edge formation, understand the uncertainty associated with the observed outcomes, and conduct inferences about whether the network substructures are more commonly observed than by chance. Modeling is also used for simulation and assessment of local effects. Exponential random graph models (ERGMs) are families of distributions defined by a set of network statistics and, thus, give rise to interesting graph theoretic questions. Our research focuses on the ERGMs where the edge, 2-path, and triangle counts are the sufficient statistics. These models are useful for modeling networks with a transitivity effect such as social networks. One of the most popular research questions for statisticians is the goodness-of-fit testing, how well does the model ”fit” the data? This is a difficult question for ERGMs. And one way to answer this question is to understand the reference set. Given an observed network G, the reference set of G is the set of simple graphs with the same edge, 2-path, and triangle counts as G. In algebraic geometry, it is called the fiber of G and are the 0-1 points on an algebraic variety, which we refer to as the reference variety. The goal of this paper is to understand reference variety through the lens of algebraic geometry.

Marc Harkonen, Georgia Tech
Combinatorial matrix theory in structural equation models

Many operations on matrices can be viewed from a combinatorial point of view by considering graphs associated to the matrix. For example, the determinant and inverse of a matrix can be computed from the linear subgraphs and 1-connections of the Coates digraph associated to the matrix. This combinatorial approach also naturally takes advantage of the sparsity structure of the matrix, which makes it ideal for applications in linear structural equation models. Another advantage of these combinatorial methods is the fact that they are often agnostic on whether the mixed graph contains cycles. As an example, we obtain a symbolic representation of the entries of the covariance matrix as a finite sum. In general, this sum will become similar to the well known trek rule, but where each half of the trek is a 1-connection instead of a path. This method of computing the covariance matrix can be easily implemented in computer algebra systems, and scales extremely well when the mixed graph has few cycles.

Kathlen Kohn University of Oslo
Moment Varieties of Measures on Polytopes

This talk brings many areas together: discrete geometry, statistics, algebraic geometry, invariant theory, geometric modeling, symbolic and numerical computations. We study the algebraic relations among moments of uniform probability distributions on polytopes. This is already a non-trivial matter for quadrangles in the plane. In fact, we need to combine invariant theory of the affine group with numerical algebraic geometry to compute first relevant relations. Moreover, the numerator of the generating function of all moments of a fixed polytope is the adjoint of the polytope, which is known from geometric modeling. We prove the conjecture that the adjoint is the unique polynomial of minimal degree which vanishes on the non-faces of a simple polytope. This talk is based on joint work with Kristian Ranestad, Boris Shapiro and Bernd Sturmfels.

Important information for speakers

1. The rules of SIAM do not allow a speaker to present multiple talks at this conference. Here is a list of other proposed sessions special sessions.

2. Talks will be 19-21 minutes plus 4 minutes for questions. (At 17 minutes, a signal will be given if desired).

3. Titles and abstracts will be asked for at a later date.

4. The minisymposium number is to be determined.

5. Housing information is to be determined .

6. Students can apply for travel support here.

Extra Links

SIAM AG Activity Group.