{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 8: Graphical Models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of this lab session is to code two methods to estimate the structure of undirected gaussian graphical models and compare them.\n", "\n", "You have to send the filled notebook named **\"L8_familyname1_familyname2.ipynb\"** (groups of 2) by email to aml.centralesupelec.2019@gmail.com before December 12, 2019 at 23:59 and put **\"AML-L8\"** as subject. \n", "\n", "We begin with the standard imports:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "%matplotlib inline\n", "sns.set_context('poster')\n", "sns.set_color_codes()\n", "plot_kwds = {'alpha' : 0.25, 's' : 80, 'linewidths':0}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graphical Models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A graphical model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. The variables are represented by nodes and the relations between them are represented by edges.\n", "\n", "### GLasso\n", "\n", "Graphical Lasso is the name of the optimization problem that estimates the precision matrix of a multivariate gaussian and its name comes from the direct link with graphical models and the regularization term. \n", "\n", "Fill in the following class that implements the GLasso algorithm optimized by ADMM:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class my_GLasso():\n", " \n", " def __init__(self, alpha, mu, max_iter = 60):\n", " '''\n", " Parameters:\n", " alpha : float\n", " Penalization parameter selected.\n", " mu: float>0\n", "\n", " Attributes:\n", " \n", " covariance_ : numpy.ndarray, shape (n_features, n_features)\n", " Estimated covariance matrix.\n", " precision_ : numpy.ndarray, shape (n_features, n_features)\n", " Estimated precision matrix (inverse covariance).\n", " '''\n", " self.covariance_ = None\n", " self.precision_ = None\n", " self.alpha = alpha\n", " self.mu = mu\n", " \n", " def fit(self, X):\n", " \"\"\" Fits the GraphicalLasso model to X.\n", " \n", " Parameters:\n", " -----------\n", " X: (n, p) np.array\n", " Data matrix\n", " \n", " Returns:\n", " -----\n", " self\n", " \"\"\" \n", " # TODO:\n", " # initialize Y, Z\n", " # until convergence:\n", " # Update precision\n", " # Update Y\n", " # Update Z\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Nodewise Regression\n", "\n", "Fill in the following class that implements the nodewise regression algorithm to estimate a graphical model structure. You can use `LassoCV` for the regressions. Bonus (not graded): Implement your own cross-validation lasso." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.linear_model import LassoCV\n", "\n", "class my_nodewise_regression():\n", " \n", " def __init__(self, rule):\n", " '''\n", " Parameters:\n", " \n", " rule: {\"OR\", \"AND\"}\n", " \n", " Attributes:\n", " \n", " covariance_structure_ : numpy.ndarray, shape (n_features, n_features)\n", " Estimated covariance matrix. \n", " '''\n", " self.graph_structure_ = None\n", " self.rule = rule\n", " \n", " def fit(self, X):\n", " \"\"\" Fit the model to X.\n", " \n", " Parameters:\n", " -----------\n", " X: (n, p) np.array\n", " Data matrix\n", " \n", " Returns:\n", " -----\n", " self\n", " \"\"\" \n", " # TODO:\n", " # Estimate the precision structure\n", " # Symetrize with OR or AND rule" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate an easy-to-check (non-trivial, p<=6) example and plot the 4 (real, GLasso, AND, OR) graphs. You can use `networkx` to plot the resulting graph." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulations\n", "\n", "Compare the two graph estimators for the following model with $p = 300$ and $n = 40, 80, 320$:\n", "\n", "- An AR(1)-Block model. In this model the *covariance* matrix is block-diagonal with equalsized AR(1)-blocks of the form $(\\Sigma_{Block})_{i, j} = 0.9^{|i−j|}$, take $30 \\times 30$ blocks.\n", "\n", "Report accuracy and F1 score for the edge estimation.\n", "\n", "For GLasso estimation, use cross-validation k-fold with loglikelihood loss to select the $\\lambda$ penalization parameter.\n", "\n", "Measure the estimation error of the GLasso matrix result." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# TODO" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }