BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Collegio Carlo Alberto - ECPv6.3.7//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Collegio Carlo Alberto
X-ORIGINAL-URL:https://www.carloalberto.org
X-WR-CALDESC:Events for Collegio Carlo Alberto
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/Rome
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20190331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20191027T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20200329T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20201025T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20210328T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20211031T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20220327T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20221030T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20191018T120000
DTEND;TZID=Europe/Rome:20191018T130000
DTSTAMP:20240425T084842
CREATED:20190924T145500Z
LAST-MODIFIED:20190924T150408Z
UID:10005027-1571400000-1571403600@www.carloalberto.org
SUMMARY:Mark Podolskij (AARHUS University)
DESCRIPTION:“Optimal estimation of certain random quantities associated with Levy processes” \nAbstract: In this talk we present new ideas on optimality of statistical estimates of certain random quantities of stochastic processes\, such as supremum or local times. Despite the existing results on estimation of such objects through high frequency observations\, the question of optimality is rarely discussed. We will demonstrate some optimal estimation methods for the supremum and local times of the Brownian motion in the $L^2$ and $L^1$ sense. In the second part of the talk we will investigate how the main ideas can be extended towards the class of Lévy processes and continuous diffusion models.
URL:https://www.carloalberto.org/event/mark-podolskij-aarhus-university/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20191129T120000
DTEND;TZID=Europe/Rome:20191129T130000
DTSTAMP:20240425T084842
CREATED:20191011T120946Z
LAST-MODIFIED:20191122T114549Z
UID:10004030-1575028800-1575032400@www.carloalberto.org
SUMMARY:Robin Ryder (Université Paris-Dauphine)
DESCRIPTION:“A Bayesian non-parametric methodology for inferring grammar complexity” \nBased on a set of strings from a language\, we wish to infer the complexity of the underlying grammar. To this end\, we develop a methodology to choose between two classes of formal grammars in the Chomsky hierarchy: simple regular grammars and more complex context-free grammars. To do so\, we introduce a probabilistic context-free grammar model in the form of a Hierarchical Dirichlet Process over rules expressed in Greibach Normal Form. In comparison to other representations\, this has the advantage of nesting the regular class within the context-free class. We consider model comparison both by exploiting this nesting\, and with Bayes’ factors. The model is fit using a Sequential Monte Carlo method\, implemented in the Birch probabilistic programming language. We apply this methodology to data collected from primates\, for which the complexity of the grammar is a key question.
URL:https://www.carloalberto.org/event/novembre-robin-ryder-paris-dauphine/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20191218T110000
DTEND;TZID=Europe/Rome:20191218T120000
DTSTAMP:20240425T084842
CREATED:20191011T122849Z
LAST-MODIFIED:20191213T110040Z
UID:10004032-1576666800-1576670400@www.carloalberto.org
SUMMARY:Daniel Kowal (Rice University)
DESCRIPTION:“Scalable Bayesian Inference and Summarization for Functional Data” \nModern scientific monitoring systems\, such as wearable and implantable devices\, commonly record data over a continuous domain at high resolutions. These functional data are high-dimensional\, strongly correlated\, and usually measured concurrently with other variables of interest. Bayesians models for functional data are particularly appealing: they accommodate multiple dependence structures\, handle missing or irregularly-spaced data\, and provide regularization via shrinkage priors. However\, these models are often complex\, computationally intensive\, and difficult to interpret. This talk will focus on two fundamental challenges for Bayesian functional data analysis: (1) constructing sufficiently flexible and scalable functional regression models and (2) extracting interpretable posterior summaries. The proposed modeling framework is nonparametric and uses an unknown functional basis to learn prominent functional features\, which are associated with scalar predictors within a regression model. A customized projection-based Gibbs sampler provides posterior inference with linear time complexity in the number of predictors\, which is empirically faster than existing frequentist and Bayesian alternatives. Using the posterior distribution\, a decision theoretic approach for Bayesian variable selection is developed\, which identifies a subset of covariates that retains nearly the predictive accuracy of the full model. The methodology is applied to actigraphy data to investigate the association between intraday physical activity and responses to a sleep questionnaire.
URL:https://www.carloalberto.org/event/dicembre-daniel-kowal-rice-university/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20191218T120000
DTEND;TZID=Europe/Rome:20191218T130000
DTSTAMP:20240425T084842
CREATED:20191011T123144Z
LAST-MODIFIED:20191213T110114Z
UID:10004034-1576670400-1576674000@www.carloalberto.org
SUMMARY:Stefano Peluchetti (Cogent Labs Tokyo)
DESCRIPTION:“Deep neural networks and stochastic processes” \nWe review deep neural networks and their training\, and then focus on the connection between neural networks and stochastic processes. Deep neural networks at initialization correspond to prior models in function space\, and under appropriate assumptions they converge to Gaussian processes in the limit of infinite width. While this connection holds “a priori”\, it is possible to derive similar results for “a posteriori” training via gradient descent. In this talk we discuss these and related results\, and we introduce a recent line of research which generalizes established results to non-Gaussian initializations and stochastic processes arising in the limit of infinite depth.
URL:https://www.carloalberto.org/event/stefano-peluchetti-cogent-labs-tokyo/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20200508T120000
DTEND;TZID=Europe/Rome:20200508T133000
DTSTAMP:20240425T084842
CREATED:20200421T145449Z
LAST-MODIFIED:20200909T102227Z
UID:10004096-1588939200-1588944600@www.carloalberto.org
SUMMARY:Francesco Sanna Passino\, Imperial College London
DESCRIPTION:“Bayesian estimation of the latent dimension and communities in stochastic blockmodels” \nAbstract: Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional latent space\, with optimal theoretical guarantees. The embedding can be used to estimate the community structure of the network\, with strong consistency results in the stochastic blockmodel framework. One of the main practical limitations of standard algorithms for community detection from spectral embeddings is that the number of communities and the latent dimension of the embedding must be specified in advance. In this talk\, a novel Bayesian model for simultaneous and automatic selection of the appropriate dimension of the latent space and the number of blocks is proposed. Extensions to directed and bipartite graphs are discussed. The model is tested on simulated and real world network data\, showing promising performance for recovering latent community structure. Joint work with Professor Nick Heard (Imperial College London).
URL:https://www.carloalberto.org/event/francesco-sanna-passino-imperial-college-london/
CATEGORIES:Seminars in Statistics,Webinars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20200515T120000
DTEND;TZID=Europe/Rome:20200515T133000
DTSTAMP:20240425T084842
CREATED:20200506T083146Z
LAST-MODIFIED:20200909T102100Z
UID:10005111-1589544000-1589549400@www.carloalberto.org
SUMMARY:Ismael Castillo\, Sorbonne Université Paris
DESCRIPTION:“Multiscale analysis of Bayesian CART” \nAbstract: This work affords new insights about Bayesian CART in the context of structured wavelet shrinkage. We show that practically used Bayesian CART priors lead to adaptive rate-minimax posterior concentration in the supremum norm in Gaussian white noise\, performing optimally up to a logarithmic factor. To further explore the benefits of structured shrinkage\, we propose the g-prior for trees\, which departs from the typical wavelet product priors by harnessing correlation induced by the tree topology. Building on supremum norm adaptation\, an adaptive nonparametric Bernstein–von Mises theorem for Bayesian CART is derived using multiscale techniques. For the fundamental goal of uncertainty quantification\, we construct adaptive confidence bands with uniform coverage for the regression function under self-similarity. (Joint work with Veronika Rockova).
URL:https://www.carloalberto.org/event/webinar-ismael-castillo-sorbonne-universite-paris/
CATEGORIES:Seminars in Statistics,Webinars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20200522T120000
DTEND;TZID=Europe/Rome:20200522T133000
DTSTAMP:20240425T084842
CREATED:20200516T100657Z
LAST-MODIFIED:20200909T102033Z
UID:10005121-1590148800-1590154200@www.carloalberto.org
SUMMARY:Pierre Jacob\, Harvard University
DESCRIPTION:“Unbiased Markov chain Monte Carlo with couplings” \nAbstract: \nVarious tasks in statistics involve numerical integration\, for which Markov chain Monte Carlo (MCMC) methods are state-of-the-art. MCMC methods yield estimators that converge to integrals of interest in the limit of the number of iterations. This iterative asymptotic justification is not ideal; first\, it stands at odds with current trends in computing hardware\, with increasingly parallel architectures; secondly\, the choice of  “burn-in” or “warm-up” is arduous. This talk will describe recently proposed estimators that are unbiased for the expectations of interest while having a finite computing cost and a finite variance. They can thus be generated independently in parallel and averaged over. The method also provides practical upper bounds on the distance (e.g. total variation) between the marginal distribution of the chain at a finite step and its invariant distribution. The key idea is to generate “faithful” couplings of Markov chains\, whereby pairs of chains coalesce after a random number of iterations. This talk will provide an overview of this line of research. (joint work with John O’Leary\, Yves Atchadé) \nThe main reference has just appeared here: https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12336
URL:https://www.carloalberto.org/event/webinar-pierre-jacob-harvard-university/
CATEGORIES:Seminars in Statistics,Webinars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20200529T150000
DTEND;TZID=Europe/Rome:20200529T163000
DTSTAMP:20240425T084842
CREATED:20200506T084044Z
LAST-MODIFIED:20200909T102008Z
UID:10005112-1590764400-1590769800@www.carloalberto.org
SUMMARY:Tommaso Rigon\, Duke University
DESCRIPTION:“A generalized Bayes framework for probabilistic clustering” \nAbstract: Clustering methods such as k-means and its variants are standard tools for finding groups in the data. However\, despite their huge popularity\, the underlying uncertainty can not be easily quantified. On the other hand\, mixture models represent a well-established inferential tool for probabilistic clustering\, but they are characterized by severe computational bottlenecks and may have unreliable solutions in presence of misspecifications. Instead\, we rely on a generalized Bayes framework for probabilistic clustering based on Gibbs posteriors. Broadly speaking\, in such a setting the log-likelihood is replaced by an arbitrary loss function and this arguably leads to much richer families of clustering methods. Our contribution is two-fold: first\, we describe a clustering pipeline for efficiently finding groups and then quantifying the associated uncertainty. Second\, we discuss two broad classes of loss functions which have advantages in terms of analytic tractability and interpretability. Specifically\, we consider losses based on Bregman divergences and pairwise dissimilarities and we show they can be interpreted as profile and composite log-likelihoods\, respectively. Full Bayesian inference is conducted via Gibbs sampling but efficient deterministic algorithms are available for point estimation.  As an important byproduct of our work\, we show that several existing clustering approaches can be interpreted as generalized Bayesian estimators under specific loss functions. Hence\, our methodology can be also used to formally quantify the uncertainty in widely used clustering approaches. Joint work with Amy Herring (Duke University) and David Dunson (Duke University).
URL:https://www.carloalberto.org/event/webinar-tommaso-rigon-duke-university/
CATEGORIES:Seminars in Statistics,Webinars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20201023T120000
DTEND;TZID=Europe/Rome:20201023T130000
DTSTAMP:20240425T084842
CREATED:20201015T135716Z
LAST-MODIFIED:20201016T072713Z
UID:10005151-1603454400-1603458000@www.carloalberto.org
SUMMARY:Minwoo Chae (Pohang University of Science and Technology\, South Korea) (webinar)
DESCRIPTION:“Posterior asymptotics in Wasserstein metrics on the real line” \nJoint initiative with MIDAS Complex Data Modeling Research Network https://midas.mat.uc.cl/network/ \nAbstract: We use the class of Wasserstein metrics to study asymptotic properties of posterior distributions. The first goal is to provide sufficient conditions for posterior consistency. In addition to the well-known Kullback-Leibler condition on the prior\, the true distribution and most probability measures in the support of the prior are required to possess moments up to an order which is determined by the order of the Wasserstein metric. We further investigate convergence rates of the posterior distributions for which we need stronger moment conditions. The required tail conditions are sharp in the sense that the posterior distribution may be inconsistent or contract slowly to the true distribution without these conditions. We apply the results to density estimation with a Dirichlet process mixture prior.
URL:https://www.carloalberto.org/event/minwoo-chae-pohang-university-of-science-and-technology-south-korea-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20201120T170000
DTEND;TZID=Europe/Rome:20201120T183000
DTSTAMP:20240425T084842
CREATED:20201026T124648Z
LAST-MODIFIED:20201026T124648Z
UID:10004144-1605891600-1605897000@www.carloalberto.org
SUMMARY:Didong Li (Princeton University\, USA) (webinar)
DESCRIPTION:“Learning & Exploiting Low-Dimensional Structure in High-Dimensional Data” \nJoint initiative with MIDAS Complex Data Modeling Research Network https://midas.mat.uc.cl/network/ \nAbstract: Data lying in a high-dimensional ambient space are commonly thought to have a much lower intrinsic dimension. In particular\, the data may be concentrated near a lower dimensional subspace or manifold. There is an immense literature focused on approximating the unknown subspace and the unknown density\, and exploiting such approximations in clustering\, data compression\, and building of predictive models. Most of the literature relies on approximating subspaces and densities using a locally linear\, and potentially multi-scale\, dictionary with Gaussian kernels. In this talk\, we propose a simple and general alternative\, which instead uses pieces of spheres\, or spherelets\, to locally approximate the unknown subspace. I will also introduce a curved kernel called the the Fisher–Gaussian (FG) kernel which outperforms multivariate Gaussians in many cases. Theory is developed showing that spherelets can produce lower covering numbers and mean square errors for many manifolds\, as well as the posterior consistency of the Dirichlet process mixture of FG kernels. Results relative to state-of-the-art competitors show gains in ability to accurately approximate the subspace and the density with fewer components and parameters. Time permitting\, I will also present some applications of spherelets\, including classification\, geodesic distance estimation and clustering.
URL:https://www.carloalberto.org/event/didong-li-princeton-university-usa-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20201217T170000
DTEND;TZID=Europe/Rome:20201217T183000
DTSTAMP:20240425T084842
CREATED:20201116T114146Z
LAST-MODIFIED:20201116T114427Z
UID:10004152-1608224400-1608229800@www.carloalberto.org
SUMMARY:David Rossell (Universitat Pompeu Fabra\, Barcelona\, Spain) (webinar)
DESCRIPTION:“Approximate Laplace approximation” \nJoint initiative with MIDAS Complex Data Modeling Research Network https://midas.mat.uc.cl/network/ \nAbstract: Bayesian model selection requires an integration exercise in order to assign posterior model probabilities to each candidate model. The computation becomes cumbersome when the integral has no closed-form\, particularly when the sample size is large\, or the number of models is large. We present a simple yet powerful idea based on the Laplace approximation (LA) to an integral. LA uses a quadratic Taylor expansion at the mode of the integrand and is typically quite accurate\, but requires cumbersome likelihood evaluations (for large n) an optimization (for large p). We propose the approximate Laplace approximation (ALA)\, which uses an Taylor expansion at the null parameter value. ALA brings very significant speed-ups by avoiding optimizations altogether\, and evaluating likelihoods via sufficient statistics. ALA is an approximate inference method equipped with strong model selection properties in the family of non-linear GLMs\, attaining comparable rates to exact computation. When (inevitably) the model is misspecified the ALA rates can actually be faster than for exact computation\, depending on the type of misspecification. We show examples in non-linear Gaussian regression with non-local priors\, for which no closed-form integral exists\, as well as non-linear logistic\, Poisson and survival regression.
URL:https://www.carloalberto.org/event/david-rossell-universitat-pompeu-fabra-barcelona-spain/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20210226T170000
DTEND;TZID=Europe/Rome:20210226T183000
DTSTAMP:20240425T084842
CREATED:20210211T130019Z
LAST-MODIFIED:20210217T074701Z
UID:10004194-1614358800-1614364200@www.carloalberto.org
SUMMARY:Yun Wei (Samsi and Duke University\, USA) (webinar)
DESCRIPTION:“Obtaining faster convergence rates in finite mixture models by taking repeated measures” \nJoint initiative with MIDAS Complex Data Modeling Research Network https://midas.mat.uc.cl/network/ \nAbstract: It is known that some finite mixture models suffer from slow rates for estimating the component parameters. Examples are mixtures of the weakly identifiable families in the sense of [Ho and Nguyen 2016]. To obtain faster parameter convergence rates\, we propose to collect more samples from each mixture component\, hence each data is a vector of samples from the same mixture component. Such a model is known in the literature as a finite mixture model of repeated measures\, which has been applied in psychological study and topic modeling. This model also belongs to the mixture of product distributions\, with the special structure that the product distributions in each mixture component are also identical. In this setup\, each data consists of conditionally independent and identically distributed samples and thus is an exchangeable sequence.\nWe show that by taking repeated measures (collecting more samples from each mixture component)\, a finite mixture model that is not originally identifiable becomes identifiable. Moreover\, the posterior contraction rates for the parameter estimation are also obtained\, demonstrating that repeated measures are beneficial for estimating the component parameters. Our results hold for general probability families including all regular exponential families and can also be applied to hierarchical models. The key tool to develop the results is by establishing an inverse inequality to upper bound a suitable distance between mixing measures by the total variational distance between the corresponding mixture densities.\nBased on joint work with Xuanlong Nguyen.
URL:https://www.carloalberto.org/event/yun-wei-samsi-and-duke-university-usa-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20210319T170000
DTEND;TZID=Europe/Rome:20210319T183000
DTSTAMP:20240425T084842
CREATED:20210211T130308Z
LAST-MODIFIED:20210309T082652Z
UID:10004196-1616173200-1616178600@www.carloalberto.org
SUMMARY:Andrea Ottolini (Standford University\, USA) (webinar)
DESCRIPTION:“Gibbs sampling in the analysis of priors for almost exchangeable data” \nJoint initiative with MIDAS Complex Data Modeling Research Network https://midas.mat.uc.cl/network/ \nAbstract: Consider a population of N individuals divided into d subgroups (e.g.\, d=4 and people are divided by sex and smoking habits). A sequence of 0-1 valued experiments on the population with outcomes X_1\,…\, X_n is called partially exchangeable if the only relevant information in the data is the number of 1’s in each category. de Finetti’s representation result guarantees that the distribution of the X’s (for n<<N) is in bijection with measures pi^{(n)} on [0\,1]^d. Natural problems — such as understanding which of the subdivisions are truly meaningful and what is the effect of our initial belief pi^{(0)} — can be addressed if one knows how to sample efficiently from these measures. In the absence of further information about the experiments\, de Finetti suggests starting with a class of priors pi^{(0)}_{C\,A} indexed by a d-dimensional network (measuring the relative differences among categories) and a positive parameter A (measuring the belief in complete exchangeability). in this talk I will analyze the performance of Gibbs sampling in approximating these measures in the almost exchangeable regime (A>>1). It will be shown that A^2 steps are necessary and sufficient to mix in a certain Wasserstein distance\, with constants depending on few spectral parameters of the network C. This is based on joint work with Gerencsér.
URL:https://www.carloalberto.org/event/andrea-ottolini-standford-university-usa-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20210409T120000
DTEND;TZID=Europe/Rome:20210409T133000
DTSTAMP:20240425T084842
CREATED:20210216T085746Z
LAST-MODIFIED:20210329T124338Z
UID:10004200-1617969600-1617975000@www.carloalberto.org
SUMMARY:Gilles Stupfler (ENSAI Rennes and CREST\, France) (webinar)
DESCRIPTION:“Asymmetric least squares techniques for extreme risk estimation” \nAbstract: Financial and actuarial risk assessment is typically based on the computation of a single quantile (or Value-at-Risk). One drawback of quantiles is that they only take into account the frequency of an extreme event\, and in particular do not give an idea of what the typical magnitude of such an event would be. Another issue is that they do not induce a coherent risk measure\, which is a serious concern in actuarial and financial applications. In this talk\, I will explain how\, starting from the formulation of a quantile as the solution of an optimisation problem\, one may come up with two alternative families of risk measures\, called expectiles and extremiles. I will give a broad overview of their properties\, as well as of their estimation at extreme levels in heavy-tailed models\, and explain why they constitute sensible alternatives for risk assessment using some real data applications. This is based on joint work with Abdelaati Daouia\, Irène Gijbels\, Stéphane Girard and Antoine Usseglio-Carleve.
URL:https://www.carloalberto.org/event/gilles-stupfler-ensai-rennes-and-crest-france-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20210430T120000
DTEND;TZID=Europe/Rome:20210430T133000
DTSTAMP:20240425T084842
CREATED:20210211T130452Z
LAST-MODIFIED:20210422T155555Z
UID:10004198-1619784000-1619789400@www.carloalberto.org
SUMMARY:Matthias Loffler (ETH Zurich) (webinar)
DESCRIPTION:“Optimality of Spectral Clustering in the Gaussian Mixture Model” \nAbstract: Spectral clustering is one of the most popular algorithms to group high dimensional data. It is easy to implement and computationally efficient. Despite its popularity and successful applications\, its theoretical properties have not been fully understood. We show that spectral clustering is minimax optimal in the Gaussian Mixture Model with isotropic covariance matrix\, when the number of clusters is fixed and the signal-to-noise ratio is large enough. Spectral gap conditions are widely assumed in the literature to analyze spectral clustering. On the contrary\, we show that these conditions are not needed.
URL:https://www.carloalberto.org/event/matthias-loffler-eth-zurich-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20210521T120000
DTEND;TZID=Europe/Rome:20210521T133000
DTSTAMP:20240425T084842
CREATED:20210510T122623Z
LAST-MODIFIED:20210510T122915Z
UID:10005234-1621598400-1621603800@www.carloalberto.org
SUMMARY:Paul Jenkins (University of Warwick) (webinar)
DESCRIPTION:“Asymptotic genealogies of interacting particle systems” \nAbstract: Interacting particle systems are a broad class of stochastic models for phenomena arising in physics\, engineering\, biology\, and finance. A prominent class of such models can be expressed as a sequential Monte Carlo algorithm in which the aim is to construct an empirical approximation to a sequence of measures. The approximation is constructed by evolving a discrete-time\, weighted population of particles\, alternating between a Markov update and a resampling step. Resampling gives rise to a notion of a genealogy in which duplicated particles are regarded as offspring of their parents. In this talk I discuss how to characterise the genealogy underlying this evolving particle system. More precisely\, under certain conditions we can show that the genealogy converges (as the number of particles grows) to Kingman’s coalescent\, a stochastic tree-valued process widely studied in population genetics. This makes explicit the analogy between sequential Monte Carlo and an evolving biological population. This is joint work with Suzie Brown\, Adam Johansen\, Jere Koskela\, and Dario Spanò.
URL:https://www.carloalberto.org/event/paul-jenkins-university-of-warwick-webinar/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20211015T120000
DTEND;TZID=Europe/Rome:20211015T130000
DTSTAMP:20240425T084842
CREATED:20211004T131431Z
LAST-MODIFIED:20211006T104704Z
UID:10005278-1634299200-1634302800@www.carloalberto.org
SUMMARY:Bas Kleijn (University of Amsterdam)
DESCRIPTION:“Confidence sets in a sparse stochastic block model with two communities of unknown sizes” \nAbstract: In a sparse stochastic block model with two communities of unequal sizes we derive two posterior concentration inequalities\, that imply (1) posterior (almost-)exact recovery of the community structure under sparsity bounds comparable to well-known sharp bounds in the planted bi-section model; (2) a construction of confidence sets for the community assignment from credible sets\, with finite graph sizes. The latter enables exact frequentist uncertain quantification with Bayesian credible sets at non-asymptotic graph sizes\, where posteriors can be simulated well. There turns out to be no proportionality between credible and confidence levels: for given edge probabilities and a desired confidence level\, there exists a critical graph size where the required credible level drops sharply from close to one to close to zero. At such graph sizes the frequentist decides to include not most of the posterior support for the construction of his confidence set\, but only a small subset of community assignments containing the highest amounts of posterior probability (like the maximum-a-posteriori estimator). It is argued that for the proposed construction of confidence sets\, a form of early stopping applies to MCMC sampling of the posterior\, which would enable the computation of confidence sets at\nlarger graph sizes.\nLink: https://arxiv.org/abs/2108.07078
URL:https://www.carloalberto.org/event/bas-kleijn-university-of-amsterdam-2/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20211119T120000
DTEND;TZID=Europe/Rome:20211119T130000
DTSTAMP:20240425T084842
CREATED:20211005T083716Z
LAST-MODIFIED:20211006T104946Z
UID:10004264-1637323200-1637326800@www.carloalberto.org
SUMMARY:Lorenzo Rosasco (Università di Genova & MIT)
DESCRIPTION:“Interpolation and learning with scale dependent kernels” \nAbstract: We study the learning properties of nonparametric ridge-less least squares. In particular\, we consider the common case of estimators defined by scale dependent (Matern) kernels\, and focus on the role scale and smoothness. These estimators interpolate the data and the scale can be shown to control their stability to noise and sampling. Larger scales\, corresponding to smoother functions\, improve stability with respect to sampling. However\, smaller scales\, corresponding to more complex functions\, improve stability to noise. We will discuss to which extent these results can explain the learning curves observed for large overparameterized models. Our analysis combines\, probabilistic results with analytic techniques from interpolation theory.
URL:https://www.carloalberto.org/event/lorenzo-rosasco-universita-di-genova-mit/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20211126T120000
DTEND;TZID=Europe/Rome:20211126T130000
DTSTAMP:20240425T084842
CREATED:20211028T084607Z
LAST-MODIFIED:20211118T152427Z
UID:10004272-1637928000-1637931600@www.carloalberto.org
SUMMARY:Gonzalo Mena (University of Oxford)
DESCRIPTION:“On the unreasonable effectiveness of Sinkhorn algorithm for learning permutations and entropic optimal transport” \nAbstract: Sinkhorn’s algorithm realizes the solution of entropy-regularized linear programs on certain matrix polytopes. In the past years\, the interest in this algorithm has grown considerably because of its usefulness as a tool for the modeling of permutations\, and because of its fundamental role in the solution of an entropic optimal transport problem\, also called the Schrödinger bridge. In this talk\, I will give an overview of my work in relation to these two areas.\nFirst\, regarding entropic optimal transport\, I will argue that this tool is valuable for deriving sensible statistical procedures. Indeed\, we show that it enjoys a substantially better sample complexity compared to optimal transport\, which suffers from the curse of dimensionality. Also\, in the more applied setup of model-based clustering we show that it can be used as an alternative to the log-likelihood\, since it has fewer bad local optima. Based on this observation\, we develop a new algorithm\, Sinkhorn-EM\, in which we only modify the E-step to solve an Entropic Optimal Transport problem. Our algorithm is shown to attain better practical performance.\nSecond\, regarding permutations\, I will describe some successful applications in Deep Learning\, and in neuroscience\, for the inference of neural identities in C.elegans worms. \nLinks\nhttps://arxiv.org/abs/1802.08665\nhttps://arxiv.org/abs/1905.11882\nhttps://arxiv.org/abs/2006.16548
URL:https://www.carloalberto.org/event/gonzalo-mena-university-of-oxford/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20211217T120000
DTEND;TZID=Europe/Rome:20211217T130000
DTSTAMP:20240425T084842
CREATED:20211028T084737Z
LAST-MODIFIED:20211129T145342Z
UID:10004273-1639742400-1639746000@www.carloalberto.org
SUMMARY:Lorenzo Masoero (Amazon)
DESCRIPTION:“Improved prediction and optimal sequencing strategies for genomic variant discovery via Bayesian nonparametrics” \nAbstract: Despite the advent of Big Data\, data-gathering in many domains can still be an expensive process that necessitates careful planning when operating under a fixed\, limited budget. For instance\, sequencing new genomic data is a complex procedure that requires careful tuning: researchers can spend resources to sequence a greater number of genomes (quantity)\, or spend resources to sequence genomes with increased accuracy (quality). In this talk\, I consider the common setting in which scientists have already conducted a pilot study to reveal variants in a genome and are contemplating a follow-up study. Spending additional resources has the potential to reveal new variations in the genome\, and thereby new genetic insights. Therefore\, practitioners are interested in (i) predicting how many new discoveries they will make under different experimental design choices. In turn\, they can leverage these predictions to optimally allocate available resources in the design of a future experiment\, e.g. (ii) to maximize the number of future discoveries or (iii) to optimize the usefulness of a future experiment for the task at hand\, e.g. the power of an associated statistical test.\nI discuss novel methodologies to solve the problems mentioned above. Our approach relies on a Bayesian nonparametric formulation that facilitates (i) prediction for the number of new variants in the follow-up study based on the pilot study. We show empirically that\, when experimental conditions are kept constant between the pilot and follow-up\, our method’s prediction is competitive with the best existing methods. Unlike current methods\, though\, our new method allows practitioners to change experimental conditions between the pilot and the follow-up. We demonstrate how this distinction allows our method to be used for more realistic predictions and for optimal allocation of a fixed budget between quality and quantity. In particular\, we first show how\, under a fixed budget\, my predictions can be used to maximize (ii) the number of new genomic variants discovered in a follow-up study. Last\, we show how our framework can guide practitioners in other experimental design problems\, and specifically how to achieve (iii) the highest possible power in statistical tests in the context of rare variants association studies.
URL:https://www.carloalberto.org/event/lorenzo-masoero-amazon/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220218T120000
DTEND;TZID=Europe/Rome:20220218T130000
DTSTAMP:20240425T084842
CREATED:20220125T095256Z
LAST-MODIFIED:20220211T183406Z
UID:10005302-1645185600-1645189200@www.carloalberto.org
SUMMARY:Graham Cormode (University of Warwick and Facebook) (webinar)
DESCRIPTION:“New Lower and Upper Bounds for Quantile Summary Algorithms” \nAbstract: Finding the median\, or more generally quantiles\, is a core problem in data analysis. The question has been heavily studied in streaming and related models of computation\, for over four decades. In this talk I will present some recent advances:\n– Lower bounds for approximating quantiles in the deterministic comparison model\, for additive error\, which show that the best known algorithm is in fact optimal\n– Upper bounds for relative error epsilon-approximations of quantiles\, which improves over previous results and exceed the best known lower bounds by only an O(log(1/e)3/2) factor.\nThis covers joint work with Pavel Vesely\, Justin Thaler\, Edo Liberty and Zohar Karnin.
URL:https://www.carloalberto.org/event/graham-cormode-university-of-warwick/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220318T110000
DTEND;TZID=Europe/Rome:20220318T120000
DTSTAMP:20240425T084842
CREATED:20220210T100336Z
LAST-MODIFIED:20230619T095840Z
UID:10005333-1647601200-1647604800@www.carloalberto.org
SUMMARY:Peter Tankov (CREST\, ENSEAE\, IP Paris)
DESCRIPTION:“Mean-field games: the linear programming approach”\n  \nAbstract: We will review the linear programming approach to mean-field games\, developed by the author in a series of papers. Mean-field games are games with a large number of identical small agents and symmetric interactions\, which have found many applications in finance and economics. The linear programming approach is a relaxation technique\, which allows to prove existence results under weak assumptions\, and lends itself well to numerical implementation. We will present an application of the approach to a model of energy markets and discuss the « linear programming fictitious play » algorithm for approximating the equilibrium.
URL:https://www.carloalberto.org/event/peter-tankov-crest-enseae-ip-paris/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220318T120000
DTEND;TZID=Europe/Rome:20220318T130000
DTSTAMP:20240425T084842
CREATED:20220125T095405Z
LAST-MODIFIED:20220315T144908Z
UID:10005303-1647604800-1647608400@www.carloalberto.org
SUMMARY:Cristina Butucea (CREST\, ENSEAE\, IP Paris)
DESCRIPTION:“Nonlinear functionals estimation under local differential privacy”\n  \nAbstract: The concept of differential privacy provides a rigorous formalism to randomize data and quantify the amount of privacy. We consider i.i.d. individuals with outcomes distributed according to the common probability distribution P. This original data is further randomized using a privacy mechanism into observations that the statistician is allowed to use in order to recover information about the distribution P. We will consider local differential privacy where each sample from the original data is privatized on the user’s local machine before its release.\nWe build privatized samples and nonparametric estimation methods of nonlinear functionals of the probability density and prove their optimality. We show that for the estimation of a quadratic functional\, interactive procedures that use previously released private data are faster than the non interactive ones. In both cases we show how to produce privacy mechanisms and estimators adaptive to (free of) the smoothness of the underlying density. We extend these results to non smooth functionals of the density. \nThis is based on joint work with A. Rohde\, L. Steinberger and Y. Issartel.
URL:https://www.carloalberto.org/event/cristina-butucea-crest-enseae-ip-paris/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220401T120000
DTEND;TZID=Europe/Rome:20220401T130000
DTSTAMP:20240425T084842
CREATED:20220125T095459Z
LAST-MODIFIED:20220316T154802Z
UID:10005304-1648814400-1648818000@www.carloalberto.org
SUMMARY:Alexandra Carpentier (University of Potsdam)
DESCRIPTION:“Optimal ranking in crowd-sourcing”\n  \nAbstract: Consider a crowd sourcing problem where we have n experts and d tasks. The average ability of each expert for each task is stored in an unknown matrix M\, which is only observed in noise and incompletely. We make no (semi) parametric assumptions\, but assume that both experts and tasks can be perfectly ranked: so that if an expert is better than another\, she performs on average better on all tasks than the other – and that the same holds for the tasks. This implies that if the matrix M is permuted so that the experts and tasks are perfectly ranked\, then the permuted matrix M is bi-isotonic.\nWe focus on the problem of recovering the optimal ranking of the experts in l_2 norm\, when the questions are perfectly ranked. We provide a minimax-optimal and computationally feasible method for this problem\, based on hierarchical clustering\, PCA\, and exchange of informations among the clusters. We prove in particular – in the case where d > n – that the problem of estimating the expert ranking is significantly easier than the problem of estimating the matrix M.\nThis talk is based on joint work with Emmanuel Pilliat and Nicolas Verzelen.
URL:https://www.carloalberto.org/event/alexandra-carpentier-university-of-potsdam/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220422T120000
DTEND;TZID=Europe/Rome:20220422T130000
DTSTAMP:20240425T084842
CREATED:20220125T095558Z
LAST-MODIFIED:20220404T073632Z
UID:10005305-1650628800-1650632400@www.carloalberto.org
SUMMARY:Johannes Schmidt-Hieber (University of Twente)
DESCRIPTION:“A theoretical comparison of deep learning and Bayes with deep Gaussian process priors”\n  \nAbstract: Deep neural networks have received a lot of attention recently and considerable progress has been made to build an underlying mathematical foundation. In a first part of the talk\, we summarize some statistical convergence results. Deep Gaussian process priors can be viewed as continuous analogues of Bayesian neural networks and this raises the question whether there is a closer link with deep learning. In the second part of the talk\, we show that the posterior for a suitable deep Gaussian process prior can achieve fast posterior contraction rates and discuss the connection with deep learning. This is joint work with Gianluca Finocchio.
URL:https://www.carloalberto.org/event/johannes-schmidt-hieber-university-of-twente/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220518T120000
DTEND;TZID=Europe/Rome:20220518T130000
DTSTAMP:20240425T084842
CREATED:20220125T095809Z
LAST-MODIFIED:20220426T161458Z
UID:10005306-1652875200-1652878800@www.carloalberto.org
SUMMARY:Larry Goldstein (University of Southern California)
DESCRIPTION:“Relaxing Gaussian assumptions in high dimensional statistical procedures”\n  \nAbstract: The assumption that high dimensional data is Gaussian is pervasive in many statistical procedures\, due not only to its tail decay\, but also to the level of analytic tractability this special distribution provides. We explore the relaxation of the Gaussian assumption in Single Index models and Shrinkage estimation using two tools that originate in Stein’s method: Stein kernels\, and the zero bias transform. Taking this approach leads to measures of discrepancy from the Gaussian that arise naturally from the nature of the procedures considered\, and result in performance bounds in contexts not restricted to the Gaussian. The resulting bounds are tight in the sense that they include an additional term that reﬂects the cost of deviation from the Gaussian\, and vanish for the Gaussian\, thus recovering this particular special case.\nJoint work with: Xiaohan Wei\, Max Fathi\, Gesine Reinert\, and Adrien Samaurd
URL:https://www.carloalberto.org/event/larry-goldstein-university-of-southern-california/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20220527T120000
DTEND;TZID=Europe/Rome:20220527T130000
DTSTAMP:20240425T084842
CREATED:20211207T111323Z
LAST-MODIFIED:20220520T090004Z
UID:10005289-1653652800-1653656400@www.carloalberto.org
SUMMARY:Alexander Cox (University of Bath)
DESCRIPTION:“Binary branching processes with Moran type interactions”\n  \nAbstract. In this talk\, we will discuss the large population limit of a binary branching particle system with Moran type interactions: we introduce a new model where particles evolve\, reproduce and die independently and\, with a probability that may depend on the configuration of the whole system\, the death of a particle may trigger the reproduction of another particle\, while a branching event may trigger the death of another one. We study the occupation measure of the new model\, explicitly relating it to the Feynman-Kac semigroup of the underlying Markov evolution and quantifying the L2 distance between their normalisations. This model extends the fixed size Moran-type interacting particle system discussed in Del Moral\, Burdzy et al\, and Villemonais\, and we will indeed show that our model outperforms the latter when used to approximate a birth and death process. We discuss several other applications of our model including the neutron transport equation\, Sequential Monte Carlo methods\, and population size dynamics. \n  \nJoint work with Emma Horton and Denis Villemonais.
URL:https://www.carloalberto.org/event/alexander-cox-university-of-bath/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20221021T120000
DTEND;TZID=Europe/Rome:20221021T130000
DTSTAMP:20240425T084842
CREATED:20221014T104724Z
LAST-MODIFIED:20230511T095458Z
UID:10005436-1666353600-1666357200@www.carloalberto.org
SUMMARY:Marie Kratz (ESSEC Business School\, CREAR risk research center)
DESCRIPTION:“Pro-Cyclicality Beyond Business Cycles: The Case of Traditional Risk Measurements”\n\nAbstract There is an accepted idea that risk measurements are pro-cyclical: in times of crisis\, they overestimate the future risk\, while they underestimate it in quiet times. We lay down a simple and efficient methodology to evaluate the amount of pro-cyclicality in the way financial institutions measure risk. To do so\, we introduce a new indicator based on the Sample Quantile Process (a dynamic extension of Value-at-Risk)\, conditioned on realized volatility. Using this framework\, we prove that pro-cyclicality is inherent in risk measure estimates based on historical data. We identify two main factors explaining this pro-cyclical behavior: the clustering and return-to-the-mean of volatility\, as it could have been anticipated but not yet quantified\, and\, more surprisingly\, the very way risk is measured\, even in a world with constant volatility\, though the empirical magnitude of the mean-reversion is greater than what would be observed in that special case. We develop CLTs and FCLT’s for functionals of quantile and dispersion estimators to support theoretically those empirical findings. This is a joint work with Dr. Marcel Bräutigam and Dr. Michel Dacorogna.
URL:https://www.carloalberto.org/event/marie-kratz-essec-business-school/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20221111T120000
DTEND;TZID=Europe/Rome:20221111T130000
DTSTAMP:20240425T084842
CREATED:20221017T091154Z
LAST-MODIFIED:20221104T131217Z
UID:10005438-1668168000-1668171600@www.carloalberto.org
SUMMARY:Massimiliano Pontil (Italian Institute of Technology and University College London)
DESCRIPTION:Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces\n\nAbstract In the last couple of years the Koopman operator framework attracted a lot of attention due to growing use of data-driven dynamical systems in science and engineering. Relevant results span fields of functional analysis\, numerical algorithms\, as well as machine learning. In this talk we will focus on discrete dynamical systems modelled by Markov chains and their study through the corresponding Koopman operator. While many data-driven algorithms to reconstruct such operators are well known\, their relationship with statistical learning is largely unexplored. We will introduce a framework to learn the Koopman operator from a finite data sample of the dynamical system. We will limit to the important case when the Markov chain admits an invariant distribution and\, relying on the theory of reproducing kernel Hilbert spaces (RKHS)\, introduce a notion of risk from which different estimators naturally arise. Special attention will be given to the estimation of the Koopman spectral properties\, and in particular its modal decomposition (KMD) that is of paramount importance in practice. With this framework we will analyse\, theoretically and empirically\, some existing estimators\, e.g. (extended) dynamic mode decomposition (DMD)\, and introduce a novel reduced-rank operator regression (RRR) estimator. We will further present learning bounds for the proposed estimator\, holding for both data drawn i.i.d. from the invariant distribution and for non i.i.d. data gathered from the trajectories of the dynamical system\, the latter in terms of mixing coefficients.\n(Joint work with V. Kostic\, P. Novelli\, A. Maurer\, C. Ciliberto\, L. Rosasco)
URL:https://www.carloalberto.org/event/massimiliano-pontil-italian-institute-of-technology-and-university-college-london/
CATEGORIES:Seminars in Statistics
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Rome:20221118T110000
DTEND;TZID=Europe/Rome:20221118T120000
DTSTAMP:20240425T084842
CREATED:20221107T114430Z
LAST-MODIFIED:20221107T142350Z
UID:10005451-1668769200-1668772800@www.carloalberto.org
SUMMARY:Vinayak Rao (Purdue University USA)
DESCRIPTION:Differential Privacy and Bayesian Computation: Two Vignettes\n\nAbstract Differential privacy (DP) protects privacy by introducing additional randomness into a recorded dataset. It comes with strong theoretical guarantees\, and has become a state-of-the-art framework for privacy protection. In this talk\, we consider two complementary challenges raised by DP. In the first part\, we recognize that implementing DP mechanisms require sampling algorithms like MCMC or rejection sampling. In these instances\, the algorithm runtime itself can leak privacy\, so that practical implementations fail to maintain the original theoretical guarantees. To address this\, we propose modifications to rejection and adaptive rejection sampling algorithms\, with varying assumptions\, to protect against timing attacks. In the second part\, we focus on a more traditional statistics problem related to differential privacy: given access to only the privatized data\, how to perform valid statistical inference on parameters underlying the confidential data. Here\, the likelihood function of the privatized data requires integrating over the large space of confidential databases and is typically intractable\, resulting\, in Bayesian settings\, in a posterior distribution that is doubly intractable. We propose a generic MCMC framework which is applicable to a wide range of statistical models and privacy mechanisms. Our MCMC algorithm is a simple wrapper that extends MCMC algorithms for the unobserved confidential data to settings where the data is privatized. Our approach translates privacy guarantees of the DP mechanism into mixing properties of the MCMC algorithm\, while maintaining the same order of computational cost as the algorithm for non privatized data. We illustrate the efficacy and applicability of both our ideas on several examples.
URL:https://www.carloalberto.org/event/vinayak-rao-purdue-university-usa/
CATEGORIES:Seminars in Statistics
END:VEVENT
END:VCALENDAR