Plenary Speakers

Vipin Kumar
Regents Professor and William Norris Endowed Chair
Computer Science and Engineering
University of Minnesota
USA 
Big Data in Climate and Earth Sciences: Challenges and Opportunities for Data Science
The climate and earth sciences have recently undergone a rapid transformation from a datapoor to a datarich environment. In particular, massive amount of data about Earth and its environment is now continuously being generated by a large number of Earth observing satellites as well as physicsbased earth system models running on largescale computational platforms. These massive and informationrich datasets offer huge potential for understanding how the Earth's climate and ecosystem have been changing and how they are being impacted by humans actions. This talk will discuss various challenges involved in analyzing these massive data sets as well as opportunities they present for both advancing machine learning as well as the science of climate change in the context of monitoring the state of the tropical forests and surface water on a global scale.
You can access the slides here.
More info about the speaker here.

Robert D. Nowak
Nosbusch Professor in Engineering
University of WisconsinMadison
USA 
Active Machine Learning: From Theory to Practice
Machine learning has advanced considerably in recent years, but mostly in welldefined domains using huge amounts of humanlabeled training data. Machines can recognize objects in images and translate text, but they must be trained with more images and text than a person can see in nearly a lifetime. Generating the necessary training data sets can require an enormous human effort. Active machine learning tackles this issue by designing learning algorithms that automatically and adaptively select the most informative data for labeling so that human time is not wasted on irrelevant or trivial examples. This lecture will cover theory, methods, and applications of active machine learning.
You can access the slides here.
More info about the speaker here.

Bin Yu
Chancellor's Professor
Statistics, Electrical Engineering and Computer Science
University of California, Berkeley
USA 
PCS Workflow, Interpretable Machine Learning, and DeepTune
In this talk, I'd like to discuss the intertwining importance and connections of three principles of data science: predictability, computability and stability (PCS) and the PCS workflow that is built on the three principles. I will also define interpretable machine learning (iML) through the PDR desiderata (Predictive accuracy, Descriptive accuracy and Relevancy) and discuss stability as a minimum requirement for interpretability. The principles and iML desiderata, PCS and PDR, will be demonstrated in the context of a collaborative project in neuroscience, DeepTune, for interpretable data results and testable hypothesis generation. If time allows, I will present proposed PCS inference that includes perturbation intervals and PCS hypothesis testing. PCS inference uses prediction screening and takes into account both data and model perturbations. Last but not least, a PCS documentation is proposed based on Rmarkdown, iPython, or Jupyter Notebook, with publicly available, reproducible codes and narratives to back up human choices made throughout an analysis. (The PCS workflow and documentation are demonstrated in a genomics case study available on Zenodo.)
You can access the slides here.
More info about the speaker here.

Rong Jin
Principal Engineer
Alibaba Group
China 
Optimization in Alibaba: Beyond Convexity
In this talk, we will talk about the recent developments in largescale optimization that are beyond the conventional wisdom of convex optimization. I will specifically address three challenging problems that have found applications in many Alibaba businesses. In the first application, we study the problem of optimizing truncated loss functions that are of particularly importance when coming to learning from heavily tailed distributions. We show that, despite of its nonconvexity, under appropriate condition, a variant of gradient descent could efficiently find the global optimal. In the second application, we study the problem of how to find local optimal in the case of nonconvex optimization. We show that with introduction of appropriate random perturbation, we could find the local optimal at the rate of O(1/gamma^3) where gamma defines the suboptimality, which significantly improves the results of the existing studies. In the last application, we consider optimizing a continuous function over a discrete space comprised of a huge number of data points. The special instances of this problem include approximate nearest neighbor search and learning a quantized neural network. The most intriguing result from our study is that this optimization problem becomes relatively easier when the size of discrete space is sufficiently large. We provide results of both theoretical analysis and empirical studies.
You can access the slides here.
More info about the speaker here.

XiaoLi Meng
Whipple V. N. Jones Professor of Statistics
Harvard University
USA 
Is it a Computing Algorithm or a Statistical Procedure: Can you tell or should you care?
The line between computing algorithms and statistical procedures is becoming increasingly blurred, as practitioners are now typically given a black box, which turns data into an “answer”. Is such a black box a computing algorithm or a statistical procedure? Does it matter that we know which is which? This talk reports my contemplations of such questions that originated in my taking part in a project that investigates the selfconsistency principle introduced by Efron (1967). We will start with a simple regression problem to illustrate a selfconsistency method and the audiences will be invited to contemplate whether it is a magical computing algorithm or a powerful statistical procedure. We will then discuss how such contemplations have played critical roles in developing the selfconsistency principle into a “LikelihoodFree EM algorithm” for semi/nonparametric estimation with incomplete data and under an arbitrary loss function, capable of addressing wavelets denoising with irregularly spaced data as well as variable selection via LASSOtype of methods with incomplete data. Throughout the talk, the audience will also be invited to consider a widely open problem: how to formulate in general the tradeoff between statistical efficiency and computational efficiency? (This talk is based on joint work with Thomas Lee and Zhan Li.)
You can access the slides here.
More info about the speaker here.

Jennifer Neville
Prof. of Computer Science and Statistics
Purdue University
USA 
Towards Relational AI – The good, the bad, and the ugly of learning over networks
In the last 20 years, there has been a great deal of research on machine learning methods for graphs, networks, and other types of relational data. By moving beyond the independence assumptions of more traditional ML methods, relational models are now able to successfully exploit the additional information that is often observed in relationships among entities. Specifically, network models are able to use relational information to improve predictions about user interests, behavior, and interactions, particularly when individual data is sparse. The tradeoff however, is that the heterogeneity, partialobservability, and interdependence of largescale network data can make it difficult to develop efficient and unbiased methods, due to several algorithmic and statistical challenges. In this talk, I will discuss these issues while surveying several general approaches used for relational learning in largescale social and information networks. In addition, to reflect on the movement toward pervasive use of the models in personalized online systems, I will discuss potential implications for privacy, polarization of communities, and spread of misinformation.
You can access the slides here.
More info about the speaker here.