Archive for the ‘Posted by Greg Ver Steeg’ Category

I’m excited to share a student paper that was just accepted to ICML. Neural networks are capable of memorizing training labels, but if this happens they will generalize poorly when applied to test data. Where is that information about memorized labels stored? Well, it has to be stored in the neural network weights somewhere. You […]

ICML and MixHop


ICML 2019 is coming up soon, and I plan to be there (except I’m missing Tuesday). I want to briefly tout the excellent work of a fantastic student who joined our lab, Sami Abu-El-Haija.  If you’ve kept up with develops on learning with graphs, you may be aware of graph convolutional networks, which combine the […]

Southern information theorists after the civil war realized that although they could no longer exclude former slaves from the polls, they could exclude people based on other criteria like, say, education, and that these criteria happen to be highly correlated with formerly being a slave who was not allowed education. Republican information theorists continue to exploit […]

It’s officially been a year since my last blog. There have been so many exciting new things going on that it’s been hard to take time out for some nice big picture blog posts. Here are a few areas that I have the best of intentions for getting to. Fair representation learning using information theory […]

Consider a little science experiment we’ve all done, to find out if a switch controls a light. How many data points does it usually take to convince you? Not many! Even if you didn’t do a randomized trial yourself, and observed somebody else manipulating the switch you’d figure it out pretty quickly. This type of […]

The Grue language doesn’t have words for “blue” or “green”. Instead Grue speakers have the following concepts: grue: green during the day and blue at night bleen: blue during the day and green at night (This example is adapted from the original grue thought experiment.) To us, these concepts seem needlessly complicated. However, to a […]

The work with Shirley Pepke on using CorEx to find patterns in gene expression data is finally published in BMC Medical Genomics. Shirley wrote a blog post about it as well. She will present this work at the Harvard Precision Medicine conference and we’ll both present at Berkeley’s Data Edge conference. The code we used for the paper […]

Edit: Also check out the story by the Washington Post and on Shirley is a collaborator of mine who works on using gene expression data to get a better understanding of ovarian cancer. She has a remarkable personal story that is featured in a podcast about our work together. I laughed, I cried, I can’t recommend […]

Here’s one way to solve a problem. (1) Visualize what a good solution would look like. (2) Quantify what makes that solution “good”. (3) Search over all potentials solutions for one that optimizes the goodness. I like working on this whole pipeline, but I have come to the realization that I have been spending too much […]

Here are the slides from my talk yesterday at ICML. The information sieve is introduced in this paper. But in this followup paper, we make it really practical and demonstrate the connections to “common information”. The code is on github for the discrete and continuous versions.