In this paper, to support promising empirical results of CBE, we extend the previous theoretical framework to address the optimal condition on the number of bits, achieving that CBE requires the same number of bits to approximate the angle up to epsilon-distortion under mild assumptions. We also provide numerical experiments to support our theoretical results.
Yoonho Hwang, Mooyeol Baek, Saehoon Kim, Bohyung Han, Hee-Kap Ahn
We propose an effective filtering algorithm to eliminate nearest neighbor candidates using their distance lower bounds in nonlinear embedded spaces, constructed by product quantized translations. Experiments on several large-scale benchmark datasets show that our framework achieves the state-of-the-art performance compared to existing exact nearest neighbor search algorithms.
We propose a neural network to learn meta-features over datasets, which is used to select initial points for Bayesian hyperparameter optimization. Specifically, we retrieve k-nearest datasets to transfer a prior knowledge on initial points, where similarity over datasets is computed by learned meta-features. Experiments demonstrate that our learned meta-features are useful in optimizing several hyperparameters of deep residual networks for image classification.
Imposing sparse + group-sparse superposition structures in high-dimensional parameter estimation is known to provide flexible regularization that is more realistic for many real-world problems. For example, such a superposition enables partially-shared support sets in multi-task learning, thereby striking the right balance between parameter overlap across tasks and task specificity.
Undirected graphical models or Markov random fields (MRFs) are widely used for modeling multivariate probability distributions. Much of the work on MRFs has focused on continuous variables, and nominal variables (that is, unordered categorical variables). However, data from many real world applications involve ordered categorical variables also known as ordinal variables, e.g., movie ratings on Netflix which can be ordered from 1 to 5 stars.
Juyong Kim, Yookoon Park, Gunhee Kim and Sung Ju Hwang
We propose a novel deep neural network that is both lightweight and effectively structured for model parallelization. Our network, which we name as SplitNet, automatically learns to split the network weights into either a set or a hierarchy of multiple groups that use disjoint sets of features, by learning both the class-to-group and feature-to-group assignment matrices along with the network weights.
The number of parameters in a deep neural network is usually very large, which helps with its learning capacity but also hinders its scalability and practicality due to memory/time inefficiency and overfitting. To resolve this issue, we propose a sparsity regularization method that exploits both positive and negative correlations among the features to enforce the network to be sparse, and at the same time remove any redundancies among the features to fully utilize the capacity of the network.