Accepted papers at TMLR @TmlrPub Twitter profile

Last Seen Profiles

@Yuuga_Ours

@_CoachGriffin

@Barbie_PLe

@OutTales

@JaeKOh2

@seplpadome

@primelinepicks

@NicoleTephie

@AJC_DoesItAll

@HeyTomAbell

@MoulinexFrance

@MLathouwers

@nLbSDgU2oRDQZZk

@irenedekruif

@_o__o_l

@themaiarenee

@Auralizzie

@_mamo_lee

@AnanasDio

@CakeSmoochie

@LucasDeaux

@Tefal_France

@AverieHabas

@_hanamizu

@NEParticipacion

@SunLifeCentre

@MistressCamii

@Joanne_Marcotte

@TempleDietitian

@AmirabbsSa

@qadir__

@GeorgetownHoyas

@Cadenosborn2027

@Shaaestars

@begsullah

@wiNoR6S

Accepted papers at TMLR

@TmlrPub

2 years

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers Andreas Peter Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

How to train your ViT? Data, Augmentation, and Regularization in...

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image...

openreview.net

2

43

305

Accepted papers at TMLR

@TmlrPub

2 years

ZerO Initialization: Initializing Neural Networks with only Zeros and Ones Jiawei Zhao, Florian Tobias Schaefer, Anima Anandkumar

ZerO Initialization: Initializing Neural Networks with only Zeros...

Deep neural networks are usually initialized with random weights, with adequately selected initial variance to ensure stable signal propagation during training. However, selecting the appropriate...

openreview.net

1

16

136

Accepted papers at TMLR

@TmlrPub

4 months

DINOv2: Learning Robust Visual Features without Supervision Maxime Oquab, Timothée Darcet, Théo Moutakanni et al.. Action editor: Abhishek Kumar. #supervised #visual #features

DINOv2: Learning Robust Visual Features without Supervision

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could...

openreview.net

1

18

118

Accepted papers at TMLR

@TmlrPub

2 years

A Simple Convergence Proof of Adam and Adagrad Alexandre Défossez, Leon Bottou, Francis Bach, Nicolas Usunier

A Simple Convergence Proof of Adam and Adagrad

We provide a simple proof of convergence covering both the Adam and Adagrad adaptive optimization algorithms when applied to smooth (possibly non-convex) objective functions with bounded gradients....

openreview.net

2

21

119

Accepted papers at TMLR

@TmlrPub

2 years

Greedy Bayesian Posterior Approximation with Deep Ensembles Aleksei Tiulpin, Matthew B. Blaschko

Greedy Bayesian Posterior Approximation with Deep Ensembles

Ensembles of independently trained neural networks are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning, and can be interpreted as an approximation of the posterior...

openreview.net

0

13

88

Accepted papers at TMLR

@TmlrPub

2 years

Emergent Abilities of Large Language Models Jason Wei, Yi Tay, Rishi Bommasani et al.

Emergent Abilities of Large Language Models

Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that...

openreview.net

0

27

83

Accepted papers at TMLR

@TmlrPub

5 months

Modular Deep Learning Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Ponti. Action editor: Karthik Narasimhan. #modular #modularity #hierarchical

Modular Deep Learning

Transfer learning has recently become the dominant paradigm of machine learning. Pre-trained models fine-tuned for downstream tasks achieve better performance with fewer labelled examples....

openreview.net

0

11

80

Accepted papers at TMLR

@TmlrPub

1 year

A geometrical connection between sparse and low-rank matrices and its application to manifold lea... Lawrence K. Saul #sparse #manifold #dimensional

A geometrical connection between sparse and low-rank matrices and...

We consider when a sparse nonnegative matrix $\mathbf{S}$ can be recovered, via an elementwise nonlinearity, from a real-valued matrix~$\mathbf{L}$ of significantly lower rank. Of particular...

openreview.net

1

13

79

Accepted papers at TMLR

@TmlrPub

2 years

Representation Alignment in Neural Networks Ehsan Imani, Wei Hu, Martha White

Representation Alignment in Neural Networks

It is now a standard for neural network representations to be trained on large, publicly available datasets, and used for new problems. The reasons for why neural network representations have been...

openreview.net

0

19

70

Accepted papers at TMLR

@TmlrPub

9 months

Understanding convolution on graphs via energies Francesco Di Giovanni, James Rowbottom, Benjamin Paul Chamberlain et al.. Action editor: Guillaume Rabusseau. #convolutions #graphs #convolutional

Understanding convolution on graphs via energies

Graph Neural Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours. Most message-passing models act as...

openreview.net

1

10

67

Accepted papers at TMLR

@TmlrPub

2 years

The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning Anders Johan Andreassen, Yasaman Bahri, Behnam Neyshabur, Rebecca Roelofs

The Evolution of Out-of-Distribution Robustness Throughout Fine-Tuning

Although machine learning models typically experience a drop in performance on out-of-distribution data, accuracies on in- versus out-of-distribution data are widely observed to follow a single...

openreview.net

3

12

64

Accepted papers at TMLR

@TmlrPub

2 years

Structured Uncertainty in the Observation Space of Variational Autoencoders James Langley, Miguel Monteiro, Charles Jones, Nick Pawlowski, Ben Glocker

Structured Uncertainty in the Observation Space of Variational...

Variational autoencoders (VAEs) are a popular class of deep generative models with many variants and a wide range of applications. Improvements upon the standard VAE mostly focus on the modelling...

openreview.net

0

9

50

Accepted papers at TMLR

@TmlrPub

10 months

Self-Supervision is All You Need for Solving Rubik’s Cube Kyo Takano. Action editor: Marc Lanctot. #rubik #cube #deepcubea

Self-Supervision is All You Need for Solving Rubik’s Cube

Existing combinatorial search methods are often complex and require some level of expertise. This work introduces a simple and efficient deep learning method for solving combinatorial problems with...

openreview.net

0

11

47

Accepted papers at TMLR

@TmlrPub

2 years

TLDR: Twin Learning for Dimensionality Reduction Yannis Kalantidis, Carlos Eduardo Rosar Kos Lassance, Jon Almazán, Diane Larlus

TLDR: Twin Learning for Dimensionality Reduction

Dimensionality reduction methods are unsupervised approaches which learn low-dimensional spaces where some properties of the initial space, typically the notion of “neighborhood”, are preserved....

openreview.net

0

11

47

Accepted papers at TMLR

@TmlrPub

2 years

Equivariant Mesh Attention Networks Sourya Basu, Jose Gallego-Posada, Francesco Viganò, James Rowbottom, Taco Cohen

Equivariant Mesh Attention Networks

Equivariance to symmetries has proven to be a powerful inductive bias in deep learning research. Recent works on mesh processing have concentrated on various kinds of natural symmetries, including...

openreview.net

1

10

46

Accepted papers at TMLR

@TmlrPub

2 years

Finding and Fixing Spurious Patterns with Explanations Gregory Plumb, Marco Tulio Ribeiro, Ameet Talwalkar

Finding and Fixing Spurious Patterns with Explanations

Image classifiers often use spurious patterns, such as “relying on the presence of a person to detect a tennis racket,” which do not generalize. In this work, we present an end-to-end pipeline for...

openreview.net

3

11

43

Accepted papers at TMLR

@TmlrPub

2 years

A Note on "Assessing Generalization of SGD via Disagreement" Andreas Kirsch, Yarin Gal

A Note on "Assessing Generalization of SGD via Disagreement"

Several recent works find empirically that the average test error of deep neural networks can be estimated via the prediction disagreement of models, which does not require labels. In particular...

openreview.net

1

5

42

Accepted papers at TMLR

@TmlrPub

11 months

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland. Action editor: Gergely Neu. #kernels #reinforcement #markov

A Kernel Perspective on Behavioural Metrics for Markov Decision...

We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a new metric under this lens that is provably equivalent to...

openreview.net

1

8

41

Accepted papers at TMLR

@TmlrPub

4 months

Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series David W. Romero, Erik J Bekkers, Jakub M. Tomczak, Mark Hoogendoorn. Action editor: Jeremias Sulam. #wavelet_networks #wavelet #transforms

Wavelet Networks: Scale-Translation Equivariant Learning From Raw...

Leveraging the symmetries inherent to specific data domains for the construction of equivariant neural networks has lead to remarkable improvements in terms of data efficiency and generalization....

openreview.net

0

10

41

Accepted papers at TMLR

@TmlrPub

1 year

Better Theory for SGD in the Nonconvex World Ahmed Khaled, Peter Richtárik. Action editor: Raman Arora. #optimal #convexity #sgd

Better Theory for SGD in the Nonconvex World

Large-scale nonconvex optimization problems are ubiquitous in modern machine learning, and among practitioners interested in solving them, Stochastic Gradient Descent (SGD) reigns supreme. We...

openreview.net

0

5

40

Accepted papers at TMLR

@TmlrPub

8 months

Causal Parrots: Large Language Models May Talk Causality But Are Not Causal Matej Zečević, Moritz Willig, Devendra Singh Dhami, Kristian Kersting. Action editor: Frederic Sala. #causal #ai #inference

Causal Parrots: Large Language Models May Talk Causality But Are...

Some argue scale is all what is needed to achieve AI, covering even causal models. We make it clear that large language models (LLMs) cannot be causal and give reason onto why sometimes we might...

openreview.net

0

10

37

Accepted papers at TMLR

@TmlrPub

11 months

When Does Uncertainty Matter?: Understanding the Impact of Predictive Uncertainty in ML Assisted ... Sean McGrath, Parth Mehta, Alexandra Zytek, Isaac Lage, Himabindu Lakkaraju. Action editor: Laurent Charlin. #predictive #predicting

When Does Uncertainty Matter?: Understanding the Impact of...

As machine learning (ML) models are increasingly being employed to assist human decision makers, it becomes critical to provide these decision makers with relevant inputs which can help them decide...

openreview.net

0

11

39

Accepted papers at TMLR

@TmlrPub

1 year

Layerwise Bregman Representation Learning of Neural Networks with Applications to Knowledge Disti... Ehsan Amid, Rohan Anil, Christopher Fifty, Manfred K Warmuth. Action editor: Stephan Mandt. #normalization #layerwise #trained

Layerwise Bregman Representation Learning of Neural Networks with...

We propose a new method for layerwise representation learning of a trained neural network that conforms to the non-linearity of the layer's transfer function. In particular, we form a Bregman...

openreview.net

0

9

38

Accepted papers at TMLR

@TmlrPub

1 year

Soft Diffusion: Score Matching with General Corruptions Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alex Dimakis, Peyman Milanfar. Action editor: Jonathan Scarlett. #denoising #diffusion #diffusions

Soft Diffusion: Score Matching with General Corruptions

We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching....

openreview.net

1

7

38

Accepted papers at TMLR

@TmlrPub

1 year

Guillotine Regularization: Why removing layers is needed to improve generalization in Self-Superv... Florian Bordes, Randall Balestriero, Quentin Garrido, Adrien Bardes, Pascal Vincent. Action editor: Jinwoo Shin. #imagenet #regulari

Guillotine Regularization: Why removing layers is needed to improve...

One unexpected technique that emerged in recent years consists in training a Deep Network (DN) with a Self-Supervised Learning (SSL) method, and using this network on downstream tasks but with its...

openreview.net

0

11

37

Accepted papers at TMLR

@TmlrPub

3 months

Towards fully covariant machine learning Soledad Villar, David W Hogg, Weichi Yao, George A Kevrekidis, Bernhard Schölkopf. Action editor: Jean Barbier. #symmetries #symmetry #invariances

Towards fully covariant machine learning

Any representation of data involves arbitrary investigator choices. Because those choices are external to the data-generating process, each choice leads to an exact symmetry, corresponding to the...

openreview.net

0

7

37

Accepted papers at TMLR

@TmlrPub

9 months

Diffusion Models for Constrained Domains Nic Fishman, Leo Klarner, Valentin De Bortoli, Emile Mathieu, Michael John Hutchinson. Action editor: Rianne van den Berg. #diffusion #denoising #riemannian

Diffusion Models for Constrained Domains

Denoising diffusion models are a novel class of generative algorithms that achieve state-of-the-art performance across a range of domains, including image generation and text-to-image tasks....

openreview.net

0

6

37

Accepted papers at TMLR

@TmlrPub

2 months

Why should autoencoders work? Matthew Kvalheim, Eduardo Sontag. Action editor: Jeffrey Pennington. #autoencoders #decoding #homeomorphic

Why should autoencoders work?

Deep neural network autoencoders are routinely used computationally for model reduction. They allow recognizing the intrinsic dimension of data that lie in a $k$-dimensional subset $K$ of an input...

openreview.net

0

7

36

Accepted papers at TMLR

@TmlrPub

1 year

VN-Transformer: Rotation-Equivariant Attention for Vector Neurons Serge Assaad, Carlton Downey, Rami Al-Rfou', Nigamaa Nayakanti, Benjamin Sapp #rotation #equivariance #neurons

VN-Transformer: Rotation-Equivariant Attention for Vector Neurons

Rotation equivariance is a desirable property in many practical applications such as motion forecasting and 3D perception, where it can offer benefits like sample efficiency, better generalization,...

openreview.net

0

7

35

Accepted papers at TMLR

@TmlrPub

7 months

Identifying latent distances with Finslerian geometry Alison Pouplin, David Eklund, Carl Henrik Ek, Søren Hauberg. Action editor: Bamdev Mishra. #geodesics #metrics #riemannian

Identifying latent distances with Finslerian geometry

Riemannian geometry provides us with powerful tools to explore the latent space of generative models while preserving the underlying structure of the data. The latent space can be equipped it with...

openreview.net

0

5

35

Accepted papers at TMLR

@TmlrPub

1 year

A Variational Perspective on Generative Flow Networks Heiko Zimmermann, Fredrik Lindsten, Jan-Willem van de Meent, Christian A Naesseth. Action editor: Jakub Tomczak. #generative #flow #variational

A Variational Perspective on Generative Flow Networks

Generative flow networks (GFNs) are a class of probabilistic models for sequential sampling of composite objects, proportional to a target distribution that is defined in terms of an energy...

openreview.net

0

7

34

Accepted papers at TMLR

@TmlrPub

2 years

Diagnosing and Fixing Manifold Overfitting in Deep Generative Models Gabriel Loaiza-Ganem, Brendan Leigh Ross, Jesse C Cresswell, Anthony L. Caterini

Diagnosing and Fixing Manifold Overfitting in Deep Generative Models

Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which...

openreview.net

0

5

34

Accepted papers at TMLR

@TmlrPub

1 month

Attending to Graph Transformers Luis Müller, Mikhail Galkin, Christopher Morris, Ladislav Rampášek. Action editor: Fredrik Johansson. #graphs #graph #molecular

Attending to Graph Transformers

Recently, transformer architectures for graphs emerged as an alternative to established techniques for machine learning with graphs, such as (message-passing) graph neural networks. So far, they...

openreview.net

0

4

31

Accepted papers at TMLR

@TmlrPub

1 year

Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli. Action editor: Jeffrey Pennington. #classifier #generalization #deep

Neural Collapse: A Review on Modelling Principles and Generalization

Deep classifier neural networks enter the terminal phase of training (TPT) when training error reaches zero and tend to exhibit intriguing Neural Collapse (NC) properties. Neural collapse...

openreview.net

0

11

30

Accepted papers at TMLR

@TmlrPub

11 months

The Vendi Score: A Diversity Evaluation Metric for Machine Learning Dan Friedman, Adji Bousso Dieng. Action editor: Antonio Vergari. #diversity #diverse #similarity

The Vendi Score: A Diversity Evaluation Metric for Machine Learning

Diversity is an important criterion for many areas of machine learning (ML), including generative modeling and dataset curation. However, existing metrics for measuring diversity are often...

openreview.net

0

12

31

Accepted papers at TMLR

@TmlrPub

1 year

Bayesian Optimization with Informative Covariance Afonso Eduardo, Michael U. Gutmann. Action editor: Pierre Alquier. #optimization #bayesian #prior

Bayesian Optimization with Informative Covariance

Bayesian optimization is a methodology for global optimization of unknown and expensive objectives. It combines a surrogate Bayesian regression model with an acquisition function to decide where to...

openreview.net

0

5

31

Accepted papers at TMLR

@TmlrPub

2 years

Decoder Denoising Pretraining for Semantic Segmentation Emmanuel Asiedu Brempong, Simon Kornblith, Ting Chen et al.

Decoder Denoising Pretraining for Semantic Segmentation

Semantic segmentation labels are expensive and time consuming to acquire. Hence, pretraining is commonly used to improve the label-efficiency of segmentation models. Typically, the encoder of a...

openreview.net

1

5

31

Accepted papers at TMLR

@TmlrPub

9 months

Holistic Evaluation of Language Models Percy Liang, Rishi Bommasani, Tony Lee et al.. Action editor: Karthik Narasimhan. #language #dialects #trustworthiness

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation...

openreview.net

1

7

31

Accepted papers at TMLR

@TmlrPub

11 months

Predicting Out-of-Domain Generalization with Neighborhood Invariance Nathan Hoyen Ng, Neha Hulkund, Kyunghyun Cho, Marzyeh Ghassemi. Action editor: Vincent Dumoulin. #classifier #classification #generalization

Predicting Out-of-Domain Generalization with Neighborhood Invariance

Developing and deploying machine learning models safely depends on the ability to char- acterize and compare their abilities to generalize to new environments. Although recent work has proposed a...

openreview.net

0

8

30

Accepted papers at TMLR

@TmlrPub

5 months

IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled ... Jay Gala, Pranjal A Chitale, A K Raghavan et al.. Action editor: W Ronny Huang. #corpus #multilingual #corpora

IndicTrans2: Towards High-Quality and Accessible Machine...

India has a rich linguistic landscape, with languages from 4 major language families spoken by over a billion people. 22 of these languages listed in the Constitution of India (referred to as...

openreview.net

0

8

30

Accepted papers at TMLR

@TmlrPub

2 years

Auto-Lambda: Disentangling Dynamic Task Relationships Shikun Liu, Stephen James, Andrew Davison, Edward Johns

Auto-Lambda: Disentangling Dynamic Task Relationships

Understanding the structure of multiple related tasks allows for multi-task learning to improve the generalisation ability of one or all of them. However, it usually requires training each pairwise...

openreview.net

0

4

30

Accepted papers at TMLR

@TmlrPub

4 months

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Stephen Casper, Xander Davies, Claudia Shi et al.. Action editor: Marcello Restelli. #ai #reinforcement #language

Open Problems and Fundamental Limitations of Reinforcement Learning...

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large...

openreview.net

0

8

28

Accepted papers at TMLR

@TmlrPub

7 months

Not All Causal Inference is the Same Matej Zečević, Devendra Singh Dhami, Kristian Kersting. Action editor: Mingming Gong. #causality #causal #inferences

Not All Causal Inference is the Same

Neurally-parameterized Structural Causal Models in the Pearlian notion to causality, referred to as NCM, were recently introduced as a step towards next-generation learning systems. However, said...

openreview.net

1

6

28

Accepted papers at TMLR

@TmlrPub

2 years

Clustering units in neural networks: upstream vs downstream information Richard D Lange, David Rolnick, Konrad Kording

Clustering units in neural networks: upstream vs downstream...

It has been hypothesized that some form of "modular" structure in artificial neural networks should be useful for learning, compositionality, and generalization. However, defining and quantifying...

openreview.net

1

2

29

Accepted papers at TMLR

@TmlrPub

1 year

Temperature check: theory and practice for training models with softmax-cross-entropy losses Atish Agarwala, Samuel Stern Schoenholz, Jeffrey Pennington, Yann Dauphin. Action editor: Ruoyu Sun. #softmax #entropy #models

Temperature check: theory and practice for training models with...

The softmax function combined with a cross-entropy loss is a principled approach to modeling probability distributions that has become ubiquitous in deep learning. The softmax function is defined...

openreview.net

0

4

29

Accepted papers at TMLR

@TmlrPub

9 months

Graph Neural Networks for Temporal Graphs: State of the Art, Open Challenges, and Opportunities Antonio Longa, Veronica Lachi, Gabriele Santin et al.. Action editor: Shinichi Nakajima. #temporal #graphs #graph

Graph Neural Networks for Temporal Graphs: State of the Art, Open...

Graph Neural Networks (GNNs) have become the leading paradigm for learning on (static) graph-structured data. However, many real-world systems are dynamic in nature, since the graph and node/edge...

openreview.net

0

4

29

Accepted papers at TMLR

@TmlrPub

4 months

Variational Classification: A Probabilistic Generalization of the Softmax Classifier Shehzaad Zuzar Dhuliawala, Mrinmaya Sachan, Carl Allen. Action editor: Frederic Sala. #softmax #classifiers #classification

Variational Classification: A Probabilistic Generalization of the...

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers. We derive a variational objective to train the model,...

openreview.net

0

4

28

Accepted papers at TMLR

@TmlrPub

2 years

Sparse Coding with Multi-layer Decoders using Variance Regularization Katrina Evtimova, Yann LeCun

Sparse Coding with Multi-layer Decoders using Variance Regularization

Sparse representations of images are useful in many computer vision applications. Sparse coding with an $l_1$ penalty and a learned linear dictionary requires regularization of the dictionary to...

openreview.net

0

6

28

Accepted papers at TMLR

@TmlrPub

11 months

The Robustness Limits of SoTA Vision Models to Natural Variation Mark Ibrahim, Quentin Garrido, Ari S. Morcos, Diane Bouchacourt. Action editor: Dumitru Erhan. #autoencoders #vision #supervised

The Robustness Limits of SoTA Vision Models to Natural Variation

Recent state-of-the-art vision models have introduced new architectures, learning paradigms, and larger pretraining data, leading to impressive performance on tasks such as classification. While...

openreview.net

0

6

28

Accepted papers at TMLR

@TmlrPub

1 year

Patches Are All You Need? Asher Trockman, J Zico Kolter. Action editor: David Ha. #attention #convolutional #vision

Patches Are All You Need?

Although convolutional neural networks have been the dominant architecture for computer vision for many years, Vision Transformers (ViTs) have recently shown promise as an alternative. Subsequently...

openreview.net

0

5

28

Accepted papers at TMLR

@TmlrPub

4 months

Pathologies of Predictive Diversity in Deep Ensembles Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, John Patrick Cunningham. Action editor: Neil Houlsby. #diversity #ensembles #diverse

Pathologies of Predictive Diversity in Deep Ensembles

Classic results establish that encouraging predictive diversity improves performance in ensembles of low-capacity models, e.g. through bagging or boosting. Here we demonstrate that these intuitions...

openreview.net

1

3

28

Accepted papers at TMLR

@TmlrPub

1 year

Controllable Generative Modeling via Causal Reasoning Joey Bose, Ricardo Pio Monti, Aditya Grover #NewPaper #PaperPost

Controllable Generative Modeling via Causal Reasoning

Deep latent variable generative models excel at generating complex, high-dimensional data, often exhibiting impressive generalization beyond the training distribution. However, many such models in...

openreview.net

0

4

27

Accepted papers at TMLR

@TmlrPub

2 years

High Fidelity Visualization of What Your Self-Supervised Representation Knows About Florian Bordes, Randall Balestriero, Pascal Vincent

High Fidelity Visualization of What Your Self-Supervised...

Discovering what is learned by neural networks remains a challenge. In self-supervised learning, classification is the most common task used to evaluate how good a representation is. However...

openreview.net

0

4

27

Accepted papers at TMLR

@TmlrPub

2 years

Weight Expansion: A New Perspective on Dropout and Generalization Gaojie Jin, Xinping Yi, Pengfei Yang, Lijun Zhang, Sven Schewe, Xiaowei Huang

Weight Expansion: A New Perspective on Dropout and Generalization

While dropout is known to be a successful regularization technique, insights into the mechanisms that lead to this success are still lacking. We introduce the concept of weight expansion, an...

openreview.net

0

7

27

Accepted papers at TMLR

@TmlrPub

2 years

Approximating 1-Wasserstein Distance with Trees Makoto Yamada, Yuki Takezawa, Ryoma Sato, Han Bao, Zornitsa Kozareva, Sujith Ravi

Approximating 1-Wasserstein Distance with Trees

The Wasserstein distance, which measures the discrepancy between distributions, shows efficacy in various types of natural language processing and computer vision applications. One of the...

openreview.net

0

3

26

Accepted papers at TMLR

@TmlrPub

2 years

On the link between conscious function and general intelligence in humans and machines Arthur Juliani, Kai Arulkumaran, Shuntaro Sasai, Ryota Kanai

On the link between conscious function and general intelligence in...

In popular media, there is often a connection drawn between the advent of awareness in artificial agents and those same agents simultaneously achieving human or superhuman level intelligence. In...

openreview.net

0

11

26

Accepted papers at TMLR

@TmlrPub

11 months

Learning Graph Structure from Convolutional Mixtures Max Wasserman, Saurabh Sihag, Gonzalo Mateos, Alejandro Ribeiro. Action editor: Makoto Yamada. #graphs #graph #deconvolution

Learning Graph Structure from Convolutional Mixtures

Machine learning frameworks such as graph neural networks typically rely on a given, fixed graph to exploit relational inductive biases and thus effectively learn from network data. However, when...

openreview.net

0

6

26

Accepted papers at TMLR

@TmlrPub

1 year

A Unified Survey on Anomaly, Novelty, Open-Set, and Out of-Distribution Detection: Solutions and ... Mohammadreza Salehi, Hossein Mirzaei, Dan Hendrycks et al. #NewPaper #PaperPost

A Unified Survey on Anomaly, Novelty, Open-Set, and Out...

Machine learning models often encounter samples that are diverged from the training distribution. Failure to recognize an out-of-distribution (OOD) sample, and consequently assign that sample to an...

openreview.net

0

7

25

Accepted papers at TMLR

@TmlrPub

3 months

Are you using test log-likelihood correctly? Sameer Deshpande, Soumya Ghosh, Tin D. Nguyen, Tamara Broderick. Action editor: Michael Gutmann. #likelihoods #likelihood #comparisons

Are you using test log-likelihood correctly?

Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model. We present simple examples...

openreview.net

0

3

24

Accepted papers at TMLR

@TmlrPub

8 months

A probabilistic Taylor expansion with Gaussian processes Toni Karvonen, Jon Cockayne, Filip Tronarp, Simo Särkkä. Action editor: Roman Garnett. #gaussian #taylor #posterior

A probabilistic Taylor expansion with Gaussian processes

We study a class of Gaussian processes for which the posterior mean, for a particular choice of data, replicates a truncated Taylor expansion of any order. The data consist of derivative...

openreview.net

0

2

25

Accepted papers at TMLR

@TmlrPub

2 years

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning Philipp Becker, Gerhard Neumann

On Uncertainty in Deep State Space Models for Model-Based...

Improved state space models, such as Recurrent State Space Models (RSSMs), are a key factor behind recent advances in model-based reinforcement learning (RL). Yet, despite their empirical success,...

openreview.net

0

3

24

Accepted papers at TMLR

@TmlrPub

1 year

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from Low-Dimensional Latents Kushagra Pandey, Avideep Mukherjee, Piyush Rai, Abhishek Kumar #NewPaper #PaperPost

DiffuseVAE: Efficient, Controllable and High-Fidelity Generation...

Diffusion probabilistic models have been shown to generate state-of-the-art results on several competitive image synthesis benchmarks but lack a low-dimensional, interpretable latent space, and are...

openreview.net

0

3

24

Accepted papers at TMLR

@TmlrPub

2 years

On the Adversarial Robustness of Vision Transformers Rulin Shao, Zhouxing Shi, Jinfeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

On the Adversarial Robustness of Vision Transformers

Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision. This work provides a comprehensive...

openreview.net

0

4

24

Accepted papers at TMLR

@TmlrPub

5 months

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks Mustafa Shukor, Corentin Dancette, Alexandre Rame, Matthieu Cord. Action editor: Stefan Lee. #multimodal #models #unified

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks

Large Language Models (LLMs) have made the ambitious quest for generalist agents significantly far from being a fantasy. A key hurdle for building such general models is the diversity and...

openreview.net

0

8

24

Accepted papers at TMLR

@TmlrPub

1 year

Comparative Generalization Bounds for Deep Neural Networks Tomer Galanti, Liane Galanti, Ido Ben-Shaul. Action editor: Yunhe Wang. #deep #depth #depths

Comparative Generalization Bounds for Deep Neural Networks

In this work, we investigate the generalization capabilities of deep neural networks. We introduce a novel measure of the effective depth of neural networks, defined as the first layer at which...

openreview.net

1

5

22

Accepted papers at TMLR

@TmlrPub

8 months

Semantic Representations of Mathematical Expressions in a Continuous Vector Space Neeraj Gangwar, Nickvash Kani. Action editor: Yonatan Bisk. #embeddings #expressions #representing

Semantic Representations of Mathematical Expressions in a...

Mathematical notation makes up a large portion of STEM literature, yet finding semantic representations for formulae remains a challenging problem. Because mathematical notation is precise, and its...

openreview.net

0

5

23

Accepted papers at TMLR

@TmlrPub

7 months

Improved baselines for vision-language pre-training Enrico Fini, Pietro Astolfi, Adriana Romero-Soriano, Jakob Verbeek, Michal Drozdzal. Action editor: Hanwang Zhang. #multimodal #regularization #augmentation

Improved baselines for vision-language pre-training

Contrastive learning has emerged as an efficient framework to learn multimodal representations. CLIP, a seminal work in this area, achieved impressive results by training on paired image-text data ...

openreview.net

0

7

23

Accepted papers at TMLR

@TmlrPub

11 months

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization Hiroki Naganuma, Kartik Ahuja, Shiro Takagi et al.. Action editor: Robert Gower. #distributional #classification #hyperparameters

Empirical Study on Optimizer Selection for Out-of-Distribution...

Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution. While much promising work has been accomplished to...

openreview.net

0

5

23

Accepted papers at TMLR

@TmlrPub

1 year

A Measure of the Complexity of Neural Representations based on Partial Information Decomposition David Alexander Ehrlich, Andreas Christian Schneider, Viola Priesemann et al.. Action editor: Jean Barbier. #complexity #neurons #repres

A Measure of the Complexity of Neural Representations based on...

In neural networks, task-relevant information is represented jointly by groups of neurons. However, the specific way in which this mutual information about the classification label is distributed...

openreview.net

0

14

22

Accepted papers at TMLR

@TmlrPub

1 year

UncertaINR: Uncertainty Quantification of End-to-End Implicit Neural Representations for Computed... Francisca Vasconcelos, Bobby He, Nalini M Singh, Yee Whye Teh. Action editor: Matthew Blaschko. #accuracy #deep #cnn

UncertaINR: Uncertainty Quantification of End-to-End Implicit...

Implicit neural representations (INRs) have achieved impressive results for scene reconstruction and computer graphics, where their performance has primarily been assessed on reconstruction...

openreview.net

0

4

23

Accepted papers at TMLR

@TmlrPub

1 year

Generalizability of Adversarial Robustness Under Distribution Shifts Kumail Alhamoud, Hasan Abed Al Kader Hammoud, Motasem Alfarra, Bernard Ghanem. Action editor: Gang Niu. #adversarial #robustness #robust

Generalizability of Adversarial Robustness Under Distribution Shifts

Recent progress in empirical and certified robustness promises to deliver reliable and deployable Deep Neural Networks (DNNs). Despite that success, most existing evaluations of DNN robustness have...

openreview.net

0

9

22

Accepted papers at TMLR

@TmlrPub

11 months

The Stack: 3 TB of permissively licensed source code Denis Kocetkov, Raymond Li, Loubna Ben allal et al.. Action editor: Swarat Chaudhuri. #text2code #bigcode #dataset

The Stack: 3 TB of permissively licensed source code

Large Language Models (LLMs) play an ever-increasing role in the field of Artificial Intelligence (AI)--not only for natural language processing but also for code understanding and generation. To...

openreview.net

0

1

23

Accepted papers at TMLR

@TmlrPub

9 months

Meta-Calibration: Learning of Model Calibration Using Differentiable Expected Calibration Error Ondrej Bohdal, Yongxin Yang, Timothy Hospedales. Action editor: Yingzhen Li. #calibration #prediction #optimise

Meta-Calibration: Learning of Model Calibration Using...

Calibration of neural networks is a topical problem that is becoming more and more important as neural networks increasingly underpin real-world applications. The problem is especially noticeable...

openreview.net

0

3

23

Accepted papers at TMLR

@TmlrPub

2 years

A Generalist Agent Scott Reed, Konrad Zolna, Emilio Parisotto et al.

A Generalist Agent

Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato,...

openreview.net

0

1

22

Accepted papers at TMLR

@TmlrPub

1 year

Integrating Bayesian Network Structure into Residual Flows and Variational Autoencoders Jacobie Mouton, Rodney Stephen Kroon. Action editor: George Papamakarios. #autoencoders #generative #flow

Integrating Bayesian Network Structure into Residual Flows and...

Deep generative models have become more popular in recent years due to their scalability and representation capacity. Unlike probabilistic graphical models, they typically do not incorporate...

openreview.net

0

3

22

Accepted papers at TMLR

@TmlrPub

1 year

Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier. Action editor: Matthew Blaschko. #decoding #nlp #contrastive

Contrastive Search Is What You Need For Neural Text Generation

Generating text with autoregressive language models (LMs) is of great importance to many natural language processing (NLP) applications. Previous solutions for this task often produce text that...

openreview.net

0

6

22

Accepted papers at TMLR

@TmlrPub

10 months

Catastrophic overfitting can be induced with discriminative non-robust features Guillermo Ortiz-Jimenez, Pau de Jorge, Amartya Sanyal et al.. Action editor: Jakub Tomczak. #adversarial #overfitting #robust

Catastrophic overfitting can be induced with discriminative...

Adversarial training (AT) is the de facto method for building robust neural networks, but it can be computationally expensive. To mitigate this, fast single-step attacks can be used, but this may...

openreview.net

0

3

22

Accepted papers at TMLR

@TmlrPub

2 years

Unsupervised Dense Information Retrieval with Contrastive Learning Gautier Izacard, Mathilde Caron, Lucas Hosseini et al.

Unsupervised Dense Information Retrieval with Contrastive Learning

Recently, information retrieval has seen the emergence of dense retrievers, using neural networks, as an alternative to classical sparse methods based on term-frequency. These models have obtained...

openreview.net

0

6

22

Accepted papers at TMLR

@TmlrPub

11 months

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on Real and Generated Data Yuji Roh, Weili Nie, De-An Huang, Steven Euijong Whang, Arash Vahdat, Anima Anandkumar. Action editor: Yingzhen Li. #sampling #fairness #unfairne

Dr-Fairness: Dynamic Data Ratio Adjustment for Fair Training on...

Fair visual recognition has become critical for preventing demographic disparity. A major cause of model unfairness is the imbalanced representation of different groups in training data. Recently...

openreview.net

0

5

22

Accepted papers at TMLR

@TmlrPub

2 years

Sparse MoEs meet Efficient Ensembles James Urquhart Allingham, Florian Wenzel, Zelda E Mariet et al.

Sparse MoEs meet Efficient Ensembles

Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the...

openreview.net

0

3

22

Accepted papers at TMLR

@TmlrPub

9 days

Exponential Moving Average of Weights in Deep Learning: Dynamics and Benefits Daniel Morales-Brotons, Thijs Vogels, Hadrien Hendrikx. Action editor: Atsushi Nitanda. #weights #sgd #regularization

Exponential Moving Average of Weights in Deep Learning: Dynamics...

Weight averaging of Stochastic Gradient Descent (SGD) iterates is a popular method for training deep learning models. While it is often used as part of complex training pipelines to improve...

openreview.net

0

3

22

Accepted papers at TMLR

@TmlrPub

8 days

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination m... Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong. Action editor: Zhiding Yu. #supervised #embedding #imagenet

Semantic Positive Pairs for Enhancing Visual Representation...

Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some...

openreview.net

0

1

21

Accepted papers at TMLR

@TmlrPub

2 years

Do better ImageNet classifiers assess perceptual similarity better? Manoj Kumar, Neil Houlsby, Nal Kalchbrenner, Ekin Dogus Cubuk

Do better ImageNet classifiers assess perceptual similarity better?

Perceptual distances between images, as measured in the space of pre-trained deep features, have outperformed prior low-level, pixel-based metrics on assessing image similarity. While the...

openreview.net

1

3

21

Accepted papers at TMLR

@TmlrPub

1 year

Dual PatchNorm Manoj Kumar, Mostafa Dehghani, Neil Houlsby. Action editor: Yunhe Wang. #layernorm #layernorms #patchnorm

Dual PatchNorm

We propose Dual PatchNorm: two Layer Normalization layers (LayerNorms), before and after the patch embedding layer in Vision Transformers. We demonstrate that Dual PatchNorm outperforms the result...

openreview.net

1

8

21

Accepted papers at TMLR

@TmlrPub

9 months

Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward Washim Uddin Mondal, Vaneet Aggarwal. Action editor: Jiantao Jiao. #reward #rewards #optimality

Reinforcement Learning with Delayed, Composite, and Partially...

We investigate an infinite-horizon average reward Markov Decision Process (MDP) with delayed, composite, and partially anonymous reward feedback. The delay and compositeness of rewards mean that...

openreview.net

0

5

21

Accepted papers at TMLR

@TmlrPub

9 months

Nonconvex-nonconcave min-max optimization on Riemannian manifolds Andi Han, Bamdev Mishra, Pratik Jawanpuria, Junbin Gao. Action editor: Zhihui Zhu. #manifolds #minimax #nonconcave

Nonconvex-nonconcave min-max optimization on Riemannian manifolds

This work studies nonconvex-nonconcave min-max problems on Riemannian manifolds. We first characterize the local optimality of nonconvex-nonconcave problems on manifolds with a generalized notion...

openreview.net

0

4

21

Accepted papers at TMLR

@TmlrPub

9 months

Simulate Time-integrated Coarse-grained Molecular Dynamics with Multi-scale Graph Networks Xiang Fu, Tian Xie, Nathan J. Rebello, Bradley Olsen, Tommi S. Jaakkola. Action editor: Jasper Snoek. #polymer #polymers #dynamics

Simulate Time-integrated Coarse-grained Molecular Dynamics with...

Molecular dynamics (MD) simulation is essential for various scientific domains but computationally expensive. Learning-based force fields have made significant progress in accelerating ab-initio MD...

openreview.net

0

4

21

Accepted papers at TMLR

@TmlrPub

1 year

Diffusion Models for Video Prediction and Infilling Tobias Höppe, Arash Mehrjou, Stefan Bauer, Didrik Nielsen, Andrea Dittadi

Diffusion Models for Video Prediction and Infilling

Predicting and anticipating future outcomes or reasoning about missing information in a sequence are critical skills for agents to be able to make intelligent decisions. This requires strong...

openreview.net

2

1

21

Accepted papers at TMLR

@TmlrPub

6 months

Convergence of SGD for Training Neural Networks with Sliced Wasserstein Losses Eloi Tanguy. Action editor: Anastasios Kyrillidis. #wasserstein #generative #flow

Convergence of SGD for Training Neural Networks with Sliced...

Optimal Transport has sparked vivid interest in recent years, in particular thanks to the Wasserstein distance, which provides a geometrically sensible and intuitive way of comparing probability...

openreview.net

0

3

20

Accepted papers at TMLR

@TmlrPub

1 month

On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization Dongruo Zhou, Jinghui Chen, Yuan Cao, Ziyan Yang, Quanquan Gu. Action editor: Peter Richtárik. #gradient #adaptive #nonconvex

On the Convergence of Adaptive Gradient Methods for Nonconvex...

Adaptive gradient methods are workhorses in deep learning. However, the convergence guarantees of adaptive gradient methods for nonconvex optimization have not been thoroughly studied. In this...

openreview.net

0

3

21

Accepted papers at TMLR

@TmlrPub

10 months

Augmented Language Models: a Survey Grégoire Mialon, Roberto Dessi, Maria Lomeli et al.. Action editor: Yujia Li. #interpreter #alms #language

Augmented Language Models: a Survey

This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into...

openreview.net

0

2

21

Accepted papers at TMLR

@TmlrPub

1 year

OpenCon: Open-world Contrastive Learning Yiyou Sun, Yixuan Li #deeplearning #supervised #imagenet

OpenCon: Open-world Contrastive Learning

Machine learning models deployed in the wild naturally encounter unlabeled samples from both known and novel classes. Challenges arise in learning from both the labeled and unlabeled data, in an...

openreview.net

0

1

21

Accepted papers at TMLR

@TmlrPub

2 years

On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons Fangshuo Liao, Anastasios Kyrillidis

On the Convergence of Shallow Neural Network Training with Randomly...

With the motive of training all the parameters of a neural network, we study why and when one can achieve this by iteratively creating, training, and combining randomly selected subnetworks. Such...

openreview.net

0

6

20

Accepted papers at TMLR

@TmlrPub

2 years

Queried Unlabeled Data Improves and Robustifies Class-Incremental Learning Tianlong Chen, Sijia Liu, Shiyu Chang, Lisa Amini, Zhangyang Wang

Queried Unlabeled Data Improves and Robustifies Class-Incremental...

Class-incremental learning (CIL) suffers from the notorious dilemma between learning newly added classes and preserving previously learned class knowledge. That catastrophic forgetting issue could...

openreview.net

0

1

19

Accepted papers at TMLR

@TmlrPub

9 months

RECLIP: Resource-efficient CLIP by Training with Small Images Runze Li, Dahun Kim, Bir Bhanu, Weicheng Kuo. Action editor: Xu Tan. #training #retrieval #pretraining

RECLIP: Resource-efficient CLIP by Training with Small Images

We present RECLIP (Resource-efficient CLIP), a simple method that minimizes computational resource footprint for CLIP (Contrastive Language Image Pretraining). Inspired by the notion of...

openreview.net

0

3

20

Accepted papers at TMLR

@TmlrPub

5 months

Improving Native CNN Robustness with Filter Frequency Regularization Jovita Lukasik, Paul Gavrikov, Janis Keuper, Margret Keuper. Action editor: Evan Shelhamer. #adversarial #robust #robustness

Improving Native CNN Robustness with Filter Frequency Regularization

Neural networks tend to overfit the training distribution and perform poorly on out-of-distribution data. A conceptually simple solution lies in adversarial training, which introduces worst-case...

openreview.net

1

7

20

Accepted papers at TMLR

@TmlrPub

1 year

Complex-Valued Autoencoders for Object Discovery Sindy Löwe, Phillip Lippe, Maja Rudolph, Max Welling #NewPaper #PaperPost

Complex-Valued Autoencoders for Object Discovery

Object-centric representations form the basis of human perception, and enable us to reason about the world and to systematically generalize to new settings. Currently, most works on unsupervised...

openreview.net

0

1

19

Accepted papers at TMLR

@TmlrPub

4 months

A Survey on the Possibilities & Impossibilities of AI-generated Text Detection Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Bedi. Action editor: Greg Durrett. #nlp #text #ai

A Survey on the Possibilities & Impossibilities of AI-generated...

Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses. However, despite these...

openreview.net

0

2

17

Accepted papers at TMLR

@TmlrPub

1 month

Improving and generalizing flow-based generative models with minibatch optimal transport Alexander Tong, Kilian FATRAS, Nikolay Malkin et al.. Action editor: Alain Oliviero Durmus. #flow #flows #generative

Improving and generalizing flow-based generative models with...

Continuous normalizing flows (CNFs) are an attractive generative modeling technique, but they have been held back by limitations in their simulation-based maximum likelihood training. We introduce...

openreview.net

0

5

19

Accepted papers at TMLR

@TmlrPub

10 months

Data Distillation: A Survey Noveen Sachdeva, Julian McAuley. Action editor: Bo Han. #distillation #datasets #dataset

Data Distillation: A Survey

The popularity of deep learning has led to the curation of a vast number of massive and multifarious datasets. Despite having close-to-human performance on individual tasks, training...

openreview.net

0

6

20