1 |
8.75 |
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks |
9, 9, 9, 8 |
|
2 |
8.33 |
Dataset Condensation with Gradient Matching |
8, 9, 8 |
|
3 |
8.25 |
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding |
7, 9, 8, 9 |
|
4 |
8.25 |
Learning Flexible Visual Representations via Interactive Gameplay |
9, 8, 8, 8 |
|
5 |
8 |
Deformable DETR: Deformable Transformers for End-to-End Object Detection |
9, 8, 8, 7 |
|
6 |
8 |
Parameterization of Hypercomplex Multiplications |
8, 8, 8 |
|
7 |
8 |
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data |
9, 7, 9, 7 |
|
8 |
8 |
Score-Based Generative Modeling through Stochastic Differential Equations |
8, 9, 7, 8 |
|
9 |
8 |
Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes |
9, 7, 8 |
|
10 |
8 |
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients |
8, 7, 8, 9 |
|
11 |
8 |
On the mapping between Hopfield networks and Restricted Boltzmann Machines |
10, 7, 7 |
|
12 |
8 |
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting |
9, 7, 8 |
|
13 |
8 |
Complex Query Answering with Neural Link Predictors |
9, 6, 8, 9 |
|
14 |
8 |
Learning a Latent Simplex in Input Sparsity Time |
7, 9, 8 |
|
15 |
8 |
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study |
7, 9, 9, 7 |
|
16 |
7.75 |
Expressive Power of Invariant and Equivariant Graph Neural Networks |
8, 8, 6, 9 |
|
17 |
7.75 |
Learning Mesh-Based Simulation with Graph Networks |
9, 6, 6, 10 |
|
18 |
7.75 |
Autoregressive Entity Retrieval |
7, 8, 8, 8 |
|
19 |
7.75 |
Rethinking Architecture Selection in Differentiable NAS |
7, 10, 7, 7 |
|
20 |
7.75 |
Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency |
6, 8, 7, 10 |
|
21 |
7.75 |
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation |
7, 9, 7, 8 |
|
22 |
7.67 |
Invariant Representations for Reinforcement Learning without Reconstruction |
7, 7, 9 |
|
23 |
7.67 |
Distributional Sliced-Wasserstein and Applications to Generative Modeling |
9, 7, 7 |
|
24 |
7.67 |
Neural Synthesis of Binaural Audio |
7, 9, 7 |
|
25 |
7.67 |
Extreme Memorization via Scale of Initialization |
7, 7, 9 |
|
26 |
7.67 |
Predicting Infectiousness for Proactive Contact Tracing |
9, 7, 7 |
|
27 |
7.67 |
Do 2D GANs know 3D shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs |
8, 7, 8 |
|
28 |
7.67 |
EigenGame: PCA as a Nash Equilibrium |
8, 8, 7 |
|
29 |
7.67 |
Geometry-aware Instance-reweighted Adversarial Training |
7, 8, 8 |
|
30 |
7.6 |
DiffWave: A Versatile Diffusion Model for Audio Synthesis |
7, 7, 9, 8, 7 |
|
31 |
7.5 |
Learning-based Support Estimation in Sublinear Time |
7, 8, 8, 7 |
|
32 |
7.5 |
Recurrent Independent Mechanisms |
9, 7, 7, 7 |
|
33 |
7.5 |
Conditional Generative Modeling via Learning the Latent Space |
7, 6, 10, 7 |
|
34 |
7.5 |
The Traveling Observer Model: Multi-task Learning Through Spatial Variable Embeddings |
6, 6, 9, 9 |
|
35 |
7.5 |
Learning to Reach Goals via Iterated Supervised Learning |
7, 8, 7, 8 |
|
36 |
7.5 |
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images |
7, 8, 8, 7 |
|
37 |
7.5 |
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability |
9, 9, 7, 5 |
|
38 |
7.5 |
Global Convergence of Three-layer Neural Networks in the Mean Field Regime |
9, 7, 7, 7 |
|
39 |
7.5 |
Implicit Normalizing Flows |
8, 7, 7, 8 |
|
40 |
7.5 |
Randomized Automatic Differentiation |
7, 8, 8, 7 |
|
41 |
7.5 |
End-to-end Adversarial Text-to-Speech |
7, 8, 7, 8 |
|
42 |
7.5 |
Correcting experience replay for multi-agent communication |
8, 8, 7, 7 |
|
43 |
7.5 |
Human-Level Performance in No-Press Diplomacy via Equilibrium Search |
7, 8, 7, 8 |
|
44 |
7.5 |
What are the Statistical Limits of Batch RL with Linear Function Approximation? |
8, 7, 8, 7 |
|
45 |
7.5 |
Winning the L2RPN Challenge: Power Grid Management via Semi-Markov Afterstate Actor-Critic |
7, 7, 7, 9 |
|
46 |
7.5 |
Learning with feature dependent label noise: a progressive approach |
7, 8, 7, 8 |
|
47 |
7.5 |
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs |
9, 7, 7, 7 |
|
48 |
7.5 |
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning |
9, 6, 7, 8 |
|
49 |
7.5 |
Rethinking Attention with Performers |
7, 8, 8, 7 |
|
50 |
7.5 |
Grounded Language Learning Fast and Slow |
8, 6, 8, 8 |
|
51 |
7.4 |
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime |
6, 8, 8, 8, 7 |
|
52 |
7.4 |
Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy |
7, 9, 7, 6, 8 |
|
53 |
7.33 |
Stabilized Medical Attacks |
7, 7, 8 |
|
54 |
7.33 |
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator |
7, 7, 8 |
|
55 |
7.33 |
When Do Curricula Work? |
7, 8, 7 |
|
56 |
7.33 |
RMSprop can converge with proper hyper-parameter |
8, 8, 6 |
|
57 |
7.33 |
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering |
8, 8, 6 |
|
58 |
7.33 |
Tent: Fully Test-Time Adaptation by Entropy Minimization |
7, 7, 8 |
|
59 |
7.33 |
Evolving Reinforcement Learning Algorithms |
7, 6, 9 |
|
60 |
7.33 |
Unsupervised Object Keypoint Learning using Local Spatial Predictability |
6, 7, 9 |
|
61 |
7.33 |
Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions |
7, 8, 7 |
|
62 |
7.33 |
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers |
6, 9, 7 |
|
63 |
7.25 |
SALD: Sign Agnostic Learning with Derivatives |
8, 8, 6, 7 |
|
64 |
7.25 |
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments |
7, 8, 7, 7 |
|
65 |
7.25 |
Orthogonalizing Convolutional Layers with the Cayley Transform |
7, 7, 7, 8 |
|
66 |
7.25 |
Self-supervised Visual Reinforcement Learning with Object-centric Representations |
5, 7, 9, 8 |
|
67 |
7.25 |
Improved Autoregressive Modeling with Distribution Smoothing |
7, 7, 7, 8 |
|
68 |
7.25 |
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning |
7, 7, 8, 7 |
|
69 |
7.25 |
PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics |
6, 7, 7, 9 |
|
70 |
7.25 |
More or Less: When and How to Build Neural Network Ensembles |
8, 8, 6, 7 |
|
71 |
7.25 |
Dynamics of Deep Equilibrium Linear Models |
8, 7, 7, 7 |
|
72 |
7.25 |
PMI-Masking: Principled masking of correlated spans |
8, 6, 7, 8 |
|
73 |
7.25 |
Sharpness-aware Minimization for Efficiently Improving Generalization |
7, 6, 8, 8 |
|
74 |
7.25 |
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training |
7, 7, 7, 8 |
|
75 |
7.25 |
Go with the flow: Adaptive control for Neural ODEs |
7, 7, 8, 7 |
|
76 |
7.25 |
Learning from Protein Structure with Geometric Vector Perceptrons |
6, 6, 10, 7 |
|
77 |
7.25 |
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation |
8, 7, 7, 7 |
|
78 |
7.25 |
Minimum Width for Universal Approximation |
7, 7, 7, 8 |
|
79 |
7.25 |
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods |
7, 6, 8, 8 |
|
80 |
7.25 |
Multiplicative Filter Networks |
9, 8, 6, 6 |
|
81 |
7.25 |
Unlearnable Examples: Making Personal Data Unexploitable |
7, 7, 8, 7 |
|
82 |
7.25 |
Growing Efficient Deep Networks by Structured Continuous Sparsification |
8, 7, 7, 7 |
|
83 |
7.25 |
DDPNOpt: Differential Dynamic Programming Neural Optimizer |
7, 8, 7, 7 |
|
84 |
7.25 |
Locally Free Weight sharing for Network Width Search |
7, 8, 6, 8 |
|
85 |
7.25 |
Mutual Information State Intrinsic Control |
7, 7, 7, 8 |
|
86 |
7.25 |
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows |
7, 9, 6, 7 |
|
87 |
7.25 |
Support-set bottlenecks for video-text representation learning |
7, 9, 6, 7 |
|
88 |
7.25 |
On the Origin of Implicit Regularization in Stochastic Gradient Descent |
8, 7, 7, 7 |
|
89 |
7.25 |
Unbiased Teacher for Semi-Supervised Object Detection |
6, 9, 7, 7 |
|
90 |
7.25 |
Is Attention Better Than Matrix Decomposition? |
8, 8, 7, 6 |
|
91 |
7.25 |
Improving Adversarial Robustness via Channel-wise Activation Suppressing |
7, 8, 7, 7 |
|
92 |
7.25 |
Gradient Projection Memory for Continual Learning |
8, 8, 5, 8 |
|
93 |
7.25 |
Generalization in data-driven models of primary visual cortex |
8, 8, 6, 7 |
|
94 |
7.25 |
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts |
8, 7, 7, 7 |
|
95 |
7.25 |
Long-tail learning via logit adjustment |
8, 8, 7, 6 |
|
96 |
7.25 |
Graph Convolution with Low-rank Learnable Local Filters |
8, 7, 7, 7 |
|
97 |
7.25 |
Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets? |
8, 7, 7, 7 |
|
98 |
7.25 |
Mind the Pad – CNNs Can Develop Blind Spots |
8, 6, 8, 7 |
|
99 |
7.25 |
Self-training For Few-shot Transfer Across Extreme Task Differences |
8, 8, 6, 7 |
|
100 |
7.25 |
Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies |
7, 8, 7, 7 |
|
101 |
7.25 |
Federated Learning Based on Dynamic Regularization |
7, 7, 7, 8 |
|
102 |
7.25 |
Fidelity-based Deep Adiabatic Scheduling |
8, 9, 6, 6 |
|
103 |
7.2 |
Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures |
5, 9, 5, 8, 9 |
|
104 |
7 |
Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU |
6, 7, 8, 7 |
|
105 |
7 |
IsarStep: a Benchmark for High-level Mathematical Reasoning |
6, 9, 7, 6 |
|
106 |
7 |
Discovering a set of policies for the worst case reward |
8, 7, 7, 6 |
|
107 |
7 |
SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness |
7, 7, 7, 7 |
|
108 |
7 |
Behavioral Cloning from Noisy Demonstrations |
8, 7, 6 |
|
109 |
7 |
How Does Mixup Help With Robustness and Generalization? |
8, 7, 7, 6 |
|
110 |
7 |
Negative Data Augmentation |
9, 7, 6, 6 |
|
111 |
7 |
Shapley explainability on the data manifold |
7, 7, 8, 6 |
|
112 |
7 |
Individually Fair Gradient Boosting |
7, 7, 7 |
|
113 |
7 |
Memory Optimization for Deep Networks |
6, 8, 7, 7 |
|
114 |
7 |
Molecule Optimization by Explainable Evolution |
8, 7, 6, 7 |
|
115 |
7 |
CaPC Learning: Confidential and Private Collaborative Learning |
7, 7, 7 |
|
116 |
7 |
Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective |
8, 6, 6, 8 |
|
117 |
7 |
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness |
7, 7, 7 |
|
118 |
7 |
Disentangled Recurrent Wasserstein Autoencoder |
7, 7, 7 |
|
119 |
7 |
Explaining the Efficacy of Counterfactually Augmented Data |
7, 6, 7, 8 |
|
120 |
7 |
Multi-timescale Representation Learning in LSTM Language Models |
8, 7, 6, 7 |
|
121 |
7 |
Decoupling Global and Local Representations via Invertible Generative Flows |
8, 6, 7, 7 |
|
122 |
7 |
gradSim: Differentiable simulation for system identification and visuomotor control |
7, 7, 7 |
|
123 |
7 |
CPT: Efficient Deep Neural Network Training via Cyclic Precision |
7, 7, 7, 7 |
|
124 |
7 |
Understanding the role of importance weighting for deep learning |
7, 7, 7, 7 |
|
125 |
7 |
Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms |
7, 7, 7, 7 |
|
126 |
7 |
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs |
7, 8, 6, 7 |
|
127 |
7 |
Private Post-GAN Boosting |
8, 7, 6 |
|
128 |
7 |
Identifying nonlinear dynamical systems with multiple time scales and long-range dependencies |
8, 7, 6, 7 |
|
129 |
7 |
On Self-Supervised Image Representations for GAN Evaluation |
7, 7, 7, 7 |
|
130 |
7 |
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity |
7, 7, 7 |
|
131 |
7 |
Linear Mode Connectivity in Multitask and Continual Learning |
7, 7, 7 |
|
132 |
7 |
The inductive bias of ReLU networks on orthogonally separable data |
8, 5, 8, 7 |
|
133 |
7 |
Systematic generalisation with group invariant predictions |
6, 6, 8, 8 |
|
134 |
7 |
Iterated learning for emergent systematicity in VQA |
6, 7, 8 |
|
135 |
7 |
Hyperbolic Neural Networks++ |
8, 7, 6, 7 |
|
136 |
7 |
A statistical theory of cold posteriors in deep neural networks |
9, 7, 6, 6 |
|
137 |
7 |
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis |
7, 7, 7, 7 |
|
138 |
7 |
Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes |
7, 7, 8, 6 |
|
139 |
7 |
ARMOURED: Adversarially Robust MOdels using Unlabeled data by REgularizing Diversity |
7, 7, 7, 7 |
|
140 |
7 |
Linear Convergent Decentralized Optimization with Compression |
7, 7, 7 |
|
141 |
7 |
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders |
9, 6, 6 |
|
142 |
7 |
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy |
5, 8, 7, 8 |
|
143 |
7 |
How Benign is Benign Overfitting ? |
8, 7, 7, 6 |
|
144 |
7 |
Denoising Diffusion Implicit Models |
7, 8, 6 |
|
145 |
7 |
Geometry-Aware Gradient Algorithms for Neural Architecture Search |
6, 8, 7 |
|
146 |
7 |
Neural Topic Model via Optimal Transport |
6, 8, 7, 7 |
|
147 |
7 |
Zero-shot Synthesis with Group-Supervised Learning |
8, 7, 7, 6 |
|
148 |
7 |
Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning |
7, 7, 7, 7 |
|
149 |
7 |
Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data |
7, 7, 7, 7 |
|
150 |
7 |
Large Associative Memory Problem in Neurobiology and Machine Learning |
7, 6, 8, 7 |
|
151 |
7 |
Calibration of Neural Networks using Splines |
8, 8, 5, 7 |
|
152 |
7 |
When does preconditioning help or hurt generalization? |
8, 6, 7 |
|
153 |
7 |
A Good Image Generator Is What You Need for High-Resolution Video Synthesis |
6, 8, 8, 6 |
|
154 |
7 |
Undistillable: Making A Nasty Teacher That CANNOT teach students |
7, 7, 7, 7 |
|
155 |
7 |
A Critique of Self-Expressive Deep Subspace Clustering |
7, 7, 7, 7 |
|
156 |
7 |
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry |
8, 8, 5, 7 |
|
157 |
7 |
GAN “Steerability” without optimization |
8, 6, 6, 8 |
|
158 |
7 |
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation |
9, 7, 5, 7 |
|
159 |
7 |
VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models |
7, 7, 6, 8 |
|
160 |
7 |
Neural Pruning via Growing Regularization |
7, 6, 7, 8 |
|
161 |
7 |
Graph-Based Continual Learning |
6, 7, 8, 7 |
|
162 |
7 |
DINO: A Conditional Energy-Based GAN for Domain Translation |
7, 7, 7 |
|
163 |
7 |
On the Universality of Rotation Equivariant Point Cloud Networks |
8, 6, 6, 8 |
|
164 |
7 |
Contrastive Divergence Learning is a Time Reversal Adversarial Game |
8, 7, 7, 6 |
|
165 |
7 |
Quantifying Differences in Reward Functions |
6, 7, 7, 8 |
|
166 |
7 |
Free Lunch for Few-shot Learning: Distribution Calibration |
7, 7, 7 |
|
167 |
7 |
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation |
6, 8, 7 |
|
168 |
7 |
Learning to Generate 3D Shapes with Generative Cellular Automata |
6, 8, 7 |
|
169 |
7 |
Uncertainty Sets for Image Classifiers using Conformal Prediction |
7, 7, 7, 7 |
|
170 |
7 |
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control |
7, 7, 7, 7 |
|
171 |
7 |
BUSTLE: Bottom-up program Synthesis Through Learning-guided Exploration |
8, 6, 9, 5 |
|
172 |
7 |
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime |
7, 7, 7, 7 |
|
173 |
7 |
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels |
5, 7, 7, 9 |
|
174 |
7 |
Can a Fruit Fly Learn Word Embeddings? |
7, 7, 7 |
|
175 |
7 |
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels |
6, 8, 8, 6 |
|
176 |
7 |
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds |
8, 7, 6, 7 |
|
177 |
7 |
Does enhanced shape bias improve neural network robustness to common corruptions? |
6, 7, 9, 6 |
|
178 |
7 |
Leaky Tiling Activations: A Simple Approach to Learning Sparse Representations Online |
7, 7, 7, 7 |
|
179 |
7 |
Calibration tests beyond classification |
7, 9, 5 |
|
180 |
7 |
A Distributional Approach to Controlled Text Generation |
7, 7, 7 |
|
181 |
7 |
Learning to Recombine and Resample Data For Compositional Generalization |
8, 7, 7, 6 |
|
182 |
7 |
Dataset Inference: Ownership Resolution in Machine Learning |
7, 7, 7 |
|
183 |
7 |
Fast Geometric Projections for Local Robustness Certification |
7, 8, 6, 7 |
|
184 |
7 |
Random Feature Attention |
8, 4, 8, 8 |
|
185 |
7 |
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale |
7, 7, 7, 7 |
|
186 |
7 |
For interpolating kernel machines, minimizing the norm of the ERM solution minimizes stability |
8, 6, 8, 6 |
|
187 |
7 |
EVALUATION OF NEURAL ARCHITECTURES TRAINED WITH SQUARE LOSS VS CROSS-ENTROPY IN CLASSIFICATION TASKS |
7, 7, 6, 8 |
|
188 |
7 |
Physics-Informed Deep Learning of Incompressible Fluid Dynamics |
7, 7, 7, 7 |
|
189 |
7 |
Mathematical Reasoning via Self-supervised Skip-tree Training |
7, 7, 7, 7 |
|
190 |
7 |
Iterative Empirical Game Solving via Single Policy Best Response |
7, 7, 7, 7 |
|
191 |
7 |
Self-Supervised Policy Adaptation during Deployment |
7, 7, 7, 7 |
|
192 |
7 |
Neurally Augmented ALISTA |
5, 7, 8, 8 |
|
193 |
7 |
In Search of Lost Domain Generalization |
8, 7, 6, 7 |
|
194 |
7 |
BOIL: Towards Representation Change for Few-shot Learning |
7, 7, 7 |
|
195 |
7 |
Neural gradients are near-lognormal: improved quantized and sparse training |
8, 6, 7, 7 |
|
196 |
7 |
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval |
6, 9, 7, 6 |
|
197 |
7 |
Meta-learning Symmetries by Reparameterization |
6, 8, 9, 5 |
|
198 |
7 |
Spatio-Temporal Graph Scattering Transform |
6, 9, 7, 6 |
|
199 |
7 |
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels |
7, 7, 7, 7 |
|
200 |
7 |
Deep Equals Shallow for ReLU Networks in Kernel Regimes |
6, 6, 7, 9 |
|
201 |
7 |
Fast convergence of stochastic subgradient method under interpolation |
7, 8, 6, 7 |
|
202 |
7 |
Lie Algebra Convolutional Neural Networks with Automatic Symmetry Extraction |
7, 8, 6 |
|
203 |
7 |
Model-Based Visual Planning with Self-Supervised Functional Distances |
7, 7, 7, 7 |
|
204 |
7 |
A Gradient Flow Framework For Analyzing Network Pruning |
6, 6, 9, 7 |
|
205 |
7 |
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction |
7, 8, 6, 7 |
|
206 |
7 |
Practical Real Time Recurrent Learning with a Sparse Approximation |
8, 7, 7, 6 |
|
207 |
7 |
Isotropy in the Contextual Embedding Space: Clusters and Manifolds |
7, 7, 7 |
|
208 |
7 |
Information-theoretic Probing Explains Reliance on Spurious Features |
6, 7, 8 |
|
209 |
7 |
Retrieval-Augmented Generation for Code Summarization via Hybrid GNN |
7, 7, 7 |
|
210 |
7 |
On the geometry of generalization and memorization in deep neural networks |
7, 7, 7, 7 |
|
211 |
7 |
Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors |
8, 6, 7, 7 |
|
212 |
7 |
Neural ODE Processes |
7, 7, 7, 7 |
|
213 |
6.8 |
Refining Deep Generative Models via Wasserstein Gradient Flows |
6, 7, 7, 7, 7 |
|
214 |
6.8 |
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech |
5, 7, 8, 7, 7 |
|
215 |
6.8 |
The geometry of integration in text classification RNNs |
7, 7, 7, 8, 5 |
|
216 |
6.8 |
Learning to Represent Action Values as a Hypergraph on the Action Vertices |
7, 5, 8, 6, 8 |
|
217 |
6.8 |
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks |
7, 6, 6, 8, 7 |
|
218 |
6.8 |
Regularized Inverse Reinforcement Learning |
7, 8, 6, 7, 6 |
|
219 |
6.8 |
Lifelong Learning of Compositional Structures |
6, 6, 7, 6, 9 |
|
220 |
6.8 |
DeepAveragers: Offline Reinforcement Learning By Solving Derived Non-Parametric MDPs |
6, 7, 7, 7, 7 |
|
221 |
6.75 |
Self-supervised Representation Learning with Relative Predictive Coding |
6, 6, 8, 7 |
|
222 |
6.75 |
Wandering within a world: Online contextualized few-shot learning |
7, 6, 7, 7 |
|
223 |
6.75 |
Randomized Ensembled Double Q-Learning: Learning Fast Without a Model |
7, 7, 6, 7 |
|
224 |
6.75 |
Black-Box Optimization Revisited: Improving Algorithm Selection Wizards through Massive Benchmarking |
6, 7, 5, 9 |
|
225 |
6.75 |
Tight Frame Contractions in Deep Networks |
6, 6, 7, 8 |
|
226 |
6.75 |
Adversarial score matching and improved sampling for image generation |
7, 6, 7, 7 |
|
227 |
6.75 |
IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression |
7, 6, 7, 7 |
|
228 |
6.75 |
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning |
7, 5, 7, 8 |
|
229 |
6.75 |
Interpreting Knowledge Graph Relation Representation from Word Embeddings |
6, 7, 7, 7 |
|
230 |
6.75 |
Generalization bounds via distillation |
6, 6, 7, 8 |
|
231 |
6.75 |
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning |
6, 7, 8, 6 |
|
232 |
6.75 |
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability |
6, 5, 8, 8 |
|
233 |
6.75 |
Do not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning |
6, 7, 9, 5 |
|
234 |
6.75 |
Amending Mistakes Post-hoc in Deep Networks by Leveraging Class Hierarchies |
8, 7, 6, 6 |
|
235 |
6.75 |
On the Critical Role of Conventions in Adaptive Human-AI Collaboration |
6, 7, 7, 7 |
|
236 |
6.75 |
Creative Sketch Generation |
6, 7, 7, 7 |
|
237 |
6.75 |
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? |
6, 7, 6, 8 |
|
238 |
6.75 |
Domain-Robust Visual Imitation Learning with Mutual Information Constraints |
7, 6, 7, 7 |
|
239 |
6.75 |
Predictive Uncertainty in Deep Object Detectors: Estimation and Evaluation |
6, 9, 6, 6 |
|
240 |
6.75 |
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation |
8, 7, 7, 5 |
|
241 |
6.75 |
Probabilistic Numeric Convolutional Neural Networks |
7, 7, 6, 7 |
|
242 |
6.75 |
LiftPool: Bidirectional ConvNet Pooling |
7, 5, 8, 7 |
|
243 |
6.75 |
The Intrinsic Dimension of Images and Its Impact on Learning |
6, 7, 8, 6 |
|
244 |
6.75 |
Learning Robust State Abstractions for Hidden-Parameter Block MDPs |
7, 7, 6, 7 |
|
245 |
6.75 |
Distilling Knowledge from Reader to Retriever for Question Answering |
6, 7, 7, 7 |
|
246 |
6.75 |
Intraclass clustering: an implicit learning ability that regularizes DNNs |
6, 8, 7, 6 |
|
247 |
6.75 |
Deep Representational Re-tuning using Contrastive Tension |
9, 5, 6, 7 |
|
248 |
6.75 |
Few-Shot Learning via Learning the Representation, Provably |
6, 8, 7, 6 |
|
249 |
6.75 |
Getting a CLUE: A Method for Explaining Uncertainty Estimates |
7, 7, 7, 6 |
|
250 |
6.75 |
LEARNABLE EMBEDDING SIZES FOR RECOMMENDER SYSTEMS |
6, 7, 7, 7 |
|
251 |
6.75 |
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks |
6, 7, 7, 7 |
|
252 |
6.75 |
Sparse Quantized Spectral Clustering |
7, 6, 7, 7 |
|
253 |
6.75 |
Differentially Private Learning Needs Better Features (or Much More Data) |
7, 7, 7, 6 |
|
254 |
6.75 |
What Makes Instance Discrimination Good for Transfer Learning? |
7, 7, 5, 8 |
|
255 |
6.75 |
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval |
5, 7, 6, 9 |
|
256 |
6.75 |
Group Equivariant Stand-Alone Self-Attention For Vision |
7, 6, 8, 6 |
|
257 |
6.75 |
H-divergence: A Decision-Theoretic Discrepancy Measure for Two Sample Tests |
7, 9, 5, 6 |
|
258 |
6.75 |
Robust early-learning: Hindering the memorization of noisy labels |
7, 7, 7, 6 |
|
259 |
6.75 |
A Temporal Kernel Approach for Deep Learning with Continuous-time Information |
6, 7, 7, 7 |
|
260 |
6.75 |
Wasserstein Embedding for Graph Learning |
6, 6, 7, 8 |
|
261 |
6.75 |
LIME: LEARNING INDUCTIVE BIAS FOR PRIMITIVES OF MATHEMATICAL REASONING |
6, 7, 8, 6 |
|
262 |
6.75 |
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL |
7, 7, 6, 7 |
|
263 |
6.75 |
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning |
6, 7, 6, 8 |
|
264 |
6.75 |
Lipschitz-Bounded Equilibrium Networks |
8, 6, 6, 7 |
|
265 |
6.75 |
Selective Classification Can Magnify Disparities Across Groups |
5, 7, 8, 7 |
|
266 |
6.75 |
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS |
5, 7, 7, 8 |
|
267 |
6.75 |
Efficient Generalized Spherical CNNs |
6, 6, 7, 8 |
|
268 |
6.75 |
Modeling the Second Player in Distributionally Robust Optimization |
7, 7, 6, 7 |
|
269 |
6.75 |
Balancing Constraints and Rewards with Meta-Gradient D4PG |
7, 7, 7, 6 |
|
270 |
6.75 |
When Optimizing $f$-Divergence is Robust with Label Noise |
7, 6, 7, 7 |
|
271 |
6.75 |
An Unsupervised Deep Learning Approach for Real-World Image Denoising |
6, 6, 8, 7 |
|
272 |
6.75 |
Linear Last-iterate Convergence in Constrained Saddle-point Optimization |
7, 7, 7, 6 |
|
273 |
6.75 |
Neural Thompson Sampling |
6, 7, 7, 7 |
|
274 |
6.75 |
On Position Embeddings in BERT |
6, 7, 8, 6 |
|
275 |
6.75 |
Data-Efficient Reinforcement Learning with Self-Predictive Representations |
7, 7, 7, 6 |
|
276 |
6.75 |
Long Range Arena : A Benchmark for Efficient Transformers |
6, 7, 7, 7 |
|
277 |
6.75 |
Effective Abstract Reasoning with Dual-Contrast Network |
7, 7, 8, 5 |
|
278 |
6.75 |
On Graph Neural Networks versus Graph-Augmented MLPs |
7, 5, 8, 7 |
|
279 |
6.75 |
Categorical Normalizing Flows via Continuous Transformations |
7, 7, 6, 7 |
|
280 |
6.75 |
Private Image Reconstruction from System Side Channels Using Generative Models |
7, 5, 7, 8 |
|
281 |
6.75 |
Activation-level uncertainty in deep neural networks |
6, 6, 8, 7 |
|
282 |
6.75 |
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization |
7, 5, 7, 8 |
|
283 |
6.75 |
Variational Multi-Task Learning |
7, 7, 5, 8 |
|
284 |
6.75 |
Representation Balancing Offline Model-based Reinforcement Learning |
7, 7, 7, 6 |
|
285 |
6.75 |
Self-supervised representation learning via adaptive hard-positive mining |
7, 6, 7, 7 |
|
286 |
6.75 |
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs |
7, 7, 6, 7 |
|
287 |
6.75 |
Optimal Regularization can Mitigate Double Descent |
7, 7, 6, 7 |
|
288 |
6.75 |
A Sharp Analysis of Model-based Reinforcement Learning with Self-Play |
8, 8, 7, 4 |
|
289 |
6.75 |
Evaluations and Methods for Explanation through Robustness Analysis |
7, 7, 6, 7 |
|
290 |
6.75 |
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization |
5, 6, 7, 9 |
|
291 |
6.75 |
Multi-Time Attention Networks for Irregularly Sampled Time Series |
7, 6, 7, 7 |
|
292 |
6.75 |
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation |
6, 7, 6, 8 |
|
293 |
6.75 |
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning |
9, 7, 6, 5 |
|
294 |
6.75 |
Parameter-based Value Functions |
7, 7, 6, 7 |
|
295 |
6.75 |
Quantifying Statistical Significance of Neural Network Representation-Driven Hypotheses by Selective Inference |
6, 6, 7, 8 |
|
296 |
6.75 |
UMEC: Unified model and embedding compression for efficient recommendation systems |
6, 7, 7, 7 |
|
297 |
6.75 |
Structured Prediction as Translation between Augmented Natural Languages |
6, 8, 6, 7 |
|
298 |
6.75 |
Saliency is a Possible Red Herring When Diagnosing Poor Generalization |
6, 7, 7, 7 |
|
299 |
6.75 |
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation |
7, 6, 7, 7 |
|
300 |
6.75 |
MALI: A memory efficient and reverse accurate integrator for Neural ODEs |
7, 7, 6, 7 |
|
301 |
6.75 |
Towards Robust Neural Networks via Close-loop Control |
7, 7, 6, 7 |
|
302 |
6.75 |
The Risks of Invariant Risk Minimization |
7, 7, 7, 6 |
|
303 |
6.75 |
HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark |
7, 7, 6, 7 |
|
304 |
6.75 |
Learning Structural Edits via Incremental Tree Transformations |
5, 7, 7, 8 |
|
305 |
6.75 |
Active Contrastive Learning of Audio-Visual Video Representations |
7, 6, 7, 7 |
|
306 |
6.75 |
Learning Associative Inference Using Fast Weight Memory |
7, 7, 7, 6 |
|
307 |
6.75 |
Quickest change detection for multi-task problems under unknown parameters |
6, 7, 7, 7 |
|
308 |
6.75 |
Universal ASR: Unify and Improve Streaming ASR with Full-context Modeling |
7, 7, 7, 6 |
|
309 |
6.75 |
Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples |
7, 7, 6, 7 |
|
310 |
6.75 |
Systematic Analysis of Cluster Similarity Indices: How to Validate Validation Measures |
7, 6, 7, 7 |
|
311 |
6.75 |
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary |
7, 7, 7, 6 |
|
312 |
6.75 |
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models |
8, 6, 7, 6 |
|
313 |
6.75 |
Pre-training Text-to-Text Transformers to Write and Reason with Concepts |
4, 7, 8, 8 |
|
314 |
6.75 |
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth |
6, 8, 6, 7 |
|
315 |
6.75 |
Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models |
6, 7, 7, 7 |
|
316 |
6.75 |
Representing Partial Programs with Blended Abstract Semantics |
7, 6, 7, 7 |
|
317 |
6.75 |
Training independent subnetworks for robust prediction |
8, 7, 6, 6 |
|
318 |
6.75 |
Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments |
7, 7, 7, 6 |
|
319 |
6.75 |
Learning to live with Dale’s principle: ANNs with separate excitatory and inhibitory units |
6, 6, 6, 9 |
|
320 |
6.75 |
Learning Visual Representation from Human Interactions |
8, 6, 9, 4 |
|
321 |
6.75 |
Rethinking Positional Encoding in Language Pre-training |
7, 7, 7, 6 |
|
322 |
6.75 |
Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control |
7, 6, 7, 7 |
|
323 |
6.75 |
MC-LSTM: Mass-conserving LSTM |
7, 7, 6, 7 |
|
324 |
6.75 |
Hierarchical Autoregressive Modeling for Neural Video Compression |
7, 7, 6, 7 |
|
325 |
6.75 |
Towards A Unified Understanding and Improving of Adversarial Transferability |
6, 10, 5, 6 |
|
326 |
6.75 |
Perceptual Adversarial Robustness: Generalizable Defenses Against Unforeseen Threat Models |
7, 7, 6, 7 |
|
327 |
6.75 |
Computational Separation Between Convolutional and Fully-Connected Networks |
5, 6, 8, 8 |
|
328 |
6.75 |
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking |
6, 7, 7, 7 |
|
329 |
6.75 |
Learning A Minimax Optimizer: A Pilot Study |
7, 7, 7, 6 |
|
330 |
6.75 |
Learning to Set Waypoints for Audio-Visual Navigation |
7, 7, 7, 6 |
|
331 |
6.75 |
GraphCodeBERT: Pre-training Code Representations with Data Flow |
7, 7, 7, 6 |
|
332 |
6.67 |
Uncertainty in Structured Prediction |
7, 7, 6 |
|
333 |
6.67 |
Learning Energy-Based Models by Diffusion Recovery Likelihood |
7, 7, 6 |
|
334 |
6.67 |
RODE: Learning Roles to Decompose Multi-Agent Tasks |
8, 6, 6 |
|
335 |
6.67 |
Understanding and Improving Lexical Choice in Non-Autoregressive Translation |
7, 7, 6 |
|
336 |
6.67 |
Learning to Identify Physical Laws of Hamiltonian Systems via Meta-Learning |
7, 7, 6 |
|
337 |
6.67 |
Contextual Dropout: An Efficient Sample-Dependent Dropout Module |
6, 7, 7 |
|
338 |
6.67 |
Directed Acyclic Graph Neural Networks |
6, 7, 7 |
|
339 |
6.67 |
SEED: Self-supervised Distillation For Visual Representation |
7, 7, 6 |
|
340 |
6.67 |
Efficient Conformal Prediction via Cascaded Inference with Expanded Admission |
8, 6, 6 |
|
341 |
6.67 |
Learning to Make Decisions via Submodular Regularization |
7, 7, 6 |
|
342 |
6.67 |
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition |
7, 6, 7 |
|
343 |
6.67 |
Hopfield Networks is All You Need |
7, 6, 7 |
|
344 |
6.67 |
Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated Learning |
7, 6, 7 |
|
345 |
6.67 |
Online Adversarial Purification based on Self-supervised Learning |
6, 7, 7 |
|
346 |
6.67 |
Influence Estimation for Generative Adversarial Networks |
6, 7, 7 |
|
347 |
6.67 |
A Block Minifloat Representation for Training Deep Neural Networks |
6, 7, 7 |
|
348 |
6.67 |
Near-Optimal Linear Regression under Distribution Shift |
6, 8, 6 |
|
349 |
6.67 |
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time |
6, 7, 7 |
|
350 |
6.67 |
Representation learning for improved interpretability and classification accuracy of clinical factors from EEG |
7, 6, 7 |
|
351 |
6.67 |
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning |
7, 7, 6 |
|
352 |
6.67 |
Information Laundering for Model Privacy |
7, 6, 7 |
|
353 |
6.67 |
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization |
7, 6, 7 |
|
354 |
6.67 |
Partitioned Learned Bloom Filters |
7, 7, 6 |
|
355 |
6.67 |
A unifying view on implicit bias in training linear neural networks |
7, 7, 6 |
|
356 |
6.67 |
SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning |
6, 7, 7 |
|
357 |
6.67 |
You Only Need Adversarial Supervision for Semantic Image Synthesis |
7, 6, 7 |
|
358 |
6.67 |
Varying Coefficient Neural Network with Functional Targeted Regularization for Estimating Continuous Treatment Effects |
5, 6, 9 |
|
359 |
6.67 |
Symmetry-Aware Actor-Critic for 3D Molecular Design |
8, 6, 6 |
|
360 |
6.67 |
Sliced Kernelized Stein Discrepancy |
6, 6, 8 |
|
361 |
6.67 |
Learning Value Functions in Deep Policy Gradients using Residual Variance |
5, 7, 8 |
|
362 |
6.67 |
Towards Robustness Against Natural Language Word Substitutions |
6, 7, 7 |
|
363 |
6.67 |
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation |
8, 6, 6 |
|
364 |
6.67 |
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning |
5, 7, 8 |
|
365 |
6.67 |
Differentiable Segmentation of Sequences |
7, 7, 6 |
|
366 |
6.67 |
Global inducing point variational posteriors for Bayesian neural networks and deep Gaussian processes |
6, 7, 7 |
|
367 |
6.67 |
Towards Practical Second Order Optimization for Deep Learning |
6, 7, 7 |
|
368 |
6.67 |
Variational inference for diffusion modulated Cox processes |
6, 7, 7 |
|
369 |
6.67 |
Progressive Skeletonization: Trimming more fat from a network at initialization |
7, 7, 6 |
|
370 |
6.67 |
Filtered Inner Product Projection for Multilingual Embedding Alignment |
6, 8, 6 |
|
371 |
6.67 |
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss |
7, 7, 6 |
|
372 |
6.67 |
Improving Transformation Invariance in Contrastive Representation Learning |
7, 6, 7 |
|
373 |
6.67 |
Continual learning in recurrent neural networks |
7, 6, 7 |
|
374 |
6.67 |
Average-case Acceleration for Bilinear Games and Normal Matrices |
6, 7, 7 |
|
375 |
6.67 |
Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation |
7, 7, 6 |
|
376 |
6.67 |
Robust Overfitting may be mitigated by properly learned smoothening |
7, 7, 6 |
|
377 |
6.6 |
BERTology Meets Biology: Interpreting Attention in Protein Language Models |
7, 6, 7, 6, 7 |
|
378 |
6.6 |
BeBold: Exploration Beyond the Boundary of Explored Regions |
5, 4, 7, 9, 8 |
|
379 |
6.6 |
Large Scale Image Completion via Co-Modulated Generative Adversarial Networks |
6, 8, 4, 8, 7 |
|
380 |
6.6 |
A Universal Representation Transformer Layer for Few-Shot Image Classification |
6, 6, 7, 8, 6 |
|
381 |
6.6 |
Text Generation by Learning from Off-Policy Demonstrations |
7, 5, 7, 7, 7 |
|
382 |
6.6 |
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data |
6, 7, 6, 6, 8 |
|
383 |
6.6 |
Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates |
7, 8, 8, 6, 4 |
|
384 |
6.6 |
Physics-aware, probabilistic model order reduction with guaranteed stability |
6, 7, 6, 7, 7 |
|
385 |
6.6 |
NBDT: Neural-Backed Decision Tree |
8, 6, 7, 6, 6 |
|
386 |
6.5 |
NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation |
6, 7, 7, 6 |
|
387 |
6.5 |
Symmetry, Conservation Laws, and Learning Dynamics in Neural Networks |
8, 5, 6, 7 |
|
388 |
6.5 |
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis |
6, 6, 5, 9 |
|
389 |
6.5 |
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima |
6, 6, 7, 7 |
|
390 |
6.5 |
Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization |
6, 6, 7, 7 |
|
391 |
6.5 |
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients |
7, 9, 3, 7 |
|
392 |
6.5 |
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks |
6, 6, 9, 5 |
|
393 |
6.5 |
Knowledge Distillation as Semiparametric Inference |
6, 6, 8, 6 |
|
394 |
6.5 |
Neural Approximate Sufficient Statistics for Likelihood-free Inference |
6, 6, 7, 7 |
|
395 |
6.5 |
Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds |
7, 5, 8, 6 |
|
396 |
6.5 |
A Universal Learnable Audio Frontend |
7, 7, 8, 4 |
|
397 |
6.5 |
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic |
7, 7, 7, 5 |
|
398 |
6.5 |
Spatially Structured Recurrent Modules |
6, 7, 7, 6 |
|
399 |
6.5 |
WaveGrad: Estimating Gradients for Waveform Generation |
6, 8, 7, 5 |
|
400 |
6.5 |
Learning Parametrised Graph Shift Operators |
7, 7, 5, 7 |
|
401 |
6.5 |
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning |
6, 7, 6, 7 |
|
402 |
6.5 |
Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech |
7, 6, 5, 8 |
|
403 |
6.5 |
Variational Auto-Encoder Architectures that Excel at Causal Inference |
7, 6, 7, 6 |
|
404 |
6.5 |
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers |
7, 6, 6, 7 |
|
405 |
6.5 |
Byzantine-Resilient Non-Convex Stochastic Gradient Descent |
8, 7, 6, 5 |
|
406 |
6.5 |
Meta Back-Translation |
6, 7, 7, 6 |
|
407 |
6.5 |
Noise or Signal: The Role of Image Backgrounds in Object Recognition |
7, 5, 6, 8 |
|
408 |
6.5 |
Local Search Algorithms for Rank-Constrained Convex Optimization |
6, 7, 7, 6 |
|
409 |
6.5 |
Neural networks with late-phase weights |
7, 6, 7, 6 |
|
410 |
6.5 |
Topology-Aware Segmentation Using Discrete Morse Theory |
7, 8, 5, 6 |
|
411 |
6.5 |
Viewmaker Networks: Learning Views for Unsupervised Representation Learning |
7, 7, 6, 6 |
|
412 |
6.5 |
A Hypergradient Approach to Robust Regression without Correspondence |
7, 5, 8, 6 |
|
413 |
6.5 |
The role of Disentanglement in Generalisation |
5, 7, 6, 8 |
|
414 |
6.5 |
Grounding Physical Object and Event Concepts Through Dynamic Visual Reasoning |
6, 7, 7, 6 |
|
415 |
6.5 |
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving |
7, 7, 6, 6 |
|
416 |
6.5 |
On Effective Parallelization of Monte Carlo Tree Search |
7, 7, 6, 6 |
|
417 |
6.5 |
Emergent Symbols through Binding in External Memory |
7, 7, 7, 5 |
|
418 |
6.5 |
Generalized Variational Continual Learning |
7, 7, 8, 4 |
|
419 |
6.5 |
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics |
7, 6, 6, 7 |
|
420 |
6.5 |
GANs Can Play Lottery Tickets Too |
6, 6, 6, 8 |
|
421 |
6.5 |
Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces |
7, 7, 6, 6 |
|
422 |
6.5 |
Return-Based Contrastive Representation Learning for Reinforcement Learning |
6, 7, 6, 7 |
|
423 |
6.5 |
Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions |
5, 7, 7, 7 |
|
424 |
6.5 |
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks |
6, 7, 6, 7 |
|
425 |
6.5 |
Combining Label Propagation and Simple Models out-performs Graph Neural Networks |
6, 6, 7, 7 |
|
426 |
6.5 |
Benchmarks for Deep Off-Policy Evaluation |
6, 6, 7, 7 |
|
427 |
6.5 |
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning |
6, 5, 6, 9 |
|
428 |
6.5 |
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding |
6, 6, 6, 8 |
|
429 |
6.5 |
Discovering Autoregressive Orderings with Variational Inference |
6, 7, 7, 6 |
|
430 |
6.5 |
A Deeper Look at the Layerwise Sparsity of Magnitude-based Pruning |
6, 8, 5, 7 |
|
431 |
6.5 |
Transformers for Modeling Physical Systems |
7, 6, 7, 6 |
|
432 |
6.5 |
Meta-learning with negative learning rates |
6, 6, 6, 8 |
|
433 |
6.5 |
What Can Phase Retrieval Tell Us About Private Distributed Learning? |
7, 7, 8, 4 |
|
434 |
6.5 |
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders |
7, 6, 6, 7 |
|
435 |
6.5 |
Task-Agnostic Morphology Evolution |
6, 7, 7, 6 |
|
436 |
6.5 |
Dynamic Tensor Rematerialization |
6, 6, 7, 7 |
|
437 |
6.5 |
Combining Ensembles and Data Augmentation Can Harm Your Calibration |
4, 7, 8, 7 |
|
438 |
6.5 |
Training GANs with Stronger Augmentations via Contrastive Discriminator |
7, 7, 6, 6 |
|
439 |
6.5 |
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective |
7, 7, 6, 6 |
|
440 |
6.5 |
MultiModalQA: complex question answering over text, tables and images |
6, 6, 8, 6 |
|
441 |
6.5 |
On Noise Injection in Generative Adversarial Networks |
7, 7, 6, 6 |
|
442 |
6.5 |
Primal Wasserstein Imitation Learning |
6, 8, 6, 6 |
|
443 |
6.5 |
Adapting to Reward Progressivity via Spectral Reinforcement Learning |
6, 6, 7, 7 |
|
444 |
6.5 |
PC2WF: 3D Wireframe Reconstruction from Raw Point Clouds |
6, 6, 7, 7 |
|
445 |
6.5 |
Meta-Learning of Compositional Task Distributions in Humans and Machines |
6, 6, 7, 7 |
|
446 |
6.5 |
Learning Deep Features in Instrumental Variable Regression |
5, 6, 8, 7 |
|
447 |
6.5 |
Uncertainty in Gradient Boosting via Ensembles |
7, 7, 6, 6 |
|
448 |
6.5 |
Information Condensing Active Learning |
8, 6, 6, 6 |
|
449 |
6.5 |
Revisiting Locally Supervised Training of Deep Neural Networks |
7, 7, 6, 6 |
|
450 |
6.5 |
ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations |
6, 7, 7, 6 |
|
451 |
6.5 |
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients |
6, 6, 7, 7 |
|
452 |
6.5 |
Overfitting for Fun and Profit: Instance-Adaptive Data Compression |
6, 7, 7, 6 |
|
453 |
6.5 |
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments |
5, 6, 8, 7 |
|
454 |
6.5 |
Meta Attention Networks: Meta-Learning Attention to Modulate Information Between Recurrent Independent Mechanisms |
7, 7, 7, 5 |
|
455 |
6.5 |
Contextual Transformation Networks for Online Continual Learning |
7, 6, 7, 6 |
|
456 |
6.5 |
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling |
6, 7, 6, 7 |
|
457 |
6.5 |
Improving Learning to Branch via Reinforcement Learning |
8, 7, 7, 4 |
|
458 |
6.5 |
Mastering Atari with Discrete World Models |
4, 10, 7, 5 |
|
459 |
6.5 |
CopulaGNN: Towards Integrating Representational and Correlational Roles of Graphs in Graph Neural Networks |
7, 7, 7, 5 |
|
460 |
6.5 |
Learning continuous-time PDEs from sparse data with graph neural networks |
7, 6, 6, 7 |
|
461 |
6.5 |
Meta-Learning in Reproducing Kernel Hilbert Space |
7, 5, 7, 7 |
|
462 |
6.5 |
Improving VAEs' Robustness to Adversarial Attack |
7, 6, 6, 7 |
|
463 |
6.5 |
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning |
5, 7, 8, 6 |
|
464 |
6.5 |
Graph Coarsening with Neural Networks |
7, 7, 6, 6 |
|
465 |
6.5 |
Asymmetric self-play for automatic goal discovery in robotic manipulation |
6, 7, 7, 6 |
|
466 |
6.5 |
Open Question Answering over Tables and Text |
6, 7, 7, 6 |
|
467 |
6.5 |
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning |
7, 7, 6, 6 |
|
468 |
6.5 |
Conservative Safety Critics for Exploration |
6, 7, 7, 6 |
|
469 |
6.5 |
Adaptive Universal Generalized PageRank Graph Neural Network |
4, 7, 9, 6 |
|
470 |
6.5 |
Learning Neural Event Functions for Ordinary Differential Equations |
7, 7, 6, 6 |
|
471 |
6.5 |
Language-Agnostic Representation Learning of Source Code from Structure and Context |
7, 7, 6, 6 |
|
472 |
6.5 |
Generalized Stochastic Backpropagation |
5, 5, 6, 10 |
|
473 |
6.5 |
Sparsifying Networks via Subdifferential Inclusion |
5, 5, 9, 7 |
|
474 |
6.5 |
Knowledge distillation via softmax regression representation learning |
7, 7, 6, 6 |
|
475 |
6.5 |
Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation |
8, 6, 6, 6 |
|
476 |
6.5 |
TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks |
6, 6, 8, 6 |
|
477 |
6.5 |
Deep Networks and the Multiple Manifold Problem |
8, 5, 7, 6 |
|
478 |
6.5 |
Revisiting Dynamic Convolution via Matrix Decomposition |
7, 6, 6, 7 |
|
479 |
6.5 |
A Trainable Optimal Transport Embedding for Feature Aggregation |
6, 7, 6, 7 |
|
480 |
6.5 |
On Statistical Bias In Active Learning: How and When to Fix It |
8, 7, 4, 7 |
|
481 |
6.5 |
Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization |
8, 5, 7, 6 |
|
482 |
6.5 |
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond |
6, 7, 7, 6 |
|
483 |
6.5 |
MoPro: Webly Supervised Learning with Momentum Prototypes |
6, 7, 6, 7 |
|
484 |
6.5 |
Scalable Bayesian Inverse Reinforcement Learning by Auto-Encoding Reward |
6, 7, 6, 7 |
|
485 |
6.5 |
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling |
8, 6, 6, 6 |
|
486 |
6.5 |
New Bounds For Distributed Mean Estimation and Variance Reduction |
6, 6, 7, 7 |
|
487 |
6.5 |
Batch Reinforcement Learning Through Continuation Method |
4, 6, 9, 7 |
|
488 |
6.5 |
ColdExpand: Semi-Supervised Graph Learning in Cold Start |
5, 9, 6, 6 |
|
489 |
6.5 |
Fourier Neural Operator for Parametric Partial Differential Equations |
7, 6, 8, 5 |
|
490 |
6.5 |
Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition |
8, 5, 6, 7 |
|
491 |
6.5 |
Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs |
8, 6, 6, 6 |
|
492 |
6.5 |
Collective Robustness Certificates |
5, 7, 6, 8 |
|
493 |
6.5 |
Tilted Empirical Risk Minimization |
6, 6, 6, 8 |
|
494 |
6.5 |
Efficient Certified Defenses Against Patch Attacks on Image Classifiers |
6, 7, 7, 6 |
|
495 |
6.5 |
On the Universality of the Double Descent Peak in Ridgeless Regression |
7, 7, 6, 6 |
|
496 |
6.5 |
Set Prediction without Imposing Structure as Conditional Density Estimation |
6, 6, 7, 7 |
|
497 |
6.5 |
Pruning Neural Networks at Initialization: Why Are We Missing the Mark? |
6, 7, 4, 9 |
|
498 |
6.5 |
Removing Undesirable Feature Contributions Using Out-of-Distribution Data |
7, 6, 7, 6 |
|
499 |
6.5 |
Scaling the Convex Barrier with Active Sets |
5, 8, 7, 7, 6, 6 |
|
500 |
6.5 |
Deep Repulsive Clustering of Ordered Data Based on Order-Identity Decomposition |
7, 6, 6, 7 |
|
501 |
6.5 |
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study |
6, 6, 6, 8 |
|
502 |
6.5 |
Learning Task-General Representations with Generative Neuro-Symbolic Modeling |
6, 6, 7, 7 |
|
503 |
6.5 |
Efficient Continual Learning with Modular Networks and Task-Driven Priors |
7, 6, 6, 7 |
|
504 |
6.5 |
Towards Understanding and Improving Dropout in Game Theory |
7, 7, 7, 5 |
|
505 |
6.5 |
Learning with AMIGo: Adversarially Motivated Intrinsic Goals |
7, 6, 6, 7 |
|
506 |
6.5 |
BiPointNet: Binary Neural Network for Point Clouds |
4, 8, 7, 7 |
|
507 |
6.5 |
Rapid Task-Solving in Novel Environments |
8, 7, 7, 4 |
|
508 |
6.5 |
VEM-GCN: Topology Optimization with Variational EM for Graph Convolutional Networks |
6, 6, 6, 8 |
|
509 |
6.5 |
Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders |
6, 7, 7, 6 |
|
510 |
6.5 |
A Discriminative Gaussian Mixture Model with Sparsity |
6, 7, 5, 8 |
|
511 |
6.4 |
Temporally-Extended ε-Greedy Exploration |
8, 5, 8, 5, 6 |
|
512 |
6.4 |
LambdaNetworks: Modeling long-range Interactions without Attention |
8, 6, 6, 6, 6 |
|
513 |
6.4 |
Provable Benefits of Representation Learning in Linear Bandits |
7, 5, 7, 6, 7 |
|
514 |
6.4 |
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? |
6, 5, 7, 7, 7 |
|
515 |
6.4 |
Risk-Averse Offline Reinforcement Learning |
7, 6, 5, 8, 6 |
|
516 |
6.33 |
Multi-resolution modeling of a discrete stochastic process identifies cusses of cancer |
7, 6, 6 |
|
517 |
6.33 |
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification |
6, 6, 7 |
|
518 |
6.33 |
Net-DNF: Effective Deep Modeling of Tabular Data |
6, 7, 6 |
|
519 |
6.33 |
MeshMVS: Multi-view Stereo Guided Mesh Reconstruction |
4, 6, 9 |
|
520 |
6.33 |
Degree-Quant: Quantization-Aware Training for Graph Neural Networks |
6, 7, 6 |
|
521 |
6.33 |
Gradient Origin Networks |
5, 7, 7 |
|
522 |
6.33 |
Trusted Multi-View Classification |
7, 4, 8 |
|
523 |
6.33 |
HyperGrid Transformers: Towards A Single Model for Multiple Tasks |
7, 6, 6 |
|
524 |
6.33 |
Wasserstein-2 Generative Networks |
6, 8, 5 |
|
525 |
6.33 |
Generating Adversarial Computer Programs using Optimized Obfuscations |
6, 7, 6 |
|
526 |
6.33 |
Understanding the effects of data parallelism and sparsity on neural network training |
7, 5, 7 |
|
527 |
6.33 |
A Learning Theoretic Perspective on Local Explainability |
5, 7, 7 |
|
528 |
6.33 |
On Learning Universal Representations Across Languages |
7, 5, 7 |
|
529 |
6.33 |
Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning |
7, 7, 5 |
|
530 |
6.33 |
On the Effectiveness of Weight-Encoded Neural Implicit 3D Shapes |
7, 4, 8 |
|
531 |
6.33 |
Conformation-Guided Molecular Representation with Hamiltonian Neural Networks |
5, 7, 7 |
|
532 |
6.33 |
The Importance of Pessimism in Fixed-Dataset Policy Optimization |
7, 6, 6 |
|
533 |
6.33 |
Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks |
5, 7, 7 |
|
534 |
6.33 |
No MCMC for me: Amortized sampling for fast and stable training of energy-based models |
7, 8, 4 |
|
535 |
6.33 |
Efficient Wasserstein Natural Gradients for Reinforcement Learning |
5, 8, 6 |
|
536 |
6.33 |
Direction Matters: On the Implicit Regularization Effect of Stochastic Gradient Descent with Moderate Learning Rate |
6, 6, 7 |
|
537 |
6.33 |
WaNet - Imperceptible Warping-based Backdoor Attack |
6, 6, 7 |
|
538 |
6.33 |
BREEDS: Benchmarks for Subpopulation Shift |
6, 7, 6 |
|
539 |
6.33 |
Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms |
6, 6, 7 |
|
540 |
6.33 |
XT2: Training an X-to-Text Typing Interface with Online Learning from Implicit Feedback |
4, 8, 7 |
|
541 |
6.33 |
The Recurrent Neural Tangent Kernel |
6, 7, 6 |
|
542 |
6.33 |
Information Theoretic Regularization for Learning Global Features by Sequential VAE |
6, 7, 6 |
|
543 |
6.33 |
FedMix: Approximation of Mixup under Mean Augmented Federated Learning |
6, 6, 7 |
|
544 |
6.33 |
Nonvacuous Loss Bounds with Fast Rates for Neural Networks via Conditional Information Measures |
6, 6, 7 |
|
545 |
6.33 |
Transferable Unsupervised Robust Representation Learning |
7, 5, 7 |
|
546 |
6.33 |
Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization |
7, 6, 6 |
|
547 |
6.33 |
Learning from Demonstration with Weakly Supervised Disentanglement |
7, 7, 5 |
|
548 |
6.33 |
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs |
7, 6, 6 |
|
549 |
6.33 |
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues |
6, 6, 7 |
|
550 |
6.33 |
Explainable Deep One-Class Classification |
4, 8, 7 |
|
551 |
6.33 |
Characterizing signal propagation to close the performance gap in unnormalized ResNets |
5, 7, 7 |
|
552 |
6.33 |
Shapley Explanation Networks |
6, 7, 6 |
|
553 |
6.33 |
Learning to Sample with Local and Global Contexts in Experience Replay Buffer |
7, 6, 6 |
|
554 |
6.33 |
Neural Network Extrapolations with G-invariances from a Single Environment |
5, 7, 7 |
|
555 |
6.33 |
Implicit Gradient Regularization |
6, 6, 7 |
|
556 |
6.33 |
Simple Augmentation Goes a Long Way: ADRL for DNN Quantization |
6, 6, 7 |
|
557 |
6.33 |
Decoy-enhanced Saliency Maps |
6, 6, 7 |
|
558 |
6.33 |
Improving relational regularized autoencoders with spherical sliced fused Gromov Wasserstein |
6, 6, 7 |
|
559 |
6.33 |
Learning with Instance-Dependent Label Noise: A Sample Sieve Approach |
6, 5, 8 |
|
560 |
6.33 |
Robust Pruning at Initialization |
6, 6, 7 |
|
561 |
6.33 |
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences |
7, 5, 7 |
|
562 |
6.33 |
Differentiable Trust Region Layers for Deep Reinforcement Learning |
6, 6, 7 |
|
563 |
6.33 |
PDE-Driven Spatiotemporal Disentanglement |
7, 5, 7 |
|
564 |
6.33 |
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning |
7, 6, 6 |
|
565 |
6.33 |
ECONOMIC HYPERPARAMETER OPTIMIZATION WITH BLENDED SEARCH STRATEGY |
6, 6, 7 |
|
566 |
6.33 |
Provable More Data Hurt in High Dimensional Least Squares Estimator |
6, 6, 7 |
|
567 |
6.33 |
Boosting Certified Robustness of Deep Networks via a Compositional Architecture |
6, 7, 6 |
|
568 |
6.25 |
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration |
6, 6, 7, 6 |
|
569 |
6.25 |
Adaptive Federated Optimization |
7, 6, 6, 6 |
|
570 |
6.25 |
The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods. |
7, 6, 6, 6 |
|
571 |
6.25 |
Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction |
5, 7, 7, 6 |
|
572 |
6.25 |
Cross-model Back-translated Distillation for Unsupervised Machine Translation |
6, 7, 7, 5 |
|
573 |
6.25 |
Provable Rich Observation Reinforcement Learning with Combinatorial Latent States |
7, 6, 5, 7 |
|
574 |
6.25 |
The act of remembering: A study in partially observable reinforcement learning |
5, 6, 7, 7 |
|
575 |
6.25 |
Density Constrained Reinforcement Learning |
6, 5, 7, 7 |
|
576 |
6.25 |
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators |
6, 6, 8, 5 |
|
577 |
6.25 |
GAN2GAN: Generative Noise Learning for Blind Denoising with Single Noisy Images |
7, 7, 4, 7 |
|
578 |
6.25 |
On the Dynamics of Training Attention Models |
4, 7, 6, 8 |
|
579 |
6.25 |
Secure Federated Learning of User Verification Models |
8, 2, 8, 7 |
|
580 |
6.25 |
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System |
7, 6, 6, 6 |
|
581 |
6.25 |
Generalized Multimodal ELBO |
6, 6, 6, 7 |
|
582 |
6.25 |
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching |
5, 7, 6, 7 |
|
583 |
6.25 |
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics |
6, 6, 7, 6 |
|
584 |
6.25 |
Efficient Sampling for Generative Adversarial Networks with Coupling Markov Chains |
8, 5, 5, 7 |
|
585 |
6.25 |
AdaSpeech: Adaptive Text to Speech for Custom Voice |
4, 8, 6, 7 |
|
586 |
6.25 |
Multiscale Score Matching for Out-of-Distribution Detection |
5, 9, 5, 6 |
|
587 |
6.25 |
Beyond Categorical Label Representations for Image Classification |
7, 7, 7, 4 |
|
588 |
6.25 |
Integrating Categorical Semantics into Unsupervised Domain Translation |
7, 7, 4, 7 |
|
589 |
6.25 |
Revisiting Point Cloud Classification with a Simple and Effective Baseline |
4, 7, 7, 7 |
|
590 |
6.25 |
Contrastive Syn-to-Real Generalization |
6, 6, 6, 7 |
|
591 |
6.25 |
Network Pruning That Matters: A Case Study on Retraining Variants |
5, 8, 6, 6 |
|
592 |
6.25 |
Teaching with Commentaries |
6, 7, 7, 5 |
|
593 |
6.25 |
Early Stopping in Deep Networks: Double Descent and How to Eliminate it |
8, 6, 4, 7 |
|
594 |
6.25 |
Learning the Pareto Front with Hypernetworks |
6, 6, 7, 6 |
|
595 |
6.25 |
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization |
7, 6, 6, 6 |
|
596 |
6.25 |
Adversarially-Trained Deep Nets Transfer Better |
6, 6, 6, 7 |
|
597 |
6.25 |
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning |
7, 8, 4, 6 |
|
598 |
6.25 |
Partial Rejection Control for Robust Variational Inference in Sequential Latent Variable Models |
7, 6, 7, 5 |
|
599 |
6.25 |
Self-supervised Learning from a Multi-view Perspective |
6, 7, 6, 6 |
|
600 |
6.25 |
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization |
6, 6, 6, 7 |
|
601 |
6.25 |
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition |
7, 7, 5, 6 |
|
602 |
6.25 |
On the Curse Of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis |
6, 3, 8, 8 |
|
603 |
6.25 |
Understanding Mental Representations Of Objects Through Verbs Applied To Them |
7, 7, 6, 5 |
|
604 |
6.25 |
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing |
7, 6, 5, 7 |
|
605 |
6.25 |
How Multipurpose Are Language Models? |
6, 8, 5, 6 |
|
606 |
6.25 |
Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks |
6, 6, 6, 7 |
|
607 |
6.25 |
Taking Notes on the Fly Helps Language Pre-Training |
6, 6, 6, 7 |
|
608 |
6.25 |
Towards Machine Ethics with Language Models |
6, 6, 7, 6 |
|
609 |
6.25 |
Efficient Inference of Nonparametric Interaction in Spiking-neuron Networks |
6, 6, 7, 6 |
|
610 |
6.25 |
Learning and Evaluating Representations for Deep One-Class Classification |
5, 7, 7, 6 |
|
611 |
6.25 |
Efficient Empowerment Estimation for Unsupervised Stabilization |
7, 6, 7, 5 |
|
612 |
6.25 |
Theoretical bounds on estimation error for meta-learning |
5, 6, 7, 7 |
|
613 |
6.25 |
Fooling a Complete Neural Network Verifier |
6, 7, 6, 6 |
|
614 |
6.25 |
Neural representation and generation for RNA secondary structures |
6, 7, 6, 6 |
|
615 |
6.25 |
Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space |
5, 6, 6, 8 |
|
616 |
6.25 |
Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning |
7, 6, 6, 6 |
|
617 |
6.25 |
Generative Time-series Modeling with Fourier Flows |
7, 6, 7, 5 |
|
618 |
6.25 |
Counterfactual Generative Networks |
8, 7, 5, 5 |
|
619 |
6.25 |
Teaching Temporal Logics to Neural Networks |
5, 7, 7, 6 |
|
620 |
6.25 |
Better Fine-Tuning by Reducing Representational Collapse |
6, 6, 7, 6 |
|
621 |
6.25 |
Colorization Transformer |
5, 7, 6, 7 |
|
622 |
6.25 |
DeLighT: Deep and Light-weight Transformer |
6, 7, 6, 6 |
|
623 |
6.25 |
Acting in Delayed Environments with Non-Stationary Markov Policies |
5, 6, 6, 8 |
|
624 |
6.25 |
Disambiguating Symbolic Expressions in Informal Documents |
8, 6, 4, 7 |
|
625 |
6.25 |
Ringing ReLUs: Harmonic Distortion Analysis of Nonlinear Feedforward Networks |
8, 4, 5, 8 |
|
626 |
6.25 |
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks |
5, 6, 6, 8 |
|
627 |
6.25 |
SSD: A Unified Framework for Self-Supervised Outlier Detection |
6, 6, 6, 7 |
|
628 |
6.25 |
Class Normalization for Zero-Shot Learning |
3, 7, 8, 7 |
|
629 |
6.25 |
Compositional Video Synthesis with Action Graphs |
7, 5, 6, 7 |
|
630 |
6.25 |
Learning with Plasticity Rules: Generalization and Robustness |
4, 7, 7, 7 |
|
631 |
6.25 |
ResNet After All: Neural ODEs and Their Numerical Solution |
5, 7, 7, 6 |
|
632 |
6.25 |
Adversarial Masking: Towards Understanding Robustness Trade-off for Generalization |
7, 7, 6, 5 |
|
633 |
6.25 |
On Proximal Policy Optimization’s Heavy-Tailed Gradients |
5, 5, 7, 8 |
|
634 |
6.25 |
Neural Potts Model |
6, 6, 7, 6 |
|
635 |
6.25 |
Bag of Tricks for Adversarial Training |
6, 7, 7, 5 |
|
636 |
6.25 |
Unity of Opposites: SelfNorm and CrossNorm for Model Robustness |
6, 7, 7, 5 |
|
637 |
6.25 |
On the Impossibility of Global Convergence in Multi-Loss Optimization |
4, 6, 7, 8 |
|
638 |
6.25 |
Understanding the failure modes of out-of-distribution generalization |
5, 6, 8, 6 |
|
639 |
6.25 |
Effective and Efficient Vote Attack on Capsule Networks |
6, 8, 5, 6 |
|
640 |
6.25 |
A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks |
5, 7, 7, 6 |
|
641 |
6.25 |
CTRLsum: Towards Generic Controllable Text Summarization |
7, 5, 7, 6 |
|
642 |
6.25 |
Adaptive Extra-Gradient Methods for Min-Max Optimization and Games |
5, 6, 7, 7 |
|
643 |
6.25 |
HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving |
7, 6, 5, 7 |
|
644 |
6.25 |
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents |
6, 6, 5, 8 |
|
645 |
6.25 |
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule |
8, 8, 4, 5 |
|
646 |
6.25 |
Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks |
7, 4, 6, 8 |
|
647 |
6.25 |
Personalized Federated Learning with First Order Model Optimization |
6, 6, 6, 7 |
|
648 |
6.25 |
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly |
5, 6, 7, 7 |
|
649 |
6.25 |
Scalable Transfer Learning with Expert Models |
6, 7, 7, 5 |
|
650 |
6.25 |
XLVIN: eXecuted Latent Value Iteration Nets |
6, 6, 6, 7 |
|
651 |
6.25 |
Distance-Based Regularisation of Deep Networks for Fine-Tuning |
7, 5, 6, 7 |
|
652 |
6.25 |
Using latent space regression to analyze and leverage compositionality in GANs |
5, 8, 5, 7 |
|
653 |
6.25 |
Deep Partition Aggregation: Provable Defenses against General Poisoning Attacks |
4, 8, 6, 7 |
|
654 |
6.25 |
Noise against noise: stochastic label noise helps combat inherent label noise |
7, 7, 5, 6 |
|
655 |
6.25 |
Nonseparable Symplectic Neural Networks |
7, 6, 6, 6 |
|
656 |
6.25 |
Learning to Generate Questions by Recovering Answer-containing Sentences |
7, 6, 5, 7 |
|
657 |
6.25 |
Bayesian Context Aggregation for Neural Processes |
6, 6, 7, 6 |
|
658 |
6.25 |
SAFENet: A Secure, Accurate and Fast Neural Network Inference |
6, 7, 7, 5 |
|
659 |
6.25 |
Convex Regularization behind Neural Reconstruction |
4, 6, 9, 6 |
|
660 |
6.25 |
ANOCE: Analysis of Causal Effects with Multiple Mediators via Constrained Structural Learning |
5, 6, 8, 6 |
|
661 |
6.25 |
Prototypical Contrastive Learning of Unsupervised Representations |
7, 5, 6, 7 |
|
662 |
6.25 |
Shape Matters: Understanding the Implicit Bias of the Noise Covariance |
6, 6, 6, 7 |
|
663 |
6.25 |
AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models |
7, 7, 6, 5 |
|
664 |
6.25 |
Universal approximation power of deep residual neural networks via nonlinear control theory |
7, 6, 6, 6 |
|
665 |
6.25 |
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples |
6, 7, 6, 6 |
|
666 |
6.25 |
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding |
9, 7, 5, 4 |
|
667 |
6.25 |
Learning Better Structured Representations Using Low-rank Adaptive Label Smoothing |
6, 6, 6, 7 |
|
668 |
6.25 |
What Should Not Be Contrastive in Contrastive Learning |
4, 8, 6, 7 |
|
669 |
6.25 |
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution |
7, 7, 6, 5 |
|
670 |
6.25 |
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space |
7, 6, 6, 6 |
|
671 |
6.25 |
Monotonic Kronecker-Factored Lattice |
6, 6, 7, 6 |
|
672 |
6.25 |
Neural Spatio-Temporal Point Processes |
6, 5, 7, 7 |
|
673 |
6.25 |
A Unified Bayesian Framework for Discriminative and Generative Continual Learning |
8, 4, 6, 7 |
|
674 |
6.25 |
Unsupervised Meta-Learning through Latent-Space Interpolation in Generative Models |
7, 6, 6, 6 |
|
675 |
6.25 |
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning |
5, 7, 7, 6 |
|
676 |
6.25 |
A Design Space Study for LISTA and Beyond |
8, 6, 7, 4 |
|
677 |
6.25 |
DEBERTA: DECODING-ENHANCED BERT WITH DISENTANGLED ATTENTION |
6, 6, 7, 6 |
|
678 |
6.25 |
Non-greedy Gradient-based Hyperparameter Optimization Over Long Horizons |
6, 5, 7, 7 |
|
679 |
6.25 |
Parameter Efficient Multimodal Transformers for Video Representation Learning |
6, 6, 8, 5 |
|
680 |
6.25 |
Contrastive Learning with Hard Negative Samples |
6, 5, 7, 7 |
|
681 |
6.25 |
Fair Mixup: Fairness via Interpolation |
5, 6, 7, 7 |
|
682 |
6.25 |
Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach |
7, 5, 7, 6 |
|
683 |
6.25 |
Revisiting Few-sample BERT Fine-tuning |
6, 6, 6, 7 |
|
684 |
6.25 |
MARS: Markov Molecular Sampling for Multi-objective Drug Discovery |
8, 6, 7, 4 |
|
685 |
6.25 |
Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF |
7, 6, 6, 6 |
|
686 |
6.25 |
Latent Convergent Cross Mapping |
6, 6, 7, 6 |
|
687 |
6.25 |
HyperDynamics: Generating Expert Dynamics Models by Observation |
6, 6, 6, 7 |
|
688 |
6.25 |
Learning “What-if” Explanations for Sequential Decision-Making |
5, 6, 7, 7 |
|
689 |
6.25 |
CoCon: A Self-Supervised Approach for Controlled Text Generation |
4, 6, 7, 8 |
|
690 |
6.25 |
SketchEmbedNet: Learning Novel Concepts by Imitating Drawings |
9, 4, 6, 6 |
|
691 |
6.25 |
Embedding a random graph via GNN: mean-field inference theory and RL applications to NP-Hard multi-robot/machine scheduling |
7, 5, 6, 7 |
|
692 |
6.25 |
Influence Functions in Deep Learning Are Fragile |
7, 6, 6, 6 |
|
693 |
6.25 |
PABI: A Unified PAC-Bayesian Informativeness Measure for Incidental Supervision Signals |
5, 7, 8, 5 |
|
694 |
6.25 |
Learning perturbation sets for robust machine learning |
8, 6, 6, 5 |
|
695 |
6.25 |
Lipschitz Recurrent Neural Networks |
8, 5, 6, 6 |
|
696 |
6.25 |
Does injecting linguistic structure into language models lead to better alignment with brain recordings? |
5, 7, 7, 6 |
|
697 |
6.25 |
Tradeoffs in Data Augmentation: An Empirical Study |
6, 8, 6, 5 |
|
698 |
6.25 |
Robust and Generalizable Visual Representation Learning via Random Convolutions |
6, 7, 6, 6 |
|
699 |
6.25 |
Physics Informed Deep Kernel Learning |
8, 5, 5, 7 |
|
700 |
6.25 |
Learning Hyperbolic Representations of Topological Features |
6, 6, 6, 7 |
|
701 |
6.25 |
DC3: A learning method for optimization with hard constraints |
6, 4, 8, 7 |
|
702 |
6.25 |
ERMAS: Learning Policies Robust to Reality Gaps in Multi-Agent Simulations |
6, 6, 6, 7 |
|
703 |
6.25 |
Prioritized Level Replay |
7, 5, 7, 6 |
|
704 |
6.25 |
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning |
5, 5, 7, 8 |
|
705 |
6.25 |
ForceNet: A Graph Neural Network for Large-Scale Quantum Chemistry Simulation |
7, 5, 6, 7 |
|
706 |
6.25 |
Variational Invariant Learning for Bayesian Domain Generalization |
6, 6, 5, 8 |
|
707 |
6.25 |
Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models |
4, 5, 9, 7 |
|
708 |
6.25 |
Exemplary natural images explain CNN activations better than synthetic feature visualizations |
7, 7, 5, 6 |
|
709 |
6.25 |
Anytime Sampling for Autoregressive Models via Ordered Autoencoding |
6, 6, 6, 7 |
|
710 |
6.25 |
MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering |
5, 6, 8, 6 |
|
711 |
6.25 |
Estimating informativeness of samples with Smooth Unique Information |
7, 6, 6, 6 |
|
712 |
6.25 |
Gradient Descent-Ascent Provably Converges to Strict Local Minmax Equilibria with a Finite Timescale Separation |
6, 7, 6, 6 |
|
713 |
6.25 |
NCP-VAE: Variational Autoencoders with Noise Contrastive Priors |
7, 5, 8, 5 |
|
714 |
6.25 |
Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration |
6, 6, 7, 6 |
|
715 |
6.2 |
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning |
7, 5, 7, 6, 6 |
|
716 |
6.2 |
Auction Learning as a Two-Player Game |
7, 6, 6, 6, 6 |
|
717 |
6.2 |
IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning |
5, 7, 6, 8, 5 |
|
718 |
6.2 |
Adaptive and Generative Zero-Shot Learning |
6, 7, 6, 7, 5 |
|
719 |
6.2 |
Faster Binary Embeddings for Preserving Euclidean Distances |
5, 7, 6, 7, 6 |
|
720 |
6.2 |
SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing |
4, 6, 7, 7, 7 |
|
721 |
6.2 |
Why resampling outperforms reweighting for correcting sampling bias |
7, 6, 6, 5, 7 |
|
722 |
6.2 |
Deep Networks from the Principle of Rate Reduction |
4, 6, 6, 9, 6 |
|
723 |
6.2 |
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology |
5, 6, 7, 8, 5 |
|
724 |
6 |
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning |
7, 7, 5, 5 |
|
725 |
6 |
A law of robustness for two-layers neural networks |
7, 7, 5, 5 |
|
726 |
6 |
Adding Recurrence to Pretrained Transformers |
7, 7, 4 |
|
727 |
6 |
Grounding Language to Entities for Generalization in Reinforcement Learning |
6, 5, 6, 7, 6 |
|
728 |
6 |
MIROSTAT: A NEURAL TEXT DECODING ALGORITHM THAT DIRECTLY CONTROLS PERPLEXITY |
6, 6, 6 |
|
729 |
6 |
EqCo: Equivalent Rules for Self-supervised Contrastive Learning |
5, 6, 5, 8 |
|
730 |
6 |
Learning a unified label space |
6, 7, 4, 7 |
|
731 |
6 |
Self-Supervised Learning of Compressed Video Representations |
6, 6, 6 |
|
732 |
6 |
Making Coherence Out of Nothing At All: Measuring Evolution of Gradient Alignment |
6, 8, 5, 5 |
|
733 |
6 |
Neural CDEs for Long Time Series via the Log-ODE Method |
5, 7, 6 |
|
734 |
6 |
Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit |
5, 6, 7, 6 |
|
735 |
6 |
Learning Subgoal Representations with Slow Dynamics |
4, 7, 6, 7 |
|
736 |
6 |
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective |
4, 6, 8, 6 |
|
737 |
6 |
Graph Representation Learning for Multi-Task Settings: a Meta-Learning Approach |
6, 5, 7 |
|
738 |
6 |
R-GAP: Recursive Gradient Attack on Privacy |
5, 6, 7 |
|
739 |
6 |
Neural Rankers are hitherto Outperformed by Gradient Boosted Decision Trees |
6, 2, 8, 8 |
|
740 |
6 |
CorrAttack: Black-box Adversarial Attack with Structured Search |
6, 6, 6, 6 |
|
741 |
6 |
Multi-Agent Collaboration via Reward Attribution Decomposition |
6, 7, 6, 5 |
|
742 |
6 |
DrNAS: Dirichlet Neural Architecture Search |
6, 7, 6, 5 |
|
743 |
6 |
Blind Pareto Fairness and Subgroup Robustness |
6, 6, 6 |
|
744 |
6 |
Max-sliced Bures Distance for Interpreting Discrepancies |
7, 6, 5 |
|
745 |
6 |
A Panda? No, It’s a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference |
7, 6, 3, 8 |
|
746 |
6 |
Learning advanced mathematical computations from examples |
8, 7, 3, 6 |
|
747 |
6 |
Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains |
7, 7, 5, 5 |
|
748 |
6 |
Bayesian Online Meta-Learning |
6, 6, 5, 7 |
|
749 |
6 |
Single-Photon Image Classification |
8, 3, 6, 7 |
|
750 |
6 |
Domain Generalization with MixStyle |
7, 4, 7 |
|
751 |
6 |
Defective Convolutional Networks |
6, 6, 6 |
|
752 |
6 |
Sample weighting as an explanation for mode collapse in generative adversarial networks |
6, 6, 6, 6 |
|
753 |
6 |
Just How Toxic is Data Poisoning? A Benchmark for Backdoor and Data Poisoning Attacks |
4, 5, 7, 8 |
|
754 |
6 |
Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks |
6, 6, 6, 6 |
|
755 |
6 |
SOLAR: Sparse Orthogonal Learned and Random Embeddings |
3, 7, 7, 7 |
|
756 |
6 |
Protecting DNNs from Theft using an Ensemble of Diverse Models |
6, 5, 7, 6 |
|
757 |
6 |
Unified Principles For Multi-Source Transfer Learning Under Label Shifts |
4, 7, 6, 7 |
|
758 |
6 |
Non-Local Graph Neural Networks |
7, 7, 4, 6 |
|
759 |
6 |
Graph Learning via Spectral Densification |
5, 5, 8, 6 |
|
760 |
6 |
Learning to interpret trajectories |
6, 6, 6, 6 |
|
761 |
6 |
Simple Spectral Graph Convolution |
5, 6, 6, 7 |
|
762 |
6 |
Trajectory Prediction using Equivariant Continuous Convolution |
5, 7, 6, 6 |
|
763 |
6 |
Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis |
7, 5, 6, 6 |
|
764 |
6 |
Enforcing robust control guarantees within neural network policies |
6, 6, 6, 6 |
|
765 |
6 |
FLAG: Adversarial Data Augmentation for Graph Neural Networks |
6, 7, 5, 6 |
|
766 |
6 |
A Simple and General Graph Neural Network with Stochastic Message Passing |
8, 6, 7, 3 |
|
767 |
6 |
On Data-Augmentation and Consistency-Based Semi-Supervised Learning |
6, 6, 6 |
|
768 |
6 |
Neural Partial Differential Equations |
6, 6, 7, 5 |
|
769 |
6 |
Neural Delay Differential Equations |
7, 6, 5, 6 |
|
770 |
6 |
Initialization and Regularization of Factorized Neural Layers |
6, 6, 6, 6 |
|
771 |
6 |
Diverse Video Generation using a Gaussian Process Trigger |
6, 6, 6 |
|
772 |
6 |
Control-Aware Representations for Model-based Reinforcement Learning |
6, 6, 6 |
|
773 |
6 |
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation |
7, 5, 5, 7 |
|
774 |
6 |
Combining Physics and Machine Learning for Network Flow Estimation |
7, 6, 4, 7 |
|
775 |
6 |
Open-world Semi-supervised Learning |
6, 6, 6, 6 |
|
776 |
6 |
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective |
7, 4, 7, 6 |
|
777 |
6 |
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision |
4, 8, 5, 7 |
|
778 |
6 |
What they do when in doubt: a study of inductive biases in seq2seq learners |
4, 7, 7, 6 |
|
779 |
6 |
Concept Learners for Generalizable Few-Shot Learning |
6, 5, 6, 7 |
|
780 |
6 |
Disentangling 3D Prototypical Networks for Few-Shot Concept Learning |
7, 5, 6, 6 |
|
781 |
6 |
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning |
6, 6, 6, 6 |
|
782 |
6 |
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors |
7, 6, 5 |
|
783 |
6 |
Segmenting Natural Language Sentences via Lexical Unit Analysis |
6, 5, 7 |
|
784 |
6 |
Equivariant Normalizing Flows for Point Processes and Sets |
5, 6, 5, 8 |
|
785 |
6 |
VA-RED$^2$: Video Adaptive Redundancy Reduction |
6, 6, 6 |
|
786 |
6 |
Structural Landmarking and Interaction Modelling: on Resolution Dilemmas in Graph Classification |
6, 6, 6, 6 |
|
787 |
6 |
PolyRetro: Few-shot Polymer Retrosynthesis via Domain Adaptation |
6, 6, 7, 5 |
|
788 |
6 |
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting |
6, 6, 6, 6 |
|
789 |
6 |
Auxiliary Learning by Implicit Differentiation |
6, 5, 6, 7 |
|
790 |
6 |
Exploiting Safe Spots in Neural Networks for Preemptive Robustness and Out-of-Distribution Detection |
6, 5, 6, 7 |
|
791 |
6 |
Usable Information and Evolution of Optimal Representations During Training |
7, 3, 7, 7 |
|
792 |
6 |
On the Effect of Consensus in Decentralized Deep Learning |
4, 7, 6, 7 |
|
793 |
6 |
Entropic gradient descent algorithms and wide flat minima |
6, 6, 7, 5 |
|
794 |
6 |
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning |
6, 7, 5 |
|
795 |
6 |
Variational Dynamic Mixtures |
7, 7, 4 |
|
796 |
6 |
Estimation of Number of Communities in Assortative Sparse Networks |
5, 7, 6, 6 |
|
797 |
6 |
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning |
6, 5, 7, 6 |
|
798 |
6 |
Automatic Data Augmentation for Generalization in Reinforcement Learning |
7, 4, 7, 6 |
|
799 |
6 |
Self-supervised Graph-level Representation Learning with Local and Global Structure |
5, 6, 8, 5 |
|
800 |
6 |
Deep Continuous Networks |
6, 7, 5 |
|
801 |
6 |
Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift |
6, 7, 5 |
|
802 |
6 |
TAM: Temporal Adaptive Module for Video Recognition |
8, 4, 6 |
|
803 |
6 |
Large-width functional asymptotics for deep Gaussian neural networks |
7, 4, 7, 6 |
|
804 |
6 |
Detecting Misclassification Errors in Neural Networks with a Gaussian Process Model |
6, 6, 6, 6 |
|
805 |
6 |
What Do Deep Nets Learn? Class-wise Patterns Revealed in the Input Space |
7, 6, 4, 7 |
|
806 |
6 |
Meta-Learning Bayesian Neural Network Priors Based on PAC-Bayesian Theory |
6, 7, 7, 4 |
|
807 |
6 |
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces |
8, 6, 5, 5 |
|
808 |
6 |
Hybrid-Regressive Neural Machine Translation |
6, 7, 5 |
|
809 |
6 |
Learning Curves for Analysis of Deep Networks |
4, 7, 7, 6 |
|
810 |
6 |
Contrastive estimation reveals topic posterior information to linear models |
6, 7, 6, 5 |
|
811 |
6 |
Multi-modal Self-Supervision from Generalized Data Transformations |
7, 4, 7, 6 |
|
812 |
6 |
Task-Agnostic and Adaptive-Size BERT Compression |
5, 6, 7, 6 |
|
813 |
6 |
The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation |
6, 7, 5, 6 |
|
814 |
6 |
Byzantine-Robust Learning on Heterogeneous Datasets via Resampling |
5, 7, 6 |
|
815 |
6 |
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections |
7, 7, 5, 5 |
|
816 |
6 |
Isometric Transformation Invariant and Equivariant Graph Convolutional Networks |
6, 7, 5 |
|
817 |
6 |
Model-Based Offline Planning |
8, 4, 5, 7 |
|
818 |
6 |
Skill Transfer via Partially Amortized Hierarchical Planning |
6, 7, 5, 6 |
|
819 |
6 |
CT-Net: Channel Tensorization Network for Video Classification |
5, 5, 7, 7 |
|
820 |
6 |
Learning Causal Semantic Representation for Out-of-Distribution Prediction |
6, 7, 5 |
|
821 |
6 |
Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning |
4, 4, 7, 9 |
|
822 |
6 |
Autoencoder Image Interpolation by Shaping the Latent Space |
5, 6, 7, 6 |
|
823 |
6 |
Learning What To Do by Simulating the Past |
7, 5, 7, 5 |
|
824 |
6 |
CcGAN: Continuous Conditional Generative Adversarial Networks for Image Generation |
6, 7, 5, 6 |
|
825 |
6 |
The Surprising Power of Graph Neural Networks with Random Node Initialization |
7, 7, 5, 5 |
|
826 |
6 |
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction |
5, 6, 7, 6 |
|
827 |
6 |
Accurate Learning of Graph Representations with Graph Multiset Pooling |
7, 4, 6, 7 |
|
828 |
6 |
VTNet: Visual Transformer Network for Object Goal Navigation |
6, 6, 6, 6 |
|
829 |
6 |
Predicting Classification Accuracy when Adding New Unobserved Classes |
6, 6, 6 |
|
830 |
6 |
Streamlining EM into Auto-Encoder Networks |
7, 6, 6, 5 |
|
831 |
6 |
Selfish Sparse RNN Training |
7, 6, 7, 4 |
|
832 |
6 |
Deep Single Image Manipulation |
6, 5, 7 |
|
833 |
6 |
The Lipschitz Constant of Self-Attention |
5, 5, 7, 7 |
|
834 |
6 |
Intention Propagation for Multi-agent Reinforcement Learning |
5, 6, 7, 6 |
|
835 |
6 |
Optimism in Reinforcement Learning with Generalized Linear Function Approximation |
5, 6, 7, 6 |
|
836 |
6 |
Mixed-Features Vectors and Subspace Splitting |
6, 6, 6 |
|
837 |
6 |
Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions |
6, 7, 6, 5 |
|
838 |
6 |
Offline Meta Learning of Exploration |
6, 6, 5, 7 |
|
839 |
6 |
On the Decision Boundaries of Neural Networks. A Tropical Geometry Perspective |
7, 6, 5, 6 |
|
840 |
6 |
Statistical inference for individual fairness |
6, 6, 6 |
|
841 |
6 |
TopoTER: Unsupervised Learning of Topology Transformation Equivariant Representations |
6, 6, 7, 5 |
|
842 |
6 |
Causal Screening to Interpret Graph Neural Networks |
7, 5, 7, 5 |
|
843 |
6 |
Interpretable Models for Granger Causality Using Self-explaining Neural Networks |
6, 8, 4, 6 |
|
844 |
6 |
Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw |
6, 6, 5, 7 |
|
845 |
6 |
The Advantage Regret-Matching Actor-Critic |
6, 6, 6 |
|
846 |
6 |
Density estimation on low-dimensional manifolds: an inflation-deflation approach |
6, 5, 6, 7 |
|
847 |
6 |
Characterizing Lookahead Dynamics of Smooth Games |
4, 4, 9, 7 |
|
848 |
6 |
Closing the Generalization Gap in One-Shot Object Detection |
5, 6, 6, 7 |
|
849 |
6 |
To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph |
6, 6, 6 |
|
850 |
6 |
Learning Neural Generative Dynamics for Molecular Conformation Generation |
6, 6, 6 |
|
851 |
6 |
A Text GAN for Language Generation with Non-Autoregressive Generator |
6, 6, 6 |
|
852 |
6 |
Neural networks behave as hash encoders: An empirical study |
5, 6, 7, 6 |
|
853 |
6 |
Learning Manifold Patch-Based Representations of Man-Made Shapes |
4, 6, 7, 7 |
|
854 |
6 |
Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge |
8, 6, 6, 4 |
|
855 |
6 |
Global Attention Improves Graph Networks Generalization |
6, 6, 7, 5 |
|
856 |
6 |
Multi-Prize Lottery Ticket Hypothesis: Finding Generalizable and Efficient Binary Subnetworks in a Randomly Weighted Neural Network |
6, 7, 7, 4 |
|
857 |
6 |
Importance-based Multimodal Autoencoder |
6, 6, 5, 7 |
|
858 |
6 |
Neural Jump Ordinary Differential Equation |
7, 7, 4, 6 |
|
859 |
6 |
A Siamese Neural Network for Behavioral Biometrics Authentication |
9, 4, 5 |
|
860 |
6 |
Uncertainty Weighted Offline Reinforcement Learning |
4, 6, 7, 8, 5 |
|
861 |
6 |
Overparameterisation and worst-case generalisation: friend or foe? |
6, 5, 7 |
|
862 |
6 |
i-Mix: A Strategy for Regularizing Contrastive Representation Learning |
3, 7, 7, 7 |
|
863 |
6 |
A Rigorous Evaluation of Real-World Distribution Shifts |
7, 4, 5, 8 |
|
864 |
6 |
On the Predictability of Pruning Across Scales |
6, 6, 6, 6 |
|
865 |
6 |
Semi-Supervised Learning of Multi-Object 3D Scene Representations |
6, 6, 6 |
|
866 |
6 |
Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models |
6, 7, 5, 6 |
|
867 |
6 |
Probing BERT in Hyperbolic Spaces |
6, 7, 5, 6 |
|
868 |
6 |
Scaling Symbolic Methods using Gradients for Neural Model Explanation |
7, 5, 7, 5 |
|
869 |
6 |
Deep Q Learning from Dynamic Demonstration with Behavioral Cloning |
5, 6, 6, 7 |
|
870 |
6 |
Exploring single-path Architecture Search ranking correlations |
5, 5, 9, 5 |
|
871 |
6 |
Luring of transferable adversarial perturbations in the black-box paradigm |
5, 5, 6, 8 |
|
872 |
6 |
Disentangling style and content for low resource video domain adaptation: a case study on keystroke inference attacks |
7, 5, 5, 7 |
|
873 |
6 |
FAST DIFFERENTIALLY PRIVATE-SGD VIA JL PROJECTIONS |
7, 4, 7 |
|
874 |
6 |
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits |
6, 7, 6, 5 |
|
875 |
6 |
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization |
5, 8, 7, 4 |
|
876 |
6 |
ABSTRACTING INFLUENCE PATHS FOR EXPLAINING (CONTEXTUALIZATION OF) BERT MODELS |
6, 6, 6, 6 |
|
877 |
6 |
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights |
6, 6, 5, 7 |
|
878 |
6 |
Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks |
5, 5, 7, 7 |
|
879 |
6 |
Learning Accurate Entropy Model with Global Reference for Image Compression |
5, 7, 6, 6 |
|
880 |
6 |
Shape-Texture Debiased Neural Network Training |
7, 7, 4, 6 |
|
881 |
6 |
Data-driven Learning of Geometric Scattering Networks |
6, 6, 8, 4 |
|
882 |
6 |
Representation Learning via Invariant Causal Mechanisms |
5, 7, 6, 6 |
|
883 |
6 |
PAC Confidence Predictions for Deep Neural Network Classifiers |
5, 7, 6 |
|
884 |
6 |
IOT: Instance-wise Layer Reordering for Transformer Structures |
5, 7, 7, 5 |
|
885 |
6 |
BRAC+: Going Deeper with Behavior Regularized Offline Reinforcement Learning |
7, 7, 5, 5 |
|
886 |
6 |
Cubic Spline Smoothing Compensation for Irregularly Sampled Sequences |
7, 5, 5, 7 |
|
887 |
6 |
Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning |
4, 6, 7, 6, 7 |
|
888 |
6 |
MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning |
7, 6, 6, 5 |
|
889 |
6 |
Isometric Propagation Network for Generalized Zero-shot Learning |
7, 7, 6, 4 |
|
890 |
6 |
Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams |
3, 7, 8 |
|
891 |
6 |
Learning a Latent Search Space for Routing Problems using Variational Autoencoders |
6, 6, 7, 5 |
|
892 |
6 |
Policy Learning Using Weak Supervision |
6, 6, 6, 6 |
|
893 |
6 |
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search |
5, 6, 6, 7 |
|
894 |
6 |
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective |
4, 8, 6 |
|
895 |
6 |
Simplifying Models with Unlabeled Output Data |
6, 6, 6 |
|
896 |
6 |
Zero-Cost Proxies for Lightweight NAS |
6, 7, 5, 6 |
|
897 |
6 |
Fusion 360 Gallery: A Dataset and Environment for Programmatic CAD Reconstruction |
4, 8, 5, 7 |
|
898 |
6 |
How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers |
5, 6, 7, 6 |
|
899 |
6 |
The Benefit of Distraction: Denoising Remote Vitals Measurements Using Inverse Attention |
9, 5, 4 |
|
900 |
6 |
Optimization Planning for 3D ConvNets |
7, 6, 6, 5 |
|
901 |
6 |
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds |
5, 6, 7, 6 |
|
902 |
6 |
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies |
5, 6, 7 |
|
903 |
6 |
Accounting for Unobserved Confounding in Domain Generalization |
3, 9, 5, 7 |
|
904 |
6 |
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity |
5, 8, 6, 3, 8 |
|
905 |
6 |
Adversarially Guided Actor-Critic |
7, 6, 5 |
|
906 |
6 |
Taming GANs with Lookahead-Minmax |
7, 4, 6, 7 |
|
907 |
6 |
Policy Optimization in Zero-Sum Markov Games: Fictitious Self-Play Provably Attains Nash Equilibria |
5, 8, 5, 6 |
|
908 |
6 |
A Representational Model of Grid Cells' Path Integration Based on Matrix Lie Algebras |
6, 5, 8, 5 |
|
909 |
6 |
Blending MPC & Value Function Approximation for Efficient Reinforcement Learning |
7, 5, 6, 6 |
|
910 |
6 |
ARMCMC: ONLINE MODEL PARAMETERS DENSITY ESTIMATION IN BAYESIAN PARADIGM |
7, 5, 6 |
|
911 |
6 |
Deep Kernel Processes |
6, 5, 6, 7 |
|
912 |
6 |
Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations |
5, 6, 7, 5, 7 |
|
913 |
6 |
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modelling |
6, 6, 6, 6 |
|
914 |
6 |
Enabling Binary Neural Network Training on the Edge |
5, 6, 5, 8 |
|
915 |
6 |
MixKD: Towards Efficient Distillation of Large-scale Language Models |
6, 6, 7, 5 |
|
916 |
6 |
Learning Contextualized Knowledge Graph Structures for Commonsense Reasoning |
5, 6, 7 |
|
917 |
6 |
Global Node Attentions via Adaptive Spectral Filters |
7, 7, 4 |
|
918 |
6 |
Recall Loss for Imbalanced Image Classification and Semantic Segmentation |
7, 6, 6, 5 |
|
919 |
6 |
Multi-Level Generative Models for Partial Label Learning with Non-random Label Noise |
5, 6, 7 |
|
920 |
6 |
Warpspeed Computation of Optimal Transport, Graph Distances, and Embedding Alignment |
6, 6, 7, 5 |
|
921 |
6 |
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization |
6, 5, 7, 6 |
|
922 |
6 |
Property Controllable Variational Autoencoder via Invertible Mutual Dependence |
6, 6, 6, 6 |
|
923 |
6 |
Regularization Cocktails |
6, 6, 6, 6 |
|
924 |
6 |
Planning from Pixels using Inverse Dynamics Models |
6, 6, 6, 6 |
|
925 |
6 |
Semi-supervised Keypoint Localization |
5, 6, 7, 6 |
|
926 |
6 |
Active Deep Probabilistic Subsampling |
6, 6, 6 |
|
927 |
6 |
Towards Finding Longer Proofs |
4, 6, 8 |
|
928 |
6 |
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines |
4, 8, 6, 6 |
|
929 |
6 |
A framework for learned sparse sketches |
5, 6, 7 |
|
930 |
6 |
Emergent Properties of Foveated Perceptual Systems |
7, 7, 3, 7 |
|
931 |
6 |
Imitation with Neural Density Models |
5, 6, 8, 5 |
|
932 |
6 |
Individually Fair Rankings |
7, 4, 7, 6 |
|
933 |
6 |
Rethinking Embedding Coupling in Pre-trained Language Models |
7, 7, 6, 4 |
|
934 |
6 |
Addressing Some Limitations of Transformers with Feedback Memory |
7, 6, 6, 5 |
|
935 |
6 |
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing |
7, 6, 5 |
|
936 |
6 |
Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning |
5, 6, 7, 6 |
|
937 |
6 |
Capturing Label Characteristics in VAEs |
6, 7, 5, 6 |
|
938 |
6 |
Unconditional Synthesis of Complex Scenes Using a Semantic Bottleneck |
6, 4, 8, 6 |
|
939 |
6 |
Model Selection for Cross-Lingual Transfer using a Learned Scoring Function |
6, 7, 7, 4 |
|
940 |
6 |
{Learning disentangled representations with the Wasserstein Autoencoder |
6, 5, 5, 8 |
|
941 |
6 |
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections |
7, 8, 4, 5 |
|
942 |
6 |
Monte-Carlo Planning and Learning with Language Action Value Estimates |
7, 4, 6, 7 |
|
943 |
6 |
Distribution-Based Invariant Deep Networks for Learning Meta-Features |
7, 5, 6, 6 |
|
944 |
6 |
Reset-Free Lifelong Learning with Skill-Space Planning |
5, 7, 6, 6 |
|
945 |
6 |
Learning Robust Models using the Principle of Independent Causal Mechanisms |
6, 6, 6 |
|
946 |
6 |
Succinct Network Channel and Spatial Pruning via Discrete Variable QCQP |
5, 7, 5, 7 |
|
947 |
6 |
SACoD: Sensor Algorithm Co-Design Towards Efficient CNN-powered Intelligent PhlatCam |
6, 6, 6, 6 |
|
948 |
6 |
Distributionally Robust Learning for Unsupervised Domain Adaptation |
7, 5, 6 |
|
949 |
6 |
On Relating “Why?” and “Why Not?” Explanations |
8, 5, 6, 5 |
|
950 |
6 |
Implicit Acceleration of Gradient Flow in Overparameterized Linear Models |
6, 5, 7, 6 |
|
951 |
6 |
Implicit bias of gradient descent for mean squared error regression with wide neural networks |
5, 7, 7, 6, 5 |
|
952 |
6 |
Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bitwise Regularization |
7, 6, 5 |
|
953 |
6 |
Balancing training time vs. performance with Bayesian Early Pruning |
7, 6, 6, 5 |
|
954 |
6 |
Unpacking Information Bottlenecks: Surrogate Objectives for Deep Learning |
8, 4, 6, 6 |
|
955 |
6 |
Deep Learning Is Composite Kernel Learning |
4, 8, 6, 6 |
|
956 |
6 |
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding |
6, 7, 5 |
|
957 |
6 |
Acoustic Neighbor Embeddings |
6, 6, 6, 6, 6 |
|
958 |
6 |
Understanding Bias in Anomaly Detection: A Semi-Supervised View with PAC Guarantees |
7, 4, 7, 6 |
|
959 |
6 |
SOAR: Second-Order Adversarial Regularization |
4, 7, 7 |
|
960 |
6 |
Linear Representation Meta-Reinforcement Learning for Instant Adaptation |
7, 6, 5 |
|
961 |
6 |
Evaluation of Similarity-based Explanations |
5, 6, 7, 6 |
|
962 |
6 |
Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks |
5, 6, 6, 7 |
|
963 |
6 |
Learning Chess Blindfolded |
7, 5, 5, 7 |
|
964 |
6 |
An Efficient Protocol for Distributed Column Subset Selection in the Entrywise $\ell_p$ Norm |
5, 6, 7 |
|
965 |
6 |
Sparse Gaussian Process Variational Autoencoders |
6, 6, 6 |
|
966 |
6 |
AlgebraNets |
5, 7, 6 |
|
967 |
6 |
Provable Memorization via Deep Neural Networks using Sub-linear Parameters |
7, 6, 5 |
|
968 |
6 |
Physics-aware Spatiotemporal Modules with Auxiliary Tasks for Meta-Learning |
5, 6, 5, 6, 8 |
|
969 |
6 |
AT-GAN: An Adversarial Generative Model for Non-constrained Adversarial Examples |
6, 7, 5 |
|
970 |
6 |
Constraint-Driven Explanations of Black-Box ML Models |
6, 7, 6, 5 |
|
971 |
5.8 |
Model-based Asynchronous Hyperparameter and Neural Architecture Search |
6, 6, 6, 5, 6 |
|
972 |
5.8 |
Understanding Self-supervised Learning with Dual Deep Networks |
3, 7, 5, 8, 6 |
|
973 |
5.8 |
SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization |
7, 7, 9, 3, 3 |
|
974 |
5.8 |
C-Learning: Learning to Achieve Goals via Recursive Classification |
4, 7, 5, 8, 5 |
|
975 |
5.8 |
Training with Quantization Noise for Extreme Model Compression |
5, 4, 6, 10, 4 |
|
976 |
5.8 |
Deep Data Flow Analysis |
5, 7, 4, 6, 7 |
|
977 |
5.8 |
Large Batch Simulation for Deep Reinforcement Learning |
4, 6, 5, 7, 7 |
|
978 |
5.8 |
Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design |
7, 5, 7, 7, 3 |
|
979 |
5.8 |
Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs |
5, 8, 6, 7, 3 |
|
980 |
5.8 |
Shape-Tailored Deep Neural Networks Using PDEs for Segmentation |
6, 6, 5, 6, 6 |
|
981 |
5.8 |
Improved Gradient based Adversarial Attacks for Quantized Networks |
7, 6, 5, 5, 6 |
|
982 |
5.8 |
Estimating Lipschitz constants of monotone deep equilibrium models |
5, 5, 7, 6, 6 |
|
983 |
5.8 |
VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation |
4, 9, 4, 7, 5 |
|
984 |
5.8 |
Single-Node Attack for Fooling Graph Neural Networks |
5, 6, 6, 6, 6 |
|
985 |
5.8 |
Learning Latent Topology for Graph Matching |
7, 8, 6, 4, 4 |
|
986 |
5.8 |
Breaking the Expressive Bottlenecks of Graph Neural Networks |
6, 6, 7, 5, 5 |
|
987 |
5.8 |
Predicting What You Already Know Helps: Provable Self-Supervised Learning |
4, 7, 6, 6, 6 |
|
988 |
5.8 |
Goal-Driven Imitation Learning from Observation by Inferring Goal Proximity |
5, 5, 7, 6, 6 |
|
989 |
5.8 |
Zero-shot Transfer Learning for Gray-box Hyper-parameter Optimization |
4, 6, 6, 7, 6 |
|
990 |
5.75 |
Deep Quotient Manifold Modeling |
8, 5, 6, 4 |
|
991 |
5.75 |
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak’s Heavy-ball Methods |
5, 6, 6, 6 |
|
992 |
5.75 |
Rethinking the Truly Unsupervised Image-to-Image Translation |
5, 6, 6, 6 |
|
993 |
5.75 |
not-MIWAE: Deep Generative Modelling with Missing not at Random Data |
6, 7, 6, 4 |
|
994 |
5.75 |
Extract Local Inference Chains of Deep Neural Nets |
6, 6, 6, 5 |
|
995 |
5.75 |
Explicit Connection Distillation |
5, 7, 6, 5 |
|
996 |
5.75 |
On the Capability of CNNs to Generalize to Unseen Category-Viewpoint Combinations |
6, 7, 4, 6 |
|
997 |
5.75 |
Formal Language Constrained Markov Decision Processes |
6, 5, 6, 6 |
|
998 |
5.75 |
Adaptive Multi-model Fusion Learning for Sparse-Reward Reinforcement Learning |
5, 6, 5, 7 |
|
999 |
5.75 |
Group Equivariant Generative Adversarial Networks |
6, 5, 6, 6 |
|
1000 |
5.75 |
A Distributional Perspective on Actor-Critic Framework |
6, 5, 7, 5 |
|
1001 |
5.75 |
Extracting Strong Policies for Robotics Tasks from zero-order trajectory optimizers |
6, 6, 5, 6 |
|
1002 |
5.75 |
Self-Supervised Multi-View Learning via Auto-Encoding 3D Transformations |
6, 4, 7, 6 |
|
1003 |
5.75 |
Membership Attacks on Conditional Generative Models Using Image Difficulty |
6, 6, 6, 5 |
|
1004 |
5.75 |
Sparse Linear Networks with a Fixed Butterfly Structure: Theory and Practice |
5, 7, 5, 6 |
|
1005 |
5.75 |
Bridging the Imitation Gap by Adaptive Insubordination |
5, 6, 6, 6 |
|
1006 |
5.75 |
WAVEQ: GRADIENT-BASED DEEP QUANTIZATION OF NEURAL NETWORKS THROUGH SINUSOIDAL REGULARIZATION |
7, 5, 7, 4 |
|
1007 |
5.75 |
Reinforcement Learning with Random Delays |
8, 6, 6, 3 |
|
1008 |
5.75 |
Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation |
6, 5, 6, 6 |
|
1009 |
5.75 |
Conditional Coverage Estimation for High-quality Prediction Intervals |
4, 7, 4, 8 |
|
1010 |
5.75 |
Contrastive Self-Supervised Learning of Global-Local Audio-Visual Representations |
5, 6, 5, 7 |
|
1011 |
5.75 |
NASOA: Towards Faster Task-oriented Online Fine-tuning |
3, 6, 7, 7 |
|
1012 |
5.75 |
RSO: A Gradient Free Sampling Based Approach For Training Deep Neural Networks |
6, 3, 6, 8 |
|
1013 |
5.75 |
A Unified Framework for Convolution-based Graph Neural Networks |
6, 5, 5, 7 |
|
1014 |
5.75 |
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win |
7, 6, 5, 5 |
|
1015 |
5.75 |
Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers |
6, 4, 4, 9 |
|
1016 |
5.75 |
On The Adversarial Robustness of 3D Point Cloud Classification |
5, 7, 6, 5 |
|
1017 |
5.75 |
On the Explicit Role of Initialization on the Convergence and Generalization Properties of Overparametrized Linear Networks |
5, 3, 9, 6 |
|
1018 |
5.75 |
Synthesizer: Rethinking Self-Attention for Transformer Models |
7, 5, 4, 7 |
|
1019 |
5.75 |
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components |
7, 5, 6, 5 |
|
1020 |
5.75 |
Sparse Uncertainty Representation in Deep Learning with Inducing Weights |
6, 6, 6, 5 |
|
1021 |
5.75 |
Adaptive Single-Pass Stochastic Gradient Descent in Input Sparsity Time |
6, 5, 6, 6 |
|
1022 |
5.75 |
On the role of planning in model-based deep reinforcement learning |
7, 6, 3, 7 |
|
1023 |
5.75 |
Hierarchical Reinforcement Learning by Discovering Intrinsic Options |
8, 7, 4, 4 |
|
1024 |
5.75 |
Energy-based Out-of-distribution Detection for Multi-label Classification |
7, 6, 4, 6 |
|
1025 |
5.75 |
On Linear Identifiability of Learned Representations |
6, 4, 7, 6 |
|
1026 |
5.75 |
Variable-Shot Adaptation for Incremental Meta-Learning |
6, 6, 6, 5 |
|
1027 |
5.75 |
Uncertainty in Neural Processes |
5, 5, 8, 5 |
|
1028 |
5.75 |
BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayeisan Fine-tuning |
6, 5, 6, 6 |
|
1029 |
5.75 |
Trans-Caps: Transformer Capsule Networks with Self-attention Routing |
6, 6, 7, 4 |
|
1030 |
5.75 |
DCT-SNN: Using DCT to Distribute Spatial Information over Time for Learning Low-Latency Spiking Neural Networks |
5, 6, 6, 6 |
|
1031 |
5.75 |
Towards Principled Representation Learning for Entity Alignment |
8, 5, 5, 5 |
|
1032 |
5.75 |
Non-Attentive Tacotron: Robust and controllable neural TTS synthesis including unsupervised duration modeling |
6, 5, 8, 4 |
|
1033 |
5.75 |
Formalizing Generalization and Robustness of Neural Networks to Weight Perturbations |
6, 7, 7, 3 |
|
1034 |
5.75 |
C-Learning: Horizon-Aware Cumulative Accessibility Estimation |
5, 6, 6, 6 |
|
1035 |
5.75 |
Uncertainty-aware Active Learning for Optimal Bayesian Classifier |
6, 7, 6, 4 |
|
1036 |
5.75 |
Pea-KD: Parameter-efficient and accurate Knowledge Distillation |
7, 5, 5, 6 |
|
1037 |
5.75 |
Improving Model Robustness with Latent Distribution Locally and Globally |
7, 5, 7, 4 |
|
1038 |
5.75 |
Emergent Road Rules In Multi-Agent Driving Environments |
6, 5, 5, 7 |
|
1039 |
5.75 |
Learning explanations that are hard to vary |
9, 2, 7, 5 |
|
1040 |
5.75 |
Learning Algebraic Representation for Abstract Spatial-Temporal Reasoning |
5, 5, 7, 6 |
|
1041 |
5.75 |
Multi-hop Attention Graph Neural Network |
5, 5, 6, 7 |
|
1042 |
5.75 |
FactoredRL: Leveraging Factored Graphs for Deep Reinforcement Learning |
6, 6, 6, 5 |
|
1043 |
5.75 |
Clairvoyance: A Pipeline Toolkit for Medical Time Series |
5, 6, 4, 8 |
|
1044 |
5.75 |
NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search |
5, 8, 7, 3 |
|
1045 |
5.75 |
Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability |
6, 4, 7, 6 |
|
1046 |
5.75 |
Non-robust Features through the Lens of Universal Perturbations |
7, 6, 5, 5 |
|
1047 |
5.75 |
Enabling counterfactual survival analysis with balanced representations |
5, 7, 4, 7 |
|
1048 |
5.75 |
Is Robustness Robust? On the interaction between augmentations and corruptions |
7, 6, 5, 5 |
|
1049 |
5.75 |
Deep Partial Updating |
6, 5, 6, 6 |
|
1050 |
5.75 |
Regression Prior Networks |
6, 5, 6, 6 |
|
1051 |
5.75 |
AR-ELBO: Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE |
7, 6, 4, 6 |
|
1052 |
5.75 |
Context-Agnostic Learning Using Synthetic Data |
7, 5, 5, 6 |
|
1053 |
5.75 |
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Sparse Neural Networks |
6, 5, 5, 7 |
|
1054 |
5.75 |
Rethinking Convolution: Towards an Optimal Efficiency |
5, 6, 6, 6 |
|
1055 |
5.75 |
A Primal Approach to Constrained Policy Optimization: Global Optimality and Finite-Time Analysis |
5, 6, 5, 7 |
|
1056 |
5.75 |
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning |
6, 7, 5, 5 |
|
1057 |
5.75 |
RRL: A Scalable Classifier for Interpretable Rule-Based Representation Learning |
5, 7, 5, 6 |
|
1058 |
5.75 |
Conditional Negative Sampling for Contrastive Learning of Visual Representations |
6, 7, 5, 5 |
|
1059 |
5.75 |
Understanding Over-parameterization in Generative Adversarial Networks |
6, 7, 6, 4 |
|
1060 |
5.75 |
Learning Continuous-Time Dynamics by Stochastic Differential Networks |
7, 4, 7, 5 |
|
1061 |
5.75 |
Learning One-hidden-layer Neural Networks on Gaussian Mixture Models with Guaranteed Generalizability |
6, 6, 7, 4 |
|
1062 |
5.75 |
Meta-Reinforcement Learning With Informed Policy Regularization |
6, 5, 6, 6 |
|
1063 |
5.75 |
Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning |
5, 5, 6, 7 |
|
1064 |
5.75 |
Data augmentation as stochastic optimization |
5, 6, 5, 7 |
|
1065 |
5.75 |
CO2: Consistent Contrast for Unsupervised Visual Representation Learning |
6, 4, 7, 6 |
|
1066 |
5.75 |
Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization |
7, 6, 6, 4 |
|
1067 |
5.75 |
Multimodal Attention for Layout Synthesis in Diverse Domains |
7, 6, 5, 5 |
|
1068 |
5.75 |
Learned Threshold Pruning |
4, 9, 4, 6 |
|
1069 |
5.75 |
Reverse engineering learned optimizers reveals known and novel mechanisms |
5, 5, 5, 8 |
|
1070 |
5.75 |
On the Transfer of Disentangled Representations in Realistic Settings |
5, 2, 7, 9 |
|
1071 |
5.75 |
Learning not to learn: Nature versus nurture in silico |
7, 6, 5, 5 |
|
1072 |
5.75 |
Parameter-Efficient Transfer Learning with Diff Pruning |
4, 5, 6, 8 |
|
1073 |
5.75 |
Pre-Training by Completing Point Clouds |
5, 4, 7, 7 |
|
1074 |
5.75 |
Exploring Zero-Shot Emergent Communication in Embodied Multi-Agent Populations |
6, 6, 5, 6 |
|
1075 |
5.75 |
Learning Latent Landmarks for Generalizable Planning |
5, 5, 7, 6 |
|
1076 |
5.75 |
FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning |
6, 7, 5, 5 |
|
1077 |
5.75 |
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization |
5, 7, 5, 6 |
|
1078 |
5.75 |
Spectrally Similar Graph Pooling |
7, 4, 7, 5 |
|
1079 |
5.75 |
Parametric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships |
6, 5, 5, 7 |
|
1080 |
5.75 |
Decoupling Representation Learning from Reinforcement Learning |
6, 5, 5, 7 |
|
1081 |
5.75 |
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues |
6, 6, 6, 5 |
|
1082 |
5.75 |
PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection |
6, 8, 3, 6 |
|
1083 |
5.75 |
Practical Marginalized Importance Sampling with the Successor Representation |
5, 6, 6, 6 |
|
1084 |
5.75 |
RMIX: Risk-Sensitive Multi-Agent Reinforcement Learning |
4, 7, 6, 6 |
|
1085 |
5.75 |
Machine Reading Comprehension with Enhanced Linguistic Verifiers |
7, 5, 5, 6 |
|
1086 |
5.75 |
Sample-Efficient Automated Deep Reinforcement Learning |
6, 5, 7, 5 |
|
1087 |
5.75 |
Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies |
7, 5, 6, 5 |
|
1088 |
5.75 |
CONTEMPLATING REAL-WORLDOBJECT RECOGNITION |
6, 5, 6, 6 |
|
1089 |
5.75 |
Multi-Agent Trust Region Learning |
6, 5, 8, 4 |
|
1090 |
5.75 |
Quantile Regularization : Towards Implicit Calibration of Regression Models |
6, 6, 5, 6 |
|
1091 |
5.75 |
Isometric Autoencoders |
7, 6, 4, 6 |
|
1092 |
5.75 |
Relational Learning with Variational Bayes |
5, 6, 6, 6 |
|
1093 |
5.75 |
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch |
6, 6, 5, 6 |
|
1094 |
5.75 |
Single Layers of Attention Suffice to Predict Protein Contacts |
5, 6, 5, 7 |
|
1095 |
5.75 |
Direct Evolutionary Optimization of Variational Autoencoders with Binary Latents |
5, 6, 6, 6 |
|
1096 |
5.75 |
Robust Learning for Congestion-Aware Routing |
5, 3, 7, 8 |
|
1097 |
5.75 |
Energy-based View of Retrosynthesis |
8, 5, 5, 5 |
|
1098 |
5.75 |
Effective Regularization Through Loss-Function Metalearning |
3, 8, 5, 7 |
|
1099 |
5.75 |
Fast Training of Contrastive Learning with Intermediate Contrastive Loss |
5, 6, 6, 6 |
|
1100 |
5.75 |
Understanding and Mitigating Accuracy Disparity in Regression |
6, 7, 6, 4 |
|
1101 |
5.75 |
Noise-Robust Contrastive Learning |
6, 6, 6, 5 |
|
1102 |
5.75 |
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs |
7, 6, 6, 4 |
|
1103 |
5.75 |
Provably robust classification of adversarial examples with detection |
5, 7, 6, 5 |
|
1104 |
5.75 |
Fine-grained Synthesis of Unrestricted Adversarial Examples |
4, 6, 6, 7 |
|
1105 |
5.75 |
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning |
6, 6, 6, 5 |
|
1106 |
5.75 |
Syntactic representations in the human brain: beyond effort-based metrics |
5, 4, 8, 6 |
|
1107 |
5.75 |
Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain |
4, 8, 7, 4 |
|
1108 |
5.75 |
Transformers are Deep Infinite-Dimensional Non-Mercer Binary Kernel Machines |
6, 4, 7, 6 |
|
1109 |
5.75 |
Privacy Preserving Recalibration under Domain Shift |
6, 5, 7, 5 |
|
1110 |
5.75 |
QTRAN++: Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning |
6, 7, 6, 4 |
|
1111 |
5.75 |
FairBatch: Batch Selection for Model Fairness |
6, 6, 7, 4 |
|
1112 |
5.75 |
A Reduction Approach to Constrained Reinforcement Learning |
5, 5, 7, 6 |
|
1113 |
5.75 |
Adaptive Procedural Task Generation for Hard-Exploration Problems |
6, 7, 4, 6 |
|
1114 |
5.75 |
Center-wise Local Image Mixture For Contrastive Representation Learning |
5, 6, 6, 6 |
|
1115 |
5.75 |
Transformer protein language models are unsupervised structure learners |
5, 6, 7, 5 |
|
1116 |
5.75 |
FILTRA: Rethinking Steerable CNN by Filter Transform |
6, 6, 5, 6 |
|
1117 |
5.75 |
Decentralized SGD with Asynchronous, Local and Quantized Updates |
7, 5, 6, 5 |
|
1118 |
5.75 |
Improving Abstractive Dialogue Summarization with Conversational Structure and Factual Knowledge |
6, 6, 6, 5 |
|
1119 |
5.75 |
Measuring Visual Generalization in Continuous Control from Pixels |
6, 5, 6, 6 |
|
1120 |
5.75 |
PIVEN: A Deep Neural Network for Prediction Intervals with Specific Value Prediction |
6, 7, 4, 6 |
|
1121 |
5.75 |
Representational aspects of depth and conditioning in normalizing flows |
3, 7, 7, 6 |
|
1122 |
5.75 |
QPLEX: Duplex Dueling Multi-Agent Q-Learning |
7, 6, 6, 4 |
|
1123 |
5.75 |
Efficient Estimators for Heavy-Tailed Machine Learning |
6, 6, 5, 6 |
|
1124 |
5.75 |
Wiring Up Vision: Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream |
6, 3, 8, 6 |
|
1125 |
5.75 |
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters |
6, 4, 7, 6 |
|
1126 |
5.75 |
Learned ISTA with Error-based Thresholding for Adaptive Sparse Coding |
7, 6, 6, 4 |
|
1127 |
5.75 |
MetaNorm: Learning to Normalize Few-Shot Batches Across Domains |
6, 6, 7, 4 |
|
1128 |
5.75 |
Stochastic Canonical Correlation Analysis: A Riemannian Approach |
6, 4, 6, 7 |
|
1129 |
5.75 |
Novelty Detection via Robust Variational Autoencoding |
8, 5, 6, 4 |
|
1130 |
5.75 |
Data Instance Prior for Transfer Learning in GANs |
4, 6, 7, 6 |
|
1131 |
5.75 |
Rewriting by Generating: Learn Heuristics for Large-scale Vehicle Routing Problems |
7, 4, 6, 6 |
|
1132 |
5.75 |
Variational Structured Attention Networks for Dense Pixel-Wise Prediction |
5, 6, 6, 6 |
|
1133 |
5.75 |
Cluster & Tune: Enhance BERT Performance in Low Resource Text Classification |
3, 8, 6, 6 |
|
1134 |
5.75 |
Robustness against Relational Adversary |
4, 6, 7, 6 |
|
1135 |
5.75 |
Enhancing Certified Robustness of Smoothed Classifiers via Weighted Model Ensembling |
6, 6, 6, 5 |
|
1136 |
5.75 |
Revealing the Structure of Deep Neural Networks via Convex Duality |
6, 6, 3, 8 |
|
1137 |
5.75 |
Globally Injective ReLU networks |
5, 8, 5, 5 |
|
1138 |
5.75 |
Deep Graph Neural Networks with Shallow Subgraph Samplers |
6, 7, 5, 5 |
|
1139 |
5.75 |
Non-iterative Parallel Text Generation via Glancing Transformer |
6, 7, 5, 5 |
|
1140 |
5.75 |
Plan-Based Asymptotically Equivalent Reward Shaping |
6, 7, 7, 3 |
|
1141 |
5.75 |
SkipW: Resource adaptable RNN with strict upper computational limit |
6, 5, 6, 6 |
|
1142 |
5.75 |
Graph Edit Networks |
3, 6, 7, 7 |
|
1143 |
5.75 |
Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization |
6, 4, 7, 6 |
|
1144 |
5.75 |
AUXILIARY TASK UPDATE DECOMPOSITION: THE GOOD, THE BAD AND THE NEUTRAL |
6, 5, 6, 6 |
|
1145 |
5.75 |
Neurosymbolic Deep Generative Models for Sequence Data with Relational Constraints |
6, 6, 7, 4 |
|
1146 |
5.75 |
Learning Self-Similarity in Space and Time as a Generalized Motion for Action Recognition |
6, 6, 6, 5 |
|
1147 |
5.75 |
The Heavy-Tail Phenomenon in SGD |
7, 5, 6, 5 |
|
1148 |
5.75 |
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation |
6, 7, 4, 6 |
|
1149 |
5.75 |
Bounded Myopic Adversaries for Deep Reinforcement Learning Agents |
6, 6, 6, 5 |
|
1150 |
5.75 |
Learning to Generate Noise for Multi-Attack Robustness |
6, 5, 6, 6 |
|
1151 |
5.75 |
Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time |
7, 4, 5, 7 |
|
1152 |
5.75 |
Kanerva++: Extending the Kanerva Machine With Differentiable, Locally Block Allocated Latent Memory |
6, 4, 6, 7 |
|
1153 |
5.75 |
Learning Online Data Association |
7, 6, 6, 4 |
|
1154 |
5.75 |
A Bayesian-Symbolic Approach to Learning and Reasoning for Intuitive Physics |
5, 6, 6, 6 |
|
1155 |
5.75 |
Hippocampal representations emerge when training recurrent neural networks on a memory dependent maze navigation task |
7, 5, 7, 4 |
|
1156 |
5.75 |
Dataset Meta-Learning from Kernel-Ridge Regression |
6, 6, 7, 4 |
|
1157 |
5.75 |
Ask Question with Double Hints: Visual Question Generation with Answer-awareness and Region-reference |
6, 6, 5, 6 |
|
1158 |
5.75 |
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization |
5, 6, 6, 6 |
|
1159 |
5.75 |
ME-MOMENTUM: EXTRACTING HARD CONFIDENT EXAMPLES FROM NOISILY LABELED DATA |
8, 4, 7, 4 |
|
1160 |
5.75 |
Uncertainty Prediction for Deep Sequential Regression Using Meta Models |
5, 6, 5, 7 |
|
1161 |
5.75 |
Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search) |
7, 6, 5, 5 |
|
1162 |
5.75 |
Model-Based Reinforcement Learning via Latent-Space Collocation |
4, 6, 6, 7 |
|
1163 |
5.75 |
Sim2SG: Sim-to-Real Scene Graph Generation for Transfer Learning |
5, 6, 7, 5 |
|
1164 |
5.75 |
CPR: Classifier-Projection Regularization for Continual Learning |
6, 4, 6, 7 |
|
1165 |
5.75 |
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning |
7, 8, 4, 4 |
|
1166 |
5.75 |
Constellation Nets for Few-Shot Learning |
6, 6, 6, 5 |
|
1167 |
5.75 |
Whitening for Self-Supervised Representation Learning |
5, 5, 6, 7 |
|
1168 |
5.75 |
Unsupervised Video Decomposition using Spatio-temporal Iterative Inference |
6, 7, 6, 4 |
|
1169 |
5.75 |
Contrastive Learning with Stronger Augmentations |
4, 7, 6, 6 |
|
1170 |
5.75 |
Cross-Probe BERT for Efficient and Effective Cross-Modal Search |
6, 5, 6, 6 |
|
1171 |
5.75 |
Fourier Representations for Black-Box Optimization over Categorical Variables |
6, 6, 6, 5 |
|
1172 |
5.75 |
Self-supervised Adversarial Robustness for the Low-label, High-data Regime |
4, 6, 6, 7 |
|
1173 |
5.67 |
Watching the World Go By: Representation Learning from Unlabeled Videos |
5, 8, 4 |
|
1174 |
5.67 |
DECENTRALIZED ATTRIBUTION OF GENERATIVE MODELS |
6, 5, 6 |
|
1175 |
5.67 |
Coping with Label Shift via Distributionally Robust Optimisation |
7, 4, 6 |
|
1176 |
5.67 |
Generalized Energy Based Models |
6, 5, 6 |
|
1177 |
5.67 |
Meta Adversarial Training |
5, 6, 6 |
|
1178 |
5.67 |
A Technical and Normative Investigation of Social Bias Amplification |
5, 5, 7 |
|
1179 |
5.67 |
Deconstructing the Regularization of BatchNorm |
7, 6, 4 |
|
1180 |
5.67 |
Augmented Sliced Wasserstein Distances |
6, 7, 4 |
|
1181 |
5.67 |
Projected Latent Markov Chain Monte Carlo: Conditional Sampling of Normalizing Flows |
4, 7, 6 |
|
1182 |
5.67 |
Learning Representation in Colour Conversion |
7, 6, 4 |
|
1183 |
5.67 |
Not All Memories are Created Equal: Learning to Expire |
6, 6, 5 |
|
1184 |
5.67 |
Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval |
5, 6, 6 |
|
1185 |
5.67 |
Group-Connected Multilayer Perceptron Networks |
7, 5, 5 |
|
1186 |
5.67 |
Meta-Learning with Implicit Processes |
6, 6, 5 |
|
1187 |
5.67 |
Continuous Transfer Learning |
6, 5, 6 |
|
1188 |
5.67 |
Fair Empirical Risk Minimization via Exponential Rényi Mutual Information |
5, 5, 7 |
|
1189 |
5.67 |
BUTLER: Building Understanding in TextWorld via Language for Embodied Reasoning |
7, 6, 4 |
|
1190 |
5.67 |
Generating Plannable Lifted Action Models for Visually Generated Logical Predicates |
6, 5, 6 |
|
1191 |
5.67 |
Stego Networks: Information Hiding on Deep Neural Networks |
7, 7, 3 |
|
1192 |
5.67 |
Learning Stochastic Behaviour from Aggregate Data |
5, 8, 4 |
|
1193 |
5.67 |
Reservoir Transformers |
5, 7, 5 |
|
1194 |
5.67 |
Universal Approximation Theorem for Equivariant Maps by Group CNNs |
5, 5, 7 |
|
1195 |
5.67 |
Uniform-Precision Neural Network Quantization via Neural Channel Expansion |
6, 6, 5 |
|
1196 |
5.67 |
Ego-Centric Spatial Memory Networks |
6, 7, 4 |
|
1197 |
5.67 |
A Point Cloud Generative Model Based on Nonequilibrium Thermodynamics |
6, 4, 7 |
|
1198 |
5.67 |
MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention |
6, 6, 5 |
|
1199 |
5.67 |
Learning to Search for Fast Maximum Common Subgraph Detection |
7, 5, 5 |
|
1200 |
5.67 |
Disentangled Representations from Non-Disentangled Models |
7, 6, 4 |
|
1201 |
5.67 |
SpreadsheetCoder: Formula Prediction from Semi-structured Context |
3, 7, 7 |
|
1202 |
5.67 |
Daylight: Assessing Generalization Skills of Deep Reinforcement Learning Agents |
5, 6, 6 |
|
1203 |
5.67 |
Meta-learning Transferable Representations with a Single Target Domain |
5, 6, 6 |
|
1204 |
5.67 |
Simple and Effective VAE Training with Calibrated Decoders |
6, 5, 6 |
|
1205 |
5.67 |
Similarity Search for Efficient Active Learning and Search of Rare Concepts |
5, 4, 8 |
|
1206 |
5.67 |
Lossless Compression of Structured Convolutional Models via Lifting |
6, 6, 5 |
|
1207 |
5.67 |
A Framework For Differentiable Discovery Of Graph Algorithms |
6, 4, 7 |
|
1208 |
5.67 |
Offline policy selection under Uncertainty |
6, 6, 5 |
|
1209 |
5.67 |
Goal-Auxiliary Actor-Critic for 6D Robotic Grasping with Point Clouds |
5, 5, 7 |
|
1210 |
5.67 |
Explicit Pareto Front Optimization for Constrained Reinforcement Learning |
4, 7, 6 |
|
1211 |
5.67 |
Fixing Asymptotic Uncertainty of Bayesian Neural Networks with Infinite ReLU Features |
7, 5, 5 |
|
1212 |
5.67 |
CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers |
7, 4, 6 |
|
1213 |
5.67 |
Learning Deep Latent Variable Models via Amortized Langevin Dynamics |
6, 5, 6 |
|
1214 |
5.67 |
Cut-and-Paste Neural Rendering |
6, 6, 5 |
|
1215 |
5.67 |
Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled Learning and Conditional Generation with Extra Data |
6, 5, 6 |
|
1216 |
5.67 |
Multiscale Invertible Generative Networks for High-Dimensional Bayesian Inference |
6, 6, 5 |
|
1217 |
5.67 |
Discriminative Representation Loss (DRL): A More Efficient Approach than Gradient Re-Projection in Continual Learning |
5, 6, 6 |
|
1218 |
5.67 |
ACT: Asymptotic Conditional Transport |
5, 6, 6 |
|
1219 |
5.67 |
Understanding and Leveraging Causal Relations in Deep Reinforcement Learning |
6, 6, 5 |
|
1220 |
5.67 |
Multi-Task Learning by a Top-Down Control Network |
7, 5, 5 |
|
1221 |
5.67 |
Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization |
5, 5, 7 |
|
1222 |
5.67 |
Discrete Graph Structure Learning for Forecasting Multiple Time Series |
4, 7, 6 |
|
1223 |
5.67 |
CURI: A Benchmark for Productive Concept Learning Under Uncertainty |
6, 6, 5 |
|
1224 |
5.67 |
Asynchronous Advantage Actor Critic: Non-asymptotic Analysis and Linear Speedup |
6, 6, 5 |
|
1225 |
5.67 |
A Near-Optimal Recipe for Debiasing Trained Machine Learning Models |
7, 6, 4 |
|
1226 |
5.6 |
Prediction and generalisation over directed actions by grid cells |
4, 7, 5, 7, 5 |
|
1227 |
5.6 |
GG-GAN: A Geometric Graph Generative Adversarial Network |
5, 5, 6, 5, 7 |
|
1228 |
5.6 |
Accelerating DNN Training through Selective Localized Learning |
6, 4, 5, 6, 7 |
|
1229 |
5.6 |
On the Bottleneck of Graph Neural Networks and its Practical Implications |
4, 8, 5, 5, 6 |
|
1230 |
5.6 |
NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition |
5, 7, 6, 6, 4 |
|
1231 |
5.6 |
Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks |
5, 6, 4, 7, 6 |
|
1232 |
5.6 |
Transfer among Agents: An Efficient Multiagent Transfer Learning Framework |
6, 6, 4, 6, 6 |
|
1233 |
5.6 |
Distributed Associative Memory Network with Association Reinforcing Loss |
5, 5, 6, 8, 4 |
|
1234 |
5.6 |
Cut out the annotator, keep the cutout: better segmentation with weak supervision |
6, 5, 7, 6, 4 |
|
1235 |
5.6 |
Learning to Reason in Large Theories without Imitation |
4, 6, 6, 6, 6 |
|
1236 |
5.6 |
Which Mutual-Information Representation Learning Objectives are Sufficient for Control? |
6, 7, 5, 5, 5 |
|
1237 |
5.6 |
Representational correlates of hierarchical phrase structure in deep language models |
6, 5, 5, 6, 6 |
|
1238 |
5.5 |
Learning Task Decomposition with Order-Memory Policy Network |
6, 6, 4, 6 |
|
1239 |
5.5 |
DEMI: Discriminative Estimator of Mutual Information |
7, 4, 6, 5 |
|
1240 |
5.5 |
Safety Verification of Model Based Reinforcement Learning Controllers |
5, 7, 7, 3 |
|
1241 |
5.5 |
Recursive Neighborhood Pooling for Graph Representation Learning |
4, 6, 6, 6 |
|
1242 |
5.5 |
Attacking Few-Shot Classifiers with Adversarial Support Sets |
6, 6, 4, 6 |
|
1243 |
5.5 |
Constrained Reinforcement Learning With Learned Constraints |
7, 5, 6, 4 |
|
1244 |
5.5 |
Adversarial Attacks on Binary Image Recognition Systems |
7, 5, 5, 5 |
|
1245 |
5.5 |
Action and Perception as Divergence Minimization |
6, 6, 3, 7 |
|
1246 |
5.5 |
Federated Continual Learning with Weighted Inter-client Transfer |
5, 6, 7, 4 |
|
1247 |
5.5 |
Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search |
6, 6, 6, 4 |
|
1248 |
5.5 |
Parallel Training of Deep Networks with Local Updates |
4, 9, 6, 3 |
|
1249 |
5.5 |
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning |
6, 6, 4, 6 |
|
1250 |
5.5 |
Reusing Preprocessing Data as Auxiliary Supervision in Conversational Analysis |
6, 6, 5, 5 |
|
1251 |
5.5 |
Learning from others' mistakes: Avoiding dataset biases without modeling them |
6, 7, 7, 2 |
|
1252 |
5.5 |
Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers |
7, 5, 5, 5 |
|
1253 |
5.5 |
Constructing Multiple High-Quality Deep Neural Networks: A TRUST-TECH Based Approach |
5, 5, 6, 6 |
|
1254 |
5.5 |
Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies |
6, 6, 6, 4 |
|
1255 |
5.5 |
Status-Quo Policy Gradient in Multi-agent Reinforcement Learning |
7, 6, 4, 5 |
|
1256 |
5.5 |
Non-Markovian Predictive Coding For Planning In Latent Space |
5, 6, 6, 5 |
|
1257 |
5.5 |
Robust Temporal Ensembling |
6, 5, 5, 6 |
|
1258 |
5.5 |
Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent |
5, 5, 6, 6 |
|
1259 |
5.5 |
Sharing Less is More: Lifelong Learning in Deep Networks with Selective Layer Transfer |
6, 3, 6, 7 |
|
1260 |
5.5 |
Variance Based Sample Weighting for Supervised Learning |
6, 6, 3, 7 |
|
1261 |
5.5 |
Pretrain Knowledge-Aware Language Models |
7, 4, 6, 5 |
|
1262 |
5.5 |
Accurately Solving Physical Systems with Graph Learning |
4, 6, 6, 6 |
|
1263 |
5.5 |
Towards a Reliable and Robust Dialogue System for Medical Automatic Diagnosis |
6, 6, 4, 6 |
|
1264 |
5.5 |
Near-Optimal Glimpse Sequences for Training Hard Attention Neural Networks |
7, 6, 5, 4 |
|
1265 |
5.5 |
Causal Inference Q-Network: Toward Resilient Reinforcement Learning |
7, 4, 7, 4 |
|
1266 |
5.5 |
Unsupervised Discovery of 3D Physical Objects |
5, 6, 6, 5 |
|
1267 |
5.5 |
Drift Detection in Episodic Data: Detect When Your Agent Starts Faltering |
5, 6, 6, 5 |
|
1268 |
5.5 |
Non-convex Optimization via Adaptive Stochastic Search for End-to-end Learning and Control |
6, 6, 6, 4 |
|
1269 |
5.5 |
Interpretable Sequence Classification Via Prototype Trajectory |
5, 6, 7, 4 |
|
1270 |
5.5 |
Fast MNAS: Uncertainty-aware Neural Architecture Search with Lifelong Learning |
6, 6, 5, 5 |
|
1271 |
5.5 |
Filter pre-pruning for improved fine-tuning of quantized deep neural networks |
5, 6, 6, 5 |
|
1272 |
5.5 |
Outlier Robust Optimal Transport |
4, 6, 5, 7 |
|
1273 |
5.5 |
Improving Generalizability of Protein Sequence Models via Data Augmentations |
9, 3, 4, 6 |
|
1274 |
5.5 |
BROS: A Pre-trained Language Model for Understanding Texts in Document |
6, 5, 5, 6 |
|
1275 |
5.5 |
Mixture Representation Learning with Coupled Autoencoding Agents |
6, 5, 5, 6 |
|
1276 |
5.5 |
Generative Scene Graph Networks |
6, 6, 4, 6 |
|
1277 |
5.5 |
Stochastic Subset Selection for Efficient Training and Inference of Neural Networks |
4, 6, 6, 6 |
|
1278 |
5.5 |
Distributional Generalization: A New Kind of Generalization |
5, 6, 4, 7 |
|
1279 |
5.5 |
BAFFLE: TOWARDS RESOLVING FEDERATED LEARNING’S DILEMMA - THWARTING BACKDOOR AND INFERENCE ATTACKS |
6, 6, 4, 6 |
|
1280 |
5.5 |
Sufficient and Disentangled Representation Learning |
4, 7, 6, 5 |
|
1281 |
5.5 |
A General Framework for Unsupervised Anomaly Detection |
5, 5, 7, 5 |
|
1282 |
5.5 |
Robust Reinforcement Learning using Adversarial Populations |
5, 4, 7, 6 |
|
1283 |
5.5 |
Progressively Stacking 2.0: A multi-stage layerwise training method for BERT training speedup |
6, 5, 5, 6 |
|
1284 |
5.5 |
Learning Two-Time-Scale Representations For Large Scale Recommendations |
6, 7, 6, 3 |
|
1285 |
5.5 |
Optimistic Policy Optimization with General Function Approximations |
4, 5, 6, 7 |
|
1286 |
5.5 |
Monotonic Robust Policy Optimization with Model Discrepancy |
4, 5, 6, 7 |
|
1287 |
5.5 |
Brain-like approaches to unsupervised learning of hidden representations - a comparative study |
5, 4, 7, 6 |
|
1288 |
5.5 |
Optimal Neural Program Synthesis from Multimodal Specifications |
4, 7, 5, 6 |
|
1289 |
5.5 |
Triple-Search: Differentiable Joint-Search of Networks, Precision, and Accelerators |
6, 5, 5, 6 |
|
1290 |
5.5 |
Safe Reinforcement Learning with Natural Language Constraints |
7, 5, 5, 5 |
|
1291 |
5.5 |
Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic |
7, 7, 5, 3 |
|
1292 |
5.5 |
Local Information Opponent Modelling Using Variational Autoencoders |
6, 3, 7, 6 |
|
1293 |
5.5 |
Dual-Tree Wavelet Packet CNNs for Image Classification |
6, 8, 4, 4 |
|
1294 |
5.5 |
How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds |
5, 7, 4, 6 |
|
1295 |
5.5 |
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD |
6, 5, 7, 4 |
|
1296 |
5.5 |
Optimizing Transformers with Approximate Computing for Faster, Smaller and more Accurate NLP Models |
6, 5, 7, 4 |
|
1297 |
5.5 |
Concentric Spherical GNN for 3D Representation Learning |
5, 5, 6, 6 |
|
1298 |
5.5 |
Debiasing Concept Bottleneck Models with Instrumental Variables |
4, 5, 7, 6 |
|
1299 |
5.5 |
Trojans and Adversarial Examples: A Lethal Combination |
5, 7, 4, 6 |
|
1300 |
5.5 |
Patch-level Neighborhood Interpolation: A General and Effective Graph-based Regularization Strategy |
5, 6, 5, 6 |
|
1301 |
5.5 |
CROSS-SUPERVISED OBJECT DETECTION |
6, 4, 6, 6 |
|
1302 |
5.5 |
Weak NAS Predictor Is All You Need |
6, 6, 6, 4 |
|
1303 |
5.5 |
EXPLORING VULNERABILITIES OF BERT-BASED APIS |
6, 4, 6, 6 |
|
1304 |
5.5 |
Uniform Priors for Data-Efficient Transfer |
6, 5, 6, 5 |
|
1305 |
5.5 |
Contextual Image Parsing via Panoptic Segment Sorting |
5, 5, 6, 6 |
|
1306 |
5.5 |
Robust Loss Functions for Complementary Labels Learning |
7, 7, 5, 3 |
|
1307 |
5.5 |
On Nondeterminism and Instability in Neural Network Optimization |
5, 6, 6, 5 |
|
1308 |
5.5 |
Globetrotter: Unsupervised Multilingual Translation from Visual Alignment |
7, 5, 5, 5 |
|
1309 |
5.5 |
Inductive Collaborative Filtering via Relation Graph Learning |
6, 4, 6, 6 |
|
1310 |
5.5 |
XLA: A Robust Unsupervised Data Augmentation Framework for Cross-Lingual NLP |
5, 6, 6, 5 |
|
1311 |
5.5 |
TextTN: Probabilistic Encoding of Language on Tensor Network |
6, 4, 7, 5 |
|
1312 |
5.5 |
Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices |
5, 4, 6, 7 |
|
1313 |
5.5 |
Disentangled Generative Causal Representation Learning |
5, 6, 6, 5 |
|
1314 |
5.5 |
Minimal Geometry-Distortion Constraint for Unsupervised Image-to-Image Translation |
7, 4, 7, 4 |
|
1315 |
5.5 |
Iterative Graph Self-Distillation |
5, 6, 5, 6 |
|
1316 |
5.5 |
Inductive Bias of Gradient Descent for Exponentially Weight Normalized Smooth Homogeneous Neural Nets |
4, 4, 7, 7 |
|
1317 |
5.5 |
Meta-Active Learning in Probabilistically-Safe Optimization |
5, 6, 5, 6 |
|
1318 |
5.5 |
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL |
6, 6, 6, 4 |
|
1319 |
5.5 |
Group Equivariant Conditional Neural Processes |
6, 4, 7, 5 |
|
1320 |
5.5 |
BASGD: Buffered Asynchronous SGD for Byzantine Learning |
7, 6, 4, 5 |
|
1321 |
5.5 |
On the Inductive Bias of a CNN for Distributions with Orthogonal Patterns |
5, 6, 5, 6 |
|
1322 |
5.5 |
Jumpy Recurrent Neural Networks |
5, 7, 5, 5 |
|
1323 |
5.5 |
L2E: Learning to Exploit Your Opponent |
6, 4, 6, 6 |
|
1324 |
5.5 |
Understanding, Analyzing, and Optimizing the Complexity of Deep Models |
5, 8, 5, 4 |
|
1325 |
5.5 |
Contextual Knowledge Distillation for Transformer Compression |
6, 5, 5, 6 |
|
1326 |
5.5 |
Truthful Self-Play |
4, 5, 8, 5 |
|
1327 |
5.5 |
Unsupervised Domain Adaptation via Minimized Joint Error |
5, 6, 7, 4 |
|
1328 |
5.5 |
Prototypical Representation Learning for Relation Extraction |
4, 6, 7, 5 |
|
1329 |
5.5 |
A priori guarantees of finite-time convergence for Deep Neural Networks |
7, 7, 4, 4 |
|
1330 |
5.5 |
A Geometric Analysis of Deep Generative Image Models and Its Applications |
5, 6, 6, 5 |
|
1331 |
5.5 |
SoGCN: Second-Order Graph Convolutional Networks |
7, 5, 5, 5 |
|
1332 |
5.5 |
Weakly Supervised Neuro-Symbolic Module Networks for Numerical Reasoning |
5, 7, 4, 6 |
|
1333 |
5.5 |
RG-Flow: A hierarchical and explainable flow model based on renormalization group and sparse prior |
6, 6, 5, 5 |
|
1334 |
5.5 |
Active Feature Acquisition with Generative Surrogate Models |
7, 5, 4, 6 |
|
1335 |
5.5 |
Laplacian Eigenspaces, Horocycles and Neuron Models on Hyperbolic Spaces |
5, 5, 8, 4 |
|
1336 |
5.5 |
Approximate Probabilistic Inference with Composed Flows |
6, 5, 7, 4 |
|
1337 |
5.5 |
On the Importance of Sampling in Training GCNs: Convergence Analysis and Variance Reduction |
7, 7, 4, 4 |
|
1338 |
5.5 |
Incremental few-shot learning via vector quantization in deep embedded space |
5, 6, 6, 5 |
|
1339 |
5.5 |
How Important is the Train-Validation Split in Meta-Learning? |
6, 6, 5, 5 |
|
1340 |
5.5 |
Provable Acceleration of Neural Net Training via Polyak’s Momentum |
6, 4, 7, 5 |
|
1341 |
5.5 |
Offline Meta-Reinforcement Learning with Advantage Weighting |
5, 5, 6, 6 |
|
1342 |
5.5 |
Deep Ensemble Kernel Learning |
3, 5, 8, 6 |
|
1343 |
5.5 |
Towards Robust Graph Neural Networks against Label Noise |
7, 4, 5, 6 |
|
1344 |
5.5 |
Semantic-Guided Representation Enhancement for Self-supervised Monocular Trained Depth Estimation |
5, 7, 6, 4 |
|
1345 |
5.5 |
The Compact Support Neural Network |
6, 6, 5, 5 |
|
1346 |
5.5 |
NeurWIN: Neural Whittle Index Network for Restless Bandits via Deep RL |
4, 7, 7, 4 |
|
1347 |
5.5 |
Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object Detection |
4, 6, 6, 6 |
|
1348 |
5.5 |
Individuality in the hive - Learning to embed lifetime social behaviour of honey bees |
5, 6, 5, 6 |
|
1349 |
5.5 |
Learning Efficient Planning-based Rewards for Imitation Learning |
5, 5, 6, 6 |
|
1350 |
5.5 |
Finding Physical Adversarial Examples for Autonomous Driving with Fast and Differentiable Image Compositing |
5, 5, 6, 6 |
|
1351 |
5.5 |
Deep Coherent Exploration For Continuous Control |
7, 4, 7, 4 |
|
1352 |
5.5 |
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time |
6, 4, 5, 7 |
|
1353 |
5.5 |
Learning Contextual Perturbation Budgets for Training Robust Neural Networks |
5, 6, 6, 5 |
|
1354 |
5.5 |
D2p-fed:Differentially Private Federated Learning with Efficient Communication |
5, 6, 7, 4 |
|
1355 |
5.5 |
On Low Rank Directed Acyclic Graphs and Causal Structure Learning |
5, 6, 5, 6 |
|
1356 |
5.5 |
Streaming Probabilistic Deep Tensor Factorization |
5, 6, 5, 6 |
|
1357 |
5.5 |
Dynamic of Stochastic Gradient Descent with State-dependent Noise |
5, 6, 6, 5 |
|
1358 |
5.5 |
Compute- and Memory-Efficient Reinforcement Learning with Latent Experience Replay |
6, 5, 4, 7 |
|
1359 |
5.5 |
Box-To-Box Transformation for Modeling Joint Hierarchies |
8, 6, 4, 4 |
|
1360 |
5.5 |
Whitening and second order optimization both destroy information about the dataset, and can make generalization impossible |
4, 4, 7, 7 |
|
1361 |
5.5 |
Efficient Long-Range Convolutions for Point Clouds |
5, 5, 6, 6 |
|
1362 |
5.5 |
Towards Understanding Fast Adversarial Training |
5, 5, 7, 5 |
|
1363 |
5.5 |
Online Learning under Adversarial Corruptions |
5, 5, 7, 5 |
|
1364 |
5.5 |
Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling |
6, 4, 5, 7 |
|
1365 |
5.5 |
Memory-Efficient Semi-Supervised Continual Learning: The World is its Own Replay Buffer |
5, 6, 7, 4 |
|
1366 |
5.5 |
Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints |
5, 6, 5, 6 |
|
1367 |
5.5 |
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale |
5, 5, 8, 4 |
|
1368 |
5.5 |
What’s in the Box? Exploring the Inner Life of Neural Networks with Robust Rules |
5, 6, 3, 8 |
|
1369 |
5.5 |
Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks |
6, 5, 4, 7 |
|
1370 |
5.5 |
Consistency and Monotonicity Regularization for Neural Knowledge Tracing |
5, 6, 7, 4 |
|
1371 |
5.5 |
Efficient Architecture Search for Continual Learning |
6, 4, 6, 6 |
|
1372 |
5.5 |
Online Testing of Subgroup Treatment Effects Based on Value Difference |
7, 5, 3, 7 |
|
1373 |
5.5 |
Modifying Memories in Transformer Models |
6, 6, 5, 5 |
|
1374 |
5.5 |
Optimizing Loss Functions Through Multivariate Taylor Polynomial Parameterization |
6, 6, 5, 5 |
|
1375 |
5.5 |
Disentangling Representations of Text by Masking Transformers |
5, 6, 6, 5 |
|
1376 |
5.5 |
Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data |
5, 6, 6, 5 |
|
1377 |
5.5 |
Adversarial Environment Generation for Learning to Navigate the Web |
6, 5, 4, 7 |
|
1378 |
5.5 |
GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering |
7, 6, 5, 4 |
|
1379 |
5.5 |
Precondition Layer and Its Use for GANs |
6, 5, 4, 7 |
|
1380 |
5.5 |
Expressive Yet Tractable Bayesian Deep Learning via Subnetwork Inference |
6, 6, 5, 5 |
|
1381 |
5.5 |
Distributional Reinforcement Learning for Risk-Sensitive Policies |
5, 5, 5, 7 |
|
1382 |
5.5 |
Truly Deterministic Policy Optimization |
5, 6, 6, 5 |
|
1383 |
5.5 |
Divide-and-Conquer Monte Carlo Tree Search |
5, 4, 5, 8 |
|
1384 |
5.5 |
The Bootstrap Framework: Generalization Through the Lens of Online Optimization |
5, 4, 6, 7 |
|
1385 |
5.5 |
Average Reward Reinforcement Learning with Monotonic Policy Improvement |
6, 6, 4, 6 |
|
1386 |
5.5 |
Robust Curriculum Learning: from clean label detection to noisy label self-correction |
5, 6, 5, 6 |
|
1387 |
5.5 |
Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification |
5, 4, 6, 7 |
|
1388 |
5.5 |
High-Capacity Expert Binary Networks |
7, 5, 6, 4 |
|
1389 |
5.5 |
Double Generative Adversarial Networks for Conditional Independence Testing |
5, 5, 6, 6 |
|
1390 |
5.5 |
Improving Few-Shot Visual Classification with Unlabelled Examples |
6, 6, 5, 5 |
|
1391 |
5.5 |
A Coach-Player Framework for Dynamic Team Composition |
5, 4, 6, 7 |
|
1392 |
5.5 |
On Dynamic Noise Influence in Differential Private Learning |
7, 5, 4, 6 |
|
1393 |
5.5 |
Nearest Neighbor Machine Translation |
4, 8, 4, 6 |
|
1394 |
5.5 |
Unsupervised Learning of Global Factors in Deep Generative Models |
6, 5, 5, 6 |
|
1395 |
5.5 |
Early Stopping by Gradient Disparity |
5, 5, 5, 7 |
|
1396 |
5.5 |
Amortized Conditional Normalized Maximum Likelihood |
5, 6, 6, 5 |
|
1397 |
5.5 |
Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning |
5, 5, 5, 7 |
|
1398 |
5.5 |
Offline Adaptive Policy Leaning in Real-World Sequential Recommendation Systems |
7, 7, 4, 4 |
|
1399 |
5.5 |
Neural Dynamical Systems: Balancing Structure and Flexibility in Physical Prediction |
4, 8, 5, 5 |
|
1400 |
5.5 |
Universal Sentence Representations Learning with Conditional Masked Language Model |
6, 7, 4, 5 |
|
1401 |
5.5 |
Hamiltonian Q-Learning: Leveraging Importance-sampling for Data Efficient RL |
5, 6, 5, 6 |
|
1402 |
5.5 |
Target Training: Tricking Adversarial Attacks to Fail |
5, 5, 7, 5 |
|
1403 |
5.5 |
D3C: Reducing the Price of Anarchy in Multi-Agent Learning |
7, 6, 6, 3 |
|
1404 |
5.5 |
Beyond GNNs: A Sample Efficient Architecture for Graph Problems |
5, 8, 5, 4 |
|
1405 |
5.5 |
Towards Adversarial Robustness of Bayesian Neural Network through Hierarchical Variational Inference |
6, 5, 6, 5 |
|
1406 |
5.5 |
Mapping the Timescale Organization of Neural Language Models |
7, 6, 6, 3 |
|
1407 |
5.5 |
Convex Regularization in Monte-Carlo Tree Search |
4, 8, 5, 5 |
|
1408 |
5.5 |
LEARNED HARDWARE/SOFTWARE CO-DESIGN OF NEURAL ACCELERATORS |
7, 5, 4, 6 |
|
1409 |
5.5 |
Federated Learning’s Blessing: FedAvg has Linear Speedup |
6, 5, 6, 5 |
|
1410 |
5.5 |
Balancing Robustness and Sensitivity using Feature Contrastive Learning |
5, 6, 6, 5 |
|
1411 |
5.5 |
Multinomial Variational Autoencoders can recover Principal Components |
4, 6, 7, 5 |
|
1412 |
5.5 |
Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs |
7, 4, 4, 7 |
|
1413 |
5.5 |
Generalizing Graph Convolutional Networks |
6, 5, 5, 6 |
|
1414 |
5.5 |
TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous Control |
5, 5, 5, 7 |
|
1415 |
5.5 |
Deep Reinforcement Learning For Wireless Scheduling with Multiclass Services |
5, 7, 7, 3 |
|
1416 |
5.5 |
Do Deeper Convolutional Networks Perform Better? |
6, 6, 5, 5 |
|
1417 |
5.5 |
Learn what you can’t learn: Regularized Ensembles for Transductive out-of-distribution detection |
5, 3, 6, 8 |
|
1418 |
5.5 |
Robustness to Pruning Predicts Generalization in Deep Neural Networks |
5, 5, 7, 5 |
|
1419 |
5.5 |
Generative Fairness Teaching |
6, 5, 5, 6 |
|
1420 |
5.5 |
Don’t stack layers in graph neural networks, wire them randomly |
5, 8, 5, 4 |
|
1421 |
5.5 |
Mitigating Mode Collapse by Sidestepping Catastrophic Forgetting |
5, 4, 7, 6 |
|
1422 |
5.5 |
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation |
5, 5, 5, 7 |
|
1423 |
5.5 |
Reinforcement Learning for Control with Probabilistic Stability Guarantee |
5, 5, 6, 6 |
|
1424 |
5.5 |
How to compare adversarial robustness of classifiers from a global perspective |
6, 5, 5, 6 |
|
1425 |
5.4 |
SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks |
5, 7, 5, 5, 5 |
|
1426 |
5.4 |
Interpretability Through Invertibility: A Deep Convolutional Network With Ideal Counterfactuals And Isosurfaces |
6, 6, 5, 5, 5 |
|
1427 |
5.4 |
MISSO: Minimization by Incremental Stochastic Surrogate Optimization for Large Scale Nonconvex and Nonsmooth Problems |
3, 6, 7, 5, 6 |
|
1428 |
5.4 |
Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming |
4, 4, 6, 6, 7 |
|
1429 |
5.4 |
Data augmentation for deep learning based accelerated MRI reconstruction |
6, 6, 6, 5, 4 |
|
1430 |
5.4 |
Benefits of Assistance over Reward Learning |
5, 6, 7, 4, 5 |
|
1431 |
5.4 |
Addressing the Topological Defects of Disentanglement |
6, 6, 3, 7, 5 |
|
1432 |
5.4 |
Optimization Variance: Exploring Generalization Properties of DNNs |
5, 5, 7, 5, 5 |
|
1433 |
5.4 |
Learning to Solve Nonlinear Partial Differential Equation Systems To Accelerate MOSFET Simulation |
7, 5, 6, 5, 4 |
|
1434 |
5.4 |
SyncTwin: Transparent Treatment Effect Estimation under Temporal Confounding |
3, 4, 9, 4, 7 |
|
1435 |
5.4 |
Learning Safe Policies with Cost-sensitive Advantage Estimation |
5, 4, 6, 7, 5 |
|
1436 |
5.4 |
Learning to Share in Multi-Agent Reinforcement Learning |
3, 8, 8, 4, 4 |
|
1437 |
5.4 |
Channel-Directed Gradients for Optimization of Convolutional Neural Networks |
6, 5, 6, 4, 6 |
|
1438 |
5.4 |
Attainability and Optimality: The Equalized-Odds Fairness Revisited |
5, 5, 6, 5, 6 |
|
1439 |
5.4 |
Acceleration in Hyperbolic and Spherical Spaces |
5, 5, 7, 4, 6 |
|
1440 |
5.33 |
Adversarial Training using Contrastive Divergence |
5, 6, 5 |
|
1441 |
5.33 |
Towards Defending Multiple Adversarial Perturbations via Gated Batch Normalization |
5, 5, 6 |
|
1442 |
5.33 |
Towards Noise-resistant Object Detection with Noisy Annotations |
6, 5, 5 |
|
1443 |
5.33 |
PODS: Policy Optimization via Differentiable Simulation |
6, 4, 6 |
|
1444 |
5.33 |
Sobolev Training for the Neural Network Solutions of PDEs |
7, 5, 4 |
|
1445 |
5.33 |
Towards Impartial Multi-task Learning |
7, 5, 4 |
|
1446 |
5.33 |
Dimension reduction as an optimization problem over a set of generalized functions |
4, 7, 5 |
|
1447 |
5.33 |
Reflective Decoding: Unsupervised Paraphrasing and Abductive Reasoning |
5, 6, 5 |
|
1448 |
5.33 |
Toward Trainability of Quantum Neural Networks |
5, 5, 6 |
|
1449 |
5.33 |
ABS: Automatic Bit Sharing for Model Compression |
6, 4, 6 |
|
1450 |
5.33 |
Generalisation Guarantees For Continual Learning With Orthogonal Gradient Descent |
5, 6, 5 |
|
1451 |
5.33 |
Analyzing and Improving Generative Adversarial Training for Generative Modeling and Out-of-Distribution Detection |
7, 4, 5 |
|
1452 |
5.33 |
RECONNAISSANCE FOR REINFORCEMENT LEARNING WITH SAFETY CONSTRAINTS |
7, 5, 4 |
|
1453 |
5.33 |
Bayesian Meta-Learning for Few-Shot 3D Shape Completion |
5, 4, 7 |
|
1454 |
5.33 |
Information-Theoretic Odometry Learning |
5, 5, 6 |
|
1455 |
5.33 |
Deep Positive Unlabeled Learning with a Sequential Bias |
5, 5, 6 |
|
1456 |
5.33 |
Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning |
6, 5, 5 |
|
1457 |
5.33 |
Matrix Shuffle-Exchange Networks for Hard 2D Tasks |
4, 4, 8 |
|
1458 |
5.33 |
Using Synthetic Data to Improve the Long-range Forecasting of Time Series Data |
6, 5, 5 |
|
1459 |
5.33 |
Beyond COVID-19 Diagnosis: Prognosis with Hierarchical Graph Representation Learning |
6, 4, 6 |
|
1460 |
5.33 |
On Disentangled Representations Learned From Correlated Data |
3, 7, 6 |
|
1461 |
5.33 |
On the Universal Approximability and Complexity Bounds of Deep Learning in Hybrid Quantum-Classical Computing |
6, 6, 4 |
|
1462 |
5.33 |
There is no trade-off: enforcing fairness can improve accuracy |
6, 6, 4 |
|
1463 |
5.33 |
On the Consistency Loss for Leveraging Augmented Data to Learn Robust and Invariant Representations |
6, 4, 6 |
|
1464 |
5.33 |
Can one hear the shape of a neural network?: Snooping the GPU via Magnetic Side Channel |
5, 7, 4 |
|
1465 |
5.33 |
Improving Calibration for Long-Tailed Recognition |
6, 4, 6 |
|
1466 |
5.33 |
Source-free Domain Adaptation via Distributional Alignment by Matching Batch Normalization Statistics |
6, 4, 6 |
|
1467 |
5.33 |
Geometry of Program Synthesis |
4, 5, 7 |
|
1468 |
5.33 |
Learning Image Labels On-the-fly for Training Robust Classification Models |
4, 7, 5 |
|
1469 |
5.33 |
Generative Learning With Euler Particle Transport |
6, 5, 5 |
|
1470 |
5.33 |
Controllable Pareto Multi-Task Learning |
5, 7, 4 |
|
1471 |
5.33 |
Active Learning in CNNs via Expected Improvement Maximization |
6, 6, 4 |
|
1472 |
5.33 |
Discovering Parametric Activation Functions |
5, 5, 6 |
|
1473 |
5.33 |
Learning to Solve Multi-Robot Task Allocation with a Covariant-Attention based Neural Architecture |
7, 5, 4 |
|
1474 |
5.33 |
Learning the Connections in Direct Feedback Alignment |
6, 5, 5 |
|
1475 |
5.33 |
Contrastive Code Representation Learning |
4, 6, 6 |
|
1476 |
5.33 |
Explainability for fair machine learning |
5, 6, 5 |
|
1477 |
5.33 |
Dynamic Backdoor Attacks Against Deep Neural Networks |
5, 6, 5 |
|
1478 |
5.33 |
Effective Distributed Learning with Random Features: Improved Bounds and Algorithms |
4, 6, 6 |
|
1479 |
5.33 |
On Learning Read-once DNFs With Neural Networks |
4, 7, 5 |
|
1480 |
5.33 |
Perceptual Deep Neural Networks: Adversarial Robustness Through Input Recreation |
5, 5, 6 |
|
1481 |
5.33 |
Modal Uncertainty Estimation via Discrete Latent Representations |
5, 6, 5 |
|
1482 |
5.33 |
Ricci-GNN: Defending Against Structural Attacks Through a Geometric Approach |
5, 5, 6 |
|
1483 |
5.33 |
Prior Preference Learning From Experts: Designing A Reward with Active Inference |
6, 5, 5 |
|
1484 |
5.33 |
A Provably Convergent and Practical Algorithm for Min-Max Optimization with Applications to GANs |
4, 6, 6 |
|
1485 |
5.33 |
Orthogonal Subspace Decomposition: A New Perspective of Learning Discriminative Features for Face Clustering |
4, 7, 5 |
|
1486 |
5.33 |
When Are Neural Pruning Approximation Bounds Useful? |
5, 6, 5 |
|
1487 |
5.33 |
Deep Learning meets Projective Clustering |
5, 4, 7 |
|
1488 |
5.33 |
Overcoming barriers to the training of effective learned optimizers |
5, 4, 7 |
|
1489 |
5.33 |
Learning-Augmented Sketches for Hessians |
6, 6, 4 |
|
1490 |
5.33 |
Exploring Balanced Feature Spaces for Representation Learning |
6, 5, 5 |
|
1491 |
5.33 |
Adversarial representation learning for synthetic replacement of private attributes |
7, 4, 5 |
|
1492 |
5.33 |
MVP: Multivariate polynomials for conditional generation |
5, 5, 6 |
|
1493 |
5.33 |
Active Tuning |
5, 3, 8 |
|
1494 |
5.33 |
Fast Partial Fourier Transform |
6, 5, 5 |
|
1495 |
5.33 |
Learning to generate Wasserstein barycenters |
6, 7, 3 |
|
1496 |
5.33 |
Learning Disentangled Representations for Image Translation |
6, 6, 4 |
|
1497 |
5.33 |
A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING |
5, 5, 6 |
|
1498 |
5.33 |
Transferable Recognition-Aware Image Processing |
5, 5, 6 |
|
1499 |
5.33 |
Spectral Synthesis for Satellite-to-Satellite Translation |
5, 6, 5 |
|
1500 |
5.33 |
Quantifying Task Complexity Through Generalized Information Measures |
6, 5, 5 |
|
1501 |
5.33 |
Guided Exploration with Proximal Policy Optimization using a Single Demonstration |
6, 4, 6 |
|
1502 |
5.33 |
Improved Communication Lower Bounds for Distributed Optimisation |
5, 5, 6 |
|
1503 |
5.33 |
Adaptive Self-training for Neural Sequence Labeling with Few Labels |
4, 5, 7 |
|
1504 |
5.33 |
On Single-environment Extrapolations in Graph Classification and Regression Tasks |
3, 8, 5 |
|
1505 |
5.33 |
Learning a Transferable Scheduling Policy for Various Vehicle Routing Problems based on Graph-centric Representation Learning |
5, 6, 5 |
|
1506 |
5.33 |
Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation |
6, 6, 4 |
|
1507 |
5.33 |
Learning Visual Representations for Transfer Learning by Suppressing Texture |
7, 4, 5 |
|
1508 |
5.33 |
Rethinking Compressed Convolution Neural Network from a Statistical Perspective |
6, 5, 5 |
|
1509 |
5.33 |
Pointwise Binary Classification with Pairwise Confidence Comparisons |
4, 7, 5 |
|
1510 |
5.33 |
Deformable Capsules for Object Detection |
4, 6, 6 |
|
1511 |
5.33 |
CoLES: Contrastive learning for event sequences with self-supervision |
6, 5, 5 |
|
1512 |
5.33 |
On the Inversion of Deep Generative Models |
6, 3, 7 |
|
1513 |
5.33 |
Higher-order Structure Prediction in Evolving Graph Simplicial Complexes |
4, 6, 6 |
|
1514 |
5.33 |
Unsupervised Active Pre-Training for Reinforcement Learning |
5, 6, 5 |
|
1515 |
5.33 |
Decomposing Mutual Information for Representation Learning |
6, 5, 5 |
|
1516 |
5.33 |
News-Driven Stock Prediction Using Noisy Equity State Representation |
6, 5, 5 |
|
1517 |
5.33 |
Multi-Agent Imitation Learning with Copulas |
7, 5, 4 |
|
1518 |
5.33 |
Stability analysis of SGD through the normalized loss function |
6, 6, 4 |
|
1519 |
5.33 |
BasisNet: Two-stage Model Synthesis for Efficient Inference |
7, 3, 6 |
|
1520 |
5.33 |
Text as Neural Operator: Image Manipulation by Text Instruction |
4, 6, 6 |
|
1521 |
5.25 |
Distributed Momentum for Byzantine-resilient Stochastic Gradient Descent |
4, 7, 4, 6 |
|
1522 |
5.25 |
Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning |
6, 4, 5, 6 |
|
1523 |
5.25 |
DyHCN: Dynamic Hypergraph Convolutional Networks |
5, 6, 6, 4 |
|
1524 |
5.25 |
Is deeper better? It depends on locality of relevant features |
4, 4, 6, 7 |
|
1525 |
5.25 |
Once Quantized for All: Progressively Searching for Quantized Efficient Models |
6, 5, 6, 4 |
|
1526 |
5.25 |
Iterated graph neural network system |
6, 6, 4, 5 |
|
1527 |
5.25 |
Factoring out Prior Knowledge from Low-Dimensional Embeddings |
5, 5, 6, 5 |
|
1528 |
5.25 |
TextSETTR: Label-Free Text Style Extraction and Tunable Targeted Restyling |
5, 6, 5, 5 |
|
1529 |
5.25 |
On Size Generalization in Graph Neural Networks |
5, 4, 7, 5 |
|
1530 |
5.25 |
Federated Averaging as Expectation Maximization |
7, 4, 5, 5 |
|
1531 |
5.25 |
CaLFADS: latent factor analysis of dynamical systems in calcium imaging data |
5, 7, 5, 4 |
|
1532 |
5.25 |
Variational Intrinsic Control Revisited |
6, 5, 4, 6 |
|
1533 |
5.25 |
Meta-Model-Based Meta-Policy Optimization |
6, 5, 5, 5 |
|
1534 |
5.25 |
Improving Sequence Generative Adversarial Networks with Feature Statistics Alignment |
5, 6, 6, 4 |
|
1535 |
5.25 |
Cross-State Self-Constraint for Feature Generalization in Deep Reinforcement Learning |
5, 5, 6, 5 |
|
1536 |
5.25 |
Regularized Mutual Information Neural Estimation |
3, 6, 7, 5 |
|
1537 |
5.25 |
FMix: Enhancing Mixed Sample Data Augmentation |
5, 6, 4, 6 |
|
1538 |
5.25 |
Weakly Supervised Scene Graph Grounding |
5, 7, 4, 5 |
|
1539 |
5.25 |
TransNAS-Bench-101: Improving Transferrability and Generalizability of Cross-Task Neural Architecture Search |
5, 5, 5, 6 |
|
1540 |
5.25 |
REPAINT: Knowledge Transfer in Deep Actor-Critic Reinforcement Learning |
6, 4, 7, 4 |
|
1541 |
5.25 |
Learning to Noise: Application-Agnostic Data Sharing with Local Differential Privacy |
6, 3, 6, 6 |
|
1542 |
5.25 |
Reviving Autoencoder Pretraining |
5, 9, 3, 4 |
|
1543 |
5.25 |
Black-Box Adversarial Attacks on Graph Neural Networks as An Influence Maximization Problem |
6, 5, 5, 5 |
|
1544 |
5.25 |
Bi-tuning of Pre-trained Representations |
8, 5, 4, 4 |
|
1545 |
5.25 |
Composite Adversarial Training for Multiple Adversarial Perturbations and Beyond |
5, 6, 5, 5 |
|
1546 |
5.25 |
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks |
3, 6, 6, 6 |
|
1547 |
5.25 |
Creating Synthetic Datasets via Evolution for Neural Program Synthesis |
3, 6, 6, 6 |
|
1548 |
5.25 |
Ranking Cost: One-Stage Circuit Routing by Directly Optimizing Global Objective Function |
5, 5, 6, 5 |
|
1549 |
5.25 |
Better Optimization can Reduce Sample Complexity: Active Semi-Supervised Learning via Convergence Rate Control |
5, 6, 5, 5 |
|
1550 |
5.25 |
Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution |
5, 6, 4, 6 |
|
1551 |
5.25 |
DECSTR: Learning Goal-Directed Abstract Behaviors using Pre-Verbal Spatial Predicates in Intrinsically Motivated Agents |
4, 5, 5, 7 |
|
1552 |
5.25 |
HyperSAGE: Generalizing Inductive Representation Learning on Hypergraphs |
6, 5, 4, 6 |
|
1553 |
5.25 |
Domain-Free Adversarial Splitting for Domain Generalization |
5, 5, 6, 5 |
|
1554 |
5.25 |
Graph Joint Attention Networks |
4, 5, 7, 5 |
|
1555 |
5.25 |
Informative Outlier Matters: Robustifying Out-of-distribution Detection Using Outlier Mining |
7, 7, 4, 3 |
|
1556 |
5.25 |
Multi-View Disentangled Representation |
5, 5, 5, 6 |
|
1557 |
5.25 |
Learning Private Representations with Focal Entropy |
6, 6, 4, 5 |
|
1558 |
5.25 |
Waste not, Want not: All-Alive Pruning for Extremely Sparse Networks |
4, 7, 5, 5 |
|
1559 |
5.25 |
Smooth Adversarial Training |
4, 7, 4, 6 |
|
1560 |
5.25 |
Demon: Momentum Decay for Improved Neural Network Training |
5, 6, 5, 5 |
|
1561 |
5.25 |
PettingZoo: Gym for Multi-Agent Reinforcement Learning |
3, 6, 5, 7 |
|
1562 |
5.25 |
Predicting the impact of dataset composition on model performance |
4, 5, 7, 5 |
|
1563 |
5.25 |
Learnable Uncertainty under Laplace Approximations |
7, 6, 4, 4 |
|
1564 |
5.25 |
Efficient Exploration for Model-based Reinforcement Learning with Continuous States and Actions |
5, 5, 5, 6 |
|
1565 |
5.25 |
Sample efficient Quality Diversity for neural continuous control |
6, 3, 6, 6 |
|
1566 |
5.25 |
Reducing Class Collapse in Metric Learning with Easy Positive Sampling |
6, 6, 5, 4 |
|
1567 |
5.25 |
Central Server Free Federated Learning over Single-sided Trust Social Networks |
4, 8, 5, 4 |
|
1568 |
5.25 |
Debiased Graph Neural Networks with Agnostic Label Selection Bias |
4, 5, 4, 8 |
|
1569 |
5.25 |
Self-supervised Bayesian Deep Learning for Image Denoising |
3, 6, 6, 6 |
|
1570 |
5.25 |
Symmetric Wasserstein Autoencoders |
6, 5, 5, 5 |
|
1571 |
5.25 |
Neighborhood-Aware Neural Architecture Search |
6, 5, 6, 4 |
|
1572 |
5.25 |
Learning Monotonic Alignments with Source-Aware GMM Attention |
5, 5, 6, 5 |
|
1573 |
5.25 |
Score-based Causal Discovery from Heterogeneous Data |
7, 3, 5, 6 |
|
1574 |
5.25 |
Unsupervised Cross-lingual Representation Learning for Speech Recognition |
5, 6, 4, 6 |
|
1575 |
5.25 |
Real-time Uncertainty Decomposition for Online Learning Control |
5, 6, 7, 3 |
|
1576 |
5.25 |
To be Robust or to be Fair: Towards Fairness in Adversarial Training |
5, 6, 5, 5 |
|
1577 |
5.25 |
Automated Concatenation of Embeddings for Structured Prediction |
6, 6, 4, 5 |
|
1578 |
5.25 |
Energy-Based Models for Continual Learning |
6, 5, 6, 4 |
|
1579 |
5.25 |
The Emergence of Individuality in Multi-Agent Reinforcement Learning |
6, 4, 5, 6 |
|
1580 |
5.25 |
Tracking the progress of Language Models by extracting their underlying Knowledge Graphs |
6, 6, 5, 4 |
|
1581 |
5.25 |
Learning Hyperbolic Representations for Unsupervised 3D Segmentation |
4, 7, 7, 3 |
|
1582 |
5.25 |
Gradient Based Memory Editing for Task-Free Continual Learning |
5, 7, 3, 6 |
|
1583 |
5.25 |
Adaptive Discretization for Continuous Control using Particle Filtering Policy Network |
4, 5, 5, 7 |
|
1584 |
5.25 |
Voting-based Approaches For Differentially Private Federated Learning |
6, 4, 5, 6 |
|
1585 |
5.25 |
Iterative Amortized Policy Optimization |
5, 5, 5, 6 |
|
1586 |
5.25 |
It Is Likely That Your Loss Should be a Likelihood |
4, 5, 6, 6 |
|
1587 |
5.25 |
Semantic Inference Network for Few-shot Streaming Label Learning |
4, 5, 4, 8 |
|
1588 |
5.25 |
A Lazy Approach to Long-Horizon Gradient-Based Meta-Learning |
4, 5, 7, 5 |
|
1589 |
5.25 |
For self-supervised learning, Rationality implies generalization, provably |
7, 7, 4, 3 |
|
1590 |
5.25 |
Calibrated Adversarial Refinement for Stochastic Semantic Segmentation |
4, 5, 6, 6 |
|
1591 |
5.25 |
Directional graph networks |
4, 5, 7, 5 |
|
1592 |
5.25 |
What can we learn from gradients? |
7, 6, 4, 4 |
|
1593 |
5.25 |
Factorized linear discriminant analysis for phenotype-guided representation learning of neuronal gene expression data |
5, 5, 6, 5 |
|
1594 |
5.25 |
A Mixture of Variational Autoencoders for Deep Clustering |
5, 5, 5, 6 |
|
1595 |
5.25 |
DISE: Dynamic Integrator Selection to Minimize Forward Pass Time in Neural ODEs |
6, 6, 4, 5 |
|
1596 |
5.25 |
ARELU: ATTENTION-BASED RECTIFIED LINEAR UNIT |
6, 5, 3, 7 |
|
1597 |
5.25 |
Transformer-QL: A Step Towards Making Transformer Network Quadratically Large |
7, 4, 5, 5 |
|
1598 |
5.25 |
Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning |
5, 5, 6, 5 |
|
1599 |
5.25 |
Adaptive Personalized Federated Learning |
3, 7, 5, 6 |
|
1600 |
5.25 |
Adversarial Deep Metric Learning |
4, 5, 6, 6 |
|
1601 |
5.25 |
Multi-Head Attention: Collaborate Instead of Concatenate |
5, 5, 5, 6 |
|
1602 |
5.25 |
Secure Byzantine-Robust Machine Learning |
6, 5, 7, 3 |
|
1603 |
5.25 |
EnTranNAS: Towards Closing the Gap between the Architectures in Search and Evaluation |
7, 6, 4, 4 |
|
1604 |
5.25 |
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences |
7, 4, 5, 5 |
|
1605 |
5.25 |
Incorporating Symmetry into Deep Dynamics Models for Improved Generalization |
4, 6, 4, 7 |
|
1606 |
5.25 |
Revisiting Loss Modelling for Unstructured Pruning |
6, 3, 5, 7 |
|
1607 |
5.25 |
Information Lattice Learning |
4, 4, 7, 6 |
|
1608 |
5.25 |
Differentiable Weighted Finite-State Transducers |
6, 5, 4, 6 |
|
1609 |
5.25 |
Neural Architecture Search of SPD Manifold Networks |
7, 4, 4, 6 |
|
1610 |
5.25 |
Differentiable Spatial Planning using Transformers |
4, 4, 7, 6 |
|
1611 |
5.25 |
SALR: Sharpness-aware Learning Rates for Improved Generalization |
5, 4, 6, 6 |
|
1612 |
5.25 |
Linking average- and worst-case perturbation robustness via class selectivity and dimensionality |
5, 6, 4, 6 |
|
1613 |
5.25 |
Out-of-Distribution Generalization via Risk Extrapolation (REx) |
4, 6, 5, 6 |
|
1614 |
5.25 |
Stable Weight Decay Regularization |
5, 6, 5, 5 |
|
1615 |
5.25 |
Multiple Descent: Design Your Own Generalization Curve |
6, 6, 4, 5 |
|
1616 |
5.25 |
Signed Graph Diffusion Network |
7, 4, 6, 4 |
|
1617 |
5.25 |
IF-Defense: 3D Adversarial Point Cloud Defense via Implicit Function based Restoration |
5, 6, 6, 4 |
|
1618 |
5.25 |
GraphSAD: Learning Graph Representations with Structure-Attribute Disentanglement |
4, 8, 6, 3 |
|
1619 |
5.25 |
Neural Point Process for Forecasting Spatiotemporal Events |
8, 5, 4, 4 |
|
1620 |
5.25 |
Cooperating RPN’s Improve Few-Shot Object Detection |
3, 6, 7, 5 |
|
1621 |
5.25 |
PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks |
4, 5, 6, 6 |
|
1622 |
5.25 |
Post-Training Weighted Quantization of Neural Networks for Language Models |
4, 6, 6, 5 |
|
1623 |
5.25 |
Rethinking Parameter Counting: Effective Dimensionality Revisited |
5, 4, 6, 6 |
|
1624 |
5.25 |
MLR-SNet: Transferable LR Schedules for Heterogeneous Tasks |
5, 4, 6, 6 |
|
1625 |
5.25 |
On the Estimation Bias in Double Q-Learning |
6, 3, 6, 6 |
|
1626 |
5.25 |
Language Controls More Than Top-Down Attention: Modulating Bottom-Up Visual Processing with Referring Expressions |
5, 4, 10, 2 |
|
1627 |
5.25 |
Temporal Difference Uncertainties as a Signal for Exploration |
6, 3, 7, 5 |
|
1628 |
5.25 |
SVMax: A Feature Embedding Regularizer |
4, 6, 6, 5 |
|
1629 |
5.25 |
Time-varying Graph Representation Learning via Higher-Order Skip-Gram with Negative Sampling |
7, 4, 5, 5 |
|
1630 |
5.25 |
Benchmarking Unsupervised Object Representations for Video Sequences |
7, 5, 4, 5 |
|
1631 |
5.25 |
Rewriter-Evaluator Framework for Neural Machine Translation |
7, 6, 4, 4 |
|
1632 |
5.25 |
DOTS: Decoupling Operation and Topology in Differentiable Architecture Search |
6, 6, 4, 5 |
|
1633 |
5.25 |
Deep Clustering and Representation Learning that Preserves Geometric Structures |
4, 7, 6, 4 |
|
1634 |
5.25 |
Deep Learning with Data Privacy via Residual Perturbation |
5, 6, 4, 6 |
|
1635 |
5.25 |
Coverage as a Principle for Discovering Transferable Behavior in Reinforcement Learning |
4, 4, 5, 8 |
|
1636 |
5.25 |
Invertible Manifold Learning for Dimension Reduction |
5, 4, 8, 4 |
|
1637 |
5.25 |
Latent Causal Invariant Model |
6, 4, 6, 5 |
|
1638 |
5.25 |
Beyond Trivial Counterfactual Generations with Diverse Valuable Explanations |
6, 7, 4, 4 |
|
1639 |
5.25 |
S2SD: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning |
4, 6, 7, 4 |
|
1640 |
5.25 |
MISIM: A Novel Code Similarity System |
5, 7, 5, 4 |
|
1641 |
5.25 |
Unsupervised Task Clustering for Multi-Task Reinforcement Learning |
5, 5, 5, 6 |
|
1642 |
5.25 |
Latent Programmer: Discrete Latent Codes for Program Synthesis |
7, 7, 4, 3 |
|
1643 |
5.25 |
Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration |
5, 5, 6, 5 |
|
1644 |
5.25 |
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation |
4, 6, 5, 6 |
|
1645 |
5.25 |
Point Cloud Instance Segmentation using Probabilistic Embeddings |
4, 7, 5, 5 |
|
1646 |
5.25 |
Faster Training of Word Embeddings |
7, 4, 5, 5 |
|
1647 |
5.25 |
Reducing Implicit Bias in Latent Domain Learning |
6, 5, 4, 6 |
|
1648 |
5.25 |
Efficient randomized smoothing by denoising with learned score function |
6, 3, 6, 6 |
|
1649 |
5.25 |
A Half-Space Stochastic Projected Gradient Method for Group Sparsity Regularization |
6, 5, 5, 5 |
|
1650 |
5.25 |
Federated Learning With Quantized Global Model Updates |
5, 5, 5, 6 |
|
1651 |
5.25 |
Detecting Hallucinated Content in Conditional Neural Sequence Generation |
5, 6, 5, 5 |
|
1652 |
5.25 |
Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning |
7, 6, 5, 3 |
|
1653 |
5.25 |
Adversarial Problems for Generative Networks |
4, 6, 4, 7 |
|
1654 |
5.25 |
Learning representations from temporally smooth data |
6, 5, 4, 6 |
|
1655 |
5.25 |
FAST GRAPH ATTENTION NETWORKS USING EFFECTIVE RESISTANCE BASED GRAPH SPARSIFICATION |
5, 6, 4, 6 |
|
1656 |
5.25 |
Solving Compositional Reinforcement Learning Problems via Task Reduction |
7, 6, 5, 3 |
|
1657 |
5.25 |
Feature Integration and Group Transformers for Action Proposal Generation |
5, 5, 6, 5 |
|
1658 |
5.25 |
Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles |
5, 4, 6, 6 |
|
1659 |
5.25 |
Efficient Differentiable Neural Architecture Search with Model Parallelism |
5, 5, 5, 6 |
|
1660 |
5.25 |
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding |
5, 6, 5, 5 |
|
1661 |
5.25 |
Boundary Effects in CNNs: Feature or Bug? |
3, 8, 7, 3 |
|
1662 |
5.25 |
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms |
8, 4, 6, 3 |
|
1663 |
5.25 |
On Episodes, Prototypical Networks, and Few-Shot Learning |
4, 7, 5, 5 |
|
1664 |
5.25 |
Hyperparameter Transfer Across Developer Adjustments |
5, 6, 5, 5 |
|
1665 |
5.25 |
Enhanced First and Zeroth Order Variance Reduced Algorithms for Min-Max Optimization |
6, 5, 6, 4 |
|
1666 |
5.25 |
Block Skim Transformer for Efficient Question Answering |
4, 6, 6, 5 |
|
1667 |
5.25 |
Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations |
5, 5, 5, 6 |
|
1668 |
5.25 |
Random Coordinate Langevin Monte Carlo |
4, 4, 7, 6 |
|
1669 |
5.25 |
Experience Replay with Likelihood-free Importance Weights |
6, 5, 7, 3 |
|
1670 |
5.25 |
DiP Benchmark Tests: Evaluation Benchmarks for Discourse Phenomena in MT |
6, 7, 4, 4 |
|
1671 |
5.25 |
Disentangling Adversarial Robustness in Directions of the Data Manifold |
6, 4, 5, 6 |
|
1672 |
5.25 |
Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation |
6, 4, 5, 6 |
|
1673 |
5.25 |
Model-Targeted Poisoning Attacks with Provable Convergence |
5, 6, 7, 3 |
|
1674 |
5.25 |
CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment |
4, 5, 6, 6 |
|
1675 |
5.25 |
Communication in Multi-Agent Reinforcement Learning: Intention Sharing |
5, 6, 4, 6 |
|
1676 |
5.25 |
A-FMI: Learning Attributions from Deep Networks via Feature Map Importance |
6, 6, 3, 6 |
|
1677 |
5.25 |
Optimal Transport Graph Neural Networks |
4, 5, 5, 7 |
|
1678 |
5.25 |
Exploring representation learning for flexible few-shot tasks |
8, 4, 5, 4 |
|
1679 |
5.25 |
Efficient Robust Training via Backward Smoothing |
5, 5, 5, 6 |
|
1680 |
5.25 |
On the Robustness of Sentiment Analysis for Stock Price Forecasting |
4, 5, 7, 5 |
|
1681 |
5.25 |
Graph Deformer Network |
5, 7, 4, 5 |
|
1682 |
5.25 |
Environment Predictive Coding for Embodied Agents |
6, 6, 4, 5 |
|
1683 |
5.25 |
Learning Flexible Classifiers with Shot-CONditional Episodic (SCONE) Training |
5, 6, 6, 4 |
|
1684 |
5.25 |
Evidence against implicitly recurrent computations in residual neural networks |
5, 5, 5, 6 |
|
1685 |
5.25 |
Contextual HyperNetworks for Novel Feature Adaptation |
5, 5, 5, 6 |
|
1686 |
5.25 |
Natural Compression for Distributed Deep Learning |
6, 5, 5, 5 |
|
1687 |
5.25 |
Out-of-distribution Prediction with Invariant Risk Minimization: The Limitation and An Effective Fix |
4, 7, 6, 4 |
|
1688 |
5.25 |
Motif-Driven Contrastive Learning of Graph Representations |
6, 5, 5, 5 |
|
1689 |
5.25 |
Counterfactual Thinking for Long-tailed Information Extraction |
5, 7, 6, 3 |
|
1690 |
5.25 |
Should Ensemble Members Be Calibrated? |
4, 6, 6, 5 |
|
1691 |
5.25 |
VECoDeR - Variational Embeddings for Community Detection and Node Representation |
5, 5, 6, 5 |
|
1692 |
5.25 |
Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates |
3, 8, 5, 5 |
|
1693 |
5.25 |
Shape or Texture: Disentangling Discriminative Features in CNNs |
7, 6, 4, 4 |
|
1694 |
5.25 |
Double Q-learning: New Analysis and Sharper Finite-time Bound |
5, 6, 4, 6 |
|
1695 |
5.25 |
ProGAE: A Geometric Autoencoder-based Generative Model for Disentangling Protein Dynamics |
4, 5, 7, 5 |
|
1696 |
5.25 |
Defining Benchmarks for Continual Few-Shot Learning |
4, 6, 6, 5 |
|
1697 |
5.25 |
A Neural Network MCMC sampler that maximizes Proposal Entropy |
3, 6, 6, 6 |
|
1698 |
5.2 |
Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent |
5, 6, 5, 4, 6 |
|
1699 |
5.2 |
Semi-supervised Domain Adaptation with Prototypical Alignment and Consistency Learning |
5, 5, 6, 6, 4 |
|
1700 |
5.2 |
Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention |
6, 4, 5, 5, 6 |
|
1701 |
5.2 |
Forward Prediction for Physical Reasoning |
5, 6, 5, 5, 5 |
|
1702 |
5.2 |
Weighted Line Graph Convolutional Networks |
5, 6, 4, 6, 5 |
|
1703 |
5.2 |
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets |
3, 5, 7, 6, 5 |
|
1704 |
5.2 |
ChePAN: Constrained Black-Box Uncertainty Modelling with Quantile Regression |
7, 7, 6, 4, 2 |
|
1705 |
5.2 |
Improving Self-supervised Pre-training via a Fully-Explored Masked Language Model |
6, 5, 6, 4, 5 |
|
1706 |
5.2 |
GeDi: Generative Discriminator Guided Sequence Generation |
5, 6, 4, 5, 6 |
|
1707 |
5.2 |
Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs |
7, 6, 6, 1, 6 |
|
1708 |
5.2 |
Identifying Informative Latent Variables Learned by GIN via Mutual Information |
6, 4, 5, 6, 5 |
|
1709 |
5.2 |
Differentiate Everything with a Reversible Domain-Specific Language |
5, 6, 5, 4, 6 |
|
1710 |
5.17 |
Embedding Transfer via Smooth Contrastive Loss |
5, 5, 5, 6, 6, 4 |
|
1711 |
5 |
On the Latent Space of Flow-based Models |
5, 5, 4, 6, 5 |
|
1712 |
5 |
Convergent Adaptive Gradient Methods in Decentralized Optimization |
3, 4, 8, 7, 3 |
|
1713 |
5 |
Mitigating bias in calibration error estimation |
5, 7, 4, 4 |
|
1714 |
5 |
Coordinated Multi-Agent Exploration Using Shared Goals |
5, 5, 6, 4 |
|
1715 |
5 |
Video Prediction with Variational Temporal Hierarchies |
6, 4, 5, 5 |
|
1716 |
5 |
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets |
6, 4, 5 |
|
1717 |
5 |
TRACE: Tensorizing and Generalizing Supernets from Neural Architecture Search |
5, 6, 4, 5 |
|
1718 |
5 |
Bidirectional Self-Normalizing Neural Networks |
6, 4, 6, 4 |
|
1719 |
5 |
WAFFLe: Weight Anonymized Factorization for Federated Learning |
6, 4, 5 |
|
1720 |
5 |
Zero-shot Fairness with Invisible Demographics |
5, 6, 5, 4 |
|
1721 |
5 |
Category Disentangled Context: Turning Category-irrelevant Features Into Treasures |
5, 6, 5, 4 |
|
1722 |
5 |
Improving Calibration through the Relationship with Adversarial Robustness |
6, 2, 5, 7 |
|
1723 |
5 |
Predictive Attention Transformer: Improving Transformer with Attention Map Prediction |
6, 6, 6, 2 |
|
1724 |
5 |
Ranking Neural Checkpoints |
5, 5, 4, 6 |
|
1725 |
5 |
Unsupervised Progressive Learning and the STAM Architecture |
5, 2, 7, 6, 5 |
|
1726 |
5 |
GINN: Fast GPU-TEE Based Integrity for Neural Network Training |
7, 6, 4, 3 |
|
1727 |
5 |
Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers |
4, 3, 5, 8 |
|
1728 |
5 |
Learning a Max-Margin Classifier for Cross-Domain Sentiment Analysis |
5, 5, 5, 5 |
|
1729 |
5 |
iPTR: Learning a representation for interactive program translation retrieval |
4, 5, 6 |
|
1730 |
5 |
Exploring Routing Strategies for Multilingual Mixture-of-Experts Models |
5, 4, 6 |
|
1731 |
5 |
Deep Curvature Suite |
6, 4, 7, 3 |
|
1732 |
5 |
Later Span Adaptation for Language Understanding |
6, 4, 4, 6 |
|
1733 |
5 |
Are all outliers alike? On Understanding the Diversity of Outliers for Detecting OODs |
5, 5, 6, 4 |
|
1734 |
5 |
Gradient-based training of Gaussian Mixture Models for High-Dimensional Streaming Data |
5, 5, 5, 5, 5 |
|
1735 |
5 |
Estimating Example Difficulty using Variance of Gradients |
6, 6, 6, 4, 3 |
|
1736 |
5 |
Semi-supervised learning by selective training with pseudo labels via confidence estimation |
5, 5, 6, 4 |
|
1737 |
5 |
Continual Memory: Can We Reason After Long-Term Memorization? |
4, 5, 6 |
|
1738 |
5 |
Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data |
5, 5, 5, 5 |
|
1739 |
5 |
Leveraged Weighted Loss For Partial Label Learning |
6, 3, 7, 4 |
|
1740 |
5 |
Quantifying and Learning Disentangled Representations with Limited Supervision |
6, 5, 4, 5 |
|
1741 |
5 |
Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness |
5, 6, 4, 5 |
|
1742 |
5 |
Robustness via Probabilistic Cross-Task Ensembles |
5, 3, 9, 3 |
|
1743 |
5 |
WeMix: How to Better Utilize Data Augmentation |
4, 7, 5, 4 |
|
1744 |
5 |
Private Split Inference of Deep Networks |
5, 5, 5 |
|
1745 |
5 |
Provably More Efficient Q-Learning in the One-Sided-Feedback/Full-Feedback Settings |
5, 6, 4, 5 |
|
1746 |
5 |
A Multi-Modal and Multitask Benchmark in the Clinical Domain |
5, 5, 5 |
|
1747 |
5 |
Wasserstein Distributionally Robust Optimization: A Three-Player Game Framework |
5, 5, 6, 5, 4 |
|
1748 |
5 |
Graph Structural Aggregation for Explainable Learning |
7, 3, 4, 6 |
|
1749 |
5 |
Contrastive Learning of Medical Visual Representations from Paired Images and Text |
5, 6, 4 |
|
1750 |
5 |
Learning to Generate Videos Using Neural Uncertainty Priors |
4, 5, 5, 6 |
|
1751 |
5 |
InstantEmbedding: Efficient Local Node Representations |
6, 4, 6, 4 |
|
1752 |
5 |
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models |
5, 5, 5, 5 |
|
1753 |
5 |
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms |
6, 4, 4, 6 |
|
1754 |
5 |
Self-Organizing Intelligent Matter: A blueprint for an AI generating algorithm |
8, 5, 4, 3 |
|
1755 |
5 |
Contrastive Video Textures |
5, 4, 6 |
|
1756 |
5 |
Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling |
6, 3, 6, 5 |
|
1757 |
5 |
Human Perception-based Evaluation Criterion for Ultra-high Resolution Cell Membrane Segmentation |
7, 6, 3, 4 |
|
1758 |
5 |
Essentials for Class Incremental Learning |
4, 7, 5, 4 |
|
1759 |
5 |
Transformers with Competitive Ensembles of Independent Mechanisms |
4, 7, 5, 4 |
|
1760 |
5 |
Knowledge Distillation based Ensemble Learning for Neural Machine Translation |
6, 4, 4, 6 |
|
1761 |
5 |
Revisiting the Stability of Stochastic Gradient Descent: A Tightness Analysis |
4, 4, 7, 5 |
|
1762 |
5 |
Speeding up Deep Learning Training by Sharing Weights and Then Unsharing |
6, 4, 5, 5 |
|
1763 |
5 |
Everybody’s Talkin': Let Me Talk as You Want |
5, 6, 5, 4 |
|
1764 |
5 |
Neural Lyapunov Model Predictive Control |
5, 3, 7 |
|
1765 |
5 |
A Maximum Mutual Information Framework for Multi-Agent Reinforcement Learning |
6, 6, 5, 3 |
|
1766 |
5 |
Does Adversarial Transferability Indicate Knowledge Transferability? |
5, 5, 5, 5 |
|
1767 |
5 |
Policy Gradient with Expected Quadratic Utility Maximization: A New Mean-Variance Approach in Reinforcement Learning |
6, 5, 4 |
|
1768 |
5 |
Learning to Generate the Unknowns for Open-set Domain Adaptation |
5, 5, 5 |
|
1769 |
5 |
Hybrid Discriminative-Generative Training via Contrastive Learning |
6, 6, 5, 3 |
|
1770 |
5 |
The Logical Options Framework |
4, 6, 6, 4 |
|
1771 |
5 |
K-PLUG: KNOWLEDGE-INJECTED PRE-TRAINED LANGUAGE MODEL FOR NATURAL LANGUAGE UNDERSTANDING AND GENERATION |
5, 4, 5, 6 |
|
1772 |
5 |
Reinforcement Learning with Latent Flow |
4, 6, 3, 7 |
|
1773 |
5 |
A Flexible Framework for Discovering Novel Categories with Contrastive Learning |
5, 6, 4, 5, 5 |
|
1774 |
5 |
Adam$^+$: A Stochastic Method with Adaptive Variance Reduction |
5, 6, 5, 4 |
|
1775 |
5 |
SIM-GAN: Adversarial Calibration of Multi-Agent Market Simulators. |
5, 7, 3 |
|
1776 |
5 |
Rethinking the Trigger of Backdoor Attack |
5, 5, 5 |
|
1777 |
5 |
AN ONLINE SEQUENTIAL TEST FOR QUALITATIVE TREATMENT EFFECTS |
4, 3, 7, 6 |
|
1778 |
5 |
TaskSet: A Dataset of Optimization Tasks |
5, 5, 7, 3 |
|
1779 |
5 |
Learning Deeply Shared Filter Bases for Efficient ConvNets |
4, 6, 5, 5 |
|
1780 |
5 |
Demystifying Learning of Unsupervised Neural Machine Translation |
5, 4, 6, 5 |
|
1781 |
5 |
Discriminative Cross-Modal Data Augmentation for Medical Imaging Applications |
6, 5, 4, 5 |
|
1782 |
5 |
CIGMO: Learning categorical invariant deep generative models from grouped data |
4, 7, 5, 4 |
|
1783 |
5 |
PanRep: Universal node embeddings for heterogeneous graphs |
4, 6, 5, 5 |
|
1784 |
5 |
Learning Discrete Adaptive Receptive Fields for Graph Convolutional Networks |
5, 5, 5, 5 |
|
1785 |
5 |
Towards Learning to Remember in Meta Learning of Sequential Domains |
4, 5, 6, 5 |
|
1786 |
5 |
Neighbor Class Consistency on Unsupervised Domain Adaptation |
5, 5, 6, 4 |
|
1787 |
5 |
Understanding Classifiers with Generative Models |
5, 6, 4, 5 |
|
1788 |
5 |
Provable Robustness by Geometric Regularization of ReLU Networks |
5, 6, 4 |
|
1789 |
5 |
Evaluating Robustness of Predictive Uncertainty Estimation: Are Dirichlet-based Models Reliable? |
6, 2, 7, 5 |
|
1790 |
5 |
Encoded Prior Sliced Wasserstein AutoEncoder for learning latent manifold representations |
5, 5, 5 |
|
1791 |
5 |
Bayesian Learning to Optimize: Quantifying the Optimizer Uncertainty |
5, 6, 4 |
|
1792 |
5 |
Adversarial Privacy Preservation in MRI Scans of the Brain |
3, 6, 3, 6, 7 |
|
1793 |
5 |
Learning Aggregation Functions |
6, 3, 6, 5 |
|
1794 |
5 |
ProxylessKD: Direct Knowledge Distillation with inherited classifier for face Recognition |
6, 4, 5 |
|
1795 |
5 |
Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search |
6, 4, 5, 5 |
|
1796 |
5 |
Gradient Descent Ascent for Min-Max Problems on Riemannian Manifold |
7, 4, 4, 5 |
|
1797 |
5 |
Preventing Value Function Collapse in Ensemble Q-Learning by Maximizing Representation Diversity |
6, 5, 5, 4 |
|
1798 |
5 |
Neural spatio-temporal reasoning with object-centric self-supervised learning |
6, 4, 5, 5 |
|
1799 |
5 |
NNGeometry: Easy and Fast Fisher Information Matrices and Neural Tangent Kernels in PyTorch |
4, 7, 4, 5 |
|
1800 |
5 |
Ordering-Based Causal Discovery with Reinforcement Learning |
5, 5, 5, 5 |
|
1801 |
5 |
Model-centric data manifold: the data through the eyes of the model |
5, 4, 6, 5 |
|
1802 |
5 |
Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design |
5, 3, 7, 5 |
|
1803 |
5 |
Zero-Shot Learning with Common Sense Knowledge Graphs |
4, 4, 7 |
|
1804 |
5 |
SSW-GAN: Scalable Stage-wise Training of Video GANs |
7, 3, 6, 3, 6 |
|
1805 |
5 |
Differentiable Graph Optimization for Neural Architecture Search |
4, 6, 5 |
|
1806 |
5 |
Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games |
4, 6, 4, 6 |
|
1807 |
5 |
Cortico-cerebellar networks as decoupled neural interfaces |
7, 5, 3 |
|
1808 |
5 |
Mixup Training as the Complexity Reduction |
6, 4, 6, 4 |
|
1809 |
5 |
Combining Imitation and Reinforcement Learning with Free Energy Principle |
5, 5, 6, 4 |
|
1810 |
5 |
ON NEURAL NETWORK GENERALIZATION VIA PROMOTING WITHIN-LAYER ACTIVATION DIVERSITY |
6, 6, 5, 3 |
|
1811 |
5 |
On the Landscape of Sparse Linear Networks |
5, 4, 7, 4 |
|
1812 |
5 |
Consistent Instance Classification for Unsupervised Representation Learning |
5, 5, 5 |
|
1813 |
5 |
Neural Cellular Automata Manifold |
4, 4, 7, 5 |
|
1814 |
5 |
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets |
4, 6, 6, 3, 6 |
|
1815 |
5 |
Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models |
4, 5, 6, 5 |
|
1816 |
5 |
AutoHAS: Efficient Hyperparameter and Architecture Search |
4, 6, 5, 5 |
|
1817 |
5 |
ATOM3D: Tasks On Molecules in Three Dimensions |
5, 6, 4 |
|
1818 |
5 |
Tight Second-Order Certificates for Randomized Smoothing |
5, 4, 6 |
|
1819 |
5 |
Gradient penalty from a maximum margin perspective |
6, 5, 4, 5 |
|
1820 |
5 |
How to Train Your Super-Net: An Analysis of Training Heuristics in Weight-Sharing NAS |
5, 5, 5, 5 |
|
1821 |
5 |
Attention-driven Robotic Manipulation |
4, 4, 7 |
|
1822 |
5 |
Analogical Reasoning for Visually Grounded Compositional Generalization |
7, 5, 3 |
|
1823 |
5 |
Continual learning using hash-routed convolutional neural networks |
4, 6, 4, 6 |
|
1824 |
5 |
Topic-aware Contextualized Transformers |
7, 4, 4 |
|
1825 |
5 |
Fast Predictive Uncertainty for Classification with Bayesian Deep Networks |
5, 5, 6, 4 |
|
1826 |
5 |
Asynchronous Modeling: A Dual-phase Perspective for Long-Tailed Recognition |
3, 6, 5, 6 |
|
1827 |
5 |
Learning Representations by Contrasting Clusters While Bootstrapping Instances |
5, 6, 4 |
|
1828 |
5 |
The shape and simplicity biases of adversarially robust ImageNet-trained CNNs |
3, 5, 6, 6 |
|
1829 |
5 |
Explore with Dynamic Map: Graph Structured Reinforcement Learning |
5, 6, 5, 4 |
|
1830 |
5 |
Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning |
6, 4, 5, 5 |
|
1831 |
5 |
Action Concept Grounding Network for Semantically-Consistent Video Generation |
5, 5, 5 |
|
1832 |
5 |
Efficient Competitive Self-Play Policy Optimization |
5, 3, 5, 7 |
|
1833 |
5 |
Learning Binary Trees via Sparse Relaxation |
6, 3, 7, 4 |
|
1834 |
5 |
Interpretable Super-Resolution via a Learned Time-Series Representation |
4, 6, 4, 6 |
|
1835 |
5 |
GraphLog: A Benchmark for Measuring Logical Generalization in Graph Neural Networks |
5, 6, 4, 5 |
|
1836 |
5 |
HyperReal: Complex-Valued Layer Functions For Complex-Valued Scaling Invariance |
5, 5, 5 |
|
1837 |
5 |
Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs |
5, 5, 4, 6 |
|
1838 |
5 |
Fundamental Limits and Tradeoffs in Invariant Representation Learning |
5, 5, 5 |
|
1839 |
5 |
Integrating linguistic knowledge into DNNs: Application to online grooming detection |
5, 6, 4 |
|
1840 |
5 |
Increasing the Coverage and Balance of Robustness Benchmarks by Using Non-Overlapping Corruptions |
5, 6, 5, 4 |
|
1841 |
5 |
Novel Policy Seeking with Constrained Optimization |
4, 6, 4, 6 |
|
1842 |
5 |
Improving Random-Sampling Neural Architecture Search by Evolving the Proxy Search Space |
5, 5, 4, 6 |
|
1843 |
5 |
PANDA - Adapting Pretrained Features for Anomaly Detection |
4, 5, 4, 7 |
|
1844 |
5 |
The Bures Metric for Taming Mode Collapse in Generative Adversarial Networks |
5, 6, 6, 3 |
|
1845 |
5 |
Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent |
5, 5, 6, 4 |
|
1846 |
5 |
Weakly-Supervised Amodal Instance Segmentation with Compositional Priors |
5, 6, 5, 5, 4 |
|
1847 |
5 |
Sparse matrix products for neural network compression |
7, 5, 4, 4 |
|
1848 |
5 |
Asynchronous Edge Learning using Cloned Knowledge Distillation |
4, 3, 8 |
|
1849 |
5 |
A Strong On-Policy Competitor To PPO |
5, 5, 5 |
|
1850 |
5 |
Improving the Unsupervised Disentangled Representation Learning with VAE Ensemble |
7, 5, 3 |
|
1851 |
5 |
Function Contrastive Learning of Transferable Representations |
5, 5, 5, 5 |
|
1852 |
5 |
Improving Machine Translation by Searching Skip Connections Efficiently |
6, 3, 7, 4 |
|
1853 |
5 |
Misclassification Detection via Class Augmentation |
3, 5, 7, 5 |
|
1854 |
5 |
Uniform Manifold Approximation with Two-phase Optimization |
4, 5, 5, 6 |
|
1855 |
5 |
LLBoost: Last Layer Perturbation to Boost Pre-trained Neural Networks |
4, 6, 5 |
|
1856 |
5 |
Robust Meta-learning with Noise via Eigen-Reptile |
6, 5, 4, 5 |
|
1857 |
5 |
GOLD-NAS: Gradual, One-Level, Differentiable |
6, 5, 4, 5 |
|
1858 |
5 |
Counterfactual Self-Training |
5, 6, 4 |
|
1859 |
5 |
A General Family of Stochastic Proximal Gradient Methods for Deep Learning |
5, 6, 5, 4 |
|
1860 |
5 |
Dynamic Feature Selection for Efficient and Interpretable Human Activity Recognition |
9, 4, 3, 4 |
|
1861 |
5 |
Optimizing Information Bottleneck in Reinforcement Learning: A Stein Variational Approach |
5, 5, 4, 6 |
|
1862 |
5 |
Gradient-based tuning of Hamiltonian Monte Carlo hyperparameters |
5, 6, 4, 5 |
|
1863 |
5 |
Oblivious Sketching-based Central Path Method for Solving Linear Programming Problems |
7, 4, 5, 4 |
|
1864 |
5 |
AggMask: Exploring locally aggregated learning of mask representations for instance segmentation |
6, 4, 6, 4 |
|
1865 |
5 |
IALE: Imitating Active Learner Ensembles |
5, 6, 4 |
|
1866 |
5 |
Deep Learning Solution of the Eigenvalue Problem for Differential Operators |
9, 4, 4, 3 |
|
1867 |
5 |
Do Transformers Understand Polynomial Simplification? |
4, 4, 6, 6 |
|
1868 |
5 |
Rethinking Uncertainty in Deep Learning: Whether and How it Improves Robustness |
5, 5, 6, 4 |
|
1869 |
5 |
Connection- and Node-Sparse Deep Learning: Statistical Guarantees |
6, 4, 5 |
|
1870 |
5 |
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution |
6, 5, 4, 5 |
|
1871 |
5 |
D4RL: Datasets for Deep Data-Driven Reinforcement Learning |
6, 6, 6, 2 |
|
1872 |
5 |
Correcting Momentum in Temporal Difference Learning |
4, 6, 6, 4 |
|
1873 |
5 |
Out-of-Distribution Generalization Analysis via Influence Function |
7, 4, 4, 5 |
|
1874 |
5 |
Transferring Inductive Biases through Knowledge Distillation |
5, 3, 7, 5 |
|
1875 |
5 |
Attention Based Joint Learning for Supervised Premature Ventricular Contraction Differentiation with Unsupervised Abnormal Beat Segmentation |
5, 6, 5, 4 |
|
1876 |
5 |
BDS-GCN: Efficient Full-Graph Training of Graph Convolutional Nets with Partition-Parallelism and Boundary Sampling |
6, 6, 4, 4 |
|
1877 |
5 |
Counterfactual Fairness through Data Preprocessing |
4, 5, 6 |
|
1878 |
5 |
Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings |
5, 5, 5, 5, 5 |
|
1879 |
5 |
Pareto-Frontier-aware Neural Architecture Search |
5, 5, 4, 6 |
|
1880 |
5 |
First-Order Optimization Algorithms via Discretization of Finite-Time Convergent Flows |
4, 6, 4, 6 |
|
1881 |
5 |
Semantically-Adaptive Upsampling for Layout-to-Image Translation |
4, 6, 5, 5 |
|
1882 |
5 |
Temporal and Object Quantification Nets |
6, 3, 6 |
|
1883 |
5 |
Approximation Algorithms for Sparse Principal Component Analysis |
4, 5, 4, 7 |
|
1884 |
5 |
Self-Activating Neural Ensembles for Continual Reinforcement Learning |
6, 4, 5, 5 |
|
1885 |
5 |
Causal Probabilistic Spatio-temporal Fusion Transformers in Two-sided Ride-Hailing Markets |
6, 6, 6, 2 |
|
1886 |
5 |
Co-complexity: An Extended Perspective on Generalization Error |
4, 7, 5, 4 |
|
1887 |
5 |
Efficiently Troubleshooting Image Segmentation Models with Human-In-The-Loop |
4, 3, 8 |
|
1888 |
5 |
Deep $k$-NN Label Smoothing Improves Reproducibility of Neural Network Predictions |
5, 5, 7, 3 |
|
1889 |
5 |
Evaluating representations by the complexity of learning low-loss predictors |
4, 4, 7 |
|
1890 |
5 |
Local Clustering Graph Neural Networks |
5, 6, 5, 4 |
|
1891 |
5 |
All-You-Can-Fit 8-Bit Flexible Floating-Point Format for Accurate and Memory-Efficient Inference of Deep Neural Networks |
6, 7, 3, 4 |
|
1892 |
5 |
Dynamically Stable Infinite-Width Limits of Neural Classifiers |
7, 5, 5, 3 |
|
1893 |
5 |
D2RL: Deep Dense Architectures in Reinforcement Learning |
4, 8, 4, 4 |
|
1894 |
5 |
Distantly Supervised Relation Extraction in Federated Settings |
5, 4, 6, 5, 5 |
|
1895 |
5 |
Wasserstein Distributional Normalization |
4, 4, 6, 6, 5 |
|
1896 |
5 |
Model Compression via Hyper-Structure Network |
5, 5, 4, 6 |
|
1897 |
5 |
Secure Network Release with Link Privacy |
6, 5, 3, 6 |
|
1898 |
5 |
Are wider nets better given the same number of parameters? |
6, 5, 4 |
|
1899 |
5 |
MixSize: Training Convnets With Mixed Image Sizes for Improved Accuracy, Speed and Scale Resiliency |
5, 5, 5, 5 |
|
1900 |
5 |
Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement |
4, 4, 7 |
|
1901 |
5 |
Interpretable Relational Representations for Food Ingredient Recommendation Systems |
5, 7, 5, 3 |
|
1902 |
5 |
Graph Information Bottleneck for Subgraph Recognition |
2, 8, 3, 7 |
|
1903 |
5 |
Perturbation Type Categorization for Multiple $\ell_p$ Bounded Adversarial Robustness |
4, 6, 6, 4 |
|
1904 |
5 |
Disentangled cyclic reconstruction for domain adaptation |
4, 6, 5 |
|
1905 |
5 |
Continual Invariant Risk Minimization |
6, 6, 5, 3 |
|
1906 |
5 |
Estimating Treatment Effects via Orthogonal Regularization |
5, 3, 5, 7 |
|
1907 |
5 |
A Unified Paths Perspective for Pruning at Initialization |
6, 6, 4, 4 |
|
1908 |
5 |
CLOPS: Continual Learning of Physiological Signals |
4, 3, 7, 6 |
|
1909 |
5 |
Good for Misconceived Reasons: Revisiting Neural Multimodal Machine Translation |
4, 5, 5, 6 |
|
1910 |
5 |
On the Marginal Regret Bound Minimization of Adaptive Methods |
3, 5, 4, 5, 8 |
|
1911 |
5 |
Semi-supervised regression with skewed data via adversarially forcing the distribution of predicted values |
5, 5, 4, 6 |
|
1912 |
5 |
What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator |
3, 5, 5, 7 |
|
1913 |
5 |
A Unified View on Graph Neural Networks as Graph Signal Denoising |
6, 3, 6, 3, 7 |
|
1914 |
5 |
Measuring and mitigating interference in reinforcement learning |
5, 4, 6, 5 |
|
1915 |
5 |
CLOCS: Contrastive Learning of Cardiac Signals Across Space, Time, and Patients |
5, 7, 4, 4 |
|
1916 |
5 |
Improved Denoising Diffusion Probabilistic Models |
5, 5, 5, 5 |
|
1917 |
5 |
Targeted VAE: Structured Inference and Targeted Learning for Causal Parameter Estimation |
5, 6, 3, 6 |
|
1918 |
5 |
Improving Neural Network Accuracy and Calibration Under Distributional Shift with Prior Augmented Data |
6, 3, 5, 6 |
|
1919 |
5 |
Towards Multi-Sense Cross-Lingual Alignment of Contextual Embeddings |
6, 4, 5, 5 |
|
1920 |
5 |
PHEW: Paths with Higher Edge-Weights give ‘‘winning tickets’’ without training data |
5, 5, 3, 5, 7 |
|
1921 |
5 |
A Unifying Perspective on Neighbor Embeddings along the Attraction-Repulsion Spectrum |
6, 4, 5, 5 |
|
1922 |
5 |
GSdyn: Learning training dynamics via online Gaussian optimization with gradient states |
6, 6, 5, 3 |
|
1923 |
5 |
SEMI: Self-supervised Exploration via Multisensory Incongruity |
5, 4, 4, 7 |
|
1924 |
5 |
Second-Moment Loss: A Novel Regression Objective for Improved Uncertainties |
6, 4, 5 |
|
1925 |
5 |
Generative Adversarial Neural Architecture Search with Importance Sampling |
6, 5, 5, 4 |
|
1926 |
5 |
On Dropout, Overfitting, and Interaction Effects in Deep Neural Networks |
4, 7, 4 |
|
1927 |
5 |
Localized Meta-Learning: A PAC-Bayes Analysis for Meta-Learning Beyond Global Prior |
4, 6, 5, 5 |
|
1928 |
5 |
BiGCN: A Bi-directional Low-Pass Filtering Graph Neural Network |
5, 5, 6, 4 |
|
1929 |
5 |
NAHAS: Neural Architecture and Hardware Accelerator Search |
5, 5, 4, 6 |
|
1930 |
5 |
Least Probable Disagreement Region for Active Learning |
4, 7, 4, 5 |
|
1931 |
5 |
F^2ed-Learning: Good Fences Make Good Neighbors |
5, 6, 5, 4 |
|
1932 |
5 |
Guarantees for Tuning the Step Size using a Learning-to-Learn Approach |
4, 4, 4, 8 |
|
1933 |
5 |
Collaborative Normalization for Unsupervised Domain Adaptation |
5, 6, 4 |
|
1934 |
5 |
One Vertex Attack on Graph Neural Networks-based Spatiotemporal Forecasting |
4, 8, 4, 4 |
|
1935 |
5 |
A Simple Unified Information Regularization Framework for Multi-Source Domain Adaptation |
4, 5, 7, 4 |
|
1936 |
5 |
An Open Review of OpenReview: A Critical Analysis of the Machine Learning Conference Review Process |
5, 6, 3, 6 |
|
1937 |
5 |
Decentralized Deterministic Multi-Agent Reinforcement Learning |
5, 5, 6, 4, 5 |
|
1938 |
5 |
Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients |
5, 4, 6 |
|
1939 |
5 |
Self-Reflective Variational Autoencoder |
5, 3, 7 |
|
1940 |
5 |
Bridging Graph Network to Lifelong Learning with Feature Interaction |
5, 5, 6, 4 |
|
1941 |
5 |
Quantum Deformed Neural Networks |
6, 4, 4, 5, 6 |
|
1942 |
5 |
Deepening Hidden Representations from Pre-trained Language Models |
6, 5, 4 |
|
1943 |
5 |
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery |
5, 5, 5 |
|
1944 |
5 |
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling |
5, 4, 5, 6 |
|
1945 |
5 |
On Trade-offs of Image Prediction in Visual Model-Based Reinforcement Learning |
7, 6, 3, 4 |
|
1946 |
5 |
Searching towards Class-Aware Generators for Conditional Generative Adversarial Networks |
5, 5, 5, 5, 5 |
|
1947 |
5 |
Entropic Risk-Sensitive Reinforcement Learning: A Meta Regret Framework with Function Approximation |
5, 4, 5, 6 |
|
1948 |
5 |
Auto-view contrastive learning for few-shot image recognition |
4, 4, 7, 5 |
|
1949 |
5 |
Temporal Difference Networks for Action Recognition |
4, 6, 5 |
|
1950 |
5 |
Predicting the Outputs of Finite Networks Trained with Noisy Gradients |
5, 5, 6, 4 |
|
1951 |
5 |
Training Federated GANs with Theoretical Guarantees: A Universal Aggregation Approach |
3, 6, 5, 6 |
|
1952 |
5 |
Ensembles of Generative Adversarial Networks for Disconnected Data |
4, 7, 5, 4 |
|
1953 |
5 |
Temperature check: theory and practice for training models with softmax-cross-entropy losses |
6, 5, 6, 3 |
|
1954 |
5 |
CorDial: Coarse-to-fine Abstractive Dialogue Summarization with Controllable Granularity |
6, 5, 5, 4 |
|
1955 |
5 |
Prior-guided Bayesian Optimization |
3, 8, 4, 4, 6 |
|
1956 |
5 |
Multi-Source Unsupervised Hyperparameter Optimization |
3, 6, 6, 5 |
|
1957 |
5 |
Enforcing Predictive Invariance across Structured Biomedical Domains |
5, 5, 4, 6 |
|
1958 |
5 |
PLM: Partial Label Masking for Imbalanced Multi-label Classification |
5, 6, 4 |
|
1959 |
5 |
Can Students Outperform Teachers in Knowledge Distillation based Model Compression? |
5, 3, 6, 6 |
|
1960 |
5 |
Differentiable Approximations for Multi-resource Spatial Coverage Problems |
4, 6, 4, 6 |
|
1961 |
5 |
Learning to Learn with Smooth Regularization |
6, 5, 5, 4 |
|
1962 |
5 |
AriEL: Volume Coding for Sentence Generation Comparisons |
6, 7, 5, 4, 3 |
|
1963 |
5 |
R-MONet: Region-Based Unsupervised Scene Decomposition and Representation via Consistency of Object Representations |
3, 6, 6 |
|
1964 |
5 |
Towards Robust and Efficient Contrastive Textual Representation Learning |
5, 3, 6, 6 |
|
1965 |
5 |
MetaPhys: Unsupervised Few-Shot Adaptation for Non-Contact Physiological Measurement |
6, 5, 4 |
|
1966 |
5 |
Uncovering the impact of learning rate for global magnitude pruning |
5, 4, 7, 4 |
|
1967 |
5 |
Mixture of Step Returns in Bootstrapped DQN |
5, 7, 4, 4, 5 |
|
1968 |
5 |
Boosting One-Point Derivative-Free Online Optimization via Residual Feedback |
4, 4, 8, 4 |
|
1969 |
5 |
Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities |
6, 4, 5 |
|
1970 |
5 |
Graph Autoencoders with Deconvolutional Networks |
3, 5, 6, 6 |
|
1971 |
5 |
Adapt-and-Adjust: Overcoming the Long-tail Problem of Multilingual Speech Recognition |
6, 5, 5, 4, 5 |
|
1972 |
5 |
Unsupervised Word Alignment via Cross-Lingual Contrastive Learning |
6, 4, 5, 5 |
|
1973 |
5 |
Playing Nondeterministic Games through Planning with a Learned Model |
3, 4, 6, 5, 7 |
|
1974 |
5 |
On the Certified Robustness for Ensemble Models and Beyond |
6, 5, 4, 5 |
|
1975 |
5 |
OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set Unlabeled Data |
7, 4, 5, 4 |
|
1976 |
5 |
LAYER SPARSITY IN NEURAL NETWORKS |
5, 5, 6, 4 |
|
1977 |
5 |
Neural Architecture Search without Training |
5, 5, 4, 6 |
|
1978 |
4.8 |
Better Together: Resnet-50 accuracy with $13 \times $ fewer parameters and at $3 \times $ speed |
4, 5, 5, 4, 6 |
|
1979 |
4.8 |
AMBERT: A Pre-trained Language Model with Multi-Grained Tokenization |
5, 4, 7, 3, 5 |
|
1980 |
4.8 |
Extrapolatable Relational Reasoning With Comparators in Low-Dimensional Manifolds |
6, 5, 4, 5, 4 |
|
1981 |
4.8 |
PAC-Bayesian Randomized Value Function with Informative Prior |
5, 4, 5, 3, 7 |
|
1982 |
4.8 |
Fairness guarantee in analysis of incomplete data |
5, 4, 5, 4, 6 |
|
1983 |
4.8 |
Prepare for the Worst: Generalizing across Domain Shifts with Adversarial Batch Normalization |
5, 3, 6, 5, 5 |
|
1984 |
4.75 |
Self-Supervised Variational Auto-Encoders |
6, 5, 4, 4 |
|
1985 |
4.75 |
ALFA: Adversarial Feature Augmentation for Enhanced Image Recognition |
6, 4, 4, 5 |
|
1986 |
4.75 |
Are Graph Convolutional Networks Fully Exploiting the Graph Structure? |
4, 5, 6, 4 |
|
1987 |
4.75 |
Mutual Calibration between Explicit and Implicit Deep Generative Models |
5, 6, 3, 5 |
|
1988 |
4.75 |
Effective Training of Sparse Neural Networks under Global Sparsity Constraint |
5, 5, 5, 4 |
|
1989 |
4.75 |
Intragroup sparsity for efficient inference |
4, 5, 4, 6 |
|
1990 |
4.75 |
Learning a Non-Redundant Collection of Classifiers |
6, 5, 4, 4 |
|
1991 |
4.75 |
Grey-box Extraction of Natural Language Models |
5, 7, 3, 4 |
|
1992 |
4.75 |
Wat zei je? Detecting Out-of-Distribution Translations with Variational Transformers |
6, 5, 5, 3 |
|
1993 |
4.75 |
Unifying Regularisation Methods for Continual Learning |
6, 5, 3, 5 |
|
1994 |
4.75 |
On Alignment in Deep Linear Neural Networks |
4, 7, 4, 4 |
|
1995 |
4.75 |
Few-Shot Bayesian Optimization with Deep Kernel Surrogates |
6, 4, 4, 5 |
|
1996 |
4.75 |
f-Domain-Adversarial Learning: Theory and Algorithms for Unsupervised Domain Adaptation with Neural Networks |
5, 5, 4, 5 |
|
1997 |
4.75 |
Hey, that’s not an ODE': Faster ODE Adjoints with 12 Lines of Code |
5, 4, 5, 5 |
|
1998 |
4.75 |
Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts |
6, 4, 4, 5 |
|
1999 |
4.75 |
Adaptive norms for deep learning with regularized Newton methods |
4, 5, 4, 6 |
|
2000 |
4.75 |
Fully Convolutional Approach for Simulating Wave Dynamics |
3, 7, 4, 5 |
|
2001 |
4.75 |
Impact-driven Exploration with Contrastive Unsupervised Representations |
4, 4, 4, 7 |
|
2002 |
4.75 |
Certified Watermarks for Neural Networks |
6, 4, 4, 5 |
|
2003 |
4.75 |
Logit As Auxiliary Weak-supervision for More Reliable and Accurate Prediction |
4, 7, 5, 3 |
|
2004 |
4.75 |
Neural Subgraph Matching |
6, 3, 5, 5 |
|
2005 |
4.75 |
Polynomial Graph Convolutional Networks |
4, 5, 5, 5 |
|
2006 |
4.75 |
Scalable Graph Neural Networks for Heterogeneous Graphs |
5, 5, 3, 6 |
|
2007 |
4.75 |
Neural Ensemble Search for Uncertainty Estimation and Dataset Shift |
5, 4, 4, 6 |
|
2008 |
4.75 |
An Attention Free Transformer |
4, 6, 5, 4 |
|
2009 |
4.75 |
Adaptive Stacked Graph Filter |
5, 5, 5, 4 |
|
2010 |
4.75 |
VilNMN: A Neural Module Network approach to Video-Grounded Language Tasks |
5, 4, 5, 5 |
|
2011 |
4.75 |
A Probabilistic Model for Discriminative and Neuro-Symbolic Semi-Supervised Learning |
3, 4, 5, 7 |
|
2012 |
4.75 |
Understanding Adversarial Attacks on Autoencoders |
7, 3, 5, 4 |
|
2013 |
4.75 |
Towards Understanding Label Smoothing |
6, 6, 1, 6 |
|
2014 |
4.75 |
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling |
5, 6, 4, 4 |
|
2015 |
4.75 |
Motion Forecasting with Unlikelihood Training |
6, 4, 5, 4 |
|
2016 |
4.75 |
Data-efficient Hindsight Off-policy Option Learning |
5, 3, 6, 5 |
|
2017 |
4.75 |
Information distance for neural network functions |
6, 4, 4, 5 |
|
2018 |
4.75 |
Convergence Analysis of Homotopy-SGD for Non-Convex Optimization |
5, 5, 4, 5 |
|
2019 |
4.75 |
Generating unseen complex scenes: are we there yet? |
4, 4, 5, 6 |
|
2020 |
4.75 |
Improving Local Effectiveness for Global Robustness Training |
5, 5, 5, 4 |
|
2021 |
4.75 |
Token-Level Contrast for Video and Language Alignment |
5, 6, 4, 4 |
|
2022 |
4.75 |
Neural Disjunctive Normal Form: Vertically Integrating Logic With Deep Learning For Classification |
4, 4, 5, 6 |
|
2023 |
4.75 |
Connecting Sphere Manifolds Hierarchically for Regularization |
3, 6, 5, 5 |
|
2024 |
4.75 |
Test-Time Adaptation and Adversarial Robustness |
7, 3, 4, 5 |
|
2025 |
4.75 |
Uncertainty Calibration Error: A New Metric for Multi-Class Classification |
4, 6, 4, 5 |
|
2026 |
4.75 |
Parametric Density Estimation with Uncertainty using Deep Ensembles |
5, 5, 4, 5 |
|
2027 |
4.75 |
Model-Free Counterfactual Credit Assignment |
3, 6, 5, 5 |
|
2028 |
4.75 |
Analysing the Update step in Graph Neural Networks via Sparsification |
6, 4, 5, 4 |
|
2029 |
4.75 |
Depth Completion using Plane-Residual Representation |
5, 5, 4, 5 |
|
2030 |
4.75 |
Dynamically locating multiple speakers based on the time-frequency domain |
4, 6, 5, 4 |
|
2031 |
4.75 |
Exchanging Lessons Between Algorithmic Fairness and Domain Generalization |
4, 6, 5, 4 |
|
2032 |
4.75 |
Deep Active Learning for Object Detection with Mixture Density Networks |
3, 6, 5, 5 |
|
2033 |
4.75 |
Fuzzy c-Means Clustering for Persistence Diagrams |
4, 3, 6, 6 |
|
2034 |
4.75 |
NeuralLog: a Neural Logic Language |
3, 5, 6, 5 |
|
2035 |
4.75 |
Learning from multiscale wavelet superpixels using GNN with spatially heterogeneous pooling |
7, 5, 2, 5 |
|
2036 |
4.75 |
Class Imbalance in Few-Shot Learning |
5, 4, 5, 5 |
|
2037 |
4.75 |
N-Bref : A High-fidelity Decompiler Exploiting Programming Structures |
3, 7, 5, 4 |
|
2038 |
4.75 |
Sandwich Batch Normalization |
5, 6, 5, 3 |
|
2039 |
4.75 |
Scalable Transformers for Neural Machine Translation |
6, 5, 4, 4 |
|
2040 |
4.75 |
Information Transfer in Multi-Task Learning |
4, 4, 5, 6 |
|
2041 |
4.75 |
Batch Normalization Increases Adversarial Vulnerability: Disentangling Usefulness and Robustness of Model Features |
6, 5, 4, 4 |
|
2042 |
4.75 |
Failure Modes of Variational Autoencoders and Their Effects on Downstream Tasks |
5, 5, 5, 4 |
|
2043 |
4.75 |
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning |
3, 5, 6, 5 |
|
2044 |
4.75 |
Learning Spatiotemporal Features via Video and Text Pair Discrimination |
4, 5, 4, 6 |
|
2045 |
4.75 |
High-Likelihood Area Matters — Rewarding Near-Correct Predictions Under Imbalanced Distributions |
4, 5, 5, 5 |
|
2046 |
4.75 |
Exploiting Verified Neural Networks via Floating Point Numerical Error |
4, 4, 8, 3 |
|
2047 |
4.75 |
Dropout’s Dream Land: Generalization from Learned Simulators to Reality |
3, 6, 4, 6 |
|
2048 |
4.75 |
Weights Having Stable Signs Are Important: Finding Primary Subnetworks and Kernels to Compress Binary Weight Networks |
5, 5, 3, 6 |
|
2049 |
4.75 |
On the Role of Pre-training for Meta Few-Shot Learning |
7, 4, 5, 3 |
|
2050 |
4.75 |
Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks |
5, 6, 4, 4 |
|
2051 |
4.75 |
Poisoned classifiers are not only backdoored, they are fundamentally broken |
7, 5, 5, 2 |
|
2052 |
4.75 |
Learning and Generalization in Univariate Overparameterized Normalizing Flows |
6, 4, 4, 5 |
|
2053 |
4.75 |
Searching for Convolutions and a More Ambitious NAS |
5, 5, 5, 4 |
|
2054 |
4.75 |
GANMEX: Class-Targeted One-vs-One Attributions using GAN-based Model Explainability |
5, 5, 5, 4 |
|
2055 |
4.75 |
Unifying Graph Convolutional Neural Networks and Label Propagation |
5, 3, 5, 6 |
|
2056 |
4.75 |
Dual Contradistinctive Generative Autoencoder |
5, 6, 5, 3 |
|
2057 |
4.75 |
Diffeomorphic Spatial Transformer Networks |
5, 6, 3, 5 |
|
2058 |
4.75 |
Few-shot Adaptation of Generative Adversarial Networks |
4, 7, 3, 5 |
|
2059 |
4.75 |
Intelligent Matrix Exponentiation |
5, 5, 5, 4 |
|
2060 |
4.75 |
A Simple Sparse Denoising Layer for Robust Deep Learning |
5, 4, 5, 5 |
|
2061 |
4.75 |
Layer-wise Adversarial Defense: An ODE Perspective |
4, 5, 5, 5 |
|
2062 |
4.75 |
SGD on Neural Networks learns Robust Features before Non-Robust |
5, 4, 5, 5 |
|
2063 |
4.75 |
A frequency domain analysis of gradient-based adversarial examples |
7, 5, 4, 3 |
|
2064 |
4.75 |
Backdoor Attacks to Graph Neural Networks |
4, 5, 5, 5 |
|
2065 |
4.75 |
Learning to Actively Learn: A Robust Approach |
7, 4, 3, 5 |
|
2066 |
4.75 |
ReaPER: Improving Sample Efficiency in Model-Based Latent Imagination |
4, 5, 6, 4 |
|
2067 |
4.75 |
Visual Imitation with Reinforcement Learning using Recurrent Siamese Networks |
6, 5, 4, 4 |
|
2068 |
4.75 |
GraphCGAN: Convolutional Graph Neural Network with Generative Adversarial Networks |
4, 5, 5, 5 |
|
2069 |
4.75 |
Robust Ensembles of Neural Networks using Itô Processes |
7, 6, 5, 1 |
|
2070 |
4.75 |
Unsupervised Hierarchical Concept Learning |
5, 6, 4, 4 |
|
2071 |
4.75 |
Wasserstein diffusion on graphs with missing attributes |
4, 3, 5, 7 |
|
2072 |
4.75 |
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks |
4, 4, 7, 4 |
|
2073 |
4.75 |
DEEP ADAPTIVE SEMANTIC LOGIC (DASL): COMPILING DECLARATIVE KNOWLEDGE INTO DEEP NEURAL NETWORKS |
5, 3, 6, 5 |
|
2074 |
4.75 |
Explore the Potential of CNN Low Bit Training |
5, 4, 4, 6 |
|
2075 |
4.75 |
Efficient Model Performance Estimation via Feature Histories |
5, 4, 6, 4 |
|
2076 |
4.75 |
Adversarial Feature Desensitization |
4, 5, 6, 4 |
|
2077 |
4.75 |
Ensemble-based Adversarial Defense Using Diversified Distance Mapping |
5, 5, 5, 4 |
|
2078 |
4.75 |
Generating Landmark Navigation Instructions from Maps as a Graph-to-Text Problem |
5, 6, 5, 3 |
|
2079 |
4.75 |
Alpha Net: Adaptation with Composition in Classifier Space |
4, 4, 8, 3 |
|
2080 |
4.75 |
Delay-Tolerant Local SGD for Efficient Distributed Training |
5, 5, 5, 4 |
|
2081 |
4.75 |
Learn Robust Features via Orthogonal Multi-Path |
4, 5, 5, 5 |
|
2082 |
4.75 |
Practical Locally Private Federated Learning with Communication Efficiency |
5, 3, 6, 5 |
|
2083 |
4.75 |
Dependency Structure Discovery from Interventions |
4, 5, 6, 4 |
|
2084 |
4.75 |
PURE: An Uncertainty-aware Recommendation Framework for Maximizing Expected Posterior Utility of Platform |
6, 4, 4, 5 |
|
2085 |
4.75 |
Exploiting structured data for learning contagious diseases under incomplete testing |
7, 5, 4, 3 |
|
2086 |
4.75 |
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition |
3, 5, 5, 6 |
|
2087 |
4.75 |
A Simple and Effective Baseline for Out-of-Distribution Detection using Abstention |
6, 4, 5, 4 |
|
2088 |
4.75 |
Sparta: Spatially Attentive and Adversarially Robust Activations |
5, 4, 4, 6 |
|
2089 |
4.75 |
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning |
5, 4, 4, 6 |
|
2090 |
4.75 |
Improved Techniques for Model Inversion Attacks |
6, 5, 4, 4 |
|
2091 |
4.75 |
Regioned Episodic Reinforcement Learning |
4, 5, 5, 5 |
|
2092 |
4.75 |
Graph Adversarial Networks: Protecting Information against Adversarial Attacks |
5, 5, 4, 5 |
|
2093 |
4.75 |
SBEVNet: End-to-End Deep Stereo Layout Estimation |
3, 5, 6, 5 |
|
2094 |
4.75 |
Fast and Differentiable Matrix Inverse and Its Extension to SVD |
5, 6, 3, 5 |
|
2095 |
4.75 |
Batch Normalization Embeddings for Deep Domain Generalization |
4, 5, 4, 6 |
|
2096 |
4.75 |
AFINets: Attentive Feature Integration Networks for Image Classification |
6, 4, 3, 6 |
|
2097 |
4.75 |
Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs |
5, 5, 6, 3 |
|
2098 |
4.75 |
Towards Data Distillation for End-to-end Spoken Conversational Question Answering |
5, 5, 5, 4 |
|
2099 |
4.75 |
Self-supervised Temporal Learning |
5, 4, 6, 4 |
|
2100 |
4.75 |
Class Balancing GAN with a Classifier in the Loop |
5, 5, 5, 4 |
|
2101 |
4.75 |
Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters |
4, 5, 4, 6 |
|
2102 |
4.75 |
Paired Examples as Indirect Supervision in Latent Decision Models |
6, 4, 5, 4 |
|
2103 |
4.75 |
TRIP: Refining Image-to-Image Translation via Rival Preferences |
5, 6, 4, 4 |
|
2104 |
4.75 |
Meta Gradient Boosting Neural Networks |
4, 5, 6, 4 |
|
2105 |
4.75 |
Why is Attention Not So Interpretable? |
4, 3, 7, 5 |
|
2106 |
4.75 |
A Truly Constant-time Distribution-aware Negative Sampling |
4, 3, 7, 5 |
|
2107 |
4.75 |
SHOT IN THE DARK: FEW-SHOT LEARNING WITH NO BASE-CLASS LABELS |
4, 4, 5, 6 |
|
2108 |
4.75 |
Data Augmentation for Meta-Learning |
5, 5, 6, 3 |
|
2109 |
4.75 |
Probabilistic Mixture-of-Experts for Efficient Deep Reinforcement Learning |
6, 3, 6, 4 |
|
2110 |
4.75 |
Incremental Learning on Growing Graphs |
3, 7, 5, 4 |
|
2111 |
4.75 |
DAG-GPs: Learning Directed Acyclic Graph Structure For Multi-Output Gaussian Processes |
5, 5, 5, 4 |
|
2112 |
4.75 |
Deep Q-Learning with Low Switching Cost |
4, 5, 5, 5 |
|
2113 |
4.75 |
A Unified Spectral Sparsification Framework for Directed Graphs |
7, 4, 5, 3 |
|
2114 |
4.75 |
Median DC for Sign Recovery: Privacy can be Achieved by Deterministic Algorithms |
4, 7, 4, 4 |
|
2115 |
4.75 |
Robust Memory Augmentation by Constrained Latent Imagination |
5, 4, 7, 3 |
|
2116 |
4.75 |
Differential-Critic GAN: Generating What You Want by a Cue of Preferences |
5, 5, 5, 4 |
|
2117 |
4.75 |
Diversity Augmented Conditional Generative Adversarial Network for Enhanced Multimodal Image-to-Image Translation |
5, 5, 4, 5 |
|
2118 |
4.75 |
Data-aware Low-Rank Compression for Large NLP Models |
3, 5, 5, 6 |
|
2119 |
4.75 |
Hidden Incentives for Auto-Induced Distributional Shift |
4, 6, 5, 4 |
|
2120 |
4.75 |
Log representation as an interface for log processing applications |
7, 4, 5, 3 |
|
2121 |
4.75 |
Towards certifying $\ell_\infty$ robustness using Neural networks with $\ell_\infty$-dist Neurons |
5, 4, 6, 4 |
|
2122 |
4.75 |
OT-LLP: Optimal Transport for Learning from Label Proportions |
4, 5, 5, 5 |
|
2123 |
4.75 |
Robust Federated Learning for Neural Networks |
4, 6, 5, 4 |
|
2124 |
4.75 |
Learning to Use Future Information in Simultaneous Translation |
5, 4, 5, 5 |
|
2125 |
4.75 |
DeeperGCN: Training Deeper GCNs with Generalized Aggregation Functions |
5, 4, 4, 6 |
|
2126 |
4.75 |
AutoBayes: Automated Bayesian Graph Exploration for Nuisance-Robust Inference |
5, 5, 5, 4 |
|
2127 |
4.75 |
Uncertainty Quantification for Bayesian Optimization |
5, 4, 5, 5 |
|
2128 |
4.75 |
DiffAutoML: Differentiable Joint Optimization for Efficient End-to-End Automated Machine Learning |
6, 4, 4, 5 |
|
2129 |
4.75 |
Relevance Attack on Detectors |
6, 4, 5, 4 |
|
2130 |
4.75 |
Practical Phase Retrieval: Low-Photon Holography with Untrained Priors |
3, 4, 7, 5 |
|
2131 |
4.75 |
MDP Playground: Controlling Dimensions of Hardness in Reinforcement Learning |
6, 4, 5, 4 |
|
2132 |
4.75 |
Slice, Dice, and Optimize: Measuring the Dimension of Neural Network Class Manifolds |
6, 4, 4, 5 |
|
2133 |
4.75 |
Dream and Search to Control: Latent Space Planning for Continuous Control |
4, 6, 4, 5 |
|
2134 |
4.75 |
Inner Ensemble Networks: Average Ensemble as an Effective Regularizer |
3, 7, 5, 4 |
|
2135 |
4.75 |
Time Series Counterfactual Inference with Hidden Confounders |
5, 5, 4, 5 |
|
2136 |
4.75 |
A StyleMap-Based Generator for Real-Time Image Projection and Local Editing |
5, 5, 6, 3 |
|
2137 |
4.75 |
Testing Robustness Against Unforeseen Adversaries |
5, 5, 5, 4 |
|
2138 |
4.75 |
Cluster-Former: Clustering-based Sparse Transformer for Question Answering |
6, 2, 5, 6 |
|
2139 |
4.75 |
SHADOWCAST: Controllable Graph Generation with Explainability |
4, 5, 5, 5 |
|
2140 |
4.75 |
You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling |
5, 6, 6, 2 |
|
2141 |
4.75 |
Adaptive Hierarchical Hyper-gradient Descent |
5, 4, 5, 5 |
|
2142 |
4.75 |
Latent Space Semi-Supervised Time Series Data Clustering |
4, 5, 6, 4 |
|
2143 |
4.75 |
Generalizing Complex/Hyper-complex Convolutions to Vector Map Convolutions |
6, 4, 4, 5 |
|
2144 |
4.75 |
FSPN: A New Class of Probabilistic Graphical Model |
4, 7, 5, 3 |
|
2145 |
4.75 |
Improved Contrastive Divergence Training of Energy Based Models |
5, 5, 5, 4 |
|
2146 |
4.75 |
Symmetry Control Neural Networks |
4, 5, 5, 5 |
|
2147 |
4.75 |
Semantic Segmentation Based Unsupervised Domain Adaptation via Pseudo-Label Fusion |
5, 4, 4, 6 |
|
2148 |
4.75 |
Safety Aware Reinforcement Learning (SARL) |
3, 6, 6, 4 |
|
2149 |
4.75 |
Semi-supervised counterfactual explanations |
5, 6, 4, 4 |
|
2150 |
4.75 |
Meta-Learned Confidence for Transductive Few-shot Learning |
5, 5, 5, 4 |
|
2151 |
4.75 |
Towards Understanding the Cause of Error in Few-Shot Learning |
6, 5, 4, 4 |
|
2152 |
4.75 |
Pretrain-to-Finetune Adversarial Training via Sample-wise Randomized Smoothing |
4, 5, 6, 4 |
|
2153 |
4.75 |
Bayesian Metric Learning for Robust Training of Deep Models under Noisy Labels |
5, 4, 3, 7 |
|
2154 |
4.75 |
Training Neural Networks with Property-Preserving Parameter Perturbations |
5, 6, 6, 2 |
|
2155 |
4.75 |
DO-GAN: A Double Oracle Framework for Generative Adversarial Networks |
3, 6, 4, 6 |
|
2156 |
4.75 |
Practical Evaluation of Out-of-Distribution Detection Methods for Image Classification |
4, 3, 8, 4 |
|
2157 |
4.75 |
Small Input Noise is Enough to Defend Against Query-based Black-box Attacks |
7, 3, 6, 3 |
|
2158 |
4.75 |
Resurrecting Submodularity for Neural Text Generation |
6, 4, 6, 3 |
|
2159 |
4.75 |
It’s Hard for Neural Networks to Learn the Game of Life |
5, 3, 5, 6 |
|
2160 |
4.75 |
Learning to Observe with Reinforcement Learning |
4, 5, 6, 4 |
|
2161 |
4.75 |
Domain-slot Relationship Modeling using a Pre-trained Language Encoder for Multi-Domain Dialogue State Tracking |
5, 3, 7, 4 |
|
2162 |
4.75 |
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training |
5, 6, 4, 4 |
|
2163 |
4.75 |
How to Motivate Your Dragon: Teaching Goal-Driven Agents to Speak and Act in Fantasy Worlds |
4, 4, 4, 7 |
|
2164 |
4.75 |
Certified robustness against physically-realizable patch attack via randomized cropping |
5, 5, 4, 5 |
|
2165 |
4.75 |
Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples |
5, 4, 5, 5 |
|
2166 |
4.75 |
Joint Descent: Training and Tuning Simultaneously |
4, 4, 6, 5 |
|
2167 |
4.75 |
Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning |
4, 6, 5, 4 |
|
2168 |
4.75 |
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning |
5, 6, 3, 5 |
|
2169 |
4.75 |
Human-interpretable model explainability on high-dimensional data |
5, 3, 7, 4 |
|
2170 |
4.75 |
Normalizing Flows for Calibration and Recalibration |
3, 4, 5, 7 |
|
2171 |
4.75 |
One-class Classification Robust to Geometric Transformation |
4, 5, 6, 4 |
|
2172 |
4.75 |
Practical Order Attack in Deep Ranking |
5, 5, 6, 3 |
|
2173 |
4.75 |
Deep Convolution for Irregularly Sampled Temporal Point Clouds |
5, 4, 5, 5 |
|
2174 |
4.67 |
Regression from Upper One-side Labeled Data |
5, 4, 5 |
|
2175 |
4.67 |
Semi-Supervised Speech-Language Joint Pre-Training for Spoken Language Understanding |
5, 5, 4 |
|
2176 |
4.67 |
PCPs: Patient Cardiac Prototypes |
5, 7, 2 |
|
2177 |
4.67 |
Empirical Studies on the Convergence of Feature Spaces in Deep Learning |
6, 5, 3 |
|
2178 |
4.67 |
Optimizing Over All Sequences of Orthogonal Polynomials |
4, 4, 6 |
|
2179 |
4.67 |
Understanding Knowledge Distillation |
4, 6, 4 |
|
2180 |
4.67 |
The Skill-Action Architecture: Learning Abstract Action Embeddings for Reinforcement Learning |
5, 4, 5 |
|
2181 |
4.67 |
Semantic Hashing with Locality Sensitive Embeddings |
4, 6, 4 |
|
2182 |
4.67 |
Rapid Neural Pruning for Novel Datasets with Set-based Task-Adaptive Meta-Pruning |
5, 5, 4 |
|
2183 |
4.67 |
Image Animation with Refined Masking |
5, 4, 5 |
|
2184 |
4.67 |
Parameterized Pseudo-Differential Operators for Graph Convolutional Neural Networks |
5, 5, 4 |
|
2185 |
4.67 |
FedMes: Speeding Up Federated Learning with Multiple Edge Servers |
5, 5, 4 |
|
2186 |
4.67 |
String Theory: Parsed Categoric Encodings with Automunge |
4, 4, 6 |
|
2187 |
4.67 |
Defuse: Debugging Classifiers Through Distilling Unrestricted Adversarial Examples |
4, 6, 4 |
|
2188 |
4.67 |
Neighbourhood Distillation: On the benefits of non end-to-end distillation |
5, 4, 5 |
|
2189 |
4.67 |
Implicit Regularization of SGD via Thermophoresis |
4, 7, 3 |
|
2190 |
4.67 |
MCM-aware Twin-least-square GAN for Hyperspectral Anomaly Detection |
5, 5, 4 |
|
2191 |
4.67 |
A spherical analysis of Adam with Batch Normalization |
5, 4, 5 |
|
2192 |
4.67 |
Catching the Long Tail in Deep Neural Networks |
5, 4, 5 |
|
2193 |
4.67 |
Detection Booster Training: A detection booster training method for improving the accuracy of classifiers. |
4, 6, 4 |
|
2194 |
4.67 |
Density-Based Object Detection: Learning Bounding Boxes without Ground Truth Assignment |
7, 4, 3 |
|
2195 |
4.67 |
Differentially Private Generative Models Through Optimal Transport |
6, 4, 4 |
|
2196 |
4.67 |
Loss Landscape Matters: Training Certifiably Robust Models with Favorable Loss Landscape |
7, 3, 4 |
|
2197 |
4.67 |
Learning Intrinsic Symbolic Rewards in Reinforcement Learning |
5, 4, 5 |
|
2198 |
4.67 |
An information-theoretic framework for learning models of instance-independent label noise |
4, 5, 5 |
|
2199 |
4.67 |
On Sparse Critical Paths of Neural Response |
4, 6, 4 |
|
2200 |
4.67 |
Network-Agnostic Knowledge Transfer from Latent Dataset for Medical Image Segmentation |
7, 4, 3 |
|
2201 |
4.67 |
Revisiting the Train Loss: an Efficient Performance Estimator for Neural Architecture Search |
6, 5, 3 |
|
2202 |
4.67 |
Orthogonal Over-Parameterized Training |
6, 5, 3 |
|
2203 |
4.67 |
DIET-SNN: A Low-Latency Spiking Neural Network with Direct Input Encoding & Leakage and Threshold Optimization |
5, 3, 6 |
|
2204 |
4.67 |
What Preserves the Emergence of Language? |
6, 5, 3 |
|
2205 |
4.67 |
A Probabilistic Approach to Constrained Deep Clustering |
5, 5, 4 |
|
2206 |
4.67 |
EEC: Learning to Encode and Regenerate Images for Continual Learning |
4, 6, 4 |
|
2207 |
4.67 |
Azimuthal Rotational Equivariance in Spherical CNNs |
3, 6, 5 |
|
2208 |
4.67 |
The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning |
5, 4, 5 |
|
2209 |
4.67 |
LONG-TAIL ZERO AND FEW-SHOT LEARNING VIA CONTRASTIVE PRETRAINING ON AND FOR SMALL DATA |
5, 4, 5 |
|
2210 |
4.67 |
Meta-Semi: A Meta-learning Approach for Semi-supervised Learning |
5, 4, 5 |
|
2211 |
4.67 |
Graph Neural Network Acceleration via Matrix Dimension Reduction |
4, 5, 5 |
|
2212 |
4.67 |
Neural Random Projection: From the Initial Task To the Input Similarity Problem |
3, 4, 7 |
|
2213 |
4.67 |
SkillBERT: “Skilling” the BERT to classify skills! |
4, 4, 6 |
|
2214 |
4.67 |
Decoupled Greedy Learning of Graph Neural Networks |
4, 6, 4 |
|
2215 |
4.67 |
Contextual Graph Reasoning Networks |
5, 4, 5 |
|
2216 |
4.67 |
Ablation Path Saliency |
6, 4, 4 |
|
2217 |
4.67 |
A Deep Graph Neural Networks Architecture Design: From Global Pyramid-like Shrinkage Skeleton to Local Link Rewiring |
5, 4, 5 |
|
2218 |
4.67 |
Characterizing Structural Regularities of Labeled Data in Overparameterized Models |
4, 5, 5 |
|
2219 |
4.67 |
Subformer: A Parameter Reduced Transformer |
4, 4, 6 |
|
2220 |
4.67 |
Mem2Mem: Learning to Summarize Long Texts with Memory Compression and Transfer |
5, 4, 5 |
|
2221 |
4.67 |
Multi-agent Deep FBSDE Representation For Large Scale Stochastic Differential Games |
5, 4, 5 |
|
2222 |
4.67 |
THE EFFICACY OF L1 REGULARIZATION IN NEURAL NETWORKS |
5, 4, 5 |
|
2223 |
4.67 |
Exploring Sub-Pseudo Labels for Learning from Weakly-Labeled Web Videos |
5, 4, 5 |
|
2224 |
4.67 |
Variance Reduction in Hierarchical Variational Autoencoders |
6, 4, 4 |
|
2225 |
4.67 |
Neural Nonnegative CP Decomposition for Hierarchical Tensor Analysis |
4, 6, 4 |
|
2226 |
4.67 |
Consensus Clustering with Unsupervised Representation Learning |
4, 5, 5 |
|
2227 |
4.67 |
Learning Irreducible Representations of Noncommutative Lie Groups |
5, 5, 4 |
|
2228 |
4.67 |
Pareto Adversarial Robustness: Balancing Spatial Robustness and Sensitivity-based Robustness |
6, 3, 5 |
|
2229 |
4.67 |
Neurally Guided Genetic Programming for Turing Complete Programming by Example |
5, 5, 4 |
|
2230 |
4.67 |
On the Reproducibility of Neural Network Predictions |
5, 5, 4 |
|
2231 |
4.67 |
Hard Masking for Explaining Graph Neural Networks |
5, 4, 5 |
|
2232 |
4.67 |
AUTOSAMPLING: SEARCH FOR EFFECTIVE DATA SAMPLING SCHEDULES |
5, 6, 3 |
|
2233 |
4.67 |
Network Reusability Analysis for Multi-Joint Robot Reinforcement Learning |
5, 4, 5 |
|
2234 |
4.67 |
CANVASEMB: Learning Layout Representation with Large-scale Pre-training for Graphic Design |
5, 5, 4 |
|
2235 |
4.67 |
Scaling Unsupervised Domain Adaptation through Optimal Collaborator Selection and Lazy Discriminator Synchronization |
2, 6, 6 |
|
2236 |
4.6 |
Random Network Distillation as a Diversity Metric for Both Image and Text Generation |
4, 6, 4, 5, 4 |
|
2237 |
4.6 |
Certified Robustness of Nearest Neighbors against Data Poisoning Attacks |
4, 5, 6, 5, 3 |
|
2238 |
4.6 |
Multi-level Graph Matching Networks for Deep and Robust Graph Similarity Learning |
5, 4, 4, 5, 5 |
|
2239 |
4.6 |
The Negative Pretraining Effect in Sequential Deep Learning and Three Ways to Fix It |
4, 4, 6, 4, 5 |
|
2240 |
4.6 |
Cross-Domain Few-Shot Learning by Representation Fusion |
4, 6, 4, 5, 4 |
|
2241 |
4.6 |
Lightweight Long-Range Generative Adversarial Networks |
5, 4, 6, 5, 3 |
|
2242 |
4.6 |
GL-Disen: Global-Local disentanglement for unsupervised learning of graph-level representations |
5, 3, 4, 6, 5 |
|
2243 |
4.6 |
Adaptive Learning Rates for Multi-Agent Reinforcement Learning |
5, 5, 4, 4, 5 |
|
2244 |
4.6 |
Searching for Robustness: Loss Learning for Noisy Classification Tasks |
5, 4, 5, 5, 4 |
|
2245 |
4.6 |
Maximum Reward Formulation In Reinforcement Learning |
5, 3, 5, 6, 4 |
|
2246 |
4.6 |
Joint State-Action Embedding for Efficient Reinforcement Learning |
6, 3, 4, 5, 5 |
|
2247 |
4.6 |
Robust Offline Reinforcement Learning from Low-Quality Data |
2, 6, 4, 6, 5 |
|
2248 |
4.6 |
Hyperrealistic neural decoding: Reconstruction of face stimuli from fMRI measurements via the GAN latent space |
2, 5, 7, 5, 4 |
|
2249 |
4.6 |
Class2Simi: A New Perspective on Learning with Label Noise |
3, 3, 6, 6, 5 |
|
2250 |
4.6 |
No Spurious Local Minima: on the Optimization Landscapes of Wide and Deep Neural Networks |
6, 4, 4, 5, 4 |
|
2251 |
4.6 |
Adaptive Gradient Method with Resilience and Momentum |
5, 5, 4, 4, 5 |
|
2252 |
4.5 |
Model information as an analysis tool in deep learning |
4, 4, 6, 4 |
|
2253 |
4.5 |
ADD-Defense: Towards Defending Widespread Adversarial Examples via Perturbation-Invariant Representation |
6, 3, 2, 7 |
|
2254 |
4.5 |
Interpretable Reinforcement Learning With Neural Symbolic Logic |
4, 5, 4, 5 |
|
2255 |
4.5 |
Revisiting Parameter Sharing in Multi-Agent Deep Reinforcement Learning |
7, 5, 3, 3 |
|
2256 |
4.5 |
Keep the Gradients Flowing: Using Gradient Flow to study Sparse Network Optimization |
5, 5, 3, 5 |
|
2257 |
4.5 |
Attention-Based Clustering: Learning a Kernel from Context |
5, 4, 4, 5 |
|
2258 |
4.5 |
SHAPE DEFENSE |
6, 5, 4, 3 |
|
2259 |
4.5 |
Non-Inherent Feature Compatible Learning |
2, 6, 5, 5 |
|
2260 |
4.5 |
Apollo: An Adaptive Parameter-wised Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization |
4, 4, 5, 5 |
|
2261 |
4.5 |
Leveraging Class Hierarchies with Metric-Guided Prototype Learning |
4, 4, 6, 4 |
|
2262 |
4.5 |
Representation and Bias in Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling |
3, 4, 5, 6 |
|
2263 |
4.5 |
GN-Transformer: Fusing AST and Source Code information in Graph Networks |
5, 5, 5, 3 |
|
2264 |
4.5 |
Better sampling in explanation methods can prevent dieselgate-like deception |
7, 4, 3, 4 |
|
2265 |
4.5 |
Dynamic Graph Representation Learning with Fourier Temporal State Embedding |
5, 4, 4, 5 |
|
2266 |
4.5 |
Uncertainty for deep image classifiers on out of distribution data. |
2, 6, 4, 6 |
|
2267 |
4.5 |
SemVLP: Vision-Language Pre-training by Aligning Semantics at Multiple Levels |
4, 5, 4, 5 |
|
2268 |
4.5 |
Gradient descent temporal difference-difference learning |
5, 5, 5, 3 |
|
2269 |
4.5 |
Invariant Batch Normalization for Multi-source Domain Generalization |
5, 5, 4, 4 |
|
2270 |
4.5 |
Information Theoretic Meta Learning with Gaussian Processes |
4, 4, 5, 5 |
|
2271 |
4.5 |
Explicit Learning Topology for Differentiable Neural Architecture Search |
5, 5, 4, 4 |
|
2272 |
4.5 |
Teleport Graph Convolutional Networks |
5, 3, 5, 5 |
|
2273 |
4.5 |
With False Friends Like These, Who Can Have Self-Knowledge? |
7, 4, 3, 4 |
|
2274 |
4.5 |
Model-Free Energy Distance for Pruning DNNs |
5, 3, 5, 5 |
|
2275 |
4.5 |
Contrast to Divide: self-supervised pre-training for learning with noisy labels |
5, 5, 4, 4 |
|
2276 |
4.5 |
Continual Learning Without Knowing Task Identities: Rethinking Occam’s Razor |
5, 5, 5, 3 |
|
2277 |
4.5 |
Learning from Demonstrations with Energy based Generative Adversarial Imitation Learning |
4, 5, 4, 5 |
|
2278 |
4.5 |
Training Data Generating Networks: Linking 3D Shapes and Few-Shot Classification |
6, 4, 3, 5 |
|
2279 |
4.5 |
Neural SDEs Made Easy: SDEs are Infinite-Dimensional GANs |
3, 6, 5, 4 |
|
2280 |
4.5 |
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification |
5, 5, 4, 4 |
|
2281 |
4.5 |
Mathematical Word Problem Generation from Commonsense Knowledge Graph and Equations |
5, 5, 3, 5 |
|
2282 |
4.5 |
Max-Affine Spline Insights Into Deep Generative Networks |
4, 4, 8, 2 |
|
2283 |
4.5 |
Response Modeling of Hyper-Parameters for Deep Convolution Neural Network |
5, 4, 4, 5 |
|
2284 |
4.5 |
Spatially Decomposed Hinge Adversarial Loss by Local Gradient Amplifier |
3, 5, 3, 7 |
|
2285 |
4.5 |
Multi-view Arbitrary Style Transfer |
5, 3, 4, 6 |
|
2286 |
4.5 |
Diverse Exploration via InfoMax Options |
4, 5, 4, 5 |
|
2287 |
4.5 |
Continual learning with neural activation importance |
6, 4, 4, 4 |
|
2288 |
4.5 |
Improved knowledge distillation by utilizing backward pass knowledge in neural networks |
6, 5, 4, 3 |
|
2289 |
4.5 |
Bayesian neural network parameters provide insights into the earthquake rupture physics. |
4, 4, 4, 6 |
|
2290 |
4.5 |
Single Pair Cross-Modality Super Resolution |
3, 4, 5, 6 |
|
2291 |
4.5 |
Network Architecture Search for Domain Adaptation |
6, 4, 4, 4 |
|
2292 |
4.5 |
Redefining Self-Normalization Property |
4, 5, 5, 4 |
|
2293 |
4.5 |
Improved Uncertainty Post-Calibration via Rank Preserving Transforms |
4, 2, 7, 5 |
|
2294 |
4.5 |
Deep Gated Canonical Correlation Analysis |
5, 5, 4, 4 |
|
2295 |
4.5 |
Recurrent Exploration Networks for Recommender Systems |
5, 4, 4, 5 |
|
2296 |
4.5 |
Task Calibration for Distributional Uncertainty in Few-Shot Classification |
5, 4, 4, 5 |
|
2297 |
4.5 |
Two steps at a time — taking GAN training in stride with Tseng’s method |
4, 4, 4, 6 |
|
2298 |
4.5 |
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events |
2, 6, 4, 6, 5, 4 |
|
2299 |
4.5 |
Untangle: Critiquing Disentangled Recommendations |
5, 4, 4, 5 |
|
2300 |
4.5 |
ImCLR: Implicit Contrastive Learning for Image Classification |
5, 4, 5, 4 |
|
2301 |
4.5 |
Q-Value Weighted Regression: Reinforcement Learning with Limited Data |
4, 3, 6, 5 |
|
2302 |
4.5 |
ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks |
5, 5, 4, 4 |
|
2303 |
4.5 |
Global Self-Attention Networks |
4, 5, 4, 5 |
|
2304 |
4.5 |
Thinking Like Transformers |
6, 3, 5, 4 |
|
2305 |
4.5 |
DJMix: Unsupervised Task-agnostic Augmentation for Improving Robustness |
4, 5, 5, 4 |
|
2306 |
4.5 |
Demystifying Loss Functions for Classification |
4, 6, 3, 5 |
|
2307 |
4.5 |
Cross-Modal Domain Adaptation for Reinforcement Learning |
4, 5, 4, 5 |
|
2308 |
4.5 |
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting |
5, 6, 3, 4 |
|
2309 |
4.5 |
Finding Patient Zero: Learning Contagion Source with Graph Neural Networks |
3, 5, 3, 7 |
|
2310 |
4.5 |
RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss |
4, 5, 3, 6 |
|
2311 |
4.5 |
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics |
4, 5, 5, 4 |
|
2312 |
4.5 |
Dataset Curation Beyond Accuracy |
4, 4, 6, 4 |
|
2313 |
4.5 |
AdaLead: A simple and robust adaptive greedy search algorithm for sequence design |
6, 5, 4, 3 |
|
2314 |
4.5 |
One Reflection Suffice |
4, 6, 4, 4 |
|
2315 |
4.5 |
Online Learning of Graph Neural Networks: When Can Data Be Permanently Deleted |
3, 5, 5, 5 |
|
2316 |
4.5 |
Certifying Robustness of Graph Laplacian Based Semi-Supervised Learning |
5, 4, 4, 5 |
|
2317 |
4.5 |
Learning to Infer Run-Time Invariants from Source code |
3, 5, 5, 5 |
|
2318 |
4.5 |
Deep Goal-Oriented Clustering |
6, 5, 4, 3 |
|
2319 |
4.5 |
Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests |
4, 4, 4, 6 |
|
2320 |
4.5 |
Frequency Decomposition in Neural Processes |
6, 5, 4, 3 |
|
2321 |
4.5 |
Hybrid and Non-Uniform DNN quantization methods using Retro Synthesis data for efficient inference |
4, 4, 6, 4 |
|
2322 |
4.5 |
Decentralized Knowledge Graph Representation Learning |
5, 4, 5, 4 |
|
2323 |
4.5 |
The simpler the better: vanilla sgd revisited |
4, 5, 6, 3 |
|
2324 |
4.5 |
Democratizing Evaluation of Deep Model Interpretability through Consensus |
6, 4, 5, 3 |
|
2325 |
4.5 |
InvertGAN: Reducing mode collapse with multi-dimensional Gaussian Inversion |
3, 4, 5, 6 |
|
2326 |
4.5 |
Optimal allocation of data across training tasks in meta-learning |
4, 4, 4, 6 |
|
2327 |
4.5 |
Powers of layers for image-to-image translation |
5, 5, 5, 3 |
|
2328 |
4.5 |
Approximating Pareto Frontier through Bayesian-optimization-directed Robust Multi-objective Reinforcement Learning |
3, 5, 5, 5 |
|
2329 |
4.5 |
Intriguing class-wise properties of adversarial training |
6, 4, 4, 4 |
|
2330 |
4.5 |
Increasing-Margin Adversarial (IMA) training to Improve Adversarial Robustness of Neural Networks |
4, 4, 6, 4 |
|
2331 |
4.5 |
The Impact of the Mini-batch Size on the Dynamics of SGD: Variance and Beyond |
5, 6, 4, 3 |
|
2332 |
4.5 |
Learning Task-Relevant Features via Contrastive Input Morphing |
4, 4, 5, 5 |
|
2333 |
4.5 |
Gated Relational Graph Attention Networks |
7, 4, 5, 2 |
|
2334 |
4.5 |
Meta-Continual Learning Via Dynamic Programming |
4, 4, 6, 4 |
|
2335 |
4.5 |
Learning Movement Strategies for Moving Target Defense |
5, 5, 4, 4 |
|
2336 |
4.5 |
Differentiable Learning of Graph-like Logical Rules from Knowledge Graphs |
3, 6, 4, 5 |
|
2337 |
4.5 |
CAFENet: Class-Agnostic Few-Shot Edge Detection Network |
4, 4, 6, 4 |
|
2338 |
4.5 |
Supervision Accelerates Pre-training in Contrastive Semi-Supervised Learning of Visual Representations |
6, 4, 4, 4 |
|
2339 |
4.5 |
Symmetry-Augmented Representation for Time Series |
6, 4, 4, 4 |
|
2340 |
4.5 |
GLUECode: A Benchmark for Source Code Machine Learning Models |
4, 6, 4, 4 |
|
2341 |
4.5 |
Suppressing Outlier Reconstruction in Autoencoders for Out-of-Distribution Detection |
4, 5, 5, 4 |
|
2342 |
4.5 |
On Representing (Anti)Symmetric Functions |
4, 6, 4, 4 |
|
2343 |
4.5 |
Self-supervised Disentangled Representation Learning |
5, 5, 4, 4 |
|
2344 |
4.5 |
Neural Bayes: A Generic Parameterization Method for Unsupervised Learning |
5, 5, 4, 4 |
|
2345 |
4.5 |
Natural World Distribution via Adaptive Confusion Energy Regularization |
5, 4, 5, 4 |
|
2346 |
4.5 |
What’s new? Summarizing Contributions in Scientific Literature |
5, 4, 4, 5 |
|
2347 |
4.5 |
Architecture Agnostic Neural Networks |
4, 5, 4, 5 |
|
2348 |
4.5 |
AutoCleansing: Unbiased Estimation of Deep Learning with Mislabeled Data |
5, 6, 4, 3 |
|
2349 |
4.5 |
3D Scene Compression through Entropy Penalized Neural Representation Functions |
4, 4, 5, 5 |
|
2350 |
4.5 |
Quantifying Exposure Bias for Open-ended Language Generation |
3, 6, 6, 3 |
|
2351 |
4.5 |
Federated Learning of a Mixture of Global and Local Models |
4, 4, 4, 6 |
|
2352 |
4.5 |
The Unreasonable Effectiveness of the Class-reversed Sampling in Tail Sample Memorization |
6, 5, 2, 5 |
|
2353 |
4.5 |
Signal Coding and Reconstruction using Spike Trains |
3, 5, 7, 3 |
|
2354 |
4.5 |
Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets |
3, 5, 4, 6 |
|
2355 |
4.5 |
Provable Fictitious Play for General Mean-Field Games |
5, 3, 5, 5 |
|
2356 |
4.5 |
Enhancing Visual Representations for Efficient Object Recognition during Online Distillation |
4, 5, 5, 4 |
|
2357 |
4.5 |
Learning Axioms to Compute Verifiable Symbolic Expression Equivalence Proofs Using Graph-to-Sequence Networks |
3, 6, 5, 4 |
|
2358 |
4.5 |
CAT-SAC: Soft Actor-Critic with Curiosity-Aware Entropy Temperature |
4, 4, 4, 6 |
|
2359 |
4.5 |
Dissecting graph measures performance for node clustering in LFR parameter space |
4, 3, 5, 6 |
|
2360 |
4.5 |
The impacts of known and unknown demonstrator irrationality on reward inference |
4, 4, 5, 5 |
|
2361 |
4.5 |
Intervention Generative Adversarial Nets |
7, 2, 6, 3 |
|
2362 |
4.5 |
Revisiting Prioritized Experience Replay: A Value Perspective |
6, 3, 5, 4 |
|
2363 |
4.5 |
Efficient Graph Neural Architecture Search |
5, 5, 3, 5 |
|
2364 |
4.5 |
ScheduleNet: Learn to Solve MinMax mTSP Using Reinforcement Learning with Delayed Reward |
5, 4, 4, 5 |
|
2365 |
4.5 |
Improving Mutual Information based Feature Selection by Boosting Unique Relevance |
2, 8, 4, 4 |
|
2366 |
4.5 |
PhraseTransformer: Self-Attention using Local Context for Semantic Parsing |
5, 3, 7, 3 |
|
2367 |
4.5 |
Improving robustness of softmax corss-entropy loss via inference information |
5, 4, 4, 5 |
|
2368 |
4.5 |
Improving Hierarchical Adversarial Robustness of Deep Neural Networks |
5, 4, 4, 5 |
|
2369 |
4.5 |
Learning to Explore with Pleasure |
5, 5, 4, 4 |
|
2370 |
4.5 |
Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule |
6, 5, 4, 3 |
|
2371 |
4.5 |
Lyapunov Barrier Policy Optimization |
4, 6, 4, 4 |
|
2372 |
4.5 |
Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding |
4, 5, 5, 4 |
|
2373 |
4.5 |
AUBER: Automated BERT Regularization |
5, 4, 4, 5 |
|
2374 |
4.5 |
Bi-Real Net V2: Rethinking Non-linearity for 1-bit CNNs and Going Beyond |
3, 6, 5, 4 |
|
2375 |
4.5 |
Learning Robust Models by Countering Spurious Correlations |
4, 6, 5, 3 |
|
2376 |
4.5 |
Memformer: The Memory-Augmented Transformer |
3, 4, 5, 6 |
|
2377 |
4.5 |
Probabilistic Meta-Learning for Bayesian Optimization |
5, 5, 4, 4 |
|
2378 |
4.5 |
Which Model to Transfer? Finding the Needle in the Growing Haystack |
4, 4, 6, 4 |
|
2379 |
4.5 |
Generalized Universal Approximation for Certified Networks |
4, 5, 4, 5 |
|
2380 |
4.5 |
Outlier Preserving Distribution Mapping Autoencoders |
6, 5, 4, 3 |
|
2381 |
4.5 |
Manifold Regularization for Locally Stable Deep Neural Networks |
5, 4, 4, 5 |
|
2382 |
4.5 |
Distributed Training of Graph Convolutional Networks using Subgraph Approximation |
5, 4, 4, 5 |
|
2383 |
4.5 |
CDT: Cascading Decision Trees for Explainable Reinforcement Learning |
5, 5, 4, 4 |
|
2384 |
4.5 |
Language-Mediated, Object-Centric Representation Learning |
4, 5, 5, 4 |
|
2385 |
4.5 |
Structural Knowledge Distillation |
5, 4, 5, 4 |
|
2386 |
4.5 |
Can We Use Gradient Norm as a Measure of Generalization Error for Model Selection in Practice? |
4, 4, 4, 6 |
|
2387 |
4.5 |
Self-Labeling of Fully Mediating Representations by Graph Alignment |
4, 5, 5, 4 |
|
2388 |
4.5 |
Neural Bootstrapper |
5, 3, 5, 5 |
|
2389 |
4.5 |
Driving through the Lens: Improving Generalization of Learning-based Steering using Simulated Adversarial Examples |
4, 4, 4, 6 |
|
2390 |
4.5 |
Out-of-Distribution Classification and Clustering |
4, 5, 4, 5 |
|
2391 |
4.5 |
PGPS : Coupling Policy Gradient with Population-based Search |
5, 3, 5, 5 |
|
2392 |
4.5 |
Memory Augmented Design of Graph Neural Networks |
3, 5, 5, 5 |
|
2393 |
4.5 |
About contrastive unsupervised representation learning for classification and its convergence |
5, 4, 3, 6 |
|
2394 |
4.5 |
SoCal: Selective Oracle Questioning for Consistency-based Active Learning of Physiological Signals |
5, 5, 4, 4 |
|
2395 |
4.5 |
Learning Active Learning in the Batch-Mode Setup with Ensembles of Active Learning Agents |
4, 3, 7, 4 |
|
2396 |
4.5 |
Redesigning the Classification Layer by Randomizing the Class Representation Vectors |
4, 5, 4, 5 |
|
2397 |
4.5 |
Recurrently Controlling a Recurrent Network with Recurrent Networks Controlled by More Recurrent Networks |
5, 6, 3, 4 |
|
2398 |
4.5 |
Low Complexity Approximate Bayesian Logistic Regression for Sparse Online Learning |
4, 4, 4, 6 |
|
2399 |
4.5 |
Hard Attention Control By Mutual Information Maximization |
4, 4, 4, 6 |
|
2400 |
4.5 |
Adaptive Gradient Methods Can Be Provably Faster than SGD with Random Shuffling |
3, 7, 4, 4 |
|
2401 |
4.5 |
Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm |
5, 4, 5, 4 |
|
2402 |
4.5 |
Interactive Visualization for Debugging RL |
6, 3, 4, 5 |
|
2403 |
4.5 |
Putting Theory to Work: From Learning Bounds to Meta-Learning Algorithms |
4, 4, 5, 5 |
|
2404 |
4.4 |
Adversarial Meta-Learning |
3, 4, 4, 6, 5 |
|
2405 |
4.4 |
Is Retriever Merely an Approximator of Reader? |
3, 5, 4, 8, 2 |
|
2406 |
4.4 |
Deep Learning Requires Explicit Regularization for Reliable Predictive Probability |
5, 3, 5, 4, 5 |
|
2407 |
4.4 |
Manifold-aware Training: Increase Adversarial Robustness with Feature Clustering |
5, 1, 7, 4, 5 |
|
2408 |
4.4 |
Non-Asymptotic PAC-Bayes Bounds on Generalisation Error |
5, 4, 5, 4, 4 |
|
2409 |
4.4 |
Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium |
4, 6, 3, 4, 5 |
|
2410 |
4.4 |
MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning |
4, 6, 5, 3, 4 |
|
2411 |
4.4 |
Structure and randomness in planning and reinforcement learning |
3, 4, 6, 3, 6 |
|
2412 |
4.4 |
SEQUENCE-LEVEL FEATURES: HOW GRU AND LSTM CELLS CAPTURE N-GRAMS |
4, 3, 5, 6, 4 |
|
2413 |
4.4 |
Chameleon: Learning Model Initializations Across Tasks With Different Schemas |
3, 3, 4, 6, 6 |
|
2414 |
4.33 |
Artificial GAN Fingerprints: Rooting Deepfake Attribution in Training Data |
6, 3, 4 |
|
2415 |
4.33 |
Sequence Metric Learning as Synchronization of Recurrent Neural Networks |
6, 4, 3 |
|
2416 |
4.33 |
Refine and Imitate: Reducing Repetition and Inconsistency in Dialogue Generation via Reinforcement Learning and Human Demonstration |
4, 6, 3 |
|
2417 |
4.33 |
Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate |
4, 3, 6 |
|
2418 |
4.33 |
Anomaly detection in dynamical systems from measured time series |
4, 5, 4 |
|
2419 |
4.33 |
ResPerfNet: Deep Residual Learning for Regressional Performance Modeling of Deep Neural Networks |
5, 4, 4 |
|
2420 |
4.33 |
Factored Action Spaces in Deep Reinforcement Learning |
5, 3, 5 |
|
2421 |
4.33 |
Adaptive Dataset Sampling by Deep Policy Gradient |
5, 3, 5 |
|
2422 |
4.33 |
AC-VAE: Learning Semantic Representation with VAE for Adaptive Clustering |
5, 3, 5 |
|
2423 |
4.33 |
A new framework for tensor PCA based on trace invariants |
5, 5, 3 |
|
2424 |
4.33 |
Distribution Based MIL Pooling Filters are Superior to Point Estimate Based Counterparts |
5, 4, 4 |
|
2425 |
4.33 |
Modeling Human Development: Effects of Blurred Vision on Category Learning in CNNs |
5, 4, 4 |
|
2426 |
4.33 |
Unbiased learning with State-Conditioned Rewards in Adversarial Imitation Learning |
5, 4, 4 |
|
2427 |
4.33 |
Learning Predictive Communication by Imagination in Networked System Control |
5, 4, 4 |
|
2428 |
4.33 |
Importance and Coherence: Methods for Evaluating Modularity in Neural Networks |
4, 4, 5 |
|
2429 |
4.33 |
Subspace Clustering via Robust Self-Supervised Convolutional Neural Network |
5, 3, 5 |
|
2430 |
4.33 |
Differentiable End-to-End Program Executor for Sample and Computationally Efficient VQA |
5, 5, 3 |
|
2431 |
4.33 |
Faster Federated Learning with Decaying Number of Local SGD Steps |
5, 4, 4 |
|
2432 |
4.33 |
Augmentation-Interpolative AutoEncoders for Unsupervised Few-Shot Image Generation |
5, 4, 4 |
|
2433 |
4.33 |
not-so-big-GAN: Generating High-Fidelity Images on Small Compute with Wavelet-based Super-Resolution |
2, 6, 5 |
|
2434 |
4.33 |
Invariant Causal Representation Learning |
4, 4, 5 |
|
2435 |
4.33 |
Generating Unobserved Alternatives: A Case Study through Super-Resolution and Decompression |
4, 5, 4 |
|
2436 |
4.33 |
No Feature Is An Island: Adaptive Collaborations Between Features Improve Adversarial Robustness |
4, 5, 4 |
|
2437 |
4.33 |
FOC OSOD: Focus on Classification One-Shot Object Detection |
4, 5, 4 |
|
2438 |
4.33 |
Hypersphere Face Uncertainty Learning |
4, 3, 6 |
|
2439 |
4.33 |
Novelty Detection with Rotated Contrastive Predictive Coding |
6, 3, 4 |
|
2440 |
4.33 |
Online Limited Memory Neural-Linear Bandits |
3, 5, 5 |
|
2441 |
4.33 |
R-LAtte: Attention Module for Visual Control via Reinforcement Learning |
5, 4, 4 |
|
2442 |
4.33 |
A New Variant of Stochastic Heavy ball Optimization Method for Deep Learning |
4, 3, 6 |
|
2443 |
4.33 |
A Chaos Theory Approach to Understand Neural Network Optimization |
4, 5, 4 |
|
2444 |
4.33 |
Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero |
4, 5, 4 |
|
2445 |
4.33 |
Visible and Invisible: Causal Variable Learning and its Application in a Cancer Study |
7, 3, 3 |
|
2446 |
4.33 |
Aspect-based Sentiment Classification via Reinforcement Learning |
3, 5, 5 |
|
2447 |
4.33 |
Learning Blood Oxygen from Respiration Signals |
4, 6, 3 |
|
2448 |
4.33 |
AUL is a better optimization metric in PU learning |
5, 5, 3 |
|
2449 |
4.33 |
Convolutional Neural Networks are not invariant to translation, but they can learn to be |
4, 4, 5 |
|
2450 |
4.33 |
Feature-Robust Optimal Transport for High-Dimensional Data |
6, 4, 3 |
|
2451 |
4.33 |
Variational saliency maps for explaining model’s behavior |
4, 5, 4 |
|
2452 |
4.33 |
Local SGD Meets Asynchrony |
4, 4, 5 |
|
2453 |
4.33 |
Adversarial Data Generation of Multi-category Marked Temporal Point Processes with Sparse, Incomplete, and Small Training Samples |
5, 5, 3 |
|
2454 |
4.33 |
Fast 3D Acoustic Scattering via Discrete Laplacian Based Implicit Function Encoders |
3, 4, 6 |
|
2455 |
4.33 |
Episodic Memory for Learning Subjective-Timescale Models |
5, 4, 4 |
|
2456 |
4.33 |
Approximate Birkhoff-von-Neumann decomposition: a differentiable approach |
5, 4, 4 |
|
2457 |
4.33 |
On the Dynamic Regret of Online Multiple Mirror Descent |
4, 5, 4 |
|
2458 |
4.33 |
Enabling Efficient On-Device Self-supervised Contrastive Learning by Data Selection |
4, 5, 4 |
|
2459 |
4.33 |
SAD: Saliency Adversarial Defense without Adversarial Training |
4, 4, 5 |
|
2460 |
4.33 |
Quantifying Uncertainty in Deep Spatiotemporal Forecasting |
4, 5, 4 |
|
2461 |
4.33 |
Additive Poisson Process: Learning Intensity of Higher-Order Interaction in Stochastic Processes |
3, 4, 6 |
|
2462 |
4.33 |
Flatness is a Flase Friend |
3, 6, 4 |
|
2463 |
4.25 |
Why Does Decentralized Training Outperform Synchronous Training In The Large Batch Setting? |
6, 3, 3, 5 |
|
2464 |
4.25 |
Communication-Computation Efficient Secure Aggregation for Federated Learning |
4, 3, 6, 4 |
|
2465 |
4.25 |
Neural Text Classification by Jointly Learning to Cluster and Align |
3, 5, 5, 4 |
|
2466 |
4.25 |
NETWORK ROBUSTNESS TO PCA PERTURBATIONS |
4, 3, 3, 7 |
|
2467 |
4.25 |
Knapsack Pruning with Inner Distillation |
4, 5, 4, 4 |
|
2468 |
4.25 |
Learning Lagrangian Fluid Dynamics with Graph Neural Networks |
4, 5, 4, 4 |
|
2469 |
4.25 |
Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments |
6, 4, 4, 3 |
|
2470 |
4.25 |
Language Models are Open Knowledge Graphs |
5, 4, 4, 4 |
|
2471 |
4.25 |
MCMC-Interactive Variational Inference |
5, 4, 4, 4 |
|
2472 |
4.25 |
Discrete Word Embedding for Logical Natural Language Understanding |
3, 4, 5, 5 |
|
2473 |
4.25 |
Iterative Image Inpainting with Structural Similarity Mask for Anomaly Detection |
5, 6, 2, 4 |
|
2474 |
4.25 |
Three Dimensional Reconstruction of Botanical Trees with Simulatable Geometry |
3, 6, 4, 4 |
|
2475 |
4.25 |
Assisting the Adversary to Improve GAN Training |
6, 3, 4, 4 |
|
2476 |
4.25 |
Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning |
3, 5, 5, 4 |
|
2477 |
4.25 |
Factor Normalization for Deep Neural Network Models |
4, 4, 4, 5 |
|
2478 |
4.25 |
DarKnight: A Data Privacy Scheme for Training and Inference of Deep Neural Networks |
4, 3, 5, 5 |
|
2479 |
4.25 |
Fast Binarized Neural Network Training with Partial Pre-training |
4, 5, 4, 4 |
|
2480 |
4.25 |
Convergence Proof for Actor-Critic Methods Applied to PPO and RUDDER |
4, 4, 4, 5 |
|
2481 |
4.25 |
What are effective labels for augmented data? Improving robustness with AutoLabel |
4, 4, 5, 4 |
|
2482 |
4.25 |
Run Away From your Teacher: a New Self-Supervised Approach Solving the Puzzle of BYOL |
6, 3, 3, 5 |
|
2483 |
4.25 |
Geometry matters: Exploring language examples at the decision boundary |
5, 4, 3, 5 |
|
2484 |
4.25 |
Einstein VI: General and Integrated Stein Variational Inference in NumPyro |
5, 5, 4, 3 |
|
2485 |
4.25 |
Adaptive Tree Wasserstein Minimization for Hierarchical Generative Modeling |
4, 5, 4, 4 |
|
2486 |
4.25 |
A Simple Framework for Uncertainty in Contrastive Learning |
5, 5, 3, 4 |
|
2487 |
4.25 |
Neural Partial Differential Equations with Functional Convolution |
4, 4, 5, 4 |
|
2488 |
4.25 |
Unsupervised Simultaneous Depth-from-defocus and Depth-from-focus |
6, 3, 4, 4 |
|
2489 |
4.25 |
Generalizing Tree Models for Improving Prediction Accuracy |
3, 6, 4, 4 |
|
2490 |
4.25 |
Rethinking the Pruning Criteria for Convolutional Neural Network |
5, 3, 5, 4 |
|
2491 |
4.25 |
Fewmatch: Dynamic Prototype Refinement for Semi-Supervised Few-Shot Learning |
5, 3, 5, 4 |
|
2492 |
4.25 |
FixNorm: Dissecting Weight Decay for Training Deep Neural Networks |
4, 4, 5, 4 |
|
2493 |
4.25 |
Bypassing the Random Input Mixing in Mixup |
4, 4, 4, 5 |
|
2494 |
4.25 |
Deep Learning is Singular, and That’s Good |
5, 4, 4, 4 |
|
2495 |
4.25 |
Example-Driven Intent Prediction with Observers |
4, 5, 3, 5 |
|
2496 |
4.25 |
On the Power of Abstention and Data-Driven Decision Making for Adversarial Robustness |
4, 4, 6, 3 |
|
2497 |
4.25 |
Derivative Manipulation for General Example Weighting |
5, 3, 5, 4 |
|
2498 |
4.25 |
To Learn Effective Features: Understanding the Task-Specific Adaptation of MAML |
3, 5, 4, 5 |
|
2499 |
4.25 |
Sself: Robust Federated Learning against Stragglers and Adversaries |
4, 4, 5, 4 |
|
2500 |
4.25 |
Why Convolutional Networks Learn Oriented Bandpass Filters: Theory and Empirical Support |
3, 5, 3, 6 |
|
2501 |
4.25 |
VortexNet: Learning Complex Dynamic Systems with Physics-Embedded Networks |
4, 4, 4, 5 |
|
2502 |
4.25 |
Robust Imitation via Decision-Time Planning |
4, 4, 6, 3 |
|
2503 |
4.25 |
Revisiting BFfloat16 Training |
3, 5, 6, 3 |
|
2504 |
4.25 |
ChemistryQA: A Complex Question Answering Dataset from Chemistry |
4, 5, 3, 5 |
|
2505 |
4.25 |
Reinforcement Learning for Flexibility Design Problems |
4, 5, 4, 4 |
|
2506 |
4.25 |
Joint Learning of Full-structure Noise in Hierarchical Bayesian Regression Models |
4, 4, 4, 5 |
|
2507 |
4.25 |
Online Continual Learning Under Domain Shift |
4, 3, 5, 5 |
|
2508 |
4.25 |
Fast Estimation for Privacy and Utility in Differentially Private Machine Learning |
4, 5, 3, 5 |
|
2509 |
4.25 |
Maximum Entropy competes with Maximum Likelihood |
4, 4, 3, 6 |
|
2510 |
4.25 |
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation |
4, 4, 5, 4 |
|
2511 |
4.25 |
RetCL: A Selection-based Approach for Retrosynthesis via Contrastive Learning |
5, 4, 4, 4 |
|
2512 |
4.25 |
Compositional Models: Multi-Task Learning and Knowledge Transfer with Modular Networks |
4, 4, 5, 4 |
|
2513 |
4.25 |
Re-examining Routing Networks for Multi-task Learning |
5, 6, 3, 3 |
|
2514 |
4.25 |
Identifying Treatment Effects under Unobserved Confounding by Causal Representation Learning |
3, 6, 4, 4 |
|
2515 |
4.25 |
Improving the accuracy of neural networks in analog computing-in-memory systems by a generalized quantization method |
4, 5, 3, 5 |
|
2516 |
4.25 |
Feedforward Legendre Memory Unit |
4, 5, 4, 4 |
|
2517 |
4.25 |
Exploring Transferability of Perturbations in Deep Reinforcement Learning |
4, 6, 3, 4 |
|
2518 |
4.25 |
A Chain Graph Interpretation of Real-World Neural Networks |
6, 4, 4, 3 |
|
2519 |
4.25 |
Mirror Sample Based Distribution Alignment for Unsupervised Domain Adaption |
5, 4, 4, 4 |
|
2520 |
4.25 |
Joint Perception and Control as Inference with an Object-based Implementation |
4, 4, 5, 4 |
|
2521 |
4.25 |
Hokey Pokey Causal Discovery: Using Deep Learning Model Errors to Learn Causal Structure |
4, 5, 4, 4 |
|
2522 |
4.25 |
A Communication Efficient Federated Kernel $k$-Means |
6, 1, 5, 5 |
|
2523 |
4.25 |
Selective Sensing: A Data-driven Nonuniform Subsampling Approach for Computation-free On-Sensor Data Dimensionality Reduction |
4, 4, 5, 4 |
|
2524 |
4.25 |
GENERATIVE MODEL-ENHANCED HUMAN MOTION PREDICTION |
5, 5, 4, 3 |
|
2525 |
4.25 |
Multi-Representation Ensemble in Few-Shot Learning |
4, 4, 5, 4 |
|
2526 |
4.25 |
Empirical Sufficiency Featuring Reward Delay Calibration |
4, 4, 5, 4 |
|
2527 |
4.25 |
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks |
5, 4, 3, 5 |
|
2528 |
4.25 |
Minimum Description Length Recurrent Neural Networks |
4, 6, 4, 3 |
|
2529 |
4.25 |
Conditional Generative Modeling for De Novo Hierarchical Multi-Label Functional Protein Design |
3, 7, 4, 3 |
|
2530 |
4.25 |
Mobile Construction Benchmark |
4, 4, 4, 5 |
|
2531 |
4.25 |
Towards Good Practices in Self-Supervised Representation Learning |
5, 4, 4, 4 |
|
2532 |
4.25 |
Model-based Navigation in Environments with Novel Layouts Using Abstract $2$-D Maps |
3, 4, 4, 6 |
|
2533 |
4.25 |
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis |
5, 6, 3, 3 |
|
2534 |
4.25 |
Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation |
4, 3, 5, 5 |
|
2535 |
4.25 |
Generalized Gumbel-Softmax Gradient Estimator for Generic Discrete Random Variables |
4, 5, 4, 4 |
|
2536 |
4.25 |
Motion Representations for Articulated Animation |
4, 4, 4, 5 |
|
2537 |
4.25 |
On the Stability of Multi-branch Network |
5, 3, 5, 4 |
|
2538 |
4.25 |
An Empirical Exploration of Open-Set Recognition via Lightweight Statistical Pipelines |
4, 3, 3, 7 |
|
2539 |
4.25 |
XMixup: Efficient Transfer Learning with Auxiliary Samples by Cross-Domain Mixup |
4, 4, 5, 4 |
|
2540 |
4.25 |
Are all negatives created equal in contrastive instance discrimination? |
5, 5, 2, 5 |
|
2541 |
4.25 |
Learning What Not to Model: Gaussian Process Regression with Negative Constraints |
5, 3, 6, 3 |
|
2542 |
4.25 |
Neuro-algorithmic Policies for Discrete Planning |
4, 3, 3, 7 |
|
2543 |
4.25 |
Towards Robustness against Unsuspicious Adversarial Examples |
4, 3, 6, 4 |
|
2544 |
4.25 |
Beyond the Pixels: Exploring the Effects of Bit-Level Network and File Corruptions on Video Model Robustness |
4, 6, 3, 4 |
|
2545 |
4.25 |
Maximum Categorical Cross Entropy (MCCE): A noise-robust alternative loss function to mitigate racial bias in Convolutional Neural Networks (CNNs) by reducing overfitting |
5, 4, 5, 3 |
|
2546 |
4.25 |
Analyzing Attention Mechanisms through Lens of Sample Complexity and Loss Landscape |
5, 4, 3, 5 |
|
2547 |
4.25 |
Learning without Forgetting: Task Aware Multitask Learning for Multi-Modality Tasks |
5, 4, 4, 4 |
|
2548 |
4.25 |
Skinning a Parameterization of Three-Dimensional Space for Neural Network Cloth |
3, 6, 4, 4 |
|
2549 |
4.25 |
ROMUL: Scale Adaptative Population Based Training |
6, 3, 4, 4 |
|
2550 |
4.25 |
A Surgery of the Neural Architecture Evaluators |
5, 4, 5, 3 |
|
2551 |
4.25 |
Deep Manifold Computing and Visualization Using Elastic Locally Isometric Smoothness |
5, 5, 3, 4 |
|
2552 |
4.25 |
A spectral perspective on GCNs |
4, 3, 4, 6 |
|
2553 |
4.25 |
STRATA: Building Robustness with a Simple Method for Generating Black-box Adversarial Attacks for Models of Code |
4, 5, 4, 4 |
|
2554 |
4.25 |
On the Geometry of Deep Bayesian Active Learning |
5, 3, 4, 5 |
|
2555 |
4.25 |
Fair Differential Privacy Can Mitigate the Disparate Impact on Model Accuracy |
5, 4, 4, 4 |
|
2556 |
4.25 |
Neural Network Surgery: Combining Training with Topology Optimization |
4, 5, 4, 4 |
|
2557 |
4.25 |
Heterogeneous Model Transfer between Different Neural Networks |
5, 5, 3, 4 |
|
2558 |
4.25 |
Domain Adaptation via Anaomaly Detection |
4, 4, 5, 4 |
|
2559 |
4.25 |
Sparse Binary Neural Networks |
3, 4, 5, 5 |
|
2560 |
4.25 |
Regularization Shortcomings for Continual Learning |
3, 5, 5, 4 |
|
2561 |
4.25 |
FGNAS: FPGA-Aware Graph Neural Architecture Search |
3, 4, 5, 5 |
|
2562 |
4.25 |
Alpha-DAG: a reinforcement learning based algorithm to learn Directed Acyclic Graphs |
4, 4, 5, 4 |
|
2563 |
4.25 |
Clearing the Path for Truly Semantic Representation Learning |
4, 3, 5, 5 |
|
2564 |
4.25 |
On the Neural Tangent Kernel of Equilibrium Models |
4, 3, 6, 4 |
|
2565 |
4.25 |
Dense Global Context Aware RCNN for Object Detection |
4, 5, 5, 3 |
|
2566 |
4.25 |
Adversarial Boot Camp: label free certified robustness in one epoch |
3, 7, 3, 4 |
|
2567 |
4.25 |
TwinDNN: A Tale of Two Deep Neural Networks |
4, 5, 4, 4 |
|
2568 |
4.25 |
Out-of-Distribution Generalization with Maximal Invariant Predictor |
4, 5, 3, 5 |
|
2569 |
4.25 |
Knowledge Distillation By Sparse Representation Matching |
4, 5, 5, 3 |
|
2570 |
4.25 |
Hierarchical Binding in Convolutional Neural Networks Confers Adversarial Robustness |
5, 5, 3, 4 |
|
2571 |
4.25 |
Distribution Embedding Network for Meta-Learning with Variable-Length Input |
4, 4, 4, 5 |
|
2572 |
4.25 |
Grounded Compositional Generalization with Environment Interactions |
4, 5, 5, 3 |
|
2573 |
4.25 |
Compressing gradients in distributed SGD by exploiting their temporal correlation |
5, 2, 4, 6 |
|
2574 |
4.25 |
Conditional Networks |
4, 4, 6, 3 |
|
2575 |
4.25 |
On Batch-size Selection for Stochastic Training for Graph Neural Networks |
4, 4, 5, 4 |
|
2576 |
4.25 |
DeepLTRS: A Deep Latent Recommender System based on User Ratings and Reviews |
4, 3, 5, 5 |
|
2577 |
4.25 |
Imagine That! Leveraging Emergent Affordances for 3D Tool Synthesis |
4, 4, 4, 5 |
|
2578 |
4.25 |
Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties |
5, 4, 3, 5 |
|
2579 |
4.25 |
Model-Agnostic Round-Optimal Federated Learning via Knowledge Transfer |
5, 4, 4, 4 |
|
2580 |
4.25 |
Can Kernel Transfer Operators Help Flow based Generative Models? |
5, 5, 5, 2 |
|
2581 |
4.25 |
The Foes of Neural Network’s Data Efficiency Among Unnecessary Input Dimensions |
4, 5, 5, 3 |
|
2582 |
4.25 |
Achieving Explainability in a Visual Hard Attention Model through Content Prediction |
4, 4, 5, 4 |
|
2583 |
4.25 |
Federated Mixture of Experts |
4, 4, 4, 5 |
|
2584 |
4.25 |
Multi-EPL: Accurate Multi-source Domain Adaptation |
5, 4, 4, 4 |
|
2585 |
4.25 |
Evaluating Online Continual Learning with CALM |
3, 4, 4, 6 |
|
2586 |
4.25 |
Convolutional Complex Knowledge Graph Embeddings |
5, 4, 4, 4 |
|
2587 |
4.25 |
Adaptive Optimizers with Sparse Group Lasso |
5, 4, 5, 3 |
|
2588 |
4.25 |
TOMA: Topological Map Abstraction for Reinforcement Learning |
5, 3, 5, 4 |
|
2589 |
4.25 |
On the Effectiveness of Deep Ensembles for Small Data Tasks |
5, 4, 5, 3 |
|
2590 |
4.25 |
Linear Convergence and Implicit Regularization of Generalized Mirror Descent with Time-Dependent Mirrors |
3, 5, 4, 5 |
|
2591 |
4.25 |
The Effectiveness of Memory Replay in Large Scale Continual Learning |
5, 5, 3, 4 |
|
2592 |
4.25 |
A Closer Look at Codistillation for Distributed Training |
5, 4, 4, 4 |
|
2593 |
4.25 |
Error Controlled Actor-Critic Method to Reinforcement Learning |
6, 3, 3, 5 |
|
2594 |
4.25 |
Deep Ecological Inference |
3, 4, 7, 3 |
|
2595 |
4.25 |
Improving Zero-Shot Neural Architecture Search with Parameters Scoring |
5, 4, 5, 3 |
|
2596 |
4.25 |
Efficiently labelling sequences using semi-supervised active learning |
5, 5, 3, 4 |
|
2597 |
4.25 |
Transferred Discrepancy: Quantifying the Difference Between Representations |
4, 5, 5, 3 |
|
2598 |
4.25 |
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms |
6, 3, 4, 4 |
|
2599 |
4.25 |
End-to-end Quantized Training via Log-Barrier Extensions |
3, 6, 5, 3 |
|
2600 |
4.25 |
Dual Averaging is Surprisingly Effective for Deep Learning Optimization |
6, 3, 4, 4 |
|
2601 |
4.25 |
Connection-Adaptive Meta-Learning |
3, 4, 5, 5 |
|
2602 |
4.25 |
Weak and Strong Gradient Directions: Explaining Memorization, Generalization, and Hardness of Examples at Scale |
4, 4, 4, 5 |
|
2603 |
4.25 |
Hidden Markov models are recurrent neural networks: A disease progression modeling application |
4, 3, 5, 5 |
|
2604 |
4.25 |
The 3TConv: An Intrinsic Approach to Explainable 3D CNNs |
6, 3, 3, 5 |
|
2605 |
4.25 |
Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks |
5, 4, 4, 4 |
|
2606 |
4.25 |
Leveraging affinity cycle consistency to isolate factors of variation in learned representations |
4, 4, 3, 6 |
|
2607 |
4.25 |
Noisy Differentiable Architecture Search |
5, 5, 5, 2 |
|
2608 |
4.25 |
DHOG: Deep Hierarchical Object Grouping |
4, 3, 6, 4 |
|
2609 |
4.25 |
Neural Time-Dependent Partial Differential Equation |
5, 4, 5, 3 |
|
2610 |
4.2 |
Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization |
4, 5, 4, 5, 3 |
|
2611 |
4.2 |
Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron |
5, 5, 4, 4, 3 |
|
2612 |
4 |
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning |
4, 3, 3, 6 |
|
2613 |
4 |
Synthesising Realistic Calcium Imaging Data of Neuronal Populations Using GAN |
4, 5, 3 |
|
2614 |
4 |
Rethinking Graph Neural Networks for Graph Coloring |
2, 6, 5, 3 |
|
2615 |
4 |
Variational Deterministic Uncertainty Quantification |
2, 5, 5, 4 |
|
2616 |
4 |
Federated Learning with Decoupled Probabilistic-Weighted Gradient Aggregation |
4, 3, 6, 3 |
|
2617 |
4 |
Play to Grade: Grading Interactive Coding Games as Classifying Markov Decision Process |
5, 3, 4 |
|
2618 |
4 |
Differentially Private Synthetic Data: Applied Evaluations and Enhancements |
4, 4, 4 |
|
2619 |
4 |
Effective Subspace Indexing via Interpolation on Stiefel and Grassmann manifolds |
4, 3, 4, 5 |
|
2620 |
4 |
Exploring Target Driven Image Classification |
4, 4, 5, 2, 5 |
|
2621 |
4 |
Experimental Design for Overparameterized Learning with Application to Single Shot Deep Active Learning |
4, 4, 3, 5 |
|
2622 |
4 |
Adaptive N-step Bootstrapping with Off-policy Data |
3, 4, 4, 5 |
|
2623 |
4 |
NASLib: A Modular and Flexible Neural Architecture Search Library |
5, 4, 4, 3 |
|
2624 |
4 |
Pair-based Self-Distillation for Semi-supervised Domain Adaptation |
3, 5, 4 |
|
2625 |
4 |
Semi-Supervised Audio Representation Learning for Modeling Beehive Strengths |
5, 3, 4 |
|
2626 |
4 |
MoCo-Pretraining Improves Representations and Transferability of Chest X-ray Models |
6, 5, 2, 3 |
|
2627 |
4 |
AttackDist: Characterizing Zero-day Adversarial Samples by Counter Attack |
5, 5, 3, 3 |
|
2628 |
4 |
RoeNets: Predicting Discontinuity of Hyperbolic Systems from Continuous Data |
3, 5, 4 |
|
2629 |
4 |
Unsupervised Learning of Slow Features for Data Efficient Regression |
3, 4, 4, 5 |
|
2630 |
4 |
Class-Weighted Evaluation Metrics for Imbalanced Data Classification |
4, 3, 3, 6 |
|
2631 |
4 |
cross-modal knowledge enhancement mechanism for few-shot learning |
3, 5, 4, 4 |
|
2632 |
4 |
LATENT OPTIMIZATION VARIATIONAL AUTOENCODER FOR CONDITIONAL MOLECULAR GENERATION |
4, 3, 5, 4 |
|
2633 |
4 |
NOSE Augment: Fast and Effective Data Augmentation Without Searching |
4, 3, 5 |
|
2634 |
4 |
Unsupervised Class-Incremental Learning through Confusion |
6, 4, 3, 3 |
|
2635 |
4 |
A new accelerated gradient method inspired by continuous-time perspective |
4, 4, 4, 4 |
|
2636 |
4 |
Trust, but verify: model-based exploration in sparse reward environments |
4, 6, 4, 2 |
|
2637 |
4 |
Difference-in-Differences: Bridging Normalization and Disentanglement in PG-GAN |
4, 3, 5 |
|
2638 |
4 |
Efficiently Disentangle Causal Representations |
4, 5, 3 |
|
2639 |
4 |
Learning Collision-free Latent Space for Bayesian Optimization |
4, 4, 3, 5 |
|
2640 |
4 |
OFFER PERSONALIZATION USING TEMPORAL CONVOLUTION NETWORK AND OPTIMIZATION |
5, 3, 4 |
|
2641 |
4 |
Differentiable Programming for Piecewise Polynomial Functions |
3, 5, 4, 4 |
|
2642 |
4 |
Out-of-Core Training for Extremely Large-Scale Neural Networks with Adaptive Window-Based Scheduling |
4, 4, 4, 4 |
|
2643 |
4 |
Momentum Contrastive Autoencoder |
5, 3, 4, 4 |
|
2644 |
4 |
Vision at A Glance: Interplay between Fine and Coarse Information Processing Pathways |
6, 3, 3 |
|
2645 |
4 |
Shuffle to Learn: Self-supervised learning from permutations via differentiable ranking |
4, 4, 4 |
|
2646 |
4 |
Deep Evolutionary Learning for Molecular Design |
4, 4, 4, 4 |
|
2647 |
4 |
One Size Doesn’t Fit All: Adaptive Label Smoothing |
4, 4, 4, 4 |
|
2648 |
4 |
Transforming Recurrent Neural Networks with Attention and Fixed-point Equations |
5, 4, 4, 3 |
|
2649 |
4 |
MOFA: Modular Factorial Design for Hyperparameter Optimization |
5, 3, 4, 4 |
|
2650 |
4 |
Nonconvex Continual Learning with Episodic Memory |
5, 4, 3, 4 |
|
2651 |
4 |
Provable Robust Learning under Agnostic Corrupted Supervision |
4, 4, 5, 3 |
|
2652 |
4 |
Deep Retrieval: An End-to-End Structure Model for Large-Scale Recommendations |
4, 5, 3, 4 |
|
2653 |
4 |
BAAAN: Backdoor Attacks Against Auto-encoder and GAN-Based Machine Learning Models |
4, 5, 3, 4 |
|
2654 |
4 |
Recurrent Neural Network Architecture based on Dynamic Systems Theory for Data Driven Modelling of Complex Physical Systems |
3, 4, 6, 3 |
|
2655 |
4 |
BaSIL: Learning Incrementally using a Bayesian Memory-Based Streaming Approach |
3, 7, 3, 3 |
|
2656 |
4 |
Distantly supervised end-to-end medical entity extraction from electronic health records with human-level quality |
3, 4, 4, 5 |
|
2657 |
4 |
Complex neural networks have no spurious local minima |
4, 4, 4 |
|
2658 |
4 |
Robust Learning via Golden Symmetric Loss of (un)Trusted Labels |
4, 4, 5, 3 |
|
2659 |
4 |
Disentanglement, Visualization and Analysis of Complex Features in DNNs |
3, 6, 3, 4 |
|
2660 |
4 |
RETHINKING LOCAL LOW RANK MATRIX DETECTION:A MULTIPLE-FILTER BASED NEURAL NETWORK FRAMEWORK |
3, 4, 5 |
|
2661 |
4 |
Learning from deep model via exploring local targets |
5, 3, 4, 4 |
|
2662 |
4 |
VideoGen: Generative Modeling of Videos using VQ-VAE and Transformers |
4, 4, 4, 4 |
|
2663 |
4 |
The large learning rate phase of deep learning |
5, 4, 3 |
|
2664 |
4 |
LEARNING BILATERAL CLIPPING PARAMETRIC ACTIVATION FUNCTION FOR LOW-BIT NEURAL NETWORKS |
5, 4, 3, 4 |
|
2665 |
4 |
TraDE: A Simple Self-Attention-Based Density Estimator |
5, 4, 3 |
|
2666 |
4 |
Faster and Smarter AutoAugment: Augmentation Policy Search Based on Dynamic Data-Clustering |
5, 4, 3, 4 |
|
2667 |
4 |
Rotograd: Dynamic Gradient Homogenization for Multitask Learning |
4, 4, 4 |
|
2668 |
4 |
Transferable Feature Learning on Graphs Across Visual Domains |
5, 4, 3, 4 |
|
2669 |
4 |
Measuring Progress in Deep Reinforcement Learning Sample Efficiency |
5, 2, 5, 4 |
|
2670 |
4 |
Learning to Disentangle Textual Representations and Attributes via Mutual Information |
4, 4, 4 |
|
2671 |
4 |
Symbol-Shift Equivariant Neural Networks |
5, 3, 4 |
|
2672 |
4 |
Leveraging the Variance of Return Sequences for Exploration Policy |
5, 5, 4, 2 |
|
2673 |
4 |
Autonomous Learning of Object-Centric Abstractions for High-Level Planning |
3, 4, 5, 4 |
|
2674 |
4 |
AdaS: Adaptive Scheduling of Stochastic Gradients |
5, 4, 4, 3 |
|
2675 |
4 |
Sample Balancing for Improving Generalization under Distribution Shifts |
6, 3, 3, 4 |
|
2676 |
4 |
Cross-Modal Retrieval Augmentation for Multi-Modal Classification |
3, 4, 5 |
|
2677 |
4 |
LayoutTransformer: Relation-Aware Scene Layout Generation |
4, 4, 4, 4 |
|
2678 |
4 |
An empirical study of a pruning mechanism |
4, 4, 4, 4 |
|
2679 |
4 |
Learning Disconnected Manifolds: Avoiding The No Gan’s Land by Latent Rejection |
4, 4, 4 |
|
2680 |
4 |
Uncertainty-Based Adaptive Learning for Reading Comprehension |
5, 4, 3, 4 |
|
2681 |
4 |
On the Importance of Looking at the Manifold |
4, 3, 5, 4 |
|
2682 |
4 |
Graph-Graph Similarity Network |
2, 5, 4, 5 |
|
2683 |
4 |
Erasure for Advancing: Dynamic Self-Supervised Learning for Commonsense Reasoning |
4, 3, 5, 4 |
|
2684 |
4 |
Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms |
4, 4, 4 |
|
2685 |
4 |
Crowd-sourced Phrase-Based Tokenization for Low-Resourced Neural Machine Translation: The case of Fon Language |
4, 3, 5 |
|
2686 |
4 |
Non-Linear Rewards For Successor Features |
4, 4, 4, 4 |
|
2687 |
4 |
Legendre Deep Neural Network (LDNN) and its application for approximation of nonlinear Volterra–Fredholm–Hammerstein integral equations |
5, 3, 4 |
|
2688 |
4 |
CNN Based Analysis of the Luria’s Alternating Series Test for Parkinson’s Disease Diagnostics |
5, 5, 2, 4 |
|
2689 |
4 |
Dynamic Probabilistic Pruning: Training sparse networks based on stochastic and dynamic masking |
5, 4, 5, 2 |
|
2690 |
4 |
PriorityCut: Occlusion-aware Regularization for Image Animation |
5, 4, 5, 2 |
|
2691 |
4 |
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment |
6, 3, 3, 4, 4 |
|
2692 |
4 |
On the Discovery of Feature Importance Distribution: An Overlooked Area |
3, 5, 4 |
|
2693 |
4 |
On the use of linguistic similarities to improve Neural Machine Translation for African Languages |
4, 4, 5, 3 |
|
2694 |
4 |
End-to-End on-device Federated Learning: A case study |
4, 2, 4, 6 |
|
2695 |
4 |
A Transformer-based Framework for Multivariate Time Series Representation Learning |
4, 4, 4, 4 |
|
2696 |
4 |
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis |
4, 4, 4, 4 |
|
2697 |
4 |
ADIS-GAN: Affine Disentangled GAN |
3, 4, 5 |
|
2698 |
4 |
Abductive Knowledge Induction from Raw Data |
4, 4, 3, 5 |
|
2699 |
4 |
UserBERT: Self-supervised User Representation Learning |
4, 3, 4, 5 |
|
2700 |
4 |
Prior Knowledge Representation for Self-Attention Networks |
4, 5, 3 |
|
2701 |
4 |
Frequency-aware Interface Dynamics with Generative Adversarial Networks |
5, 3, 4 |
|
2702 |
4 |
Inverse Problems, Deep Learning, and Symmetry Breaking |
3, 4, 5, 4 |
|
2703 |
4 |
Learning to Recover from Failures using Memory |
4, 4, 4, 4 |
|
2704 |
4 |
Learning Semantic Similarities for Prototypical Classifiers |
4, 4, 4, 4 |
|
2705 |
4 |
Unsupervised Disentanglement Learning by intervention |
2, 5, 5 |
|
2706 |
4 |
Optimizing Quantized Neural Networks with Natural Gradient |
5, 3, 3, 5 |
|
2707 |
4 |
Explicit homography estimation improves contrastive self-supervised learning |
4, 4, 4, 4 |
|
2708 |
4 |
FORK: A FORward-looKing Actor for Model-Free Reinforcement Learning |
3, 5, 3, 5 |
|
2709 |
4 |
Analysis of Alignment Phenomenon in Simple Teacher-student Networks with Finite Width |
4, 4, 5, 3 |
|
2710 |
4 |
Multi-scale Network Architecture Search for Object Detection |
3, 4, 4, 5 |
|
2711 |
4 |
GenAD: General Representations of Multivariate Time Series for Anomaly Detection |
4, 5, 3 |
|
2712 |
4 |
Identifying Coarse-grained Independent Causal Mechanisms with Self-supervision |
5, 2, 5 |
|
2713 |
4 |
Ballroom Dance Movement Recognition Using a Smart Watch and Representation Learning |
4, 4, 4 |
|
2714 |
4 |
Overinterpretation reveals image classification model pathologies |
6, 3, 2, 5 |
|
2715 |
4 |
Efficient Neural Machine Translation with Prior Word Alignment |
3, 5, 4 |
|
2716 |
4 |
Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning |
4, 3, 4, 5 |
|
2717 |
4 |
Contrasting distinct structured views to learn sentence embeddings |
4, 3, 5 |
|
2718 |
4 |
Learning to Represent Programs with Heterogeneous Graphs |
4, 5, 5, 2 |
|
2719 |
4 |
Recovering Geometric Information with Learned Texture Perturbations |
4, 3, 5, 4 |
|
2720 |
4 |
EMPIRICAL UPPER BOUND IN OBJECT DETECTION |
4, 3, 5, 4 |
|
2721 |
4 |
The Importance of Importance Sampling for Deep Budgeted Training |
5, 3, 4, 4 |
|
2722 |
4 |
BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer |
4, 5, 3, 4 |
|
2723 |
4 |
Inhibition-augmented ConvNets |
5, 3, 4, 4 |
|
2724 |
4 |
A Large-scale Study on Training Sample Memorization in Generative Modeling |
5, 3, 4 |
|
2725 |
4 |
Learn2Weight: Weights Transfer Defense against Similar-domain Adversarial Attacks |
4, 5, 3 |
|
2726 |
4 |
Discrete Predictive Representation for Long-horizon Planning |
4, 4, 4, 4 |
|
2727 |
4 |
Hellinger Distance Constrained Regression |
5, 4, 3, 4 |
|
2728 |
4 |
A first look into the carbon footprint of federated learning |
4, 6, 3, 3 |
|
2729 |
4 |
AdaDGS: An adaptive black-box optimization method with a nonlocal directional Gaussian smoothing gradient |
4, 4, 3, 5 |
|
2730 |
4 |
Defending against black-box adversarial attacks with gradient-free trained sign activation neural networks |
3, 5, 4 |
|
2731 |
4 |
Toward Synergism in Macro Action Ensembles |
4, 4, 4, 4 |
|
2732 |
4 |
Disentangling Action Sequences: Discovering Correlated Samples |
3, 4, 6, 5, 2 |
|
2733 |
4 |
Hard-label Manifolds: Unexpected advantages of query efficiency for finding on-manifold adversarial examples |
5, 3, 4 |
|
2734 |
4 |
DynamicVAE: Decoupling Reconstruction Error and Disentangled Representation Learning |
4, 4, 4, 4 |
|
2735 |
4 |
Improving Tail Label Prediction for Extreme Multi-label Learning |
4, 5, 3 |
|
2736 |
4 |
FTSO: Effective NAS via First Topology Second Operator |
3, 5, 4 |
|
2737 |
4 |
QuatRE: Relation-Aware Quaternions for Knowledge Graph Embeddings |
5, 5, 2, 4 |
|
2738 |
4 |
An Examination of Preference-based Reinforcement Learning for Treatment Recommendation |
4, 4, 4 |
|
2739 |
4 |
Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm |
5, 4, 4, 3 |
|
2740 |
4 |
Adversarial and Natural Perturbations for General Robustness |
4, 4, 4 |
|
2741 |
4 |
Intrinsically Guided Exploration in Meta Reinforcement Learning |
4, 4, 4, 4 |
|
2742 |
3.8 |
More Side Information, Better Pruning: Shared-Label Classification as a Case Study |
3, 4, 2, 6, 4 |
|
2743 |
3.8 |
Towards Powerful Graph Neural Networks: Diversity Matters |
3, 4, 4, 4, 4 |
|
2744 |
3.8 |
Memory Representation in Transformer |
4, 3, 4, 5, 3 |
|
2745 |
3.8 |
Domain Adaptation with Morphologic Segmentation |
4, 5, 3, 3, 4 |
|
2746 |
3.8 |
Graph View-Consistent Learning Network |
5, 4, 4, 3, 3 |
|
2747 |
3.8 |
An Euler-based GAN for time series |
5, 3, 5, 3, 3 |
|
2748 |
3.8 |
TOWARDS NATURAL ROBUSTNESS AGAINST ADVERSARIAL EXAMPLES |
3, 3, 3, 5, 5 |
|
2749 |
3.8 |
Exploiting Weight Redundancy in CNNs: Beyond Pruning and Quantization |
3, 5, 4, 4, 3 |
|
2750 |
3.8 |
Cost-efficient SVRG with Arbitrary Sampling |
3, 4, 4, 4, 4 |
|
2751 |
3.75 |
ROGA: Random Over-sampling Based on Genetic Algorithm |
4, 3, 5, 3 |
|
2752 |
3.75 |
Playing Atari with Capsule Networks: A systematic comparison of CNN and CapsNets-based agents. |
4, 4, 5, 2 |
|
2753 |
3.75 |
Efficient Learning of Less Biased Models with Transfer Learning |
5, 3, 4, 3 |
|
2754 |
3.75 |
On Flat Minima, Large Margins and Generalizability |
3, 4, 4, 4 |
|
2755 |
3.75 |
Towards Robust Textual Representations with Disentangled Contrastive Learning |
4, 3, 5, 3 |
|
2756 |
3.75 |
Multi-Faceted Trust Based Recommendation System |
4, 4, 4, 3 |
|
2757 |
3.75 |
Toward Understanding Supervised Representation Learning with RKHS and GAN |
3, 5, 3, 4 |
|
2758 |
3.75 |
Unified analytic forms for Convolutional Neural Networks and Wavelet Filter Banks |
4, 2, 5, 4 |
|
2759 |
3.75 |
Transformers satisfy |
4, 3, 4, 4 |
|
2760 |
3.75 |
Privacy-preserving Learning via Deep Net Pruning |
2, 4, 5, 4 |
|
2761 |
3.75 |
Accurate Word Representations with Universal Visual Guidance |
3, 4, 4, 4 |
|
2762 |
3.75 |
Greedy Multi-Step Off-Policy Reinforcement Learning |
5, 4, 4, 2 |
|
2763 |
3.75 |
Improved generalization by noise enhancement |
4, 4, 3, 4 |
|
2764 |
3.75 |
Quantum and Translation Embedding for Knowledge Graph Completion |
4, 4, 3, 4 |
|
2765 |
3.75 |
Spatial Frequency Bias in Convolutional Generative Adversarial Networks |
5, 3, 4, 3 |
|
2766 |
3.75 |
Cross-lingual Transfer Learning for Pre-trained Contextualized Language Models |
4, 4, 3, 4 |
|
2767 |
3.75 |
Multilayer Dense Connections for Hierarchical Concept Classification |
2, 5, 5, 3 |
|
2768 |
3.75 |
Domain Knowledge in Exploration Noise in AlphaZero |
4, 4, 4, 3 |
|
2769 |
3.75 |
Smooth Activations and Reproducibility in Deep Networks |
2, 4, 5, 4 |
|
2770 |
3.75 |
Unsupervised Discovery of Interpretable Latent Manipulations in Language VAEs |
4, 5, 3, 3 |
|
2771 |
3.75 |
Guiding Neural Network Initialization via Marginal Likelihood Maximization |
3, 4, 4, 4 |
|
2772 |
3.75 |
Task-similarity Aware Meta-learning through Nonparametric Kernel Regression |
4, 4, 4, 3 |
|
2773 |
3.75 |
Graph Pooling by Edge Cut |
3, 3, 5, 4 |
|
2774 |
3.75 |
Generating universal language adversarial examples by understanding and enhancing the transferability across neural models |
3, 5, 4, 3 |
|
2775 |
3.75 |
Empirical Frequentist Coverage of Deep Learning Uncertainty Quantification Procedures |
4, 4, 4, 3 |
|
2776 |
3.75 |
Sequential Normalization: an improvement over Ghost Normalization |
4, 4, 4, 3 |
|
2777 |
3.75 |
Learning to Dynamically Select Between Reward Shaping Signals |
4, 4, 2, 5 |
|
2778 |
3.75 |
Perfect density models cannot guarantee anomaly detection |
3, 4, 4, 4 |
|
2779 |
3.75 |
RNA Alternative Splicing Prediction with Discrete Compositional Energy Network |
4, 4, 4, 3 |
|
2780 |
3.75 |
Empirically Verifying Hypotheses Using Reinforcement Learning |
4, 5, 3, 3 |
|
2781 |
3.75 |
LINGUINE: LearnIng to pruNe on subGraph convolUtIon NEtworks |
5, 4, 3, 3 |
|
2782 |
3.75 |
A Spectral Perspective of Neural Networks Robustness to Label Noise |
3, 4, 3, 5 |
|
2783 |
3.75 |
Detecting Adversarial Examples by Additional Evidence from Noise Domain |
4, 4, 3, 4 |
|
2784 |
3.75 |
PERIL: Probabilistic Embeddings for hybrid Meta-Reinforcement and Imitation Learning |
4, 4, 3, 4 |
|
2785 |
3.75 |
A Gradient-based Kernel Approach for Efficient Network Architecture Search |
4, 4, 3, 4 |
|
2786 |
3.75 |
The Card Shuffling Hypotheses: Building a Time and Memory Efficient Graph Convolutional Network |
4, 3, 4, 4 |
|
2787 |
3.75 |
Adaptive Automotive Radar data Acquisition |
4, 4, 3, 4 |
|
2788 |
3.75 |
Stochastic Normalized Gradient Descent with Momentum for Large Batch Training |
3, 4, 4, 4 |
|
2789 |
3.75 |
Deep Reinforcement Learning for Optimal Stopping with Application in Financial Engineering |
5, 4, 4, 2 |
|
2790 |
3.75 |
Introducing Sample Robustness |
5, 4, 2, 4 |
|
2791 |
3.75 |
AETree: Areal Spatial Data Generation |
5, 5, 2, 3 |
|
2792 |
3.75 |
On the cost of homogeneous network building blocks and parameter sharing |
4, 3, 4, 4 |
|
2793 |
3.75 |
Hybrid Quantum-Classical Stochastic Networks with Boltzmann Layers |
3, 5, 4, 3 |
|
2794 |
3.75 |
Succinct Explanations with Cascading Decision Trees |
3, 5, 3, 4 |
|
2795 |
3.75 |
Evaluating Agents Without Rewards |
3, 4, 4, 4 |
|
2796 |
3.75 |
Search Data Structure Learning |
4, 4, 4, 3 |
|
2797 |
3.75 |
EMTL: A Generative Domain Adaptation Approach |
4, 3, 5, 3 |
|
2798 |
3.75 |
Conditioning Trick for Training Stable GANs |
3, 5, 3, 4 |
|
2799 |
3.75 |
Bayesian Neural Networks with Variance Propagation for Uncertainty Evaluation |
4, 3, 4, 4 |
|
2800 |
3.75 |
Federated learning using mixture of experts |
6, 3, 3, 3 |
|
2801 |
3.75 |
Cross-Attention Guided Network for Visual Tracking |
3, 3, 5, 4 |
|
2802 |
3.75 |
Self-Supervised Continuous Control without Policy Gradient |
4, 4, 4, 3 |
|
2803 |
3.75 |
Revisiting Graph Neural Networks for Link Prediction |
3, 4, 5, 3 |
|
2804 |
3.75 |
HYPE-C: Evaluating Image Completion Models Through Standardized Crowdsourcing |
4, 3, 4, 4 |
|
2805 |
3.75 |
Neural Networks Preserve Invertibility Across Iterations: A Possible Source of Implicit Data Augmentation |
5, 4, 2, 4 |
|
2806 |
3.75 |
Using MMD GANs to correct physics models and improve Bayesian parameter estimation |
4, 4, 3, 4 |
|
2807 |
3.75 |
Few-Round Learning for Federated Learning |
3, 4, 5, 3 |
|
2808 |
3.75 |
Max-Affine Spline Insights Into Deep Network Pruning |
4, 4, 5, 2 |
|
2809 |
3.75 |
A straightforward line search approach on the expected empirical loss for stochastic deep learning problems |
3, 4, 4, 4 |
|
2810 |
3.75 |
An Empirical Study of the Expressiveness of Graph Kernels and Graph Neural Networks |
4, 3, 4, 4 |
|
2811 |
3.75 |
Learning Graph Normalization for Graph Neural Networks |
4, 4, 3, 4 |
|
2812 |
3.75 |
Stochastic Optimization with Non-stationary Noise: The Power of Moment Estimation |
3, 4, 5, 3 |
|
2813 |
3.75 |
Deep Ensembles for Low-Data Transfer Learning |
4, 3, 3, 5 |
|
2814 |
3.75 |
Decorrelated Double Q-learning |
5, 3, 3, 4 |
|
2815 |
3.75 |
Constraining Latent Space to Improve Deep Self-Supervised e-Commerce Products Embeddings for Downstream Tasks |
5, 3, 4, 3 |
|
2816 |
3.75 |
Adaptive Learning Rates with Maximum Variation Averaging |
4, 4, 4, 3 |
|
2817 |
3.75 |
Asymptotic Optimality of Self-Representative Low-Rank Approximation and Its Applications |
4, 4, 4, 3 |
|
2818 |
3.75 |
Learned residual Gerchberg-Saxton network for computer generated holography |
3, 4, 5, 3 |
|
2819 |
3.75 |
On the Benefits of Early Fusion in Multimodal Representation Learning |
4, 4, 3, 4 |
|
2820 |
3.75 |
A General Computational Framework to Measure the Expressiveness of Complex Networks using a Tight Upper Bound of Linear Regions |
4, 4, 4, 3 |
|
2821 |
3.75 |
Generative Auto-Encoder: Non-adversarial Controllable Synthesis with Disentangled Exploration |
3, 5, 3, 4 |
|
2822 |
3.75 |
Model agnostic meta-learning on trees |
3, 4, 5, 3 |
|
2823 |
3.75 |
Temporal Attention Modules for Memory-Augmented Neural Networks |
5, 4, 3, 3 |
|
2824 |
3.75 |
Modelling Drug-Target Binding Affinity using a BERT based Graph Neural network |
3, 4, 4, 4 |
|
2825 |
3.75 |
Representation Quality Of Neural Networks Links To Adversarial Attacks and Defences |
4, 3, 4, 4 |
|
2826 |
3.75 |
Dynamic Relational Inference in Multi-Agent Trajectories |
4, 5, 4, 2 |
|
2827 |
3.75 |
Fighting Filterbubbles with Adversarial BERT-Training for News-Recommendation |
5, 4, 3, 3 |
|
2828 |
3.75 |
Highway-Connection Classifier Networks for Plastic yet Stable Continual Learning |
4, 3, 4, 4 |
|
2829 |
3.75 |
Predicting Video with VQVAE |
4, 4, 3, 4 |
|
2830 |
3.75 |
MASP: Model-Agnostic Sample Propagation for Few-shot learning |
3, 5, 4, 3 |
|
2831 |
3.75 |
CAFE: Catastrophic Data Leakage in Federated Learning |
4, 3, 4, 4 |
|
2832 |
3.75 |
FASG: Feature Aggregation Self-training GCN for Semi-supervised Node Classification |
4, 4, 4, 3 |
|
2833 |
3.67 |
Offline Policy Optimization with Variance Regularization |
4, 4, 3 |
|
2834 |
3.67 |
Bractivate: Dendritic Branching in Segmentation Neural Architecture Search |
4, 4, 3 |
|
2835 |
3.67 |
Unsupervised Word Translation Pairing using Refinement based Point Set Registration |
3, 4, 4 |
|
2836 |
3.67 |
A self-explanatory method for the black problem on discrimination part of CNN |
5, 3, 3 |
|
2837 |
3.67 |
Ruminating Word Representations with Random Noise Masking |
4, 4, 3 |
|
2838 |
3.67 |
Temperature Regret Matching for Imperfect-Information Games |
6, 2, 3 |
|
2839 |
3.67 |
Don’t Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks |
3, 3, 5 |
|
2840 |
3.67 |
DACT-BERT: Increasing the efficiency and interpretability of BERT by using adaptive computation time. |
3, 5, 3 |
|
2841 |
3.67 |
AE-SMOTE: A Multi-Modal Minority Oversampling Framework |
3, 4, 4 |
|
2842 |
3.67 |
Meta-k: Towards Unsupervised Prediction of Number of Clusters |
4, 4, 3 |
|
2843 |
3.67 |
NODE-SELECT: A FLEXIBLE GRAPH NEURAL NETWORK BASED ON REALISTIC PROPAGATION SCHEME |
4, 3, 4 |
|
2844 |
3.67 |
TimeAutoML: Autonomous Representation Learning for Multivariate Irregularly Sampled Time Series |
4, 3, 4 |
|
2845 |
3.67 |
Addressing Extrapolation Error in Deep Offline Reinforcement Learning |
4, 4, 3 |
|
2846 |
3.67 |
CoNES: Convex Natural Evolutionary Strategies |
3, 2, 6 |
|
2847 |
3.67 |
Towards Generalized Artificial Intelligence by Assessment Aggregation with Applications to Standard and Extreme Classifications |
6, 3, 2 |
|
2848 |
3.67 |
Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression using Privileged Information |
3, 4, 4 |
|
2849 |
3.67 |
An Adversarial Attack via Feature Contributive Regions |
3, 5, 3 |
|
2850 |
3.67 |
Don’t be picky, all students in the right family can learn from good teachers |
5, 3, 3 |
|
2851 |
3.67 |
$\alpha$VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning |
4, 4, 3 |
|
2852 |
3.67 |
Pseudo Label-Guided Multi Task Learning for Scene Understanding |
3, 4, 4 |
|
2853 |
3.67 |
On the relationship between topology and gradient propagation in deep networks |
2, 6, 3 |
|
2854 |
3.67 |
Automatic Music Production Using Generative Adversarial Networks |
2, 4, 5 |
|
2855 |
3.67 |
Single Image Depth Estimation Based on Spectral Consistency and Predicted View |
3, 4, 4 |
|
2856 |
3.67 |
Evaluating Gender Bias in Natural Language Inference |
4, 4, 3 |
|
2857 |
3.67 |
Frequency Regularized Deep Convolutional Dictionary Learning and Application to Blind Denoising |
4, 3, 4 |
|
2858 |
3.67 |
Optimal Designs of Gaussian Processes with Budgets for Hyperparameter Optimization |
4, 4, 3 |
|
2859 |
3.67 |
Boltzman Tuning of Generative Models |
4, 3, 4 |
|
2860 |
3.6 |
Real-Time AutoML |
4, 4, 2, 4, 4 |
|
2861 |
3.5 |
Stochastic Proximal Point Algorithm for Large-scale Nonconvex Optimization: Convergence, Implementation, and Application to Neural Networks |
4, 3, 3, 4 |
|
2862 |
3.5 |
Learning to Control on the Fly |
3, 4, 4, 3 |
|
2863 |
3.5 |
CLARE-GAN: GENERATION OF CLASS-SPECIFIC TIME SERIES |
3, 4, 4, 3 |
|
2864 |
3.5 |
Information-theoretic Vocabularization via Optimal Transport |
4, 4, 3, 3 |
|
2865 |
3.5 |
Embedding semantic relationships in hidden representations via label smoothing |
5, 3, 2, 4 |
|
2866 |
3.5 |
Zero-Shot Recognition through Image-Guided Semantic Classification |
3, 4, 3, 4 |
|
2867 |
3.5 |
Measuring GAN Training in Real Time |
2, 4, 5, 3 |
|
2868 |
3.5 |
Polar Embedding |
4, 4, 3, 3 |
|
2869 |
3.5 |
Generalization and Stability of GANs: A theory and promise from data augmentation |
3, 4, 3, 4 |
|
2870 |
3.5 |
Deep Ensembles with Hierarchical Diversity Pruning |
3, 3, 4, 4 |
|
2871 |
3.5 |
Deep Reinforcement Learning With Adaptive Combined Critics |
3, 5, 3, 3 |
|
2872 |
3.5 |
Collaborative Filtering with Smooth Reconstruction of the Preference Function |
4, 3, 4, 3 |
|
2873 |
3.5 |
Prediction of Enzyme Specificity using Protein Graph Convolutional Neural Networks |
3, 4, 4, 3 |
|
2874 |
3.5 |
Efficient estimates of optimal transport via low-dimensional embeddings |
4, 4, 2, 4 |
|
2875 |
3.5 |
A Real-time Contribution Measurement Method for Participants in Federated Learning |
3, 4, 3, 4 |
|
2876 |
3.5 |
Hindsight Curriculum Generation Based Multi-Goal Experience Replay |
3, 4, 4, 3 |
|
2877 |
3.5 |
Deep Denoising for Scientific Discovery: A Case Study in Electron Microscopy |
5, 3, 4, 2 |
|
2878 |
3.5 |
An Algorithm for Out-Of-Distribution Attack to Neural Network Encoder |
4, 3, 4, 3 |
|
2879 |
3.5 |
Machine Learning Algorithms for Data Labeling: An Empirical Evaluation |
3, 4, 4, 3 |
|
2880 |
3.5 |
Semi-Supervised Learning via Clustering Representation Space |
4, 4, 2, 4 |
|
2881 |
3.5 |
EM-RBR: a reinforced framework for knowledge graph completion from reasoning perspective |
3, 4, 4, 3 |
|
2882 |
3.5 |
Unsupervised Anomaly Detection by Robust Collaborative Autoencoders |
4, 4, 3, 3 |
|
2883 |
3.5 |
Adaptive Spatial-Temporal Inception Graph Convolutional Networks for Multi-step Spatial-Temporal Network Data Forecasting |
5, 3, 3, 3 |
|
2884 |
3.5 |
Probabilistic Multimodal Representation Learning |
4, 4, 3, 3 |
|
2885 |
3.5 |
Syntactic Relevance XLNet Word Embedding Generation in Low-Resource Machine Translation |
3, 3, 5, 3 |
|
2886 |
3.5 |
On the Importance of Distraction-Robust Representations for Robot Learning |
3, 3, 4, 4 |
|
2887 |
3.5 |
Solving Non-Stationary Bandit Problems with an RNN and an Energy Minimization Loss |
5, 3, 4, 2 |
|
2888 |
3.5 |
Learning to communicate through imagination with model-based deep multi-agent reinforcement learning |
3, 4, 4, 3 |
|
2889 |
3.5 |
A Simple Approach To Define Curricula For Training Neural Networks |
3, 4, 3, 4 |
|
2890 |
3.5 |
Bigeminal Priors Variational Auto-encoder |
3, 4, 3, 4 |
|
2891 |
3.5 |
MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining |
4, 5, 2, 3 |
|
2892 |
3.5 |
Translation Memory Guided Neural Machine Translation |
4, 4, 2, 4 |
|
2893 |
3.5 |
A Robust Fuel Optimization Strategy For Hybrid Electric Vehicles: A Deep Reinforcement Learning Based Continuous Time Design Approach |
2, 4, 5, 3 |
|
2894 |
3.5 |
Analysing Features Learned Using Unsupervised Models on Program Embeddings |
3, 4, 2, 5 |
|
2895 |
3.5 |
Mitigating Deep Double Descent by Concatenating Inputs |
5, 3, 2, 4 |
|
2896 |
3.33 |
Sparse Coding-inspired GAN for Weakly Supervised Hyperspectral Anomaly Detection |
3, 3, 4 |
|
2897 |
3.33 |
Adversarial Attacks on Machine Learning Systems for High-Frequency Trading |
4, 3, 3 |
|
2898 |
3.33 |
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological Models |
3, 4, 3 |
|
2899 |
3.33 |
An Automated Domain Understanding Technique for Knowledge Graph Generation |
3, 4, 3 |
|
2900 |
3.33 |
Sensory Resilience based on Synesthesia |
5, 2, 3 |
|
2901 |
3.33 |
DROPS: Deep Retrieval of Physiological Signals via Attribute-specific Clinical Prototypes |
4, 4, 2 |
|
2902 |
3.33 |
A Benchmark for Voice-Face Cross-Modal Matching and Retrieval |
4, 3, 3 |
|
2903 |
3.33 |
Self-Pretraining for Small Datasets by Exploiting Patch Information |
4, 2, 4 |
|
2904 |
3.25 |
Flow Neural Network and Flow-Structured Data Representation |
2, 4, 4, 3 |
|
2905 |
3.25 |
Simple deductive reasoning tests and data sets for exposing limitation of today’s deep neural networks |
3, 4, 3, 3 |
|
2906 |
3.25 |
Matrix Data Deep Decoder - Geometric Learning for Structured Data Completion |
3, 4, 3, 3 |
|
2907 |
3.25 |
Hierarchical Probabilistic Model for Blind Source Separation via Legendre Transformation |
4, 4, 2, 3 |
|
2908 |
3.25 |
Necessary and Sufficient Conditions for Compositional Representations |
3, 3, 4, 3 |
|
2909 |
3.25 |
MULTI-SPAN QUESTION ANSWERING USING SPAN-IMAGE NETWORK |
3, 1, 4, 5 |
|
2910 |
3.25 |
Continual Lifelong Causal Effect Inference with Real World Evidence |
4, 4, 3, 2 |
|
2911 |
3.25 |
Indirect Supervision to Mitigate Perturbations |
3, 4, 4, 2 |
|
2912 |
3.25 |
Explainable Reinforcement Learning Through Goal-Based Explanations |
3, 4, 3, 3 |
|
2913 |
3.25 |
Hierarchical Meta Reinforcement Learning for Multi-Task Environments |
3, 4, 3, 3 |
|
2914 |
3.25 |
Recycling sub-optimial Hyperparameter Optimization models to generate efficient Ensemble Deep Learning |
3, 4, 3, 3 |
|
2915 |
3.25 |
Dual Adversarial Training for Unsupervised Domain Adaptation |
5, 3, 2, 3 |
|
2916 |
3.25 |
A Simple and General Strategy for Referential Problem in Low-Resource Neural Machine Translation |
4, 3, 4, 2 |
|
2917 |
3.25 |
USING OBJECT-FOCUSED IMAGES AS AN IMAGE AUGMENTATION TECHNIQUE TO IMPROVE THE ACCURACY OF IMAGE-CLASSIFICATION MODELS WHEN VERY LIMITED DATA SETS ARE AVAILABLE |
3, 5, 2, 3 |
|
2918 |
3.25 |
Success-Rate Targeted Reinforcement Learning by Disorientation Penalty |
4, 4, 3, 2 |
|
2919 |
3.25 |
Switching-Aligned-Words Data Augmentation for Neural Machine Translation |
2, 3, 4, 4 |
|
2920 |
3.25 |
Certified Distributional Robustness via Smoothed Classifiers |
6, 3, 2, 2 |
|
2921 |
3.25 |
MSFM: Multi-Scale Fusion Module for Object Detection |
3, 3, 4, 3 |
|
2922 |
3.25 |
Dual Graph Complementary Network |
4, 2, 4, 3 |
|
2923 |
3.25 |
Gradient Descent Resists Compositionality |
5, 1, 4, 3 |
|
2924 |
3.2 |
VideoFlow: A Framework for Building Visual Analysis Pipelines |
3, 3, 4, 3, 3 |
|
2925 |
3.2 |
QRGAN: Quantile Regression Generative Adversarial Networks |
2, 3, 5, 4, 2 |
|
2926 |
3.2 |
Interpretable Meta-Reinforcement Learning with Actor-Critic Method |
3, 2, 4, 3, 4 |
|
2927 |
3 |
Image Modeling with Deep Convolutional Gaussian Mixture Models |
3, 4, 3, 2 |
|
2928 |
3 |
ZCal: Machine learning methods for calibrating radio interferometric data |
3, 2, 4 |
|
2929 |
3 |
Meta Auxiliary Labels with Constituent-based Transformer for Aspect-based Sentiment Analysis |
2, 3, 4 |
|
2930 |
3 |
Proper Measure for Adversarial Robustness |
3, 3, 3, 3 |
|
2931 |
3 |
Computing Preimages of Deep Neural Networks with Applications to Safety |
3, 4, 3, 2 |
|
2932 |
3 |
Anti-Distillation: Improving Reproducibility of Deep Networks |
3, 3, 3, 3 |
|
2933 |
3 |
Accurate and fast detection of copy number variations from short-read whole-genome sequencing with deep convolutional neural network |
5, 2, 2, 3 |
|
2934 |
3 |
DQSGD: DYNAMIC QUANTIZED STOCHASTIC GRADIENT DESCENT FOR COMMUNICATION-EFFICIENT DISTRIBUTED LEARNING |
2, 4, 4, 2 |
|
2935 |
3 |
Monotonic neural network: combining deep learning with domain knowledge for chiller plants energy optimization |
4, 3, 2, 3 |
|
2936 |
3 |
Gradient flow encoding with distance optimization adaptive step size |
4, 3, 2, 3 |
|
2937 |
3 |
Generative modeling with one recursive network |
2, 2, 4, 4 |
|
2938 |
3 |
GenQu: A Hybrid Framework for Learning Classical Data in Quantum States |
4, 2, 3, 3 |
|
2939 |
3 |
Neural Pooling for Graph Neural Networks |
3, 4, 2, 3 |
|
2940 |
3 |
Reinforcement Learning Based Asymmetrical DNN Modularization for Optimal Loading |
3, 2, 4, 3 |
|
2941 |
3 |
A Theory of Self-Supervised Framework for Few-Shot Learning |
3, 4, 2, 2, 4 |
|
2942 |
3 |
Robust Multi-view Representation Learning |
3, 3, 3, 3 |
|
2943 |
3 |
WordsWorth Scores for Attacking CNNs and LSTMs for Text Classification |
2, 3, 4 |
|
2944 |
3 |
Implicit Regularization Effects of Unbiased Random Label Noises with SGD |
2, 4, 3, 3 |
|
2945 |
3 |
Deep Learning Proteins using a Triplet-BERT network |
3, 3, 3, 3 |
|
2946 |
3 |
Transferability of Compositionality |
2, 3, 4, 3 |
|
2947 |
3 |
Structure Controllable Text Generation |
5, 2, 2, 3 |
|
2948 |
3 |
FSV: Learning to Factorize Soft Value Function for Cooperative Multi-Agent Reinforcement Learning |
3, 2, 4, 2, 4 |
|
2949 |
3 |
BBRefinement: an universal scheme to improve precision of box object detectors |
4, 2, 4, 2 |
|
2950 |
3 |
Identifying the Sources of Uncertainty in Object Classification |
3, 3, 3 |
|
2951 |
2.8 |
A 3D Convolutional Neural Network for Predicting Wildfire Profiles |
3, 3, 3, 3, 2 |
|
2952 |
2.8 |
Stochastic Inverse Reinforcement Learning |
3, 3, 4, 2, 2 |
|
2953 |
2.75 |
A Stochastic Gradient Langevin Dynamics Algorithm For Noise Intrinsic Federated Learning |
3, 3, 3, 2 |
|
2954 |
2.67 |
Using Deep Reinforcement Learning to Train and Evaluate Instructional Sequencing Policies for an Intelligent Tutoring System |
2, 4, 2 |
|
2955 |
2.6 |
Reducing the number of neurons of Deep ReLU Networks based on the current theory of Regularization |
2, 3, 4, 2, 2 |
|
2956 |
2.5 |
Guiding Representation Learning in Deep Generative Models with Policy Gradients |
1, 4, 3, 2 |
|
2957 |
2.5 |
FLAGNet : Feature Label based Automatic Generation Network for symbolic music |
3, 2, 3, 2 |
|
2958 |
2.5 |
What to Prune and What Not to Prune at Initialization |
2, 1, 4, 3 |
|
2959 |
2.5 |
A Numbers Game: Numeric Encoding Options with Automunge |
2, 3, 3, 2 |
|
2960 |
2.5 |
Multi-Task Multicriteria Hyperparameter Optimization |
2, 3, 2, 3 |
|
2961 |
2.33 |
SEMANTIC APPROACH TO AGENT ROUTING USING A HYBRID ATTRIBUTE-BASED RECOMMENDER SYSTEM |
3, 2, 2 |
|
2962 |
2.25 |
$Graph Embedding via Topology and Functional Analysis$ |
2, 3, 2, 2 |
|
2963 |
2.25 |
KETG: A Knowledge Enhanced Text Generation Framework |
2, 2, 2, 3 |
|
2964 |
2.25 |
Consensus Driven Learning |
1, 3, 2, 3 |
|
2965 |
2 |
Towards Counteracting Adversarial Perturbations to Resist Adversarial Examples |
1, 2, 2, 3 |
|
2966 |
2 |
A generalized probability kernel on discrete distributions and its application in two-sample test |
1, 2, 3, 2 |
|