Downloads 2024
Number of events: 4619
- $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
- $\boldsymbol{\mu}\mathbf{P^2}$: Effective Sharpness Aware Minimization Requires Layerwise Perturbation Scaling
- $C^2M^3$: Cycle-Consistent Multi-Model Merging
- $E^3$: Exploring Embodied Emotion Through A Large-Scale Egocentric Video Dataset
- $\epsilon$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise
- $\nabla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials
- $SE(3)$ Equivariant Ray Embeddings for Implicit Multi-View Depth Estimation
- $\text{Di}^2\text{Pose}$: Discrete Diffusion Model for Occluded 3D Human Pose Estimation
- $\text{ID}^3$: Identity-Preserving-yet-Diversified Diffusion Models for Synthetic Face Recognition
- $\textit{Bifr\"ost}$: 3D-Aware Image Compositing with Language Instructions
- $\textit{NeuroPath}$: A Neural Pathway Transformer for Joining the Dots of Human Connectomes
- $\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
- $\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning
- $\texttt{ConflictBank}$: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMs
- $\texttt{dattri}$: A Library for Efficient Data Attribution
- $\texttt{dopanim}$: A Dataset of Doppelganger Animals with Noisy Annotations from Multiple Humans
- $\texttt{Model-GLUE}$: Democratized LLM Scaling for A Large Model Zoo in the Wild
- $\texttt{pfl-research}$: simulation framework for accelerating research in Private Federated Learning
- 2D-OOB: Attributing Data Contribution Through Joint Valuation Framework
- 2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
- 2nd Workshop on Touch Processing: From Data to Knowledge
- 3DCoMPaT200: Language Grounded Large-Scale 3D Vision Dataset for Compositional Recognition
- 3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction
- 3DET-Mamba: Causal Sequence Modelling for End-to-End 3D Object Detection
- 3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration
- 3D Gaussian Rendering Can Be Sparser: Efficient Rendering via Learned Fragment Pruning
- 3D Gaussian Splatting as Markov Chain Monte Carlo
- 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors
- 3D Structure Prediction of Atomic Systems with Flow-based Direct Preference Optimization
- 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability
- 3rd Workshop on New Frontiers in Adversarial Machine Learning (AdvML-Frontiers)
- 4+3 Phases of Compute-Optimal Neural Scaling Laws
- 4-bit Shampoo for Memory-Efficient Network Training
- 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on RDBs
- 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization
- 4Diffusion: Multi-view Video Diffusion Model for 4D Generation
- 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
- 4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models
- 5th Workshop on Self-Supervised Learning: Theory and Practice
- A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware Perspective
- A Bayesian Approach for Personalized Federated Learning in Heterogeneous Settings
- A Bayesian Approach to Data Point Selection
- ABCFair: an Adaptable Benchmark approach for Comparing Fairness Methods
- Abductive Reasoning in Logical Credal Networks
- A Benchmark Dataset for Event-Guided Human Pose Estimation and Tracking in Extreme Conditions
- A benchmark for prediction of transcriptomic responses to chemical perturbations across cell types
- A Benchmark Suite for Evaluating Neural Mutual Information Estimators on Unstructured Datasets
- A Benchmark Suite for Systematically Evaluating Reasoning Shortcuts
- A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays
- A Boosting-Type Convergence Result for AdaBoost.MH with Factorized Multi-Class Classifiers
- Abrupt Learning in Transformers: A Case Study on Matrix Completion
- Absorb & Escape: Overcoming Single Model Limitations in Generating Heterogeneous Genomic Sequences
- Abstracted Shapes as Tokens - A Generalizable and Interpretable Model for Time-series Classification
- Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation
- A Canonicalization Perspective on Invariant and Equivariant Learning
- A Careful Examination of Large Language Model Performance on Grade School Arithmetic
- A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
- Accelerated Regularized Learning in Finite N-Person Games
- Accelerating Augmentation Invariance Pretraining
- Accelerating Blockwise Parallel Language Models with Draft Refinement
- Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity
- Accelerating ERM for data-driven algorithm design using output-sensitive techniques
- Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling
- Accelerating Matroid Optimization through Fast Imprecise Oracles
- Accelerating Nash Equilibrium Convergence in Monte Carlo Settings Through Counterfactual Value Based Fictitious Play
- Accelerating Non-Maximum Suppression: A Graph Theory Perspective
- Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
- Accelerating Relative Entropy Coding with Space Partitioning
- Accelerating Transformers with Spectrum-Preserving Token Merging
- Acceleration Exists! Optimization Problems When Oracle Can Only Compare Objective Function Values
- Accuracy is Not All You Need
- Accurate and Steady Inertial Pose Estimation through Sequence Structure Learning and Modulation
- ACES: Generating a Diversity of Challenging Programming Puzzles with Autotelic Generative Models
- ACFun: Abstract-Concrete Fusion Facial Stylization
- Achievable distributional robustness when the robust risk is only partially identified
- Achievable Fairness on Your Data With Utility Guarantees
- Achieving $\tilde{O}(1/\epsilon)$ Sample Complexity for Constrained Markov Decision Process
- Achieving Constant Regret in Linear Markov Decision Processes
- Achieving Domain-Independent Certified Robustness via Knowledge Continuity
- Achieving Linear Convergence with Parameter-Free Algorithms in Decentralized Optimization
- Achieving Near-Optimal Convergence for Distributed Minimax Optimization with Adaptive Stepsizes
- Achieving Optimal Clustering in Gaussian Mixture Models with Anisotropic Covariance Structures
- Achieving Tractable Minimax Optimal Regret in Average Reward MDPs
- A Closer Look at AUROC and AUPRC under Class Imbalance
- A Closer Look at the CLS Token for Cross-Domain Few-Shot Learning
- A Combinatorial Algorithm for the Semi-Discrete Optimal Transport Problem
- A Compositional Atlas for Algebraic Circuits
- A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression
- A Concept-Based Explainability Framework for Large Multimodal Models
- A Consistency-Aware Spot-Guided Transformer for Versatile and Hierarchical Point Cloud Registration
- A Continuous-time Stochastic Gradient Descent Method for Continuous Data
- Acoustic Volume Rendering for Neural Impulse Response Fields
- A Critical Evaluation of AI Feedback for Aligning Large Language Models
- A Cross-Domain Benchmark for Active Learning
- ActAnywhere: Subject-Aware Video Background Generation
- ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation
- ActionAtlas: A VideoQA Benchmark for Fine-grained Action Recognition
- Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
- Action Imitation in Common Action Space for Customized Action Image Synthesis
- Activating Self-Attention for Multi-Scene Absolute Pose Regression
- Activation Map Compression through Tensor Decomposition for Deep Learning
- Active, anytime-valid risk controlling prediction sets
- Active Classification with Few Queries under Misspecification
- Active design of two-photon holographic stimulation for identifying neural population dynamics
- Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes
- Active Learning of General Halfspaces: Label Queries vs Membership Queries
- Active Perception for Grasp Detection via Neural Graspness Field
- Active preference learning for ordering items in- and out-of-sample
- Active Sequential Posterior Estimation for Sample-Efficient Simulation-Based Inference
- Active Set Ordering
- ActSort: An active-learning accelerated cell sorting algorithm for large-scale calcium imaging datasets
- AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
- Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
- Ada-MSHyper: Adaptive Multi-Scale Hypergraph Transformer for Time Series Forecasting
- Adam with model exponential moving average is effective for nonconvex optimization
- AdanCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
- AdaNeg: Adaptive Negative Proxy Guided OOD Detection with Vision-Language Models
- AdaNovo: Towards Robust \emph{De Novo} Peptide Sequencing in Proteomics against Data Biases
- AdaPKC: PeakConv with Adaptive Peak Receptive Field for Radar Semantic Segmentation
- Adaptable Logical Control for Large Language Models
- Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis
- Adapting to Unknown Low-Dimensional Structures in Score-Based Diffusion Models
- Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning
- Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization
- Adaptive Depth Networks with Skippable Sub-Paths
- Adaptive Domain Learning for Cross-domain Image Denoising
- Adaptive Experimentation When You Can't Experiment
- Adaptive Exploration for Data-Efficient General Value Function Evaluations
- Adaptive Foundation Models: Evolving AI for Personalized and Efficient Learning
- Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare
- Adaptive Important Region Selection with Reinforced Hierarchical Search for Dense Object Detection
- AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection
- Adaptive Labeling for Efficient Out-of-distribution Model Evaluation
- Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment
- Adaptive Passive-Aggressive Framework for Online Regression with Side Information
- Adaptive Preference Scaling for Reinforcement Learning with Human Feedback
- Adaptive Proximal Gradient Method for Convex Optimization
- Adaptive Randomized Smoothing: Certified Adversarial Robustness for Multi-Step Defences
- Adaptive Sampling for Efficient Softmax Approximation
- Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
- Adaptive Visual Scene Understanding: Incremental Scene Graph Generation
- AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-Making
- A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data
- Ad Auctions for LLMs via Retrieval Augmented Generation
- Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation
- Addressing Bias in Online Selection with Limited Budget of Comparisons
- Addressing Hidden Confounding with Heterogeneous Observational Datasets for Recommendation
- Addressing Spatial-Temporal Heterogeneity: General Mixed Time Series Analysis via Latent Continuity Recovery and Alignment
- Addressing Spectral Bias of Deep Neural Networks by Multi-Grade Deep Learning
- A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health
- A distributional simplicity bias in the learning dynamics of transformers
- AdjointDEIS: Efficient Gradients for Diffusion Models
- Adjust Pearson's $r$ to Measure Arbitrary Monotone Dependence
- ADOPT: Modified Adam Can Converge with Any $\beta_2$ with the Optimal Rate
- AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks
- Advancing Cross-domain Discriminability in Continual Learning of Vision-Language Models
- Advancing Data Selection for Foundation Models: From Heuristics to Principled Methods
- Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
- Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler
- Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators
- Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
- Advancing Training Efficiency of Deep Spiking Neural Networks through Rate-based Backpropagation
- Advancing Video Anomaly Detection: A Concise Review and a New Dataset
- Advection Augmented Convolutional Neural Networks
- Adversarial Environment Design via Regret-Guided Diffusion Models
- Adversarially Robust Decision Transformer
- Adversarially Robust Dense-Sparse Tradeoffs via Heavy-Hitters
- Adversarially Robust Multi-task Representation Learning
- Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning
- Adversarial Moment-Matching Distillation of Large Language Models
- Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models
- Adversarial Schrödinger Bridge Matching
- AED: Adaptable Error Detection for Few-shot Imitation Policy
- A Fast Convoluted Story: Scaling Probabilistic Inference for Integer Arithmetics
- AFBench: A Large-scale Benchmark for Airfoil Design
- A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs
- A Flexible, Equivariant Framework for Subgraph GNNs via Graph Products and Graph Coarsening
- A Foundation Model for Zero-shot Logical Query Reasoning
- A Framework for Bilevel Optimization on Riemannian Manifolds
- A Full-duplex Speech Dialogue Scheme Based On Large Language Model
- A Functional Extension of Semi-Structured Networks
- A generalized neural tangent kernel for surrogate gradient learning
- A General Protocol to Probe Large Vision Models for 3D Physical Understanding
- A Generative Model of Symmetry Transformations
- AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
- AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents
- Agent Planning with World Knowledge Model
- AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases
- A Geometric View of Data Complexity: Efficient Local Intrinsic Dimension Estimation with Diffusion Models
- Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP
- Aggregating Quantitative Relative Judgments: From Social Choice to Ranking Prediction
- AGILE: A Novel Reinforcement Learning Framework of LLM Agents
- A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding
- A Globally Optimal Portfolio for m-Sparse Sharpe Ratio Maximization
- A Gradient Accumulation Method for Dense Retriever under Memory Constraint
- AHA: Human-Assisted Out-of-Distribution Generalization and Detection
- A hierarchical decomposition for explaining ML performance discrepancies
- A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning
- A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy
- AI4Mat-2024: NeurIPS 2024 Workshop on AI for Accelerated Materials Design
- AID: Attention Interpolation of Text-to-Image Diffusion
- AI for New Drug Modalities
- AIM-FM: Advancements In Medical Foundation Models: Explainability, Robustness, Security, and Beyond
- AirSketch: Generative Motion to Sketch
- A Kernel Perspective on Distillation-based Collaborative Learning
- A Label is Worth A Thousand Images in Dataset Distillation
- A Large-Scale Human-Centric Benchmark for Referring Expression Comprehension in the LMM Era
- A Layer-Wise Natural Gradient Optimizer for Training Deep Neural Networks
- AlchemistCoder: Harmonizing and Eliciting Code Capability by Hindsight Tuning on Multi-source Data
- Algebraic Positional Encodings
- Algorithmic Capabilities of Random Transformers
- Algorithmic Collective Action in Recommender Systems: Promoting Songs by Reordering Playlists
- Algorithmic Fairness through the lens of Metrics and Evaluation
- Algorithmic progress in language models
- ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
- Alias-Free Mamba Neural Operator
- Aligner: Efficient Alignment by Learning to Correct
- Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers
- Aligning Audio-Visual Joint Representations with an Agentic Workflow
- Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control
- Aligning Diffusion Models by Optimizing Human Utility
- Aligning Embeddings and Geometric Random Graphs: Informational Results and Computational Approaches for the Procrustes-Wasserstein Problem
- Aligning Individual and Collective Objectives in Multi-Agent Cooperation
- Aligning Large Language Models with Representation Editing: A Control Perspective
- Aligning LLM Agents by Learning Latent Preference from User Edits
- Aligning Model Properties via Conformal Risk Control
- Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization
- Aligning to Thousands of Preferences via System Message Generalization
- Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
- Alignment at Pre-training! Towards Native Alignment for Arabic LLMs
- Alignment for Honesty
- AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery
- Alleviate Anchor-Shift: Explore Blind Spots with Cross-View Reconstruction for Incomplete Multi-View Clustering
- Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization
- Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization
- All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation
- Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits
- Almost-Linear RNNs Yield Highly Interpretable Symbolic Codes in Dynamical Systems Reconstruction
- Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits
- Almost Surely Asymptotically Constant Graph Neural Networks
- A Local Method for Satisfying Interventional Fairness with Partially Known Causal Graphs
- AlphaMath Almost Zero: Process Supervision without Process
- AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
- AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos
- ALPINE: Unveiling The Planning Capability of Autoregressive Learning in Language Models
- ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
- AlterMOMA: Fusion Redundancy Pruning for Camera-LiDAR Fusion Models with Alternative Modality Masking
- AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers
- A Match Made in Silicon: The Co-Evolution of Systems and AI
- AMBROSIA: A Benchmark for Parsing Ambiguous Questions into Database Queries
- A Metalearned Neural Circuit for Nonparametric Bayesian Inference
- A Method for Evaluating Hyperparameter Sensitivity in Reinforcement Learning
- Amnesia as a Catalyst for Enhancing Black Box Pixel Attacks in Image Classification and Object Detection
- A Modular Conditional Diffusion Framework for Image Reconstruction
- AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
- AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
- Amortized Active Causal Induction with Deep Reinforcement Learning
- Amortized Bayesian Experimental Design for Decision-Making
- Amortized Eigendecomposition for Neural Networks
- Amortized Fourier Neural Operators
- Amortized Planning with Large-Scale Transformers: A Case Study on Chess
- Amortizing intractable inference in diffusion models for vision, language, and control
- A Motion-aware Spatio-temporal Graph for Video Salient Object Ranking
- A multi-UAV dataset for multi-object tracking and re-identification of wild antelopes
- An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
- An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization
- An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints
- ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models
- Analysing Multi-Task Regression via Random Matrix Theory with Application to Time Series Forecasting
- Analysing the Generalisation and Reliability of Steering Vectors
- Analysis of Corrected Graph Convolutions
- Analytically deriving Partial Information Decomposition for affine systems of stable and convolution-closed distributions
- Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
- An Analysis of Elo Rating Systems via Markov Chains
- An Analysis of Robustness of Non-Lipschitz Networks
- An Analysis of Tokenization: Transformers under Markov Data
- An Analytical Study of Utility Functions in Multi-Objective Reinforcement Learning
- An Autoencoder-Like Nonnegative Matrix Co-Factorization for Improved Student Cognitive Modeling
- A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
- A Near-optimal Algorithm for Learning Margin Halfspaces with Massart Noise
- An effective framework for estimating individualized treatment rules
- An Efficient High-dimensional Gradient Estimator for Stochastic Differential Equations
- An Efficient Memory Module for Graph Few-Shot Class-Incremental Learning
- An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
- An End-To-End Graph Attention Network Hashing for Cross-Modal Retrieval
- An engine not a camera: Measuring performative power of online search
- An Equivalence Between Static and Dynamic Regret Minimization
- A Neural Network Approach for Efficiently Answering Most Probable Explanation Queries in Probabilistic Models
- A New Multi-Source Light Detection Benchmark and Semi-Supervised Focal Light Detection
- A New Neural Kernel Regime: The Inductive Bias of Multi-Task Learning
- An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem
- An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations
- An eye for an ear: zero-shot audio description leveraging an image captioner with audio-visual token distribution matching
- An Image is Worth 32 Tokens for Reconstruction and Generation
- Animal-Bench: Benchmarking Multimodal Video Models for Animal-centric Video Understanding
- Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
- An Improved Empirical Fisher Approximation for Natural Gradient Descent
- An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
- An Information Theoretic Perspective on Conformal Prediction
- Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
- An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning
- A Non-parametric Direct Learning Approach to Heterogeneous Treatment Effect Estimation under Unmeasured Confounding
- A Novel Benchmark for Decision-Making in Uncertain and Competitive Games
- A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation
- ANT: Adaptive Noise Schedule for Time Series Diffusion Models
- Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
- Any2Graph: Deep End-To-End Supervised Graph Prediction With An Optimal Transport Loss
- Any2Policy: Learning Visuomotor Policy with Any-Modality
- AnyFit: Controllable Virtual Try-on for Any Combination of Attire Across Any Scenario
- AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models
- A Pairwise Pseudo-likelihood Approach for Matrix Completion with Informative Missingness
- Apathetic or Empathetic? Evaluating LLMs' Emotional Alignments with Humans
- APDDv2: Aesthetics of Paintings and Drawings Dataset with Artist Labeled Scores and Comments
- APEBench: A Benchmark for Autoregressive Neural Emulators of PDEs
- A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention
- A PID Controller Approach for Adaptive Probability-dependent Gradient Decay in Model Calibration
- APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets
- A Polar coordinate system represents syntax in large language models
- Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models
- Approaching Human-Level Forecasting with Language Models
- Approximated Orthogonal Projection Unit: Stabilizing Regression Network Training Using Natural Gradient
- Approximately Equivariant Neural Processes
- Approximately Pareto-optimal Solutions for Bi-Objective k-Clustering
- Approximating mutual information of high-dimensional variables using learned representations
- Approximating the Top Eigenvector in Random Order Streams
- Approximation-Aware Bayesian Optimization
- Approximation Rate of the Transformer Architecture for Sequence Modeling
- A Practitioner's Guide to Real-World Continual Multimodal Pretraining
- A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints
- A probability contrastive learning framework for 3D molecular representation learning
- A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context Reasoning
- A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness
- Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
- ARC: A Generalist Graph Anomaly Detector with In-Context Learning
- Archaeoscape: Bringing Aerial Laser Scanning Archaeology to the Deep Learning Era
- Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting
- Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification
- A Recipe for Charge Density Prediction
- Are Graph Neural Networks Optimal Approximation Algorithms?
- Are High-Degree Representations Really Unnecessary in Equivariant Graph Neural Networks?
- Are Language Models Actually Useful for Time Series Forecasting?
- Are Large Language Models Good Statisticians?
- Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?
- Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems
- Are Multiple Instance Learning Algorithms Learnable for Instances?
- Are nuclear masks all you need for improved out-of-domain generalisation? A closer look at cancer classification in histopathology
- Are Self-Attentions Effective for Time Series Forecasting?
- A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics
- Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?
- Are We on the Right Way for Evaluating Large Vision-Language Models?
- Are Your Models Still Fair? Fairness Attacks on Graph Neural Networks via Node Injections
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction
- A robust inlier identification algorithm for point cloud registration via $\mathbf{\ell_0}$-minimization
- AROMA: Preserving Spatial Structure for Latent PDE Modeling with Local Neural Fields
- AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties
- ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users
- Artemis: Towards Referential Understanding in Complex Videos
- Articulate your NeRF: Unsupervised articulated object modeling via conditional view synthesis
- Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning
- A SARS-CoV-2 Interaction Dataset and VHH Sequence Corpus for Antibody Language Models
- A scalable generative model for dynamical system reconstruction from neuroimaging data
- AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
- A Separation in Heavy-Tailed Sampling: Gaussian vs. Stable Oracles for Proximal Samplers
- AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction
- A Siamese Transformer with Hierarchical Refinement for Lane Detection
- A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $\Theta(T^{2/3})$ and its Application to Best-of-Both-Worlds
- A Simple and Optimal Approach for Universal Online Learning with Gradient Variations
- A Simple Framework for Generalization in Visual RL under Dynamic Scene Perturbations
- A Simple Image Segmentation Framework via In-Context Examples
- A Simple Remedy for Dataset Bias via Self-Influence: A Mislabeled Sample Perspective
- A Simple yet Scalable Granger Causal Structural Learning Approach for Topological Event Sequences
- A Simple yet Universal Framework for Depth Completion
- A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data
- A Single-Step, Sharpness-Aware Minimization is All You Need to Achieve Efficient and Accurate Sparse Training
- Ask, Attend, Attack: An Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models
- A Sober Look at the Robustness of CLIPs to Spurious Features
- Assemblage: Automatic Binary Dataset Construction for Machine Learning
- Assembly Fuzzy Representation on Hypergraph for Open-Set 3D Object Retrieval
- Association of Objects May Engender Stereotypes: Mitigating Association-Engendered Stereotypes in Text-to-Image Generation
- Association Pattern-aware Fusion for Biological Entity Relationship Prediction
- Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability
- A StrongREJECT for Empty Jailbreaks
- A Structure-Aware Framework for Learning Device Placements on Computation Graphs
- A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning
- A Surprisingly Simple Approach to Generalized Few-Shot Semantic Segmentation
- A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences
- A Swiss Army Knife for Heterogeneous Federated Learning: Flexible Coupling via Trace Norm
- Asymptotics of Alpha-Divergence Variational Inference Algorithms with Exponential Families
- AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising
- Asynchronous Perception Machine for Efficient Test Time Training
- A Synthetic Dataset for Personal Attribute Inference
- A Taxonomy of Challenges to Curating Fair Datasets
- A teacher-teacher framework for clinical language representation learning
- A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis
- A theoretical case-study of Scalable Oversight in Hierarchical Reinforcement Learning
- A theoretical design of concept sets: improving the predictability of concept bottleneck models
- A Theoretical Perspective for Speculative Decoding Algorithm
- A Theoretical Understanding of Self-Correction through In-context Alignment
- A Theory of Optimistically Universal Online Learnability for General Concept Classes
- Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication
- A Topology-aware Graph Coarsening Framework for Continual Graph Learning
- A Tractable Inference Perspective of Offline RL
- Attack-Aware Noise Calibration for Differential Privacy
- Attack-Resilient Image Watermarking Using Stable Diffusion
- Attention boosted Individualized Regression
- Attention Temperature Matters in ViT-Based Cross-Domain Few-Shot Learning
- AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
- Attractor Memory for Long-Term Time Series Forecasting: A Chaos Perspective
- Attributing Model Behavior at Scale (ATTRIB)
- A two-scale Complexity Measure for Deep Learning Models
- AUC Maximization under Positive Distribution Shift
- AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation
- Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation
- AudioMarkBench: Benchmarking Robustness of Audio Watermarking
- Auditing Local Explanations is Hard
- Auditing Privacy Mechanisms via Label Inference Attacks
- A Unified Confidence Sequence for Generalized Linear Models, with Applications to Bandits
- A Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks
- A Unified Framework for 3D Scene Understanding
- A Unified Principle of Pessimism for Offline Reinforcement Learning under Model Mismatch
- A Unified Recipe for Deriving (Time-Uniform) PAC-Bayes Bounds
- A Unifying Normative Framework of Decision Confidence
- A Unifying Post-Processing Framework for Multi-Objective Learn-to-Defer Problems
- A Universal Growth Rate for Learning with Smooth Surrogate Losses
- Autobidder's Dilemma: Why More Sophisticated Autobidders Lead to Worse Auction Efficiency
- Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency
- AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents
- AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning
- Automated Efficient Estimation using Monte Carlo Efficient Influence Functions
- Automated Label Unification for Multi-Dataset Semantic Segmentation with GNNs
- Automated Multi-level Preference for MLLMs
- Automated Multi-Task Learning for Joint Disease Prediction on Electronic Health Records
- Automatically Learning Hybrid Digital Twins of Dynamical Systems
- Automatic Outlier Rectification via Optimal Transport
- Automating Data Annotation under Strategic Human Agents: Risks and Potential Solutions
- Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models
- AutoMix: Automatically Mixing Language Models
- Autonomous Agents for Collaborative Task under Information Asymmetry
- Autonomous Driving with Spiking Neural Networks
- AutoPSV: Automated Process-Supervised Verifier
- Autoregressive Image Diffusion: Generation of Image Sequence and Application in MRI
- Autoregressive Image Generation without Vector Quantization
- Autoregressive Policy Optimization for Constrained Allocation Tasks
- AutoSurvey: Large Language Models Can Automatically Write Surveys
- AutoTimes: Autoregressive Time Series Forecasters via Large Language Models
- AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
- AV-Cloud: Spatial Audio Rendering Through Audio-Visual Cloud Splatting
- Average gradient outer product as a mechanism for deep neural collapse
- AverNet: All-in-one Video Restoration for Time-varying Unknown Degradations
- A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation
- A versatile informative diffusion model for single-cell ATAC-seq data generation and analysis
- AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis
- Avoiding Undesired Future with Minimal Cost in Non-Stationary Environments
- A Walsh Hadamard Derived Linear Vector Symbolic Architecture
- AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
- Axioms for AI Alignment from Human Feedback
- B$\oplus$LD: Boolean Logic Deep Learning
- BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
- BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment
- BackTime: Backdoor Attacks on Multivariate Time Series Forecasting
- Back to the Continuous Attractor
- BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
- Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
- BAKU: An Efficient Transformer for Multi-Task Policy Learning
- Balancing Context Length and Mixing Times for Reinforcement Learning at Scale
- BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
- Banded Square Root Matrix Factorization for Differentially Private Model Training
- BAN: Detecting Backdoors Activated by Neuron Noise
- Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs
- Bandits with Abstention under Expert Advice
- Bandits with Preference Feedback: A Stackelberg Game Perspective
- Bandits with Ranking Feedback
- Barely Random Algorithms and Collective Metrical Task Systems
- B-ary Tree Push-Pull Method is Provably Efficient for Distributed Learning on Heterogeneous Data
- Base of RoPE Bounds Context Length
- Batched Energy-Entropy acquisition for Bayesian Optimization
- Bayesian Adaptive Calibration and Optimal Design
- Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design
- Bayesian Domain Adaptation with Gaussian Mixture Domain-Indexing
- Bayesian-guided Label Mapping for Visual Reprogramming
- Bayesian Identification of the Hamiltonian Inductive Bias in Dynamical Systems
- Bayesian Nonparametrics Meets Data-Driven Distributionally Robust Optimization
- Bayesian Online Natural Gradient (BONG)
- Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal
- Bayesian Optimization of Functions over Node Subsets in Graphs
- Bayesian Strategic Classification
- Bayes-optimal learning of an extensive-width neural network from quadratically many samples
- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
- BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
- BeanCounter: A low-toxicity, large-scale, and open dataset of business-oriented text
- Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback
- BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning
- Be Confident in What You Know: Bayesian Parameter Efficient Fine-Tuning of Vision Foundation Models
- BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction
- Belief-State Query Policies for User-Aligned Planning under Partial Observability
- Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
- BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models
- Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition
- Benchmarking Counterfactual Image Generation
- Benchmarking Estimators for Natural Experiments: A Novel Dataset and a Doubly Robust Algorithm
- Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming
- Benchmarking LLMs via Uncertainty Quantification
- Benchmarking Out-of-Distribution Generalization and Adaptation for Electromyography
- Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex.
- Benchmarking Structural Inference Methods for Interacting Dynamical Systems with Synthetic Data
- Benchmarking the Attribution Quality of Vision Models
- Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study
- Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized Tasks
- Benchmark Repositories for Better Benchmarking
- BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
- BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
- BendVLM: Test-Time Debiasing of Vision-Language Embeddings
- Benign overfitting in leaky ReLU networks with moderate input dimension
- BertaQA: How Much Do Language Models Know About Local Culture?
- BERTs are Generative In-Context Learners
- BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices
- Better by default: Strong pre-tuned MLPs and boosted trees on tabular data
- BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation
- Beware of Road Markings: A New Adversarial Patch Attack to Monocular Depth Estimation
- Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales
- Beyond Accuracy: Tracking more like Human via Visual Search
- Beyond Aesthetics: Cultural Competence in Text-to-Image Models
- Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable?
- Beyond Decoding: Meta-Generation Algorithms for Large Language Models
- Beyond Efficiency: Molecular Data Pruning for Enhanced Generalization
- Beyond Euclidean: Dual-Space Representation Learning for Weakly Supervised Video Violence Detection
- Beyond Optimism: Exploration With Partially Observable Rewards
- Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints
- Beyond Prompts: Dynamic Conversational Benchmarking of Large Language Models
- Beyond Redundancy: Information-aware Unsupervised Multiplex Graph Structure Learning
- Beyond Single Stationary Policies: Meta-Task Players as Naturally Superior Collaborators
- Beyond Slow Signs in High-fidelity Model Extraction
- Beyond task diversity: provable representation transfer for sequential multitask linear bandits
- Beyond the Doors of Perception: Vision Transformers Represent Relations Between Objects
- Bias Amplification in Language Model Evolution: An Iterated Learning Perspective
- Bias Detection via Signaling
- Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
- Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding
- BiDM: Pushing the Limit of Quantization for Diffusion Models
- Bigger, Regularized, Optimistic: scaling for compute and sample efficient continuous control
- Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
- Binarized Diffusion Model for Image Super-Resolution
- Binary Search with Distributional Predictions
- Binding in hippocampal-entorhinal circuits enables compositionality in cognitive maps
- Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis
- Biologically-Inspired Learning Model for Instructed Vision
- Biomedical Visual Instruction Tuning with Clinician Preference Alignment
- BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
- BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
- Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
- bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction
- BitDelta: Your Fine-Tune May Only Be Worth One Bit
- BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
- BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval
- Black-Box Forgetting
- BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
- BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
- Blind Image Restoration via Fast Diffusion Inversion
- BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
- Block Sparse Bayesian Learning: A Diversified Scheme
- Block Transformer: Global-to-Local Language Modeling for Fast Inference
- BLURD: Benchmarking and Learning using a Unified Rendering and Diffusion Model
- B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
- BMRS: Bayesian Model Reduction for Structured Pruning
- BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
- BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping
- Boosted Conformal Prediction Intervals
- Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
- Boosting Generalization in Parametric PDE Neural Solvers through Adaptive Conditioning
- Boosting Graph Pooling with Persistent Homology
- Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance
- Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
- Boosting Text-to-Video Generative Model with MLLMs Feedback
- Boosting the Potential of Large Language Models with an Intelligent Information Assistant
- Boosting the Transferability of Adversarial Attack on Vision Transformer with Adaptive Token Tuning
- Boosting Transferability and Discriminability for Time Series Domain Adaptation
- Boosting Vision-Language Models with Transduction
- Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension
- Bootstrapping Top-down Information for Self-modulating Slot Attention
- Boundary Decomposition for Nadir Objective Vector Estimation
- Boundary Matters: A Bi-Level Active Finetuning Method
- Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension
- BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning
- BrainBits: How Much of the Brain are Generative Reconstruction Methods Using?
- Brain-JEPA: Brain Dynamics Foundation Model with Gradient Positioning and Spatiotemporal Masking
- Brain Treebank: Large-scale intracranial recordings from naturalistic language stimuli
- Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model
- Breaking Long-Tailed Learning Bottlenecks: A Controllable Paradigm with Hypernetwork-Generated Diverse Experts
- Breaking Semantic Artifacts for Generalized AI-generated Image Detection
- Breaking the curse of dimensionality in structured density estimation
- Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack
- BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO
- Bridge-IF: Learning Inverse Protein Folding with Markov Bridges
- Bridge the Modality and Capability Gaps in Vision-Language Model Selection
- Bridge the Points: Graph-based Few-shot Segment Anything Semantically
- Bridging Gaps: Federated Multi-View Clustering in Heterogeneous Hybrid Views
- Bridging Geometric States via Geometric Diffusion Bridge
- Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
- Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift
- Bridging OOD Detection and Generalization: A Graph-Theoretic View
- Bridging semantics and pragmatics in information-theoretic emergent communication
- Bridging the Divide: Reconsidering Softmax and Linear Attention
- Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation
- Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
- Building a stable classifier with the inflated argmax
- Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers
- Building Timeseries Dataset: Empowering Large-Scale Building Analytics
- Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences
- CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset
- CALANet: Cheap All-Layer Aggregation for Human Activity Recognition
- Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment
- CALE: Continuous Arcade Learning Environment
- Calibrated Self-Rewarding Vision Language Models
- Calibrating Reasoning in Language Models with Internal Consistency
- CALVIN: Improved Contextual Video Captioning via Instruction Tuning
- Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
- Can an AI Agent Safely Run a Government? Existence of Probably Approximately Aligned Policies
- Can Graph Learning Improve Planning in LLM-based Agents?
- Can Graph Neural Networks Expose Training Data Properties? An Efficient Risk Assessment Approach
- Can Language Models Learn to Skip Steps?
- Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?
- Can Large Language Model Agents Simulate Human Trust Behavior?
- Can Large Language Models Analyze Graphs like Professionals? A Benchmark and Dataset
- Can large language models explore in-context?
- Can Learned Optimization Make Reinforcement Learning Less Difficult?
- Can LLMs Implicitly Learn Numeric Parameter Constraints in Data Science APIs?
- Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study
- Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation
- Can Models Learn Skill Composition from Examples?
- Can neural operators always be continuously discretized?
- Can Simple Averaging Defeat Modern Watermarks?
- Can Transformers Smell Like Humans?
- Can We Leave Deepfake Data Behind in Training Deepfake Detector?
- CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities
- Capturing the denoising effect of PCA via compression ratio
- Cardinality-Aware Set Prediction and Top-$k$ Classification
- CARE: a Benchmark Suite for the Classification and Retrieval of Enzymes
- CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
- Carrot and Stick: Eliciting Comparison Data and Beyond
- Cascade of phase transitions in the training of energy-based models
- Cascade Speculative Drafting for Even Faster LLM Inference
- CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
- CAT3D: Create Anything in 3D with Multi-View Diffusion Models
- Catastrophic Goodhart: regularizing RLHF with KL divergence does not mitigate heavy-tailed reward misspecification
- CAT: Coordinating Anatomical-Textual Prompts for Multi-Organ and Tumor Segmentation
- Categorical Flow Matching on Statistical Manifolds
- Causal Bandits for Linear Structural Equation Models
- CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
- Causal Context Adjustment Loss for Learned Image Compression
- Causal Contrastive Learning for Counterfactual Regression Over Time
- Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model
- Causal Dependence Plots
- CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense
- Causal Discovery from Event Sequences by Local Cause-Effect Attribution
- Causal discovery with endogenous context variables
- Causal Effect Identification in a Sub-Population with Latent Variables
- Causal Imitation for Markov Decision Processes: a Partial Identification Approach
- Causal Inference in the Closed-Loop: Marginal Structural Models for Sequential Excursion Effects
- Causality and Large Models
- Causality for Large Language Models
- Causal language modeling can elicit search and reasoning capabilities on logic puzzles
- Causal-learn: Causal Discovery in Python
- CausalStock: Deep End-to-end Causal Discovery for News-driven Multi-stock Movement Prediction
- Causal Temporal Representation Learning with Nonstationary Sparse Transition
- Causal vs. Anticausal merging of predictors
- Cell ontology guided transcriptome foundation model
- CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition
- CE-NAS: An End-to-End Carbon-Efficient Neural Architecture Search Framework
- Certified Adversarial Robustness via Randomized $\alpha$-Smoothing for Regression Models
- Certified Machine Unlearning via Noisy Stochastic Gradient Descent
- Certified Robustness for Deep Equilibrium Models via Serialized Random Smoothing
- C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
- Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
- Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs
- Chain of Thoughtlessness? An Analysis of CoT in Planning
- Chain-of-Thought Reasoning Without Prompting
- Chain-of-Thought Unfaithfulness as Disguised Accuracy
- Challenges of Generating Structurally Diverse Graphs
- Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization
- ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction
- CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
- CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition
- ChatCam: Empowering Camera Control through Conversational AI
- ChatQA: Surpassing GPT-4 on Conversational QA and RAG
- Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
- ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model
- Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models
- Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models
- ChronoEpilogi: Scalable Time Series Selection with Multiple Solutions
- ChronoMagic: A Benchmark for Metamorphic Evaluation of Time-lapse Text-to-Video Generation
- CIFD: Controlled Information Flow to Enhance Knowledge Distillation
- CigTime: Corrective Instruction Generation Through Inverse Motion Editing
- CiteME: Can Language Models Accurately Cite Scientific Claims?
- CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language Models
- ClashEval: Quantifying the tug-of-war between an LLM’s internal prior and external evidence
- Class Distribution Shifts in Zero-Shot Learning: Learning Robust Representations
- Classic GNNs are Strong Baselines: Reassessing GNNs for Node Classification
- Classification Diffusion Models: Revitalizing Density Ratio Estimation
- Classification Done Right for Vision-Language Pre-Training
- Classifier Clustering and Feature Alignment for Federated Learning under Distributed Concept Drift
- Classifier-guided Gradient Modulation for Enhanced Multimodal Learning
- ClavaDDPM: Multi-relational Data Synthesis with Cluster-guided Diffusion Models
- CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated Responses
- CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
- ClevrSkills: Compositional Language And Visual Understanding in Robotics
- CLIPAway: Harmonizing focused embeddings for removing objects via diffusion models
- CLIPCEIL: Domain Generalization through CLIP via Channel rEfinement and Image-text aLignment
- CLIP in Mirror: Disentangling text from visual images through reflection
- CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning
- Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation
- Cloud Object Detector Adaptation by Integrating Different Source Knowledge
- CLUES: Collaborative Private-domain High-quality Data Selection for LLMs via Training Dynamics
- Clustering in Causal Attention Masking
- Clustering then Propagation: Select Better Anchors for Knowledge Graph Embedding
- Clustering with Non-adaptive Subset Queries
- Cluster-Learngene: Inheriting Adaptive Clusters for Vision Transformers
- Cluster-wise Graph Transformer with Dual-granularity Kernelized Attention
- CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors
- Coarse-to-Fine Concept Bottleneck Models
- CoBo: Collaborative Learning via Bilevel Optimization
- CODA: A Correlation-Oriented Disentanglement and Augmentation Modeling Scheme for Better Resisting Subpopulation Shifts
- Codec Avatar Studio: Paired Human Captures for Complete, Driveable, and Generalizable Avatars
- CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models
- Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework
- Code Repair with LLMs gives an Exploration-Exploitation Tradeoff
- CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming
- Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
- CoFie: Learning Compact Neural Surface Representations with Coordinate Fields
- CogVLM: Visual Expert for Pretrained Language Models
- Coherence-free Entrywise Estimation of Eigenvectors in Low-rank Signal-plus-noise Matrix Models
- Coherent 3D Scene Diffusion From a Single RGB Image
- CoIN: A Benchmark of Continual Instruction Tuning for Multimodel Large Language Models
- COLD: Causal reasOning in cLosed Daily activities
- ColJailBreak: Collaborative Generation and Editing for Jailbreaking Text-to-Image Deep Generation
- Collaboration! Towards Robust Neural Methods for Routing Problems
- Collaborative Cognitive Diagnosis with Disentangled Representation Learning for Learner Modeling
- Collaborative Refining for Learning from Inaccurate Labels
- Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
- CoLLaM: A Comprehensive Chinese Legal Benchmark for Evaluating Large Language Models
- CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training
- Color-Oriented Redundancy Reduction in Dataset Distillation
- CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
- ComBack: A Versatile Dataset for Enhancing Compiler Backend Development Efficiency
- Combining Observational Data and Language for Species Range Estimation
- Combining Statistical Depth and Fermat Distance for Uncertainty Quantification
- CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
- CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding
- Communication Bounds for the Distributed Experts Problem
- Communication Efficient Distributed Training with Distributed Lion
- Communication-Efficient Federated Group Distributionally Robust Optimization
- Community Detection Guarantees using Embeddings Learned by Node2Vec
- Compact Language Models via Pruning and Knowledge Distillation
- Compact Proofs of Model Performance via Mechanistic Interpretability
- CompBench: A Comparative Reasoning Benchmark for Multimodal LLMs
- Complete Graphical Criterion for Sequential Covariate Adjustment in Causal Inference
- Compositional 3D-aware Video Generation with LLM Director
- Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning
- Compositional Generalization Across Distributional Shifts with Sparse Tree Operations
- Compositional Learning: Perspectives, Methods, and Paths Forward
- Compositional PAC-Bayes: Generalization of GNNs with persistence and beyond
- Comprehensive Framework for Curating Speech Datasets and Evaluating ASR Systems: A Case Study for the Polish Language
- Compressing Large Language Models using Low Rank and Low Precision Decomposition
- Computational Aspects of Bayesian Persuasion under Approximate Best Response
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference
- Computerized Adaptive Testing via Collaborative Ranking
- Computing the Bias of Constant-step Stochastic Approximation with Markovian Noise
- Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification
- Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models
- ConceptFactory: Facilitate 3D Object Knowledge Annotation with Object Conceptualization
- ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty
- Conditional Controllable Image Fusion
- Conditional Density Estimation with Histogram Trees
- Conditional Generative Models are Sufficient to Sample from Any Causal Effect Estimand
- Conditional Outcome Equivalence: A Quantile Alternative to CATE
- Conditional Synthesis of 3D Molecules with Time Correction Sampler
- Conditioning non-linear and infinite-dimensional diffusion processes
- CondTSF: One-line Plugin of Dataset Condensation for Time Series Forecasting
- Confidence Calibration of Classifiers with Many Classes
- Confidence Regulation Neurons in Language Models
- Confident Natural Policy Gradient for Local Planning in $q_\pi$-realizable Constrained MDPs
- Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees
- Conformal Classification with Equalized Coverage for Adaptively Selected Groups
- Conformal Inverse Optimization
- Conformalized Credal Set Predictors
- Conformalized Multiple Testing after Data-dependent Selection
- Conformalized Time Series with Semantic Features
- Conformal Prediction for Class-wise Coverage via Augmented Label Rank Calibration
- Confusion-Resistant Federated Learning via Diffusion-Based Data Harmonization on Non-IID Data
- Conjugate Bayesian Two-step Change Point Detection for Hawkes Process
- Conjugated Semantic Pool Improves OOD Detection with Pre-trained Vision-Language Models
- ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs
- Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised Learning
- Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
- Connectivity-Driven Pseudo-Labeling Makes Stronger Cross-Domain Segmenters
- Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
- Consensus Learning with Deep Sets for Essential Matrix Estimation
- Consistency Diffusion Bridge Models
- Consistency Models for Scalable and Fast Simulation-Based Inference
- Consistency of Neural Causal Partial Identification
- Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness
- Constant Acceleration Flow
- ConStat: Performance-Based Contamination Detection in Large Language Models
- Constrained Adaptive Attack: Effective Adversarial Attack Against Deep Neural Networks for Tabular Data
- Constrained Binary Decision Making
- Constrained Diffusion Models via Dual Training
- Constrained Diffusion with Trust Sampling
- Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
- Constrained Latent Action Policies for Model-Based Offline Reinforcement Learning
- Constrained Sampling with Primal-Dual Langevin Monte Carlo
- Constrained Synthesis with Projected Diffusion Models
- Constructing Semantics-Aware Adversarial Examples with Probabilistic Perspective
- Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model
- ContactField: Implicit Field Representation for Multi-Person Interaction Geometry
- Context and Geometry Aware Voxel Transformer for Semantic Scene Completion
- Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
- ContextCite: Attributing Model Generation to Context
- ContextGS : Compact 3D Gaussian Splatting with Anchor Level Context Model
- Contextual Active Model Selection
- Contextual Bilevel Reinforcement Learning for Incentive Alignment
- Contextual Decision-Making with Knapsacks Beyond the Worst Case
- Contextual Linear Optimization with Bandit Feedback
- Contextual Multinomial Logit Bandits with General Value Functions
- Continual Audio-Visual Sound Separation
- Continual Counting with Gradual Privacy Expiration
- Continual Learning in the Frequency Domain
- Continual Learning with Global Alignment
- Continual learning with the neural tangent ensemble
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition
- Continuous Heatmap Regression for Pose Estimation via Implicit Neural Representation
- Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
- Continuous Partitioning for Graph-Based Semi-Supervised Learning
- Continuous Product Graph Neural Networks
- Continuous Spatiotemporal Events Decoupling through Spike-based Bayesian Computation
- Continuous Temporal Domain Generalization
- Contracting with a Learning Agent
- CONTRAST: Continual Multi-source Adaptation to Dynamic Distributions
- Contrasting with Symile: Simple Model-Agnostic Representation Learning for Unlimited Modalities
- Contrastive dimension reduction: when and how?
- Contrastive-Equivariant Self-Supervised Learning Improves Alignment with Primate Visual Area IT
- Contrastive losses as generalized models of global epistasis
- Controlled maximal variability along with reliable performance in recurrent neural networks
- Controlling Continuous Relaxation for Combinatorial Optimization
- Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets
- Controlling Multiple Errors Simultaneously with a PAC-Bayes Bound
- ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
- ControlSynth Neural ODEs: Modeling Dynamical Systems with Guaranteed Convergence
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models
- Convergence Analysis of Split Federated Learning on Heterogeneous Data
- Convergence of $\text{log}(1/\epsilon)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis
- Convergence of No-Swap-Regret Dynamics in Self-Play
- Convolutional Differentiable Logic Gate Networks
- Convolutions and More as Einsum: A Tensor Network Perspective with Advances for Second-Order Methods
- Co-occurrence is not Factual Association in Language Models
- CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
- Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
- Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation
- Cooperative Hardware-Prompt Learning for Snapshot Compressive Imaging
- Copycats: the many lives of a publicly available medical imaging dataset
- CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
- Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
- CosAE: Learnable Fourier Series for Image Restoration
- COSMIC: Compress Satellite Image Efficiently via Diffusion Compensation
- Cost-aware Bayesian Optimization via the Pandora's Box Gittins Index
- Cost-efficient Knowledge-based Question Answering with Large Language Models
- CoSW: Conditional Sample Weighting for Smoke Segmentation with Label Noise
- CoSy: Evaluating Textual Explanations of Neurons
- Counter-Current Learning: A Biologically Plausible Dual Network Approach for Deep Learning
- Counterfactual Fairness by Combining Factual and Counterfactual Predictions
- CountGD: Multi-Modal Open-World Counting
- Coupled Mamba: Enhanced Multimodal Fusion with Coupled State Space Model
- Covariate Shift Corrected Conditional Randomization Test
- COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
- CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
- cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers
- Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
- Crafting Interpretable Embeddings for Language Neuroscience by Asking LLMs Questions
- CRAG - Comprehensive RAG Benchmark
- CRAYM: Neural Field Optimization via Camera RAY Matching
- Credal Deep Ensembles for Uncertainty Quantification
- Credal Learning Theory
- Credit Attribution and Stable Compression
- Critically Assessing the State of the Art in Neural Network Verification
- CriticEval: Evaluating Large-scale Language Model as Critic
- Croissant: A Metadata Format for ML-Ready Datasets
- CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks
- Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias
- Cross-Device Collaborative Test-Time Adaptation
- Cross-disciplinary insights into alignment in humans and machines
- Cross-Modality Perturbation Synergy Attack for Person Re-identification
- Cross-modal Representation Flattening for Multi-modal Domain Generalization
- Cross-model Control: Improving Multiple Large Language Models in One-time Training
- Cross-Scale Self-Supervised Blind Image Deblurring via Implicit Neural Representation
- Cross-video Identity Correlating for Person Re-identification Pre-training
- CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection
- CryoBench: Datasets and Benchmarks for Heterogeneous Cryo-EM Reconstruction
- CryoGEM: Physics-Informed Generative Cryo-Electron Microscopy
- CryoSPIN: Improving Ab-Initio Cryo-EM Reconstruction with Semi-Amortized Pose Inference
- Cryptographic Hardness of Score Estimation
- CSPG: Crossing Sparse Proximity Graphs for Approximate Nearest Neighbor Search
- CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence
- Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance
- CultureLLM: Incorporating Cultural Differences into Large Language Models
- CulturePark: Boosting Cross-cultural Understanding in Large Language Models
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
- CURE4Rec: A Benchmark for Recommendation Unlearning with Deeper Influence
- Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
- Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature
- Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning
- Customized Subgraph Selection and Encoding for Drug-drug Interaction Prediction
- Customizing Language Models with Instance-wise LoRA for Sequential Recommendation
- CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
- CV-VAE: A Compatible Video VAE for Latent Generative Video Models
- CycleNet: Enhancing Time Series Forecasting through Modeling Periodic Patterns
- CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos
- D2R2: Diffusion-based Representation with Random Distance Matching for Tabular Few-shot Learning
- D3S3: Data-driven and Differentiable Simulations, Surrogates, and Solvers
- DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection
- DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation
- DAGER: Exact Gradient Inversion for Large Language Models
- DAPE: Data-Adaptive Positional Encoding for Length Extrapolation
- DapperFL: Domain Adaptive Federated Learning with Model Fusion Pruning for Edge Devices
- DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
- DarkSAM: Fooling Segment Anything Model to Segment Nothing
- DARNet: Dual Attention Refinement Network with Spatiotemporal Construction for Auditory Attention Detection
- DART-Eval: A Comprehensive DNA Language Model Evaluation Benchmark on Regulatory DNA
- DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
- DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity
- Data Acquisition via Experimental Design for Data Markets
- Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
- Data Augmentation with Diffusion for Open-Set Semi-Supervised Learning
- DataComp-LM: In search of the next generation of training sets for language models
- Data curation via joint example selection further accelerates multimodal learning
- Data Distribution Valuation
- Data-Driven Discovery of Dynamical Systems in Pharmacology using Large Language Models
- Data-Efficient Learning with Neural Programs
- Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning
- Data-faithful Feature Attribution: Mitigating Unobservable Confounders via Instrumental Variables
- Data Free Backdoor Attacks
- Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions
- Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
- Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
- DataStealing: Steal Data from Diffusion Models in Federated Learning with Multiple Trojans
- Data subsampling for Poisson regression with pth-root-link
- DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain
- DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain
- DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos
- D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
- DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering
- DDK: Distilling Domain Knowledge for Efficient Large Language Models
- DDN: Dual-domain Dynamic Normalization for Non-stationary Time Series Forecasting
- DDR: Exploiting Deep Degradation Response as Flexible Image Descriptor
- Dealing with Synthetic Data Contamination in Online Continual Learning
- DeBaRA: Denoising-Based 3D Room Arrangement Generation
- Debiasing Synthetic Data Generated by Deep Generative Models
- Decentralized Noncooperative Games with Coupled Decision-Dependent Distributions
- Decision-Focused Learning with Directional Gradients
- Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Context
- Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
- Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling
- DECO-Bench: Unified Benchmark for Decoupled Task-Agnostic Synthetic Data Release
- Decoding-Time Language Model Alignment with Multiple Objectives
- Decomposable Transformer Point Processes
- Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle
- Decomposed Prompt Decision Transformer for Efficient Unseen Task Generalization
- Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP
- Decoupled Kullback-Leibler Divergence Loss
- Decoupling Semantic Similarity from Spatial Alignment for Neural Networks.
- DECRL: A Deep Evolutionary Clustering Jointed Temporal Knowledge Graph Representation Learning Approach
- Deep Bayesian Active Learning for Preference Modeling in Large Language Models
- Deep Correlated Prompting for Visual Recognition with Missing Modalities
- DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection
- Deep Equilibrium Algorithmic Reasoning
- Deep Graph Mating
- Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptative Residual Module
- Deep Homomorphism Networks
- DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation
- DeepLag: Discovering Deep Lagrangian Dynamics for Intuitive Fluid Prediction
- Deep Learning for Computing Convergence Rates of Markov Chains
- Deep Learning in Medical Image Registration: Magic or Mirage?
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond
- Deep linear networks for regression are implicitly regularized towards flat minima
- Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers
- DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs
- Deep Submodular Peripteral Networks
- Deep Support Vectors
- DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
- Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
- DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching
- DEFT: Efficient Fine-tuning of Diffusion Models by Learning the Generalised $h$-transform
- DeiSAM: Segment Anything with Deictic Prompting
- Déjà Vu Memorization in Vision–Language Models
- DEL: Discrete Element Learner for Learning 3D Particle Dynamics with Neural Rendering
- Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
- DeltaDEQ: Exploiting Heterogeneous Convergence for Accelerating Deep Equilibrium Iterations
- DeltaDock: A Unified Framework for Accurate, Efficient, and Physically Reliable Molecular Docking
- Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
- DeMo: Decoupling Motion Forecasting into Directional Intentions and Dynamic States
- Demystify Mamba in Vision: A Linear Attention Perspective
- Dendritic Integration Inspired Artificial Neural Networks Capture Data Correlation
- DeNetDM: Debiasing by Network Depth Modulation
- DenoiseRep: Denoising Model for Representation Learning
- Denoising Diffusion Path: Attribution Noise Reduction with An Auxiliary Diffusion Model
- Dense Associative Memory Through the Lens of Random Features
- Dense Connector for MLLMs
- DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
- Density-based User Representation using Gaussian Process Regression for Multi-interest Personalized Retrieval
- DePLM: Denoising Protein Language Models for Property Optimization
- DEPrune: Depth-wise Separable Convolution Pruning for Maximizing GPU Parallelism
- Depth Anything V2
- Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation
- Derandomizing Multi-Distribution Learning
- Derivative-enhanced Deep Operator Network
- Derivatives of Stochastic Gradient Descent in parametric optimization
- Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization
- Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems
- DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms
- DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning
- DetectEval: Benchmarking LLM-Generated Text Detection in Real-World Scenarios
- Detecting and Measuring Confounding Using Causal Mechanism Shifts
- Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers
- Detecting Bugs with Substantial Monetary Consequences by LLM and Rule-based Reasoning
- DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning
- Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time
- Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning
- DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
- DeTrack: In-model Latent Denoising Learning for Visual Object Tracking
- DevBench: A multimodal developmental benchmark for language learning
- DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI Accelerators
- DF40: Toward Next-Generation Deepfake Detection
- DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment
- DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization
- DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
- DiffAug: A Diffuse-and-Denoise Augmentation for Training Robust Classifiers
- DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut
- Diffeomorphic interpolation for efficient persistence-based topological optimization
- Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models
- Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation
- Differentiable Quantum Computing for Large-scale Linear Control
- Differentiable Structure Learning with Partial Orders
- Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
- Differentially Private Equivalence Testing for Continuous Distributions and Applications
- Differentially Private Graph Diffusion with Applications in Personalized PageRanks
- Differentially Private Optimization with Sparse Gradients
- Differentially Private Reinforcement Learning with Self-Play
- Differentially Private Set Representations
- Differentially Private Stochastic Gradient Descent with Fixed-Size Minibatches: Tighter RDP Guarantees with or without Replacement
- Differential Privacy in Scalable General Kernel Learning via $K$-means Nystr{\"o}m Random Features
- DiffGS: Functional Gaussian Splatting Diffusion
- DiffHammer: Rethinking the Robustness of Diffusion-Based Adversarial Purification
- DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data
- DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
- DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion
- DiffPhyCon: A Generative Approach to Control Complex Physical Systems
- DiffPO: A causal diffusion model for learning distributions of potential outcomes
- DiffSF: Diffusion Models for Scene Flow Estimation
- DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
- DiffuBox: Refining 3D Object Detection with Point Diffusion
- DiffuLT: Diffusion for Long-tail Recognition Without External Knowledge
- DiffuPac: Contextual Mimicry in Adversarial Packets Generation via Diffusion Model
- DiffuserLite: Towards Real-time Diffusion Planning
- Diffusing Differentiable Representations
- Diffusion4D: Fast Spatial-temporal Consistent 4D generation via Video Diffusion Models
- Diffusion Actor-Critic with Entropy Regulator
- Diffusion-based Curriculum Reinforcement Learning
- Diffusion-based Layer-wise Semantic Reconstruction for Unsupervised Out-of-Distribution Detection
- Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization
- DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction
- Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning
- DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
- Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
- Diffusion for World Modeling: Visual Details Matter in Atari
- Diffusion Imitation from Observation
- Diffusion-Inspired Truncated Sampler for Text-Video Retrieval
- Diffusion Models are Certifiably Robust Classifiers
- Diffusion Models With Learned Adaptive Noise
- Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement
- Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models
- DiffusionPDE: Generative PDE-Solving under Partial Observation
- Diffusion PID: Interpreting Diffusion via Partial Information Decomposition
- Diffusion Policies Creating a Trust Region for Offline Reinforcement Learning
- Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies
- Diffusion Priors for Variational Likelihood Estimation and Image Denoising
- Diffusion-Reward Adversarial Imitation Learning
- Diffusion Spectral Representation for Reinforcement Learning
- Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting
- Diffusion Twigs with Loop Guidance for Conditional Graph Generation
- DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning
- DiGRAF: Diffeomorphic Graph-Adaptive Activation Function
- DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model
- Dimension-free deterministic equivalents and scaling laws for random feature regression
- Dimension-free Private Mean Estimation for Anisotropic Distributions
- DiMSUM: Diffusion Mamba - A Scalable and Unified Spatial-Frequency Method for Image Generation
- DINTR: Tracking via Diffusion-based Interpolation
- DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
- DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
- Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer
- Direct Consistency Optimization for Robust Customization of Text-to-Image Diffusion models
- DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
- Directional Smoothness and Gradient Methods: Convergence and Adaptivity
- Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
- Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandits
- Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
- DisCEdit: Model Editing by Identifying Discriminative Components
- DisC-GS: Discontinuity-aware Gaussian Splatting
- Discovering Creative Behaviors through DUPLEX: Diverse Universal Features for Policy Exploration
- Discovering plasticity rules that organize and maintain neural circuits
- Discovering Preference Optimization Algorithms with and for Large Language Models
- Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
- Discovery of the Hidden World with Large Language Models
- DiscoveryWorld: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
- Discrete Dictionary-based Decomposition Layer for Structured Representation Learning
- Discrete Flow Matching
- Discretely beyond $1/e$: Guided Combinatorial Algortihms for Submodular Maximization
- Discrete Modeling via Boundary Conditional Diffusion Processes
- Discrete-state Continuous-time Diffusion for Graph Generation
- DisenGCD: A Meta Multigraph-assisted Disentangled Graph Learning Framework for Cognitive Diagnosis
- Disentangled Representation Learning in Non-Markovian Causal Systems
- Disentangled Style Domain for Implicit $z$-Watermark Towards Copyright Protection
- Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning
- Disentangling and mitigating the impact of task similarity for continual learning
- Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component Analysis
- Disentangling Linear Quadratic Control with Untrusted ML Predictions
- Disentangling the Roles of Distinct Cell Classes with Cell-Type Dynamical Systems
- Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation
- DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models
- Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection
- Dissecting Query-Key Interaction in Vision Transformers
- Dissecting the Failure of Invariant Learning on Graphs
- Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
- DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features
- Distributed Least Squares in Small Space via Sketching and Bias Reduction
- Distributed-Order Fractional Graph Operating Network
- Distributed Sparse Regression via Penalization
- Distributionally Robust Performative Prediction
- Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms
- Distributional Preference Alignment of LLMs via Optimal Transport
- Distributional regression: CRPS-error bounds for model fitting, model selection and convex aggregation
- Distributional Reinforcement Learning with Regularized Wasserstein Loss
- Distributional Successor Features Enable Zero-Shot Policy Optimization
- Distribution-Aware Data Expansion with Diffusion Models
- Distribution Guidance Network for Weakly Supervised Point Cloud Semantic Segmentation
- Distribution Learning with Valid Outputs Beyond the Worst-Case
- DistrictNet: Decision-aware learning for geographical districting
- DiTFastAttn: Attention Compression for Diffusion Transformer Models
- Divergences between Language Models and Human Brains
- Diversify, Contextualize, and Adapt: Efficient Entropy Modeling for Neural Image Codec
- Diversity-Driven Synthesis: Enhancing Dataset Distillation through Directed Weight Adjustment
- Diversity Is Not All You Need: Training A Robust Cooperative Agent Needs Specialist Partners
- Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
- Divide-and-Conquer Posterior Sampling for Denoising Diffusion priors
- Divide-and-Conquer Predictive Coding: a structured Bayesian inference algorithm
- DivSafe: Evaluating the Generalization of LLM Safety Training Across Diverse Tasks and Prompt Types
- DLAD: Improving Logits-based Detector without Logits from Black-box LLMs
- D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models
- DMC-VB: A Benchmark for Representation Learning for Control with Visual Distractors
- DMesh: A Differentiable Mesh Representation
- D-MiSo: Editing Dynamic 3D Scenes using Multi-Gaussians Soup
- DMNet: Self-comparison Driven Model for Subject-independent Seizure Detection
- DMPlug: A Plug-in Method for Solving Inverse Problems with Diffusion Models
- DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering
- Do causal predictors generalize better to new domains?
- Do Counterfactually Fair Image Classifiers Satisfy Group Fairness? -- A Theoretical and Empirical Study
- Does Egalitarian Fairness Lead to Instability? The Fairness Bounds in Stable Federated Learning Under Altruistic Behaviors
- Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models
- Does Video-Text Pretraining Help Open-Vocabulary Online Action Detection?
- Does Worst-Performing Agent Lead the Pack? Analyzing Agent Dynamics in Unified Distributed SGD
- DOFEN: Deep Oblivious Forest ENsemble
- Do Finetti: On Causal Effects for Exchangeable Data
- DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting
- DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus
- Doing Experiments and Revising Rules with Natural Language and Probabilistic Reasoning
- Do LLMs Build World Representations? Probing Through the Lens of State Abstraction
- Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers
- Domain Adaptation for Large-Vocabulary Object Detectors
- DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning
- Do Multimodal Foundation Models Understand Enterprise Workflows? A Benchmark for Business Process Management Tasks
- Don't Compress Gradients in Random Reshuffling: Compress Gradient Differences
- Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
- Doob's Lagrangian: A Sample-Efficient Variational Approach to Transition Path Sampling
- DOPPLER: Differentially Private Optimizers with Low-pass Filter for Privacy Noise Reduction
- Do's and Don'ts: Learning Desirable Skills with Instruction Videos
- Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search
- Doubly Hierarchical Geometric Representations for Strand-based Human Hairstyle Generation
- Doubly Mild Generalization for Offline Reinforcement Learning
- DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection
- DRACO: A Denoising-Reconstruction Autoencoder for Cryo-EM
- Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization
- DreamCatcher: A Wearer-aware Multi-modal Sleep Event Dataset Based on Earables in Non-restrictive Environments
- DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
- DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation
- DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
- DreamSteerer: Enhancing Source Image Conditioned Editability using Personalized Diffusion Models
- Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data
- DRIP: Unleashing Diffusion Priors for Joint Foreground and Alpha Prediction in Image Matting
- DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
- DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
- Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond
- DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation
- DropEdge not Foolproof: Effective Augmentation Method for Signed Graph Neural Networks
- DTGB: A Comprehensive Benchmark for Dynamic Text-Attributed Graphs
- Dual Cone Gradient Descent for Training Physics-Informed Neural Networks
- Dual Critic Reinforcement Learning under Partial Observability
- Dual Defense: Enhancing Privacy and Mitigating Poisoning Attacks in Federated Learning
- Dual-Diffusion for Binocular 3D Human Pose Estimation
- Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images
- Dual-frame Fluid Motion Estimation with Test-time Optimization and Zero-divergence Loss
- Dual Lagrangian Learning for Conic Optimization
- Dual-Personalizing Adapter for Federated Foundation Models
- Dual-Perspective Activation: Efficient Channel Denoising via Joint Forward-Backward Criterion for Artificial Neural Networks
- Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models
- Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models
- Dueling over Dessert, Mastering the Art of Repeated Cake Cutting
- Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
- DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
- DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation
- Dynamic 3D Gaussian Fields for Urban Areas
- Dynamic Conditional Optimal Transport through Simulation-Free Flows
- Dynamic Model Predictive Shielding for Provably Safe Reinforcement Learning
- Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets
- Dynamic Rescaling for Training GNNs
- Dynamic Service Fee Pricing under Strategic Behavior: Actions as Instruments and Phase Transition
- Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron
- Dynamic Sparsity in Machine Learning: Routing Information through Neural Pathways
- Dynamic Subgroup Identification in Covariate-adjusted Response-adaptive Randomization Experiments
- Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
- DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning
- DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control
- E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
- E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation
- EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding
- EAI: Emotional Decision-Making of LLMs in Strategic Games and Ethical Dilemmas
- EASI: Evolutionary Adversarial Simulator Identification for Sim-to-Real Transfer
- Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
- Easy Regional Contrastive Learning of Expressive Fashion Representations
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
- ECLipsE: Efficient Compositional Lipschitz Constant Estimation for Deep Neural Networks
- ECMamba: Consolidating Selective State Space Model with Retinex Guidance for Efficient Multiple Exposure Correction
- e-COP : Episodic Constrained Optimization of Policies
- Edit Distance Robust Watermarks via Indexing Pseudorandom Codes
- EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
- EEG2Video: Towards Decoding Dynamic Visual Perception from EEG Signals
- EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG Signals
- EEVR: A Virtual Reality-Based Emotion Dataset Featuring Paired Physiological Signals and Textual Descriptions
- Effective Exploration Based on the Structural Information Principles
- Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting
- EffiBench: Benchmarking the Efficiency of Automatically Generated Code
- Efficiency for Free: Ideal Data Are Transportable Representations
- Efficiency of the First-Price Auction in the Autobidding World
- Efficient $\Phi$-Regret Minimization with Low-Degree Swap Deviations in Extensive-Form Games
- Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
- Efficient Adversarial Training in LLMs with Continuous Attacks
- Efficient and Private Marginal Reconstruction with Local Non-Negativity
- Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes
- Efficient Availability Attacks against Supervised and Contrastive Learning Simultaneously
- EfficientCAPER: An End-to-End Framework for Fast and Robust Category-Level Articulated Object Pose Estimation
- Efficient Centroid-Linkage Clustering
- Efficient Combinatorial Optimization via Heat Diffusion
- Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning
- Efficient Convex Algorithms for Universal Kernel Learning
- Efficient Discrepancy Testing for Learning with Distribution Shift
- Efficient Federated Learning against Heterogeneous and Non-stationary Client Unavailability
- Efficient Graph Matching for Correlated Stochastic Block Models
- Efficient Large Multi-modal Models via Visual Context Compression
- Efficient Leverage Score Sampling for Tensor Train Decomposition
- Efficient Lifelong Model Evaluation in an Era of Rapid Progress
- Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization
- Efficient LLM Scheduling by Learning to Rank
- Efficiently Learning Significant Fourier Feature Pairs for Statistical Independence Testing
- Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms
- Efficient multi-prompt evaluation of LLMs
- Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters
- Efficient Multi-task Reinforcement Learning with Cross-Task Policy Guidance
- Efficient Policy Evaluation Across Multiple Different Experimental Datasets
- Efficient Prompt Optimization Through the Lens of Best Arm Identification
- Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate
- Efficient Reinforcement Learning by Discovering Neural Pathways
- Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction
- Efficient Sketches for Training Data Attribution and Studying the Loss Landscape
- Efficient Streaming Algorithms for Graphlet Sampling
- Efficient Temporal Action Segmentation via Boundary-aware Query Voting
- EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization
- EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views
- EGODE: An Event-attended Graph ODE Framework for Modeling Rigid Dynamics
- EGonc : Energy-based Open-Set Node Classification with substitute Unknowns
- EgoSim: An Egocentric Multi-view Simulator for Body-worn Cameras during Human Motion
- EGSST: Event-based Graph Spatiotemporal Sensitive Transformer for Object Detection
- EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
- EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries
- EigenVI: score-based variational inference with orthogonal function expansions
- einspace: Searching for Neural Architectures from Fundamental Operations
- Einsum Benchmark: Enabling the Development of Next-Generation Tensor Execution Engines
- ElasTST: Towards Robust Varied-Horizon Forecasting with Elastic Time-Series Transformer
- Elliptical Attention
- Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
- ELSA: Exploiting Layer-wise N:M Sparsity for Vision Transformer Acceleration
- Elucidating the Design Space of Dataset Condensation
- Embedding-Aligned Language Models
- Embedding Dimension of Contrastive Learning and $k$-Nearest Neighbors
- Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning
- Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
- EM Distillation for One-step Diffusion Models
- Emergence of heavy tails in homogenized stochastic gradient descent
- Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
- emg2pose: A Large and Diverse Benchmark for Surface Electromyographic Hand Pose Estimation
- emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography
- E-Motion: Future Motion Simulation via Event Sequence Diffusion
- Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
- Empowering Active Learning for 3D Molecular Graphs with Geometric Graph Isomorphism
- Empowering and Assessing the Utility of Large Language Models in Crop Science
- Empowering Visible-Infrared Person Re-Identification with Large Foundation Models
- EMR-Merging: Tuning-Free High-Performance Model Merging
- EMVP: Embracing Visual Foundation Model for Visual Place Recognition with Centroid-Free Probing
- Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity
- ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
- End-To-End Causal Effect Estimation from Unstructured Natural Language Data
- End-to-end Learnable Clustering for Intent Learning in Recommendation
- End-to-End Ontology Learning with Large Language Models
- End-to-End Video Semantic Segmentation in Adverse Weather using Fusion Blocks and Temporal-Spatial Teacher-Student Learning
- Energy-based Epistemic Uncertainty for Graph Neural Networks
- Energy-based Hopfield Boosting for Out-of-Distribution Detection
- Energy-Based Modelling for Discrete and Mixed Data via Heat Equations on Structured Spaces
- Energy-Guided Continuous Entropic Barycenter Estimation for General Costs
- Enhancing Chess Reinforcement Learning with Graph Representation
- Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination
- Enhancing Diversity in Bayesian Deep Learning via Hyperspherical Energy Minimization of CKA
- Enhancing Domain Adaptation through Prompt Gradient Alignment
- Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation
- Enhancing Feature Diversity Boosts Channel-Adaptive Vision Transformers
- Enhancing Graph Transformers with Hierarchical Distance Structural Encoding
- Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
- Enhancing Large Language Models through Adaptive Tokenizers
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension
- Enhancing LLM Reasoning via Vision-Augmented Prompting
- Enhancing LLM’s Cognition via Structurization
- Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
- Enhancing Multiple Dimensions of Trustworthiness in LLMs via Sparse Activation Control
- Enhancing Preference-based Linear Bandits via Human Response Time
- Enhancing Protein Mutation Effect Prediction through a Retrieval-Augmented Framework
- Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus
- Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
- Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning
- Enhancing Robustness of Last Layer Two-Stage Fair Model Corrections
- Enhancing Semi-Supervised Learning via Representative and Diverse Sample Selection
- Enhancing vision-language models for medical imaging: bridging the 3D gap with innovative slice selection
- Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting
- EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature
- Enriching Disentanglement: From Logical Definitions to Quantitative Metrics
- Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration
- Ensemble sampling for linear bandits: small ensembles suffice
- EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models
- Entity Alignment with Noisy Annotations from Large Language Models
- Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning
- Entropy testing and its application to testing Bayesian networks
- Entrywise error bounds for low-rank approximations of kernel matrices
- EpiCare: A Reinforcement Learning Benchmark for Dynamic Treatment Regimes
- EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models
- Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis
- Episodic Future Thinking Mechanism for Multi-agent Reinforcement Learning
- Equivariant Blurring Diffusion for Hierarchical Molecular Conformer Generation
- Equivariant Machine Learning on Graphs with Nonlinear Spectral Filters
- Equivariant Neural Diffusion for Molecule Generation
- Equivariant spatio-hemispherical networks for diffusion MRI deconvolution
- Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
- Erasing Undesirable Concepts in Diffusion Models with Adversarial Preservation
- ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
- Error Analysis of Spherically Constrained Least Squares Reformulation in Solving the Stackelberg Prediction Game
- Error Correction Output Codes for Robust Neural Networks against Weight-errors: A Neural Tangent Kernel Point of View
- ESPACE: Dimensionality Reduction of Activations for Model Compression
- Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data
- Estimating Epistemic and Aleatoric Uncertainty with a Single Model
- Estimating Generalization Performance Along the Trajectory of Proximal SGD in Robust Regression
- Estimating Heterogeneous Treatment Effects by Combining Weak Instruments and Observational Data
- Estimating the Hallucination Rate of Generative AI
- E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding
- ET-Flow: Equivariant Flow-Matching for Molecular Conformer Generation
- ETO:Efficient Transformer-based Local Feature Matching by Organizing Multiple Homography Hypotheses
- Euclidean distance compression via deep random features
- Evaluate calibration of language models with folktexts
- Evaluate then Cooperate: Shapley-based View Cooperation Enhancement for Multi-view Clustering
- Evaluating alignment between humans and neural network representations in image-based learning tasks
- Evaluating Copyright Takedown Methods for Language Models
- Evaluating Evaluations: Examining Best Practices for Measuring Broader Impacts of Generative AI
- Evaluating Large Language Models - Principles, Approaches, and Applications
- Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads
- Evaluating Numerical Reasoning in Text-to-Image Models
- Evaluating the design space of diffusion-based generative models
- Evaluating the World Model Implicit in a Generative Model
- Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
- Even Sparser Graph Transformers
- Event-3DGS: Event-based 3D Reconstruction Using 3D Gaussian Splatting
- Everyday Object Meets Vision-and-Language Navigation Agent via Backdoor
- Evidence of Learned Look-Ahead in a Chess-Playing Neural Network
- Evidential Mixture Machines: Deciphering Multi-Label Correlations for Active Learning Sensitivity
- Evidential Stochastic Differential Equations for Time-Aware Sequential Recommendation
- EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations
- EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
- Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals
- Exactly Minimax-Optimal Locally Differentially Private Sampling
- Exact, Tractable Gauss-Newton Optimization in Deep Reversible Architectures Reveal Poor Generalization
- Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking
- Exclusively Penalized Q-learning for Offline Reinforcement Learning
- Exocentric-to-Egocentric Video Generation
- Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation
- Expanding Sparse Tuning for Low Memory Usage
- Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch
- Expected Probabilistic Hierarchies
- Expectile Regularization for Fast and Accurate Training of Neural Optimal Transport
- Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection
- Experimental Design and Analysis for AI Researchers
- Expert-level protocol translation for self-driving labs
- Explaining Datasets in Words: Statistical Models with Natural Language Parameters
- Explaining RL Decisions with Trajectories': A Reproducibility Study
- Explanations that reveal all through the definition of encoding
- Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization
- Exploitation of a Latent Mechanism in Graph Contrastive Learning: Representation Scattering
- Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion
- Exploiting Descriptive Completeness Prior for Cross Modal Hashing with Incomplete Labels
- Exploiting LLM Quantization
- Exploiting Representation Curvature for Boundary Detection in Time Series
- Exploiting the Replay Memory Before Exploring the Environment: Enhancing Reinforcement Learning Through Empirical MDP Iteration
- Exploration by Learning Diverse Skills through Successor State Representations
- Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment
- Exploratory Retrieval-Augmented Planning For Continual Embodied Instruction Following
- Exploring Adversarial Robustness of Deep State Space Models
- Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
- Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models
- Exploring Consistency in Graph Representations: from Graph Kernels to Graph Neural Networks
- Exploring Context Window of Large Language Models via Decomposed Positional Vectors
- Exploring DCN-like architecture for fast image generation with arbitrary resolution
- Exploring Fixed Point in Image Editing: Theoretical Support and Convergence Optimization
- Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations
- Exploring Low-Dimensional Subspace in Diffusion Models for Controllable Image Editing
- Exploring Molecular Pretraining Model at Scale
- Exploring Structured Semantic Priors Underlying Diffusion Score for Test-time Adaptation
- Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
- Exploring the Precise Dynamics of Single-Layer GAN Models: Leveraging Multi-Feature Discriminators for High-Dimensional Subspace Learning
- Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models
- Exploring the trade-off between deep-learning and explainable models for brain-machine interfaces
- Exploring Token Pruning in Vision State Space Models
- eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modeling
- Exponential Quantum Communication Advantage in Distributed Inference and Learning
- Expressive Gaussian Human Avatars from Monocular RGB Video
- Extending Multi-modal Contrastive Representations
- Extending Video Masked Autoencoders to 128 frames
- Extensive-Form Game Solving via Blackwell Approachability on Treeplexes
- Externally Valid Policy Evaluation from Randomized Trials Using Additional Observational Data
- Extracting Training Data from Molecular Pre-trained Models
- Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
- Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning
- EyeGraph: Modularity-aware Spatio Temporal Graph Clustering for Continuous Event-based Eye Tracking
- EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection
- Face2QR: A Unified Framework for Aesthetic, Face-Preserving, and Scannable QR Code Generation
- Facilitating Multimodal Classification via Dynamically Learning Modality Gap
- FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?
- Factorized Diffusion Architectures for Unsupervised Image Generation and Segmentation
- FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing
- FactorSim: Generative Simulation via Factorized Representation
- Fair Allocation in Dynamic Mechanism Design
- Fair and Welfare-Efficient Constrained Multi-Matchings under Uncertainty
- Fair Bilevel Neural Network (FairBiNN): On Balancing fairness and accuracy via Stackelberg Equilibrium
- Fair GLASSO: Estimating Fair Graphical Models with Unbiased Statistical Behavior
- FairJob: A Real-World Dataset for Fairness in Online Systems
- Fair Kernel K-Means: from Single Kernel to Multiple Kernel
- FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models
- Fairness and Efficiency in Online Class Matching
- Fairness-Aware Estimation of Graphical Models
- Fairness-Aware Meta-Learning via Nash Bargaining
- Fairness in Social Influence Maximization via Optimal Transport
- Fairness without Harm: An Influence-Guided Active Sampling Approach
- Fair Online Bilateral Trade
- FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation
- Fair Secretaries with Unfair Predictions
- Fair Wasserstein Coresets
- FairWire: Fair Graph Generation
- FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models
- FasMe: Fast and Sample-efficient Meta Estimator for Precision Matrix Learning in Small Sample Settings
- FAST: A Dual-tier Few-Shot Learning Paradigm for Whole Slide Image Classification
- Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
- Fast Best-of-N Decoding via Speculative Rejection
- Fast Channel Simulation via Error-Correcting Codes
- FastDrag: Manipulate Anything in One Step
- Fast Encoder-Based 3D from Casual Videos via Point Track Processing
- Faster Accelerated First-order Methods for Convex Optimization with Strongly Convex Function Constraints
- Faster Algorithms for User-Level Private Stochastic Convex Optimization
- Faster Differentially Private Top-$k$ Selection: A Joint Exponential Mechanism with Pruning
- Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference
- FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
- Faster Local Solvers for Graph Diffusion Equations
- Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
- Faster Repeated Evasion Attacks in Tree Ensembles
- Fast Graph Sharpness-Aware Minimization for Enhancing and Accelerating Few-Shot Node Classification
- Fast Iterative Hard Thresholding Methods with Pruning Gradient Computations
- Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
- FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model
- Fast Proxy Experiment Design for Causal Effect Identification
- Fast Rates for Bandit PAC Multiclass Classification
- Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets
- Fast samplers for Inverse Problems in Iterative Refinement models
- Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time
- FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models
- Fast T2T: Optimization Consistency Speeds Up Diffusion-Based Training-to-Testing Solving for Combinatorial Optimization
- Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning
- Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers
- Fast yet Safe: Early-Exiting with Risk Control
- Fearless Stochasticity in Expectation Propagation
- Feature-Level Adversarial Attacks and Ranking Disruption for Visible-Infrared Person Re-identification
- FedAvP: Augment Local Data via Shared Policy in Federated Learning
- Federated Behavioural Planes: Explaining the Evolution of Client Behaviour in Federated Learning
- Federated Black-Box Adaptation for Semantic Segmentation
- Federated Ensemble-Directed Offline Reinforcement Learning
- Federated Fine-tuning of Large Language Models under Heterogeneous Tasks and Client Resources
- Federated Graph Learning for Cross-Domain Recommendation
- Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
- Federated Learning over Connected Modes
- Federated Learning under Periodic Client Participation and Heterogeneous Data: A New Communication-Efficient Algorithm and Analysis
- Federated Model Heterogeneous Matryoshka Representation Learning
- Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning
- Federated Online Prediction from Experts with Differential Privacy: Separations and Regret Speed-ups
- Federated Transformer: Multi-Party Vertical Federated Learning on Practical Fuzzily Linked Data
- FedGMark: Certifiably Robust Watermarking for Federated Graph Learning
- FedGMKD: An Efficient Prototype Federated Learning Framework through Knowledge Distillation and Discrepancy-Aware Aggregation
- FedGTST: Boosting Global Transferability of Federated Models via Statistics Tuning
- FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models
- FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation
- FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection
- FedNE: Surrogate-Assisted Federated Neighbor Embedding for Dimensionality Reduction
- FedSSP: Federated Graph Learning with Spectral Knowledge and Personalized Preference
- Feedback control guides credit assignment in recurrent neural networks
- FEEL-SNN: Robust Spiking Neural Networks with Frequency Encoding and Evolutionary Leak Factor
- Feint Behaviors and Strategies: Formalization, Implementation and Evaluation
- FERERO: A Flexible Framework for Preference-Guided Multi-Objective Learning
- Ferrari: Federated Feature Unlearning via Optimizing Feature Sensitivity
- Fetch and Forge: Efficient Dataset Condensation for Object Detection
- Few-Shot Adversarial Prompt Learning on Vision-Language Models
- Few-shot Algorithms for Consistent Neural Decoding (FALCON) Benchmark
- Few-Shot Diffusion Models Escape the Curse of Dimensionality
- Few-Shot Task Learning through Inverse Generative Modeling
- FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training
- FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors
- FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction
- FIDE: Frequency-Inflated Conditional Diffusion Model for Extreme-Aware Time Series Generation
- FIFO-Diffusion: Generating Infinite Videos from Text without Training
- Fight Back Against Jailbreaking via Prompt Adversarial Tuning
- FilterNet: Harnessing Frequency Filters for Time Series Forecasting
- FINALLY: fast and universal speech enhancement with studio-like quality
- FinBen: An Holistic Financial Benchmark for Large Language Models
- FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making
- FindingEmo: An Image Dataset for Emotion Recognition in the Wild
- Finding good policies in average-reward Markov Decision Processes without prior knowledge
- Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
- Finding Transformer Circuits With Edge Pruning
- FineCLIP: Self-distilled Region-based CLIP for Better Fine-grained Understanding
- Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
- Fine-grained Control of Generative Data Augmentation in IoT Sensing
- Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random
- Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models
- FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models
- Fine-Tuning in Modern Machine Learning: Principles and Scalability
- Fine-Tuning is Fine, if Calibrated
- Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
- Fine Tuning Out-of-Vocabulary Item Recommendation with User Sequence Imagination
- Fine-Tuning Personalization in Federated Learning to Mitigate Adversarial Clients
- FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models
- First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs
- First-Order Methods for Linearly Constrained Bilevel Optimization
- First-Order Minimax Bilevel Optimization
- Fisher Flow Matching for Generative Modeling over Discrete Data
- Fit for our purpose, not yours: Benchmark for a low-resource, Indigenous language
- FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models
- Fixed Confidence Best Arm Identification in the Bayesian Setting
- Fixed points of nonnegative neural networks
- (FL)$^2$: Overcoming Few Labels in Federated Semi-Supervised Learning
- FLAME : Factuality-Aware Alignment for Large Language Models
- FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
- Flatten Anything: Unsupervised Neural Surface Parameterization
- Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM
- FlexCap: Describe Anything in Images in Controllable Detail
- Flexible Context-Driven Sensory Processing in Dynamical Vision Models
- Flexible mapping of abstract domains by grid cells via self-supervised extraction and projection of generalized velocity signals
- Flexible task abstractions emerge in linear networks with fast and bounded units
- Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts
- FlexMol: A Flexible Toolkit for Benchmarking Molecular Relational Learning
- FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation
- FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling
- Flipped Classroom: Aligning Teacher Attention with Student in Generalized Category Discovery
- Flipping-based Policy for Chance-Constrained Markov Decision Processes
- FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations
- FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions
- Flow Matching for Generative Modeling
- Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching
- Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
- FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
- FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models
- fMRI predictors based on language models of increasing complexity recover brain left lateralization
- FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation
- F-OAL: Forward-only Online Analytic Learning with Fast Training and Low Memory Footprint in Class Incremental Learning
- Focus On What Matters: Separated Models For Visual-Based RL Generalization
- FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
- Forgetting, Ignorance or Myopia: Revisiting Key Challenges in Online Continual Learning
- Foundation Inference Models for Markov Jump Processes
- Foundation Model Interventions
- Foundation Models for Science: Progress, Opportunities, and Challenges
- Foundations of Multivariate Distributional Reinforcement Learning
- Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding
- FouRA: Fourier Low-Rank Adaptation
- Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting
- Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion
- Fourier Neural Operator with Learned Deformations for PDEs on General Geometries
- Fractal Patterns May Illuminate the Success of Next-Token Prediction
- FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention
- Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature Enhancement
- Free-Rider and Conflict Aware Collaboration Formation for Cross-Silo Federated Learning
- FreeSplat: Generalizable 3D Gaussian Splatting Towards Free View Synthesis of Indoor Scenes
- FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge
- FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space
- Frequency Adaptive Normalization For Non-stationary Time Series Forecasting
- Frequency-aware Generative Models for Multivariate Time Series Imputation
- Freya PAGE: First Optimal Time Complexity for Large-Scale Nonconvex Finite-Sum Optimization with Heterogeneous Asynchronous Computations
- Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching
- From an Image to a Scene: Learning to Imagine the World from a Million 360° Videos
- From Biased to Unbiased Dynamics: An Infinitesimal Generator Approach
- From Causal to Concept-Based Representation Learning
- From Chaos to Clarity: 3DGS in the Dark
- From Dictionary to Tensor: A Scalable Multi-View Subspace Clustering Framework with Triple Information Enhancement
- From Diffusion Models to Schrödinger Bridges
- From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
- From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization
- From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection
- From Seeing to Doing: Ascending the Ladder of Visual Intelligence
- From Similarity to Superiority: Channel Clustering for Time Series Forecasting
- From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
- From Transparent to Opaque: Rethinking Neural Implicit Surfaces with $\alpha$-NeuS
- From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models
- From Unstructured Data to In-Context Learning: Exploring What Tasks Can Be Learned and When
- Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models
- Frustratingly Easy Test-Time Adaptation of Vision-Language Models
- FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning
- FT-AED: Benchmark Dataset for Early Freeway Traffic Anomalous Event Detection
- FUGAL: Feature-fortified Unrestricted Graph Alignment
- FUG: Feature-Universal Graph Contrastive Pre-training for Graphs with Diverse Node Features
- Full-Atom Peptide Design with Geometric Latent Diffusion
- Full-Distance Evasion of Pedestrian Detectors in the Physical World
- Fully Explicit Dynamic Gaussian Splatting
- Fully Unconstrained Online Learning
- Functional Bilevel Optimization for Machine Learning
- Functional Gradient Flows for Constrained Sampling
- Functionally Constrained Algorithm Solves Convex Simple Bilevel Problem
- Fundamental Convergence Analysis of Sharpness-Aware Minimization
- Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models
- FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images
- FUSE: Fast Unified Simulation and Estimation for PDEs
- FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion
- FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
- FUSU: A Multi-temporal-source Land Use Change Segmentation Dataset for Fine-grained Urban Semantic Understanding
- FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving
- G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
- G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models
- GACL: Exemplar-Free Generalized Analytic Continual Learning
- GAIA: Rethinking Action Quality Assessment for AI-Generated Videos
- GAMap: Zero-Shot Object Goal Navigation with Multi-Scale Geometric-Affordance Guidance
- Game-Traversal-Benchmark: Evaluating Planning Abilities Of Large Language Models Via Traversing 2D Game Maps
- GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation
- Gated Inference Network: Inference and Learning State-Space Models
- Gated Slot Attention for Efficient Linear-Time Sequence Modeling
- Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
- GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling
- GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting
- Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images
- GaussianMarker: Uncertainty-Aware Copyright Protection of 3D Gaussian Splatting
- Gaussian Process Bandits for Top-k Recommendations
- GAVEL: Generating Games via Evolution and Language Models
- GC-Bench: An Open and Unified Benchmark for Graph Condensation
- GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning
- GenAI Arena: An Open Evaluation Platform for Generative Models
- GenAI for Health: Potential, Trust and Policy Compliance
- GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing
- Gene-Gene Relationship Modeling Based on Genetic Evidence for Single-Cell RNA-Seq Data Imputation
- General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process
- General bounds on the quality of Bayesian coresets
- General Detection-based Text Line Recognition
- Generalizable and Animatable Gaussian Head Avatar
- Generalizable Implicit Motion Modeling for Video Frame Interpolation
- Generalizable Person Re-identification via Balancing Alignment and Uniformity
- Generalizablity of Memorization Neural Network
- Generalization Analysis for Label-Specific Representation Learning
- Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming
- Generalization Bounds via Conditional $f$-Information
- Generalization Error Bounds for Two-stage Recommender Systems with Tree Structure
- Generalization of Hamiltonian algorithms
- Generalized Eigenvalue Problems with Generative Priors
- Generalized Fast Exact Conformalization
- Generalized Linear Bandits with Limited Adaptivity
- Generalized Protein Pocket Generation with Prior-Informed Flow Matching
- Generalized Tensor Decomposition for Understanding Multi-Output Regression under Combinatorial Shifts
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts
- Generalizing CNNs to graphs with learnable neighborhood quantization
- Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
- Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling
- Generated and Pseudo Content guided Prototype Refinement for Few-shot Point Cloud Segmentation
- Generate Universal Adversarial Perturbations for Few-Shot Learning
- Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search
- Generating compositional scenes via Text-to-image RGBA Instance Generation
- Generating Highly Designable Proteins with Geometric Algebra Flow Matching
- Generating Origin-Destination Matrices in Neural Spatial Interaction Models
- Generating Programmatic Solutions: Algorithms and Applications of Programmatic Reinforcement Learning and Code Generation
- Generative Adversarial Model-Based Optimization via Source Critic Regularization
- Generative AI and Creativity: A dialogue between machine learning researchers and creative professionals
- Generative Forests
- Generative Fractional Diffusion Models
- Generative Hierarchical Materials Search
- Generative Modeling of Molecular Dynamics Trajectories
- Generative Modelling of Structurally Constrained Graphs
- Generative Retrieval Meets Multi-Graded Relevance
- Generative Semi-supervised Graph Anomaly Detection
- Generic Unsupervised Optimization for a Latent Variable Model With Exponential Family Observables
- Genetic-guided GFlowNets for Sample Efficient Molecular Optimization
- GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics
- GenRec: Unifying Video Generation and Recognition with Diffusion Models
- GenRL: Multimodal-foundation world models for generalization in embodied agents
- GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
- Geodesic Optimization for Predictive Shift Adaptation on EEG data
- GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation
- Geometric Analysis of Nonlinear Manifold Clustering
- Geometric-Averaged Preference Optimization for Soft Preference Labels
- Geometric Exploitation for Indoor Panoramic Semantic Segmentation
- Geometric Trajectory Diffusion Models
- Geometry Awakening: Cross-Geometry Learning Exhibits Superiority over Individual Structures
- Geometry-aware training of factorized layers in tensor Tucker format
- Geometry Cloak: Preventing TGS-based 3D Reconstruction from Copyrighted Images
- Geometry of naturalistic object representations in recurrent neural network models of working memory
- GeoNLF: Geometry guided Pose-Free Neural LiDAR Fields
- GeoPlant: Spatial Plant Species Prediction Dataset.
- GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts
- Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
- Get Rid of Isolation: A Continuous Multi-task Spatio-Temporal Learning Framework
- Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment
- GFlowNet Assisted Biological Sequence Editing
- GFT: Graph Foundation Model with Transferable Tree Vocabulary
- GIC: Gaussian-Informed Continuum for Physical Property Identification and Simulation
- GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning
- GLBench: A Comprehensive Benchmark for Graph with Large Language Models
- Gliding over the Pareto Front with Uniform Designs
- GLinSAT: The General Linear Satisfiability Neural Network Layer By Accelerated Gradient Descent
- GL-NeRF: Gauss-Laguerre Quadrature Enables Training-Free NeRF Acceleration
- Global Convergence in Training Large-Scale Transformers
- Global Distortions from Local Rewards: Neural Coding Strategies in Path-Integrating Neural Systems
- Global Lyapunov functions: a long-standing open problem in mathematics, with symbolic transformers
- Globally Convergent Variational Inference
- Globally Q-linear Gauss-Newton Method for Overparameterized Non-convex Matrix Sensing
- Global Rewards in Restless Multi-Armed Bandits
- GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
- GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
- GO4Align: Group Optimization for Multi-Task Alignment
- Goal-Conditioned On-Policy Reinforcement Learning
- Goal Conditioned Reinforcement Learning for Photo Finishing Tuning
- Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning
- Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
- GOMAA-Geo: GOal Modality Agnostic Active Geo-localization
- GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
- Gorilla: Large Language Model Connected with Massive APIs
- Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
- Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
- Gradient-free Decoder Inversion in Latent Diffusion Models
- Gradient-Free Methods for Nonconvex Nonsmooth Stochastic Compositional Optimization
- Gradient Guidance for Diffusion Models: An Optimization Perspective
- Gradient Methods for Online DR-Submodular Maximization with Stochastic Long-Term Constraints
- Gradient Rewiring for Editable Graph Neural Network Training
- Gradients of Functions of Large Matrices
- Gradient-Variation Online Learning under Generalized Smoothness
- Gradual Domain Adaptation via Manifold-Constrained Distributionally Robust Optimization
- Grammar-Aligned Decoding
- GRANOLA: Adaptive Normalization for Graph Neural Networks
- Graph-based Uncertainty Metrics for Long-form Language Model Generations
- Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models
- Graph Classification via Reference Distribution Learning: Theory and Practice
- Graph Coarsening with Message-Passing Guarantees
- Graphcode: Learning from multiparameter persistent homology using graph neural networks
- Graph Convolutions Enrich the Self-Attention in Transformers!
- GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction
- Graph Diffusion Policy Optimization
- Graph Diffusion Transformers for Multi-Conditional Molecular Generation
- Graph Edit Distance with General Costs Using Neural Set Divergence
- Graph-enhanced Optimizers for Structure-aware Recommendation Embedding Evolution
- Graph Learning for Numeric Planning
- GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts
- GraphMorph: Tubular Structure Extraction by Morphing Predicted Graphs
- Graph Neural Flows for Unveiling Systemic Interactions Among Irregularly Sampled Time Series
- Graph Neural Networks and Arithmetic Circuits
- Graph neural networks and non-commuting operators
- Graph Neural Networks Do Not Always Oversmooth
- Graph Neural Networks Need Cluster-Normalize-Activate Modules
- Graph Structure Inference with BAM: Neural Dependency Processing via Bilinear Attention
- GraphTrail: Translating GNN Predictions into Human-Interpretable Logical Rules
- GraphVis: Boosting LLMs with Visual Knowledge Graph Integration
- Grasp as You Say: Language-guided Dexterous Grasp Generation
- Great Minds Think Alike: The Universal Convergence Trend of Input Salience
- GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models
- GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration
- G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering
- Grid4D: 4D Decomposed Hash Encoding for High-fidelity Dynamic Gaussian Splatting
- Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization
- Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
- Grounding Multimodal Large Language Models in Actions
- GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
- Group and Shuffle: Efficient Structured Orthogonal Parametrization
- Group Robust Preference Optimization in Reward-free RLHF
- Group-wise oracle-efficient algorithms for online multi-group learning
- GS-Blur: A 3D Scene-Based Dataset for Realistic Image Deblurring
- GSDF: 3DGS Meets SDF for Improved Neural Rendering and Reconstruction
- GSGAN: Adversarial Learning for Hierarchical Generation of 3D Gaussian Splats
- GS-Hider: Hiding Messages into 3D Gaussian Splatting
- GTA: A Benchmark for General Tool Agents
- GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
- GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations
- GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
- GuardT2I: Defending Text-to-Image Models from Adversarial Prompts
- Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization
- GUIDE: Real-Time Human-Shaped Agents
- Guiding a Diffusion Model with a Bad Version of Itself
- Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame
- GVKF: Gaussian Voxel Kernel Functions for Highly Efficient Surface Reconstruction in Open Scenes
- GV-Rep: A Large-Scale Dataset for Genetic Variant Representation Learning
- HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion
- HairFastGAN: Realistic and Robust Hair Transfer with a Fast Encoder-Based Approach
- Hallo3D: Multi-Modal Hallucination Detection and Mitigation for Consistent 3D Content Generation
- HaloScope: Harnessing Unlabeled LLM Generations for Hallucination Detection
- Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba
- Hamiltonian Monte Carlo Inference of Marginalized Linear Mixed-Effects Models
- Hamiltonian Monte Carlo on ReLU Neural Networks is Inefficient
- Hamiltonian Score Matching and Generative Flows
- Handling Learnwares from Heterogeneous Feature Spaces with Explicit Label Exploitation
- Happy: A Debiased Learning Framework for Continual Generalized Category Discovery
- HardCore Generation: Generating Hard UNSAT Problems for Data Augmentation
- Hardness of Learning Neural Networks under the Manifold Hypothesis
- HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection
- Harmonizing Stochasticity and Determinism: Scene-responsive Diverse Human Motion Prediction
- Harmonizing Visual Text Comprehension and Generation
- Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions
- Harnessing Multiple Correlated Networks for Exact Community Recovery
- Harnessing small projectors and multiple views for efficient vision pretraining
- HAWK: Learning to Understand Open-World Video Anomalies
- HC-GAE: The Hierarchical Cluster-based Graph Auto-Encoder for Graph Representation Learning
- HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting
- HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data
- Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
- HelpSteer 2: Open-source dataset for training top-performing reward models
- HEMM: Holistic Evaluation of Multimodal Foundation Models
- HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
- HEPrune: Fast Private Training of Deep Neural Networks With Encrypted Data Pruning
- HEST-1k: A Dataset For Spatial Transcriptomics and Histology Image Analysis
- Heterogeneity-Guided Client Sampling: Towards Fast and Efficient Non-IID Federated Learning
- HGDL: Heterogeneous Graph Label Distribution Learning
- HHD-GP: Incorporating Helmholtz-Hodge Decomposition into Gaussian Processes for Learning Dynamical Systems
- HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation
- HiCoM: Hierarchical Coherent Motion for Dynamic Streamable Scenes with 3D Gaussian Splatting
- Hidden in Plain Sight: Evaluating Abstract Shape Recognition in Vision-Language Models
- Hierarchical and Density-based Causal Clustering
- Hierarchical Federated Learning with Multi-Timescale Gradient Correction
- Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions
- Hierarchical Object-Aware Dual-Level Contrastive Learning for Domain Generalized Stereo Matching
- Hierarchical Programmatic Option Framework
- Hierarchical Selective Classification
- Hierarchical Uncertainty Exploration via Feedforward Posterior Trees
- Hierarchical Visual Feature Aggregation for OCR-Free Document Understanding
- High-dimensional (Group) Adversarial Training in Linear Regression
- Higher-Order Causal Message Passing for Experimentation with Complex Interference
- Higher-Rank Irreducible Cartesian Tensors for Equivariant Message Passing
- High-probability complexity bounds for stochastic non-convex minimax optimization
- High Rank Path Development: an approach to learning the filtration of stochastic processes
- High-Resolution Image Harmonization with Adaptive-Interval Color Transformation
- Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation
- HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
- Historical Test-time Prompt Tuning for Vision Foundation Models
- HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction
- HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
- Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models
- Homology Consistency Constrained Efficient Tuning for Vision-Language Models
- HonestLLM: Toward an Honest and Helpful Large Language Model
- Honor Among Bandits: No-Regret Learning for Online Fair Division
- HOPE: Shape Matching Via Aligning Different K-hop Neighbourhoods
- HORSE: Hierarchical Representation for Large-Scale Neural Subset Selection
- HourVideo: 1-Hour Video-Language Understanding
- How Control Information Influences Multilingual Text Image Generation and Editing?
- How Diffusion Models Learn to Factorize and Compose
- How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers
- How Does Black-Box Impact the Learning Guarantee of Stochastic Compositional Optimization?
- How does Gradient Descent Learn Features --- A Local Analysis for Regularized Two-Layer Neural Networks
- How does Inverse RL Scale to Large State Spaces? A Provably Efficient Approach
- How Does Message Passing Improve Collaborative Filtering?
- How does PDE order affect the convergence of PINNs?
- How Does Variance Shape the Regret in Contextual Bandits?
- How Do Large Language Models Acquire Factual Knowledge During Pretraining?
- How do Large Language Models Handle Multilingualism?
- How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
- How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
- How many classifiers do we need?
- How Molecules Impact Cells: Unlocking Contrastive PhenoMolecular Retrieval
- How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective
- How to Boost Any Loss Function
- How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
- How to optimize what matters most?
- How to Solve Contextual Goal-Oriented Problems with Offline Datasets?
- How to Use Diffusion Priors under Sparse Views?
- How Transformers Utilize Multi-Head Attention in In-Context Learning? A Case Study on Sparse Linear Regression
- Human-3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models
- Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
- Human Expertise in Algorithmic Prediction
- Human-level shape inferences: A benchmark for evaluating the 3D understanding of vision models
- Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
- Humanoid Locomotion as Next Token Prediction
- HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors
- HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation
- HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid
- Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
- HuRef: HUman-REadable Fingerprint for Large Language Models
- HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models
- Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability
- Hybrid Mamba for Few-Shot Segmentation
- Hybrid Reinforcement Learning Breaks Sample Size Barriers In Linear MDPs
- Hybrid Top-Down Global Causal Discovery with Local Search for Linear and Nonlinear Additive Noise Models
- Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
- HYDRA-FL: Hybrid Knowledge Distillation for Robust and Accurate Federated Learning
- HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
- HYDRA: Model Factorization Framework for Black-Box LLM Personalization
- HydraViT: Stacking Heads for a Scalable ViT
- Hyperbolic Embeddings of Supervised Models
- HyperLogic: Enhancing Diversity and Accuracy in Rule Learning with HyperNets
- Hyper-opinion Evidential Deep Learning for Out-of-Distribution Detection
- HyperPrism: An Adaptive Non-linear Aggregation Framework for Distributed Machine Learning over Non-IID Data and Time-varying Communication Links
- Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
- Hypothesis Testing the Circuit Hypothesis in LLMs
- HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis
- I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
- IaC-Eval: A code generation benchmark for Infrastructure-as-Code programs
- Identifiability Analysis of Linear ODE Systems with Hidden Confounders
- Identifiability Guarantees for Causal Disentanglement from Purely Observational Data
- Identifiable Object-Centric Representation Learning via Probabilistic Slot Attention
- Identifiable Shared Component Analysis of Unpaired Multimodal Mixtures
- Identification and Estimation of the Bi-Directional MR with Some Invalid Instruments
- Identification of Analytic Nonlinear Dynamical Systems with Non-asymptotic Guarantees
- Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model
- Identifying Causal Effects Under Functional Dependencies
- Identifying Equivalent Training Dynamics
- Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
- Identifying General Mechanism Shifts in Linear Causal Representations
- Identifying Latent State-Transition Processes for Individualized Reinforcement Learning
- Identifying Selections for Unsupervised Subtask Discovery
- Identifying Spatio-Temporal Drivers of Extreme Events
- Identify Then Recommend: Towards Unsupervised Group Recommendation
- Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
- IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation
- Idiographic Personality Gaussian Process for Psychological Assessment
- I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
- ID-to-3D: Expressive ID-guided 3D Heads via Score Distillation Sampling
- IF-Font: Ideographic Description Sequence-Following Font Generation
- If You Want to Be Robust, Be Wary of Initialization
- II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
- IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
- IllumiNeRF: 3D Relighting Without Inverse Rendering
- Image2Struct: A Benchmark for Evaluating Vision-Language Models in Extracting Structured Information from Images
- Image-aware Evaluation of Generated Medical Reports
- Image Copy Detection for Diffusion Models
- ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
- ImageNet++: A Large-Scale Benchmark of Data Curation Strategies
- Image Reconstruction Via Autoencoding Sequential Deep Image Prior
- Images that Sound: Composing Images and Sounds on a Single Canvas
- Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions
- Image Understanding Makes for A Good Tokenizer for Image Generation
- IMAGPose: A Unified Conditional Framework for Pose-Guided Person Generation
- IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization
- Imitating Language via Scalable Inverse Reinforcement Learning
- Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment
- ImOV3D: Learning Open Vocabulary Point Clouds 3D Object Detection from Only 2D Images
- IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents
- Implicit Bias of Mirror Flow on Separable Data
- Implicit Curriculum in Procgen Made Explicit
- Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient
- Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
- Implicit Optimization Bias of Next-token Prediction in Linear Models
- Implicit Regularization of Decentralized Gradient Descent for Sparse Regression
- Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
- Implicit Regularization Paths of Weighted Neural Representations
- Implicit Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
- Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations
- Improved Algorithms for Contextual Dynamic Pricing
- Improved Analysis for Bandit Learning in Matching Markets
- Improved Bayes Regret Bounds for Multi-Task Hierarchical Bayesian Bandit Algorithms
- Improved Distribution Matching Distillation for Fast Image Synthesis
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses
- Improved Generation of Adversarial Examples Against Safety-aligned LLMs
- Improved Guarantees for Fully Dynamic $k$-Center Clustering with Outliers in General Metric Spaces
- Improved learning rates in multi-unit uniform price auctions
- Improved off-policy training of diffusion samplers
- Improved Particle Approximation Error for Mean Field Neural Networks
- Improved Regret for Bandit Convex Optimization with Delayed Feedback
- Improved Regret of Linear Ensemble Sampling
- Improved Sample Complexity Bounds for Diffusion Model Training
- Improved Sample Complexity for Multiclass PAC Learning
- Improving Adaptivity via Over-Parameterization in Sequence Models
- Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation
- Improving Alignment and Robustness with Circuit Breakers
- Improving Context-Aware Preference Modeling for Language Models
- Improving Decision Sparsity
- Improving Deep Learning Optimization through Constrained Parameter Regularization
- Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
- Improving Environment Novelty Quantification for Effective Unsupervised Environment Design
- Improving Equivariant Model Training via Constraint Relaxation
- Improving Generalization and Convergence by Enhancing Implicit Regularization
- Improving Generalization in Federated Learning with Model-Data Mutual Information Regularization: A Posterior Inference Approach
- Improving Generalization of Dynamic Graph Learning via Environment Prompt
- Improving Gloss-free Sign Language Translation by Reducing Representation Density
- Improving Linear System Solvers for Hyperparameter Optimisation in Iterative Gaussian Processes
- Improving Neural Network Surface Processing with Principal Curvatures
- Improving Neural ODE Training with Temporal Adaptive Batch Normalization
- Improving Robustness of 3D Point Cloud Recognition from a Fourier Perspective
- Improving robustness to corruptions with multiplicative weight perturbations
- Improving self-training under distribution shifts via anchored confidence with theoretical guarantees
- Improving Sparse Decomposition of Language Model Activations with Gated Sparse Autoencoders
- Improving Subgroup Robustness via Data Selection
- Improving Temporal Link Prediction via Temporal Walk Matrix Projection
- Improving the Learning Capability of Small-size Image Restoration Network by Deep Fourier Shifting
- Improving the Training of Rectified Flows
- Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity
- Improving Viewpoint-Independent Object-Centric Representations through Active Viewpoint Selection
- Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition
- In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies
- Incentivizing Quality Text Generation via Statistical Contracts
- IncomeSCM: From tabular data set to time-series simulator and causal estimation benchmark
- In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization
- In-Context Learning State Vector with Inner and Momentum Optimization
- In-Context Learning with Representations: Contextual Generalization of Trained Transformers
- In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
- In-Context Symmetries: Self-Supervised Learning through Contextual World Models
- Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques
- Incorporating Test-Time Optimization into Training with Dual Networks for Human Mesh Recovery
- Incremental Learning of Retrievable Skills For Efficient Continual Task Adaptation
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
- IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
- Indoor Air Quality Dataset with Activities of Daily Living in Low to Middle-income Communities
- Induced Model Matching: Restricted Models Help Train Full-Featured Models
- Inductive biases of multi-task learning and finetuning: multiple regimes of feature reuse
- Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models
- Inexact Augmented Lagrangian Methods for Conic Optimization: Quadratic Growth and Linear Convergence
- Inference of Neural Dynamics Using Switching Recurrent Neural Networks
- Inference on the Change Point under a High Dimensional Covariance Shift
- Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference
- Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baseline
- Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors
- Inferring stochastic low-rank recurrent neural networks from neural data
- InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models
- Infinite-Dimensional Feature Interaction
- Infinite Limits of Multi-head Transformer Dynamics
- Inflationary Flows: Calibrated Bayesian Inference with Diffusion-Based Models
- InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
- Information Re-Organization Improves Reasoning in Large Language Models
- Information-theoretic Generalization Analysis for Expected Calibration Error
- Information-theoretic Limits of Online Classification with Noisy Labels
- InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling
- Infusing Self-Consistency into Density Functional Theory Hamiltonian Prediction via Deep Equilibrium Models
- Infusing Synthetic Data with Real-World Patterns for Zero-Shot Material State Segmentation
- Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
- Initializing Services in Interactive ML Systems for Diverse Users
- Initializing Variable-sized Vision Transformers from Learngene with Learnable Transformation
- Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models
- In-N-Out: Lifting 2D Diffusion Prior for 3D Object Removal via Tuning-Free Latents Alignment
- In Pursuit of Causal Label Correlations for Multi-label Image Recognition
- Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space
- INQUIRE: A Natural World Text-to-Image Retrieval Benchmark
- Instance-adaptive Zero-shot Chain-of-Thought Prompting
- Instance-Optimal Private Density Estimation in the Wasserstein Distance
- Instance-Specific Asymmetric Sensitivity in Differential Privacy
- InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
- Instruction Embedding: Latent Representations of Instructions Towards Task Identification
- Instruction-Guided Visual Masking
- Instruction Tuning Large Language Models to Understand Electronic Health Records
- Instruction Tuning With Loss Over Instructions
- Instructor-inspired Machine Learning for Robust Molecular Property Prediction
- Integrating Deep Metric Learning with Coreset for Active Learning in 3D Segmentation
- Integrating GNN and Neural ODEs for Estimating Non-Reciprocal Two-Body Interactions in Mixed-Species Collective Motion
- Integrating Suboptimal Human Knowledge with Hierarchical Reinforcement Learning for Large-Scale Multiagent Systems
- Interaction-Force Transport Gradient Flows
- Interactive Deep Clustering via Value Mining
- InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint
- InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction
- Interfacing Foundation Models' Embeddings
- International Workshop on Federated Foundation Models in Conjunction with NeurIPS 2024 (FL@FM-NeurIPS'24)
- InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
- InterpBench: Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques
- Interpolating Item and User Fairness in Multi-Sided Recommendations
- Interpretable AI: Past, Present and Future
- Interpretable Concept-Based Memory Reasoning
- Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents
- Interpretable Generalized Additive Models for Datasets with Missing Values
- Interpretable Image Classification with Adaptive Prototype-based Vision Transformers
- Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors
- Interpretable Mesomorphic Networks for Tabular Data
- Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge
- Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
- Interpreting Learned Feedback Patterns in Large Language Models
- Interpreting the Weight Space of Customized Diffusion Models
- Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification
- Interventional Causal Discovery in a Mixture of DAGs
- Interventionally Consistent Surrogates for Complex Simulation Models
- Intervention and Conditioning in Causal Bayesian Networks
- In-Trajectory Inverse Reinforcement Learning: Learn Incrementally From An Ongoing Trajectory
- IntraMix: Intra-Class Mixup Generation for Accurate Labels and Neighbors
- Intrinsically Motivated Open-ended Learning (IMOL)
- Intrinsic Robustness of Prophet Inequality to Strategic Reward Signaling
- Intrinsic Self-Supervision for Data Quality Audits
- Introducing Spectral Attention for Long-Range Dependency in Time Series Forecasting
- Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity
- Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level
- Invariant subspaces and PCA in nearly matrix multiplication time
- Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation
- Inverse Factorized Soft Q-Learning for Cooperative Multi-agent Imitation Learning
- Inverse M-Kernels for Linear Universal Approximators of Non-Negative Functions
- Inversion-based Latent Bayesian Optimization
- InversionView: A General-Purpose Method for Reading Information from Neural Activations
- Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
- Invisible Image Watermarks Are Provably Removable Using Generative AI
- IODA: Instance-Guided One-shot Domain Adaptation for Super-Resolution
- IPM-LSTM: A Learning-Based Interior Point Method for Solving Nonlinear Programs
- IPO: Interpretable Prompt Optimization for Vision-Language Models
- IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering
- IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons
- IR-CM: The Fast and Universal Image Restoration Method Based on Consistency Model
- Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models
- Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning
- Is Cross-validation the Gold Standard to Estimate Out-of-sample Model Performance?
- Is Function Similarity Over-Engineered? Building a Benchmark
- Is Knowledge Power? On the (Im)possibility of Learning from Strategic Interactions
- Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
- Is Multiple Object Tracking a Matter of Specialization?
- Is O(log N) practical? Near-Equivalence Between Delay Robustness and Bounded Regret in Bandits and RL
- Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models.
- Is Programming by Example solved by LLMs?
- Is Score Matching Suitable for Estimating Point Processes?
- Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
- Is Value Learning Really the Main Bottleneck in Offline RL?
- Is Your HD Map Constructor Reliable under Sensor Corruptions?
- Is Your LiDAR Placement Optimized for 3D Scene Understanding?
- Iteration Head: A Mechanistic Study of Chain-of-Thought
- Iteratively Refined Behavior Regularization for Offline Reinforcement Learning
- Iteratively Refined Early Interaction Alignment for Subgraph Matching based Graph Retrieval
- Iterative Methods via Locally Evolving Set Process
- Iterative Reasoning Preference Optimization
- iVideoGPT: Interactive VideoGPTs are Scalable World Models
- IWBVT: Instance Weighting-based Bias-Variance Trade-off for Crowdsourcing
- JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
- Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
- JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
- JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
- Job-SDF: A Multi-Granularity Dataset for Job Skill Demand Forecasting and Benchmarking
- John Ellipsoids via Lazy Updates
- Jointly Modeling Inter- & Intra-Modality Dependencies for Multi-modal Learning
- JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
- Just Add $100 More: Augmenting Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem
- Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
- Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning
- KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
- Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting
- Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
- Kermut: Composite kernel regression for protein variant effects
- Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm
- Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities
- Kernel PCA for Out-of-Distribution Detection
- Key-Grid: Unsupervised 3D Keypoints Detection using Grid Heatmap Features
- KFNN: K-Free Nearest Neighbor For Crowdsourcing
- KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge
- kGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution
- KnowGPT: Knowledge Graph based Prompting for Large Language Models
- Knowledge Circuits in Pretrained Transformers
- Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
- Knowledge-Empowered Dynamic Graph Network for Irregularly Sampled Medical Time Series
- Knowledge Graph Completion by Intermediate Variables Regularization
- KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis
- KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
- Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
- Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks
- Kuro Siwo: 33 billion $m^2$ under the water. A global multi-temporal satellite dataset for rapid flood mapping
- KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
- KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
- L4GM: Large 4D Gaussian Reconstruction Model
- Label Alignment Regularization for Distribution Shift
- Label Delay in Online Continual Learning
- Label Noise: Ignorance Is Bliss
- LACIE: Listener-Aware Finetuning for Calibration in Large Language Models
- LaKD: Length-agnostic Knowledge Distillation for Trajectory Prediction with Any Length Observations
- LAM3D: Large Image-Point Clouds Alignment Model for 3D Reconstruction from Single Image
- Lambda: Learning Matchable Prior For Entity Alignment with Unlabeled Dangling Cases
- Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning
- Language-Driven Interactive Traffic Trajectory Generation
- Language Gamification
- Language Generation in the Limit
- Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication
- Language Model as Visual Explainer
- Language Models as Hierarchy Encoders
- Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
- Language Without Borders: A Dataset and Benchmark for Code-Switching Lip Reading
- Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation
- Large Language Models' Expert-level Global History Knowledge Benchmark (HiST-LLM)
- Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
- Large Language Models Must Be Taught to Know What They Don’t Know
- Large Language Models Play StarCraft II:Benchmarks and A Chain of Summarization Approach
- Large Language Model Unlearning
- Large Language Model Unlearning via Embedding-Corrupted Prompts
- Large language model validity via enhanced conformal prediction methods
- Large Pre-trained time series models for cross-domain Time series analysis tasks
- Large Scale Transfer Learning for Tabular Data via Language Modeling
- Large Spatial Model: End-to-end Unposed Images to Semantic 3D
- Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
- LaSCal: Label-Shift Calibration without target labels
- LaSe-E2V: Towards Language-guided Semantic-aware Event-to-Video Reconstruction
- Last-Iterate Convergence for Generalized Frank-Wolfe in Monotone Variational Inequalities
- Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
- Latent Diffusion for Neural Spiking Data
- Latent Functional Maps: a spectral framework for representation alignment
- Latent Intrinsics Emerge from Training to Relight
- Latent Learning Progress Drives Autonomous Goal Selection in Human Reinforcement Learning
- Latent Neural Operator for Solving Forward and Inverse PDE Problems
- Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models
- Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference
- Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks
- LAVIB: A Large-scale Video Interpolation Benchmark
- Layer-Adaptive State Pruning for Deep State Space Models
- LCGen: Mining in Low-Certainty Generation for View-consistent Text-to-3D
- LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling
- Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
- Learnability Matters: Active Learning for Video Captioning
- Learnability of high-dimensional targets by two-parameter models and gradient flow
- Learning 1D Causal Visual Representation with De-focus Attention Networks
- Learning 3D Equivariant Implicit Function with Patch-Level Pose-Invariant Representation
- Learning 3D Garment Animation from Trajectories of A Piece of Cloth
- Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
- Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training
- Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
- Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise
- Learning-Augmented Algorithms for the Bahncard Problem
- Learning-Augmented Algorithms with Explicit Predictors
- Learning-Augmented Approximation Algorithms for Maximum Cut and Related Problems
- Learning-Augmented Dynamic Submodular Maximization
- Learning-Augmented Priority Queues
- Learning Better Representations From Less Data For Propositional Satisfiability
- Learning Bregman Divergences with Application to Robustness
- Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification
- Learning Complete Protein Representation by Dynamically Coupling of Sequence and Structure
- Learning Cooperative Trajectory Representations for Motion Forecasting
- Learning Cortico-Muscular Dependence through Orthonormal Decomposition of Density Ratios
- Learning Cut Generating Functions for Integer Programming
- Learning De-Biased Representations for Remote-Sensing Imagery
- Learning diffusion at lightspeed
- Learning Diffusion Priors from Observations by Expectation Maximization
- Learning Discrete Concepts in Latent Hierarchical Models
- Learning Discrete Latent Variable Structures with Tensor Rank Conditions
- Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization
- Learning Distinguishable Trajectory Representation with Contrastive Loss
- Learning Distributions on Manifolds with Free-Form Flows
- Learning diverse causally emergent representations from time series data
- Learning Elastic Costs to Shape Monge Displacements
- Learning Equilibria in Adversarial Team Markov Games: A Nonconvex-Hidden-Concave Min-Max Optimization Problem
- Learning for Interaction and Interaction for Learning
- Learning Formal Mathematics From Intrinsic Motivation
- Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation
- Learning from higher-order correlations, efficiently: hypothesis tests, random features, and neural networks
- Learning from Highly Sparse Spatio-temporal Data
- Learning from Noisy Labels via Conditional Distributionally Robust Optimization
- Learning from Offline Foundation Features with Tensor Augmentations
- Learning from Pattern Completion: Self-supervised Controllable Generation
- Learning from Snapshots of Discrete and Continuous Data Streams
- Learning from Teaching Regularization: Generalizable Correlations Should be Easy to Imitate
- Learning from Uncertain Data: From Possible Worlds to Possible Models
- Learning Generalized Linear Programming Value Functions
- Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
- Learning Goal-Conditioned Representations for Language Reward Models
- Learning Group Actions on Latent Representations
- Learning Human-like Representations to Enable Learning Human Values
- Learning Identifiable Factorized Causal Representations of Cellular Responses
- Learning Image Priors Through Patch-Based Diffusion Models for Solving Inverse Problems
- Learning Infinitesimal Generators of Continuous Symmetries from Data
- Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient Algorithms
- Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars
- Learning Linear Causal Representations from General Environments: Identifiability and Intrinsic Ambiguity
- Learning Low-Rank Feature for Thorax Disease Classification
- Learning Macroscopic Dynamics from Partial Microscopic Observations
- Learning Mixtures of Unknown Causal Interventions
- Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
- Learning Neural Contracting Dynamics: Extended Linearization and Global Guarantees
- Learning Noisy Halfspaces with a Margin: Massart is No Harder than Random
- Learning on Large Graphs using Intersecting Communities
- Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression
- Learning Optimal Tax Design in Nonatomic Congestion Games
- Learning Partitions from Context
- Learning Place Cell Representations and Context-Dependent Remapping
- Learning Plaintext-Ciphertext Cryptographic Problems via ANF-based SAT Instance Representation
- Learning predictable and robust neural representations by straightening image sequences
- Learning Representations for Hierarchies with Minimal Support
- Learning rigid-body simulators over implicit shapes for large-scale scenes and vision
- Learning Segmentation from Point Trajectories
- Learning Social Welfare Functions
- Learning Spatially-Aware Language and Audio Embeddings
- Learning Structure-Aware Representations of Dependent Types
- Learning Structured Representations with Hyperbolic Embeddings
- Learning Successor Features the Simple Way
- Learning Superconductivity from Ordered and Disordered Material Structures
- Learning symmetries via weight-sharing with doubly stochastic tensors
- Learning the Expected Core of Strictly Convex Stochastic Cooperative Games
- Learning the Infinitesimal Generator of Stochastic Diffusion Processes
- Learning the Latent Causal Structure for Modeling Label Noise
- Learning the Optimal Policy for Balancing Short-Term and Long-Term Rewards
- Learning to Assist Humans without Inferring Rewards
- Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games
- Learning to be Smooth: An End-to-End Differentiable Particle Smoother
- Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching
- Learning to compute Gröbner bases
- Learning to Cooperate with Humans using Generative Agents
- Learning to Decouple the Lights for 3D Face Texture Modeling
- Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
- Learning to Edit Visual Programs with Self-Supervision
- Learning to Embed Distributions via Maximum Kernel Entropy
- Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
- Learning to Handle Complex Constraints for Vehicle Routing Problems
- Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers
- Learning to Mitigate Externalities: the Coase Theorem with Hindsight Rationality
- Learning to Predict Structural Vibrations
- Learning to Price Homogeneous Data
- Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenarios
- Learning to Reason via Program Generation, Emulation, and Search
- Learning to Shape In-distribution Feature Space for Out-of-distribution Detection
- Learning to Solve Quadratic Unconstrained Binary Optimization in a Classification Way
- Learning to Understand: Identifying Interactions via the Möbius Transform
- Learning Transferable Features for Implicit Neural Representations
- Learning Truncated Causal History Model for Video Restoration
- Learning Versatile Skills with Curriculum Masking
- Learning via Surrogate PAC-Bayes
- Learning Where to Edit Vision Transformers
- Learning with Fitzpatrick Losses
- Learning World Models for Unconstrained Goal Navigation
- Learn more, but bother less: parameter efficient continual learning
- Learn To be Efficient: Build Structured Sparsity in Large Language Models
- Least Squares Regression Can Exhibit Under-Parameterized Double Descent
- LeDex: Training LLMs to Better Self-Debug and Explain Code
- Length Optimization in Conformal Prediction
- LESS: Label-Efficient and Single-Stage Referring 3D Segmentation
- Leveraging an ECG Beat Diffusion Model for Morphological Reconstruction from Indirect Signals
- Leveraging Catastrophic Forgetting to Develop Safe Diffusion Models against Malicious Finetuning
- Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers
- Leveraging Drift to Improve Sample Complexity of Variance Exploding Diffusion Models
- Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models
- Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation
- Leveraging partial stragglers within gradient coding
- Leveraging Separated World Model for Exploration in Visually Distracted Environments
- Leveraging Tumor Heterogeneity: Heterogeneous Graph Representation Learning for Cancer Survival Prediction in Whole Slide Images
- Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
- Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models
- Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
- LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization
- LG-CAV: Train Any Concept Activation Vector with Language Guidance
- LG-VQ: Language-Guided Codebook Learning
- LibAMM: Empirical Insights into Approximate Computing for Accelerating Matrix Multiplication
- LibMOON: A Gradient-based MultiObjective OptimizatioN Library in PyTorch
- LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS
- Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis
- Light Unbalanced Optimal Transport
- Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation
- Limits of Transformer Language Models on Learning to Compose Algorithms
- Linear Causal Bandits: Unknown Graph and Soft Interventions
- Linear Causal Representation Learning from Unknown Multi-node Interventions
- Linearly Decomposing and Recomposing Vision Transformers for Diverse-Scale Models
- Linear Regression using Heterogeneous Data Batches
- Linear Time Approximation Algorithm for Column Subset Selection with Local Search
- Linear Transformers are Versatile In-Context Learners
- Linear Uncertainty Quantification of Graphical Model Inference
- LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low Resource and Extinct Languages
- Linguistic Collapse: Neural Collapse in (Large) Language Models
- Linking In-context Learning in Transformers to Human Episodic Memory
- LinNet: Linear Network for Efficient Point Cloud Representation Learning
- LION: Linear Group RNN for 3D Object Detection in Point Clouds
- Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-Syncing DeepFakes
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
- Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack
- Listenable Maps for Zero-Shot Audio Classifiers
- LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models
- LiT: Unifying LiDAR "Languages" with LiDAR Translator
- LIVE: Learnable In-Context Vector for Visual Question Answering
- LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Control and Rendering
- LLaMo: Large Language Model-based Molecular Graph Assistant
- LLaNA: Large Language and NeRF Assistant
- LLM-AutoDA: Large Language Model-Driven Automatic Data Augmentation for Long-tailed Problems
- LLM-based Skill Diffusion for Zero-shot Policy Adaptation
- LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
- LLM-Check: Investigating Detection of Hallucinations in Large Language Models
- LLM Circuit Analyses Are Consistent Across Training and Scale
- LLM Dataset Inference: Did you train on my dataset?
- LLMDFA: Analyzing Dataflow in Code with Large Language Models
- LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation
- LLM Evaluators Recognize and Favor Their Own Generations
- LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language
- LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings
- LLMs Can Evolve Continually on Modality for $\mathbb{X}$-Modal Reasoning
- LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model
- Local and Adaptive Mirror Descents in Extensive-Form Games
- Local Anti-Concentration Class: Logarithmic Regret for Greedy Linear Contextual Bandit
- Local Curvature Smoothing with Stein's Identity for Efficient Score Matching
- Localized Adaptive Risk Control
- Localized Zeroth-Order Prompt Optimization
- Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner
- Localizing Memorization in SSL Vision Encoders
- Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
- Locally Private and Robust Multi-Armed Bandits
- Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning
- Local to Global: Learning Dynamics and Effect of Initialization for Transformers
- Locating What You Need: Towards Adapting Diffusion Models to OOD Concepts In-the-Wild
- LocCa: Visual Pretraining with Location-aware Captioners
- LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss
- LoD-Loc: Aerial Visual Localization using LoD 3D Map with Neural Wireframe Alignment
- LoFiT: Localized Fine-tuning on LLM Representations
- Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
- Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk
- Logical characterizations of recurrent graph neural networks with reals and floats
- LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation
- Loki: Low-rank Keys for Efficient Sparse Attention
- Long-form factuality in large language models
- Long-Horizon Planning for Multi-Agent Robots in Partially Observable Environments
- Long-range Brain Graph Transformer
- Long-Range Feedback Spiking Network Captures Dynamic and Static Representations of the Visual Cortex under Movie Stimuli
- Long-range Meta-path Search on Large-scale Heterogeneous Graphs
- Long-tailed Object Detection Pretraining: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction
- Long-Tailed Out-of-Distribution Detection via Normalized Outlier Distribution Adaptation
- LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
- Lookback Prophet Inequalities
- LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
- Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering
- Looks Too Good To Be True: An Information-Theoretic Analysis of Hallucinations in Generative Restoration Models
- LoQT: Low-Rank Adapters for Quantized Pretraining
- LoRA-GA: Low-Rank Adaptation with Gradient Approximation
- LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search
- Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics
- Loss Landscape Characterization of Neural Networks without Over-Parametrization
- LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
- LOVA3: Learning to Visual Question Answering, Asking and Assessment
- Low Degree Hardness for Broadcasting on Trees
- Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks
- Lower Bounds of Uniform Stability in Gradient-Based Bilevel Algorithms for Hyperparameter Optimization
- Low Precision Local Training is Enough for Federated Learning
- Low-Rank Optimal Transport through Factor Relaxation with Latent Coupling
- LP-3DGS: Learning to Prune 3D Gaussian Splatting
- LRM-Zero: Training Large Reconstruction Models with Synthesized Data
- LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing
- LT-Defense: Searching-free Backdoor Defense via Exploiting the Long-tailed Effect
- L-TTA: Lightweight Test-Time Adaptation Using a Versatile Stem Layer
- LucidAction: A Hierarchical and Multi-model Dataset for Comprehensive Action Quality Assessment
- Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models
- Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT
- LuSh-NeRF: Lighting up and Sharpening NeRFs for Low-light Scenes
- LVD-2M: A Long-take Video Dataset with Temporally Dense Captions
- M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation
- M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data
- MAC Advice for facility location mechanism design
- Machine Learning for Systems
- Machine Learning in Structural Biology
- MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems
- MADiff: Offline Multi-agent Learning with Diffusion Models
- MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution
- MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
- Magnet: We Never Know How Text-to-Image Diffusion Models Work, Until We Learn How Vision-Language Models Function
- MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization
- Maia-2: A Unified Model for Human-AI Alignment in Chess
- Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion
- Make Continual Learning Stronger via C-Flat
- Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials
- Make Your LLM Fully Utilize the Context
- Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning
- MALT Powers Up Adversarial Attacks
- MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection
- MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space
- MambaLRP: Explaining Selective State Space Sequence Models
- MambaSCI: Efficient Mamba-UNet for Quad-Bayer Patterned Video Snapshot Compressive Imaging
- MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models
- MambaTree: Tree Topology is All You Need in State Space Model
- MAmmoTH2: Scaling Instructions from the Web
- ManiPose: Manifold-Constrained Multi-Hypothesis 3D Human Pose Estimation
- MaNo: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
- MAN TruckScenes: A multimodal dataset for autonomous trucking in diverse conditions
- Many-Shot In-Context Learning
- Many-shot Jailbreaking
- Map It Anywhere: Empowering BEV Map Prediction using Large-scale Public Datasets
- Marginal Causal Flows for Validation and Inference
- Markov Equivalence and Consistency in Differentiable Structure Learning
- Markovian Flow Matching: Accelerating MCMC with Continuous Normalizing Flows
- MARPLE: A Benchmark for Long-Horizon Inference
- Marrying Causal Representation Learning with Dynamical Systems for Science
- Mars: Situated Inductive Reasoning in an Open-World Environment
- MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
- Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages
- Masked Pre-training Enables Universal Zero-shot Denoiser
- MaskFactory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation
- MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
- MassSpecGym: A benchmark for the discovery and identification of molecules
- Matching the Statistical Query Lower Bound for $k$-Sparse Parity Problems with Sign Stochastic Gradient Descent
- MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
- MatFormer: Nested Transformer for Elastic Inference
- MATH-AI: The 4th Workshop on Mathematical Reasoning and AI
- Mathematics of Modern Machine Learning (M3L)
- MathPile: A Billion-Token-Scale Pretraining Corpus for Math
- Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods
- MatrixNet: Learning over symmetry groups using learned group representations
- Matryoshka Query Transformer for Large Vision-Language Models
- MaVEn: An Effective Multi-granularity Hybrid Visual Encoding Framework for Multimodal Large Language Model
- Maximizing utility in multi-agent environments by anticipating the behavior of other learners
- Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models
- Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
- MC-DiT: Contextual Enhancement via Clean-to-Clean Reconstruction for Masked Diffusion Models
- MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-Making
- Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
- Mean-Field Langevin Dynamics for Signed Measures via a Bilevel Approach
- Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance
- Measuring Dejavu Memorization Efficiently
- Measuring Goal-Directedness
- Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset
- Measuring Mutual Policy Divergence for Multi-Agent Sequential Exploration
- Measuring Per-Unit Interpretability at Scale Without Humans
- Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
- MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning
- Mechanism design augmented with output advice
- MedCalc-Bench: Evaluating Large Language Models for Medical Calculations
- Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification
- MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning
- MedJourney: Benchmark and Evaluation of Large Language Models over Patient Clinical Journey
- Med-Real2Sim: Non-Invasive Medical Digital Twins using Physics-Informed Self-Supervised Learning
- MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models
- Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
- MeLLoC: Lossless Compression with High-order Mechanism Learning
- Melting Pot Contest: Charting the Future of Generalized Cooperative Intelligence
- Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
- Membership Inference Attacks against Large Vision-Language Models
- Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy
- MeMo: Meaningful, Modular Controllers via Noise Injection
- Memorize What Matters: Emergent Scene Decomposition from Multitraverse
- Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
- Memory-Efficient LLM Training with Online Subspace Descent
- MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers
- MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts
- Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
- MEQA: A Benchmark for Multi-hop Event-centric Question Answering with Explanations
- Mercury: A Code Efficiency Benchmark for Code Large Language Models
- Mesa-Extrapolation: A Weave Position Encoding Method for Enhanced Extrapolation in LLMs
- MeshFormer : High-Quality Mesh Generation with 3D-Guided Reconstruction Model
- MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
- Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials
- MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
- Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
- Meta-Controller: Few-Shot Imitation of Unseen Embodiments and Tasks in Continuous Control
- MetaCURL: Non-stationary Concave Utility Reinforcement Learning
- Meta-Diffu$B$: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration
- Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
- Meta-Exploiting Frequency Prior for Cross-Domain Few-Shot Learning
- MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
- Meta-Learning Universal Priors Using Non-Injective Change of Variables
- Meta-Reinforcement Learning with Universal Policy Adaptation: Provable Near-Optimality under All-task Optimum Comparator
- MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning
- Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
- Metric Flow Matching for Smooth Interpolations on the Data Manifold
- Metric from Human: Zero-shot Monocular Metric Depth Estimation via Test-time Adaptation
- Metric Space Magnitude for Evaluating the Diversity of Latent Representations
- Metric Transforms and Low Rank Representations of Kernels for Fast Attention
- Metrizing Weak Convergence with Maximum Mean Discrepancies
- MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction
- MG-Net: Learn to Customize QAOA with Circuit Depth Awareness
- MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
- Microstructures and Accuracy of Graph Recall by Large Language Models
- MIDGArD: Modular Interpretable Diffusion over Graphs for Articulated Designs
- MILP-StuDio: MILP Instance Generation via Block Structure Decomposition
- Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Games
- MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
- MindMerger: Efficiently Boosting LLM Reasoning in non-English Languages
- Mind's Eye of LLMs: Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models
- Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making
- Mind the Gap Between Prototypes and Images in Cross-domain Finetuning
- Mind the Graph When Balancing Data for Fairness or Robustness
- MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
- MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
- Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning
- Minimizing UCB: a Better Local Search Strategy in Local Bayesian Optimization
- Minimum Entropy Coupling with Bottleneck
- Mining and Transferring Feature-Geometry Coherence for Unsupervised Point Cloud Registration
- Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training
- MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
- MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
- Mirror and Preconditioned Gradient Descent in Wasserstein Space
- MiSO: Optimizing brain stimulation to create neural activity states
- Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
- Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor
- Mitigating Biases in Blackbox Feature Extractors for Image Classification Tasks
- Mitigating Covariate Shift in Behavioral Cloning via Robust Stationary Distribution Correction
- Mitigating Object Hallucination via Concentric Causal Attention
- Mitigating Partial Observability in Decision Processes via the Lambda Discrepancy
- Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation
- Mitigating Spurious Correlations via Disagreement Probability
- Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
- MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures
- Mixture of Adversarial LoRAs: Boosting Robust Generalization in Meta-Tuning
- Mixture of Demonstrations for In-Context Learning
- Mixture of Experts Meets Prompt-Based Continual Learning
- Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
- Mixture of Link Predictors on Graphs
- Mixture of Nested Experts: Adaptive Processing of Visual Tokens
- Mixture of neural fields for heterogeneous reconstruction in cryo-EM
- Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
- Mixture of Tokens: Continuous MoE through Cross-Example Aggregation
- Mixtures of Experts for Audio-Visual Learning
- MKGL: Mastery of a Three-Word Language
- MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models
- ML with New Compute Paradigms
- MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding
- MmCows: A Multimodal Dataset for Dairy Cattle Monitoring
- MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs
- MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
- MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
- MMM-RS: A Multi-modal, Multi-GSD, Multi-scene Remote Sensing Dataset and Benchmark for Text-to-Image Generation
- MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
- MMSite: A Multi-modal Framework for the Identification of Active Sites in Proteins
- MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
- Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
- Mobility-LLM: Learning Visiting Intentions and Travel Preference from Human Mobility Data with Large Language Models
- MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for Multi-Object Demand-driven Navigation
- Model-based Diffusion for Trajectory Optimization
- Model Based Inference of Synaptic Plasticity Rules
- Model-Based Transfer Learning for Contextual Reinforcement Learning
- Model Collapse Demystified: The Case of Regression
- Model Decides How to Tokenize: Adaptive DNA Sequence Tokenization with MxDNA
- Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise Matrix Estimation
- Model Fusion through Bayesian Optimization in Language Model Fine-Tuning
- Modeling Latent Neural Dynamics with Gaussian Process Switching Linear Dynamical Systems
- Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks
- Model Reconstruction Using Counterfactual Explanations: A Perspective From Polytope Theory
- Model Sensitivity Aware Continual Learning
- MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks
- MoEUT: Mixture-of-Experts Universal Transformers
- MoGenTS: Motion Generation based on Spatial-Temporal Joint Modeling
- MoGU: A Framework for Enhancing Safety of LLMs While Preserving Their Usability
- Molecule Design by Latent Prompt Transformer
- Molecule Generation with Fragment Retrieval Augmentation
- MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
- MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
- MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts
- MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
- MonkeySee: Space-time-resolved reconstructions of natural images from macaque multi-unit activity
- Monoculture in Matching Markets
- MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders
- Monomial Matrix Group Equivariant Neural Functional Networks
- Monte Carlo Tree Search based Space Transfer for Black Box Optimization
- Most Influential Subset Selection: Challenges, Promises, and Beyond
- MOTE-NAS: Multi-Objective Training-based Estimate for Efficient Neural Architecture Search
- MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
- MOTI$\mathcal{V}\mathcal{E}$: A Drug-Target Interaction Graph For Inductive Link Prediction
- Motif-oriented influence maximization for viral marketing in large-scale social networks
- MotionBooth: Motion-Aware Customized Text-to-Video Generation
- Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
- MotionCraft: Physics-Based Zero-Shot Video Generation
- Motion Forecasting in Continuous Driving
- Motion Graph Unleashed: A Novel Approach to Video Prediction
- MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting
- MotionTTT: 2D Test-Time-Training Motion Estimation for 3D Motion Corrected MRI
- MoVA: Adapting Mixture of Vision Experts to Multimodal Context
- Moving Off-the-Grid: Scene-Grounded Video Representations
- MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
- MSA Generation with Seqs2Seqs Pretraining: Advancing Protein Structure Predictions
- MSAGPT: Neural Prompting Protein Structure Prediction via MSA Generative Pre-Training
- MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
- MTGS: A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction
- Muharaf: Manuscripts of Handwritten Arabic Dataset for Cursive Text Recognition
- Multi-Agent Coordination via Multi-Level Communication
- Multi-Agent Domain Calibration with a Handful of Offline Data
- Multi-Agent Imitation Learning: Value is Easy, Regret is Hard
- Multi-Chain Graphs of Graphs: A New Paradigm in Blockchain Dataset
- Multiclass Transductive Online Learning
- Multidimensional Fractional Programming for Normalized Cuts
- Multi-Group Proportional Representation in Retrieval
- Multi-Head Mixture-of-Experts
- Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images
- Multi-Instance Partial-Label Learning with Margin Adjustment
- Multi-Label Learning with Stronger Consistency Guarantees
- Multi-Label Open Set Recognition
- Multi-language Diversity Benefits Autoformalization
- Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization
- Multilingual Diversity Improves Vision-Language Representations
- Multi-LLM Debate: Framework, Principals, and Interventions
- Multimodal Algorithmic Reasoning Workshop
- Multimodal Large Language Models Make Text-to-Image Generative Models Align Better
- Multi-modal Situated Reasoning in 3D Scenes
- Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
- Multi-modal Transfer Learning between Biological Foundation Models
- Multi-model Ensemble Conformal Prediction in Dynamic Environments
- Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
- Multi-Object Hallucination in Vision Language Models
- MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities
- MultiOrg: A Multi-rater Organoid-detection Dataset
- Multiple Physics Pretraining for Spatiotemporal Surrogate Models
- MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step
- Multi-Reward Best Policy Identification
- Multi-scale Consistency for Robust 3D Registration via Hierarchical Sinkhorn Tree
- Multi-Scale Representation Learning for Protein Fitness Prediction
- Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
- Multistable Shape from Shading Emerges from Patch Diffusion
- Multi-Stage Predict+Optimize for (Mixed Integer) Linear Programs
- Multistep Distillation of Diffusion Models via Moment Matching
- Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction
- Multi-turn Reinforcement Learning with Preference Human Feedback
- Multivariate Probabilistic Time Series Forecasting with Correlated Errors
- Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking
- Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis
- Multiview Scene Graph
- Multi-Winner Reconfiguration
- Muscles in Time: Learning to Understand Human Motion In-Depth by Simulating Muscle Activations
- MutaPLM: Protein Language Modeling for Mutation Explanation and Engineering
- Mutli-Armed Bandits with Network Interference
- Mutual Information Estimation via $f$-Divergence and Data Derangements
- Mutual Information Estimation via Normalizing Flows
- MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding
- MV2Cyl: Reconstructing 3D Extrusion Cylinders from Multi-View Images
- MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
- MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
- MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps
- MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
- N-agent Ad Hoc Teamwork
- NanoBaseLib: A Multi-Task Benchmark Dataset for Nanopore Sequencing
- NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
- NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
- Natural Counterfactuals With Necessary Backtracking
- Nature-Inspired Local Propagation
- Navigable Graphs for High-Dimensional Nearest Neighbor Search: Constructions and Limits
- Navigating Chemical Space with Latent Flows
- Navigating Extremes: Dynamic Sparsity in Large Output Spaces
- Navigating the Effect of Parametrization for Dimensionality Reduction
- Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics
- Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
- NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking
- Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
- Nearly Minimax Optimal Regret for Multinomial Logistic Bandit
- Nearly Minimax Optimal Submodular Maximization with Bandit Feedback
- Nearly Optimal Approximation of Matrix Functions by the Lanczos Method
- Nearly Tight Black-Box Auditing of Differentially Private Machine Learning
- Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model
- Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity
- Near-Optimal Distributionally Robust Reinforcement Learning with General $L_p$ Norms
- Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs
- Near-Optimality of Contrastive Divergence Algorithms
- Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD
- Needle In A Multimodal Haystack
- Neglected Hessian component explains mysteries in sharpness regularization
- NeoRL: Efficient Exploration for Nonepisodic RL
- Nesterov acceleration despite very noisy gradients
- NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation
- Neuc-MDS: Non-Euclidean Multidimensional Scaling Through Bilinear Forms
- NeuMA: Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
- Neur2BiLO: Neural Bilevel Optimization
- Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
- Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks
- NeuralClothSim: Neural Deformation Fields Meet the Thin Shell Theory
- Neural Collapse Inspired Feature Alignment for Out-of-Distribution Generalization
- Neural Collapse To Multiple Centers For Imbalanced Data
- Neural collapse vs. low-rank bias: Is deep neural collapse really optimal?
- Neural Combinatorial Optimization for Robust Routing Problem with Uncertain Travel Times
- Neural Concept Binder
- Neural Conditional Probability for Uncertainty Quantification
- Neural Cover Selection for Image Steganography
- Neural decoding from stereotactic EEG: accounting for electrode variability across subjects
- Neural Embeddings Rank: Aligning 3D latent dynamics with movements
- Neural Experts: Mixture of Experts for Implicit Neural Representations
- Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling
- NeuralFluid: Nueral Fluidic System Design and Control with Differentiable Simulation
- NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
- Neural Gaffer: Relighting Any Object via Diffusion
- Neural Isometries: Taming Transformations for Equivariant ML
- Neural Krylov Iteration for Accelerating Linear System Solving
- Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation
- Neural Model Checking
- Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
- Neural Network Reparametrization for Accelerated Optimization in Molecular Simulations
- Neural P$^3$M: A Long-Range Interaction Modeling Enhancer for Geometric GNNs
- Neural Persistence Dynamics
- Neural Pfaffians: Solving Many Many-Electron Schrödinger Equations
- NeuralPlane: An Efficiently Parallelizable Platform for Fixed-wing Aircraft Control with Reinforcement Learning
- Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
- Neural Residual Diffusion Models for Deep Scalable Vision Generation
- Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set
- NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks
- NeuralSteiner: Learning Steiner Tree for Overflow-avoiding Global Routing in Chip Design
- NeurIPS 2024 Workshop: Machine Learning and the Physical Sciences
- NeurIPS'24 Workshop on Causal Representation Learning
- NeuroAI: Fusing Neuroscience and AI for Intelligent Solutions
- NeuroBOLT: Resting-state EEG-to-fMRI Synthesis with Multi-dimensional Feature Mapping
- NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction
- NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
- NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation
- Neuronal Competition Groups with Supervised STDP for Spike-Based Classification
- Neuro-Symbolic Data Generation for Math Reasoning
- Neuro-Vision to Language: Enhancing Brain Recording-based Visual Reconstruction and Language Interaction
- Newswire: A Large-Scale Structured Database of a Century of Historical News
- NewTerm: Benchmarking Real-Time New Terms for Large Language Models with Annual Updates
- Newton Informed Neural Operator for Computing Multiple Solutions of Nonlinear Partials Differential Equations
- Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms
- Nimbus: Secure and Efficient Two-Party Inference for Transformers
- NN4SysBench: Characterizing Neural Network Verification for Computer Systems
- Noether's Razor: Learning Conserved Quantities
- No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
- No Free Delivery Service: Epistemic limits of passive data collection in complex social systems
- No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
- No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation
- Noise-Aware Differentially Private Regression via Meta-Learning
- Noise Contrastive Alignment of Language Models with Explicit Rewards
- NoiseGPT: Label Noise Detection and Rectification through Probability Curvature
- Noisy Dual Mirror Descent: A Near Optimal Algorithm for Jointly-DP Convex Resource Allocation
- NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise
- Noisy Label Learning with Instance-Dependent Outliers: Identifiability via Crowd Wisdom
- Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction Methods
- NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
- Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
- Non-asymptotic Approximation Error Bounds of Parameterized Quantum Circuits
- Non-asymptotic Convergence of Training Transformers for Next-token Prediction
- Non-asymptotic Global Convergence Analysis of BFGS with the Armijo-Wolfe Line Search
- Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning
- Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data
- Non-convolutional graph neural networks.
- Non-Euclidean Mixture Model for Social Network Embedding
- Non-geodesically-convex optimization in the Wasserstein space
- Nonlinear dynamics of localization in neural receptive fields
- Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery
- Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks
- Non-parametric classification via expand-and-sparsify representation
- Nonparametric Evaluation of Noisy ICA Solutions
- Nonparametric Instrumental Variable Regression through Stochastic Approximate Gradients
- Nonparametric Regression for 3D Point Cloud Learning
- Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset
- Nonstationary Sparse Spectral Permanental Process
- No-Regret Bandit Exploration based on Soft Tree Ensemble Model
- No-Regret Learning for Fair Multi-Agent Social Welfare Optimization
- No-regret Learning in Harmonic Games: Extrapolation in the Face of Conflicting Interests
- No-Regret M${}^{\natural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting
- No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
- No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
- Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering
- Normalization and effective learning rates in reinforcement learning
- Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
- Norms for Managing Datasets: A Systematic Review of NeurIPS Datasets
- Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
- Not All Tokens Are What You Need for Pretraining
- Not Just Object, But State: Compositional Incremental Learning without Forgetting
- No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations
- Not so griddy: Internal representations of RNNs path integrating more than one agent
- Novel Object Synthesis via Adaptive Text-Image Harmony
- NovoBench: Benchmarking Deep Learning-based \emph{De Novo} Sequencing Methods in Proteomics
- No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
- Nuclear Fusion Diamond Polishing Dataset
- Nuclear Norm Regularization for Deep Learning
- Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees
- NVRC: Neural Video Representation Compression
- NYU CTF Dataset: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security
- OAM-TCD: A globally diverse dataset of high-resolution tree cover maps
- OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
- Object segmentation from common fate: Motion energy processing enables human-like zero-shot generalization to random dot stimuli
- Observational Scaling Laws and the Predictability of Langauge Model Performance
- OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
- OccFusion: Rendering Occluded Humans with Generative Diffusion Priors
- Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality
- Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding
- OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries
- ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models
- ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings
- ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
- Off-Dynamics Reinforcement Learning via Domain Adaptation and Reward Augmented Imitation
- Offline Behavior Distillation
- Offline Multitask Representation Learning for Reinforcement Learning
- Offline Oracle-Efficient Learning for Contextual MDPs via Layerwise Exploration-Exploitation Tradeoff
- Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
- Off-policy estimation with adaptively collected data: the power of online learning
- Off-Policy Selection for Initiating Human-Centric Experimental Design
- Off to new Shores: A Dataset & Benchmark for (near-)coastal Flood Inundation Forecasting
- Oja's Algorithm for Streaming Sparse PCA
- OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
- OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
- Omnigrasp: Simulated Humanoid Grasping on Diverse Objects
- OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
- OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation
- On $f$-Divergence Principled Domain Adaptation: An Improved Framework
- On Affine Homotopy between Language Encoders
- On Causal Discovery in the Presence of Deterministic Relations
- Once Read is Enough: Domain-specific Pretraining-free Language Models with Cluster-guided Sparse Experts for Long-tail Domain Knowledge
- On conditional diffusion models for PDE simulations
- On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
- On Differentially Private Subspace Estimation in a Distribution-Free Setting
- On Differentially Private U Statistics
- On Divergence Measures for Training GFlowNets
- OneActor: Consistent Subject Generation via Cluster-Conditioned Guidance
- OneBit: Towards Extremely Low-bit Large Language Models
- One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
- One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
- OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling
- One Sample Fits All: Approximating All Probabilistic Values Simultaneously and Efficiently
- One-shot Federated Learning via Synthetic Distiller-Distillate Communication
- One-Shot Safety Alignment for Large Language Models via Optimal Dualization
- One-Step Diffusion Distillation through Score Implicit Matching
- One-Step Effective Diffusion Network for Real-World Image Super-Resolution
- One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
- One-to-Multiple: A Progressive Style Transfer Unsupervised Domain-Adaptive Framework for Kidney Tumor Segmentation
- One-to-Normal: Anomaly Personalization for Few-shot Anomaly Detection
- On Feature Learning in Structured State Space Models
- On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
- On improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
- On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
- Online Adaptation of Language Models with a Memory of Amortized Contexts
- Online Bayesian Persuasion Without a Clue
- Online Budgeted Matching with General Bids
- Online Classification with Predictions
- Online Composite Optimization Between Stochastic and Adversarial Environments
- Online Consistency of the Nearest Neighbor Rule
- Online Control in Population Dynamics
- Online Control with Adversarial Disturbance for Continuous-time Linear Systems
- Online Convex Optimisation: The Optimal Switching Regret for all Segmentations Simultaneously
- Online Estimation via Offline Estimation: An Information-Theoretic Framework
- Online Feature Updates Improve Online (Generalized) Label Shift Adaptation
- Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
- Online Learning of Delayed Choices
- Online Learning with Sublinear Best-Action Queries
- Online Non-convex Learning in Dynamic Environments
- Online Posterior Sampling with a Diffusion Prior
- Online Relational Inference for Evolving Multi-agent Interacting Systems
- OnlineTAS: An Online Baseline for Temporal Action Segmentation
- Online Weighted Paging with Unknown Weights
- Only Strict Saddles in the Energy Landscape of Predictive Coding Networks?
- On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability
- On Neural Networks as Infinite Tree-Structured Probabilistic Graphical Models
- On provable privacy vulnerabilities of graph representations
- On-Road Object Importance Estimation: A New Dataset and A Model with Multi-Fold Top-Down Guidance
- On Sampling Strategies for Spectral Model Sharding
- On scalable oversight with weak LLMs judging strong LLMs
- On Socially Fair Low-Rank Approximation and Column Subset Selection
- On Softmax Direct Preference Optimization for Recommendation
- On Sparse Canonical Correlation Analysis
- On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)
- On the Ability of Developers' Training Data Preservation of Learnware
- On the Adversarial Robustness of Benjamini Hochberg
- On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift
- On the cohesion and separability of average-link for hierarchical agglomerative clustering
- On the Comparison between Multi-modal and Single-modal Contrastive Learning
- On the Complexity of Identification in Linear Structural Causal Models
- On the Complexity of Learning Sparse Functions with Statistical and Gradient Queries
- On the Complexity of Teaching a Family of Linear Behavior Cloning Learners
- On the Computational Complexity of Private High-dimensional Model Selection
- On the Computational Landscape of Replicable Learning
- On the Convergence of Loss and Uncertainty-based Active Learning Algorithms
- On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
- On the Effects of Data Scale on Computer Control Agents
- On the Efficiency of ERM in Feature Learning
- On the Expressive Power of Tree-Structured Probabilistic Circuits
- On the Expressivity and Sample Complexity of Node-Individualized Graph Neural Networks
- On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution
- On the Identifiability of Poisson Branching Structural Causal Model Using Probability Generating Function
- On the Impact of Feature Heterophily on Link Prediction with Graph Neural Networks
- On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory
- On the Inductive Bias of Stacking Towards Improving Reasoning
- On the Limitations of Fractal Dimension as a Measure of Generalization
- On the Minimax Regret for Contextual Linear Bandits and Multi-Armed Bandits with Expert Advice
- On the Necessity of Collaboration for Online Model Selection with Decentralized Data
- On the Noise Robustness of In-Context Learning for Text Generation
- On the Optimality of Dilated Entropy and Lower Bounds for Online Learning in Extensive-Form Games
- On the Optimal Time Complexities in Decentralized Stochastic Asynchronous Optimization
- On the Parameter Identifiability of Partially Observed Linear Causal Models
- On the Power of Decision Trees in Auto-Regressive Language Modeling
- On the Power of Small-size Graph Neural Networks for Linear Programming
- On the Reproducibility of: "Learning Perturbations to Explain Time Series Predictions"
- On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models
- On the Role of Attention Masks and LayerNorm in Transformers
- On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
- On the Saturation Effects of Spectral Algorithms in Large Dimensions
- On the Scalability of Certified Adversarial Robustness with Generated Data
- On the Scalability of GNNs for Molecular Graphs
- On the Sparsity of the Strong Lottery Ticket Hypothesis
- On the Stability and Generalization of Meta-Learning
- On the Surprising Effectiveness of Attention Transfer for Vision Transformers
- On the Target-kernel Alignment: a Unified Analysis with Kernel Complexity
- On the Use of Anchoring for Training Vision Models
- On the Worst Prompt Performance of Large Language Models
- On Tractable $\Phi$-Equilibria in Non-Concave Games
- On Weak Regret Analysis for Dueling Bandits
- OPEL: Optimal Transport Guided ProcedurE Learning
- Open-Book Neural Algorithmic Reasoning
- OpenCDA-Loop: A Closed-loop Benchmarking Platform for End-to-end Evaluation of Cooperative Perception
- OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset
- OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images
- OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
- Opening the Language Model Pipeline: A Tutorial on Data Preparation, Model Training, and Adaptation
- Open LLMs are Necessary for Current Private Adaptations and Outperform their Closed Alternatives
- OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
- OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction
- Open-Vocabulary Object Detection via Language Hierarchy
- OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
- Operator World Models for Reinforcement Learning
- Opponent Modeling based on Subgoal Inference
- Opponent Modeling with In-context Search
- OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
- Optical Diffusion Models for Image Generation
- Optimal ablation for interpretability
- Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift
- Optimal Algorithms for Augmented Testing of Discrete Distributions
- Optimal Algorithms for Learning Partitions with Faulty Oracles
- Optimal Algorithms for Online Convex Optimization with Adversarial Constraints
- Optimal and Approximate Adaptive Stochastic Quantization
- Optimal Batched Best Arm Identification
- Optimal Classification under Performative Distribution Shift
- Optimal Clustering with Bandit Feedback
- Optimal deep learning of holomorphic operators between Banach spaces
- Optimal Design for Human Preference Elicitation
- Optimal Flow Matching: Learning Straight Trajectories in Just One Step
- Optimal Hypothesis Selection in (Almost) Linear Time
- Optimal Multiclass U-Calibration Error and Beyond
- Optimal Multi-Fidelity Best-Arm Identification
- Optimal Parallelization of Boosting
- Optimal Private and Communication Constraint Distributed Goodness-of-Fit Testing for Discrete Distributions in the Large Sample Regime
- Optimal Rates for Vector-Valued Spectral Regularization Learning Algorithms
- Optimal Scalarizations for Sublinear Hypervolume Regret
- Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos
- Optimal Top-Two Method for Best Arm Identification and Fluid Analysis
- Optimal Transport-based Labor-free Text Prompt Modeling for Sketch Re-identification
- Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL
- Optimistic Verifiable Training by Controlling Hardware Nondeterminism
- Optimization Algorithm Design via Electric Circuits
- Optimization-based Causal Estimation from Heterogeneous Environments
- Optimization Can Learn Johnson Lindenstrauss Embeddings
- Optimization for ML Workshop
- Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning
- Optimizing Automatic Differentiation with Deep Reinforcement Learning
- Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition
- Optimizing the coalition gain in Online Auctions with Greedy Structured Bandits
- Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
- OPUS: Occupancy Prediction Using a Sparse Set
- Oracle-Efficient Differentially Private Learning with Public Data
- Oracle-Efficient Reinforcement Learning for Max Value Ensembles
- Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
- Ordered Momentum for Asynchronous SGD
- Order-Independence Without Fine Tuning
- Ordering-Based Causal Discovery for Linear and Nonlinear Relations
- OSLO: One-Shot Label-Only Membership Inference Attacks
- OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
- OT4P: Unlocking Effective Orthogonal Group Path for Permutation Relaxation
- OTTER: Effortless Label Distribution Adaptation of Zero-shot Models
- Outlier-Robust Distributionally Robust Optimization via Unbalanced Optimal Transport
- Out-of-Distribution Detection with a Single Unconditional Diffusion Model
- Out-Of-Distribution Detection with Diversification (Provably)
- Out-of-Distribution Generalization: Shortcuts, Spuriousness, and Stability
- Overcoming Brittleness in Pareto-Optimal Learning Augmented Algorithms
- Overcoming Common Flaws in the Evaluation of Selective Classification Systems
- Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
- Overfitting Behaviour of Gaussian Kernel Ridgeless Regression: Varying Bandwidth or Dimensionality
- Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
- OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking
- OwMatch: Conditional Self-Labeling with Consistency for Open-world Semi-Supervised Learning
- OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning
- OxonFair: A Flexible Toolkit for Algorithmic Fairness
- P$^2$C$^2$Net: PDE-Preserved Coarse Correction Network for efficient prediction of spatiotemporal dynamics
- PAC-Bayes-Chernoff bounds for unbounded losses
- PACE: marrying the generalization of PArameter-efficient fine-tuning with Consistency rEgularization
- PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices
- PaCE: Parsimonious Concept Engineering for Large Language Models
- PaDeLLM-NER: Parallel Decoding in Large Language Models for Named Entity Recognition
- PageRank Bandits for Link Prediction
- PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher
- Paloma: A Benchmark for Evaluating Language Model Fit
- Panacea: Pareto Alignment via Preference Adaptation for LLMs
- Pandora's Box: Towards Building Universal Attackers against Real-World Large Vision-Language Models
- PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining
- Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation
- Parallel Backpropagation for Shared-Feature Visualization
- ParallelEdits: Efficient Multi-Aspect Text-Driven Image Editing with Attention Grouping
- Parallelizing Linear Transformers with the Delta Rule over Sequence Length
- Parallelizing Model-based Reinforcement Learning Over the Sequence Length
- Parameter Competition Balancing for Model Merging
- Parameter Disparities Dissection for Backdoor Defense in Heterogeneous Federated Learning
- Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts
- Parameter-free Clipped Gradient Descent Meets Polyak
- Parameter-Inverted Image Pyramid Networks
- Parameterized Approximation Schemes for Fair-Range Clustering
- Parameter Symmetry and Noise Equilibrium of Stochastic Gradient Descent
- Parametric model reduction of mean-field and stochastic systems via higher-order action matching
- Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation
- Parseval Regularization for Continual Reinforcement Learning
- Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting
- Partially Observable Cost-Aware Active-Learning with Large Language Models
- Partial observation can induce mechanistic mismatches in data-constrained models of neural dynamics
- Partial Structure Discovery is Sufficient for No-regret Learning in Causal Bandits
- Partial Transportability for Domain Generalization
- Particle Semi-Implicit Variational Inference
- Paths to Equilibrium in Games
- pcaGAN: Improving Posterior-Sampling cGANs via Principal Component Regularization
- PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding
- PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
- PEACE: A Dataset of Pharmaceutical Care for Cancer Pain Analgesia Evaluation and Medication Decision
- PEAC: Unsupervised Pre-training for Cross-Embodiment Reinforcement Learning
- Pearls from Pebbles: Improved Confidence Functions for Auto-labeling
- Pedestrian-Centric 3D Pre-collision Pose and Shape Estimation from Dashcam Perspective
- Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and Benchmarking
- PediatricsGPT: Large Language Models as Chinese Medical Assistants for Pediatric Applications
- Penalty-based Methods for Simple Bilevel Optimization under Hölderian Error Bounds
- Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers
- Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
- Perceptual Fairness in Image Restoration
- PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator
- Performative Control for Linear Dynamical Systems
- PERIA: Perceive, Reason, Imagine, Act via Holistic Language and Vision Planning for Manipulation
- Peri-midFormer: Periodic Pyramid Transformer for Time Series Analysis
- Periodic agent-state based Q-learning for POMDPs
- Perplexity-aware Correction for Robust Alignment with Noisy Preferences
- Persistence Homology Distillation for Semi-supervised Continual Learning
- Persistent Homology for High-dimensional Data Based on Spectral Methods
- Persistent Test-time Adaptation in Recurring Testing Scenarios
- Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models
- Personalized Federated Learning via Feature Distribution Adaptation
- Personalized Federated Learning with Mixture of Models for Adaptive Prediction and Model Fine-Tuning
- Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
- Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
- Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning
- PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models
- PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations
- Pessimistic Backward Policy for GFlowNets
- pFedClub: Controllable Heterogeneous Model Aggregation for Personalized Federated Learning
- PGN: The RNN's New Successor is Effective for Long-Range Time Series Forecasting
- Phased Consistency Models
- PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging
- PhyloGen: Language Model-Enhanced Phylogenetic Inference via Graph Structure Generation
- PhyRecon: Physically Plausible Neural Scene Reconstruction
- Physical Consistency Bridges Heterogeneous Data in Molecular Multi-Task Learning
- Physically Compatible 3D Object Modeling from a Single Image
- Physics-Constrained Comprehensive Optical Neural Networks
- Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees
- Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling
- Physics-Informed Variational State-Space Gaussian Processes
- Physics-Regularized Multi-Modal Image Assimilation for Brain Tumor Localization
- Piecewise deterministic generative models
- Piecewise-Stationary Bandits with Knapsacks
- PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs
- Pin-Tuning: Parameter-Efficient In-Context Tuning for Few-Shot Molecular Property Prediction
- Pipeline Parallelism with Controllable Memory
- PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
- PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
- Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs
- Plant-and-Steal: Truthful Fair Allocations via Predictions
- PLIP: Language-Image Pre-training for Person Representation Learning
- Pluralistic Alignment Workshop
- PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection
- Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
- PointMamba: A Simple State Space Model for Point Cloud Analysis
- Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis
- Poisson Variational Autoencoder
- Policy Aggregation
- Policy Improvement using Language Feedback Models
- Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
- Policy Mirror Descent with Lookahead
- Policy Optimization for Robust Average Reward MDPs
- Policy-shaped prediction: avoiding distractions in model-based reinforcement learning
- Polyhedral Complex Derivation from Piecewise Trilinear Networks
- Polynomial-Time Computation of Exact $\Phi$-Equilibria in Polyhedral Games
- Poseidon: Efficient Foundation Models for PDEs
- Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
- Post-Hoc Reversal: Are We Selecting Models Prematurely?
- Posture-Informed Muscular Force Learning for Robust Hand Pressure Estimation
- PowerGraph: A power grid benchmark dataset for graph neural networks
- PowerPM: Foundation Model for Power Systems
- PPLNs: Parametric Piecewise Linear Networks for Event-Based Temporal Modeling and Beyond
- Practical $0.385$-Approximation for Submodular Maximization Subject to a Cardinality Constraint
- Practical Bayesian Algorithm Execution via Posterior Sampling
- Practical Shuffle Coding
- Precipitation Downscaling with Spatiotemporal Video Diffusion
- Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks
- Predicting Future Actions of Reinforcement Learning Agents
- Predicting Ground State Properties: Constant Sample Complexity and Deep Learning Algorithms
- Predicting Label Distribution from Ternary Labels
- Predicting the Performance of Foundation Models via Agreement-on-the-Line
- Prediction-Powered Ranking of Large Language Models
- Prediction with Action: Visual Policy Learning via Joint Denoising Process
- Predictive Attractor Models
- Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning
- Preference Alignment with Flow Matching
- Preference-based Pure Exploration
- Preference Learning Algorithms Do Not Learn Preference Rankings
- Preference Learning of Latent Decision Utilities with a Human-like Model of Preferential Choice
- Preferential Normalizing Flows
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
- Prejudice and Volatility: A Statistical Framework for Measuring Social Discrimination in Large Language Models
- Pre-trained Gaussian Processes for Bayesian Optimization
- Pre-trained Large Language Models Use Fourier Features to Compute Addition
- Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation
- Pretrained Optimization Model for Zero-Shot Black Box Optimization
- Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
- Pretrained Transformer Efficiently Learns Low-Dimensional Target Functions In-Context
- Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs
- Pre-training Differentially Private Models with Limited Public Data
- Pretraining with Random Noise for Fast and Robust Learning without Weight Transport
- Preventing Dimensional Collapse in Self-Supervised Learning via Orthogonality Regularization
- Preventing Model Collapse in Deep Canonical Correlation Analysis by Noise Regularization
- Pricing and Competition for Generative AI
- Principled Bayesian Optimization in Collaboration with Human Experts
- Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors
- Prior-itizing Privacy: A Bayesian Approach to Setting the Privacy Budget in Differential Privacy
- Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
- Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
- PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
- PrivacyML: Meaningful Privacy-Preserving Machine Learning and How To Evaluate AI Privacy
- Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training
- Private Algorithms for Stochastic Saddle Points and Variational Inequalities: Beyond Euclidean Geometry
- Private and Personalized Frequency Estimation in a Federated Setting
- Private Attribute Inference from Images with Vision-Language Models
- Private Edge Density Estimation for Random Graphs: Optimal, Efficient and Robust
- Private Geometric Median
- Private Online Learning via Lazy Algorithms
- Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions
- PrivAuditor: Benchmarking Data Protection Vulnerabilities in LLM Adaptation Techniques
- PrivCirNet: Efficient Private Inference via Block Circulant Transformation
- Probabilistic Conformal Distillation for Enhancing Missing Modality Robustness
- Probabilistic Decomposed Linear Dynamical Systems for Robust Discovery of Latent Neural Dynamics
- Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data
- Probabilistic Graph Rewiring via Virtual Nodes
- Probabilistic size-and-shape functional mixed models
- Probabilistic Weather Forecasting with Hierarchical Graph Neural Networks
- Probablistic Emulation of a Global Climate Model with Spherical DYffusion
- Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach
- Probing the Decision Boundaries of In-context Learning in Large Language Models
- ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons
- Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation
- PRODuctive bandits: Importance Weighting No More
- ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
- ProG: A Graph Prompt Learning Benchmark
- ProgressGym: Alignment with a Millennium of Moral Progress
- Progressive Entropic Optimal Transport Solvers
- Progressive Exploration-Conformal Learning for Sparsely Annotated Object Detection in Aerial Images
- Promoting Fairness Among Dynamic Agents in Online-Matching Markets under Known Stationary Arrival Distributions
- Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models
- PromptFix: You Prompt and We Fix the Photo
- Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars
- Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation
- Propensity Score Alignment of Unpaired Multimodal Data
- Proportional Fairness in Clustering: A Social Choice Perspective
- Proportional Fairness in Non-Centroid Clustering
- Prospective Learning: Learning for a Dynamic Future
- Prospective Representation Learning for Non-Exemplar Class-Incremental Learning
- PROSPECT PTMs: Rich Labeled Tandem Mass Spectrometry Dataset of Modified Peptides for Machine Learning in Proteomics
- ProSST: Protein Language Modeling with Quantized Structure and Disentangled Attention
- Protected Test-Time Adaptation via Online Entropy Matching: A Betting Approach
- Protecting Your LLMs with Information Bottleneck
- Protein-Nucleic Acid Complex Modeling with Frame Averaging Transformer
- ProtGO: Function-Guided Protein Modeling for Unified Representation Learning
- Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery
- ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
- Provable Acceleration of Nesterov's Accelerated Gradient for Asymmetric Matrix Factorization and Linear Neural Networks
- Provable and Efficient Dataset Distillation for Kernel Ridge Regression
- Provable Benefit of Cutout and CutMix for Feature Learning
- Provable Benefits of Complex Parameterizations for Structured State Space Models
- Provable Editing of Deep Neural Networks using Parametric Linear Relaxation
- Provable Partially Observable Reinforcement Learning with Privileged Information
- Provable Posterior Sampling with Denoising Oracles via Tilted Transport
- Provable Tempered Overfitting of Minimal Nets and Typical Nets
- Provably and Practically Efficient Adversarial Imitation Learning with General Function Approximation
- Provably Efficient Interactive-Grounded Learning with Personalized Reward
- Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation
- Provably Faster Algorithms for Bilevel Optimization via Without-Replacement Sampling
- Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
- Provably Optimal Memory Capacity for Modern Hopfield Models: Transformer-Compatible Dense Associative Memories as Spherical Codes
- Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image Reconstruction
- Provably Safe Neural Network Controllers via Differential Dynamic Logic
- Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning
- Proving Olympiad Algebraic Inequalities without Human Demonstrations
- Proving Theorems Recursively
- ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field
- Proximal Causal Inference With Text Data
- ProxyFusion: Face Feature Aggregation Through Sparse Experts
- Prune and Repaint: Content-Aware Image Retargeting for any Ratio
- Pruning neural network models for gene regulatory dynamics using data and domain knowledge
- Pseudo-Private Data Guided Model Inversion Attacks
- Pseudo-Siamese Blind-spot Transformers for Self-Supervised Real-World Denoising
- PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation
- PTQ4DiT: Post-training Quantization for Diffusion Transformers
- Public-data Assisted Private Stochastic Optimization: Power and Limitations
- PuLID: Pure and Lightning ID Customization via Contrastive Alignment
- PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics
- Pure Message Passing Can Estimate Common Neighbor for Link Prediction
- PURE: Prompt Evolution with Graph ODE for Out-of-distribution Fluid Dynamics Modeling
- PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition
- Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning
- PUZZLES: A Benchmark for Neural Algorithmic Reasoning
- PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression
- QBB: Quantization with Binary Bases for LLMs
- Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
- QGFN: Controllable Greediness with Action Values
- QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
- QKFormer: Hierarchical Spiking Transformer using Q-K Attention
- QTIP: Quantization with Trellises and Incoherence Processing
- QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion
- QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model
- Quadratic Quantum Variational Monte Carlo
- Qualitative Mechanism Independence
- Quality-Improved and Property-Preserved Polarimetric Imaging via Complementarily Fusing
- QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
- Quantifying Aleatoric Uncertainty of the Treatment Effect: A Novel Orthogonal Learner
- Quantifying and Optimizing Global Faithfulness in Persona-driven Role-playing
- Quantifying the Bitter Lesson: How Safety Benchmarks Measure Capabilities Instead of Safety
- Quantifying the Gain in Weak-to-Strong Generalization
- Quantitative Convergences of Lie Group Momentum Optimizers
- Quantum algorithm for large-scale market equilibrium computation
- Quantum Algorithms for Non-smooth Non-convex Optimization
- Quantum Deep Equilibrium Models
- QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
- Quasi-Bayes meets Vines
- QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos
- Query-Based Adversarial Prompt Generation
- Query-Efficient Correlation Clustering with Noisy Oracle
- Questioning the Survey Responses of Large Language Models
- QUEST: Quadruple Multimodal Contrastive Learning with Constraints and Self-Penalization
- QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation
- QueST: Self-Supervised Skill Abstractions for Learning Continuous Control
- Queueing Matching Bandits with Preference Feedback
- QVAE-Mole: The Quantum VAE with Spherical Latent Variable Learning for 3-D Molecule Generation
- Q-VLM: Post-training Quantization for Large Vision-Language Models
- QWO: Speeding Up Permutation-Based Causal Discovery in LiGAMs
- R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction
- RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar
- Rad-NeRF: Ray-decoupled Training of Neural Radiance Field
- RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation
- RAGraph: A General Retrieval-Augmented Graph Learning Framework
- Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
- RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness
- RandNet-Parareal: a time-parallel PDE solver using Random Neural Networks
- Random Cycle Coding: Lossless Compression of Cluster Assignments via Bits-Back Coding
- Random Function Descent
- Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces
- Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation
- Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning
- Randomized Sparse Matrix Compression for Large-Scale Constrained Optimization in Cancer Radiotherapy
- Randomized Strategic Facility Location with Predictions
- Randomized Truthful Auctions with Learning Agents
- Random Representations Outperform Online Continually Learned Representations
- RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
- RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
- RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
- Rapid Plug-in Defenders
- RashomonGB: Analyzing the Rashomon Effect and Mitigating Predictive Multiplicity in Gradient Boosting
- RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
- RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees
- RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling
- RClicks: Realistic Click Simulation for Benchmarking Interactive Segmentation
- Reactzyme: A Benchmark for Enzyme-Reaction Prediction
- RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
- Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer
- RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
- realSEUDO for real-time calcium imaging analysis
- Real-time Core-Periphery Guided ViT with Smart Data Layout Selection on Mobile Devices
- Real-Time Recurrent Learning using Trace Units in Reinforcement Learning
- Real-Time Selection Under General Constraints via Predictive Inference
- Real-time Stereo-based 3D Object Detection for Streaming Perception
- Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network
- Reasoning Multi-Agent Behavioral Topology for Interactive Autonomous Driving
- Reasons and Solutions for the Decline in Model Performance after Editing
- Re-assembling the past: The RePAIR dataset and benchmark for real world 2D and 3D puzzle solving
- Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training
- REBEL: Reinforcement Learning via Regressing Relative Rewards
- REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR
- Reciprocal Learning
- Reciprocal Reward Influence Encourages Cooperation From Self-Interested Agents
- Recognize Any Regions
- Reconstruct and Match: Out-of-Distribution Robustness via Topological Homogeneity
- Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model
- Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable
- Reconstruction of Manipulated Garment with Guided Deformation Prior
- Recovering Complete Actions for Cross-dataset Skeleton Action Recognition
- RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
- [Re] CUDA: Curriculum of Data Augmentation for Long‐tailed Recognition
- Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery
- Recurrent neural network dynamical systems for biological vision
- Recurrent neural networks: vanishing and exploding gradients are not the end of the story
- Recurrent Reinforcement Learning with Memoroids
- Recursive Introspection: Teaching Language Model Agents How to Self-Improve
- Recursive PAC-Bayes: A Frequentist Approach to Sequential Prior Updates with No Information Loss
- RedCode: Multi-dimensional Safety Benchmark for Code Agents
- RedPajama: an Open Dataset for Training Large Language Models
- Red Teaming GenAI: What Can We Learn from Adversaries?
- Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
- REDUCR: Robust Data Downsampling using Class Priority Reweighting
- ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution
- RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance
- Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models
- Referencing Where to Focus: Improving Visual Grounding with Referential Query
- Referring Human Pose and Mask Estimation In the Wild
- ReFIR: Grounding Large Restoration Models with Retrieval Augmentation
- ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration
- Reflective Multi-Agent Collaboration based on Large Language Models
- ReFT: Representation Finetuning for Language Models
- Refusal in Language Models Is Mediated by a Single Direction
- RegExplainer: Generating Explanations for Graph Neural Networks in Regression Tasks
- [Re] GNNInterpreter: A probabilistic generative model-level explanation for Graph Neural Networks
- Regression under demographic parity constraints via unlabeled post-processing
- Regret Minimization in Stackelberg Games with Side Information
- ReGS: Reference-based Controllable Scene Stylization with Gaussian Splatting
- Regularized Adaptive Momentum Dual Averaging with an Efficient Inexact Subproblem Solver for Training Structured Neural Network
- Regularized Conditional Diffusion Model for Multi-Task Preference Alignment
- Regularized Q-Learning
- Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
- Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations
- Reimagining Mutual Information for Enhanced Defense against Data Leakage in Collaborative Inference
- Reinforced Cross-Domain Knowledge Distillation on Time Series Data
- Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
- Reinforcement Learning Guided Semi-Supervised Learning
- Reinforcement Learning Policy as Macro Regulator Rather than Macro Placer
- Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity
- Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems
- Reinforcement Learning with Euclidean Data Augmentation for State-Based Continuous Control
- Reinforcement Learning with Lookahead Information
- Reinforcement Learning with LTL and $\omega$-Regular Objectives via Optimality-Preserving Translation to Average Rewards
- Reinforcing LLM Agents via Policy Optimization with Action Decomposition
- Rejection via Learning Density Ratios
- Relating Hopfield Networks to Episodic Control
- Relational Concept Bottleneck Models
- Relational Verification Leaps Forward with RABBit
- Relationship Prompt Learning is Enough for Open-Vocabulary Semantic Segmentation
- RelBench: A Benchmark for Deep Learning on Relational Databases
- Reliable Learning of Halfspaces under Gaussian Marginals
- ReLIZO: Sample Reusable Linear Interpolation-based Zeroth-order Optimization
- ReMAP: Neural Model Reprogramming with Network Inversion and Retrieval-Augmented Mapping for Adaptive Motion Forecasting
- ReMI: A Dataset for Reasoning with Multiple Images
- Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising
- ReMoDetect: Reward Models Recognize Aligned LLM's Generations
- Remove that Square Root: A New Efficient Scale-Invariant Version of AdaGrad
- ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
- Renovating Names in Open-Vocabulary Segmentation Benchmarks
- [Re] On the Reproducibility of Post-Hoc Concept Bottleneck Models
- Reparameterization invariance in approximate Bayesian inference
- Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling
- ReplaceAnything3D: Text-Guided Object Replacement in 3D Scenes with Compositional Scene Representations
- Replay-and-Forget-Free Graph Class-Incremental Learning: A Task Profiling and Prompting Approach
- Replicability in Learning: Geometric Partitions and KKM-Sperner Lemma
- Replicable Uniformity Testing
- RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content
- Representation Noising: A Defence Mechanism Against Harmful Finetuning
- Reproducibility of predictive networks for mouse visual cortex
- Reproducibility Study: Equal Improvability: A New Fairness Notion Considering the Long-Term Impact
- Reproducibility study of FairAC
- Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation"
- Reproducibility Study Of Learning Fair Graph Representations Via Automated Data Augmentations
- Reproducibility study of “LICO: Explainable Models with Language-Image Consistency"
- Reproducibility study of "Robust Fair Clustering: A Novel Fairness Attack and Defense Framework"
- Reproducibility Study of "Robust Fair Clustering: A Novel Fairness Attack and Defense Framework"
- Reproducibility Study on Adversarial Attacks Against Robust Transformer Trackers
- Reprogramming Pretrained Target-Specific Diffusion Models for Dual-Target Drug Design
- Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe
- Reranking Laws for Language Generation: A Communication-Theoretic Perspective
- [Re] Reproducibility Study of “Explaining Temporal Graph Models Through an Explorer-Navigator Framework"
- ResAD: A Simple Framework for Class Generalizable Anomaly Detection
- Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise
- Reshuffling Resampling Splits Can Improve Generalization of Hyperparameter Optimization
- Resolving Discrepancies in Compute-Optimal Scaling of Language Models
- Resource-Aware Federated Self-Supervised Learning with Global Class Representations
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
- RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models
- Rethinking 3D Convolution in $\ell_p$-norm Space
- Rethinking Decoders for Transformer-based Semantic Segmentation: Compression is All You Need
- Rethinking Deep Thinking: Stable Learning of Algorithms using Lipschitz Constraints
- Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus
- Rethinking Fourier Transform from A Basis Functions Perspective for Long-term Time Series Forecasting
- Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability, Reproducibility, and Practicality
- Rethinking Imbalance in Image Super-Resolution for Efficient Inference
- Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
- Rethinking LLM Memorization through the Lens of Adversarial Compression
- Rethinking Memory and Communication Costs for Efficient Data Parallel Training of Large Language Models
- Rethinking Misalignment in Vision-Language Model Adaptation from a Causal Perspective
- Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity
- Rethinking No-reference Image Exposure Assessment from Holism to Pixel: Models, Datasets and Benchmarks
- Rethinking Optimal Transport in Offline Reinforcement Learning
- Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution
- Rethinking Parity Check Enhanced Symmetry-Preserving Ansatz
- Rethinking Reconstruction-based Graph-Level Anomaly Detection: Limitations and a Simple Remedy
- Rethinking Score Distillation as a Bridge Between Image Distributions
- Rethinking the Capacity of Graph Neural Networks for Branching Strategy
- Rethinking the Diffusion Models for Missing Data Imputation: A Gradient Flow Perspective
- Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
- Rethinking the Membrane Dynamics and Optimization Objectives of Spiking Neural Networks
- Rethinking the Power of Timestamps for Robust Time Series Forecasting: A Global-Local Fusion Perspective
- Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
- Rethinking Transformer for Long Contextual Histopathology Whole Slide Image Analysis
- Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models
- Retrieval-Augmented Diffusion Models for Time Series Forecasting
- Retrieval & Fine-Tuning for In-Context Tabular Models
- Retrieval-Retro: Retrieval-based Inorganic Retrosynthesis with Expert Knowledge
- RETR: Multi-View Radar Detection Transformer for Indoor Perception
- Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos
- Return of Unconditional Generation: A Self-supervised Representation Generation Method
- Revealing Distribution Discrepancy by Sampling Transfer in Unlabeled Data
- Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference
- Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference
- ReVideo: Remake a Video with Motion and Content Control
- Revisiting Adversarial Patches for Designing Camera-Agnostic Attacks against Person Detection
- Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation
- Revisiting Differentially Private ReLU Regression
- Revisiting Ensembling in One-Shot Federated Learning
- Revisiting Few-Shot Object Detection with Vision-Language Models
- Revisiting K-mer Profile for Effective and Scalable Genome Representation Learning
- Revisiting motion information for RGB-Event tracking with MOT philosophy
- Revisiting Score Propagation in Graph Out-of-Distribution Detection
- Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective
- Revisiting the Integration of Convolution and Attention for Vision Backbone
- Revive Re-weighting in Imbalanced Learning by Density Ratio Estimation
- Reward Machines for Deep RL in Noisy and Uncertain Environments
- ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos
- RFLPA: A Robust Federated Learning Framework against Poisoning Attacks with Secure Aggregation
- RGFN: Synthesizable Molecular Generation Using GFlowNets
- RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space
- RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
- Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
- Right this way: Can VLMs Guide Us to See More to Answer Questions?
- Risk-Averse Fine-tuning of Large Language Models
- Risk-sensitive control as inference with Rényi divergence
- RLE: A Unified Perspective of Data Augmentation for Cross-Spectral Re-Identification
- RL-GPT: Integrating Reinforcement Learning and Code-as-policy
- RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
- RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold
- RMLR: Extending Multinomial Logistic Regression into General Geometries
- Road Network Representation Learning with the Third Law of Geography
- ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization
- RobIR: Robust Inverse Rendering for High-Illumination Scenes
- RoboMamba: Efficient Vision-Language-Action Model for Robotic Reasoning and Manipulation
- Robot Policy Learning with Temporal Optimal Transport Reward
- Robust and Faster Zeroth-Order Minimax Optimization: Complexity and Applications
- Robust Conformal Prediction Using Privileged Information
- Robust Contrastive Multi-view Clustering against Dual Noisy Correspondence
- Robust Fine-tuning of Zero-shot Models via Variance Reduction
- Robust Gaussian Processes via Relevance Pursuit
- Robust Graph Neural Networks via Unbiased Aggregation
- Robust group and simultaneous inferences for high-dimensional single index model
- Robustly overfitting latents for flexible neural image compression
- Robust Mixture Learning when Outliers Overwhelm Small Groups
- Robust Neural Contextual Bandit against Adversarial Corruptions
- Robust Offline Active Learning on Graphs
- Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks
- Robust Reinforcement Learning from Corrupted Human Feedback
- Robust Reinforcement Learning with General Utility
- Robust Sleep Staging over Incomplete Multimodal Physiological Signals via Contrastive Imagination
- Robust Sparse Regression with Non-Isotropic Designs
- ROIDICE: Offline Return on Investment Maximization for Efficient Decision Making
- RoleAgent: Building, Interacting, and Benchmarking High-quality Role-Playing Agents from Scripts
- RoME: A Robust Mixed-Effects Bandit Algorithm for Optimizing Mobile Health Interventions
- RoPINN: Region Optimized Physics-Informed Neural Networks
- Rough Transformers: Lightweight Continuous-Time Sequence Modelling with Path Signatures
- RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models
- RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
- RTify: Aligning Deep Neural Networks with Human Behavioral Decisions
- Rule Based Rewards for Language Model Safety
- Rule Extrapolation in Language Modeling: A Study of Compositional Generalization on OOD Prompts
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models
- S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
- S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning
- SA3DIP: Segment Any 3D Instance with Potential 3D Priors
- Safe and Efficient: A Primal-Dual Method for Offline Convex CMDPs under Partial Data Coverage
- Safe and Sparse Newton Method for Entropic-Regularized Optimal Transport
- Safe Exploitative Play with Untrusted Type Beliefs
- Safe Generative AI
- Safe LoRA: The Silver Lining of Reducing Safety Risks when Finetuning Large Language Models
- SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models
- SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
- Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel
- Safety through feedback in Constrained RL
- SafeWorld: Geo-Diverse Safety Alignment
- Saliency-driven Experience Replay for Continual Learning
- Samba: Severity-aware Recurrent Modeling for Cross-domain Medical Image Grading
- SAM-Guided Masked Token Prediction for 3D Scene Understanding
- SAMPa: Sharpness-aware Minimization Parallelized
- SampDetox: Black-box Backdoor Defense via Perturbation-based Sample Detoxification
- Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models
- Sample Complexity of Algorithm Selection Using Neural Networks and Its Applications to Branch-and-Cut
- Sample Complexity of Interventional Causal Representation Learning
- Sample Complexity of Posted Pricing for a Single Item
- Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
- Sample-Efficient Agnostic Boosting
- Sample Efficient Bayesian Learning of Causal Graphs from Interventions
- Sample-efficient Bayesian Optimisation Using Known Invariances
- Sample-Efficient Constrained Reinforcement Learning with General Parameterization
- Sample-Efficient Geometry Reconstruction from Euclidean Distances using Non-Convex Optimization
- Sample-Efficient Private Learning of Mixtures of Gaussians
- Sample Selection via Contrastive Fragmentation for Noisy Label Regression
- Sandbox for the Blackbox: How LLMs Learn Structured Data?
- SAND: Smooth imputation of sparse and noisy functional data with Transformer networks
- SARAD: Spatial Association-Aware Anomaly Detection and Diagnosis for Multivariate Time Series
- SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection
- Satformer: Accurate and Robust Traffic Data Estimation for Satellite Networks
- SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
- SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
- Scalable and Effective Arithmetic Tree Generation for Adder and Multiplier Designs
- Scalable Bayesian Optimization via Focalized Sparse Gaussian Processes
- Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning
- Scalable DBSCAN with Random Projections
- Scalable DP-SGD: Shuffling vs. Poisson Subsampling
- Scalable Early Childhood Reading Performance Prediction
- Scalable Kernel Inverse Optimization
- Scalable Neural Network Verification with Branch-and-bound Inferred Cutting Planes
- Scalable Optimization in the Modular Norm
- Scale Equivariant Graph Metanetworks
- Scale-invariant Optimal Sampling for Rare-events Data and Sparse Models
- ScaleKD: Strong Vision Transformers Could Be Excellent Teachers
- Scaling Continuous Latent Variable Models as Probabilistic Integral Circuits
- Scaling Law for Time Series Forecasting
- Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
- Scaling laws for learning with real and surrogate data
- Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
- Scaling Laws in Linear Regression: Compute, Parameters, and Data
- Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
- Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
- Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
- Scaling Sign Language Translation
- Scaling the Codebook Size of VQ-GAN to 100,000 with a Utilization Rate of 99%
- Scaling transformer neural networks for skillful and reliable medium-range weather forecasting
- Scaling White-Box Transformers for Vision
- Scanning Trojaned Models Using Out-of-Distribution Samples
- SCaR: Refining Skill Chaining for Long-Horizon Robotic Manipulation via Dual Regularization
- SceneCraft: Layout-Guided 3D Scene Generation
- SceneDiffuser: Efficient and Controllable Driving Simulation Initialization and Rollout
- Scene Graph Disentanglement and Composition for Generalizable Complex Image Generation
- Scene Graph Generation with Role-Playing Large Language Models
- Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
- Schrodinger Bridge Flow for Unpaired Data Translation
- Schur Nets: exploiting local structure for equivariance in higher order graph neural networks
- SciCode: A Research Coding Benchmark Curated by Scientists
- Scientific Methods for Understanding Neural Networks
- SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
- SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models
- Score-based 3D molecule generation with neural fields
- Score-based generative models are provably robust: an uncertainty quantification perspective
- Score Distillation via Reparametrized DDIM
- Score-Optimal Diffusion Schedules
- SCOREQ: Speech Quality Assessment with Contrastive Regression
- Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets
- SCRREAM : SCan, Register, REnder And Map: A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark
- SCube: Instant Large-Scale Scene Reconstruction using VoxSplats
- SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
- SDformer: Similarity-driven Discrete Transformer For Time Series Generation
- SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
- SE(3)-bi-equivariant Transformers for Point Cloud Assembly
- SeafloorGenAI: A Large-scale Vision-Language Dataset for Seafloor Geological Survey
- Search for Efficient Large Language Models
- Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
- SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge
- SEA: State-Exchange Attention for High-Fidelity Physics Based Transformers
- Second-order forward-mode optimization of recurrent neural networks for neuroscience
- Secret Collusion among AI Agents: Multi-Agent Deception via Steganography
- SeeA*: Efficient Exploration-Enhanced A* Search by Selective Sampling
- SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution
- Seeing Beyond the Crop: Using Language Priors for Out-of-Bounding Box Keypoint Prediction
- Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
- Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL
- SEEV: Synthesis with Efficient Exact Verification for ReLU Neural Barrier Functions
- Segment Any Change
- Segment Anything without Supervision
- Segmenting Watermarked Texts From Language Models
- Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations
- SegVol: Universal and Interactive Volumetric Medical Image Segmentation
- SEL-BALD: Deep Bayesian Active Learning for Selective Labeling with Instance Rejection
- SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
- Selective Attention: Enhancing Transformer through Principled Context Control
- Selective Explanations
- Selective Generation for Controllable Language Models
- Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection
- Self-Calibrating Conformal Prediction
- SelfCodeAlign: Self-Alignment for Code Generation
- Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences
- SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures
- Self-Distilled Depth Refinement with Noisy Poisson Fusion
- Self-Guided Masked Autoencoder
- Self-Guiding Exploration for Combinatorial Problems
- Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
- Self-Labeling the Job Shop Scheduling Problem
- Self-Play Fine-tuning of Diffusion Models for Text-to-image Generation
- Self-playing Adversarial Language Game Enhances LLM Reasoning
- Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations
- Self-Retrieval: End-to-End Information Retrieval with One Large Language Model
- Self-Supervised Adversarial Training via Diverse Augmented Queries and Self-Supervised Double Perturbation
- Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
- Self-supervised Transformation Learning for Equivariant Representations
- Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
- SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data
- Semantic Density: Uncertainty Quantification for Large Language Models through Confidence Measurement in Semantic Space
- Semantic Feature Learning for Universal Unsupervised Cross-Domain Retrieval
- Semantic Routing via Autoregressive Modeling
- Semantics and Spatiality of Emergent Communication
- SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
- SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
- Semidefinite Relaxations of the Gromov-Wasserstein Distance
- Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization
- Semi-Open 3D Object Retrieval via Hierarchical Equilibrium on Hypergraph
- Semi-Random Matrix Completion via Flow-Based Adaptive Reweighting
- Semi-supervised Knowledge Transfer Across Multi-omic Single-cell Data
- Semi-supervised Multi-label Learning with Balanced Binary Angular Margin Loss
- Semi-Supervised Sparse Gaussian Classification: Provable Benefits of Unlabeled Data
- Semi-Truths: A Large-Scale Dataset for Testing Robustness of AI-Generated Image Detectors
- Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
- Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics
- Separations in the Representational Capabilities of Transformers and Recurrent Architectures
- Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Generation
- SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
- Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
- Sequential Harmful Shift Detection Without Labels
- Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood
- Sequential Signal Mixing Aggregation for Message Passing Graph Neural Networks
- Sequoia: Scalable and Robust Speculative Decoding
- SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation
- Set-based Neural Network Encoding Without Weight Tying
- SETBENCH: Assessing the Analytical and Semantic Robustness of Language Models
- SfPUEL: Shape from Polarization under Unknown Environment Light
- SF-V: Single Forward Video Generation Model
- SGD vs GD: Rank Deficiency in Linear Networks
- SGLang: Efficient Execution of Structured Language Model Programs
- SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
- Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models
- Shadowheart SGD: Distributed Asynchronous SGD with Optimal Time Complexity Under Arbitrary Computation and Communication Heterogeneity
- Shape analysis for time series
- Shaping the distribution of neural responses with interneurons in a recurrent circuit model
- shapiq: Shapley Interactions for Machine Learning
- Shared Autonomy with IDA: Interventional Diffusion Assistance
- ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
- Sharing Key Semantics in Transformer Makes Efficient Image Restoration
- Sharpness-Aware Minimization Activates the Interactive Teaching's Understanding and Optimization
- Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
- Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks using the Marginal Likelihood
- SHDocs: A dataset, benchmark, and method to efficiently generate high-quality, real-world specular highlight data with near-perfect alignment
- SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
- SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
- ShopBench: A Massive Multi-Task Online Shopping Benchmark for Large Language Models
- Should We Really Edit Language Models? On the Evaluation of Edited Language Models
- ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling
- Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization
- Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
- SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices
- Sim2Real-Fire: A Multi-modal Simulation Dataset for Forecast and Backtracking of Real-world Forest Fire
- SimGen: Simulator-conditioned Driving Scene Generation
- Similarity-Navigated Conformal Prediction for Graph Neural Networks
- Simple and Effective Masked Diffusion Language Models
- Simple and Fast Distillation of Diffusion Models
- Simplified and Generalized Masked Diffusion for Discrete Data
- Simplifying Constraint Inference with Inverse Reinforcement Learning
- Simplifying Latent Dynamics with Softly State-Invariant World Models
- SimPO: Simple Preference Optimization with a Reference-Free Reward
- Simulation-Free Training of Neural ODEs on Paired Data
- SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
- Single Image Reflection Separation via Dual-Stream Interactive Transformers
- Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
- Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions
- SIRIUS : Contexual Sparisty with Correction for Efficient LLMs
- Sketched Lanczos uncertainty score: a low-memory summary of the Fisher information
- Sketching for Distributed Deep Learning: A Sharper Analysis
- Sketchy Moment Matching: Toward Fast and Provable Data Selection for Finetuning
- SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
- Skill-aware Mutual Information Optimisation for Zero-shot Generalisation in Reinforcement Learning
- Skinned Motion Retargeting with Dense Geometric Interaction Perception
- SkipPredict: When to Invest in Predictions for Scheduling
- Slack-Free Spiking Neural Network Formulation for Hypergraph Minimum Vertex Cover
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models
- SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
- SLICE-100K: A Multimodal Dataset for Extrusion-based 3D Printing
- Slicing Vision Transformer for Flexibile Inference
- Slight Corruption in Pre-training Data Makes Better Diffusion Models
- SlimGPT: Layer-wise Structured Pruning for Large Language Models
- SlimSAM: 0.1% Data Makes Segment Anything Slim
- SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
- Slot State Space Models
- Slot-VLM: Object-Event Slots for Video-Language Modeling
- SLowcalSGD : Slow Query Points Improve Local-SGD for Stochastic Convex Optimization
- SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM
- SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining
- SM3-Text-to-Query: Synthetic Multi-Model Medical Text-to-Query Benchmark
- Small coresets via negative dependence: DPPs, linear statistics, and concentration
- Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
- SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models
- SMART: Scalable Multi-agent Real-time Motion Generation via Next-token Prediction
- SMART: Towards Pre-trained Missing-Aware Model for Patient Health Status Prediction
- Sm: enhanced localization in Multiple Instance Learning for medical imaging classification
- Smoke and Mirrors in Causal Downstream Tasks
- S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search
- Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention
- Smoothed Online Classification can be Harder than Batch Classification
- Smoothie: Label Free Language Model Routing
- SnapKV: LLM Knows What You are Looking for Before Generation
- SocialGPT: Prompting LLMs for Social Relation Reasoning via Greedy Segment Optimization
- Socially Responsible Language Modelling Research (SoLaR)
- SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models
- Soft ascent-descent as a stable and flexible alternative to flooding
- Soft-Label Integration for Robust Toxicity Classification
- Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space
- SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion
- Soft Superpixel Neighborhood Attention
- Soft Tensor Product Representations for Fully Continuous, Compositional Visual Representations
- SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model
- SolarCube: An Integrative Benchmark Dataset Harnessing Satellite and In-situ Observations for Large-scale Solar Energy Forecasting
- Solving Inverse Problems via Diffusion Optimal Control
- Solving Minimum-Cost Reach Avoid using Reinforcement Learning
- Solving Sparse \& High-Dimensional-Output Regression via Compression
- Solving Zero-Sum Markov Games with Continous State via Spectral Dynamic Embedding
- SongCreator: Lyrics-based Universal Song Generation
- Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases
- Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation
- SpaceByte: Towards Deleting Tokenization from Large Language Modeling
- Space-Time Continuous PDE Forecasting using Equivariant Neural Fields
- SpaFL: Communication-Efficient Federated Learning With Sparse Models And Low Computational Overhead
- Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
- SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
- Sparse Bayesian Generative Modeling for Compressive Sensing
- Sparse High Rank Adapters
- SparseLLM: Towards Global Pruning of Pre-trained Language Models
- Sparse maximal update parameterization: A holistic approach to sparse training dynamics
- Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis
- Sparsity-Agnostic Linear Bandits with Adaptive Adversaries
- SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors
- SpatialRGPT: Grounded Spatial Reasoning in Vision-Language Models
- Spatio-Spectral Graph Neural Networks
- Spatio-Temporal Interactive Learning for Efficient Image Reconstruction of Spiking Cameras
- Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication
- SpeAr: A Spectral Approach for Zero-Shot Node Classification
- SPEAR: Exact Gradient Inversion of Batches in Federated Learning
- SpecExec: Massively Parallel Speculative Decoding For Interactive LLM Inference on Consumer Devices
- Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting
- Spectral Adapter: Fine-Tuning in Spectral Space
- Spectral Editing of Activations for Large Language Model Alignment
- Spectral Graph Pruning Against Over-Squashing and Over-Smoothing
- Spectral Learning of Shared Dynamics Between Generalized-Linear Processes
- Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
- Speculative Decoding with CTC-based Draft Model for LLM Inference Acceleration
- Speculative Monte-Carlo Tree Search
- SpeechAlign: Aligning Speech Generation to Human Preferences
- SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
- SpeedLoader: An I/O efficient scheme for heterogeneous and distributed LLM operation
- SpelsNet: Surface Primitive Elements Segmentation by B-Rep Graph Structure Supervision
- SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network
- Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation
- Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
- Spike-based Neuromorphic Model for Sound Source Localization
- SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation
- SpikeReveal: Unlocking Temporal Sequences from Real Blurry Inputs with Spike Streams
- Spiking Graph Neural Network on Riemannian Manifolds
- Spiking Neural Network as Adaptive Event Stream Slicer
- Spiking Token Mixer: A event-driven friendly Former structure for spiking neural networks
- Spiking Transformer with Experts Mixture
- SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
- Splatter a Video: Video Gaussian Representation for Versatile Processing
- SplitNeRF: Split Sum Approximation Neural Field for Joint Geometry, Illumination, and Material Estimation
- SPO: Sequential Monte Carlo Policy Optimisation
- SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation
- SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
- SR-CACO-2: A Dataset for Confocal Fluorescence Microscopy Image Super-Resolution
- SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding
- SS1: Accelerating Inference with Fast and Expressive Sketch Structured Transform
- SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset
- SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation
- SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening
- SSDM: Scalable Speech Dysfluency Modeling
- S-SOS: Stochastic Sum-Of-Squares for Parametric Polynomial Optimization
- S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
- ST$_k$: A Scalable Module for Solving Top-k Problems
- Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics
- Stability and Generalization of Adversarial Training for Shallow Neural Networks with Smooth Activation
- Stability and Generalization of Asynchronous SGD: Sharper Bounds Beyond Lipschitz and Smoothness
- Stabilized Proximal-Point Methods for Federated Optimization
- Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective
- Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling
- Stabilizing Zero-Shot Prediction: A Novel Antidote to Forgetting in Continual Vision-Language Tasks
- Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
- Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation
- StackEval: Benchmarking LLMs in Coding Assistance
- Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
- Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
- STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
- START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation
- State Chrono Representation for Enhancing Generalization in Reinforcement Learning
- State-free Reinforcement Learning
- State Space Models on Temporal Graphs: A First-Principles Study
- Statistical and Geometrical properties of the Kernel Kullback-Leibler divergence
- Statistical-Computational Trade-offs for Density Estimation
- Statistical Efficiency of Distributional Temporal Difference Learning
- Statistical Estimation in the Spiked Tensor Model via the Quantum Approximate Optimization Algorithm
- Statistical Frontiers in LLMs and Foundation Models
- Statistical Inference for Fairness Auditing
- Statistical Multicriteria Benchmarking via the GSD-Front
- Stealth edits to large language models
- StepbaQ: Stepping backward as Correction for Quantized Diffusion Models
- Stepping Forward on the Last Mile
- Stepping on the Edge: Curvature Aware Learning Rate Tuners
- Stepwise Alignment for Constrained Language Model Policy Optimization
- STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics
- STL: Still Tricky Logic (for System Validation, Even When Showing Your Work)
- Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution
- Stochastic Concept Bottleneck Models
- Stochastic contextual bandits with graph feedback: from independence number to MAS number
- Stochastic Extragradient with Flip-Flop Shuffling & Anchoring: Provable Improvements
- Stochastic Kernel Regularisation Improves Generalisation in Deep Kernel Machines
- Stochastic Newton Proximal Extragradient Method
- Stochastic Optimal Control and Estimation with Multiplicative and Internal Noise
- Stochastic Optimal Control for Diffusion Bridges in Function Spaces
- Stochastic Optimal Control Matching
- Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data
- Stochastic Optimization Schemes for Performative Prediction with Nonconvex Loss
- Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
- STONE: A Submodular Optimization Framework for Active 3D Object Detection
- Stopping Bayesian Optimization with Probabilistic Regret Bounds
- StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
- Strategic Linear Contextual Bandits
- Strategic Littlestone Dimension: Improved Bounds on Online Strategic Classification
- Strategic Multi-Armed Bandit Problems Under Debt-Free Reporting
- StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
- Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models
- StreamBench: Towards Benchmarking Continuous Improvement of Language Agents
- StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences
- Streaming Bayes GFlowNets
- Streaming Detection of Queried Event Start
- StreamingDialogue: Prolonged Dialogue Learning via Long Context Compression with Minimal Losses
- Streaming Long Video Understanding with Large Language Models
- Stress-Testing Capability Elicitation With Password-Locked Models
- Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack
- Stronger Than You Think: Benchmarking Weak Supervision on Realistic Tasks
- Structural Inference of Dynamical Systems with Conjoined State Space Models
- Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis
- Structured flexibility in recurrent neural networks via neuromodulation
- Structured Learning of Compositional Sequential Interventions
- Structured Matrix Basis for Multivariate Time Series Forecasting with Interpretable Dynamics
- Structured Multi-Track Accompaniment Arrangement via Style Prior Modelling
- Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning
- Studying How to Efficiently and Effectively Guide Models with Explanations - A Reproducibility Study
- Style Adaptation and Uncertainty Estimation for Multi-Source Blended-Target Domain Adaptation
- Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models
- Stylus: Automatic Adapter Selection for Diffusion Models
- SubgDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning
- Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning
- SubjECTive-QA: A dataset for the subjective evaluation of answers in Earnings Call Transcripts (ECTs)
- Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning
- Subsurface Scattering for Gaussian Splatting
- Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
- SUGARCREPE++ Dataset: Vision-Language Model Sensitivity to Semantic and Lexical Alterations
- Suitable is the Best: Task-Oriented Knowledge Fusion in Vulnerability Detection
- Super Consistency of Neural Network Landscapes and Learning Rate Transfer
- SuperDeepFool: a new fast and accurate minimal adversarial attack
- Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
- Supervised Kernel Thinning
- SuperVLAD: Compact and Robust Image Descriptors for Visual Place Recognition
- Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques
- Supra-Laplacian Encoding for Transformer on Dynamic Graphs
- SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation
- Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
- SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning
- SustainDC: Benchmarking for Sustainable Data Center Control
- SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- Swift Sampler: Efficient Learning of Sampler by 10 Parameters
- SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
- SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents
- Symbolic Regression with a Learned Concept Library
- SymILO: A Symmetry-Aware Learning Framework for Integer Linear Optimization
- Symmetric Linear Bandits with Hidden Symmetry
- Symmetries in Overparametrized Neural Networks: A Mean Field View
- Symmetry and Geometry in Neural Representations
- Symmetry Discovery Beyond Affine Transformations
- Symmetry-Informed Governing Equation Discovery
- Synatra: Turning Indirect Knowledge into Direct Demonstrations for Digital Agents at Scale
- SyncTweedies: A General Generative Framework Based on Synchronized Diffusions
- SyncVIS: Synchronized Video Instance Segmentation
- Synergistic Dual Spatial-aware Generation of Image-to-text and Text-to-image
- SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery
- Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Models
- Synthetic Programming Elicitation for Text-to-Code in Very Low-Resource Programming and Formal Languages
- System-2 Reasoning at Scale
- T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
- T2Vs Meet VLMs: A Scalable Multimodal Dataset for Visual Harmfulness Recognition
- T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
- TabEBM: A Tabular Data Augmentation Method with Distinct Class-Specific Energy-Based Models
- TableRAG: Million-Token Table Understanding with Language Models
- Table Representation Learning Workshop (TRL)
- TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy
- TabularBench: Benchmarking Adversarial Robustness for Tabular Deep Learning in Real-world Use-cases
- Tackling Climate Change with Machine Learning
- Tackling Uncertain Correspondences for Multi-Modal Entity Alignment
- TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
- Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation
- TAIA: Large Language Models are Out-of-Distribution Data Learners
- Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks
- Talking Heads: Understanding Inter-Layer Communication in Transformer Language Models
- TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight
- Taming Cross-Domain Representation Variance in Federated Prototype Learning with Heterogeneous Data Domains
- Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
- Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs
- Taming Generative Diffusion Prior for Universal Blind Image Restoration
- Taming Heavy-Tailed Losses in Adversarial Bandits and the Best-of-Both-Worlds Setting
- Taming the Long Tail in Human Mobility Prediction
- Tangent Space Causal Inference: Leveraging Vector Fields for Causal Discovery in Dynamical Systems
- TAPTRv2: Attention-based Position Update Improves Tracking Any Point
- TAPVid-3D: A Benchmark for Tracking Any Point in 3D
- Targeted Sequential Indirect Experiment Design
- Target-Guided Adversarial Point Cloud Transformer Towards Recognition Against Real-world Corruptions
- TARP-VP: Towards Evaluation of Transferred Adversarial Robustness and Privacy on Label Mapping Visual Prompting Models
- TARSS-Net: Temporal-Aware Radar Semantic Segmentation Network
- Task-Agnostic Machine-Learning-Assisted Inference
- TaskBench: Benchmarking Large Language Models for Task Automation
- Task Confusion and Catastrophic Forgetting in Class-Incremental Learning: A Mathematical Framework for Discriminative and Generative Modelings
- Task Me Anything
- Task-oriented Time Series Imputation Evaluation via Generalized Representers
- Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental Learning
- Teach Better or Show Smarter? On Instructions and Exemplars in Automatic Prompt Optimization
- Team-Fictitious Play for Reaching Team-Nash Equilibrium in Multi-team Games
- TEG-DB: A Comprehensive Dataset and Benchmark of Textual-Edge Graphs
- Tell What You Hear From What You See - Video to Audio Generation Through Text
- Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis
- Temporal-Difference Learning Using Distributed Error Signals
- Temporal Graph Neural Tangent Kernel with Graphon-Guaranteed
- Temporally Consistent Atmospheric Turbulence Mitigation with Neural Representations
- Temporal Sentence Grounding with Relevance Feedback in Videos
- Tensor-Based Synchronization and the Low-Rankness of the Block Trifocal Tensor
- Terra: A Multimodal Spatio-Temporal Dataset Spanning the Earth
- Testably Learning Polynomial Threshold Functions
- Testing Calibration in Nearly-Linear Time
- Testing Semantic Importance via Betting
- Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line
- Test-time Adaptation in Non-stationary Environments via Adaptive Representation Alignment
- Test-Time Dynamic Image Fusion
- Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning
- Tetrahedron Splatting for 3D Generation
- Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts
- Text2NKG: Fine-Grained N-ary Relation Extraction for N-ary relational Knowledge Graph Construction
- Text-Aware Diffusion for Policy Learning
- TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
- Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model
- Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Models
- Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection
- Text-space Graph Foundation Models: Comprehensive Benchmarks and New Insights
- Text to Blind Motion
- Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection
- TFGDA: Exploring Topology and Feature Alignment in Semi-supervised Graph Domain Adaptation through Robust Clustering
- TFG: Unified Training-Free Guidance for Diffusion Models
- TFS-NeRF: Template-Free NeRF for Semantic 3D Reconstruction of Dynamic Scene
- TGB 2.0: A Benchmark for Learning on Temporal Knowledge Graphs and Heterogeneous Graphs
- The ALCHEmist: Automated Labeling 500x CHEaper than LLM Data Annotators
- The Art of Saying No: Contextual Noncompliance in Language Models
- The Bayesian sampling in a canonical recurrent circuit with a diversity of inhibitory interneurons
- The Benefits of Balance: From Information Projections to Variance Reduction
- The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection
- The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks
- The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
- The Collusion of Memory and Nonlinearity in Stochastic Approximation With Constant Stepsize
- The Dormant Neuron Phenomenon in Multi-Agent Reinforcement Learning Value Factorization
- The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning
- The Elephant in the Room: Towards A Reliable Time-Series Anomaly Detection Benchmark
- The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
- The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
- The Expressive Capacity of State Space Models: A Formal Language Perspective
- The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
- The Fairness-Quality Tradeoff in Clustering
- The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
- The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
- The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
- The First Workshop on Large Foundation Models for Educational Assessment
- The Fourth Workshop on Efficient Natural Language and Speech Processing (ENLSP-IV): Highlighting New Architectures for Future Foundation Models
- The Fragility of Fairness: Causal Sensitivity Analysis for Fair Machine Learning
- The GAN is dead; long live the GAN! A Modern GAN Baseline
- The Golem vs. Stone Soup: Understanding How Children Learn Can Help Us Understand And Improve AI
- The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations
- The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
- The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
- The Impact of Initialization on LoRA Finetuning Dynamics
- The Implicit Bias of Adam on Separable Data
- The Implicit Bias of Gradient Descent on Separable Multiclass Data
- The Implicit Bias of Gradient Descent toward Collaboration between Layers: A Dynamic Analysis of Multilayer Perceptions
- The Implicit Bias of Heterogeneity towards Invariance: A Study of Multi-Environment Matrix Sensing
- The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains
- The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
- The iNaturalist Sounds Dataset
- The Intelligible and Effective Graph Neural Additive Network
- The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
- The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space
- The Limits of Differential Privacy in Online Learning
- The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure
- The Mamba in the Llama: Distilling and Accelerating Hybrid Models
- The Many Faces of Optimal Weak-to-Strong Learning
- The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks
- The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels
- The motion planning neural circuit in goal-directed navigation as Lie group operator search
- The Multimodal Universe: Enabling Large-Scale Machine Learning with 100TBs of Astronomical Scientific Data
- Theoretical Analysis of Weak-to-Strong Generalization
- Theoretical and Empirical Insights into the Origins of Degree Bias in Graph Neural Networks
- Theoretical Characterisation of the Gauss Newton Conditioning in Neural Networks
- Theoretical Foundations of Deep Selective State-Space Models
- Theoretical guarantees in KL for Diffusion Flow Matching
- Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning
- The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models
- The Power of Extrapolation in Federated Learning
- The Power of Hard Attention Transformers on Data Sequences: A formal language theoretic perspective
- The Power of Resets in Online Reinforcement Learning
- The Prevalence of Neural Collapse in Neural Multivariate Regression
- The Price of Implicit Bias in Adversarially Robust Generalization
- The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
- There is No Silver Bullet: Benchmarking Methods in Predictive Combinatorial Optimization
- The Reliability of OKRidge Method in Solving Sparse Ridge Regression Problems
- The Representation Landscape of Few-Shot Learning and Fine-Tuning in Large Language Models
- The Road Less Scheduled
- The Sample-Communication Complexity Trade-off in Federated Q-Learning
- The Sample Complexity of Gradient Descent in Stochastic Convex Optimization
- The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
- The Secretary Problem with Predicted Additive Gap
- The Selective $G$-Bispectrum and its Inversion: Applications to $G$-Invariant Networks
- The Space Complexity of Approximating Logistic Loss
- The Star Geometry of Critic-Based Regularizer Learning
- The State of Data Curation at NeurIPS: An Assessment of Dataset Development Practices in the Datasets and Benchmarks Track
- The Surprising Effectiveness of SP Voting with Partial Preferences
- The surprising efficiency of temporal difference learning for rare event prediction
- The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning
- The tree autoencoder model, with application to hierarchical data visualization
- The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
- The Value of Reward Lookahead in Reinforcement Learning
- The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning
- Thinking Forward: Memory-Efficient Federated Finetuning of Language Models
- This Too Shall Pass: Removing Stale Observations in Dynamic Bayesian Optimization
- Thompson Sampling For Combinatorial Bandits: Polynomial Regret and Mismatched Sampling Paradox
- Thought of Search: Planning with Language Models Through The Lens of Efficiency
- Through the Looking-Glass: Tracing Shifts in AI Data Consent across the Web
- Tight Bounds for Learning RUMs from Small Slates
- Tighter Convergence Bounds for Shuffled SGD via Primal-Dual Perspective
- Tight Rates for Bandit Control Beyond Quadratics
- Time-Constrained Robust MDPs
- Time-FFM: Towards LM-Empowered Federated Foundation Model for Time Series Forecasting
- Time Makes Space: Emergence of Place Fields in Networks Encoding Temporally Continuous Sensory Experiences
- Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis
- Time-Reversal Provides Unsupervised Feedback to LLMs
- Time Series in the Age of Large Models
- Time-Varying LoRA: Towards Effective Cross-Domain Fine-Tuning of Diffusion Models
- TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables
- TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge
- Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series
- TinyTTA: Efficient Test-time Adaptation via Early-exit Ensembles on Edge Devices
- To Believe or Not to Believe Your LLM: IterativePrompting for Estimating Epistemic Uncertainty
- To Err Like Human: Affective Bias-Inspired Measures for Visual Emotion Recognition Evaluation
- Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis
- To Learn or Not to Learn, That is the Question — A Feature-Task Dual Learning Model of Perceptual Learning
- Tolerant Algorithms for Learning with Arbitrary Covariate Shift
- TOPA: Extending Large Language Models for Video Understanding via Text-Only Pre-Alignment
- Topic-Conversation Relevance (TCR) Dataset and Benchmarks
- TopoFR: A Closer Look at Topology Alignment on Face Recognition
- Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms
- Topological Hidden Markov Models
- Topological obstruction to the training of shallow ReLU neural networks
- TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes
- TorchOpt: An Efficient Library for Differentiable Optimization
- TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning
- Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?
- Toward Approaches to Scalability in 3D Human Pose Estimation
- Toward a Stable, Fair, and Comprehensive Evaluation of Object Hallucination in Large Vision-Language Models
- Toward a Well-Calibrated Discrimination via Survival Outcome-Aware Contrastive Learning
- Toward Conditional Distribution Calibration in Survival Prediction
- Toward Dynamic Non-Line-of-Sight Imaging with Mamba Enforced Temporal Consistency
- Toward Efficient Inference for Mixture of Experts
- Toward Global Convergence of Gradient EM for Over-Paramterized Gaussian Mixture Models
- Toward Industrial Artificial Intelligence
- Toward Real Ultra Image Segmentation: Leveraging Surrounding Context to Cultivate General Segmentation Model
- Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning
- Towards Accurate and Fair Cognitive Diagnosis via Monotonic Data Augmentation
- Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning
- Towards a Scalable Reference-Free Evaluation of Generative Models
- Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics
- Towards a theory of how the structure of language is acquired by deep neural networks
- Towards a "Universal Translator" for Neural Dynamics at Single-Cell, Single-Spike Resolution
- Towards Calibrated Robust Fine-Tuning of Vision-Language Models
- Towards Combating Frequency Simplicity-biased Learning for Domain Generalization
- Towards Comprehensive Detection of Chinese Harmful Memes: Dataset and Detector
- Towards Croppable Implicit Neural Representations
- Towards Diverse Device Heterogeneous Federated Learning via Task Arithmetic Knowledge Integration
- Towards Dynamic Message Passing on Graphs
- Towards Editing Time Series
- Towards Effective Planning Strategies for Dynamic Opinion Networks
- Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits
- Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
- Toward Semantic Gaze Target Detection
- Towards Estimating Bounds on the Effect of Policies under Unobserved Confounding
- Towards Exact Gradient-based Training on Analog In-memory Computing
- Towards Explainable Evaluation Metrics for Machine Translation
- Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection
- Towards Flexible Visual Relationship Segmentation
- Towards General Loop Invariant Generation: A Benchmark of Programs with Memory Manipulation
- Towards Global Optimal Visual In-Context Learning Prompt Selection
- Towards Harmless Rawlsian Fairness Regardless of Demographic Prior
- Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox
- Towards Human-AI Complementarity with Prediction Sets
- Towards Learning Group-Equivariant Features for Domain Adaptive 3D Detection
- Towards Multi-dimensional Explanation Alignment for Medical Classification
- Towards Multi-Domain Learning for Generalizable Video Anomaly Detection
- Towards Neuron Attributions in Multi-Modal Large Language Models
- Towards Next-Generation Logic Synthesis: A Scalable Neural Circuit Generation Framework
- Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
- Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
- Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels
- Towards Principled Graph Transformers
- Towards Reliable Model Selection for Unsupervised Domain Adaptation: An Empirical Study and A Certified Baseline
- Towards Robust Multimodal Sentiment Analysis with Incomplete Data
- Towards Safe Concept Transfer of Multi-Modal Diffusion via Causal Representation Editing
- Towards Safe & Trustworthy Agents
- Towards Scalable and Stable Parallelization of Nonlinear RNNs
- Towards Stable Representations for Protein Interface Prediction
- Towards the Dynamics of a DNN Learning Symbolic Interactions
- Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning
- Towards training digitally-tied analog blocks via hybrid gradient computation
- Towards Understanding Evolving Patterns in Sequential Data
- Towards Understanding Extrapolation: a Causal Lens
- Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens
- Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model
- Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
- Towards Universal Mesh Movement Networks
- Towards Unsupervised Model Selection for Domain Adaptive Object Detection
- Towards Visual Text Design Transfer Across Languages
- Toxicity Detection for Free
- TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation
- TPR: Topology-Preserving Reservoirs for Generalized Zero-Shot Learning
- Trace is the Next AutoDiff: Generative Optimization with Rich Feedback, Execution Traces, and LLMs
- Tracing Hyperparameter Dependencies for Model Parsing via Learnable Graph Pooling Network
- TrackIME: Enhanced Video Point Tracking via Instance Motion Estimation
- TrAct: Making First-layer Pre-Activations Trainable
- Trade-Offs of Diagonal Fisher Information Matrix Estimators
- Trading off Consistency and Dimensionality of Convex Surrogates for Multiclass Classification
- Trading Place for Space: Increasing Location Resolution Reduces Contextual Capacity in Hippocampal Codes
- Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
- Training an Open-Vocabulary Monocular 3D Detection Model without 3D Data
- Training Binary Neural Networks via Gaussian Variational Inference and Low-Rank Semidefinite Programming
- Training Compute-Optimal Protein Language Models
- Training Data Attribution via Approximate Unrolling
- Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis
- Training for Stable Explanation for Free
- Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
- Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts
- TrajCLIP: Pedestrian trajectory prediction method using contrastive learning and idempotent networks
- Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability
- Trajectory Diffusion for ObjectGoal Navigation
- Trajectory Flow Matching with Applications to Clinical Time Series Modelling
- TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration
- Transcendence: Generative Models Can Outperform The Experts That Train Them
- Transcoders find interpretable LLM feature circuits
- Transductive Active Learning: Theory and Applications
- Transductive Learning is Compact
- Transferability Bound Theory: Exploring Relationship between Adversarial Transferability and Flatness
- Transferable Adversarial Attacks on SAM and Its Downstream Models
- Transferable Boltzmann Generators
- Transfer Learning for Diffusion Models
- Transfer Learning for Latent Variable Network Models
- Transfer Learning with Informative Priors: Simple Baselines Better than Previously Reported
- Transfer Q-star : Principled Decoding for LLM Alignment
- Transferring disentangled representations: bridging the gap between synthetic and real images
- Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization
- Transformer Doctor: Diagnosing and Treating Vision Transformers
- Transformers are Minimax Optimal Nonparametric In-Context Learners
- Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models
- Transformers Can Do Arithmetic with the Right Embeddings
- Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression
- Transformers need glasses! Information over-squashing in language tasks
- Transformers on Markov data: Constant depth suffices
- Transformers Represent Belief State Geometry in their Residual Stream
- Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
- Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner
- Transition Constrained Bayesian Optimization via Markov Decision Processes
- TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
- Trap-MID: Trapdoor-based Defense against Model Inversion Attacks
- Treatment of Statistical Estimation Problems in Randomized Smoothing for Adversarial Robustness
- Treeffuser: probabilistic prediction via conditional diffusions with gradient-boosted trees
- Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
- TreeVI: Reparameterizable Tree-structured Variational Inference for Instance-level Correlation Capturing
- Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization
- TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
- Truncated Variance Reduced Value Iteration
- Truthful High Dimensional Sparse Linear Regression
- Truthfulness of Calibration Measures
- Truth is Universal: Robust Detection of Lies in LLMs
- TSDS: Data Selection for Task-Specific Model Finetuning
- TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series
- TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks
- TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
- Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
- Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
- Typicalness-Aware Learning for Failure Detection
- UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles
- UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-World Document Analysis
- UDC: A Unified Neural Divide-and-Conquer Framework for Large-Scale Combinatorial Optimization Problems
- U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
- UDON: Universal Dynamic Online distillatioN for generic image representations
- UDPM: Upsampling Diffusion Probabilistic Models
- UGC: Universal Graph Coarsening
- UKnow: A Unified Knowledge Protocol with Multimodal Knowledge Graph Datasets for Reasoning and Vision-Language Pre-Training
- UltraEdit: Instruction-based Fine-Grained Image Editing at Scale
- Ultrafast classical phylogenetic method beats large protein language models on variant effect prediction
- UltraMedical: Building Specialized Generalists in Biomedicine
- UltraPixel: Advancing Ultra High-Resolution Image Synthesis to New Peaks
- UMB: Understanding Model Behavior for Open-World Object Detection
- UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models
- Uncertainty-aware Fine-tuning of Segmentation Foundation Models
- Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
- Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in LLMs
- Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-Contrast
- Unconditional stability of a recurrent neural circuit implementing divisive normalization
- Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
- Uncovering Safety Risks of Large Language Models through Concept Activation Vector
- Uncovering the Redundancy in Graph Self-supervised Learning Models
- Understanding and Improving Adversarial Collaborative Filtering for Robust Recommendation
- Understanding and Improving Training-free Loss-based Diffusion Guidance
- Understanding and Minimising Outlier Features in Transformer Training
- Understanding Bias in Large-Scale Visual Datasets
- Understanding Emergent Abilities of Language Models from the Loss Perspective
- Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure
- Understanding Hallucinations in Diffusion Models through Mode Interpolation
- Understanding Information Storage and Transfer in Multi-Modal Large Language Models
- Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
- Understanding Model Selection for Learning in Strategic Environments
- Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
- Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective
- Understanding Scaling Laws with Statistical and Approximation Theory for Transformer Neural Networks on Intrinsically Low-dimensional Data
- Understanding the Differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
- Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
- Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective
- Understanding the Gains from Repeated Self-Distillation
- Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem
- Understanding the Role of Equivariance in Self-supervised Learning
- Understanding the Transferability of Representations via Task-Relatedness
- Understanding Transformer Reasoning Capabilities via Graph Algorithms
- Understanding Transformers via N-Gram Statistics
- Understanding Visual Feature Reliance through the Lens of Complexity
- Unelicitable Backdoors via Cryptographic Transformer Circuits
- UniAR: A Unified model for predicting human Attention and Responses on visual content
- UniAudio 1.5: Large Language Model-Driven Audio Codec is A Few-Shot Audio Task Learner
- UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling
- UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
- UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior
- Unified Covariate Adjustment for Causal Inference
- Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection
- Unified Generative and Discriminative Training for Multi-modal Large Language Models
- Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement
- Unified Graph Augmentations for Generalized Contrastive Learning on Graphs
- Unified Guidance for Geometry-Conditioned Molecular Generation
- Unified Insights: Harnessing Multi-modal Data for Phenotype Imputation via View Decoupling
- Unified Lexical Representation for Interpretable Visual-Language Alignment
- Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification
- Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
- UniFL: Improve Latent Diffusion Model via Unified Feedback Learning
- Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
- Unifying Generation and Prediction on Graphs with Latent Graph Diffusion
- Unifying Homophily and Heterophily for Spectral Graph Neural Networks via Triple Filter Ensembles
- UniGAD: Unifying Multi-level Graph Anomaly Detection
- UniIF: Unified Molecule Inverse Folding
- Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE
- UniMTS: Unified Pre-training for Motion Time Series
- UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
- Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
- UniReps: Unifying Representations in Neural Models
- UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections
- Unitary Convolutions for Learning on Graphs and Groups
- United We Stand, Divided We Fall: Fingerprinting Deep Neural Networks via Adversarial Trajectories
- UniTox: Leveraging LLMs to Curate a Unified Dataset of Drug-Induced Toxicity from FDA Labels
- UniTS: A Unified Multi-Task Time Series Model
- UNIT: Unifying Image and Text Recognition in One Vision Encoder
- Unity by Diversity: Improved Representation Learning for Multimodal VAEs
- Universal Exact Compression of Differentially Private Mechanisms
- Universal In-Context Approximation By Prompting Fully Recurrent Models
- Universality in Transfer Learning for Linear Models
- Universality of AdaGrad Stepsizes for Stochastic Optimization: Inexact Oracle, Acceleration and Variance Reduction
- Universal Neural Functionals
- Universal Online Convex Optimization with $1$ Projection per Round
- Universal Physics Transformers: A Framework For Efficiently Scaling Neural Operators
- Universal Rates for Active Learning
- Universal Rates of Empirical Risk Minimization
- Universal Sample Coding
- Unlearnable 3D Point Clouds: Class-wise Transformation Is All You Need
- UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models
- Unleashing Multispectral Video's Potential in Semantic Segmentation: A Semi-supervised Viewpoint and New UAV-View Benchmark
- Unleashing Region Understanding in Intermediate Layers for MLLM-based Referring Expression Generation
- Unleashing the Denoising Capability of Diffusion Prior for Solving Inverse Problems
- Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
- Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance
- Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought
- Unlocking the Potential of Global Human Expertise
- Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models
- Unlock the Intermittent Control Ability of Model Free Reinforcement Learning
- Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
- Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry
- Unraveling the Gradient Descent Dynamics of Transformers
- Unravelling in Collaborative Learning
- Unrolled denoising networks provably learn to perform optimal Bayesian inference
- Unscrambling disease progression at scale: fast inference of event permutations with optimal transport
- UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation
- Unsupervised Anomaly Detection Algorithms on Real-world Data: How Many Do We Need?
- Unsupervised Anomaly Detection in The Presence of Missing Values
- Unsupervised Discovery of Formulas for Mathematical Constants
- Unsupervised Hierarchy-Agnostic Segmentation: Parsing Semantic Image Structure
- Unsupervised Homography Estimation on Multimodal Image Pair via Alternating Optimization
- Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation
- Unsupervised Object Detection with Theoretical Guarantees
- Untrained Neural Nets for Snapshot Compressive Imaging: Theory and Algorithms
- Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization
- Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness
- Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?
- Unveiling Encoder-Free Vision-Language Models
- Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
- Unveiling LoRA Intrinsic Ranks via Salience Analysis
- Unveiling the Bias Impact on Symmetric Moral Consistency of Large Language Models
- Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagation
- Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis
- Unveiling The Matthew Effect Across Channels: Assessing Layer Width Sufficiency via Weight Norm Variance
- Unveiling the Potential of Robustness in Selecting Conditional Average Treatment Effect Estimators
- Unveiling the Tapestry of Consistency in Large Vision-Language Models
- Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms
- Upping the Game: How 2D U-Net Skip Connections Flip 3D Segmentation
- UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond
- UQE: A Query Engine for Unstructured Databases
- UQ-Guided Hyperparameter Optimization for Iterative Learners
- UrbanDataLayer: A Unified Data Pipeline for Urban Science
- UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction
- USCILab3D: A Large-scale, Long-term, Semantically Annotated Outdoor Dataset
- User-Creator Feature Polarization in Recommender Systems with Dual Influence
- User-item fairness tradeoffs in recommendations
- Using Noise to Infer Aspects of Simplicity Without Learning
- Using Surrogates in Covariate-adjusted Response-adaptive Randomization Experiments with Delayed Outcomes
- Using Time-Aware Graph Neural Networks to Predict Temporal Centralities in Dynamic Graphs
- Using Unity to Help Solve Reinforcement Learning
- Utilizing Human Behavior Modeling to Manipulate Explanations in AI-Assisted Decision Making: The Good, the Bad, and the Scary
- Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series
- UV-free Texture Generation with Denoising and Geodesic Heat Diffusion
- Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning Attack
- Validating Climate Models with Spherical Convolutional Wasserstein Distance
- Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
- Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets
- Variance estimation in compound decision theory under boundedness
- Variational Delayed Policy Optimization
- Variational Distillation of Diffusion Policies into Mixture of Experts
- Variational Flow Matching for Graph Generation
- Variational Multi-scale Representation for Estimating Uncertainty in 3D Gaussian Splatting
- Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression
- VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
- VastTrack: Vast Category Visual Object Tracking
- VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks
- VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction
- Vector Quantization Prompting for Continual Learning
- VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections
- Verifiably Robust Conformal Prediction
- VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding
- Verified Code Transpilation with LLMs
- Verified Safe Reinforcement Learning for Neural Network Dynamic Models
- VeXKD: The Versatile Integration of Cross-Modal Fusion and Knowledge Distillation for 3D Perception
- VFIMamba: Video Frame Interpolation with State Space Models
- VHELM: A Holistic Evaluation of Vision Language Models
- Video Diffusion Models are Training-free Motion Interpreter and Controller
- VideoGUI: A Benchmark for GUI Automation from Instructional Videos
- VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation
- VideoTetris: Towards Compositional Text-to-Video Generation
- Video Token Merging for Long Video Understanding
- VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
- VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
- Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels
- ViLCo-Bench: VIdeo Language COntinual learning Benchmark
- Virtual Scanning: Unsupervised Non-line-of-sight Imaging from Irregularly Undersampled Transients
- VISA: Variational Inference with Sequential Sample-Average Approximations
- Vision Foundation Model Enables Generalizable Object Pose Estimation
- Vision-Language Models are Strong Noisy Label Detectors
- Vision-Language Navigation with Energy-Based Policy
- VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
- Vision Mamba Mender
- Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
- Vision Transformer Neural Architecture Search for Out-of-Distribution Generalization: Benchmark and Insights
- VisMin: Visual Minimal-Change Understanding
- Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
- Visual Anchors Are Strong Information Aggregators For Multimodal Large Language Model
- Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
- Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
- Visual Data Diagnosis and Debiasing with Concept Graphs
- Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion
- Visual Fourier Prompt Tuning
- Visual Perception by Large Language Model’s Weights
- Visual Pinwheel Center Act as Geometric Saliency Detector
- Visual Prompt Tuning in Null Space for Continual Learning
- Visual Riddles: a Commonsense and World Knowledge Challenge for Large Vision and Language Models
- Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
- Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
- Vivid-ZOO: Multi-View Video Generation with Diffusion Model
- VLG-CBM: Training Concept Bottleneck Models with Vision-Language Guidance
- VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
- VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images
- VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought
- VLMimic: Vision Language Models are Visual Imitation Learner for Fine-grained Actions
- VMamba: Visual State Space Model
- Vocal Call Locator Benchmark (VCL'24) for localizing rodent vocalizations from multi-channel audio
- Voila-A: Aligning Vision-Language Models with User's Gaze Attention
- Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection
- Voxel Proposal Network via Multi-Frame Knowledge Distillation for Semantic Scene Completion
- V-PETL Bench: A Unified Visual Parameter-Efficient Transfer Learning Benchmark
- VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
- Vript: A Video Is Worth Thousands of Words
- VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding
- WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models
- Warm-starting Push-Relabel
- Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
- Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models
- Wasserstein convergence of Cech persistence diagrams for samplings of submanifolds
- Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
- Wasserstein Distributionally Robust Optimization through the Lens of Structural Causal Models and Individual Fairness
- Wasserstein Gradient Boosting: A Framework for Distribution-Valued Supervised Learning
- Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
- Watermarking for Large Language Models
- Watermarking Makes Language Models Radioactive
- WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off
- WATT: Weight Average Test Time Adaptation of CLIP
- WaveAttack: Asymmetric Frequency Obfuscation-based Backdoor Attacks Against Deep Neural Networks
- Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles
- Weak Supervision Performance Evaluation via Partial Identification
- Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach
- WebUOT-1M: Advancing Deep Underwater Object Tracking with A Million-Scale Benchmark
- Weight decay induces low-rank attention layers
- Weight Diffusion for Future: Learn to Generalize in Non-Stationary Environments
- Weight for Robustness: A Comprehensive Approach towards Optimal Fault-Tolerant Asynchronous ML
- WeiPer: OOD Detection using Weight Perturbations of Class Projections
- Weisfeiler and Leman Go Loopy: A New Hierarchy for Graph Representational Learning
- WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking
- WenMind: A Comprehensive Benchmark for Evaluating Large Language Models in Chinese Classical Literature and Language Arts
- WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control
- What does guidance do? A fine-grained analysis in a simple setting
- What do Graph Neural Networks learn? Insights from Tropical Geometry
- What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration
- What If the Input is Expanded in OOD Detection?
- What Is Missing For Graph Homophily? Disentangling Graph Homophily For Graph Neural Networks
- What is my quantum computer good for? Quantum capability learning with physics-aware neural networks
- What Makes and Breaks Safety Fine-tuning? A Mechanistic Study
- What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights
- What Makes Partial-Label Learning Algorithms Effective?
- What makes unlearning hard and what to do about it
- What Matters in Graph Class Incremental Learning? An Information Preservation Perspective
- What matters when building vision-language models?
- What Rotary Position Embedding Can Tell Us: Identifying Query and Key Weights Corresponding to Basic Syntactic or High-level Semantic Information
- What to Say and When to Say it: Live Fitness Coaching as a Testbed for Situated Interaction
- What type of inference is planning?
- What Variables Affect Out-of-Distribution Generalization in Pretrained Models?
- When are dynamical systems learned from time series data statistically accurate?
- When does perceptual alignment benefit vision representations?
- When is an Embedding Model More Promising than Another?
- When Is Inductive Inference Possible?
- When is Multicalibration Post-Processing Necessary?
- When LLM Meets DRL: Advancing Jailbreaking Efficiency via DRL-guided Search
- When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models
- When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding
- When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
- When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
- Where does In-context Learning \\ Happen in Large Language Models?
- Where Do Large Learning Rates Lead Us?
- Where's Waldo: Diffusion Features For Personalized Segmentation and Retrieval
- WhodunitBench: Evaluating Large Multimodal Agents via Murder Mystery Games
- Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)
- Who's asking? User personas and the mechanics of latent misalignment
- Who’s Gaming the System? A Causally-Motivated Approach for Detecting Strategic Adaptation
- Why are Visually-Grounded Language Models Bad at Image Classification?
- Why Do We Need Weight Decay in Modern Deep Learning?
- Why Go Full? Elevating Federated Learning Through Partial Network Updates
- Why the Metric Backbone Preserves Community Structure
- Why Transformers Need Adam: A Hessian Perspective
- Why Warmup the Learning Rate? Underlying Mechanisms and Improvements
- Wide Two-Layer Networks can Learn from Adversarial Perturbations
- WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia
- WikiDBs: A Large-Scale Corpus Of Relational Databases From Wikidata
- WikiDO: Evaluating Out-of-Distribution Generalization of Vision-Language Models in Cross-Modal Retrieval
- WildGaussians: 3D Gaussian Splatting In the Wild
- Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections
- WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
- WildPPG: A Real-World PPG Dataset of Long Continuous Recordings
- WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
- WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences
- WindsorML - High-Fidelity Computational Fluid Dynamics Dataset For Automotive Aerodynamics
- Wings: Learning Multimodal LLMs without Text-only Forgetting
- WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
- WizardArena: Post-training Large Language Models via Simulated Offline Chatbot Arena
- WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
- Workshop on Behavioral Machine Learning
- Workshop on Machine Learning and Compression
- Workshop on Open-World Agents: Synnergizing Reasoning and Decision-Making in Open-World Environments (OWA-2024)
- Workshop on Responsibly Building Next Generation of Multimodal Foundation Models
- Workshop on Scalable Continual Learning for Lifelong Foundation Models
- Workshop on Video-Language Models
- WorldCoder, a Model-Based LLM Agent: Building World Models by Writing Code and Interacting with the Environment
- Wormhole Loss for Partial Shape Matching
- Worst-Case Offline Reinforcement Learning with Arbitrary Data Support
- Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads
- XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX
- xLSTM: Extended Long Short-Term Memory
- XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
- xMIL: Insightful Explanations for Multiple Instance Learning in Histopathology
- xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
- X-Ray: A Sequential 3D Representation For Generation
- Yo'LLaVA: Your Personalized Language and Vision Assistant
- YOLOv10: Real-Time End-to-End Object Detection
- You Don’t Need Domain-Specific Data Augmentations When Scaling Self-Supervised Learning
- YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals
- You Only Cache Once: Decoder-Decoder Architectures for Language Models
- You Only Look Around: Learning Illumination-Invariant Feature for Low-light Object Detection
- Your contrastive learning problem is secretly a distribution alignment problem
- Your Diffusion Model is Secretly a Noise Classifier and Benefits from Contrastive Training
- ZeroMark: Towards Dataset Ownership Verification without Disclosing Watermark
- Zero-Shot Event-Intensity Asymmetric Stereo via Visual Prompting from Image Domain
- Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection
- Zero-shot Image Editing with Reference Imitation
- Zero-Shot Reinforcement Learning from Low Quality Data
- Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly
- Zero-Shot Tokenizer Transfer
- Zero-Shot Transfer of Neural ODEs
- Zeroth-Order Sampling Methods for Non-Log-Concave Distributions: Alleviating Metastability by Denoising Diffusion
- Zero-to-Hero: Enhancing Zero-Shot Novel View Synthesis via Attention Map Filtering
- ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification
- Zipfian Whitening
- Zipper: Addressing Degeneracy in Algorithm-Agnostic Inference
- ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving
- ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination
- μBench: A Microscopy Benchmark for Vision-Language Understanding