Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Enzo Tartaglione

The Simpler The Better: An Entropy-Based Importance Metric To Reduce Neural Networks' Depth

Apr 27, 2024
Victor Quétu, Zhu Liao, Enzo Tartaglione

While deep neural networks are highly effective at solving complex tasks, large pre-trained models are commonly employed even to solve consistently simpler downstream tasks, which do not necessarily require a large model's complexity. Motivated by the awareness of the ever-growing AI environmental impact, we propose an efficiency strategy that leverages prior knowledge transferred by large models. Simple but effective, we propose a method relying on an Entropy-bASed Importance mEtRic (EASIER) to reduce the depth of over-parametrized deep neural networks, which alleviates their computational burden. We assess the effectiveness of our method on traditional image classification setups. The source code will be publicly released upon acceptance of the article.

* arXiv admin note: text overlap with arXiv:2404.16890

Via

Access Paper or Ask Questions

Domain Adaptation for Learned Image Compression with Supervised Adapters

Apr 24, 2024
Alberto Presta, Gabriele Spadaro, Enzo Tartaglione, Attilio Fiandrotti, Marco Grangetto

In Learned Image Compression (LIC), a model is trained at encoding and decoding images sampled from a source domain, often outperforming traditional codecs on natural images; yet its performance may be far from optimal on images sampled from different domains. In this work, we tackle the problem of adapting a pre-trained model to multiple target domains by plugging into the decoder an adapter module for each of them, including the source one. Each adapter improves the decoder performance on a specific domain, without the model forgetting about the images seen at training time. A gate network computes the weights to optimally blend the contributions from the adapters when the bitstream is decoded. We experimentally validate our method over two state-of-the-art pre-trained models, observing improved rate-distortion efficiency on the target domains without penalties on the source domain. Furthermore, the gate's ability to find similarities with the learned target domains enables better encoding efficiency also for images outside them.

* 10 pages, published to Data compression conference 2024 (DCC2024)

Via

Access Paper or Ask Questions

NEPENTHE: Entropy-Based Pruning as a Neural Network Depth's Reducer

Apr 24, 2024
Zhu Liao, Victor Quétu, Van-Tam Nguyen, Enzo Tartaglione

While deep neural networks are highly effective at solving complex tasks, their computational demands can hinder their usefulness in real-time applications and with limited-resources systems. Besides, for many tasks it is known that these models are over-parametrized: neoteric works have broadly focused on reducing the width of these networks, rather than their depth. In this paper, we aim to reduce the depth of over-parametrized deep neural networks: we propose an eNtropy-basEd Pruning as a nEural Network depTH's rEducer (NEPENTHE) to alleviate deep neural networks' computational burden. Based on our theoretical finding, NEPENTHE focuses on un-structurally pruning connections in layers with low entropy to remove them entirely. We validate our approach on popular architectures such as MobileNet and Swin-T, showing that when encountering an over-parametrization regime, it can effectively linearize some layers (hence reducing the model's depth) with little to no performance loss. The code will be publicly available upon acceptance of the article.

Via

Access Paper or Ask Questions

Debiasing surgeon: fantastic weights and how to find them

Mar 21, 2024
Rémi Nahon, Ivan Luiz De Moura Matos, Van-Tam Nguyen, Enzo Tartaglione

Figure 1 for Debiasing surgeon: fantastic weights and how to find them

Figure 2 for Debiasing surgeon: fantastic weights and how to find them

Figure 3 for Debiasing surgeon: fantastic weights and how to find them

Figure 4 for Debiasing surgeon: fantastic weights and how to find them

Nowadays an ever-growing concerning phenomenon, the emergence of algorithmic biases that can lead to unfair models, emerges. Several debiasing approaches have been proposed in the realm of deep learning, employing more or less sophisticated approaches to discourage these models from massively employing these biases. However, a question emerges: is this extra complexity really necessary? Is a vanilla-trained model already embodying some ``unbiased sub-networks'' that can be used in isolation and propose a solution without relying on the algorithmic biases? In this work, we show that such a sub-network typically exists, and can be extracted from a vanilla-trained model without requiring additional training. We further validate that such specific architecture is incapable of learning a specific bias, suggesting that there are possible architectural countermeasures to the problem of biases in deep neural networks.

Via

Access Paper or Ask Questions

Find the Lady: Permutation and Re-Synchronization of Deep Neural Networks

Dec 19, 2023
Carl De Sousa Trias, Mihai Petru Mitrea, Attilio Fiandrotti, Marco Cagnazzo, Sumanta Chaudhuri, Enzo Tartaglione

Deep neural networks are characterized by multiple symmetrical, equi-loss solutions that are redundant. Thus, the order of neurons in a layer and feature maps can be given arbitrary permutations, without affecting (or minimally affecting) their output. If we shuffle these neurons, or if we apply to them some perturbations (like fine-tuning) can we put them back in the original order i.e. re-synchronize? Is there a possible corruption threat? Answering these questions is important for applications like neural network white-box watermarking for ownership tracking and integrity verification. We advance a method to re-synchronize the order of permuted neurons. Our method is also effective if neurons are further altered by parameter pruning, quantization, and fine-tuning, showing robustness to integrity attacks. Additionally, we provide theoretical and practical evidence for the usual means to corrupt the integrity of the model, resulting in a solution to counter it. We test our approach on popular computer vision datasets and models, and we illustrate the threat and our countermeasure on a popular white-box watermarking method.

Via

Access Paper or Ask Questions

SCoTTi: Save Computation at Training Time with an adaptive framework

Dec 19, 2023
Ziyu Lin, Enzo Tartaglione, Van-Tam Nguyen

On-device training is an emerging approach in machine learning where models are trained on edge devices, aiming to enhance privacy protection and real-time performance. However, edge devices typically possess restricted computational power and resources, making it challenging to perform computationally intensive model training tasks. Consequently, reducing resource consumption during training has become a pressing concern in this field. To this end, we propose SCoTTi (Save Computation at Training Time), an adaptive framework that addresses the aforementioned challenge. It leverages an optimizable threshold parameter to effectively reduce the number of neuron updates during training which corresponds to a decrease in memory and computation footprint. Our proposed approach demonstrates superior performance compared to the state-of-the-art methods regarding computational resource savings on various commonly employed benchmarks and popular architectures, including ResNets, MobileNet, and Swin-T.

Via

Access Paper or Ask Questions

Enhanced EEG-Based Mental State Classification : A novel approach to eliminate data leakage and improve training optimization for Machine Learning

Dec 14, 2023
Maxime Girard, Rémi Nahon, Enzo Tartaglione, Van-Tam Nguyen

Figure 1 for Enhanced EEG-Based Mental State Classification : A novel approach to eliminate data leakage and improve training optimization for Machine Learning

Figure 2 for Enhanced EEG-Based Mental State Classification : A novel approach to eliminate data leakage and improve training optimization for Machine Learning

Figure 3 for Enhanced EEG-Based Mental State Classification : A novel approach to eliminate data leakage and improve training optimization for Machine Learning

Figure 4 for Enhanced EEG-Based Mental State Classification : A novel approach to eliminate data leakage and improve training optimization for Machine Learning

In this paper, we explore prior research and introduce a new methodology for classifying mental state levels based on EEG signals utilizing machine learning (ML). Our method proposes an optimized training method by introducing a validation set and a refined standardization process to rectify data leakage shortcomings observed in preceding studies. Furthermore, we establish novel benchmark figures for various models, including random forest and deep neural networks.

* 5 pages, 2 figures, 1 table

Via

Access Paper or Ask Questions

Weighted Ensemble Models Are Strong Continual Learners

Dec 14, 2023
Imad Eddine Marouf, Subhankar Roy, Enzo Tartaglione, Stéphane Lathuilière

In this work, we study the problem of continual learning (CL) where the goal is to learn a model on a sequence of tasks, such that the data from the previous tasks becomes unavailable while learning on the current task data. CL is essentially a balancing act between being able to learn on the new task (i.e., plasticity) and maintaining the performance on the previously learned concepts (i.e., stability). With an aim to address the stability-plasticity trade-off, we propose to perform weight-ensembling of the model parameters of the previous and current task. This weight-ensembled model, which we call Continual Model Averaging (or CoMA), attains high accuracy on the current task by leveraging plasticity, while not deviating too far from the previous weight configuration, ensuring stability. We also propose an improved variant of CoMA, named Continual Fisher-weighted Model Averaging (or CoFiMA), that selectively weighs each parameter in the weight ensemble by leveraging the Fisher information of the weights of the model. Both the variants are conceptually simple, easy to implement, and effective in attaining state-of-the-art performance on several standard CL benchmarks.

* Code: https://github.com/IemProg/CoFiMA

Via

Access Paper or Ask Questions

Towards On-device Learning on the Edge: Ways to Select Neurons to Update under a Budget Constraint

Dec 08, 2023
Aël Quélennec, Enzo Tartaglione, Pavlo Mozharovskyi, Van-Tam Nguyen

In the realm of efficient on-device learning under extreme memory and computation constraints, a significant gap in successful approaches persists. Although considerable effort has been devoted to efficient inference, the main obstacle to efficient learning is the prohibitive cost of backpropagation. The resources required to compute gradients and update network parameters often exceed the limits of tightly constrained memory budgets. This paper challenges conventional wisdom and proposes a series of experiments that reveal the existence of superior sub-networks. Furthermore, we hint at the potential for substantial gains through a dynamic neuron selection strategy when fine-tuning a target task. Our efforts extend to the adaptation of a recent dynamic neuron selection strategy pioneered by Bragagnolo et al. (NEq), revealing its effectiveness in the most stringent scenarios. Our experiments demonstrate, in the average case, the superiority of a NEq-inspired approach over a random selection. This observation prompts a compelling avenue for further exploration in the area, highlighting the opportunity to design a new class of algorithms designed to facilitate parameter update selection. Our findings usher in a new era of possibilities in the field of on-device learning under extreme constraints and encourage the pursuit of innovative strategies for efficient, resource-friendly model fine-tuning.

* 8 pages, 4 figures, 2 tables, WACV2024 - SCIoT workshop

Via

Access Paper or Ask Questions

Mini but Mighty: Finetuning ViTs with Mini Adapters

Nov 07, 2023
Imad Eddine Marouf, Enzo Tartaglione, Stéphane Lathuilière

Vision Transformers (ViTs) have become one of the dominant architectures in computer vision, and pre-trained ViT models are commonly adapted to new tasks via fine-tuning. Recent works proposed several parameter-efficient transfer learning methods, such as adapters, to avoid the prohibitive training and storage cost of finetuning. In this work, we observe that adapters perform poorly when the dimension of adapters is small, and we propose MiMi, a training framework that addresses this issue. We start with large adapters which can reach high performance, and iteratively reduce their size. To enable automatic estimation of the hidden dimension of every adapter, we also introduce a new scoring function, specifically designed for adapters, that compares the neuron importance across layers. Our method outperforms existing methods in finding the best trade-off between accuracy and trained parameters across the three dataset benchmarks DomainNet, VTAB, and Multi-task, for a total of 29 datasets.

* WACV2024

Via

Access Paper or Ask Questions