Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator For Binary Neural Networks Training

Pricing Type

Pricing Type: Free
Price Range Start($):

GitHub Link

The GitHub link is https://github.com/dravenalg/reste

Introduce

The implementation of the Rectified Straight Through Estimator (ReSTE) for training Binary Neural Networks (BNNs) is presented in this GitHub repository. ReSTE addresses the inconsistency problem in training BNNs by balancing estimating error and gradient stability. It introduces indicators to quantify this equilibrium, and proposes a power function based estimator, ReSTE, which outperforms other estimators in terms of balancing these factors. The method is evaluated on CIFAR-10 and ImageNet datasets, demonstrating superior performance without requiring additional modules or losses. The repository provides implementation details and instructions for running the method on these datasets. The paper is accepted at ICCV 2023 and provides insights into this novel approach for training BNNs.

The pioneering work BinaryConnect uses Straight Through Estimator (STE) to mimic the gradients of the sign function, but it also causes the crucial inconsistency problem.

Content

Official implement of ReSTE. | Paper | Personal Homepage. Xiao-Ming Wu, Dian Zheng, Zu-Hao Liu, Wei-Shi Zheng*. If you have any questions, feel free to contact me by [email protected]. Binary Neural Networks (BNNs) attract great research enthusiasm in recent years due to its great performance in neural networks compression. The pioneering work BinaryConnect proposes to use Straight Through Estimator (STE) to mimic the gradients of the sign function in BNNs training, but it also causes the crucial inconsistency problem due to the difference between the forward and the backward processes. Most of the previous methods design different estimators instead of STE to mitigate the inconsistency problem. However, they ignore the fact that when reducing the estimating error, the gradient stability will decrease concomitantly, which makes the gradients highly divergent, harming the model training and increasing the risk of gradient vanishing and gradient exploding. To fully take the gradient stability into consideration, we present a new perspective to the BNNs training, regarding it as the equilibrium between the estimating error and the gradient stability. In this view, we firstly design two indicators to quantitatively demonstrate the equilibrium phenomenon. In addition, in order to balance the estimating error and the gradient stability well, we revise the original straight through estimator and propose a power function based estimator, Rectified Straight Through Estimator (ReSTE for short). Comparing to other estimators, ReSTE is rational and capable of flexibly balancing the estimating error with the gradient stability. Extensive experiments on CIFAR-10 and ImageNet datasets show that ReSTE has excellent performance and surpasses the state-of-the-art methods without any auxiliary modules or losses. The core idea of our method is that we present a new perspective to the BNNs training, regarding it as the equilibrium between the estimating error and the gradient stability. We also design two indicators to demonstrate the estimating error and the gradient stability. The estimating error is the difference between the sign function and the estimator, which can be evaluated by: The gradient stability is the divergence of the gradients of all parameters in an iteration update, which can be evaluated by: In this view, we propose Rectified Straight Through Estimator (ReSTE for short), which is rational and capable of flexibly balancing the estimating error with the gradient stability. The equavalent forward process of ReSTE is : The backward process of ReSTE is: We visualize the forward and backward processes of ReSTE as follow. NOTE: The version is not strictly required and can be flexibly adjusted based on your CUDA. If you use our code or models in your research, please cite our paper with

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

Med PaLM-Medical Large Language Model from Google Research

Google's AI Medical Language Model

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations

Two metrics are proposed to evaluate AER performance with automatic segmentation based on time-weighted emotion and speaker classification errors.

PI.EXCHANGE-Simplifying AI with Accessible Machine Learning in Minutes. No Code Required.

Machine learning is made easy with the AI , Analytics Engine. Access your cost-effective and sustainable artificial intelligence capability today.

Learning on Graphs with Out-of-Distribution Nodes

Graph Neural Networks (GNNs) are state-of-the-art models for performing prediction tasks on graphs.

LLaVA-LLMs designed to connect a vision encoder with a language model

Large Language and Vision Assistant

DoctorGPT-An Open-Source Medical Dialogue Model

an LLM that can pass the US Medical Licensing Exam. It works offline, it's cross-platform, & your health data stays private.

No comments yet, please leave the first one!

No comments...

Hot AI Books

The ChatGPT Millionaire: Making Money Online has never been this EASY

This is the simplest guide on how to make money quickly and easily with ChatGPT (Updated for GPT-4)

The GPT-4 Millionaire: Future of Business Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work

The GPT-4 MILLIONAIRE: FUTURE OF BUSINESS Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work. Discover the transformative power of GPT-4, a state-of-the-art AI-driven language model, and its integration with Microsoft 365 Copilot

CHATGPT MONEY EXPLOSION UNCOVER THE SECRET AI WEAPON TO SKYROCKET YOUR INCOME: THE ULTIMATE GUIDE TO UNLEASHING THE FULL POTENTIAL OF CHATGPT FOR MASSIVE PROFITS

Revolutionize Your Income Streams with the Ultimate ChatGPT Guide Transform Your Business with AI-Powered Strategies and Unstoppable Profits

The ChatGPT-4 Billionaire: Making Bundles Of Money Online Was Not That Much Easy

In today's world, businesses are spending substantial amounts on content creation, social media marketing, and SEO. With ChatGPT, even if you lack experience, you can excel in these areas. Many businesses are not leveraging ChatGPT yet, creating an opportunity for you to offer similar services at a lower cost with minimal effort. I'll provide you with step-by-step instructions that you can easily replicate. While the market may become saturated in the future, now is the ideal time to get started!

The ChatGPT Millionaire: Easy Way to Make Money Online Using ChatGPT Effectively

This guide is the ultimate resource for making fast and easy money with ChatGPT, now updated for GPT-4.

The ChatGPT Millionaire Guide: How To Earn Money Online & Become A Millionaire Using ChatGPT Making Money Online has never been this EASY