Resonant Learner – A Smarter Early Stop for Deep Models

3 days ago 2

Intelligent Early Stopping for Deep Learning

Production Validated Compute Savings PyTorch License

Stop training exactly when your model converges, not epochs later. Save 25-47% compute while maintaining or improving quality.


Resonant Convergence Analysis (RCA) is an intelligent early stopping system that analyzes oscillation patterns in validation loss to detect true convergence. Unlike simple patience-based methods, RCA uses resonance metrics (β, ω) to distinguish meaningful plateaus from temporary stagnation.

  • 🎯 Intelligent Detection: Analyzes loss oscillations, not just raw values
  • 25-47% Compute Savings: Stop training epochs earlier
  • 🎓 Quality Preserved: Automatically loads best model checkpoint
  • 🔧 Adaptive LR: Built-in learning rate reduction
  • 📊 Production Validated: Real data from NVIDIA L40S GPU

Community Edition (This Repository)

Free and open source - Production-validated RCA callback for manual training loops.

✅ ResonantCallback for early stopping
✅ Full β/ω resonance analysis
✅ Adaptive learning rate reduction
✅ Best model checkpointing
✅ Validated on 4+ datasets

Perfect for:

  • Manual PyTorch training loops
  • Research and experimentation
  • Learning how RCA works
  • Production deployments

AutoCoach + SmartTeach + RCA Ultimate - Zero-config training with automatic optimization.

🎯 AutoCoach: Auto-detects model/task, selects optimal hyperparameters
🧠 SmartTeach: Gradient modulation for smoother convergence
🌊 RCA Ultimate: Enhanced early stopping with multi-metric analysis
Ultimate Trainer: Integrated training loop with 3-hook API
📊 Advanced Analytics: TensorBoard integration, detailed metrics

Additional Features:

  • Zero-config training (detects BERT/CNN/transformer automatically)
  • SmartTeach gradient feedback for faster convergence
  • Architecture-specific presets (BERT, CNN, ResNet, ViT)
  • Enhanced multi-metric stopping criteria
  • Professional support and updates

Contact for Pro Edition:

Damjan Žakelj Email: [email protected]

The Pro Edition includes extensive examples for:

  • NanoGPT with RCA
  • BERT fine-tuning with SmartTeach
  • Vision transformers (TIMM)
  • Custom architectures

🚀 Quick Start (Community Edition)

pip install torch torchvision pip install -U pip setuptools wheel pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 pip install tqdm numpy pandas matplotlib timm transformers datasets pip install -e . pytest -q python verify_installation.py
from resonant_learner import ResonantCallback # Initialize RCA rca = ResonantCallback( checkpoint_dir='./checkpoints', patience_steps=3, min_delta=0.01, verbose=True ) # Training loop for epoch in range(max_epochs): train_loss = train_epoch(model, train_loader, optimizer) val_loss = validate(model, val_loader) # RCA callback rca(val_loss=val_loss, model=model, optimizer=optimizer, epoch=epoch) if rca.should_stop(): print("Early stopping triggered!") break

That's it! RCA handles the rest: LR reduction, checkpointing, and early stopping.


Real Production Results (NVIDIA L40S GPU)

RCA Production Validation

Figure 1: Production performance dashboard showing epoch reduction, compute savings, accuracy preservation, and efficiency improvements across 4 datasets.

Compute Savings Across Datasets

Dataset Baseline RCA Saved Accuracy Delta
BERT SST2 10 epochs 7 epochs 30% -0.11% ✅
MNIST 30 epochs 18 epochs 40% +0.12% ✅
CIFAR-10 60 epochs 45 epochs 25% +1.35% ✅
Fashion-MNIST 30 epochs 16 epochs 47% -0.67% ✅

Quality maintained or improved
Average 36% compute reduction
Production validated on NVIDIA L40S


RCA tracks two key metrics during training:

The system internally regulates training stability using two key parameters:

  • β (Resonance Amplitude): Controls the strength of adaptive feedback — higher values yield smoother convergence, lower values allow exploratory oscillations.
  • ω (Resonance Frequency): Governs the oscillatory phase of learning. Empirically, models stabilize near a universal resonance regime.

Parameter ranges and fine-tuning strategies are part of the PRO implementation.

MNIST Deep Dive

Figure 2: RCA metrics evolution during MNIST training - showing validation loss, beta, omega, and learning rate adaptation. RCA automatically reduces LR twice before stopping at epoch 18.

RCA v5 fixes a critical bug in plateau detection:

# v4: MISSED β=0.70-0.75 plateaus ❌ if state == "plateau" and beta > 0.75: stop() # v5: CATCHES ALL plateaus ✅ if state == "plateau" and beta > 0.70: stop()

Impact: BERT training now stops correctly at β=0.72 (epoch 7 instead of continuing), saving 30% compute!

BERT Production

Figure 3: BERT SST2 fine-tuning comparison - RCA stops at epoch 7 when β=0.72, saving 30% compute while maintaining 92.55% accuracy.


Perfect for:

  • Long training runs (>10 epochs)
  • Expensive models (transformers, large CNNs)
  • Hyperparameter search (auto-stop bad runs)
  • Cloud compute (save $$$ on GPU time)

Not recommended for:

  • Very short training (<5 epochs)
  • When you need exact epoch control
  • Research needing full training curves

  • Hardware: NVIDIA L40S GPU (44.4GB VRAM)
  • Software: PyTorch 2.9.0 + CUDA 12.8
  • Platform: RunPod cloud compute
  • Reproducibility: Fixed seed (42), deterministic ops
  1. MNIST - 60K digit images (handwritten digits)
  2. Fashion-MNIST - 60K clothing images (10 classes)
  3. CIFAR-10 - 50K natural images (10 classes)
  4. BERT SST2 - 67K sentiment samples (binary classification)
# CIFAR-10 python examples/cifar10_rca.py --baseline --epochs 60 python examples/cifar10_rca.py --epochs 60 # Fashion-MNIST python examples/fashion_mnist_rca.py --baseline --epochs 30 python examples/fashion_mnist_rca.py --epochs 30 # MNIST python examples/mnist_rca.py --baseline --epochs 30 python examples/mnist_rca.py --epochs 30 # BERT SST2 python examples/hf_bert_glue.py --baseline --task sst2 --epochs 10 python examples/hf_bert_glue.py --task sst2 --epochs 10

All results: See full scientific report →


rca = ResonantCallback( checkpoint_dir='./checkpoints', # Where to save best models patience_steps=3, # Epochs to wait before LR reduction min_delta=0.01, # Min improvement (1%) ema_alpha=0.3, # EMA smoothing factor max_lr_reductions=2, # Max LR reductions before stop lr_reduction_factor=0.5, # Reduce LR by 50% min_lr=1e-6, # Minimum LR threshold verbose=True # Print RCA analysis )

Recommended Settings by Dataset

Easy datasets (MNIST, Fashion-MNIST):

patience_steps=3, min_delta=0.01

Medium datasets (CIFAR-10, CIFAR-100):

patience_steps=4, min_delta=0.005

Hard datasets (ImageNet, large NLP):

patience_steps=5, min_delta=0.005

Fast fine-tuning (BERT, pre-trained models):

patience_steps=2, min_delta=0.005

python examples/mnist_rca.py --epochs 30

Result: Stops at epoch 18, saving 40% compute, 99.20% accuracy

Natural Images (CIFAR-10)

python examples/cifar10_rca.py --epochs 60

Result: Stops at epoch 45, saving 25% compute, 85.34% accuracy (better than baseline!)

python examples/hf_bert_glue.py --task sst2 --epochs 10

Result: Stops at epoch 7, saving 30% compute, 92.55% accuracy

More examples →


🎓 Understanding the Output

📊 RCA (Epoch 7): No improvement (waiting 2/2) β=0.72, ω=2.1, confidence=0.66, state=plateau 🛑 RCA: Early stopping triggered! Reason: Stable plateau detected (β=0.72, no improvement for 2 epochs) Best model saved at epoch 1 (val_loss=0.236579)

What this means:

  • β=0.72: High resonance = stable plateau (>0.70 threshold)
  • patience 2/2: Exceeded patience without improvement
  • state=plateau: Loss not improving significantly
  • Action: Stop training, load best model from epoch 1

Possible causes:

  • β not reaching >0.70 (print RCA metrics to check)
  • patience_steps too high
  • min_delta too strict (try lowering to 0.005)

Possible causes:

  • min_delta too lenient (try 0.01)
  • patience_steps too low (try 4-5)

"Best checkpoint not found"

Possible causes:

  • checkpoint_dir doesn't exist or not writable
  • Disk space full

Solution: RCA will use final model weights as fallback


📊 Comparison with Other Methods

Method Compute Savings Quality Adaptive LR Resonance Metrics
RCA ✅ 25-47% ✅ Preserved ✅ Yes ✅ β, ω
Early Stopping (patience) ⚠️ 10-30% ✅ Preserved ❌ No ❌ None
ReduceLROnPlateau ❌ 0% ✅ N/A ✅ Yes ❌ None
Fixed schedule ❌ 0% ⚠️ May degrade ⚠️ Pre-set ❌ None

RCA is based on log-periodic resonance analysis, inspired by complex systems theory. During training, validation loss exhibits oscillations. As the model converges, these oscillations:

  1. Stabilize (β increases toward 1.0)
  2. Reduce amplitude (smaller loss changes)
  3. Approach frequency (ω ≈ 6.0)

When all three conditions align, training has reached optimal convergence.



  • v5: Plateau threshold alignment (β=0.70)
  • Production validation on real data
  • Comprehensive documentation
  • TensorBoard integration
  • Weights & Biases integration
  • Multi-GPU support
  • AutoCoach zero-config training
  • SmartTeach gradient modulation
  • RCA Ultimate multi-metric analysis
  • Architecture-specific presets
  • Distributed training support
  • Professional dashboard

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Areas we need help with:

  • 🧪 Testing on more datasets
  • 📚 Documentation improvements
  • 🔧 Integration with popular frameworks
  • 🐛 Bug reports and fixes

MIT License - see LICENSE for details



If you use RCA in your research, please cite:

@software{rca2025, title={Resonant Convergence Analysis: Intelligent Early Stopping for Deep Learning}, author={Žakelj, Damjan}, year={2025}, version={5.0}, url={https://github.com/...} }

Community Edition Support

  • Issues: Open a GitHub issue
  • Discussions: GitHub Discussions
  • Documentation: See docs folder

Professional Edition Inquiries

  • Email: [email protected]
  • Subject: RCA Professional Edition
  • Include: Your use case and requirements

If RCA saved you compute time and money, give us a star! ⭐

Questions? Open an issue or discussion.
Success story? We'd love to hear it!
Need Pro Edition? Contact Damjan directly.


Status: ✅ Production Ready
Version: v5 (Plateau Threshold Fix)
Validation: NVIDIA L40S GPU + PyTorch 2.9.0
License: MIT (Community Edition)

"Stop training when your model converges, not epochs later." 🌊✨

Read Entire Article