Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

iLevyTate/stac

Repository files navigation

STAC: Spiking Transformer Augmenting Cognition for Conversational AI

Overview

STAC (Spiking Transformer Augmenting Cognition) is a research framework with two distinct approaches:

  • STAC V1: Complete end-to-end training pipeline with learnable AdEx neurons (see stac-v1/)
  • STAC V2: Experimental conversion framework that transforms pretrained transformer LLMs (DistilGPT-2, SmolLM2-1.7B-Instruct) into Spiking Neural Networks (SNNs) for potential energy savings while retaining multi-turn conversational ability in simulation

Important: This repository currently runs software-level SNN simulations only. No metrics have been collected on physical neuromorphic hardware yet. Energy savings figures are theoretical projections based on spike-count analysis, not measured hardware data.

Key Features

Proof-of-concept ANN-SNN conversion using SpikingJelly
Multi-turn context retention via a Temporal Spike Processor
Extensive software tests for position IDs, KV-cache, and spike-rate sanity
Hardware power profiling -- planned, not implemented
Full operator coverage & optimisation -- work in progress

Quick Start

# 1. Install dependencies
pip install -r requirements.txt

# 2. Convert DistilGPT-2 to SNN (fast)
python run_conversion.py --model_name distilgpt2 --timesteps 8 --simplified

# 3. Test multi-turn conversation
python snn_multi_turn_conversation_test.py --mode snn --turns 3 --timesteps 8

# 4. Run comprehensive validation
python test_conversational_snn.py --model_name distilgpt2 --test_all --timesteps 8

Core Components

STAC V2 (Current)

Component Purpose
smollm2_converter.py Specialized converter with TemporalSpikeProcessor
convert.py Generic ANN-SNN conversion pipeline
run_conversion.py Main CLI entry point for conversions
spikingjelly_compat.py Cross-version compatibility layer
test_conversational_snn.py Comprehensive test suite (1K+ lines)
snn_multi_turn_conversation_test.py Simple conversation smoke test

STAC V1 (Original Research)

Component Purpose
stac-v1/stacv1.ipynb Complete end-to-end training pipeline with learnable AdEx neurons
stac-v1/README.md V1 documentation and research contributions
stac_v1/ + run_stac_v1.py Repo-native runnable V1 pipeline demonstrating hybrid fine-tuning (frozen GPT-2 + trained spiking/memory head)

Implementation Status

STAC V2 (Current)

Completed (prototype level)

  • Core conversion flow (GELU-ReLU, quantization, ann2snn)
  • Temporal dynamics & KV-cache handling in PyTorch
  • Spike-count telemetry hooks and unit tests
  • Loihi export gating (requires EXPORT_LOIHI=1 and lava.lib.dl.slayer; otherwise remains simulation-only and Loihi tests are skipped)

Pending / In Progress

  • Hardware benchmarking on Loihi-2 / Akida
  • Expanded operator support (e.g., rotary embeddings, flash-attention variants)
  • Integration with SCANUE multi-agent alignment layer
  • Robust CLI/UX and documentation polish

STAC V1 (Complete)

Completed (research prototype)

  • End-to-end training pipeline with learnable AdEx neurons
  • Hyperdimensional Memory Module (HEMM) integration
  • Surrogate gradient training on WikiText-2
  • L1 spike regularization for energy efficiency
  • Comprehensive validation suite

Documentation

STAC V2 (Current)

STAC V1 (Original Research)

Testing & Validation

The repository includes extensive testing for multi-turn conversational correctness:

# Test specific components
python test_conversational_snn.py --model_name distilgpt2 --test_position_boundaries
python test_conversational_snn.py --model_name distilgpt2 --test_attention_mask
python test_conversational_snn.py --model_name distilgpt2 --test_multi_turn
python test_conversational_snn.py --model_name distilgpt2 --test_energy

# Run all tests
python test_conversational_snn.py --model_name distilgpt2 --test_all

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

STAC (Spiking Transformer Augmenting Cognition) converts pretrained transformer LLMs (e.g., DistilGPT-2, SmolLM2-1.7B-Instruct) into energy-efficient Spiking Neural Networks (SNNs) while preserving coherent multi-turn conversational ability.

Topics

Resources

Readme

License

MIT license

Stars

Watchers

Forks

Packages

Contributors

Languages