Independent Science + Technology

Category: self-speculative

Meta LayerSkip Llama3.2 1B: Achieving Fast LLM Inference with Self-Speculative Decoding locally

Nothing left to load.