oLLM is an innovative Python library which enables researchers, hobbyists, and developers to execute Generative AI models with large context windows – up totokens – on standard NVIDIA GPUs with as little as 8 GB VRAM, by using smart SSD ...
Home/LLM