gptq

Here are 14 public repositories matching this topic...

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

sparsity pruning quantization knowledge-distillation auto-tuning int8 low-precision quantization-aware-training post-training-quantization awq int4 large-language-models gptq smoothquant sparsegpt fp4 mxformat

Updated May 30, 2025
Python

ModelCloud / GPTQModel

Star

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

transformers quantization optimum peft vllm gptq sglang

Updated May 29, 2025
Python

intel / auto-round

Star

Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Transformers, and vLLM.

rounding quantization awq int4 gptq neural-compressor

Updated May 30, 2025
Python

bobazooba / xllm

Star

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

deep-neural-networks deep-learning torch pytorch openai llama gpt alpaca zephyr mistral vicuna gpt-4 large-language-models llm chatgpt cerebras gptq bitsandbytes llama2

Updated Jan 17, 2024
Python

1b5d / llm-api

Star

Run any Large Language Model behind a unified API

python machine-learning llama huggingface llm chatgpt langchain llamacpp gptq llm-inference

Updated Nov 13, 2023
Python

chenhunghan / ialacol

Star

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

python kubernetes ai gpu helm cuda openai cloudnative llm langchain llm-serving llamacpp ggml gptq llm-inference

Updated Feb 5, 2024
Python

tripathiarpan20 / self-improvement-4all

Star

Private self-improvement coaching with open-source LLMs

python transformers faiss langchain text-generation-webui gptq

Updated Mar 7, 2024
Python

chinoll / chatsakura

Star

ChatSakura：Open-source multilingual conversational model.（开源多语言对话大模型）

bloom transformers pytorch gradio llm chatgpt bloomz instruct-gpt gptq

Updated Apr 2, 2023
Python

hcd233 / Aris-AI-Model-Server

Star

An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

ai embedding mlx reranker rag fastapi sentence-transformers awq llm vllm gptq openai-compatible-api

Updated Apr 17, 2025
Python

This repository is for profiling, extracting, visualizing and reusing generative AI weights to hopefully build more accurate AI models and audit/scan weights at rest to identify knowledge domains for risk(s).

ai deep-learning blender tiff transformers weights image-to-image blender-python llm stable-diffusion foundational-models generative-ai safetensors blip2 gptq

Updated Dec 18, 2023
Python

Aqirito / A.L.I.C.E

Star

A.L.I.C.E (Artificial Labile Intelligence Cybernated Existence). A REST API of A.I companion for creating more complex system

text-to-speech anime rest-api text-generation artificial-intelligence tts waifu otaku pygmalion fastapi huggingface-transformers genshin-impact vits llm llms langchain gptq langchain-python exllama

Updated Feb 6, 2025
Python

bobazooba / shurale

Star

Conversation AI model for open domain dialogs

Updated Nov 15, 2023
Python

STiFLeR7 / Edge-LLM

Star

Optimized Qwen2.5-3B using GPTQ, reducing size from 5.75GB → 1.93GB and improving inference speed. Ideal for efficient edge AI deployments.

ai machinelearning deeplearning quantization edge-computing llm gptq edge-llm

Updated May 24, 2025
Python

lpalbou / model-quantizer

Star

Effortlessly quantize, benchmark, and publish Hugging Face models with cross-platform support for CPU/GPU. Reduce model size by 75% while maintaining performance.

python nlp machine-learning cross-platform optimization transformers inference pytorch quantization model-compression huggingface awq llm gptq bitsandbytes cpu-compatible

Updated Mar 15, 2025
Python

Improve this page

Add a description, image, and links to the gptq topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gptq topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gptq

Here are 14 public repositories matching this topic...

intel / neural-compressor

ModelCloud / GPTQModel

intel / auto-round

bobazooba / xllm

1b5d / llm-api

chenhunghan / ialacol

tripathiarpan20 / self-improvement-4all

chinoll / chatsakura

hcd233 / Aris-AI-Model-Server

matlok-ai / bampe-weights

Aqirito / A.L.I.C.E

bobazooba / shurale

STiFLeR7 / Edge-LLM

lpalbou / model-quantizer

Improve this page

Add this topic to your repo