Skip to content

Releases: huggingface/optimum-intel

v1.15.2: Patch release

22 Feb 17:20
Compare
Choose a tag to compare

v1.15.1: Patch release

21 Feb 15:29
Compare
Choose a tag to compare
  • Relax dependency on accelerate and datasets in OVQuantizer by @eaidova in #547

  • Disable compilation before applying 4-bit weight compression by @AlexKoff88 in #569

  • Update Transformers dependency requirements by @echarlaix in #571

v1.15.0: OpenVINO Tokenizers, quantization configuration

19 Feb 17:53
Compare
Choose a tag to compare
from diffusers import StableDiffusionPipeline
from optimum.exporters.openvino import export_from_model

model_id = "runwayml/stable-diffusion-v1-5"
model = StableDiffusionPipeline.from_pretrained(model_id)

export_from_model(model, output="ov_model", task="stable-diffusion")

v1.14.0: IPEX models

31 Jan 17:15
Compare
Choose a tag to compare

IPEX models

from optimum.intel import IPEXModelForCausalLM
from transformers import AutoTokenizer, pipeline

model_id = "Intel/q8_starcoder"
model = IPEXModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
results = pipe("He's a dreadful magician and")

Fixes

  • Fix position_ids initialization for first inference of stateful models by @eaidova in #532
  • Relax requirements to have registered normalized config for decoder models #537 by @eaidova in #537

v1.13.0: 4-bit quantization, stateful models, Whisper

25 Jan 16:48
Compare
Choose a tag to compare

OpenVINO

Weight only 4-bit quantization

optimum-cli export openvino --model gpt2 --weight-format int4_sym_g128 ov_model

Stateful

New architectures

Whisper

  • Add support for export and inference for whisper models by @eaidova in #470

v1.12.4: Patch release

22 Jan 14:08
Compare
Choose a tag to compare

v1.12.3: Patch release

04 Jan 17:25
Compare
Choose a tag to compare

v1.12.2: Patch release

14 Dec 19:48
Compare
Choose a tag to compare

v1.12.1: Patch release

08 Nov 09:31
Compare
Choose a tag to compare

v1.12.0: Weight only quantization, LCM, Pix2Struct , GPTBigCode

07 Nov 16:02
Compare
Choose a tag to compare

OpenVINO

Export CLI

optimum-cli export openvino --model gpt2 ov_model

New architectures

LCMs

  • Enable Latent Consistency models OpenVINO export and inference by @echarlaix in #463
from optimum.intel import OVLatentConsistencyModelPipeline

pipe = OVLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images

Pix2Struct

  • Add support for export and inference for pix2struct models by @eaidova in #450

GPTBigCode

  • Add support for export and inference for GPTBigCode models by @echarlaix in #459

Changes and bugfixes

model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
  • Create default attention mask when needed but not provided by @eaidova in #457
  • Do not automatically cache models when exporting a model in a temporary directory by @helena-intel in #462

Neural Compressor

Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.12.0