API Bug - Incorrect parameter transfer #6393

kupertdev · 2024-09-22T01:12:07Z

Describe the bug

For requests to endpoints (i sended only to chat/completions), auto_max_new_tokens is somehow always True, even if it is False

if state['auto_max_new_tokens']: # always True. generation_reply_HF

API models doesn't support max_new_tokens params. This greatly affects the operation of the model via the API.

Function getting

generate_reply_HF {'max_new_tokens': 8028, 'temperature': 1, 'temperature_last': False, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_p': 1, 'min_p': 0.05, 'top_k': 0, 'repetition_penalty': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'guidance_scale': 1, 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'do_sample': True, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_breakers': '"\\n", ":", "\\"", "*"', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutoff', 'tfs', 'top_a', 'min_p', 'mirostat'], 'use_cache': True, 'inputs': ..., 'eos_token_id': [1], 'stopping_criteria': [<modules.callbacks._StopEverythingStoppingCriteria object at 0x000001B5B9F4E110>, <modules.callbacks.Stream object at 0x000001B5B9F4E590>], 'logits_processor': []} <bos><start_of_turn>user

Sending params:
{'preset': 'min_p', 'min_p': 0.05, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_k': 0, 'repetition_penalty': 1, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'epsilon_cutoff': 0, 'eta_cutoff': 0, 'guidance_scale': 1, 'negative_prompt': '', 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'temperature_last': False, 'do_sample': True, 'seed': -1, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_break': '\\n, :, ", *', 'truncation_length': 0, 'max_tokens_second': 0, 'prompt_lookup_num_toknes': 0, 'custom_token_bans': '', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutofftfs', 'top_a', 'min_p', 'mirostat'], 'auto_max_new_tokens': False, 'ban_eos_token': False, 'add_bos_token': True, 'skip_special_tokens': True, 'grammar_string': '', 'model': '', 'prompt': '', 'best_of': 1, 'echo': False, 'frequency_penalty': 0, 'logit_bias': {}, 'logprobs': None, 'max_tokens': 0, 'n': 1, 'presence_penalty': 0, 'stop': [']'], 'stream': False, 'suffix': ']', 'temperature': 1, 'top_p': 1, 'messages': [{}], 'mode': 'chat-instruct', 'character': 'Alice', 'user_name': 'Bob', 'user_bio': "I'm Bob. 18 years old", 'chat_template_str': "{%- for message in messages %}\n {%- if message['role'] == 'system' -%}\n {%- if message['content'] -%}\n {{- message['content'] + '\n\n' -}}\n {%- endif -%}\n {%- if user_bio -%}\n {{- user_bio + '\n\n' -}}\n {%- endif -%}\n {%- else -%}\n {%- if message['role'] == 'user' -%}\n {{- name1 + ': ' + message['content'] + '\n'-}}\n {%- else -%}\n {{- name2 + ': ' + message['content'] + '\n' -}}\n {%- endif -%}\n {%- endif -%}\n{%- endfor -%}", 'chat_instruct_command': None, 'continue_': False}

Is there an existing issue for this?

I have searched the existing issues

Reproduction

Make request to chat/completions
See CMD output

Screenshot

No response

Logs

generate_reply_HF {'max_new_tokens': 8028, 'temperature': 1, 'temperature_last': False, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_p': 1, 'min_p': 0.05, 'top_k': 0, 'repetition_penalty': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'guidance_scale': 1, 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'do_sample': True, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_breakers': '"\\n", ":", "\\"", "*"', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutoff', 'tfs', 'top_a', 'min_p', 'mirostat'], 'use_cache': True, 'inputs': ..., 'eos_token_id': [1], 'stopping_criteria': [<modules.callbacks._StopEverythingStoppingCriteria object at 0x000001B5B9F4E110>, <modules.callbacks.Stream object at 0x000001B5B9F4E590>], 'logits_processor': []} <bos><start_of_turn>user

System Info

Win 11 x64

The text was updated successfully, but these errors were encountered:

kupertdev added the bug Something isn't working label Sep 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API Bug - Incorrect parameter transfer #6393

API Bug - Incorrect parameter transfer #6393

kupertdev commented Sep 22, 2024

API Bug - Incorrect parameter transfer #6393

API Bug - Incorrect parameter transfer #6393

Comments

kupertdev commented Sep 22, 2024

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info