Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Bug - Incorrect parameter transfer #6393

Open
1 task done
kupertdev opened this issue Sep 22, 2024 · 0 comments
Open
1 task done

API Bug - Incorrect parameter transfer #6393

kupertdev opened this issue Sep 22, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@kupertdev
Copy link

Describe the bug

For requests to endpoints (i sended only to chat/completions), auto_max_new_tokens is somehow always True, even if it is False

if state['auto_max_new_tokens']: # always True. generation_reply_HF

API models doesn't support max_new_tokens params. This greatly affects the operation of the model via the API.

Function getting

generate_reply_HF {'max_new_tokens': 8028, 'temperature': 1, 'temperature_last': False, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_p': 1, 'min_p': 0.05, 'top_k': 0, 'repetition_penalty': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'guidance_scale': 1, 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'do_sample': True, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_breakers': '"\\n", ":", "\\"", "*"', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutoff', 'tfs', 'top_a', 'min_p', 'mirostat'], 'use_cache': True, 'inputs': ..., 'eos_token_id': [1], 'stopping_criteria': [<modules.callbacks._StopEverythingStoppingCriteria object at 0x000001B5B9F4E110>, <modules.callbacks.Stream object at 0x000001B5B9F4E590>], 'logits_processor': []} <bos><start_of_turn>user

Sending params:
{'preset': 'min_p', 'min_p': 0.05, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_k': 0, 'repetition_penalty': 1, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'epsilon_cutoff': 0, 'eta_cutoff': 0, 'guidance_scale': 1, 'negative_prompt': '', 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'temperature_last': False, 'do_sample': True, 'seed': -1, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_break': '\\n, :, ", *', 'truncation_length': 0, 'max_tokens_second': 0, 'prompt_lookup_num_toknes': 0, 'custom_token_bans': '', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutofftfs', 'top_a', 'min_p', 'mirostat'], 'auto_max_new_tokens': False, 'ban_eos_token': False, 'add_bos_token': True, 'skip_special_tokens': True, 'grammar_string': '', 'model': '', 'prompt': '', 'best_of': 1, 'echo': False, 'frequency_penalty': 0, 'logit_bias': {}, 'logprobs': None, 'max_tokens': 0, 'n': 1, 'presence_penalty': 0, 'stop': [']'], 'stream': False, 'suffix': ']', 'temperature': 1, 'top_p': 1, 'messages': [{}], 'mode': 'chat-instruct', 'character': 'Alice', 'user_name': 'Bob', 'user_bio': "I'm Bob. 18 years old", 'chat_template_str': "{%- for message in messages %}\n {%- if message['role'] == 'system' -%}\n {%- if message['content'] -%}\n {{- message['content'] + '\n\n' -}}\n {%- endif -%}\n {%- if user_bio -%}\n {{- user_bio + '\n\n' -}}\n {%- endif -%}\n {%- else -%}\n {%- if message['role'] == 'user' -%}\n {{- name1 + ': ' + message['content'] + '\n'-}}\n {%- else -%}\n {{- name2 + ': ' + message['content'] + '\n' -}}\n {%- endif -%}\n {%- endif -%}\n{%- endfor -%}", 'chat_instruct_command': None, 'continue_': False}

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

  1. Make request to chat/completions
  2. See CMD output

Screenshot

No response

Logs

generate_reply_HF {'max_new_tokens': 8028, 'temperature': 1, 'temperature_last': False, 'dynamic_temperature': False, 'dynatemp_low': 1, 'dynatemp_high': 1, 'dynatemp_exponent': 1, 'smoothing_factor': 0, 'smoothing_curve': 1, 'top_p': 1, 'min_p': 0.05, 'top_k': 0, 'repetition_penalty': 1, 'presence_penalty': 0, 'frequency_penalty': 0, 'repetition_penalty_range': 1024, 'typical_p': 1, 'tfs': 1, 'top_a': 0, 'guidance_scale': 1, 'penalty_alpha': 0, 'mirostat_mode': 0, 'mirostat_tau': 5, 'mirostat_eta': 0.1, 'do_sample': True, 'encoder_repetition_penalty': 1, 'no_repeat_ngram_size': 0, 'dry_multiplier': 0, 'dry_base': 1.75, 'dry_allowed_length': 2, 'dry_sequence_breakers': '"\\n", ":", "\\"", "*"', 'sampler_priority': ['temperature', 'dynamic_temperature', 'quadratic_sampling', 'top_k', 'top_p', 'typical_p', 'epsilon_cutoff', 'eta_cutoff', 'tfs', 'top_a', 'min_p', 'mirostat'], 'use_cache': True, 'inputs': ..., 'eos_token_id': [1], 'stopping_criteria': [<modules.callbacks._StopEverythingStoppingCriteria object at 0x000001B5B9F4E110>, <modules.callbacks.Stream object at 0x000001B5B9F4E590>], 'logits_processor': []} <bos><start_of_turn>user

System Info

Win 11 x64
@kupertdev kupertdev added the bug Something isn't working label Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant