Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] check-performance.sh failed with ValueError: invalid literal for int() #1108

Closed
fredoh9 opened this issue Oct 2, 2023 · 11 comments
Closed
Assignees
Labels
P2 Critical bugs or normal features type:bug Something doesn't work as expected

Comments

@fredoh9
Copy link
Collaborator

fredoh9 commented Oct 2, 2023

Describe the bug

check-performance.sh failed with ValueError: invalid literal for int()

2023-10-02 14:41:20 UTC [REMOTE_INFO] Checking SOF component performance
2023-10-02 14:41:20 UTC [REMOTE_COMMAND] sof_perf_analyzer.py --skip-to-first-trace --kmsg=/home/ubuntu/sof-test/logs/check-performance/2023-10-02-14:40:46-4847/dmesg.txt --out2csv /home/ubuntu/sof-test/logs/check-performance/2023-10-02-14:40:46-4847/sof_perf.txt /home/ubuntu/sof-test/logs/check-performance/2023-10-02-14:40:46-4847/mtrace.txt
Traceback (most recent call last):
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 323, in 
    main()
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 316, in main
    process_kmsg_file()
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 222, in process_kmsg_file
    widget_id = int('0x' + widget_id[-6:], 16)
ValueError: invalid literal for int() with base 16: '0x cmd 1'
2023-10-02 14:41:20 UTC [REMOTE_ERROR] SOF component performace analysis tool exit with error
2023-10-02 14:41:20 UTC [REMOTE_INFO] Test Result: FAIL!

Most recent failure link is
planresultdetail/32602?model=MTLP_SDW_AIOC-ace1_0-ipc4&testcase=check-performance

To Reproduce
Test with MTLP_SDW_AIOC
TPLG=/lib/firmware/intel/sof-ace-tplg/sof-mtl-rt711-l0-rt1316-l23-rt714-l1.tplg MODEL=MTLP_SDW_AIOC-ace1_0-ipc4 SOF_TEST_INTERVAL=5 ~/sof-test/test-case/check-performance.sh

@fredoh9 fredoh9 added the P2 Critical bugs or normal features label Oct 2, 2023
@fredoh9 fredoh9 added the type:bug Something doesn't work as expected label Oct 2, 2023
@marc-hb
Copy link
Collaborator

marc-hb commented Oct 3, 2023

From: @aiChaoSONG
Date: Monday, 25 September 2023 at 19:03
Subject: RE: Daily report summary -WW39.1

The module name parsing from the kernel message is based on pattern like this:

[  298.066895] kernel: snd_sof:sof_ipc4_widget_setup: sof-audio-pci-intel-mtl 0000:00:1f.3: Create widget mixout.21.1 instance 0 - pipe 21 - core 0
[  298.066907] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-mtl 0000:00:1f.3: ipc tx      : 0x40000003|0x1000a: MOD_INIT_INSTANCE [data size: 40]

A message that containing Creating widget, and a following message that shows the MOD_INIT_INSTANCE IPC message content. In mtrace log, we only have module id(0x3, 4 hex bits lsb) and instance id(0x0, two hex bits msb), that is 0x000003, by matching 0x40000003, we get the component name mixout.2.1.

The assumption is true in most cases, but sometimes may be wrong. I could try to fix it.

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 3, 2023

The assumption is true in most cases, but sometimes may be wrong

When exactly can it be wrong? Is it because of a (non-performance) bug? If yes then the test should still fail; just with a better error message.

@btian1
Copy link
Contributor

btian1 commented Oct 27, 2023

@fredoh9 , could you attach an dmesg.txt file for further debug this isssue? Chao shared some ideas to me, I can continue work on this issue.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Oct 30, 2023

@btian1 here you are,
dmesg-performance.txt
This is downloaded from one of latest daily test result.

link: planresultdetail/33757?model=MTLP_SDW_AIOC-ace1_0-ipc4&testcase=check-performance

btian1 added a commit to btian1/sof-test that referenced this issue Nov 6, 2023
When there is no widget ID information in creat widget related logs,
try to search and find the widget ID in bind related logs.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
btian1 added a commit to btian1/sof-test that referenced this issue Nov 7, 2023
When there is no widget ID information in creat widget related logs,
try to search and find the widget ID in bind related logs.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
@btian1
Copy link
Contributor

btian1 commented Nov 14, 2023

@fredoh9 did you observe this issue on other platform? I can reproduce this with MTLP_SDW_AIOC, not sure other platforms, @plbossart may want to have a check from kernel side, and see whether can have a fix from kernel side.

@marc-hb
Copy link
Collaborator

marc-hb commented Nov 14, 2023

I can reproduce this with MTLP_SDW_AIOC, not sure other platforms

If you can easily reproduce this logging issue in at least one configuration then that is great news. Focus the debug and analysis on that configuration for now. Don't submit any fix before understanding and explaining the problem.

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Nov 14, 2023

I saw the error only with MTLP_SDW_AIOC

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Nov 15, 2023

I was wrong, found in MTLP_RVP_HDA also

2023-11-15 16:13:37 UTC [REMOTE_INFO] Checking SOF component performance
2023-11-15 16:13:37 UTC [REMOTE_COMMAND] sof_perf_analyzer.py --skip-to-first-trace --kmsg=/home/ubuntu/sof-test/logs/check-performance/2023-11-15-16:13:03-23579/dmesg.txt --out2html /home/ubuntu/sof-test/logs/check-performance/2023-11-15-16:13:03-23579/sof_perf.html /home/ubuntu/sof-test/logs/check-performance/2023-11-15-16:13:03-23579/mtrace.txt
Traceback (most recent call last):
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 295, in 
    main()
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 290, in main
    process_kmsg_file()
  File "/home/ubuntu/sof-test/tools/sof_perf_analyzer.py", line 207, in process_kmsg_file
    widget_id = int('0x' + widget_id[-6:], 16)
ValueError: invalid literal for int() with base 16: '0xtion 1'

dmesg-mtl-hda.txt

Intel internal link:
planresultdetail/34694?model=MTLP_RVP_HDA&testcase=check-performance

@fredoh9
Copy link
Collaborator Author

fredoh9 commented Nov 15, 2023

For HDA, only two failure was reported for a month. Except today failure, last failure was 2023-10-17 17:27:57.

btian1 added a commit to btian1/sof-test that referenced this issue Nov 21, 2023
This is a follow up with kernel PR: thesofproject/linux#4709
With above PR, pipeline and module creation are using separate
print for information output, hence, script need change as well
to catch up with kernel logs.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
btian1 added a commit to btian1/sof-test that referenced this issue Nov 21, 2023
This is a follow up with kernel PR: thesofproject/linux#4709
With above PR, pipeline and module creation are using separate
print for information output, hence, script need change as well
to catch up with kernel logs.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
btian1 added a commit to btian1/sof-test that referenced this issue Nov 22, 2023
In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx      : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that `Create widget` message and `ipc tx` message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message,
see thesofproject/linux#4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
btian1 added a commit to btian1/sof-test that referenced this issue Nov 22, 2023
In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx      : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that `Create widget` message and `ipc tx` message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message,
see thesofproject/linux#4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
btian1 added a commit to btian1/sof-test that referenced this issue Nov 22, 2023
In existing code, we parse component name from the line
that contains 'Create widget', and component ID from the
next line, which normally is the ipc tx message:

Create widget host-copier.0.playback instance 0 - pipe 1 - core 0
ipc tx      : 0x40000004|0x15: MOD_INIT_INSTANCE [data size: 84]

However, due to the concurrent nature of linux kernel, the
assumption that `Create widget` message and `ipc tx` message
are always adjacent is wrong. An exception will be raised
if the next line is not the 'ipc tx' message.

The first step to fix the issue is to make a kernel PR to print
all the information we need in a single 'Create widget' line,
thus to avoid the requirement for multiple lines of kernel message,
see thesofproject/linux#4709.

And next,this patch will parse the new 'Create widget' line to
get component name and ID.

Fixes: thesofproject#1108

Signed-off-by: Baofeng Tian <[email protected]>
Signed-off-by: Chao Song <[email protected]>
@marc-hb
Copy link
Collaborator

marc-hb commented Dec 28, 2023

@btian1
Copy link
Contributor

btian1 commented Dec 28, 2023

Thanks, Marc, and with one more PR to kernel:
thesofproject/linux#4709
#1122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Critical bugs or normal features type:bug Something doesn't work as expected
Projects
None yet
Development

No branches or pull requests

4 participants