Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update "Performance Adaptation" use case #207

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 19 additions & 10 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ noise suppression using Recurrent Neural Network such as [[RNNoise]] for
suppressing background dynamic noise like baby cry or dog barking to improve
audio experiences in video conferences.

### Detecting fake video ### {#usecase-detecting-fake-video}
### Detecting Fake Video ### {#usecase-detecting-fake-video}

A user is exposed to realistic fake videos generated by ‘deepfake’ on the web.
The fake video can swap the speaker’s face into the president’s face to incite
Expand Down Expand Up @@ -355,15 +355,24 @@ fully-connected layers with it.

### Performance Adaptation ### {#usecase-perf-adapt}

A web application developer has a concern about performance of her DNN model on
mobile devices. She has confirmed that it may run too slow on mobile devices
which do not have GPU acceleration. To address this issue, her web application
refers to the WebNN API to confirm whether acceleration is available or not, so
that the application can display the warning for devices without acceleration.

After several weeks, she has developed a tiny DNN model that can even run on
CPU. In order to accommodate CPU execution, she modifies the application
so that the application loads the tiny model in the case of CPU-only devices.
A web application developer has a concern about performance of her DNN model.
The model needs to run on both a mobile device with a low power CPU as well as
on a laptop with a powerful CPU, GPU and a dedicated AI accelerator.

She has confirmed that the model may run too slow on the mobile device which does
not have GPU acceleration. To address this issue, her web application refers to
the WebNN API to confirm whether acceleration is available or not, so that the
application can display a warning for devices without acceleration. After several
weeks, she has developed a tiny DNN model that can even run on a CPU. In order to
accommodate CPU execution, she modifies the application so that the application
loads the tiny model in the case of CPU-only devices.

When executing the DNN model on a laptop with a more powerful CPU, GPU and a
dedicated AI accelerator, she wants to use the execution device that minimizes
the inference time. To address this issue, she runs the model on each execution
device and measures the inference time for each test run. This information helps
her release a web application that provides the best possible user experience
given available hardware.

### Operation Level Execution ### {#usecase-op-level-exec}

Expand Down