Skip to content

Commit ba475a7

Browse files
authored
Setup package (#1)
* Add tests * Add intel neural compressor examples * Rename file * Update transformers version for INC
1 parent 567d474 commit ba475a7

38 files changed

+7804
-1
lines changed

.github/PULL_REQUEST_TEMPLATE.md

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# What does this PR do?
2+
3+
<!--
4+
Congratulations! You've made it this far! You're not quite done yet though.
5+
6+
Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution.
7+
8+
Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change.
9+
10+
Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost.
11+
-->
12+
13+
<!-- Remove if not applicable -->
14+
15+
Fixes # (issue)
16+
17+
18+
## Before submitting
19+
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
20+
- [ ] Did you make sure to update the documentation with your changes?
21+
- [ ] Did you write any new necessary tests?
22+
+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
name: check_code_quality
2+
3+
on:
4+
push:
5+
branches: [ main ]
6+
paths:
7+
- "optimum/**.py"
8+
- "tests/**.py"
9+
- "examples/**.py"
10+
11+
pull_request:
12+
branches: [ main ]
13+
paths:
14+
- "optimum/**.py"
15+
- "tests/**.py"
16+
- "examples/**.py"
17+
18+
concurrency:
19+
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
20+
cancel-in-progress: true
21+
22+
jobs:
23+
build:
24+
strategy:
25+
fail-fast: false
26+
matrix:
27+
python-version: [3.8]
28+
os: [ubuntu-20.04]
29+
30+
runs-on: ${{ matrix.os }}
31+
steps:
32+
- uses: actions/checkout@v2
33+
- name: Setup Python ${{ matrix.python-version }}
34+
uses: actions/setup-python@v2
35+
with:
36+
python-version: ${{ matrix.python-version }}
37+
- name: Create and start a virtual environment
38+
run: |
39+
python -m venv venv
40+
source venv/bin/activate
41+
- name: Install dependencies
42+
run: |
43+
source venv/bin/activate
44+
pip install --upgrade pip
45+
pip install isort
46+
pip install black
47+
- name: Check style with black
48+
run: |
49+
source venv/bin/activate
50+
black --check .
51+
- name: Check style with isort
52+
run: |
53+
source venv/bin/activate
54+
isort --check .

.github/workflows/test_general.yml

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
2+
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
3+
name: Neural Compressor / Python - Test
4+
5+
on:
6+
push:
7+
branches: [ main ]
8+
pull_request:
9+
branches: [ main ]
10+
11+
concurrency:
12+
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
13+
cancel-in-progress: true
14+
15+
jobs:
16+
build:
17+
strategy:
18+
fail-fast: false
19+
matrix:
20+
python-version: [3.8, 3.9]
21+
os: [ubuntu-18.04]
22+
23+
runs-on: ${{ matrix.os }}
24+
steps:
25+
- uses: actions/checkout@v2
26+
- name: Setup Python ${{ matrix.python-version }}
27+
uses: actions/setup-python@v2
28+
with:
29+
python-version: ${{ matrix.python-version }}
30+
- name: Install dependencies
31+
run: |
32+
python -m pip install --upgrade pip
33+
pip install .[tests]
34+
pip install torch==1.9.1
35+
- name: Test with Pytest
36+
run: |
37+
pytest tests/

examples/config/prune.yml

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
#
2+
# Copyright (c) 2021 Intel Corporation
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
version: 1.0
17+
18+
model:
19+
name: bert_prune
20+
framework: pytorch
21+
22+
pruning:
23+
approach:
24+
weight_compression:
25+
initial_sparsity: 0.0
26+
target_sparsity: 0.1 # targeted sparsity of 10%
27+
start_epoch: 0
28+
end_epoch: 1
29+
pruners:
30+
- !Pruner
31+
prune_type: basic_magnitude
32+
tuning:
33+
accuracy_criterion:
34+
relative: 0.1 # only verifying workflow, accuracy loss percentage: 10%
35+
exit_policy:
36+
timeout: 0 # tuning timeout (seconds)
37+
random_seed: 9527 # random seed

examples/config/quantization.yml

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#
2+
# Copyright (c) 2021 Intel Corporation
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
version: 1.0
17+
18+
model: # mandatory.
19+
name: bert
20+
framework: pytorch # mandatory. possible values are pytorch and pytorch_fx.
21+
22+
device: cpu
23+
24+
quantization: # optional.
25+
approach: post_training_dynamic_quant
26+
27+
tuning:
28+
accuracy_criterion:
29+
relative: 0.03 # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 3%.
30+
exit_policy:
31+
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
32+
max_trials: 30
33+
random_seed: 9527 # optional. random seed for deterministic tuning.

examples/language-modeling/README.md

+82
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
<!---
2+
Copyright 2020 The HuggingFace Team. All rights reserved.
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
http://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
-->
16+
17+
# Language modeling training
18+
19+
The scripts [`run_clm.py`](https://github.com/huggingface/optimum/blob/main/examples/language-modeling/run_clm.py)
20+
and [`run_mlm.py`](https://github.com/huggingface/optimum/blob/main/examples/language-modeling/run_mlm.py)
21+
allow us to apply different quantization approaches (such as dynamic, static and aware-training quantization) as well as pruning
22+
using the [Intel Neural Compressor (INC)](https://github.com/intel/neural-compressor) library for language modeling tasks.
23+
24+
25+
GPT and GPT-2 are trained or fine-tuned using a causal language modeling (CLM) loss. ALBERT, BERT, DistilBERT and
26+
RoBERTa are trained or fine-tuned using a masked language modeling (MLM) loss, more information about the differences
27+
between those objectives can be found in our [model summary](https://huggingface.co/transformers/model_summary.html).
28+
29+
30+
### GPT-2/GPT and causal language modeling
31+
32+
The following example fine-tunes GPT-Neo on WikiText-2 while first applying magnitude pruning and then quantization aware training.
33+
We're using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling (CLM).
34+
35+
```bash
36+
python run_clm.py \
37+
--model_name_or_path EleutherAI/gpt-neo-125M \
38+
--dataset_name wikitext \
39+
--dataset_config_name wikitext-2-raw-v1 \
40+
--quantize \
41+
--quantization_approach aware_training \
42+
--prune \
43+
--target_sparsity 0.02 \
44+
--perf_tol 0.5 \
45+
--do_train \
46+
--do_eval \
47+
--verify_loading \
48+
--output_dir /tmp/clm_output
49+
```
50+
51+
### RoBERTa/BERT/DistilBERT and masked language modeling
52+
53+
The following example fine-tunes RoBERTa on WikiText-2 while applying quantization aware training and magnitude pruning. We're using the raw
54+
WikiText-2. The loss is different as BERT/RoBERTa have a bidirectional mechanism, we are therefore using the same loss
55+
that was used during their pre-training: masked language modeling (MLM) loss.
56+
57+
```bash
58+
python run_mlm.py \
59+
--model_name_or_path bert-base-uncased \
60+
--dataset_name wikitext \
61+
--dataset_config_name wikitext-2-raw-v1 \
62+
--quantize \
63+
--quantization_approach aware_training \
64+
--prune \
65+
--target_sparsity 0.1 \
66+
--perf_tol 0.5 \
67+
--do_train \
68+
--do_eval \
69+
--verify_loading \
70+
--output_dir /tmp/mlm_output
71+
```
72+
73+
In order to apply dynamic, static or aware-training quantization, `quantization_approach` must be set to
74+
respectively `dynamic`, `static` or `aware_training`.
75+
76+
The configuration file containing all the information related to the model quantization and pruning objectives can be
77+
specified using respectively `quantization_config` and `pruning_config`. If not specified, the default
78+
[quantization](https://github.com/huggingface/optimum/blob/main/examples/config/quantization.yml)
79+
and [pruning](https://github.com/huggingface/optimum/blob/main/examples/config/prune.yml)
80+
config files will be used.
81+
82+
The flag `--verify_loading` can be passed along to verify that the resulting quantized model can be loaded correctly.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#
2+
# Copyright (c) 2021 Intel Corporation
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
version: 1.0
17+
18+
model: # mandatory.
19+
name: bert
20+
framework: pytorch # mandatory. possible values are pytorch and pytorch_fx.
21+
22+
device: cpu
23+
24+
quantization: # optional.
25+
approach: post_training_dynamic_quant
26+
27+
tuning:
28+
accuracy_criterion:
29+
absolute: 2 # optional. default value is relative, other value is absolute. this example allows absolute accuracy loss of 2.
30+
exit_policy:
31+
timeout: 0 # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
32+
max_trials: 30
33+
random_seed: 9527 # optional.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
accelerate
2+
torch >= 1.9
3+
datasets >= 1.8.0
4+
sentencepiece != 0.1.92
5+
protobuf

0 commit comments

Comments
 (0)