Skip to content

Commit 1fa758f

Browse files
committed
deploy: a5e5f5f
1 parent b33f383 commit 1fa758f

File tree

750 files changed

+1760
-871
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

750 files changed

+1760
-871
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
:orphan:
2+
3+
:py:mod:`neural_compressor.torch.algorithms.smooth_quant`
4+
=========================================================
5+
6+
.. py:module:: neural_compressor.torch.algorithms.smooth_quant
7+
8+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
:orphan:
2+
3+
:py:mod:`neural_compressor.torch.algorithms.smooth_quant.smooth_quant`
4+
======================================================================
5+
6+
.. py:module:: neural_compressor.torch.algorithms.smooth_quant.smooth_quant
7+
8+
9+
Module Contents
10+
---------------
11+
12+
13+
Functions
14+
~~~~~~~~~
15+
16+
.. autoapisummary::
17+
18+
neural_compressor.torch.algorithms.smooth_quant.smooth_quant.smooth_quantize
19+
20+
21+
22+
.. py:function:: smooth_quantize(model, tune_cfg, run_fn, example_inputs, inplace=True)
23+
24+
Execute the quantize process on the specified model.
25+
26+
:param model: a float model to be quantized.
27+
:param tune_cfg: quantization config for ops.
28+
:param run_fn: a calibration function for calibrating the model.
29+
:param example_inputs: used to trace torch model.
30+
:param inplace: whether to carry out model transformations in-place.
31+
32+
:returns: A quantized model.
33+
34+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
:orphan:
2+
3+
:py:mod:`neural_compressor.torch.algorithms.smooth_quant.utility`
4+
=================================================================
5+
6+
.. py:module:: neural_compressor.torch.algorithms.smooth_quant.utility
7+
8+
9+
Module Contents
10+
---------------
11+
12+
Classes
13+
~~~~~~~
14+
15+
.. autoapisummary::
16+
17+
neural_compressor.torch.algorithms.smooth_quant.utility.TorchSmoothQuant
18+
neural_compressor.torch.algorithms.smooth_quant.utility.CpuInfo
19+
20+
21+
22+
Functions
23+
~~~~~~~~~
24+
25+
.. autoapisummary::
26+
27+
neural_compressor.torch.algorithms.smooth_quant.utility.generate_activation_observer
28+
neural_compressor.torch.algorithms.smooth_quant.utility.check_cfg_and_qconfig
29+
neural_compressor.torch.algorithms.smooth_quant.utility.get_quantizable_ops_recursively
30+
neural_compressor.torch.algorithms.smooth_quant.utility.get_module
31+
neural_compressor.torch.algorithms.smooth_quant.utility.set_module
32+
neural_compressor.torch.algorithms.smooth_quant.utility.update_sq_scale
33+
neural_compressor.torch.algorithms.smooth_quant.utility.reshape_scale_as_weight
34+
neural_compressor.torch.algorithms.smooth_quant.utility.reshape_in_channel_to_last
35+
neural_compressor.torch.algorithms.smooth_quant.utility.reshape_scale_as_input
36+
neural_compressor.torch.algorithms.smooth_quant.utility.register_autotune
37+
38+
39+
40+
.. py:function:: generate_activation_observer(scheme, algorithm, smooth_quant=False, smooth_quant_enable=False)
41+
42+
This is a helper method to generate an activation observer.
43+
44+
:param scheme: Quantization scheme to be used.
45+
:type scheme: str
46+
:param algorithm: What algorithm for computing the quantization parameters based on.
47+
:type algorithm: str
48+
49+
:returns: An observer.
50+
51+
52+
.. py:function:: check_cfg_and_qconfig(tune_cfg, cfgs, op_infos_from_cfgs, output_tensor_ids_op_name, smooth_quant=False)
53+
54+
Check configs and quantization configs.
55+
56+
:param tune_cfg: dictionary of quantization configuration.
57+
:type tune_cfg: dict
58+
:param cfgs: the input configs.
59+
:type cfgs: dict
60+
:param op_infos_from_cfgs: op infos from configs.
61+
:type op_infos_from_cfgs: dict
62+
:param output_tensor_ids_op_name: dictionary of output tensor op names.
63+
:type output_tensor_ids_op_name: dict
64+
65+
:returns: cfgs (dict).
66+
67+
68+
.. py:function:: get_quantizable_ops_recursively(model, example_inputs)
69+
70+
Get all quantizable ops from model.
71+
72+
:param model: input model
73+
:type model: object
74+
:param example_inputs: used to trace torch model.
75+
:type example_inputs: dict|list|tuple|torch.Tensor
76+
77+
:returns: list of tuples of op_name and op_type.
78+
cfgs (dict): dict of configuration
79+
:rtype: quantizable_ops (list)
80+
81+
82+
.. py:function:: get_module(model, key)
83+
84+
Get module from model by key name.
85+
86+
:param model: original model
87+
:type model: torch.nn.Module
88+
:param key: module name to be replaced
89+
:type key: str
90+
91+
92+
.. py:function:: set_module(model, key, new_module)
93+
94+
Set new module into model by key name.
95+
96+
:param model: original model
97+
:type model: torch.nn.Module
98+
:param key: module name to be replaced
99+
:type key: str
100+
:param new_module: new module to be inserted
101+
:type new_module: torch.nn.Module
102+
103+
104+
.. py:function:: update_sq_scale(ipex_config_path, smoothquant_scale_info)
105+
106+
Update ipex_config.json with smoothquant scale info generated by our algorithm.
107+
108+
:param ipex_config_path: a path to temporary ipex_config.json file.
109+
:type ipex_config_path: str
110+
:param smoothquant_scale_info: a dict contains smoothquant scale info.
111+
:type smoothquant_scale_info: dict
112+
113+
114+
.. py:function:: reshape_scale_as_weight(layer, scale)
115+
116+
Reshape the scale for weight input channel, depthwise output channel
117+
:param layer: torch module
118+
:param scale: orig scale
119+
:return: reshaped scale.
120+
121+
122+
.. py:function:: reshape_in_channel_to_last(layer_name, model)
123+
124+
Move the input channel to the last dim
125+
:param layer_name: Layer name
126+
:return: The reshaped weight.
127+
128+
129+
.. py:function:: reshape_scale_as_input(layer, scale)
130+
131+
Reshape the scale for input feature in channel
132+
:param layer:
133+
134+
:param scale:
135+
:return:
136+
137+
138+
.. py:function:: register_autotune(name)
139+
140+
Class decorator to register a smoothquant auto-tune subclass.
141+
142+
:return: the class of register
143+
144+
145+
.. py:class:: TorchSmoothQuant(model, dataloader=None, example_inputs=None, q_func=None, traced_model=None, scale_sharing=True, record_max_info=False)
146+
147+
148+
Fake input channel quantization, for more details please refer to
149+
[1] SmoothQuant: Accurate and Efficient
150+
Post-Training Quantization for Large Language Models
151+
[2] SPIQ: Data-Free Per-Channel Static Input Quantization
152+
Currently, we only handle the layers whose smooth scale could be absorbed, we will support other layers later.
153+
154+
We only support inplace mode which means the model weights will be changed, you can call recover function
155+
to recover the weights if needed
156+
157+
158+
.. py:class:: CpuInfo
159+
160+
161+
162+
163+
Get CPU Info.
164+
165+

latest/_sources/autoapi/neural_compressor/torch/algorithms/static_quant/utility/index.rst.txt

+35-35
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ Classes
1414

1515
.. autoapisummary::
1616

17-
neural_compressor.torch.algorithms.static_quant.utility.TransformerBasedModelBlockPatternDetector
1817
neural_compressor.torch.algorithms.static_quant.utility.Statistics
18+
neural_compressor.torch.algorithms.static_quant.utility.TransformerBasedModelBlockPatternDetector
1919

2020

2121

@@ -24,17 +24,46 @@ Functions
2424

2525
.. autoapisummary::
2626

27+
neural_compressor.torch.algorithms.static_quant.utility.get_quantizable_ops_recursively
28+
neural_compressor.torch.algorithms.static_quant.utility.simple_inference
29+
neural_compressor.torch.algorithms.static_quant.utility.dump_model_op_stats
2730
neural_compressor.torch.algorithms.static_quant.utility.get_depth
2831
neural_compressor.torch.algorithms.static_quant.utility.get_dict_at_depth
2932
neural_compressor.torch.algorithms.static_quant.utility.get_element_under_depth
3033
neural_compressor.torch.algorithms.static_quant.utility.paser_cfgs
3134
neural_compressor.torch.algorithms.static_quant.utility.get_quantizable_ops_from_cfgs
32-
neural_compressor.torch.algorithms.static_quant.utility.simple_inference
33-
neural_compressor.torch.algorithms.static_quant.utility.dump_model_op_stats
34-
neural_compressor.torch.algorithms.static_quant.utility.get_quantizable_ops_recursively
3535

3636

3737

38+
.. py:function:: get_quantizable_ops_recursively(model, example_inputs)
39+
40+
Get all quantizable ops from model.
41+
42+
:param model: input model
43+
:type model: object
44+
:param example_inputs: used to trace torch model.
45+
:type example_inputs: dict|list|tuple|torch.Tensor
46+
47+
:returns: list of tuples of op_name and op_type.
48+
cfgs (dict): dict of configuration
49+
:rtype: quantizable_ops (list)
50+
51+
52+
.. py:function:: simple_inference(q_model, example_inputs, iterations=1)
53+
54+
The function is used for ipex warm-up inference.
55+
56+
57+
.. py:function:: dump_model_op_stats(tune_cfg)
58+
59+
This is a function to dump quantizable ops of model to user.
60+
61+
:param tune_cfg: quantization config
62+
:type tune_cfg: dict
63+
64+
:returns: None
65+
66+
3867
.. py:function:: get_depth(d) -> int
3968
4069
Query the depth of the dict.
@@ -78,33 +107,10 @@ Functions
78107
:returns: cfgs (dict).
79108

80109

81-
.. py:function:: simple_inference(q_model, example_inputs, iterations=1)
82-
83-
The function is used for ipex warm-up inference.
84-
85-
86-
.. py:function:: dump_model_op_stats(tune_cfg)
87-
88-
This is a function to dump quantizable ops of model to user.
89-
90-
:param tune_cfg: quantization config
91-
:type tune_cfg: dict
92-
93-
:returns: None
94-
95-
96-
.. py:function:: get_quantizable_ops_recursively(model, example_inputs)
110+
.. py:class:: Statistics(data, header, field_names, output_handle=logger.info)
97111
98-
Get all quantizable ops from model.
99112
100-
:param model: input model
101-
:type model: object
102-
:param example_inputs: used to trace torch model.
103-
:type example_inputs: dict|list|tuple|torch.Tensor
104-
105-
:returns: list of tuples of op_name and op_type.
106-
cfgs (dict): dict of configuration
107-
:rtype: quantizable_ops (list)
113+
The statistics printer.
108114

109115

110116
.. py:class:: TransformerBasedModelBlockPatternDetector(model: torch.nn.Module, pattern_lst: List[List[Union[str, int]]] = BLOCK_PATTERNS)
@@ -113,9 +119,3 @@ Functions
113119
Detect the attention block and FFN block in transformer-based model.
114120

115121

116-
.. py:class:: Statistics(data, header, field_names, output_handle=logger.info)
117-
118-
119-
The statistics printer.
120-
121-

latest/autoapi/block_mask/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@
107107
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
108108
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
109109
provided by <a href="https://readthedocs.org">Read the Docs</a>.
110-
<jinja2.runtime.BlockReference object at 0x7fd3a8228eb0>
110+
<jinja2.runtime.BlockReference object at 0x7f38ecfa9a50>
111111
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
112112

113113

latest/autoapi/neural_compressor/adaptor/adaptor/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ <h3>Functions<a class="headerlink" href="#functions" title="Permalink to this he
146146
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
147147
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
148148
provided by <a href="https://readthedocs.org">Read the Docs</a>.
149-
<jinja2.runtime.BlockReference object at 0x7fd3a7726380>
149+
<jinja2.runtime.BlockReference object at 0x7f38e9e51720>
150150
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
151151

152152

latest/autoapi/neural_compressor/adaptor/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ <h2>Package Contents<a class="headerlink" href="#package-contents" title="Permal
217217
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
218218
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
219219
provided by <a href="https://readthedocs.org">Read the Docs</a>.
220-
<jinja2.runtime.BlockReference object at 0x7fd3a76d36a0>
220+
<jinja2.runtime.BlockReference object at 0x7f38ecbfce80>
221221
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
222222

223223

latest/autoapi/neural_compressor/adaptor/keras/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,7 @@ <h3>Classes<a class="headerlink" href="#classes" title="Permalink to this headin
125125
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
126126
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
127127
provided by <a href="https://readthedocs.org">Read the Docs</a>.
128-
<jinja2.runtime.BlockReference object at 0x7fd3a76d21a0>
128+
<jinja2.runtime.BlockReference object at 0x7f38e9a9cfd0>
129129
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
130130

131131

latest/autoapi/neural_compressor/adaptor/keras_utils/conv2d/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
107107
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
108108
provided by <a href="https://readthedocs.org">Read the Docs</a>.
109-
<jinja2.runtime.BlockReference object at 0x7fd3a76d3ac0>
109+
<jinja2.runtime.BlockReference object at 0x7f38e9a9e950>
110110
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
111111

112112

latest/autoapi/neural_compressor/adaptor/keras_utils/dense/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
107107
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
108108
provided by <a href="https://readthedocs.org">Read the Docs</a>.
109-
<jinja2.runtime.BlockReference object at 0x7fd3a76d0340>
109+
<jinja2.runtime.BlockReference object at 0x7f38e9e534c0>
110110
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
111111

112112

latest/autoapi/neural_compressor/adaptor/keras_utils/depthwise_conv2d/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
107107
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
108108
provided by <a href="https://readthedocs.org">Read the Docs</a>.
109-
<jinja2.runtime.BlockReference object at 0x7fd3a73d07f0>
109+
<jinja2.runtime.BlockReference object at 0x7f38e9c04640>
110110
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
111111

112112

latest/autoapi/neural_compressor/adaptor/keras_utils/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
107107
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
108108
provided by <a href="https://readthedocs.org">Read the Docs</a>.
109-
<jinja2.runtime.BlockReference object at 0x7fd3a76d3f70>
109+
<jinja2.runtime.BlockReference object at 0x7f38e9e51870>
110110
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
111111

112112

latest/autoapi/neural_compressor/adaptor/keras_utils/pool2d/index.html

+1-1
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@
106106
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
107107
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
108108
provided by <a href="https://readthedocs.org">Read the Docs</a>.
109-
<jinja2.runtime.BlockReference object at 0x7fd3a76d0cd0>
109+
<jinja2.runtime.BlockReference object at 0x7f38e9c07430>
110110
<p></p><div><a href='https://www.intel.com/content/www/us/en/privacy/intel-cookie-notice.html' data-cookie-notice='true'>Cookies</a> <a href='https://www.intel.com/content/www/us/en/privacy/intel-privacy-notice.html'>| Privacy</a></div>
111111

112112

0 commit comments

Comments
 (0)