Skip to content

Commit c3e3a83

Browse files
zheng-dapiiswrong
authored andcommitted
Refactor operators and add MKLDNN (apache#9677)
* Remove MKL code. * Integrate MKLDNN. Update MXNet for MKLDNN. Enable MKLDNN Relu. Fix a compilation error. Change Makefile for MKLDNN. Remove infer storage in convolution. Update MXNet for MKLDNN. Support MKLDNN storage type in python. Update activation. Add MKLDNN base classes. Implement MKLDNN fully connected. Add MKLDNN convolution. Update MKLDNN interface in NDArray. MKLDNN convolution handle CreateMKLDNNData failure. Add another GetMKLDNNData in NDArray. Have mkldnn to define the data format. Create output MKLDNN memory explicitly for FC. Fix a bug in NDArray. Fix a bug in GetWeightDesc. Convert data layout if necessary in FC. remove unnecessary print in MKLDNN convolution. Add MKLDNN deconvolution. Add MKLDNNStream to manage primitives and memories. Use MKLDNNStream to register memory in NDArray. Use MKLDNNStream to manage resources in operators. Handle kAddTo in MKLDNN operators. Fix a bug in deconvolution. Fix bugs in NDArray. Revert "Fix bugs in NDArray." This reverts commit f5624a4. Fix a bug in NDArray. Fix a bug in NDArray. Reorder MKLDNN memory to default format in SetTBlob. Disable MKLDNN correctly. Fix a bug in activation. Reshape of NDArray supports MKLDNN. Fix a memory ref bug in NDArray. Reshape NDArray in MKLDNN FullyConnected. Fix data format conversion. Create MKLDNN NDArray in python. Support Slice for MKLDNN NDArray. Reduce the overhead of summing the result to the output array. Avoid unnecessary memory copy in NDArray. Fix a bug in data reordering. Fix a bug in NDArray. Don't hard code MKLDNN type. Support dilation in MKLDNN convolution. Fix a bug in sum results. Rewrite GetMKLDNNData. Add prepare_mkldnn.sh Enable MKLDNN activation. Fix a bug on FullyConnected. Handle 3 dims for MKLDNN NDArray. Fix a bug in MKLDNN FC. Support MKLDNN storage in KV store. Fix a bug in executor for non-default NDArray. Fix a link error in cast_storage.cc. Remove unnecessary function def Fall back to def storage if the type isn't supported by MKLDNN. Use NDArray for MKLDNN in python. Reshape output of MKLDNN convolution. Fix a bug in NDArray. Support more operations in MKLDNN NDArray. Fix a bug in deconvolution. Fix bugs in MKLDNN deconvolution. We still need to compute bias correctly. Have elemwise binary ops to fall to default for MKLDNN. Limit the cases that MKLDNN operations are called. Force the layout of mkldnn::memory from NDArray. Add MKLDNN softmax. Fix output storage type of MKLDNN softmax. Add MKLDNN sum. Fix a bug in elemwise sum. Fix a bug in MKLDNN softmax. Fix a bug in imperative. Clean up dispatch modes. Remove redundant code. MKLDNN Pooling Op integration MKLDNN Pooling Op integration add missing file fix mkldnn pooling op workspace issue handle workspace in MKLDNN pooling correctly. Use a non-MKLDNN op for testing. Allow to share arguments and their gradients between executors. Avoid using MKLDNN pooling when it's not supported. Support MKLDNN properly. Choose MKLDNN softmax more carefully. Fix a bug in MKLDNN pooling. Fall back if MKLDNN pooling isn't supported. Fix a bug in Slice of NDArray. Use int32 for workspace memory. Exclude MKLDNN act with tanh. Have two Reshape functions in NDArray. Copy data for NDArray with diff shapes. Add MKLDNN copy. Add MKLDNN version of elemwise_add. Add MKLDNN version of Flatten. add mkldnn surport for concat simplify MKLDNN Flatten. Enalbe MKLDNN deconvolution with bias. Fix a bug in CuDNN deconvolution. avoid using MKLDNNStorage when it's not defined. Remove ./cudnn_lrn-inl.h Fix for make lint. add mkldnn surport for concat fix the coding style for pr of mkldnn concat Only add input data for MKLDNN concat backward Remove unnecessary TODO. remove unnecessary __repr__ in MKLNDArray. better condition check for readability. Use macro when including mkldnn.hpp. Revert "Use CoreOpRunner for refactored Ops." This reverts commit a28586f. Fix a bug in test core. Limit MKLDNN ops being used. Fix complains from "make pylint" Move ContainStorage to common/utils.h Limit MKLDNN concat being used. Add license. Fix amalgamation Fix compilation error in mkldnn_ops-inl.h Fix a bug in deconvolution. Fix a bug in pooling. MKLDNN ops allocates temp mem. Fix a bug in pooling. Allocate align memory from temp space. Have parameter gradients stored in the default storage. Handle all cases in CopyFrom. Ensure NDArray returns memory with right memory descriptors. use auto to define memory in the operator. Use raw pointer for mkldnn memory. Move more code to mkldnn_base.cc Fix a compilation error. Address review comments. fix a bug in activation backward. Miss a macro in mkldnn_base.cc Fix a bug in data iterator in examples. Avoid memory allocation in ReshapeMKLDNN. Avoid memory allocation in storage cast. Fix a bug in cast storage. Handle sliced MKLDNN NDArray. Use memcpy if NDArray uses default format. Revert "Limit MKLDNN ops being used." This reverts commit 75e2ae5. Enable mkldnn act backward has the same input layout. Fix a bug in mkldnn activation. Use MKLDNN sum in more cases. Improve perf of reorder. Avoid memory reorder in conv and deconv. Avoid unnecessary storage cast in fallback path. Revert "Use MKLDNN sum in more cases." This reverts commit 7a21ebc. Handle sliced ndarray in more cases. Fix a complain from make lint. Update Jenkins to test MKLDNN. debug compiling mkldnn. Use MKLDNN sum in more cases. Add mkldnn as a submodule. Compile with mkldnn in 3rdparty. Fix some coding styles. write the path to mkldnn lib in libmxnet.so. use rpath with $ORIGIN. Pack all lib files in Jenkins. pack and unpack mxnet with MKLDNN. Update Jenkinsfile Update Jenkinsfile Add mkldnn batch normalization Fix bugs in BN. Avoid memory allocation in MKLDNNCopy. only use MKLDNN BatchNorm for special cases. MKLDNN BatchNorm doesn't work well on the default layout. Add MKL-DNN based LRN Code Style Changes Fix a bug in BN. Fix a bug in LRN. Handle non-default storage in memory plan. Fix coding style. Fix a compilation error without mkldnn. Fix some coding styles for batch norm Improve forward of convolution. Add openmp and simd support to BN operator Retrieve MKLDNN Conv primitive based on signature. Retrieve Act primitive based on its signature. Fix a bug in pooling. Diable some MKLDNN activation and pooling. Cast MKLDNN storage with diff data type. Check if it's a view of NDArray. Reshaped and sliced arrays share the same chunks. Implement caching MKLDNN Act correctly. Fix a bug in check_consistency. Fix a potential bug when destroying NDArray. Fix bugs when allocating mem in NDArray. Fix coding style. Add micro when using mkldnn in ndarray. Fix a compilation error. Fix a bug in concat. Remove MKLDNNStorage. handle diff layouts in CopyFromToDnsImpl. Fallback correctly. Force weight grad to use default layout. Reorder weight arrays in (de)conv for faster inference. Avoid caching TBlob from NDArray. This commit may add some overhead of managing NDArray for each fallback. Fix a bug in Flatten. handle ndarray with def layout in mkldnn BN correctly. Align to page when mkldnn is enabled. Use default mem alloc for mkldnn. Reuse NDArrays. Support WriteInplace for sum. fix complains from "make lint". Avoid reallocation in NDArray. Handle weight arrays with special MKLDNN layouts. Remove unnecessary GetWeights. Fix compilation error without MKLDNN. Fix a bug in (de)conv for weight arrays. Fix a minor bug in MKLDNN conv. Fix a bug in MKLDNNOpSignature. Reimplement fallback for MKLDNN ops. Fix a bug in FallbackExecutor. Add params in hashcode. Invalidate data in outputs to accelerate. Fix a minor bug. Update mkldnn_base-inl.h Add primitive caching for Pooling forward computation Add hashcode in pooling parameters. Support NDArray copy with types unsupported by MKLDNN. Avoid using MKLDNN concat for negative dimension. Fix make lint complain. Disable mkldnn avg pooling for now. Fix a compile warning. Fix compile error when MKLDNN is disabled. OP primitive cache: use memory as signature for MKLDNN storage type Remove MKLDNN array in python. Disable Clang tests in Jenkins. Use mklml dockers to test mkldnn. Update MKLDNN repo to zhengda's mkldnn repo. Update MKLDNN repo to ashok's. Fix a bug in fallback. Change avg pooling algorithm to pooling_avg_include_padding Fix a code style in mkldnn pooling. Temp fix a bug in FC. Revert "Disable Clang tests in Jenkins." This reverts commit b4efa8f. Rebase and Refactor deconv (apache#20) * rebase to Da,Zheng refactor branch Jan.14, add signature for mkldnn Deconv and modify classMKLDNNDeconvForward * fix make lint complains A simple way of caching BN inference. cache BN forward for both training and inference. Fix some minor problems in BN. Fix a bug in caching BN. force to build with avx2 in Jenkins. Remove the remaining MKLDNNStorageType Some minor updates in NDArray. a lot of updates to address comments. minor changes. * Use NNVM interface. Use NNVM interface for upsampling. Use NNVM interface for convolution. Use NNVM interface for deconvolution. Use NNVM interface for FullyConnected. Move NNVM interface to batch norm. Use NNVM interface for depthwise convolution. Use NNVM interface for softmax activation. Use NNVM interface for pooling. use NNVM interface for dropout. Use NNVM interface for activation. Use NNVM interface for CuDNN batch norm. Use NNVM interface for CuDNN pooling. Use NNVM interface for CuDNN softmax activation. Use NNVM interface for CuDNN activation. Use NNVM interface for CuDNN convolution. Use NNVM interface for CuDNN deconvolution. Move concat to nn/ Use NNVM interface for concat. Fix headers in concat. Move lrn to nn/. Use NNVM interface for LRN. Fix a compilation error in convolution. Fix a compilation error in activation. Fix coding style. Fix coding style for make lint. use enums in batch norm. Use CoreOpRunner for refactored Ops. Make FullyConnected stateless. Make upsampling stateless. Make pooling stateless. Make batchnorm stateless. Make SoftmaxActivation stateless. Fix a code style problem. pass amalgamation test for batch norm. pass amalgamation test for dropout. Get convolution ops from a function. Fix compilation errors for GPU. Fix thread local in diff platforms. Avoid using thread_local for non-CuDNN conv/deconv. Remove TODO in deconv. Fix a bug in batch norm. Fix a bug in fully connected. Don't set #inputs for backward convolution. Revert "Make pooling stateless." * revert modification in test_executor. * Fix a bug in FlattenStorageType. * Remove BN debug. * Remove remaining MXNET_USE_MKL2017 * Remove unused code in pooling. * Fixing bugs in gtests. * Fix lint errors. * a lot of minor updates to address comments. * Fix coding style in MKLDNN Pooling (apache#22) * revert the code change in the previous code refactor. * Fix a bug in pooling. * LRN coding style changes (apache#21) * LRN coding style change * Add const for local variables * Add req for LRN forward * rebase code * align API interface * revert modification in test_executor. * cast storage with MKLDNN properly. * Minor updates to address comments. * some minor updates. * Switch to the master branch of MKLDNN. * Minor updates to address comments. * Update activation.cc * Fix a bug in convert NDArray. * Add gluon model zoo tests. * Update GPU tests on model zoo. * Avoid using mobilenet for GPU tests with gluon models. mobilenet can't pass the test even without MKLDNN. * Update GPU tests on gluon. * change cmake to compile MKLDNN. * update cmake for MKLDNN. * Implement align myself. * Switch to intel/mkl-dnn. * Fix errors in align unittest. * Add unit test for LRN. * fix a compilation error. * use storage_type_assign to determine storage type. * avoid global pooling in mkldnn. There is a bug in global pooling in mkldnn. * compare all MKLDNN ops with native impls. add MXNET_MKLDNN_DEBUG to control the test. * Fix a bug in testing correctness. * print the name of buggy operator. * undo some modifications. * Fix a bug on reshaped array. * avoid testing outputs with NullOp. * turn on MKLDNN tests in Jenkins. * print each operator in MKLDNN tests. * rename test_gluon_model_zoo.py * Create hashcode for operator parameters properly. * Add USE_MKL2017 back. * Print warning messages. * move batchnorm tests to nnvm interface. * Delete batchnorm v1 tests. * Get inputs and outputs in batchnorm tests. * disable batchnorm tests for now. * Fix GPU tests on gluon model zoo. * Fix lint complains in tests. * Remove simd from openmp instructions in BatchNorm (apache#24) * Remove warnings. * Fix MKLDNN 1st compile failure issue (apache#23) * Fix compilation errors. * Remove ARCH_OPT in Jenkins. * Revert "avoid global pooling in mkldnn." This reverts commit f6efd34. * Move to the latest MKLDNN. This fixes the bug in global pooling. * WIP unit tests (apache#25) * WIP unit tests * some backward items initialized * Make more C++ unit tests work for batch norm (apache#28) * WIP unit tests * some backward items initialized * some backward items initialized * some backward items initialized * first unit test working * Working on types * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * backward types working for fp16 on first unit test * . * . * some tests working * fix input data * hangle gpu<->cpu for setting values * gpu working * gpu working * CAccessAsCPU class * Fix varying type in AccessAsCPU * starting to add channel axis tests * TestChannelAxisSimple * TestChannelAxisSimple * run bidirectional * run bidirectional * run bidirectional * CLEANUP * CLEANUP * .. * noaxis * .. * lint * revert * revert * Fix lint complains. * Fix a minor problem in Makefile. * fix GPU pooling. * Disable modelzoo inference tests. * update accuracy checks for MKLDNN. * Fix MKLDNN pooling for global pooling. * Fix Jenkins. * Fix a bug in Jenkins. * Fix Jenkins
1 parent d03182f commit c3e3a83

File tree

121 files changed

+10758
-9120
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+10758
-9120
lines changed

.gitmodules

+4
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,7 @@
2222
[submodule "3rdparty/googletest"]
2323
path = 3rdparty/googletest
2424
url = https://github.com/google/googletest.git
25+
[submodule "3rdparty/mkldnn"]
26+
path = 3rdparty/mkldnn
27+
url = https://github.com/intel/mkl-dnn.git
28+
branch = master

3rdparty/mkldnn

Submodule mkldnn added at 283c4a8

CMakeLists.txt

+27-15
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ mxnet_option(USE_OPENMP "Build with Openmp support" ON)
1717
mxnet_option(USE_CUDNN "Build with cudnn support" ON) # one could set CUDNN_ROOT for search path
1818
mxnet_option(USE_LAPACK "Build with lapack support" ON IF NOT MSVC)
1919
mxnet_option(USE_MKL_IF_AVAILABLE "Use MKL if found" ON)
20-
mxnet_option(USE_MKLML_MKL "Use MKLML variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND UNIX AND (NOT APPLE))
21-
mxnet_option(USE_MKL_EXPERIMENTAL "Use experimental MKL (if MKL enabled and found)" OFF)
20+
mxnet_option(USE_MKLML_MKL "Use MKLDNN variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND UNIX AND (NOT APPLE))
21+
mxnet_option(USE_MKLDNN "Use MKLDNN variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND UNIX AND (NOT APPLE))
2222
mxnet_option(USE_OPERATOR_TUNING "Enable auto-tuning of operators" ON IF NOT MSVC)
2323
mxnet_option(USE_GPERFTOOLS "Build with GPerfTools support (if found)" ON)
2424
mxnet_option(USE_JEMALLOC "Build with Jemalloc support" ON)
@@ -148,14 +148,18 @@ if(USE_VTUNE)
148148
endif()
149149

150150
if(USE_MKL_IF_AVAILABLE)
151-
if(USE_MKL_EXPERIMENTAL AND NOT USE_MKLML_MKL)
152-
message(ERROR " USE_MKL_EXPERIMENTAL can only be used when USE_MKL_EXPERIMENTAL is enabled")
153-
endif()
151+
if(USE_MKLDNN)
152+
add_subdirectory(3rdparty/mkldnn)
153+
include_directories(3rdparty/mkldnn/include)
154+
list(APPEND mxnet_LINKER_LIBS mkldnn)
155+
set(MKL_FOUND TRUE)
156+
else()
154157
find_package(MKL)
158+
endif()
155159
if(MKL_FOUND)
156160
include_directories(${MKL_INCLUDE_DIR})
157161
include_directories(${CMAKE_CURRENT_SOURCE_DIR}/src/operator/mkl)
158-
add_definitions(-DMXNET_USE_MKL2017=1)
162+
add_definitions(-DMXNET_USE_MKLDNN=1)
159163
add_definitions(-DUSE_MKL=1)
160164
add_definitions(-DCUB_MKL=1)
161165
list(APPEND mxnet_LINKER_LIBS ${MKL_LIBRARIES})
@@ -164,11 +168,6 @@ if(USE_MKL_IF_AVAILABLE)
164168
endif()
165169
# If using MKL, use the Intel OMP libraries
166170
list(APPEND mxnet_LINKER_LIBS iomp5)
167-
if(USE_MKL_EXPERIMENTAL)
168-
add_definitions(-DMKL_EXPERIMENTAL=1)
169-
else()
170-
add_definitions(-DMKL_EXPERIMENTAL=0)
171-
endif()
172171
else()
173172
message(STATUS " MKL not found")
174173
endif()
@@ -248,9 +247,8 @@ if(NOT MSVC AND NOT APPLE)
248247
set(BEGIN_WHOLE_ARCHIVE -Wl,--whole-archive)
249248
set(END_WHOLE_ARCHIVE -Wl,--no-whole-archive)
250249
elseif(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
251-
set(BEGIN_WHOLE_ARCHIVE -Wl,-force_load) # force_load loads all symbols of the next library
252-
#set(BEGIN_WHOLE_ARCHIVE -Wl,-all_load) # loads all symbols from all libraries
253-
#set(END_WHOLE_ARCHIVE -Wl,-noall_load)
250+
# using regular Clang or AppleClang
251+
set(BEGIN_WHOLE_ARCHIVE -Wl,-force_load)
254252
endif()
255253

256254
if(UNIX)
@@ -319,6 +317,9 @@ if(USE_OPENMP)
319317
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/3rdparty/openmp)
320318
list(REMOVE_ITEM mxnet_LINKER_LIBS iomp5)
321319
list(APPEND mxnet_LINKER_LIBS omp)
320+
if(UNIX)
321+
list(APPEND mxnet_LINKER_LIBS pthread)
322+
endif()
322323
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
323324
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
324325
else()
@@ -610,7 +611,18 @@ endif()
610611

611612
if(USE_OPENCV)
612613
add_executable(im2rec "tools/im2rec.cc")
613-
target_link_libraries(im2rec ${BEGIN_WHOLE_ARCHIVE} mxnet ${END_WHOLE_ARCHIVE} ${mxnet_LINKER_LIBS} ${OpenCV_LIBS} dmlc)
614+
if(MSVC)
615+
target_link_libraries(im2rec mxnet)
616+
else()
617+
target_link_libraries(im2rec ${BEGIN_WHOLE_ARCHIVE} mxnet_static ${END_WHOLE_ARCHIVE})
618+
endif()
619+
target_link_libraries(im2rec
620+
${mxnet_LINKER_LIBS}
621+
${OpenCV_LIBS}
622+
dmlc
623+
${nnvm_LINKER_LIBS}
624+
${pslite_LINKER_LIBS}
625+
)
614626
endif()
615627

616628
target_link_libraries(mxnet PUBLIC dmlc)

Jenkinsfile

+31-30
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
mx_lib = 'lib/libmxnet.so, lib/libmxnet.a, dmlc-core/libdmlc.a, nnvm/lib/libnnvm.a'
2525
// mxnet cmake libraries, in cmake builds we do not produce a libnvvm static library by default.
2626
mx_cmake_lib = 'build/libmxnet.so, build/libmxnet.a, build/dmlc-core/libdmlc.a, build/tests/mxnet_unit_tests, build/3rdparty/openmp/runtime/src/libomp.so'
27+
mx_mkldnn_lib = 'lib/libmxnet.so, lib/libmxnet.a, lib/libiomp5.so, lib/libmklml_gnu.so, lib/libmkldnn.so, lib/libmkldnn.so.0, lib/libmklml_intel.so, dmlc-core/libdmlc.a, nnvm/lib/libnnvm.a'
2728
// command to start a docker container
2829
docker_run = 'tests/ci_build/ci_build.sh'
2930
// timeout in minutes
@@ -160,18 +161,18 @@ def python3_gpu_ut(docker_type) {
160161
}
161162

162163
// Python 2
163-
def python2_mklml_ut(docker_type) {
164+
def python2_mkldnn_ut(docker_type) {
164165
timeout(time: max_time, unit: 'MINUTES') {
165166
sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
166-
sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests-2.7 --with-timer --verbose tests/python/cpu"
167+
sh "${docker_run} ${docker_type} PYTHONPATH=./python/ MXNET_MKLDNN_DEBUG=1 nosetests-2.7 --with-timer --verbose tests/python/cpu"
167168
}
168169
}
169170

170171
// Python 3
171-
def python3_mklml_ut(docker_type) {
172+
def python3_mkldnn_ut(docker_type) {
172173
timeout(time: max_time, unit: 'MINUTES') {
173174
sh "${docker_run} ${docker_type} find . -name '*.pyc' -type f -delete"
174-
sh "${docker_run} ${docker_type} PYTHONPATH=./python/ nosetests-3.4 --with-timer --verbose tests/python/cpu"
175+
sh "${docker_run} ${docker_type} PYTHONPATH=./python/ MXNET_MKLDNN_DEBUG=1 nosetests-3.4 --with-timer --verbose tests/python/cpu"
175176
}
176177
}
177178

@@ -242,21 +243,20 @@ try {
242243
}
243244
}
244245
},
245-
'CPU: MKLML': {
246+
'CPU: MKLDNN': {
246247
node('mxnetlinux-cpu') {
247-
ws('workspace/build-mklml-cpu') {
248+
ws('workspace/build-mkldnn-cpu') {
248249
init_git()
249250
def flag = """ \
250251
DEV=1 \
251252
USE_PROFILER=1 \
252253
USE_CPP_PACKAGE=1 \
253254
USE_BLAS=openblas \
254-
USE_MKL2017=1 \
255-
USE_MKL2017_EXPERIMENTAL=1 \
255+
USE_MKLDNN=1 \
256256
-j\$(nproc)
257257
"""
258258
make("cpu_mklml", flag)
259-
pack_lib('mklml_cpu')
259+
pack_lib('mkldnn_cpu', mx_mkldnn_lib)
260260
}
261261
}
262262
},
@@ -267,6 +267,8 @@ try {
267267
def defines = """ \
268268
-DUSE_CUDA=1 \
269269
-DUSE_CUDNN=1 \
270+
-DUSE_MKLML_MKL=0 \
271+
-DUSE_MKLDNN=0 \
270272
-DCMAKE_BUILD_TYPE=Release \
271273
"""
272274
def flag = "-v"
@@ -275,24 +277,23 @@ try {
275277
}
276278
}
277279
},
278-
'GPU: MKLML': {
280+
'GPU: MKLDNN': {
279281
node('mxnetlinux-cpu') {
280-
ws('workspace/build-mklml-gpu') {
282+
ws('workspace/build-mkldnn-gpu') {
281283
init_git()
282284
def flag = """ \
283285
DEV=1 \
284286
USE_PROFILER=1 \
285287
USE_CPP_PACKAGE=1 \
286288
USE_BLAS=openblas \
287-
USE_MKL2017=1 \
288-
USE_MKL2017_EXPERIMENTAL=1 \
289+
USE_MKLDNN=1 \
289290
USE_CUDA=1 \
290291
USE_CUDA_PATH=/usr/local/cuda \
291292
USE_CUDNN=1 \
292293
-j\$(nproc)
293294
"""
294295
make("build_cuda", flag)
295-
pack_lib('mklml_gpu')
296+
pack_lib('mkldnn_gpu', mx_mkldnn_lib)
296297
}
297298
}
298299
},
@@ -439,43 +440,43 @@ try {
439440
}
440441
}
441442
},
442-
'Python2: MKLML-CPU': {
443+
'Python2: MKLDNN-CPU': {
443444
node('mxnetlinux-cpu') {
444-
ws('workspace/ut-python2-mklml-cpu') {
445+
ws('workspace/ut-python2-mkldnn-cpu') {
445446
init_git()
446-
unpack_lib('mklml_cpu')
447+
unpack_lib('mkldnn_cpu', mx_mkldnn_lib)
447448
python2_ut('cpu_mklml')
448-
python2_mklml_ut('cpu_mklml')
449+
python2_mkldnn_ut('cpu_mklml')
449450
}
450451
}
451452
},
452-
'Python2: MKLML-GPU': {
453+
'Python2: MKLDNN-GPU': {
453454
node('mxnetlinux-gpu') {
454-
ws('workspace/ut-python2-mklml-gpu') {
455+
ws('workspace/ut-python2-mkldnn-gpu') {
455456
init_git()
456-
unpack_lib('mklml_gpu')
457+
unpack_lib('mkldnn_gpu', mx_mkldnn_lib)
457458
python2_gpu_ut('gpu_mklml')
458-
python2_mklml_ut('gpu_mklml')
459+
python2_mkldnn_ut('gpu_mklml')
459460
}
460461
}
461462
},
462-
'Python3: MKLML-CPU': {
463+
'Python3: MKLDNN-CPU': {
463464
node('mxnetlinux-cpu') {
464-
ws('workspace/ut-python3-mklml-cpu') {
465+
ws('workspace/ut-python3-mkldnn-cpu') {
465466
init_git()
466-
unpack_lib('mklml_cpu')
467+
unpack_lib('mkldnn_cpu', mx_mkldnn_lib)
467468
python3_ut('cpu_mklml')
468-
python3_mklml_ut('cpu_mklml')
469+
python3_mkldnn_ut('cpu_mklml')
469470
}
470471
}
471472
},
472-
'Python3: MKLML-GPU': {
473+
'Python3: MKLDNN-GPU': {
473474
node('mxnetlinux-gpu') {
474-
ws('workspace/ut-python3-mklml-gpu') {
475+
ws('workspace/ut-python3-mkldnn-gpu') {
475476
init_git()
476-
unpack_lib('mklml_gpu')
477+
unpack_lib('mkldnn_gpu', mx_mkldnn_lib)
477478
python3_gpu_ut('gpu_mklml')
478-
python3_mklml_ut('gpu_mklml')
479+
python3_mkldnn_ut('gpu_mklml')
479480
}
480481
}
481482
},

Makefile

+23-22
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,17 @@ endif
6060
include $(config)
6161

6262
ifeq ($(USE_MKL2017), 1)
63-
# must run ./prepare_mkl before including mshadow.mk
64-
RETURN_STRING := $(shell ./prepare_mkl.sh $(MKLML_ROOT))
65-
MKLROOT := $(firstword $(RETURN_STRING))
66-
export USE_MKLML = $(lastword $(RETURN_STRING))
63+
$(warning "USE_MKL2017 is deprecated. We will switch to USE_MKLDNN.")
64+
USE_MKLDNN=1
65+
endif
66+
67+
ifeq ($(USE_MKLDNN), 1)
68+
RETURN_STRING := $(shell ./prepare_mkldnn.sh $(MKLDNN_ROOT))
69+
LAST_WORD_INDEX := $(words $(RETURN_STRING))
70+
# fetch the 2nd last word as MKLDNNROOT
71+
MKLDNNROOT := $(word $(shell echo $$(($(LAST_WORD_INDEX) - 1))),$(RETURN_STRING))
72+
MKLROOT := $(lastword $(RETURN_STRING))
73+
export USE_MKLML = 1
6774
endif
6875

6976
include mshadow/make/mshadow.mk
@@ -131,23 +138,16 @@ ifeq ($(USE_NNPACK), 1)
131138
LDFLAGS += -lnnpack
132139
endif
133140

134-
ifeq ($(USE_MKL2017), 1)
135-
CFLAGS += -DMXNET_USE_MKL2017=1
141+
ifeq ($(USE_MKLDNN), 1)
142+
CFLAGS += -DMXNET_USE_MKLDNN=1
136143
CFLAGS += -DUSE_MKL=1
137-
CFLAGS += -I$(ROOTDIR)/src/operator/mkl/
138-
CFLAGS += -I$(MKLML_ROOT)/include
139-
LDFLAGS += -L$(MKLML_ROOT)/lib
140-
ifeq ($(USE_MKL2017_EXPERIMENTAL), 1)
141-
CFLAGS += -DMKL_EXPERIMENTAL=1
142-
else
143-
CFLAGS += -DMKL_EXPERIMENTAL=0
144-
endif
145-
ifeq ($(UNAME_S), Darwin)
146-
LDFLAGS += -lmklml
147-
else
148-
LDFLAGS += -Wl,--as-needed -lmklml_intel -lmklml_gnu
144+
CFLAGS += -I$(ROOTDIR)/src/operator/nn/mkldnn/
145+
ifneq ($(MKLDNNROOT), $(MKLROOT))
146+
CFLAGS += -I$(MKLROOT)/include
147+
LDFLAGS += -L$(MKLROOT)/lib
149148
endif
150-
LDFLAGS += -liomp5
149+
CFLAGS += -I$(MKLDNNROOT)/include
150+
LDFLAGS += -L$(MKLDNNROOT)/lib -lmkldnn -Wl,-rpath,'$${ORIGIN}'
151151
endif
152152

153153
ifeq ($(USE_OPERATOR_TUNING), 1)
@@ -161,7 +161,7 @@ endif
161161
# - for Ubuntu, installing atlas will not automatically install the atlas provided lapack library
162162
# silently switching lapack off instead of letting the build fail because of backward compatibility
163163
ifeq ($(USE_LAPACK), 1)
164-
ifeq ($(USE_BLAS),$(filter $(USE_BLAS),blas openblas atlas))
164+
ifeq ($(USE_BLAS),$(filter $(USE_BLAS),blas openblas atlas mkl))
165165
ifeq (,$(wildcard /lib/liblapack.a))
166166
ifeq (,$(wildcard /usr/lib/liblapack.a))
167167
ifeq (,$(wildcard /usr/lib64/liblapack.a))
@@ -179,7 +179,7 @@ ifeq ($(USE_LAPACK), 1)
179179
ifneq ($(USE_LAPACK_PATH), )
180180
LDFLAGS += -L$(USE_LAPACK_PATH)
181181
endif
182-
ifeq ($(USE_BLAS),$(filter $(USE_BLAS),blas openblas atlas))
182+
ifeq ($(USE_BLAS),$(filter $(USE_BLAS),blas openblas atlas mkl))
183183
LDFLAGS += -llapack
184184
endif
185185
CFLAGS += -DMXNET_USE_LAPACK
@@ -569,7 +569,8 @@ clean: cyclean $(EXTRA_PACKAGES_CLEAN)
569569
else
570570
clean: cyclean testclean $(EXTRA_PACKAGES_CLEAN)
571571
$(RM) -r build lib bin *~ */*~ */*/*~ */*/*/*~ R-package/NAMESPACE R-package/man R-package/R/mxnet_generated.R \
572-
R-package/inst R-package/src/image_recordio.h R-package/src/*.o R-package/src/*.so mxnet_*.tar.gz
572+
R-package/inst R-package/src/image_recordio.h R-package/src/*.o R-package/src/*.so mxnet_*.tar.gz \
573+
3rdparty/mkldnn/install/*
573574
cd $(DMLC_CORE); $(MAKE) clean; cd -
574575
cd $(PS_PATH); $(MAKE) clean; cd -
575576
cd $(NNVM_PATH); $(MAKE) clean; cd -

amalgamation/mxnet_predict0.cc

+1-1
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@
6666
#include "src/operator/operator_util.cc"
6767
#include "src/operator/nn/activation.cc"
6868
#include "src/operator/nn/batch_norm.cc"
69-
#include "src/operator/concat.cc"
69+
#include "src/operator/nn/concat.cc"
7070
#include "src/operator/nn/convolution.cc"
7171
#include "src/operator/nn/deconvolution.cc"
7272
#include "src/operator/nn/dropout.cc"

cmake/ChooseBlas.cmake

+2-2
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ if(USE_MKL_IF_AVAILABLE)
2323
find_package(MKL)
2424
endif()
2525
if(MKL_FOUND)
26-
if(USE_MKLML_MKL)
26+
if(USE_MKLDNN)
2727
set(BLAS "open")
2828
else()
2929
set(BLAS "MKL")
@@ -55,4 +55,4 @@ elseif(BLAS STREQUAL "apple")
5555
list(APPEND mshadow_LINKER_LIBS ${Accelerate_LIBRARIES})
5656
add_definitions(-DMSHADOW_USE_MKL=0)
5757
add_definitions(-DMSHADOW_USE_CBLAS=1)
58-
endif()
58+
endif()

0 commit comments

Comments
 (0)