Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Colcon build failure of TensorRT using python3.8 and tensorrt-10.1.0.27 #7944

Closed
3 tasks done
annb3 opened this issue Jul 10, 2024 · 18 comments
Closed
3 tasks done

Colcon build failure of TensorRT using python3.8 and tensorrt-10.1.0.27 #7944

annb3 opened this issue Jul 10, 2024 · 18 comments
Assignees
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned)

Comments

@annb3
Copy link

annb3 commented Jul 10, 2024

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I'm convinced that this is not my fault but a bug.

Description

Running colcon build on autoware package of ROS2 humble into the docker environment gives me errors related to the lidar_transfusion package.

Expected behavior

--

Actual behavior

Running colcon build on autoware package of ROS2 humble into the docker environment gives me this errors:

/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp: In member function ‘bool lidar_transfusion::TransfusionTRT::preprocess(const PointCloud2&, const tf2_ros::Buffer&)’:
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:158:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  158 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:160:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setInputShape’; did you mean ‘setInputShapeBinding’?
  160 |   network_trt_ptr_->context->setInputShape(
      |                              ^~~~~~~~~~~~~
      |                              setInputShapeBinding
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:165:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  165 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:167:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setInputShape’; did you mean ‘setInputShapeBinding’?
  167 |   network_trt_ptr_->context->setInputShape(
      |                              ^~~~~~~~~~~~~
      |                              setInputShapeBinding
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:170:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  170 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:172:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setInputShape’; did you mean ‘setInputShapeBinding’?
  172 |   network_trt_ptr_->context->setInputShape(
      |                              ^~~~~~~~~~~~~
      |                              setInputShapeBinding
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:176:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  176 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:178:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  178 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:180:30: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘setTensorAddress’
  180 |   network_trt_ptr_->context->setTensorAddress(
      |                              ^~~~~~~~~~~~~~~~
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp: In member function ‘bool lidar_transfusion::TransfusionTRT::inference()’:
/home/user/autoware/src/universe/autoware.universe/perception/lidar_transfusion/lib/transfusion_trt.cpp:187:44: error: ‘class nvinfer1::IExecutionContext’ has no member named ‘enqueueV3’; did you mean ‘enqueueV2’?
  187 |   auto status = network_trt_ptr_->context->enqueueV3(stream_);
      |                                            ^~~~~~~~~
      |                                            enqueueV2
gmake[2]: *** [CMakeFiles/transfusion_lib.dir/build.make:160: CMakeFiles/transfusion_lib.dir/lib/transfusion_trt.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:143: CMakeFiles/transfusion_lib.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2
---
Failed   <<< lidar_transfusion [1min 0s, exited with code 2]

Steps to reproduce

--

Versions

docker environment

TensorRT Version:
10.1.0.27

NVIDIA GPU:
NVIDIA RTX A2000 Laptop GPU

NVIDIA Driver Version:
555.42.06

CUDA Version:
12.5

CUDNN Version:
cuda_12.2.r12.2

Operating System:
Linux - Ubuntu 20.04 LTS

Python Version:
3.8

Possible causes

No response

Additional context

No response

@amadeuszsz amadeuszsz self-assigned this Jul 10, 2024
@amadeuszsz amadeuszsz added the component:perception Advanced sensor data processing and environment understanding. (auto-assigned) label Jul 10, 2024
@amadeuszsz
Copy link
Contributor

amadeuszsz commented Jul 10, 2024

Hi @annb3
Thank you for your report.
Could you please show me your output of dpkg -l | grep nvinfer? Please, make it sure you execute this command in same terminal window where you build your workspace.

@annb3
Copy link
Author

annb3 commented Jul 10, 2024

Screenshot from 2024-07-10 17-20-04
It's here :) sorry for the delay @amadeuszsz

@amadeuszsz
Copy link
Contributor

@annb3
You trying to build workspace out of Docker container. To use Docker, please follow Docker installation instructions.
If you would like to use Autoware without docker images, please follow source installation instructions. In this case, not only is your TensorRT outdated, but you will also need to upgrade to Ubuntu 22.04.

@annb3
Copy link
Author

annb3 commented Jul 11, 2024

No, it is inside :) this is my docker environment. Already used https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/. Everything works since years until last month. I don't know exactly why.

@amadeuszsz
Copy link
Contributor

No, it is inside :) this is my docker environment. Already used https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/docker-installation/. Everything works since years until last month. I don't know exactly why.

@annb3
Ok, then:

  1. To make it sure, please execute docker container ps command in same terminal where you build workspace. Show the output please.
  2. If error occurred, execute docker container ps in new terminal window (without docker). Show the output.
  3. Which command you use to run docker image? Recently Autoware dropped rocker support and use new docker images. Please update your image and show output from docker images | grep autoware.

@annb3
Copy link
Author

annb3 commented Jul 15, 2024

@amadeuszsz here your requests. :) thanks for the support.
1.
Screenshot from 2024-07-15 10-19-23
2.
Screenshot from 2024-07-15 10-19-32
3. rocker --nvidia --x11 --network host --devices /dev/vid* --user --volume $HOME/techdemo-autoware_b/autoware --volume $HOME/autoware_map --volume $HOME/autoware_data --volume /dev/shm/:/dev/shm -- ghcr.io/autowarefoundation/autoware-universe:latest-cuda
Screenshot from 2024-07-15 10-21-46

@amadeuszsz
Copy link
Contributor

@annb3
You updating autoware.universe while still using old Autoware Docker image. If you wish to use recent changes, you have to update Autoware repository and Docker image as well. Please, update your Autoware repository and proceed to Docker installation tutorial.
FYI, the problem is your current image contains TensorRT 8.4.2, lidar_transfusion uses TensorRT 8.6.1, which is already in current Autoware Docker image.

@vividf
Copy link
Contributor

vividf commented Jul 23, 2024

@annb3
May I kindly inquire if @amadeuszsz's suggestion resolved the issue?
Thanks.

@annb3
Copy link
Author

annb3 commented Jul 23, 2024

@vividf
Unfortunately not.
The upgrade of the docker container gives me lots of problems and errors. I have also problems trying to download the new one from the beginning.
So probably it should solve, but I am unable to try it and I don't know exactly why.. probably the new docker environment is for 22.04LTS? Or am I miss something?

@annb3
Copy link
Author

annb3 commented Jul 23, 2024

@vividf
It seems that the new installation is concluded but apparently nothing changes: always same problem.
Screenshot from 2024-07-23 13-55-09

@amadeuszsz
Copy link
Contributor

@annb3
What is your problem after image update? Please, follow this tutorial and share your error here and command which causes it as well

@amadeuszsz
Copy link
Contributor

amadeuszsz commented Jul 23, 2024

@amadeuszsz Exactly the same as before, nothing changes during the building: colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release Screenshot from 2024-07-23 16-04-00

@annb3
What command you used to run Docker container? From now you need to use ./docker/run.sh --devel. Then you can validate TensorRT version as before and run Autoware using prebuilt workspace. If you confirm it, you can proceed to building from sources.

@annb3
Copy link
Author

annb3 commented Jul 23, 2024

@amadeuszsz
sorry I have published the wrong terminal, that's why I delete the comment.
Give me more time I'll give you the correct feedback :)

@annb3
Copy link
Author

annb3 commented Jul 23, 2024

@amadeuszsz Exactly the same as before, nothing changes during the building: colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release Screenshot from 2024-07-23 16-04-00

@annb3 What command you used to run Docker container? From now you need to use ./docker/run.sh --devel. Then you can validate TensorRT version as before and run Autoware using prebuilt workspace. If you confirm it, you can proceed to building from sources.

Screenshot from 2024-07-23 17-00-56
I have this problem launching ./docker/run.sh --devel

@amadeuszsz
Copy link
Contributor

amadeuszsz commented Jul 24, 2024

Screenshot from 2024-07-23 17-00-56 I have this problem launching ./docker/run.sh --devel

@annb3

Please, check the command provided in linked instruction before for setting up development environment (setup-dev-env.sh script) or pull docker image directly via docker pull ghcr.io/autowarefoundation/autoware:latest-devel-cuda

@annb3
Copy link
Author

annb3 commented Jul 30, 2024

with setup-dev-env.sh I get:
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Only Ubuntu 22.04 is supported for this branch. Please refer to https://autowarefoundation.github.io/autoware-documentation/main/installation/autoware/source-installation/."}

Instead I have already done docker pull ghcr.io/autowarefoundation/autoware:latest-devel-cuda.. I don't know honestly.
Can I try something else @amadeuszsz ?

@amadeuszsz
Copy link
Contributor

@annb3

with setup-dev-env.sh I get: fatal: [localhost]: FAILED!
Did you follow the instruction which I linked for you? Please share your exact command which triggers environment setup.

Instead I have already done docker pull ghcr.io/autowarefoundation/autoware:latest-devel-cuda.. I don't know honestly.

This command was shortcut for you. If you are not sure about pull success, you can always look to sample prompt in Docker documentation.

Can I try something else @amadeuszsz ?

I don't think so. As you already tried before, you can:

  • Build from source. Since you use Ubuntu 20.04, it is not possible. You might change to galactic branch but please keep in mind you will stay without new Autoware features. Upgrade to Ubuntu 22.04 will help here.
  • Use Docker containers. In that case you need to run container as it is describe in tutorial. You are almost there!

If something from documentation is not clear for you, please let us know. We want to make our instructions easy to handle for community as much as we can!

@annb3
Copy link
Author

annb3 commented Aug 6, 2024

As I was unable to do anything more, I solve the issue upgrading to 22.04 LTS and building from source.

Thanks a lot for your support @amadeuszsz !

@annb3 annb3 closed this as completed Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding. (auto-assigned)
Projects
None yet
Development

No branches or pull requests

3 participants