Skip to content

Commit e5c94b2

Browse files
knzo25mojomexmanatopre-commit-ci[bot]amadeuszsz
authored
feat(autoware_cuda_pointcloud_preprocessor): a cuda-accelerated pointcloud preprocessor (autowarefoundation#9454)
* feat: moved the cuda pointcloud preprocessor and organized from a personal repository Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: fixed incorrect links Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: fixed dead links pt2 Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: fixed spelling errors Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: json schema fixes Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed comments and filled the fields Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * fix: fixed the adapter for the case when the number of points in the pointcloud changes after the first iteration Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * feat: used the cuda host allocators for aster host to device copies Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * Update sensing/autoware_cuda_pointcloud_preprocessor/docs/cuda-pointcloud-preprocessor.md Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> * Update sensing/autoware_cuda_pointcloud_preprocessor/src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu Co-authored-by: Manato Hirabayashi <3022416+manato@users.noreply.github.com> * Update sensing/autoware_cuda_pointcloud_preprocessor/src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu Co-authored-by: Manato Hirabayashi <3022416+manato@users.noreply.github.com> * style(pre-commit): autofix * Update sensing/autoware_cuda_pointcloud_preprocessor/docs/cuda-pointcloud-preprocessor.md Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> * Update sensing/autoware_cuda_pointcloud_preprocessor/README.md Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> * Update sensing/autoware_cuda_pointcloud_preprocessor/README.md Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> * Update sensing/autoware_cuda_pointcloud_preprocessor/src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> * style(pre-commit): autofix * Update sensing/autoware_cuda_pointcloud_preprocessor/src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu Co-authored-by: Manato Hirabayashi <3022416+manato@users.noreply.github.com> * style(pre-commit): autofix * Update sensing/autoware_cuda_pointcloud_preprocessor/src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu Co-authored-by: Manato Hirabayashi <3022416+manato@users.noreply.github.com> * style(pre-commit): autofix * chore: fixed code compilation to reflect Hirabayashi-san's memory pool proposal Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * feat: generalized the number of crop boxes. For two at least, the new approach is actually faster Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: updated config, schema, and handled the null case in a specialized way Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * feat: moving the pointcloud organization into gpu Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * feat: reimplemented the organized pointcloud adapter in cuda. the only bottleneck is the H->D copy Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed redundant ternay operator Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: added a temporary memory check. the check will be unified in a later PR Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: refactored the structure to avoid large files Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: updated the copyright year Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * fix: fixed a bug in the undistortion kernel setup. validated it comparing it with the baseline Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed unused packages Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed mentions of the removed adapter Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: fixed missing autoware prefix Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * fix: missing assignment in else branch Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: added cuda/nvcc debug flags on debug builds Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: refactored parameters for the undistortion settings Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed unused headers Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: changed default crop box to no filtering at all Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * feat: added missing restrict keyword Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: spells Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: removed default destructor Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: ocd activated (spelling) Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: fixed the schema Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: improved readibility Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: added dummy crop box Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: added new repositories to ansible Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: CI/CD Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: more CI/CD Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: mode CI/CD. some linters are conflicting Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * style(pre-commit): autofix * chore: ignoring uncrustify Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: ignoring more uncrustify Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: missed one more uncrustify exception Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> * chore: added meta dep Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> --------- Signed-off-by: Kenzo Lobos-Tsunekawa <kenzo.lobos@tier4.jp> Co-authored-by: Max Schmeller <6088931+mojomex@users.noreply.github.com> Co-authored-by: Manato Hirabayashi <3022416+manato@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Amadeusz Szymko <amadeusz.szymko.2@tier4.jp>
1 parent ea448a9 commit e5c94b2

23 files changed

+2391
-0
lines changed

build_depends_humble.repos

+8
Original file line numberDiff line numberDiff line change
@@ -46,3 +46,11 @@ repositories:
4646
type: git
4747
url: https://github.com/tier4/glog.git
4848
version: v0.6.0_t4-ros
49+
universe/external/cuda_blackboard:
50+
type: git
51+
url: https://github.com/autowarefoundation/cuda_blackboard.git
52+
version: v0.1.1
53+
universe/external/negotiated:
54+
type: git
55+
url: https://github.com/osrf/negotiated.git
56+
version: eac198b55dcd052af5988f0f174902913c5f20e7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
cmake_minimum_required(VERSION 3.14)
2+
project(autoware_cuda_pointcloud_preprocessor)
3+
4+
find_package(ament_cmake_auto REQUIRED)
5+
find_package(CUDA)
6+
7+
if(NOT ${CUDA_FOUND})
8+
message(WARNING "cuda was not found, so the autoware_cuda_pointcloud_preprocessor package will not be built.")
9+
return()
10+
elseif(CMAKE_BUILD_TYPE STREQUAL "Debug")
11+
set(CMAKE_CUDA_FLAGS ${CMAKE_CUDA_FLAGS} "-g -G")
12+
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS} "-g -G")
13+
endif()
14+
15+
ament_auto_find_build_dependencies()
16+
17+
# Default to C++17
18+
if(NOT CMAKE_CXX_STANDARD)
19+
set(CMAKE_CXX_STANDARD 17)
20+
endif()
21+
22+
if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang")
23+
add_compile_options(-Wall -Wextra -Wpedantic -Wunused-function)
24+
endif()
25+
26+
if(BUILD_TESTING)
27+
find_package(ament_lint_auto REQUIRED)
28+
ament_lint_auto_find_test_dependencies()
29+
endif()
30+
31+
include_directories(
32+
include
33+
SYSTEM
34+
${CUDA_INCLUDE_DIRS}
35+
)
36+
37+
list(APPEND CUDA_NVCC_FLAGS "--expt-relaxed-constexpr -diag-suppress 20012") # cSpell: ignore expt
38+
39+
cuda_add_library(cuda_pointcloud_preprocessor_lib SHARED
40+
src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor.cu
41+
src/cuda_pointcloud_preprocessor/common_kernels.cu
42+
src/cuda_pointcloud_preprocessor/organize_kernels.cu
43+
src/cuda_pointcloud_preprocessor/outlier_kernels.cu
44+
src/cuda_pointcloud_preprocessor/undistort_kernels.cu
45+
)
46+
47+
target_link_libraries(cuda_pointcloud_preprocessor_lib
48+
${autoware_pointcloud_preprocessor_TARGETS}
49+
)
50+
51+
target_include_directories(cuda_pointcloud_preprocessor_lib SYSTEM PRIVATE
52+
${autoware_pointcloud_preprocessor_INCLUDE_DIRS}
53+
${autoware_point_types_INCLUDE_DIRS}
54+
${cuda_blackboard_INCLUDE_DIRS}
55+
${diagnostic_msgs_INCLUDE_DIRS}
56+
${geometry_msgs_INCLUDE_DIRS}
57+
${rclcpp_INCLUDE_DIRS}
58+
${rclcpp_components_INCLUDE_DIRS}
59+
${rcl_interfaces_INCLUDE_DIRS}
60+
${sensor_msgs_INCLUDE_DIRS}
61+
${tf2_INCLUDE_DIRS}
62+
${tf2_msgs_INCLUDE_DIRS}
63+
)
64+
65+
66+
# Targets
67+
ament_auto_add_library(cuda_pointcloud_preprocessor SHARED
68+
src/cuda_pointcloud_preprocessor/cuda_pointcloud_preprocessor_node.cpp
69+
)
70+
71+
target_link_libraries(cuda_pointcloud_preprocessor
72+
${CUDA_LIBRARIES}
73+
cuda_pointcloud_preprocessor_lib
74+
)
75+
76+
rclcpp_components_register_node(cuda_pointcloud_preprocessor
77+
PLUGIN "autoware::cuda_pointcloud_preprocessor::CudaPointcloudPreprocessorNode"
78+
EXECUTABLE cuda_pointcloud_preprocessor_node
79+
)
80+
81+
install(DIRECTORY launch config
82+
DESTINATION share/${PROJECT_NAME}
83+
)
84+
85+
ament_auto_package()
86+
87+
# Set ROS_DISTRO macros
88+
set(ROS_DISTRO $ENV{ROS_DISTRO})
89+
if(${ROS_DISTRO} STREQUAL "rolling")
90+
add_compile_definitions(ROS_DISTRO_ROLLING)
91+
elseif(${ROS_DISTRO} STREQUAL "foxy")
92+
add_compile_definitions(ROS_DISTRO_FOXY)
93+
elseif(${ROS_DISTRO} STREQUAL "galactic")
94+
add_compile_definitions(ROS_DISTRO_GALACTIC)
95+
elseif(${ROS_DISTRO} STREQUAL "humble")
96+
add_compile_definitions(ROS_DISTRO_HUMBLE)
97+
endif()
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# autoware_cuda_pointcloud_preprocessor
2+
3+
## Purpose
4+
5+
The pointcloud preprocessing implemented in `autoware_pointcloud_preprocessor` has been thoroughly tested in autoware. However, the latency it introduces does not scale well with modern LiDAR devices due to the high number of points they introduce.
6+
7+
To alleviate this issue, this package reimplements most of the pipeline presented in `autoware_pointcloud_preprocessor` leveraging the use of GPGPUs. In particular, this package makes use of CUDA to provide accelerated versions of the already established implementations, while also maintaining compatibility with normal ROS nodes/topics. <!-- cSpell: ignore GPGPUs >
8+
9+
## Inner-workings / Algorithms
10+
11+
A detailed description of each filter's algorithm is available in the following links.
12+
13+
| Filter Name | Description | Detail |
14+
| ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |
15+
| cuda_pointcloud_preprocessor | Implements the cropping, distortion correction, and outlier filtering (ring-based) of the `autoware_pointcloud_preprocessor`'s CPU versions. | [link](docs/cuda-pointcloud-preprocessor.md) |
16+
17+
## (Optional) Future extensions / Unimplemented parts
18+
19+
The subsample filters implemented in `autoware_pointcloud_preprocessor` will have similar counterparts in this package.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
/**:
2+
ros__parameters:
3+
4+
base_frame: base_link
5+
use_imu: true
6+
use_3d_distortion_correction: false
7+
distance_ratio: 1.03
8+
object_length_threshold: 0.1
9+
num_points_threshold: 4
10+
11+
crop_box.min_x: [0.0]
12+
crop_box.min_y: [0.0]
13+
crop_box.min_z: [0.0]
14+
crop_box.max_x: [0.0]
15+
crop_box.max_y: [0.0]
16+
crop_box.max_z: [0.0]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# cuda_pointcloud_preprocessor
2+
3+
## Purpose
4+
5+
This node implements all standard pointcloud preprocessing algorithms applied to a single LiDAR's pointcloud in CUDA.
6+
In particular, this node implements:
7+
8+
- box cropping (ego-vehicle and ego-vehicle's mirrors)
9+
- distortion correction
10+
- ring-based outlier filtering
11+
12+
## Inner-workings / Algorithms
13+
14+
As this node reimplements the functionalities of the CPU-version algorithms, please have a look at the documentations of [crop-box](../../autoware_pointcloud_preprocessor/docs/crop-box-filter.md), [distortion correction](../../autoware_pointcloud_preprocessor/docs/distortion-corrector.md), and [ring-based outlier filter](../../autoware_pointcloud_preprocessor/docs/ring-outlier-filter.md) for more information about these algorithms.
15+
16+
In addition to the individual algorithms previously mentioned, this node uses the `cuda_blackboard`, a cuda transport layer that enables a zero-copy mechanism between GPU and GPU memory for both input and output.
17+
18+
## Inputs / Outputs
19+
20+
### Input
21+
22+
| Name | Type | Description |
23+
| ------------------------- | ------------------------------------------------ | ----------------------------------------- |
24+
| `~/input/pointcloud` | `sensor_msgs::msg::PointCloud2` | Input pointcloud's topic. |
25+
| `~/input/pointcloud/cuda` | `negotiated_interfaces/msg/NegotiatedTopicsInfo` | Input pointcloud's type negotiation topic |
26+
| `~/input/twist` | `geometry_msgs::msg::TwistWithCovarianceStamped` | Topic of the twist information. |
27+
| `~/input/imu` | `sensor_msgs::msg::Imu` | Topic of the IMU data. |
28+
29+
### Output
30+
31+
| Name | Type | Description |
32+
| -------------------------- | ------------------------------------------------ | ---------------------------------------- |
33+
| `~/output/pointcloud` | `sensor_msgs::msg::PointCloud2` | Processed pointcloud's topic |
34+
| `~/output/pointcloud/cuda` | `negotiated_interfaces/msg/NegotiatedTopicsInfo` | Processed pointcloud's negotiation topic |
35+
36+
## Parameters
37+
38+
### Core Parameters
39+
40+
{{ json_to_markdown("sensing/autoware_cuda_pointcloud_preprocessor/schema/cuda_pointcloud_preprocessor.schema.schema.json") }}
41+
42+
## Assumptions / Known limits
43+
44+
- The CUDA implementations, while following the original CPU ones, will not offer the same numerical results, and small approximations were needed to maximize GPU usage.
45+
- This node expects that the input pointcloud follows the `autoware::point_types::PointXYZIRCAEDT` layout and the output pointcloud will use the `autoware::point_types::PointXYZIRC` layout defined in the `autoware_point_types` package.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
// Copyright 2025 TIER IV, Inc.
2+
//
3+
// Licensed under the Apache License, Version 2.0 (the "License");
4+
// you may not use this file except in compliance with the License.
5+
// You may obtain a copy of the License at
6+
//
7+
//    http://www.apache.org/licenses/LICENSE-2.0
8+
//
9+
// Unless required by applicable law or agreed to in writing, software
10+
// distributed under the License is distributed on an "AS IS" BASIS,
11+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
// See the License for the specific language governing permissions and
13+
// limitations under the License.
14+
15+
#ifndef AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__COMMON_KERNELS_HPP_
16+
#define AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__COMMON_KERNELS_HPP_
17+
18+
#include "autoware/cuda_pointcloud_preprocessor/point_types.hpp"
19+
#include "autoware/cuda_pointcloud_preprocessor/types.hpp"
20+
21+
#include <cuda_runtime.h>
22+
23+
namespace autoware::cuda_pointcloud_preprocessor
24+
{
25+
void transformPointsLaunch(
26+
const InputPointType * input_points, InputPointType * output_points, int num_points,
27+
TransformStruct transform, int threads_per_block, int blocks_per_grid, cudaStream_t & stream);
28+
29+
void cropBoxLaunch(
30+
InputPointType * d_points, std::uint32_t * output_mask, int num_points,
31+
const CropBoxParameters * crop_box_parameters_ptr, int num_crop_boxes, int threads_per_block,
32+
int blocks_per_grid, cudaStream_t & stream);
33+
34+
void combineMasksLaunch(
35+
const std::uint32_t * mask1, const std::uint32_t * mask2, int num_points,
36+
std::uint32_t * output_mask, int threads_per_block, int blocks_per_grid, cudaStream_t & stream);
37+
38+
void extractPointsLaunch(
39+
InputPointType * input_points, std::uint32_t * masks, std::uint32_t * indices, int num_points,
40+
OutputPointType * output_points, int threads_per_block, int blocks_per_grid,
41+
cudaStream_t & stream);
42+
43+
} // namespace autoware::cuda_pointcloud_preprocessor
44+
45+
#endif // AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__COMMON_KERNELS_HPP_
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
// Copyright 2025 TIER IV, Inc.
2+
//
3+
// Licensed under the Apache License, Version 2.0 (the "License");
4+
// you may not use this file except in compliance with the License.
5+
// You may obtain a copy of the License at
6+
//
7+
//    http://www.apache.org/licenses/LICENSE-2.0
8+
//
9+
// Unless required by applicable law or agreed to in writing, software
10+
// distributed under the License is distributed on an "AS IS" BASIS,
11+
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
// See the License for the specific language governing permissions and
13+
// limitations under the License.
14+
15+
#ifndef AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__CUDA_POINTCLOUD_PREPROCESSOR_HPP_
16+
#define AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__CUDA_POINTCLOUD_PREPROCESSOR_HPP_
17+
18+
#include "autoware/cuda_pointcloud_preprocessor/point_types.hpp"
19+
#include "autoware/cuda_pointcloud_preprocessor/types.hpp"
20+
21+
#include <Eigen/Core>
22+
#include <Eigen/Geometry>
23+
#include <cuda_blackboard/cuda_pointcloud2.hpp>
24+
25+
#include <geometry_msgs/msg/transform_stamped.hpp>
26+
#include <geometry_msgs/msg/twist_with_covariance_stamped.hpp>
27+
#include <geometry_msgs/msg/vector3_stamped.hpp>
28+
#include <sensor_msgs/msg/point_cloud2.hpp>
29+
30+
#include <thrust/device_vector.h>
31+
32+
#include <cstdint>
33+
#include <deque>
34+
#include <memory>
35+
#include <vector>
36+
37+
namespace autoware::cuda_pointcloud_preprocessor
38+
{
39+
40+
class CudaPointcloudPreprocessor
41+
{
42+
public:
43+
enum class UndistortionType { Invalid, Undistortion2D, Undistortion3D };
44+
45+
CudaPointcloudPreprocessor();
46+
~CudaPointcloudPreprocessor() = default;
47+
48+
void setCropBoxParameters(const std::vector<CropBoxParameters> & crop_box_parameters);
49+
void setRingOutlierFilterParameters(const RingOutlierFilterParameters & ring_outlier_parameters);
50+
void setUndistortionType(const UndistortionType & undistortion_type);
51+
52+
void preallocateOutput();
53+
54+
std::unique_ptr<cuda_blackboard::CudaPointCloud2> process(
55+
const sensor_msgs::msg::PointCloud2::ConstSharedPtr input_pointcloud_msg_ptr,
56+
const geometry_msgs::msg::TransformStamped & transform_msg,
57+
const std::deque<geometry_msgs::msg::TwistWithCovarianceStamped> & twist_queue,
58+
const std::deque<geometry_msgs::msg::Vector3Stamped> & angular_velocity_queue,
59+
const std::uint32_t first_point_rel_stamp_nsec);
60+
61+
private:
62+
void organizePointcloud();
63+
64+
CropBoxParameters self_crop_box_parameters_{};
65+
CropBoxParameters mirror_crop_box_parameters_{};
66+
RingOutlierFilterParameters ring_outlier_parameters_{};
67+
UndistortionType undistortion_type_{UndistortionType::Invalid};
68+
69+
int num_rings_{};
70+
int max_points_per_ring_{};
71+
std::size_t num_raw_points_{};
72+
std::size_t num_organized_points_{};
73+
74+
std::vector<sensor_msgs::msg::PointField> point_fields_{};
75+
std::unique_ptr<cuda_blackboard::CudaPointCloud2> output_pointcloud_ptr_{};
76+
77+
cudaStream_t stream_;
78+
int max_blocks_per_grid_{};
79+
const int threads_per_block_{256};
80+
cudaMemPool_t device_memory_pool_;
81+
82+
// Organizing buffers
83+
thrust::device_vector<InputPointType> device_input_points_;
84+
thrust::device_vector<InputPointType> device_organized_points_;
85+
thrust::device_vector<std::int32_t> device_ring_index_;
86+
thrust::device_vector<std::uint32_t> device_indexes_tensor_;
87+
thrust::device_vector<std::uint32_t> device_sorted_indexes_tensor_;
88+
thrust::device_vector<std::int32_t> device_segment_offsets_;
89+
thrust::device_vector<std::int32_t> device_max_ring_;
90+
thrust::device_vector<std::int32_t> device_max_points_per_ring_;
91+
92+
thrust::device_vector<std::uint8_t> device_sort_workspace_;
93+
std::size_t sort_workspace_bytes_{0};
94+
95+
// Pointcloud preprocessing buffers
96+
thrust::device_vector<InputPointType> device_transformed_points_{};
97+
thrust::device_vector<OutputPointType> device_output_points_{};
98+
thrust::device_vector<std::uint32_t> device_crop_mask_{};
99+
thrust::device_vector<std::uint32_t> device_ring_outlier_mask_{};
100+
thrust::device_vector<std::uint32_t> device_indices_{};
101+
thrust::device_vector<TwistStruct2D> device_twist_2d_structs_{};
102+
thrust::device_vector<TwistStruct3D> device_twist_3d_structs_{};
103+
thrust::device_vector<CropBoxParameters> host_crop_box_structs_{};
104+
thrust::device_vector<CropBoxParameters> device_crop_box_structs_{};
105+
};
106+
107+
} // namespace autoware::cuda_pointcloud_preprocessor
108+
109+
#endif // AUTOWARE__CUDA_POINTCLOUD_PREPROCESSOR__CUDA_POINTCLOUD_PREPROCESSOR_HPP_

0 commit comments

Comments
 (0)