You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the EventTracer family, including ETDumpGen, logs all delegation intermediate outputs during execution. This comprehensive logging can lead to significant storage and performance overhead, especially when the model is large and deep like LLaMA. Users often have limited memory resources and need to focus on specific outputs, making the logging of all intermediate outputs unnecessary and inefficient.
Target
Goal
The goal of this update is to introduce a mechanism that allows users to specify which outputs should be logged based on their names and delegate_debug_index. By applying filters, users can reduce the amount of logged data, focusing only on the outputs they care about. This will help in managing storage requirements and improving performance.
Specifically:
Universal Application: Every EventTracer should benefit from this update.
Filtering should be a fundamental function for every EventTracer, not just ETDumpGen or other specific implementations.
The filtering API should work on every ET-compatible device; it shall not depend on libraries other than ET core supported.
Customizable Filtering: Allow users to configure filters to meet specific requirements, enabling selective logging of intermediate outputs based on their names and delegate_debug_index.
Backward Compatibility: Ensure no breaking changes are introduced.
The new pipeline with filter and default argument should continue to function as the current pipeline without filters.
Low Overhead: Minimize the overhead when switching different filters.
The filtering should be configurable and occur during runtime to minimize export and lowering overhead.
Non-Goal
The following items, although reasonable, are not the goals of this project:
Filtering debug data blobs other than delegation intermediate output.
Filtering debug data blobs based on attributes other than data’s name and delegate_debug_index, like their size, content, dtype, etc.
Having a concrete and uniform filter implementation for all use cases.
Proposals
Filter Interface Design
We propose a new class called EventTracerFilterBase to serve as the base class for all filters, containing essential functions for filtering in the EventTracer class. Users should extend this class to create their own filters.
classEventTracerFilterBase {
public:// Return true if given name and delegate_debug_index matches filter// False otherwise.// Error code if anything error happensvirtual Result<bool> filter(
char* name,
DebugHandle delegate_debug_index);
virtual~EventTracerFilterBase();
};
Benefits of this interface include:
No extra dependencies are needed.
Users have full control over filtering logic and any intermediate variables, if needed.
It allows any sort of filtering algorithm the user wants to implement.
EventTracer Update
We propose to have two main updates to EventTracer:
Introduce a new API set_output_filter that accepts the pointer of user-implemented EventTracerFilter for further filtering.
classEventTracer {
public:// New API to set intermediate output filtersvoidset_intermediate_output_filter(EventTracerFilterBase* event_tracer_filter);
};
Modify log_intermediate_output_delegate definition, change the return type from void to Result<bool> to indicate whether the output has been logged, or if any error occurs during logging.
// True indicates the output was logged.// False indicates the output was not logged due to filtering.// An error result indicates any error that occurred during the process.template <typename T>
Result<bool> EventTracer::log_intermediate_output_delegate_helper(
constchar* name,
DebugHandle delegate_debug_index,
const T& output);
ETDumpGen Support
As a first-class EventTracer supported by ExecuTorch, ETDumpGen will update to support the proposed filtering mechanism.
We propose a concrete implementations of EventTracerFilterBase living alongside of ETDumpGen called ETDumpFilter, which uses Regex for name filtering, and a range-based approach for debug handle filtering:
classETDumpFilter : publicEventTracerFilterBase {
public:ETDumpFilter() = default;
// Add the regex pattern for filtering
Result<bool> add_regex(char* pattern);
// Reset the range for delegate_debug_index filtering;
Result<bool> set_range(size_t start, size_t end);
// return True if given name matches filter// false otherwise.// error code if anything error happen
Result<bool> filter(char* name, DebugHandle delegate_debug_index) override;
~ETDumpFilter() = default;
}
We have chosen to use regex strings as the filter format of the first example for several reasons:
Flexibility: Regex allows for arbitrary name filtering, enabling customized intermediate output filtering.
Familiarity: Regex is widely known and used, making it accessible for most developers.
Compatibility: Most systems and programming environments support regex, ensuring broad applicability and ease of integration.
We have chosen to use a range-based approach for:
Efficiency: Range-based filtering can be pretty efficient, especially when dealing with large indices and doing search stuff like binary search.
Simplicity: Specifying a start and end index is a straightforward way to define a range, making it easy for users to understand and implement.
module: devtoolsIssues related to developer tools and code under devtools/
1 participant
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Context
Currently, the
EventTracer
family, includingETDumpGen
, logs all delegation intermediate outputs during execution. This comprehensive logging can lead to significant storage and performance overhead, especially when the model is large and deep likeLLaMA
. Users often have limited memory resources and need to focus on specific outputs, making the logging of all intermediate outputs unnecessary and inefficient.Target
Goal
The goal of this update is to introduce a mechanism that allows users to specify which outputs should be logged based on their names and delegate_debug_index. By applying filters, users can reduce the amount of logged data, focusing only on the outputs they care about. This will help in managing storage requirements and improving performance.
Specifically:
EventTracer
should benefit from this update.EventTracer
, not justETDumpGen
or other specific implementations.delegate_debug_index
.Non-Goal
The following items, although reasonable, are not the goals of this project:
delegate_debug_index
, like their size, content, dtype, etc.Proposals
Filter Interface Design
We propose a new class called
EventTracerFilterBase
to serve as the base class for all filters, containing essential functions for filtering in theEventTracer
class. Users should extend this class to create their own filters.Benefits of this interface include:
EventTracer Update
We propose to have two main updates to
EventTracer
:set_output_filter
that accepts the pointer of user-implementedEventTracerFilter
for further filtering.log_intermediate_output_delegate
definition, change the return type fromvoid
toResult<bool>
to indicate whether the output has been logged, or if any error occurs during logging.ETDumpGen Support
As a first-class
EventTracer
supported byExecuTorch
,ETDumpGen
will update to support the proposed filtering mechanism.We propose a concrete implementations of
EventTracerFilterBase
living alongside ofETDumpGen
calledETDumpFilter
, which uses Regex for name filtering, and a range-based approach for debug handle filtering:We have chosen to use regex strings as the filter format of the first example for several reasons:
We have chosen to use a range-based approach for:
@tarun292 @YIWENX14 @byjlw @iseeyuan
Beta Was this translation helpful? Give feedback.
All reactions