File QnnHtpProfile.h

Parent directory (include/QNN/HTP)

QNN HTP Profile component API.

Definition (include/QNN/HTP/QnnHtpProfile.h)

Detailed Description

Requires HTP backend to be initialized. Should be used with the QnnProfile API but has HTP backend specific definition for different QnnProfile data structures

Includes

Defines

Full File Listing

QNN HTP Profile component API.

    Requires HTP backend to be initialized.
    Should be used with the QnnProfile API but has HTP backend
    specific definition for different QnnProfile data structures

Defines

QNN_HTP_PROFILE_EVENTTYPE_CONTEXT_LOAD_BIN_HOST_RPC_TIME_MICROSEC 1002

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the ARM processor when client invokes QnnContext_createFromBinary. The value returned is time in microseconds.

Note

context load binary host rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_CONTEXT_LOAD_BIN_HTP_RPC_TIME_MICROSEC 1003

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the HTP processor when client invokes QnnContext_createFromBinary. The value returned is time in microseconds.

Note

context load binary htp rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_CONTEXT_LOAD_BIN_ACCEL_TIME_MICROSEC 1004

QnnProfile_EventType_t definition to get profile information that corresponds to the time taken to create the context on the accelerator when client invokes QnnContext_createFromBinary. The value returned is time in microseconds.

Note

context load binary accelerator time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_HOST_RPC_TIME_MICROSEC 2001

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the ARM processor when client invokes QnnGraph_finalize. The value returned is time in microseconds.

Note

graph finalize host rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_HTP_RPC_TIME_MICROSEC 2002

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the HTP processor when client invokes QnnGraph_finalize. The value returned is time in microseconds.

Note

graph finalize htp rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_ACCEL_TIME_MICROSEC 2003

QnnProfile_EventType_t definition to get profile information that corresponds to finalize the graph on the accelerator when client invokes QnnGraph_finalize. The value returned is time in microseconds.

Note

graph finalize accelerator time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE 2004

QnnProfile_EventType_t definition to get profile information that corresponds to Performance Estimates for the graph when client invokes QnnGraph_finalize. This is just a dummy event which will print only the heading with no value or unit.

Note

HTP Performance Estimates maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_MODE 2005

QnnProfile_EventType_t definition to get perf mode at which the perf estimates are collected during QnnGraph_finalize. The value returned is the perf mode in string with no unit.

Note

Perf mode maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_SIM_EXEC_CYCLES 2006

QnnProfile_EventType_t definition to get profile information that corresponds to simulated execution cycles during QnnGraph_finalize. The value returned is number of cycles.

Note

Simulated execution cycles maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_SIM_EXEC_LOWER_CYCLES 2007

QnnProfile_EventType_t definition to get profile information that corresponds to a lower estimate of simulated execution cycles during QnnGraph_finalize. The value returned is number of cycles.

Note

Simulated execution cycles lower estimate maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_SIM_EXEC_UPPER_CYCLES 2008

QnnProfile_EventType_t definition to get profile information that corresponds to a upper estimate of simulated execution cycles during QnnGraph_finalize. The value returned is number of cycles.

Note

Simulated execution cycles upper estimate maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_BANDWIDTH_STATS 2009

QnnProfile_EventType_t definition to get profile information that corresponds to DDR information for each HTP during QnnGraph_finalize. This is just a dummy event which will print only the heading with no value or unit.

Note

DDR Information for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_BANDWIDTH_STATS_HTP_ID 2010

QnnProfile_EventType_t definition to get profile information that corresponds to the HTP ID on chip during QnnGraph_finalize. The value returned is the HTP ID with no unit.

Note

HTP ID’s maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_INPUT_FILL 2011

QnnProfile_EventType_t definition to get profile information that corresponds to the Graph defined inputs or the total reads (in bytes) from DDR for graph input related tensors (weights, bias, activations) which do not have predecessors. The value returned is the num of blocks in bytes.

Note

Graph defined inputs for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_INTERMEDIATE_FILL 2012

QnnProfile_EventType_t definition to get profile information that corresponds to the total reads (in bytes) from DDR for compiler generated fill operators which have predecessors and successors and originate on the same HTP. The value returned is the num of blocks in bytes.

Note

Intermediate Fill Information for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_INTERMEDIATE_SPILL 2013

QnnProfile_EventType_t definition to get profile information that corresponds to the total writes (in bytes) from DDR for compiler generated fill operators which have predecessors and successors and originate on the same HTP. The value returned is the num of blocks in bytes.

Note

Intermediate Spill Information for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_INTER_HTP_FILL 2014

QnnProfile_EventType_t definition to get profile information that corresponds to the total reads (in bytes) from DDR for fills which were generated by a different HTP core and do not have a predecessor, but have a successor. The value returned is the num of blocks in bytes.

Note

Inter HTP Fill Information for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_INTER_HTP_SPILL 2015

QnnProfile_EventType_t definition to get profile information that corresponds to the total writes (in bytes) from DDR for fills which were generated by a different HTP core and do not have a successor, but have a predecessor. The value returned is the num of blocks in bytes.

Note

Inter HTP Spill Information for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_OUTPUT_SPILL 2016

QnnProfile_EventType_t definition to get profile information that corresponds to the total writes (in bytes) to DDR for graph output related tensors which do not have successors. The value returned is the num of blocks in bytes.

Note

Graph output related tensors for each HTP maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_MISSING_COST_OPS 2017

QnnProfile_EventType_t definition to get profile information that corresponds to the total number of missing ops which do not have any cost associated with them while getting the graph performance estimates. The value returned is the num of missing ops with no unit.

Note

Number of missing cost ops maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_FINALIZE_PERF_ESTIMATE_MISSING_COST_OPID 2018

QnnProfile_EventType_t definition to get profile information that corresponds to the op ids of the missing ops which do not have any cost associated with them while getting the graph performance estimates. The value returned is the opname along with the op id (decimal format) of the ops which does not have any costs associated with them.

Note

Opname and Op ids of missing cost ops are available only with QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_HOST_RPC_TIME_MICROSEC 3001

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the ARM processor when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value returned is time in microseconds.

Note

graph execute host rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_HTP_RPC_TIME_MICROSEC 3002

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the HTP processor when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value returned is time in microseconds.

Note

graph execute htp rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_ACCEL_TIME_CYCLE 3003

QnnProfile_EventType_t definition to get profile information that corresponds to execute the graph on the accelerator when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value returned is number of processor cycles taken.

Note

graph execute accelerator time maybe available only on QNN_PROFILE_LEVEL_DETAILED levels

Note

When QNN_PROFILE_LEVEL_DETAILED is used, this event can have multiple sub-events of type QNN_PROFILE_EVENTTYPE_NODE. There will be a sub-event for each node that was added to the graph

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_ACCEL_TIME_MICROSEC 3004

QnnProfile_EventType_t definition to get profile information that corresponds to execute the graph on the accelerator when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value indicates execute including wait/resource acquisition time on the accelerator, if applicable in multi-threaded scenarios. The value returned is time taken in microseconds.

Note

graph execute accelerator time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

Note

When QNN_PROFILE_LEVEL_DETAILED is used, this event can have multiple sub-events of type QNN_PROFILE_EVENTTYPE_NODE / QNN_PROFILE_EVENTUNIT_MICROSEC There will be a sub-event for each node that was added to the graph

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_MISC_ACCEL_TIME_MICROSEC 3005

QnnProfile_EventType_t definition to get profile information that corresponds to time taken for miscellaneous work i.e. time that cannot be attributed to a node but are still needed to execute the graph on the accelerator. This occurs when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value returned is time taken in microseconds.

Note

graph execute misc accelerator time is available only on QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_YIELD_INSTANCE_RELEASE_TIME 3006

QnnProfile_EventType_t definition to get profile information that corresponds to time taken for a graph yield instance to release all its resources to the other graph. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_YIELD_INSTANCE_WAIT_TIME 3007

QnnProfile_EventType_t definition to get profile information that corresponds to time a graph spends waiting for a higher priority graph to finish execution. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_YIELD_INSTANCE_RESTORE_TIME 3008

QnnProfile_EventType_t definition to get profile information that corresponds to time a graph spends re-acquiring resources and restoring vtcm. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_YIELD_COUNT 3009

QnnProfile_EventType_t definition to get profile information that corresponds to the number of times that a yield occured during execution.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_VTCM_ACQUIRE_TIME 3010

QnnProfile_EventType_t definition for time a graph waits to get VTCM. This should be constant UNLESS we need another graph to yield. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_RESOURCE_POWER_UP_TIME 3011

QnnProfile_EventType_t definition for time a graph waits to get HMX + HVX, and turn them all on. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_ACCEL_EXCL_WAIT_TIME_MICROSEC 3012

QnnProfile_EventType_t definition to get profile information that corresponds to execute the graph on the accelerator when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value indicates execute excluding wait/resource acquisition time on the accelerator, if applicable in multi-threaded scenarios. The value returned is time taken in microseconds.

Note

graph execute accelerator time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

Note

When QNN_PROFILE_LEVEL_DETAILED is used, this event can have multiple sub-events of type QNN_PROFILE_EVENTTYPE_NODE / QNN_PROFILE_EVENTUNIT_MICROSEC There will be a sub-event for each node that was added to the graph

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_DEINIT_HOST_RPC_TIME_MICROSEC 4001

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the ARM processor when client invokes QnnContext_free which in consequence deinit graph. The value returned is time in microseconds.

Note

graph deinit host rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_DEINIT_HTP_RPC_TIME_MICROSEC 4002

QnnProfile_EventType_t definition to get profile information that corresponds to the remote procedure call on the HTP processor when client invokes QnnContext_free which in consequence deinit graph. The value returned is time in microseconds.

Note

graph deinit htp rpc time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_DEINIT_ACCEL_TIME_MICROSEC 4003

QnnProfile_EventType_t definition to get profile information that corresponds to the time taken to deinit graph on the accelerator when client invokes QnnContext_free which in consequence deinit graph. The value returned is time in microseconds.

Note

graph deinit accelerator time maybe available on both QNN_PROFILE_LEVEL_BASIC and QNN_PROFILE_LEVEL_DETAILED levels

QNN_HTP_PROFILE_EVENTTYPE_NODE_WAIT 5001

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents the amount of time an op spends waiting for execution on the main thread since the last op on the main thread due to scheduling and can be interpreted appropriately in conjunction with the unit.

Note

node wait information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_NODE_OVERLAP 5002

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents the amount of time at least one background op is running during the execution of an op on the main thread and can be interpreted appropriately in conjunction with the unit.

Note

node overlap information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_NODE_WAIT_OVERLAP 5003

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents the amount of time at least one background op that is not being waited upon to finish is running during the wait period of an op on the main thread and can be interpreted appropriately in conjunction with the unit.

Note

node wait overlap information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_NODE_RESOURCEMASK 5004

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents a bitmask denoting the resources an op uses.

Note

node specific information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_NODE_CRITICAL_BG_OP_ID 5005

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents the ID of an op running in parallel to an op running on the main thread or on HMX.

Note

node specific information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_NODE_WAIT_BG_OP_ID 5006

QnnProfile_EventType_t definition to get data related to execution of an operation. This value represents the ID of an op running on threads other than the main or the HMX thread when the main and the HMX threads are not executing any op.

Note

node specific information is available on QNN_HTP_PROFILE_LEVEL_LINTING level

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_EXECUTE_CRITICAL_ACCEL_TIME_CYCLE 6001

QnnProfile_EventType_t definition to get profile information that corresponds to execute the graph’s critical path on the accelerator when client invokes QnnGraph_execute or QnnGraph_executeAsync. The value returned is number of processor cycles taken.

Note

graph execute accelerator time maybe available only on QNN_HTP_PROFILE_LEVEL_LINTING levels

Note

When QNN_HTP_PROFILE_LEVEL_LINTING is used, this event can have multiple sub-events of type QNN_PROFILE_EVENTTYPE_NODE. There will be a sub-event for each node that was added to the graph

QNN_HTP_PROFILE_LEVEL_LINTING 7001

Linting QnnProfile_Level_t definition that allows collecting in-depth performance metrics for each op in the graph including main thread execution time and time spent on parallel background ops.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_NUMBER_OF_HVX_THREADS 8001

QnnProfile_EventType_t definition to get number of HVX threads configured by a graph. Different graphs can have a different value.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_APPLY_BINARY_SECTION_QNN 9001

QnnProfile_EventType_t definition to get profile information that corresponds to applying binary section for updatable tensors when client invokes QnnContext_ApplyBinarySection. It refers to the total time the entire API takes. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_APPLY_BINARY_SECTION_RPC 9002

QnnProfile_EventType_t definition to get profile information that corresponds to applying binary section for updatable tensors when client invokes QnnContext_ApplyBinarySection. It refers to the time of callTransport. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_APPLY_BINARY_SECTION_QNN_ACC 9003

QnnProfile_EventType_t definition to get profile information that corresponds to applying binary section for updatable tensors when client invokes QnnContext_ApplyBinarySection. It refers to the remote procedure call on the HTP processor. The value returned is time taken in microseconds.

QNN_HTP_PROFILE_EVENTTYPE_GRAPH_APPLY_BINARY_SECTION_ACC 9004

QnnProfile_EventType_t definition to get profile information that corresponds to applying binary section for updatable tensors when client invokes QnnContext_ApplyBinarySection. It refers to the Hexnn call The value returned is time taken in microseconds.