Android Build Tools
275
Android/android-ndk-r27d/simpleperf/ChangeLog
Normal file
@ -0,0 +1,275 @@
|
||||
build 11421629 (Feb 8, 2024)
|
||||
inject command:
|
||||
Support converting LBR profiles to AutoFDO format
|
||||
list command:
|
||||
Use event_table.json to config raw events supported on different cpu models, and read
|
||||
/proc/cpuinfo to know cpu models on device. As a result, we can precisely report raw
|
||||
events supported on different cpu cores.
|
||||
record/stat command:
|
||||
Support monitoring different events on different cores using --cpu
|
||||
Use thousand groups when reporting sample counts
|
||||
record command:
|
||||
Add --delay to delay recording samples
|
||||
Add --record-timestamp and --record-cycles to record timestamps and cpu-cycles with ETM data
|
||||
Add --cycle-threshold for ETM cycle count packets
|
||||
Support reading symbols from DEX files in memory
|
||||
Support starting simpleperf at early-init when boot profiling, as in android_platform_profiling.md
|
||||
stat command:
|
||||
Add --tp-filter and --kprobe to improve tracepoint event counting
|
||||
app_profiler.py:
|
||||
Add app versioncode in record file
|
||||
Add --launch to start an app with its package name
|
||||
report scripts:
|
||||
Add --cpu to filter samples based on CPUs
|
||||
Add ipc.py to capture instructions per cycle of the system
|
||||
Support profiles generated by the report-sample command
|
||||
Add sample_filter.py to split large record files
|
||||
gecko_profile_generator.py:
|
||||
Add --percpu-samples to show samples grouped by CPUs
|
||||
report_html.py:
|
||||
Support parsing kernel disassembly in scripts
|
||||
Speed up generating disassembly
|
||||
Sort functions by names in flamegraph
|
||||
|
||||
|
||||
build 10661963 (Aug 16, 2023)
|
||||
report-sample command: Remove small stack gaps to get a smoother view in Stack Chart.
|
||||
Use prebult libsimpleperf_readelf (provided by Android clang prebuilts) for reading ELF files.
|
||||
report_html.py: Speed up disassembling many functions in a binary.
|
||||
|
||||
|
||||
build 10306210 (Jun 12, 2023)
|
||||
record cmd: Add --decode-etm to decode ETM data while recording.
|
||||
It saves the space storing raw ETM data.
|
||||
Store lost/cut record info in recording file.
|
||||
Report lost samples in kernel space and user space.
|
||||
inject cmd: Accept missing aux data.
|
||||
Add build id in AutoFDO output.
|
||||
gecko_profile_generatory.py: color off-cpu frames blue and jit app cache frames green.
|
||||
|
||||
|
||||
build 9796343 (March 22, 2023)
|
||||
Fix dozens of security bugs detected by fuzzer.
|
||||
record cmd:
|
||||
Increase the default user buffer size from 64M to 256M for devices having >= 4G memory. This is
|
||||
to reduce lost samples and incomplete callchains.
|
||||
Add --user-buffer-size to adjust the user buffer size.
|
||||
record/stat cmd:
|
||||
Support using process name regex to select processes via the -p option.
|
||||
Raise file desciptors limit to better support "stat --per-thread -a".
|
||||
report:
|
||||
Support demangling rust symbols.
|
||||
Suppress read symbol warnings for not-ELF files.
|
||||
Improve deobfuscating Java symbols (ndk issue 1836).
|
||||
In report scripts, add --aggregate-threads to merge samples for selected threads.
|
||||
In binary_cache_builder.py, support searching for binaries having different names from those
|
||||
recorded in perf.data. Also fix a bug supporting native libs embedded in apk.
|
||||
In gecko_profile_generator.py, remove small stack gaps to get a smoother view in Stack Chart.
|
||||
doc:
|
||||
Update doc on the broken DWARF call graph issue.
|
||||
Add doc for trying the latest simpleperf builds and scripts.
|
||||
|
||||
|
||||
build 9042912 (Sep 8, 2022)
|
||||
Fix adhoc codesign for darwin binaries.
|
||||
Release protobuf files in proto directory.
|
||||
stat cmd: Update to work with CPU cores having different numbers of PMU counters.
|
||||
doc: Update collect_etm_data_for_autofdo.md.
|
||||
|
||||
|
||||
build 8355685 (March 25, 2022)
|
||||
1. Add doc for getting boot-time profile, add doc view_the_profile.md.
|
||||
2. Share report lib options between scripts.
|
||||
|
||||
|
||||
build 8121221 (Jan 26, 2022)
|
||||
1. On Android >= 13, allow an app profiling itself even after device reboot.
|
||||
The permission expiration time can be set by --days in api_profiler.py.
|
||||
2. Improve --trace-offcpu in record cmd to support multiple report modes: pure on cpu samples,
|
||||
pure on-cpu samples, or both on-cpu and off-cpu samples.
|
||||
3. Add --add-counter in record cmd to add additional event counts in samples.
|
||||
4. Add --filter-file in report cmds and scripts to filter samples based on timestamps.
|
||||
5. On Android >= 13, add boot-record cmd to record boot-time profiles on userdebug/eng devices.
|
||||
6. For AutoFDO/ETM support, support multiple input files in inject cmd to speed up processing and
|
||||
combine output.
|
||||
|
||||
|
||||
build 7848450
|
||||
record cmd:
|
||||
Add support for new perf ETE files.
|
||||
report cmd:
|
||||
Extend --percent-limit to report entries.
|
||||
scripts:
|
||||
Add gecko_profile_generator.py, which generates reports for firefox profiler.
|
||||
Add stackcollapse.py, which generates reports to Folded Stacks.
|
||||
Improve support of proguard mapping file.
|
||||
Improve logging.
|
||||
pprof_proto_generator.py:
|
||||
Add thread, threadpool, pid, tid labels.
|
||||
Set units of common events.
|
||||
Add comments.
|
||||
app_profiler.py: Kill app process for current user.
|
||||
report_sample.py: add --header and --comm options.
|
||||
doc:
|
||||
Add introduction slide.
|
||||
Use auto-generated tables of contents.
|
||||
test: Fix flaky tests.
|
||||
|
||||
|
||||
build 7649958
|
||||
Build arm simpleperf for armv7-neon instead of armv8.
|
||||
Support JIT method name with signature.
|
||||
Doc improvement.
|
||||
|
||||
|
||||
build 7549687
|
||||
Use multithreading to speed up line annotation.
|
||||
Add doc/debug_dwarf_unwinding.md.
|
||||
Add doc/collect_etm_data_for_autofdo.md.
|
||||
Move to file2 feature section.
|
||||
|
||||
|
||||
build 7414587
|
||||
Drop python2 support, scripts are tested on python3.8 and python3.9.
|
||||
Refactor debug unwinding:
|
||||
1. Add --keep-failed-unwinding-result and --keep-failed-unwinding-stack options
|
||||
in record cmd, to generate additional records for failed unwinding cases.
|
||||
2. Refactor debug-unwind cmd and debug_unwind_reporter.py to report failed
|
||||
unwinding cases.
|
||||
Support recording and converting kernel ETM data in record cmd and inject cmd.
|
||||
Support using proguard mapping file for reporting.
|
||||
Support vmlinux file when building binary_cache.
|
||||
Support showing disassembly of vmlinux file in report_html.py. Use multithreading
|
||||
to speedup disassembling.
|
||||
Add app_type, android_sdk_version and android_build_type in meta_info of recording file.
|
||||
|
||||
|
||||
ndk r23
|
||||
build 7173446
|
||||
Add visualization tool purgatorio.
|
||||
Switch to llvm-objdump and llvm-readelf.
|
||||
|
||||
build 7119240
|
||||
Reduce prepare recording time.
|
||||
Add --kprobe option in record cmd.
|
||||
Add --cpu option in report cmd.
|
||||
Add -i option in dump cmd.
|
||||
Add --exclude-perf option in inject cmd.
|
||||
Add merge cmd to merge recording files recorded in the same environment using
|
||||
the same event types.
|
||||
Add monitor cmd to record and report events in real time.
|
||||
Fix a few bugs about symbolization of kernel and kernel modules.
|
||||
Support parsing kernel etm data in inject cmd.
|
||||
Add --show-execution-type option in report-sample cmd.
|
||||
Don't hide art jni methods in report_lib and report-sample cmd.
|
||||
|
||||
|
||||
build 6859468
|
||||
Add --csv option in report cmd.
|
||||
Add --sort option in stat cmd.
|
||||
Add --tp-filter option to filter tracepoint events in record cmd.
|
||||
Add --addr-filter to filter etm recording in record cmd.
|
||||
Fix finding symbols from kernel modules.
|
||||
Better ART JIT support (dump jit symfiles to a single file instead of multiple
|
||||
temporary files).
|
||||
Support generic JIT symbols from symbol map file. See doc/jit_symbols.md.
|
||||
|
||||
|
||||
ndk r22
|
||||
build 6401870
|
||||
Support multiple record files in pprof_proto_generator.py.
|
||||
In stat cmd, add --per-thread and --per-core options to report per thread and per core.
|
||||
In record cmd, add --exclude-perf option to exclude simpleperf samples in system wide
|
||||
recording.
|
||||
In inject cmd, support decoding coresight etm data to branch list data in protobuf format.
|
||||
Fix and add doc for app_api, which can control simpleperf recording in app code.
|
||||
Support pmu event types:
|
||||
list supported pmu events via `simpleperf list pmu`.
|
||||
record/stat pmu events via options like -e armv8_pmuv3/cpu_cycles/.
|
||||
Switch to llvm-objdump.
|
||||
Add doc for line and disassembly annotation in README.md.
|
||||
Add doc for profiling profileable release app on Android >= Q.
|
||||
Remove dependency on libncurses.
|
||||
|
||||
ndk r21
|
||||
In record cmd, support recording coresight etm data (via -e cs-etm option).
|
||||
Add inject cmd to decode coresight etm data.
|
||||
Add doc for downloading unstripped libraries on device.
|
||||
Fix scripts for using unstripped libraries without build ids for reporting.
|
||||
Switch to llvm-symbolizer.
|
||||
Add app_api and api_profiler.py, which can control simpleperf recording in app code.
|
||||
Fix pprof_proto_generator.py to support line and disassembly annotation via pprof.
|
||||
|
||||
ndk r20
|
||||
Skipped.
|
||||
|
||||
ndk r19
|
||||
Fix report-sample command on Windows.
|
||||
|
||||
ndk r18
|
||||
Improve support of profiling JITed/interpreted Java code on Android >= P:
|
||||
1) Support JITed/interpreted Java code in system wide recording.
|
||||
2) Support dex files extracted to memory.
|
||||
3) Fix some bugs and improve inefficient code.
|
||||
Improve record command:
|
||||
1) Add a user space buffer and a high priority record reading thread to reduce sampe lost rate.
|
||||
2) Record full process name instead of only the last 16 bytes.
|
||||
Improve report_html.py:
|
||||
1) Generate flamegraphs in Javascript code instead of using inferno, thus
|
||||
reducing the time used to generate and load report.
|
||||
2) Use bootstrap 4 to format UI.
|
||||
3) Use progressbar to show progress of loading contents.
|
||||
4) Add --binary_filter option to only annotate selected binaries.
|
||||
Export tracing data in simpleperf_report_lib.py.
|
||||
Test python scripts with both python2 and python3.
|
||||
Add document for using simpleperf in Android platform profiling.
|
||||
|
||||
ndk r17
|
||||
(release)
|
||||
Use new Android unwinder, which can unwind for archs different from build.
|
||||
Support profiling interpreted and JITed java code on Android >= P.
|
||||
Refactor app_profiler.py: improve option interface, simplify profiling from launch,
|
||||
and improve native lib downloading.
|
||||
Fix ndk issues 638, 644, 499, 493.
|
||||
Add debug-unwind cmd and script to debug unwinding.
|
||||
Update document, including the way using wrap.sh to profile released apk.
|
||||
|
||||
(beta 1)
|
||||
Add report_html.py, reporting profiling result in html interface.
|
||||
Improve inferno.
|
||||
Refactor document.
|
||||
Provide more complete dwarf based call graphs.
|
||||
|
||||
ndk r16
|
||||
|
||||
Add inferno, a flamegraph generator.
|
||||
Add --trace-offcpu option in simpleperf record command and app_profiler.py to trace offcpu time.
|
||||
Add --app option in simpleperf record command to remove need of using run-as.
|
||||
Add --profile_from_launch option in app_profiler.py to start recording from Activity launch time.
|
||||
Configure scripts from command lines, remove config files.
|
||||
Wrap simpleperf report command with report.py, in which GUI mode is enabled with --gui option.
|
||||
Add release tests for scripts.
|
||||
|
||||
|
||||
ndk r15
|
||||
|
||||
Add three Android Studio project examples, show how to build optimized native libs containing
|
||||
debug info, show how to fully compile app on Android O.
|
||||
Add symbol info in perf.data by default, no need to add --dump-symbols in simpleperf record command.
|
||||
Report brief call-graph in simpleperf report command.
|
||||
Support raw cpu pmu events.
|
||||
|
||||
|
||||
ndk r14
|
||||
|
||||
Add app_profiler.py to help recording profiling data.
|
||||
Add annotate.py to annotate source code.
|
||||
Add simpleperf_report_lib.py interface to support extracting samples from perf.data.
|
||||
Release simpleperf binaries on host to support reporting on host.
|
||||
|
||||
|
||||
ndk r13
|
||||
|
||||
Release simpleperf binaries on device.
|
||||
Support recording and reporting stack frame based callgraphs and dwarf based callgraphs.
|
||||
Add simpleperf_report.py to show callgraphs in GUI.
|
||||
0
Android/android-ndk-r27d/simpleperf/__init__.py
Normal file
486
Android/android-ndk-r27d/simpleperf/annotate.py
Normal file
@ -0,0 +1,486 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""annotate.py: annotate source files based on perf.data.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
import shutil
|
||||
from texttable import Texttable
|
||||
from typing import Dict, Union
|
||||
|
||||
from simpleperf_report_lib import GetReportLib
|
||||
from simpleperf_utils import (
|
||||
Addr2Nearestline, BaseArgumentParser, BinaryFinder, extant_dir, flatten_arg_list, is_windows,
|
||||
log_exit, ReadElf, SourceFileSearcher)
|
||||
|
||||
|
||||
class SourceLine(object):
|
||||
def __init__(self, file_id, function, line):
|
||||
self.file = file_id
|
||||
self.function = function
|
||||
self.line = line
|
||||
|
||||
@property
|
||||
def file_key(self):
|
||||
return self.file
|
||||
|
||||
@property
|
||||
def function_key(self):
|
||||
return (self.file, self.function)
|
||||
|
||||
@property
|
||||
def line_key(self):
|
||||
return (self.file, self.line)
|
||||
|
||||
|
||||
class Addr2Line(object):
|
||||
"""collect information of how to map [dso_name, vaddr] to [source_file:line].
|
||||
"""
|
||||
|
||||
def __init__(self, ndk_path, binary_cache_path, source_dirs):
|
||||
binary_finder = BinaryFinder(binary_cache_path, ReadElf(ndk_path))
|
||||
self.addr2line = Addr2Nearestline(ndk_path, binary_finder, True)
|
||||
self.source_searcher = SourceFileSearcher(source_dirs)
|
||||
|
||||
def add_addr(self, dso_path: str, build_id: str, func_addr: int, addr: int):
|
||||
self.addr2line.add_addr(dso_path, build_id, func_addr, addr)
|
||||
|
||||
def convert_addrs_to_lines(self):
|
||||
self.addr2line.convert_addrs_to_lines(jobs=os.cpu_count())
|
||||
|
||||
def get_sources(self, dso_path, addr):
|
||||
dso = self.addr2line.get_dso(dso_path)
|
||||
if not dso:
|
||||
return []
|
||||
source = self.addr2line.get_addr_source(dso, addr)
|
||||
if not source:
|
||||
return []
|
||||
result = []
|
||||
for (source_file, source_line, function_name) in source:
|
||||
source_file_path = self.source_searcher.get_real_path(source_file)
|
||||
if not source_file_path:
|
||||
source_file_path = source_file
|
||||
result.append(SourceLine(source_file_path, function_name, source_line))
|
||||
return result
|
||||
|
||||
|
||||
class Period(object):
|
||||
"""event count information. It can be used to represent event count
|
||||
of a line, a function, a source file, or a binary. It contains two
|
||||
parts: period and acc_period.
|
||||
When used for a line, period is the event count occurred when running
|
||||
that line, acc_period is the accumulated event count occurred when
|
||||
running that line and functions called by that line. Same thing applies
|
||||
when it is used for a function, a source file, or a binary.
|
||||
"""
|
||||
|
||||
def __init__(self, period=0, acc_period=0):
|
||||
self.period = period
|
||||
self.acc_period = acc_period
|
||||
|
||||
def __iadd__(self, other):
|
||||
self.period += other.period
|
||||
self.acc_period += other.acc_period
|
||||
return self
|
||||
|
||||
|
||||
class DsoPeriod(object):
|
||||
"""Period for each shared library"""
|
||||
|
||||
def __init__(self, dso_name):
|
||||
self.dso_name = dso_name
|
||||
self.period = Period()
|
||||
|
||||
def add_period(self, period):
|
||||
self.period += period
|
||||
|
||||
|
||||
class FilePeriod(object):
|
||||
"""Period for each source file"""
|
||||
|
||||
def __init__(self, file_id):
|
||||
self.file = file_id
|
||||
self.period = Period()
|
||||
# Period for each line in the file.
|
||||
self.line_dict = {}
|
||||
# Period for each function in the source file.
|
||||
self.function_dict = {}
|
||||
|
||||
def add_period(self, period):
|
||||
self.period += period
|
||||
|
||||
def add_line_period(self, line, period):
|
||||
a = self.line_dict.get(line)
|
||||
if a is None:
|
||||
self.line_dict[line] = a = Period()
|
||||
a += period
|
||||
|
||||
def add_function_period(self, function_name, function_start_line, period):
|
||||
a = self.function_dict.get(function_name)
|
||||
if not a:
|
||||
if function_start_line is None:
|
||||
function_start_line = -1
|
||||
self.function_dict[function_name] = a = [function_start_line, Period()]
|
||||
a[1] += period
|
||||
|
||||
|
||||
class SourceFileAnnotator(object):
|
||||
"""group code for annotating source files"""
|
||||
|
||||
def __init__(self, config):
|
||||
# check config variables
|
||||
config_names = ['perf_data_list', 'source_dirs', 'dso_filters', 'ndk_path']
|
||||
for name in config_names:
|
||||
if name not in config:
|
||||
log_exit('config [%s] is missing' % name)
|
||||
symfs_dir = 'binary_cache'
|
||||
if not os.path.isdir(symfs_dir):
|
||||
symfs_dir = None
|
||||
kallsyms = 'binary_cache/kallsyms'
|
||||
if not os.path.isfile(kallsyms):
|
||||
kallsyms = None
|
||||
|
||||
# init member variables
|
||||
self.config = config
|
||||
self.symfs_dir = symfs_dir
|
||||
self.kallsyms = kallsyms
|
||||
self.dso_filter = set(config['dso_filters']) if config.get('dso_filters') else None
|
||||
|
||||
config['annotate_dest_dir'] = 'annotated_files'
|
||||
output_dir = config['annotate_dest_dir']
|
||||
if os.path.isdir(output_dir):
|
||||
shutil.rmtree(output_dir)
|
||||
os.makedirs(output_dir)
|
||||
|
||||
self.addr2line = Addr2Line(self.config['ndk_path'], symfs_dir, config.get('source_dirs'))
|
||||
self.period = 0
|
||||
self.dso_periods = {}
|
||||
self.file_periods = {}
|
||||
|
||||
def annotate(self):
|
||||
self._collect_addrs()
|
||||
self._convert_addrs_to_lines()
|
||||
self._generate_periods()
|
||||
self._write_summary()
|
||||
self._annotate_files()
|
||||
|
||||
def _collect_addrs(self):
|
||||
"""Read perf.data, collect all addresses we need to convert to
|
||||
source file:line.
|
||||
"""
|
||||
for perf_data in self.config['perf_data_list']:
|
||||
lib = GetReportLib(perf_data)
|
||||
if self.symfs_dir:
|
||||
lib.SetSymfs(self.symfs_dir)
|
||||
if self.kallsyms:
|
||||
lib.SetKallsymsFile(self.kallsyms)
|
||||
lib.SetReportOptions(self.config['report_lib_options'])
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
symbols = []
|
||||
symbols.append(lib.GetSymbolOfCurrentSample())
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
for i in range(callchain.nr):
|
||||
symbols.append(callchain.entries[i].symbol)
|
||||
for symbol in symbols:
|
||||
if self._filter_symbol(symbol):
|
||||
build_id = lib.GetBuildIdForPath(symbol.dso_name)
|
||||
self.addr2line.add_addr(symbol.dso_name, build_id, symbol.symbol_addr,
|
||||
symbol.vaddr_in_file)
|
||||
self.addr2line.add_addr(symbol.dso_name, build_id, symbol.symbol_addr,
|
||||
symbol.symbol_addr)
|
||||
|
||||
def _filter_symbol(self, symbol):
|
||||
if not self.dso_filter or symbol.dso_name in self.dso_filter:
|
||||
return True
|
||||
return False
|
||||
|
||||
def _convert_addrs_to_lines(self):
|
||||
self.addr2line.convert_addrs_to_lines()
|
||||
|
||||
def _generate_periods(self):
|
||||
"""read perf.data, collect Period for all types:
|
||||
binaries, source files, functions, lines.
|
||||
"""
|
||||
for perf_data in self.config['perf_data_list']:
|
||||
lib = GetReportLib(perf_data)
|
||||
if self.symfs_dir:
|
||||
lib.SetSymfs(self.symfs_dir)
|
||||
if self.kallsyms:
|
||||
lib.SetKallsymsFile(self.kallsyms)
|
||||
lib.SetReportOptions(self.config['report_lib_options'])
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
self._generate_periods_for_sample(lib, sample)
|
||||
|
||||
def _generate_periods_for_sample(self, lib, sample):
|
||||
symbols = []
|
||||
symbols.append(lib.GetSymbolOfCurrentSample())
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
for i in range(callchain.nr):
|
||||
symbols.append(callchain.entries[i].symbol)
|
||||
# Each sample has a callchain, but its period is only used once
|
||||
# to add period for each function/source_line/source_file/binary.
|
||||
# For example, if more than one entry in the callchain hits a
|
||||
# function, the event count of that function is only increased once.
|
||||
# Otherwise, we may get periods > 100%.
|
||||
is_sample_used = False
|
||||
used_dso_dict = {}
|
||||
used_file_dict = {}
|
||||
used_function_dict = {}
|
||||
used_line_dict = {}
|
||||
period = Period(sample.period, sample.period)
|
||||
for j, symbol in enumerate(symbols):
|
||||
if j == 1:
|
||||
period = Period(0, sample.period)
|
||||
if not self._filter_symbol(symbol):
|
||||
continue
|
||||
is_sample_used = True
|
||||
# Add period to dso.
|
||||
self._add_dso_period(symbol.dso_name, period, used_dso_dict)
|
||||
# Add period to source file.
|
||||
sources = self.addr2line.get_sources(symbol.dso_name, symbol.vaddr_in_file)
|
||||
for source in sources:
|
||||
if source.file:
|
||||
self._add_file_period(source, period, used_file_dict)
|
||||
# Add period to line.
|
||||
if source.line:
|
||||
self._add_line_period(source, period, used_line_dict)
|
||||
# Add period to function.
|
||||
sources = self.addr2line.get_sources(symbol.dso_name, symbol.symbol_addr)
|
||||
for source in sources:
|
||||
if source.file:
|
||||
self._add_file_period(source, period, used_file_dict)
|
||||
if source.function:
|
||||
self._add_function_period(source, period, used_function_dict)
|
||||
|
||||
if is_sample_used:
|
||||
self.period += sample.period
|
||||
|
||||
def _add_dso_period(self, dso_name: str, period: Period, used_dso_dict: Dict[str, bool]):
|
||||
if dso_name not in used_dso_dict:
|
||||
used_dso_dict[dso_name] = True
|
||||
dso_period = self.dso_periods.get(dso_name)
|
||||
if dso_period is None:
|
||||
dso_period = self.dso_periods[dso_name] = DsoPeriod(dso_name)
|
||||
dso_period.add_period(period)
|
||||
|
||||
def _add_file_period(self, source, period, used_file_dict):
|
||||
if source.file_key not in used_file_dict:
|
||||
used_file_dict[source.file_key] = True
|
||||
file_period = self.file_periods.get(source.file)
|
||||
if file_period is None:
|
||||
file_period = self.file_periods[source.file] = FilePeriod(source.file)
|
||||
file_period.add_period(period)
|
||||
|
||||
def _add_line_period(self, source, period, used_line_dict):
|
||||
if source.line_key not in used_line_dict:
|
||||
used_line_dict[source.line_key] = True
|
||||
file_period = self.file_periods[source.file]
|
||||
file_period.add_line_period(source.line, period)
|
||||
|
||||
def _add_function_period(self, source, period, used_function_dict):
|
||||
if source.function_key not in used_function_dict:
|
||||
used_function_dict[source.function_key] = True
|
||||
file_period = self.file_periods[source.file]
|
||||
file_period.add_function_period(source.function, source.line, period)
|
||||
|
||||
def _write_summary(self):
|
||||
summary = os.path.join(self.config['annotate_dest_dir'], 'summary')
|
||||
with open(summary, 'w') as f:
|
||||
f.write('total period: %d\n\n' % self.period)
|
||||
self._write_dso_summary(f)
|
||||
self._write_file_summary(f)
|
||||
|
||||
file_periods = sorted(self.file_periods.values(),
|
||||
key=lambda x: x.period.acc_period, reverse=True)
|
||||
for file_period in file_periods:
|
||||
self._write_function_line_summary(f, file_period)
|
||||
|
||||
def _write_dso_summary(self, summary_fh):
|
||||
dso_periods = sorted(self.dso_periods.values(),
|
||||
key=lambda x: x.period.acc_period, reverse=True)
|
||||
table = Texttable(max_width=self.config['summary_width'])
|
||||
table.set_cols_align(['l', 'l', 'l'])
|
||||
table.add_row(['Total', 'Self', 'DSO'])
|
||||
for dso_period in dso_periods:
|
||||
total_str = self._get_period_str(dso_period.period.acc_period)
|
||||
self_str = self._get_period_str(dso_period.period.period)
|
||||
table.add_row([total_str, self_str, dso_period.dso_name])
|
||||
print(table.draw(), file=summary_fh)
|
||||
print(file=summary_fh)
|
||||
|
||||
def _write_file_summary(self, summary_fh):
|
||||
file_periods = sorted(self.file_periods.values(),
|
||||
key=lambda x: x.period.acc_period, reverse=True)
|
||||
table = Texttable(max_width=self.config['summary_width'])
|
||||
table.set_cols_align(['l', 'l', 'l'])
|
||||
table.add_row(['Total', 'Self', 'Source File'])
|
||||
for file_period in file_periods:
|
||||
total_str = self._get_period_str(file_period.period.acc_period)
|
||||
self_str = self._get_period_str(file_period.period.period)
|
||||
table.add_row([total_str, self_str, file_period.file])
|
||||
print(table.draw(), file=summary_fh)
|
||||
print(file=summary_fh)
|
||||
|
||||
def _write_function_line_summary(self, summary_fh, file_period: FilePeriod):
|
||||
table = Texttable(max_width=self.config['summary_width'])
|
||||
table.set_cols_align(['l', 'l', 'l'])
|
||||
table.add_row(['Total', 'Self', 'Function/Line in ' + file_period.file])
|
||||
values = []
|
||||
for func_name in file_period.function_dict.keys():
|
||||
func_start_line, period = file_period.function_dict[func_name]
|
||||
values.append((func_name, func_start_line, period))
|
||||
values.sort(key=lambda x: x[2].acc_period, reverse=True)
|
||||
for func_name, func_start_line, period in values:
|
||||
total_str = self._get_period_str(period.acc_period)
|
||||
self_str = self._get_period_str(period.period)
|
||||
name = func_name + ' (line %d)' % func_start_line
|
||||
table.add_row([total_str, self_str, name])
|
||||
for line in sorted(file_period.line_dict.keys()):
|
||||
period = file_period.line_dict[line]
|
||||
total_str = self._get_period_str(period.acc_period)
|
||||
self_str = self._get_period_str(period.period)
|
||||
name = 'line %d' % line
|
||||
table.add_row([total_str, self_str, name])
|
||||
|
||||
print(table.draw(), file=summary_fh)
|
||||
print(file=summary_fh)
|
||||
|
||||
def _get_period_str(self, period: Union[Period, int]) -> str:
|
||||
if isinstance(period, Period):
|
||||
return 'Total %s, Self %s' % (
|
||||
self._get_period_str(period.acc_period),
|
||||
self._get_period_str(period.period))
|
||||
if self.config['raw_period'] or self.period == 0:
|
||||
return str(period)
|
||||
return '%.2f%%' % (100.0 * period / self.period)
|
||||
|
||||
def _annotate_files(self):
|
||||
"""Annotate Source files: add acc_period/period for each source file.
|
||||
1. Annotate java source files, which have $JAVA_SRC_ROOT prefix.
|
||||
2. Annotate c++ source files.
|
||||
"""
|
||||
dest_dir = self.config['annotate_dest_dir']
|
||||
for key in self.file_periods:
|
||||
from_path = key
|
||||
if not os.path.isfile(from_path):
|
||||
logging.warning("can't find source file for path %s" % from_path)
|
||||
continue
|
||||
if from_path.startswith('/'):
|
||||
to_path = os.path.join(dest_dir, from_path[1:])
|
||||
elif is_windows() and ':\\' in from_path:
|
||||
to_path = os.path.join(dest_dir, from_path.replace(':\\', os.sep))
|
||||
else:
|
||||
to_path = os.path.join(dest_dir, from_path)
|
||||
is_java = from_path.endswith('.java')
|
||||
self._annotate_file(from_path, to_path, self.file_periods[key], is_java)
|
||||
|
||||
def _annotate_file(self, from_path, to_path, file_period, is_java):
|
||||
"""Annotate a source file.
|
||||
|
||||
Annotate a source file in three steps:
|
||||
1. In the first line, show periods of this file.
|
||||
2. For each function, show periods of this function.
|
||||
3. For each line not hitting the same line as functions, show
|
||||
line periods.
|
||||
"""
|
||||
logging.info('annotate file %s' % from_path)
|
||||
with open(from_path, 'r') as rf:
|
||||
lines = rf.readlines()
|
||||
|
||||
annotates = {}
|
||||
for line in file_period.line_dict.keys():
|
||||
annotates[line] = self._get_period_str(file_period.line_dict[line])
|
||||
for func_name in file_period.function_dict.keys():
|
||||
func_start_line, period = file_period.function_dict[func_name]
|
||||
if func_start_line == -1:
|
||||
continue
|
||||
line = func_start_line - 1 if is_java else func_start_line
|
||||
annotates[line] = '[func] ' + self._get_period_str(period)
|
||||
annotates[1] = '[file] ' + self._get_period_str(file_period.period)
|
||||
|
||||
max_annotate_cols = 0
|
||||
for key in annotates:
|
||||
max_annotate_cols = max(max_annotate_cols, len(annotates[key]))
|
||||
|
||||
empty_annotate = ' ' * (max_annotate_cols + 6)
|
||||
|
||||
dirname = os.path.dirname(to_path)
|
||||
if not os.path.isdir(dirname):
|
||||
os.makedirs(dirname)
|
||||
with open(to_path, 'w') as wf:
|
||||
for line in range(1, len(lines) + 1):
|
||||
annotate = annotates.get(line)
|
||||
if annotate is None:
|
||||
if not lines[line-1].strip():
|
||||
annotate = ''
|
||||
else:
|
||||
annotate = empty_annotate
|
||||
else:
|
||||
annotate = '/* ' + annotate + (
|
||||
' ' * (max_annotate_cols - len(annotate))) + ' */'
|
||||
wf.write(annotate)
|
||||
wf.write(lines[line-1])
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description="""
|
||||
Annotate source files based on profiling data. It reads line information from binary_cache
|
||||
generated by app_profiler.py or binary_cache_builder.py, and generate annotated source
|
||||
files in annotated_files directory.""")
|
||||
parser.add_argument('-i', '--perf_data_list', nargs='+', action='append', help="""
|
||||
The paths of profiling data. Default is perf.data.""")
|
||||
parser.add_argument('-s', '--source_dirs', type=extant_dir, nargs='+', action='append', help="""
|
||||
Directories to find source files.""")
|
||||
parser.add_argument('--ndk_path', type=extant_dir, help='Set the path of a ndk release.')
|
||||
parser.add_argument('--raw-period', action='store_true',
|
||||
help='show raw period instead of percentage')
|
||||
parser.add_argument('--summary-width', type=int, default=80, help='max width of summary file')
|
||||
sample_filter_group = parser.add_argument_group('Sample filter options')
|
||||
sample_filter_group.add_argument('--dso', nargs='+', action='append', help="""
|
||||
Use samples only in selected binaries.""")
|
||||
parser.add_report_lib_options(sample_filter_group=sample_filter_group)
|
||||
|
||||
args = parser.parse_args()
|
||||
config = {}
|
||||
config['perf_data_list'] = flatten_arg_list(args.perf_data_list)
|
||||
if not config['perf_data_list']:
|
||||
config['perf_data_list'].append('perf.data')
|
||||
config['source_dirs'] = flatten_arg_list(args.source_dirs)
|
||||
config['dso_filters'] = flatten_arg_list(args.dso)
|
||||
config['ndk_path'] = args.ndk_path
|
||||
config['raw_period'] = args.raw_period
|
||||
config['summary_width'] = args.summary_width
|
||||
config['report_lib_options'] = args.report_lib_options
|
||||
|
||||
annotator = SourceFileAnnotator(config)
|
||||
annotator.annotate()
|
||||
logging.info('annotate finish successfully, please check result in annotated_files/.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
131
Android/android-ndk-r27d/simpleperf/api_profiler.py
Normal file
@ -0,0 +1,131 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2019 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""
|
||||
This script is part of controlling simpleperf recording in user code. It is used to prepare
|
||||
profiling environment (upload simpleperf to device and enable profiling) before recording
|
||||
and collect recording data on host after recording.
|
||||
Controlling simpleperf recording is done in below steps:
|
||||
1. Add simpleperf Java API/C++ API to the app's source code. And call the API in user code.
|
||||
2. Run `api_profiler.py prepare` to prepare profiling environment.
|
||||
3. Run the app one or more times to generate recording data.
|
||||
4. Run `api_profiler.py collect` to collect recording data on host.
|
||||
"""
|
||||
|
||||
from argparse import Namespace
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
import shutil
|
||||
import zipfile
|
||||
|
||||
from simpleperf_utils import (AdbHelper, BaseArgumentParser,
|
||||
get_target_binary_path, log_exit, remove)
|
||||
|
||||
|
||||
class ApiProfiler:
|
||||
def __init__(self, args: Namespace):
|
||||
self.args = args
|
||||
self.adb = AdbHelper()
|
||||
|
||||
def prepare_recording(self):
|
||||
self.enable_profiling_on_device()
|
||||
self.upload_simpleperf_to_device()
|
||||
self.run_simpleperf_prepare_cmd()
|
||||
|
||||
def enable_profiling_on_device(self):
|
||||
android_version = self.adb.get_android_version()
|
||||
if android_version >= 10:
|
||||
self.adb.set_property('debug.perf_event_max_sample_rate',
|
||||
str(self.args.max_sample_rate))
|
||||
self.adb.set_property('debug.perf_cpu_time_max_percent', str(self.args.max_cpu_percent))
|
||||
self.adb.set_property('debug.perf_event_mlock_kb', str(self.args.max_memory_in_kb))
|
||||
self.adb.set_property('security.perf_harden', '0')
|
||||
|
||||
def upload_simpleperf_to_device(self):
|
||||
device_arch = self.adb.get_device_arch()
|
||||
simpleperf_binary = get_target_binary_path(device_arch, 'simpleperf')
|
||||
self.adb.check_run(['push', simpleperf_binary, '/data/local/tmp'])
|
||||
self.adb.check_run(['shell', 'chmod', 'a+x', '/data/local/tmp/simpleperf'])
|
||||
|
||||
def run_simpleperf_prepare_cmd(self):
|
||||
cmd_args = ['shell', '/data/local/tmp/simpleperf', 'api-prepare', '--app', self.args.app]
|
||||
if self.args.days:
|
||||
cmd_args += ['--days', str(self.args.days)]
|
||||
self.adb.check_run(cmd_args)
|
||||
|
||||
def collect_data(self):
|
||||
if not os.path.isdir(self.args.out_dir):
|
||||
os.makedirs(self.args.out_dir)
|
||||
self.download_recording_data()
|
||||
self.unzip_recording_data()
|
||||
|
||||
def download_recording_data(self):
|
||||
""" download recording data to simpleperf_data.zip."""
|
||||
self.upload_simpleperf_to_device()
|
||||
self.adb.check_run(['shell', '/data/local/tmp/simpleperf', 'api-collect',
|
||||
'--app', self.args.app, '-o', '/data/local/tmp/simpleperf_data.zip'])
|
||||
self.adb.check_run(['pull', '/data/local/tmp/simpleperf_data.zip', self.args.out_dir])
|
||||
self.adb.check_run(['shell', 'rm', '-rf', '/data/local/tmp/simpleperf_data'])
|
||||
|
||||
def unzip_recording_data(self):
|
||||
zip_file_path = os.path.join(self.args.out_dir, 'simpleperf_data.zip')
|
||||
with zipfile.ZipFile(zip_file_path, 'r') as zip_fh:
|
||||
names = zip_fh.namelist()
|
||||
logging.info('There are %d recording data files.' % len(names))
|
||||
for name in names:
|
||||
logging.info('recording file: %s' % os.path.join(self.args.out_dir, name))
|
||||
zip_fh.extract(name, self.args.out_dir)
|
||||
remove(zip_file_path)
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
subparsers = parser.add_subparsers(title='actions', dest='command')
|
||||
|
||||
prepare_parser = subparsers.add_parser('prepare', help='Prepare recording on device.')
|
||||
prepare_parser.add_argument('-p', '--app', required=True, help="""
|
||||
The app package name of the app profiled.""")
|
||||
prepare_parser.add_argument('-d', '--days', type=int, help="""
|
||||
By default, the recording permission is reset after device reboot.
|
||||
But on Android >= 13, we can use --days to set how long we want the
|
||||
permission to persist. It can last after device reboot.
|
||||
""")
|
||||
prepare_parser.add_argument('--max-sample-rate', type=int, default=100000, help="""
|
||||
Set max sample rate (only on Android >= Q).""")
|
||||
prepare_parser.add_argument('--max-cpu-percent', type=int, default=25, help="""
|
||||
Set max cpu percent for recording (only on Android >= Q).""")
|
||||
prepare_parser.add_argument('--max-memory-in-kb', type=int,
|
||||
default=(1024 + 1) * 4 * 8, help="""
|
||||
Set max kernel buffer size for recording (only on Android >= Q).
|
||||
""")
|
||||
|
||||
collect_parser = subparsers.add_parser('collect', help='Collect recording data.')
|
||||
collect_parser.add_argument('-p', '--app', required=True, help="""
|
||||
The app package name of the app profiled.""")
|
||||
collect_parser.add_argument('-o', '--out-dir', default='simpleperf_data', help="""
|
||||
The directory to store recording data.""")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.command == 'prepare':
|
||||
ApiProfiler(args).prepare_recording()
|
||||
elif args.command == 'collect':
|
||||
ApiProfiler(args).collect_data()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
530
Android/android-ndk-r27d/simpleperf/app_api/cpp/simpleperf.cpp
Normal file
@ -0,0 +1,530 @@
|
||||
/*
|
||||
* Copyright (C) 2019 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#include "simpleperf.h"
|
||||
|
||||
#include <limits.h>
|
||||
#include <stdarg.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/stat.h>
|
||||
#include <sys/wait.h>
|
||||
#include <time.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <android/log.h>
|
||||
#include <mutex>
|
||||
#include <sstream>
|
||||
|
||||
namespace simpleperf {
|
||||
|
||||
constexpr int AID_USER_OFFSET = 100000;
|
||||
|
||||
enum RecordCmd {
|
||||
CMD_PAUSE_RECORDING = 1,
|
||||
CMD_RESUME_RECORDING,
|
||||
};
|
||||
|
||||
class RecordOptionsImpl {
|
||||
public:
|
||||
std::string output_filename;
|
||||
std::string event = "cpu-cycles";
|
||||
size_t freq = 4000;
|
||||
double duration_in_second = 0.0;
|
||||
std::vector<pid_t> threads;
|
||||
bool dwarf_callgraph = false;
|
||||
bool fp_callgraph = false;
|
||||
bool trace_offcpu = false;
|
||||
};
|
||||
|
||||
RecordOptions::RecordOptions() : impl_(new RecordOptionsImpl) {}
|
||||
|
||||
RecordOptions::~RecordOptions() {
|
||||
delete impl_;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::SetOutputFilename(const std::string& filename) {
|
||||
impl_->output_filename = filename;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::SetEvent(const std::string& event) {
|
||||
impl_->event = event;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::SetSampleFrequency(size_t freq) {
|
||||
impl_->freq = freq;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::SetDuration(double duration_in_second) {
|
||||
impl_->duration_in_second = duration_in_second;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::SetSampleThreads(const std::vector<pid_t>& threads) {
|
||||
impl_->threads = threads;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::RecordDwarfCallGraph() {
|
||||
impl_->dwarf_callgraph = true;
|
||||
impl_->fp_callgraph = false;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::RecordFramePointerCallGraph() {
|
||||
impl_->fp_callgraph = true;
|
||||
impl_->dwarf_callgraph = false;
|
||||
return *this;
|
||||
}
|
||||
|
||||
RecordOptions& RecordOptions::TraceOffCpu() {
|
||||
impl_->trace_offcpu = true;
|
||||
return *this;
|
||||
}
|
||||
|
||||
static std::string GetDefaultOutputFilename() {
|
||||
time_t t = time(nullptr);
|
||||
struct tm tm;
|
||||
if (localtime_r(&t, &tm) != &tm) {
|
||||
return "perf.data";
|
||||
}
|
||||
char* buf = nullptr;
|
||||
asprintf(&buf, "perf-%02d-%02d-%02d-%02d-%02d.data", tm.tm_mon + 1, tm.tm_mday, tm.tm_hour,
|
||||
tm.tm_min, tm.tm_sec);
|
||||
std::string result = buf;
|
||||
free(buf);
|
||||
return result;
|
||||
}
|
||||
|
||||
std::vector<std::string> RecordOptions::ToRecordArgs() const {
|
||||
std::vector<std::string> args;
|
||||
std::string output_filename = impl_->output_filename;
|
||||
if (output_filename.empty()) {
|
||||
output_filename = GetDefaultOutputFilename();
|
||||
}
|
||||
args.insert(args.end(), {"-o", output_filename});
|
||||
args.insert(args.end(), {"-e", impl_->event});
|
||||
args.insert(args.end(), {"-f", std::to_string(impl_->freq)});
|
||||
if (impl_->duration_in_second != 0.0) {
|
||||
args.insert(args.end(), {"--duration", std::to_string(impl_->duration_in_second)});
|
||||
}
|
||||
if (impl_->threads.empty()) {
|
||||
args.insert(args.end(), {"-p", std::to_string(getpid())});
|
||||
} else {
|
||||
std::ostringstream os;
|
||||
os << *(impl_->threads.begin());
|
||||
for (auto it = std::next(impl_->threads.begin()); it != impl_->threads.end(); ++it) {
|
||||
os << "," << *it;
|
||||
}
|
||||
args.insert(args.end(), {"-t", os.str()});
|
||||
}
|
||||
if (impl_->dwarf_callgraph) {
|
||||
args.push_back("-g");
|
||||
} else if (impl_->fp_callgraph) {
|
||||
args.insert(args.end(), {"--call-graph", "fp"});
|
||||
}
|
||||
if (impl_->trace_offcpu) {
|
||||
args.push_back("--trace-offcpu");
|
||||
}
|
||||
return args;
|
||||
}
|
||||
|
||||
static void Abort(const char* fmt, ...) {
|
||||
va_list vl;
|
||||
va_start(vl, fmt);
|
||||
__android_log_vprint(ANDROID_LOG_FATAL, "simpleperf", fmt, vl);
|
||||
va_end(vl);
|
||||
abort();
|
||||
}
|
||||
|
||||
class ProfileSessionImpl {
|
||||
public:
|
||||
ProfileSessionImpl(const std::string& app_data_dir)
|
||||
: app_data_dir_(app_data_dir), simpleperf_data_dir_(app_data_dir + "/simpleperf_data") {}
|
||||
~ProfileSessionImpl();
|
||||
void StartRecording(const std::vector<std::string>& args);
|
||||
void PauseRecording();
|
||||
void ResumeRecording();
|
||||
void StopRecording();
|
||||
|
||||
private:
|
||||
std::string FindSimpleperf();
|
||||
std::string FindSimpleperfInTempDir();
|
||||
void CheckIfPerfEnabled();
|
||||
std::string GetProperty(const std::string& name);
|
||||
void CreateSimpleperfDataDir();
|
||||
void CreateSimpleperfProcess(const std::string& simpleperf_path,
|
||||
const std::vector<std::string>& record_args);
|
||||
void SendCmd(const std::string& cmd);
|
||||
std::string ReadReply();
|
||||
|
||||
enum State {
|
||||
NOT_YET_STARTED,
|
||||
STARTED,
|
||||
PAUSED,
|
||||
STOPPED,
|
||||
};
|
||||
|
||||
const std::string app_data_dir_;
|
||||
const std::string simpleperf_data_dir_;
|
||||
std::mutex lock_; // Protect all members below.
|
||||
State state_ = NOT_YET_STARTED;
|
||||
pid_t simpleperf_pid_ = -1;
|
||||
int control_fd_ = -1;
|
||||
int reply_fd_ = -1;
|
||||
bool trace_offcpu_ = false;
|
||||
};
|
||||
|
||||
ProfileSessionImpl::~ProfileSessionImpl() {
|
||||
if (control_fd_ != -1) {
|
||||
close(control_fd_);
|
||||
}
|
||||
if (reply_fd_ != -1) {
|
||||
close(reply_fd_);
|
||||
}
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::StartRecording(const std::vector<std::string>& args) {
|
||||
std::lock_guard<std::mutex> guard(lock_);
|
||||
if (state_ != NOT_YET_STARTED) {
|
||||
Abort("startRecording: session in wrong state %d", state_);
|
||||
}
|
||||
for (const auto& arg : args) {
|
||||
if (arg == "--trace-offcpu") {
|
||||
trace_offcpu_ = true;
|
||||
}
|
||||
}
|
||||
std::string simpleperf_path = FindSimpleperf();
|
||||
CheckIfPerfEnabled();
|
||||
CreateSimpleperfDataDir();
|
||||
CreateSimpleperfProcess(simpleperf_path, args);
|
||||
state_ = STARTED;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::PauseRecording() {
|
||||
std::lock_guard<std::mutex> guard(lock_);
|
||||
if (state_ != STARTED) {
|
||||
Abort("pauseRecording: session in wrong state %d", state_);
|
||||
}
|
||||
if (trace_offcpu_) {
|
||||
Abort("--trace-offcpu doesn't work well with pause/resume recording");
|
||||
}
|
||||
SendCmd("pause");
|
||||
state_ = PAUSED;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::ResumeRecording() {
|
||||
std::lock_guard<std::mutex> guard(lock_);
|
||||
if (state_ != PAUSED) {
|
||||
Abort("resumeRecording: session in wrong state %d", state_);
|
||||
}
|
||||
SendCmd("resume");
|
||||
state_ = STARTED;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::StopRecording() {
|
||||
std::lock_guard<std::mutex> guard(lock_);
|
||||
if (state_ != STARTED && state_ != PAUSED) {
|
||||
Abort("stopRecording: session in wrong state %d", state_);
|
||||
}
|
||||
// Send SIGINT to simpleperf to stop recording.
|
||||
if (kill(simpleperf_pid_, SIGINT) == -1) {
|
||||
Abort("failed to stop simpleperf: %s", strerror(errno));
|
||||
}
|
||||
int status;
|
||||
pid_t result = TEMP_FAILURE_RETRY(waitpid(simpleperf_pid_, &status, 0));
|
||||
if (result == -1) {
|
||||
Abort("failed to call waitpid: %s", strerror(errno));
|
||||
}
|
||||
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
|
||||
Abort("simpleperf exited with error, status = 0x%x", status);
|
||||
}
|
||||
state_ = STOPPED;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::SendCmd(const std::string& cmd) {
|
||||
std::string data = cmd + "\n";
|
||||
if (TEMP_FAILURE_RETRY(write(control_fd_, &data[0], data.size())) !=
|
||||
static_cast<ssize_t>(data.size())) {
|
||||
Abort("failed to send cmd to simpleperf: %s", strerror(errno));
|
||||
}
|
||||
if (ReadReply() != "ok") {
|
||||
Abort("failed to run cmd in simpleperf: %s", cmd.c_str());
|
||||
}
|
||||
}
|
||||
|
||||
static bool IsExecutableFile(const std::string& path) {
|
||||
struct stat st;
|
||||
if (stat(path.c_str(), &st) == 0) {
|
||||
if (S_ISREG(st.st_mode) && (st.st_mode & S_IXUSR)) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
static std::string ReadFile(FILE* fp) {
|
||||
std::string s;
|
||||
if (fp == nullptr) {
|
||||
return s;
|
||||
}
|
||||
char buf[200];
|
||||
while (true) {
|
||||
ssize_t n = fread(buf, 1, sizeof(buf), fp);
|
||||
if (n <= 0) {
|
||||
break;
|
||||
}
|
||||
s.insert(s.end(), buf, buf + n);
|
||||
}
|
||||
fclose(fp);
|
||||
return s;
|
||||
}
|
||||
|
||||
static bool RunCmd(std::vector<const char*> args, std::string* stdout) {
|
||||
int stdout_fd[2];
|
||||
if (pipe(stdout_fd) != 0) {
|
||||
return false;
|
||||
}
|
||||
args.push_back(nullptr);
|
||||
// Fork handlers (like gsl_library_close) may hang in a multi-thread environment.
|
||||
// So we use vfork instead of fork to avoid calling them.
|
||||
int pid = vfork();
|
||||
if (pid == -1) {
|
||||
return false;
|
||||
}
|
||||
if (pid == 0) {
|
||||
// child process
|
||||
close(stdout_fd[0]);
|
||||
dup2(stdout_fd[1], 1);
|
||||
close(stdout_fd[1]);
|
||||
execvp(const_cast<char*>(args[0]), const_cast<char**>(args.data()));
|
||||
_exit(1);
|
||||
}
|
||||
// parent process
|
||||
close(stdout_fd[1]);
|
||||
int status;
|
||||
pid_t result = TEMP_FAILURE_RETRY(waitpid(pid, &status, 0));
|
||||
if (result == -1) {
|
||||
Abort("failed to call waitpid: %s", strerror(errno));
|
||||
}
|
||||
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
|
||||
return false;
|
||||
}
|
||||
if (stdout == nullptr) {
|
||||
close(stdout_fd[0]);
|
||||
} else {
|
||||
*stdout = ReadFile(fdopen(stdout_fd[0], "r"));
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
std::string ProfileSessionImpl::FindSimpleperf() {
|
||||
// 1. Try /data/local/tmp/simpleperf first. Probably it's newer than /system/bin/simpleperf.
|
||||
std::string simpleperf_path = FindSimpleperfInTempDir();
|
||||
if (!simpleperf_path.empty()) {
|
||||
return simpleperf_path;
|
||||
}
|
||||
// 2. Try /system/bin/simpleperf, which is available on Android >= Q.
|
||||
simpleperf_path = "/system/bin/simpleperf";
|
||||
if (IsExecutableFile(simpleperf_path)) {
|
||||
return simpleperf_path;
|
||||
}
|
||||
Abort("can't find simpleperf on device. Please run api_profiler.py.");
|
||||
return "";
|
||||
}
|
||||
|
||||
std::string ProfileSessionImpl::FindSimpleperfInTempDir() {
|
||||
const std::string path = "/data/local/tmp/simpleperf";
|
||||
if (!IsExecutableFile(path)) {
|
||||
return "";
|
||||
}
|
||||
// Copy it to app_dir to execute it.
|
||||
const std::string to_path = app_data_dir_ + "/simpleperf";
|
||||
if (!RunCmd({"/system/bin/cp", path.c_str(), to_path.c_str()}, nullptr)) {
|
||||
return "";
|
||||
}
|
||||
// For apps with target sdk >= 29, executing app data file isn't allowed.
|
||||
// For android R, app context isn't allowed to use perf_event_open.
|
||||
// So test executing downloaded simpleperf.
|
||||
std::string s;
|
||||
if (!RunCmd({to_path.c_str(), "list", "sw"}, &s)) {
|
||||
return "";
|
||||
}
|
||||
if (s.find("cpu-clock") == std::string::npos) {
|
||||
return "";
|
||||
}
|
||||
return to_path;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::CheckIfPerfEnabled() {
|
||||
if (GetProperty("persist.simpleperf.profile_app_uid") == std::to_string(getuid())) {
|
||||
std::string time_str = GetProperty("persist.simpleperf.profile_app_expiration_time");
|
||||
if (!time_str.empty()) {
|
||||
errno = 0;
|
||||
uint64_t expiration_time = strtoull(time_str.data(), nullptr, 10);
|
||||
if (errno == 0 && expiration_time > time(nullptr)) {
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
if (GetProperty("security.perf_harden") == "1") {
|
||||
Abort("Recording app isn't enabled on the device. Please run api_profiler.py.");
|
||||
}
|
||||
}
|
||||
|
||||
std::string ProfileSessionImpl::GetProperty(const std::string& name) {
|
||||
std::string s;
|
||||
if (!RunCmd({"/system/bin/getprop", name.c_str()}, &s)) {
|
||||
return "";
|
||||
}
|
||||
return s;
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::CreateSimpleperfDataDir() {
|
||||
struct stat st;
|
||||
if (stat(simpleperf_data_dir_.c_str(), &st) == 0 && S_ISDIR(st.st_mode)) {
|
||||
return;
|
||||
}
|
||||
if (mkdir(simpleperf_data_dir_.c_str(), 0700) == -1) {
|
||||
Abort("failed to create simpleperf data dir %s: %s", simpleperf_data_dir_.c_str(),
|
||||
strerror(errno));
|
||||
}
|
||||
}
|
||||
|
||||
void ProfileSessionImpl::CreateSimpleperfProcess(const std::string& simpleperf_path,
|
||||
const std::vector<std::string>& record_args) {
|
||||
// 1. Create control/reply pips.
|
||||
int control_fd[2];
|
||||
int reply_fd[2];
|
||||
if (pipe(control_fd) != 0 || pipe(reply_fd) != 0) {
|
||||
Abort("failed to call pipe: %s", strerror(errno));
|
||||
}
|
||||
|
||||
// 2. Prepare simpleperf arguments.
|
||||
std::vector<std::string> args;
|
||||
args.emplace_back(simpleperf_path);
|
||||
args.emplace_back("record");
|
||||
args.emplace_back("--log-to-android-buffer");
|
||||
args.insert(args.end(), {"--log", "debug"});
|
||||
args.emplace_back("--stdio-controls-profiling");
|
||||
args.emplace_back("--in-app");
|
||||
args.insert(args.end(), {"--tracepoint-events", "/data/local/tmp/tracepoint_events"});
|
||||
args.insert(args.end(), record_args.begin(), record_args.end());
|
||||
char* argv[args.size() + 1];
|
||||
for (size_t i = 0; i < args.size(); ++i) {
|
||||
argv[i] = &args[i][0];
|
||||
}
|
||||
argv[args.size()] = nullptr;
|
||||
|
||||
// 3. Start simpleperf process.
|
||||
// Fork handlers (like gsl_library_close) may hang in a multi-thread environment.
|
||||
// So we use vfork instead of fork to avoid calling them.
|
||||
int pid = vfork();
|
||||
if (pid == -1) {
|
||||
Abort("failed to fork: %s", strerror(errno));
|
||||
}
|
||||
if (pid == 0) {
|
||||
// child process
|
||||
close(control_fd[1]);
|
||||
dup2(control_fd[0], 0); // simpleperf read control cmd from fd 0.
|
||||
close(control_fd[0]);
|
||||
close(reply_fd[0]);
|
||||
dup2(reply_fd[1], 1); // simpleperf writes reply to fd 1.
|
||||
close(reply_fd[0]);
|
||||
chdir(simpleperf_data_dir_.c_str());
|
||||
execvp(argv[0], argv);
|
||||
Abort("failed to call exec: %s", strerror(errno));
|
||||
}
|
||||
// parent process
|
||||
close(control_fd[0]);
|
||||
control_fd_ = control_fd[1];
|
||||
close(reply_fd[1]);
|
||||
reply_fd_ = reply_fd[0];
|
||||
simpleperf_pid_ = pid;
|
||||
|
||||
// 4. Wait until simpleperf starts recording.
|
||||
std::string start_flag = ReadReply();
|
||||
if (start_flag != "started") {
|
||||
Abort("failed to receive simpleperf start flag");
|
||||
}
|
||||
}
|
||||
|
||||
std::string ProfileSessionImpl::ReadReply() {
|
||||
std::string s;
|
||||
while (true) {
|
||||
char c;
|
||||
ssize_t result = TEMP_FAILURE_RETRY(read(reply_fd_, &c, 1));
|
||||
if (result <= 0 || c == '\n') {
|
||||
break;
|
||||
}
|
||||
s.push_back(c);
|
||||
}
|
||||
return s;
|
||||
}
|
||||
|
||||
ProfileSession::ProfileSession() {
|
||||
FILE* fp = fopen("/proc/self/cmdline", "r");
|
||||
if (fp == nullptr) {
|
||||
Abort("failed to open /proc/self/cmdline: %s", strerror(errno));
|
||||
}
|
||||
std::string s = ReadFile(fp);
|
||||
for (int i = 0; i < s.size(); i++) {
|
||||
if (s[i] == '\0') {
|
||||
s = s.substr(0, i);
|
||||
break;
|
||||
}
|
||||
}
|
||||
std::string app_data_dir = "/data/data/" + s;
|
||||
int uid = getuid();
|
||||
if (uid >= AID_USER_OFFSET) {
|
||||
int user_id = uid / AID_USER_OFFSET;
|
||||
app_data_dir = "/data/user/" + std::to_string(user_id) + "/" + s;
|
||||
}
|
||||
impl_ = new ProfileSessionImpl(app_data_dir);
|
||||
}
|
||||
|
||||
ProfileSession::ProfileSession(const std::string& app_data_dir)
|
||||
: impl_(new ProfileSessionImpl(app_data_dir)) {}
|
||||
|
||||
ProfileSession::~ProfileSession() {
|
||||
delete impl_;
|
||||
}
|
||||
|
||||
void ProfileSession::StartRecording(const RecordOptions& options) {
|
||||
StartRecording(options.ToRecordArgs());
|
||||
}
|
||||
|
||||
void ProfileSession::StartRecording(const std::vector<std::string>& record_args) {
|
||||
impl_->StartRecording(record_args);
|
||||
}
|
||||
|
||||
void ProfileSession::PauseRecording() {
|
||||
impl_->PauseRecording();
|
||||
}
|
||||
|
||||
void ProfileSession::ResumeRecording() {
|
||||
impl_->ResumeRecording();
|
||||
}
|
||||
|
||||
void ProfileSession::StopRecording() {
|
||||
impl_->StopRecording();
|
||||
}
|
||||
|
||||
} // namespace simpleperf
|
||||
161
Android/android-ndk-r27d/simpleperf/app_api/cpp/simpleperf.h
Normal file
@ -0,0 +1,161 @@
|
||||
/*
|
||||
* Copyright (C) 2019 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
#pragma once
|
||||
#include <sys/types.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
// A C++ API used to control simpleperf recording.
|
||||
namespace simpleperf {
|
||||
|
||||
/**
|
||||
* RecordOptions sets record options used by ProfileSession. The options are
|
||||
* converted to a string list in toRecordArgs(), which is then passed to
|
||||
* `simpleperf record` cmd. Run `simpleperf record -h` or
|
||||
* `run_simpleperf_on_device.py record -h` for help messages.
|
||||
*
|
||||
* Example:
|
||||
* RecordOptions options;
|
||||
* options.setDuration(3).recordDwarfCallGraph().setOutputFilename("perf.data");
|
||||
* ProfileSession session;
|
||||
* session.startRecording(options);
|
||||
*/
|
||||
class RecordOptionsImpl;
|
||||
class RecordOptions {
|
||||
public:
|
||||
RecordOptions();
|
||||
~RecordOptions();
|
||||
/**
|
||||
* Set output filename. Default is perf-<month>-<day>-<hour>-<minute>-<second>.data.
|
||||
* The file will be generated under simpleperf_data/.
|
||||
*/
|
||||
RecordOptions& SetOutputFilename(const std::string& filename);
|
||||
|
||||
/**
|
||||
* Set event to record. Default is cpu-cycles. See `simpleperf list` for all available events.
|
||||
*/
|
||||
RecordOptions& SetEvent(const std::string& event);
|
||||
|
||||
/**
|
||||
* Set how many samples to generate each second running. Default is 4000.
|
||||
*/
|
||||
RecordOptions& SetSampleFrequency(size_t freq);
|
||||
|
||||
/**
|
||||
* Set record duration. The record stops after `durationInSecond` seconds. By default,
|
||||
* record stops only when stopRecording() is called.
|
||||
*/
|
||||
RecordOptions& SetDuration(double duration_in_second);
|
||||
|
||||
/**
|
||||
* Record some threads in the app process. By default, record all threads in the process.
|
||||
*/
|
||||
RecordOptions& SetSampleThreads(const std::vector<pid_t>& threads);
|
||||
|
||||
/**
|
||||
* Record dwarf based call graph. It is needed to get Java callstacks.
|
||||
*/
|
||||
RecordOptions& RecordDwarfCallGraph();
|
||||
|
||||
/**
|
||||
* Record frame pointer based call graph. It is suitable to get C++ callstacks on 64bit devices.
|
||||
*/
|
||||
RecordOptions& RecordFramePointerCallGraph();
|
||||
|
||||
/**
|
||||
* Trace context switch info to show where threads spend time off cpu.
|
||||
*/
|
||||
RecordOptions& TraceOffCpu();
|
||||
|
||||
/**
|
||||
* Translate record options into arguments for `simpleperf record` cmd.
|
||||
*/
|
||||
std::vector<std::string> ToRecordArgs() const;
|
||||
|
||||
private:
|
||||
RecordOptionsImpl* impl_;
|
||||
};
|
||||
|
||||
/**
|
||||
* ProfileSession uses `simpleperf record` cmd to generate a recording file.
|
||||
* It allows users to start recording with some options, pause/resume recording
|
||||
* to only profile interested code, and stop recording.
|
||||
*
|
||||
* Example:
|
||||
* RecordOptions options;
|
||||
* options.setDwarfCallGraph();
|
||||
* ProfileSession session;
|
||||
* session.StartRecording(options);
|
||||
* sleep(1);
|
||||
* session.PauseRecording();
|
||||
* sleep(1);
|
||||
* session.ResumeRecording();
|
||||
* sleep(1);
|
||||
* session.StopRecording();
|
||||
*
|
||||
* It aborts when error happens. To read error messages of simpleperf record
|
||||
* process, filter logcat with `simpleperf`.
|
||||
*/
|
||||
class ProfileSessionImpl;
|
||||
class ProfileSession {
|
||||
public:
|
||||
/**
|
||||
* @param appDataDir the same as android.content.Context.getDataDir().
|
||||
* ProfileSession stores profiling data in appDataDir/simpleperf_data/.
|
||||
*/
|
||||
ProfileSession(const std::string& app_data_dir);
|
||||
|
||||
/**
|
||||
* ProfileSession assumes appDataDir as /data/data/app_package_name.
|
||||
*/
|
||||
ProfileSession();
|
||||
~ProfileSession();
|
||||
|
||||
/**
|
||||
* Start recording.
|
||||
* @param options RecordOptions
|
||||
*/
|
||||
void StartRecording(const RecordOptions& options);
|
||||
|
||||
/**
|
||||
* Start recording.
|
||||
* @param args arguments for `simpleperf record` cmd.
|
||||
*/
|
||||
void StartRecording(const std::vector<std::string>& record_args);
|
||||
|
||||
/**
|
||||
* Pause recording. No samples are generated in paused state.
|
||||
*/
|
||||
void PauseRecording();
|
||||
|
||||
/**
|
||||
* Resume a paused session.
|
||||
*/
|
||||
void ResumeRecording();
|
||||
|
||||
/**
|
||||
* Stop recording and generate a recording file under appDataDir/simpleperf_data/.
|
||||
*/
|
||||
void StopRecording();
|
||||
|
||||
private:
|
||||
ProfileSessionImpl* impl_;
|
||||
};
|
||||
|
||||
} // namespace simpleperf
|
||||
@ -0,0 +1,383 @@
|
||||
/*
|
||||
* Copyright (C) 2019 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package com.android.simpleperf;
|
||||
|
||||
import android.os.Build;
|
||||
import android.system.Os;
|
||||
import android.system.OsConstants;
|
||||
|
||||
import android.support.annotation.NonNull;
|
||||
import android.support.annotation.Nullable;
|
||||
import android.support.annotation.RequiresApi;
|
||||
|
||||
import java.io.BufferedReader;
|
||||
import java.io.File;
|
||||
import java.io.FileInputStream;
|
||||
import java.io.IOException;
|
||||
import java.io.InputStream;
|
||||
import java.io.InputStreamReader;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.stream.Collectors;
|
||||
|
||||
/**
|
||||
* <p>
|
||||
* This class uses `simpleperf record` cmd to generate a recording file.
|
||||
* It allows users to start recording with some options, pause/resume recording
|
||||
* to only profile interested code, and stop recording.
|
||||
* </p>
|
||||
*
|
||||
* <p>
|
||||
* Example:
|
||||
* RecordOptions options = new RecordOptions();
|
||||
* options.setDwarfCallGraph();
|
||||
* ProfileSession session = new ProfileSession();
|
||||
* session.StartRecording(options);
|
||||
* Thread.sleep(1000);
|
||||
* session.PauseRecording();
|
||||
* Thread.sleep(1000);
|
||||
* session.ResumeRecording();
|
||||
* Thread.sleep(1000);
|
||||
* session.StopRecording();
|
||||
* </p>
|
||||
*
|
||||
* <p>
|
||||
* It throws an Error when error happens. To read error messages of simpleperf record
|
||||
* process, filter logcat with `simpleperf`.
|
||||
* </p>
|
||||
*/
|
||||
@RequiresApi(28)
|
||||
public class ProfileSession {
|
||||
private static final String SIMPLEPERF_PATH_IN_IMAGE = "/system/bin/simpleperf";
|
||||
|
||||
enum State {
|
||||
NOT_YET_STARTED,
|
||||
STARTED,
|
||||
PAUSED,
|
||||
STOPPED,
|
||||
}
|
||||
|
||||
private State mState = State.NOT_YET_STARTED;
|
||||
private final String mAppDataDir;
|
||||
private String mSimpleperfPath;
|
||||
private final String mSimpleperfDataDir;
|
||||
private Process mSimpleperfProcess;
|
||||
private boolean mTraceOffCpu = false;
|
||||
|
||||
/**
|
||||
* @param appDataDir the same as android.content.Context.getDataDir().
|
||||
* ProfileSession stores profiling data in appDataDir/simpleperf_data/.
|
||||
*/
|
||||
public ProfileSession(@NonNull String appDataDir) {
|
||||
mAppDataDir = appDataDir;
|
||||
mSimpleperfDataDir = appDataDir + "/simpleperf_data";
|
||||
}
|
||||
|
||||
/**
|
||||
* ProfileSession assumes appDataDir as /data/data/app_package_name.
|
||||
*/
|
||||
public ProfileSession() {
|
||||
String packageName;
|
||||
try {
|
||||
String s = readInputStream(new FileInputStream("/proc/self/cmdline"));
|
||||
for (int i = 0; i < s.length(); i++) {
|
||||
if (s.charAt(i) == '\0') {
|
||||
s = s.substring(0, i);
|
||||
break;
|
||||
}
|
||||
}
|
||||
packageName = s;
|
||||
} catch (IOException e) {
|
||||
throw new Error("failed to find packageName: " + e.getMessage());
|
||||
}
|
||||
if (packageName.isEmpty()) {
|
||||
throw new Error("failed to find packageName");
|
||||
}
|
||||
final int AID_USER_OFFSET = 100000;
|
||||
int uid = Os.getuid();
|
||||
if (uid >= AID_USER_OFFSET) {
|
||||
int user_id = uid / AID_USER_OFFSET;
|
||||
mAppDataDir = "/data/user/" + user_id + "/" + packageName;
|
||||
} else {
|
||||
mAppDataDir = "/data/data/" + packageName;
|
||||
}
|
||||
mSimpleperfDataDir = mAppDataDir + "/simpleperf_data";
|
||||
}
|
||||
|
||||
/**
|
||||
* Start recording.
|
||||
* @param options RecordOptions
|
||||
*/
|
||||
public void startRecording(@NonNull RecordOptions options) {
|
||||
startRecording(options.toRecordArgs());
|
||||
}
|
||||
|
||||
/**
|
||||
* Start recording.
|
||||
* @param args arguments for `simpleperf record` cmd.
|
||||
*/
|
||||
public synchronized void startRecording(@NonNull List<String> args) {
|
||||
if (mState != State.NOT_YET_STARTED) {
|
||||
throw new IllegalStateException("startRecording: session in wrong state " + mState);
|
||||
}
|
||||
for (String arg : args) {
|
||||
if (arg.equals("--trace-offcpu")) {
|
||||
mTraceOffCpu = true;
|
||||
}
|
||||
}
|
||||
mSimpleperfPath = findSimpleperf();
|
||||
checkIfPerfEnabled();
|
||||
createSimpleperfDataDir();
|
||||
createSimpleperfProcess(mSimpleperfPath, args);
|
||||
mState = State.STARTED;
|
||||
}
|
||||
|
||||
/**
|
||||
* Pause recording. No samples are generated in paused state.
|
||||
*/
|
||||
public synchronized void pauseRecording() {
|
||||
if (mState != State.STARTED) {
|
||||
throw new IllegalStateException("pauseRecording: session in wrong state " + mState);
|
||||
}
|
||||
if (mTraceOffCpu) {
|
||||
throw new AssertionError(
|
||||
"--trace-offcpu option doesn't work well with pause/resume recording");
|
||||
}
|
||||
sendCmd("pause");
|
||||
mState = State.PAUSED;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resume a paused session.
|
||||
*/
|
||||
public synchronized void resumeRecording() {
|
||||
if (mState != State.PAUSED) {
|
||||
throw new IllegalStateException("resumeRecording: session in wrong state " + mState);
|
||||
}
|
||||
sendCmd("resume");
|
||||
mState = State.STARTED;
|
||||
}
|
||||
|
||||
/**
|
||||
* Stop recording and generate a recording file under appDataDir/simpleperf_data/.
|
||||
*/
|
||||
public synchronized void stopRecording() {
|
||||
if (mState != State.STARTED && mState != State.PAUSED) {
|
||||
throw new IllegalStateException("stopRecording: session in wrong state " + mState);
|
||||
}
|
||||
if (Build.VERSION.SDK_INT == Build.VERSION_CODES.P + 1
|
||||
&& mSimpleperfPath.equals(SIMPLEPERF_PATH_IN_IMAGE)) {
|
||||
// The simpleperf shipped on Android Q contains a bug, which may make it abort if
|
||||
// calling simpleperfProcess.destroy().
|
||||
destroySimpleperfProcessWithoutClosingStdin();
|
||||
} else {
|
||||
mSimpleperfProcess.destroy();
|
||||
}
|
||||
try {
|
||||
int exitCode = mSimpleperfProcess.waitFor();
|
||||
if (exitCode != 0) {
|
||||
throw new AssertionError("simpleperf exited with error: " + exitCode);
|
||||
}
|
||||
} catch (InterruptedException e) {
|
||||
}
|
||||
mSimpleperfProcess = null;
|
||||
mState = State.STOPPED;
|
||||
}
|
||||
|
||||
private void destroySimpleperfProcessWithoutClosingStdin() {
|
||||
// In format "Process[pid=? ..."
|
||||
String s = mSimpleperfProcess.toString();
|
||||
final String prefix = "Process[pid=";
|
||||
if (s.startsWith(prefix)) {
|
||||
int startIndex = prefix.length();
|
||||
int endIndex = s.indexOf(',');
|
||||
if (endIndex > startIndex) {
|
||||
int pid = Integer.parseInt(s.substring(startIndex, endIndex).trim());
|
||||
android.os.Process.sendSignal(pid, OsConstants.SIGTERM);
|
||||
return;
|
||||
}
|
||||
}
|
||||
mSimpleperfProcess.destroy();
|
||||
}
|
||||
|
||||
private String readInputStream(InputStream in) {
|
||||
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
|
||||
String result = reader.lines().collect(Collectors.joining("\n"));
|
||||
try {
|
||||
reader.close();
|
||||
} catch (IOException e) {
|
||||
}
|
||||
return result;
|
||||
}
|
||||
|
||||
private String findSimpleperf() {
|
||||
// 1. Try /data/local/tmp/simpleperf. Probably it's newer than /system/bin/simpleperf.
|
||||
String simpleperfPath = findSimpleperfInTempDir();
|
||||
if (simpleperfPath != null) {
|
||||
return simpleperfPath;
|
||||
}
|
||||
// 2. Try /system/bin/simpleperf, which is available on Android >= Q.
|
||||
simpleperfPath = SIMPLEPERF_PATH_IN_IMAGE;
|
||||
if (isExecutableFile(simpleperfPath)) {
|
||||
return simpleperfPath;
|
||||
}
|
||||
throw new Error("can't find simpleperf on device. Please run api_profiler.py.");
|
||||
}
|
||||
|
||||
private boolean isExecutableFile(@NonNull String path) {
|
||||
File file = new File(path);
|
||||
return file.canExecute();
|
||||
}
|
||||
|
||||
@Nullable
|
||||
private String findSimpleperfInTempDir() {
|
||||
String path = "/data/local/tmp/simpleperf";
|
||||
File file = new File(path);
|
||||
if (!file.isFile()) {
|
||||
return null;
|
||||
}
|
||||
// Copy it to app dir to execute it.
|
||||
String toPath = mAppDataDir + "/simpleperf";
|
||||
try {
|
||||
Process process = new ProcessBuilder()
|
||||
.command("cp", path, toPath).start();
|
||||
process.waitFor();
|
||||
} catch (Exception e) {
|
||||
return null;
|
||||
}
|
||||
if (!isExecutableFile(toPath)) {
|
||||
return null;
|
||||
}
|
||||
// For apps with target sdk >= 29, executing app data file isn't allowed.
|
||||
// For android R, app context isn't allowed to use perf_event_open.
|
||||
// So test executing downloaded simpleperf.
|
||||
try {
|
||||
Process process = new ProcessBuilder().command(toPath, "list", "sw").start();
|
||||
process.waitFor();
|
||||
String data = readInputStream(process.getInputStream());
|
||||
if (!data.contains("cpu-clock")) {
|
||||
return null;
|
||||
}
|
||||
} catch (Exception e) {
|
||||
return null;
|
||||
}
|
||||
return toPath;
|
||||
}
|
||||
|
||||
private void checkIfPerfEnabled() {
|
||||
if (getProperty("persist.simpleperf.profile_app_uid").equals("" + Os.getuid())) {
|
||||
String timeStr = getProperty("persist.simpleperf.profile_app_expiration_time");
|
||||
if (!timeStr.isEmpty()) {
|
||||
try {
|
||||
long expirationTime = Long.parseLong(timeStr);
|
||||
if (expirationTime > System.currentTimeMillis() / 1000) {
|
||||
return;
|
||||
}
|
||||
} catch (NumberFormatException e) {
|
||||
}
|
||||
}
|
||||
}
|
||||
if (getProperty("security.perf_harden") == "1") {
|
||||
throw new Error("Recording app isn't enabled on the device."
|
||||
+ " Please run api_profiler.py.");
|
||||
}
|
||||
}
|
||||
|
||||
private String getProperty(String name) {
|
||||
String value;
|
||||
Process process;
|
||||
try {
|
||||
process = new ProcessBuilder()
|
||||
.command("/system/bin/getprop", name).start();
|
||||
} catch (IOException e) {
|
||||
return "";
|
||||
}
|
||||
try {
|
||||
process.waitFor();
|
||||
} catch (InterruptedException e) {
|
||||
}
|
||||
return readInputStream(process.getInputStream());
|
||||
}
|
||||
|
||||
private void createSimpleperfDataDir() {
|
||||
File file = new File(mSimpleperfDataDir);
|
||||
if (!file.isDirectory()) {
|
||||
file.mkdir();
|
||||
}
|
||||
}
|
||||
|
||||
private void createSimpleperfProcess(String simpleperfPath, List<String> recordArgs) {
|
||||
// 1. Prepare simpleperf arguments.
|
||||
ArrayList<String> args = new ArrayList<>();
|
||||
args.add(simpleperfPath);
|
||||
args.add("record");
|
||||
args.add("--log-to-android-buffer");
|
||||
args.add("--log");
|
||||
args.add("debug");
|
||||
args.add("--stdio-controls-profiling");
|
||||
args.add("--in-app");
|
||||
args.add("--tracepoint-events");
|
||||
args.add("/data/local/tmp/tracepoint_events");
|
||||
args.addAll(recordArgs);
|
||||
|
||||
// 2. Create the simpleperf process.
|
||||
ProcessBuilder pb = new ProcessBuilder(args).directory(new File(mSimpleperfDataDir));
|
||||
try {
|
||||
mSimpleperfProcess = pb.start();
|
||||
} catch (IOException e) {
|
||||
throw new Error("failed to create simpleperf process: " + e.getMessage());
|
||||
}
|
||||
|
||||
// 3. Wait until simpleperf starts recording.
|
||||
String startFlag = readReply();
|
||||
if (!startFlag.equals("started")) {
|
||||
throw new Error("failed to receive simpleperf start flag");
|
||||
}
|
||||
}
|
||||
|
||||
private void sendCmd(@NonNull String cmd) {
|
||||
cmd += "\n";
|
||||
try {
|
||||
mSimpleperfProcess.getOutputStream().write(cmd.getBytes());
|
||||
mSimpleperfProcess.getOutputStream().flush();
|
||||
} catch (IOException e) {
|
||||
throw new Error("failed to send cmd to simpleperf: " + e.getMessage());
|
||||
}
|
||||
if (!readReply().equals("ok")) {
|
||||
throw new Error("failed to run cmd in simpleperf: " + cmd);
|
||||
}
|
||||
}
|
||||
|
||||
@NonNull
|
||||
private String readReply() {
|
||||
// Read one byte at a time to stop at line break or EOF. BufferedReader will try to read
|
||||
// more than available and make us blocking, so don't use it.
|
||||
String s = "";
|
||||
while (true) {
|
||||
int c = -1;
|
||||
try {
|
||||
c = mSimpleperfProcess.getInputStream().read();
|
||||
} catch (IOException e) {
|
||||
}
|
||||
if (c == -1 || c == '\n') {
|
||||
break;
|
||||
}
|
||||
s += (char) c;
|
||||
}
|
||||
return s;
|
||||
}
|
||||
}
|
||||
@ -0,0 +1,196 @@
|
||||
/*
|
||||
* Copyright (C) 2019 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
package com.android.simpleperf;
|
||||
|
||||
import android.system.Os;
|
||||
|
||||
import android.support.annotation.NonNull;
|
||||
import android.support.annotation.Nullable;
|
||||
import android.support.annotation.RequiresApi;
|
||||
|
||||
import java.time.LocalDateTime;
|
||||
import java.time.format.DateTimeFormatter;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* <p>
|
||||
* This class sets record options used by ProfileSession. The options are
|
||||
* converted to a string list in toRecordArgs(), which is then passed to
|
||||
* `simpleperf record` cmd. Run `simpleperf record -h` or
|
||||
* `run_simpleperf_on_device.py record -h` for help messages.
|
||||
* </p>
|
||||
*
|
||||
* <p>
|
||||
* Example:
|
||||
* RecordOptions options = new RecordOptions();
|
||||
* options.setDuration(3).recordDwarfCallGraph().setOutputFilename("perf.data");
|
||||
* ProfileSession session = new ProfileSession();
|
||||
* session.startRecording(options);
|
||||
* </p>
|
||||
*/
|
||||
@RequiresApi(28)
|
||||
public class RecordOptions {
|
||||
|
||||
/**
|
||||
* Set output filename. Default is perf-<month>-<day>-<hour>-<minute>-<second>.data.
|
||||
* The file will be generated under simpleperf_data/.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions setOutputFilename(@NonNull String filename) {
|
||||
mOutputFilename = filename;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Set event to record. Default is cpu-cycles. See `simpleperf list` for all available events.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions setEvent(@NonNull String event) {
|
||||
mEvent = event;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Set how many samples to generate each second running. Default is 4000.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions setSampleFrequency(int freq) {
|
||||
mFreq = freq;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Set record duration. The record stops after `durationInSecond` seconds. By default,
|
||||
* record stops only when stopRecording() is called.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions setDuration(double durationInSecond) {
|
||||
mDurationInSeconds = durationInSecond;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Record some threads in the app process. By default, record all threads in the process.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions setSampleThreads(@NonNull List<Integer> threads) {
|
||||
mThreads.addAll(threads);
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Record dwarf based call graph. It is needed to get Java callstacks.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions recordDwarfCallGraph() {
|
||||
mDwarfCallGraph = true;
|
||||
mFpCallGraph = false;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Record frame pointer based call graph. It is suitable to get C++ callstacks on 64bit devices.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions recordFramePointerCallGraph() {
|
||||
mFpCallGraph = true;
|
||||
mDwarfCallGraph = false;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Trace context switch info to show where threads spend time off cpu.
|
||||
*/
|
||||
@NonNull
|
||||
public RecordOptions traceOffCpu() {
|
||||
mTraceOffCpu = true;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Translate record options into arguments for `simpleperf record` cmd.
|
||||
*/
|
||||
@NonNull
|
||||
public List<String> toRecordArgs() {
|
||||
ArrayList<String> args = new ArrayList<>();
|
||||
|
||||
String filename = mOutputFilename;
|
||||
if (filename == null) {
|
||||
filename = getDefaultOutputFilename();
|
||||
}
|
||||
args.add("-o");
|
||||
args.add(filename);
|
||||
args.add("-e");
|
||||
args.add(mEvent);
|
||||
args.add("-f");
|
||||
args.add(String.valueOf(mFreq));
|
||||
if (mDurationInSeconds != 0.0) {
|
||||
args.add("--duration");
|
||||
args.add(String.valueOf(mDurationInSeconds));
|
||||
}
|
||||
if (mThreads.isEmpty()) {
|
||||
args.add("-p");
|
||||
args.add(String.valueOf(Os.getpid()));
|
||||
} else {
|
||||
String s = "";
|
||||
for (int i = 0; i < mThreads.size(); i++) {
|
||||
if (i > 0) {
|
||||
s += ",";
|
||||
}
|
||||
s += mThreads.get(i).toString();
|
||||
}
|
||||
args.add("-t");
|
||||
args.add(s);
|
||||
}
|
||||
if (mDwarfCallGraph) {
|
||||
args.add("-g");
|
||||
} else if (mFpCallGraph) {
|
||||
args.add("--call-graph");
|
||||
args.add("fp");
|
||||
}
|
||||
if (mTraceOffCpu) {
|
||||
args.add("--trace-offcpu");
|
||||
}
|
||||
return args;
|
||||
}
|
||||
|
||||
private String getDefaultOutputFilename() {
|
||||
LocalDateTime time = LocalDateTime.now();
|
||||
DateTimeFormatter formatter = DateTimeFormatter.ofPattern("'perf'-MM-dd-HH-mm-ss'.data'");
|
||||
return time.format(formatter);
|
||||
}
|
||||
|
||||
@Nullable
|
||||
private String mOutputFilename;
|
||||
|
||||
@NonNull
|
||||
private String mEvent = "cpu-cycles";
|
||||
|
||||
private int mFreq = 4000;
|
||||
|
||||
private double mDurationInSeconds = 0.0;
|
||||
|
||||
@NonNull
|
||||
private ArrayList<Integer> mThreads = new ArrayList<>();
|
||||
|
||||
private boolean mDwarfCallGraph = false;
|
||||
|
||||
private boolean mFpCallGraph = false;
|
||||
|
||||
private boolean mTraceOffCpu = false;
|
||||
}
|
||||
547
Android/android-ndk-r27d/simpleperf/app_profiler.py
Normal file
@ -0,0 +1,547 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""app_profiler.py: Record cpu profiling data of an android app or native program.
|
||||
|
||||
It downloads simpleperf on device, uses it to collect profiling data on the selected app,
|
||||
and pulls profiling data and related binaries on host.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
from typing import Optional
|
||||
|
||||
from simpleperf_utils import (
|
||||
AdbHelper, BaseArgumentParser, bytes_to_str, extant_dir, get_script_dir, get_target_binary_path,
|
||||
log_exit, ReadElf, remove, str_to_bytes)
|
||||
|
||||
NATIVE_LIBS_DIR_ON_DEVICE = '/data/local/tmp/native_libs/'
|
||||
|
||||
SHELL_PS_UID_PATTERN = re.compile(r'USER.*\nu(\d+)_.*')
|
||||
|
||||
|
||||
class HostElfEntry(object):
|
||||
"""Represent a native lib on host in NativeLibDownloader."""
|
||||
|
||||
def __init__(self, path, name, score):
|
||||
self.path = path
|
||||
self.name = name
|
||||
self.score = score
|
||||
|
||||
def __repr__(self):
|
||||
return self.__str__()
|
||||
|
||||
def __str__(self):
|
||||
return '[path: %s, name %s, score %s]' % (self.path, self.name, self.score)
|
||||
|
||||
|
||||
class NativeLibDownloader(object):
|
||||
"""Download native libs on device.
|
||||
|
||||
1. Collect info of all native libs in the native_lib_dir on host.
|
||||
2. Check the available native libs in /data/local/tmp/native_libs on device.
|
||||
3. Sync native libs on device.
|
||||
"""
|
||||
|
||||
def __init__(self, ndk_path, device_arch, adb):
|
||||
self.adb = adb
|
||||
self.readelf = ReadElf(ndk_path)
|
||||
self.device_arch = device_arch
|
||||
self.need_archs = self._get_need_archs()
|
||||
self.host_build_id_map = {} # Map from build_id to HostElfEntry.
|
||||
self.device_build_id_map = {} # Map from build_id to relative_path on device.
|
||||
# Map from filename to HostElfEntry for elf files without build id.
|
||||
self.no_build_id_file_map = {}
|
||||
self.name_count_map = {} # Used to give a unique name for each library.
|
||||
self.dir_on_device = NATIVE_LIBS_DIR_ON_DEVICE
|
||||
self.build_id_list_file = 'build_id_list'
|
||||
|
||||
def _get_need_archs(self):
|
||||
"""Return the archs of binaries needed on device."""
|
||||
if self.device_arch == 'arm64':
|
||||
return ['arm', 'arm64']
|
||||
if self.device_arch == 'arm':
|
||||
return ['arm']
|
||||
if self.device_arch == 'x86_64':
|
||||
return ['x86', 'x86_64']
|
||||
if self.device_arch == 'x86':
|
||||
return ['x86']
|
||||
return []
|
||||
|
||||
def collect_native_libs_on_host(self, native_lib_dir):
|
||||
self.host_build_id_map.clear()
|
||||
for root, _, files in os.walk(native_lib_dir):
|
||||
for name in files:
|
||||
if not name.endswith('.so'):
|
||||
continue
|
||||
self.add_native_lib_on_host(os.path.join(root, name), name)
|
||||
|
||||
def add_native_lib_on_host(self, path, name):
|
||||
arch = self.readelf.get_arch(path)
|
||||
if arch not in self.need_archs:
|
||||
return
|
||||
sections = self.readelf.get_sections(path)
|
||||
score = 0
|
||||
if '.debug_info' in sections:
|
||||
score = 3
|
||||
elif '.gnu_debugdata' in sections:
|
||||
score = 2
|
||||
elif '.symtab' in sections:
|
||||
score = 1
|
||||
build_id = self.readelf.get_build_id(path)
|
||||
if build_id:
|
||||
entry = self.host_build_id_map.get(build_id)
|
||||
if entry:
|
||||
if entry.score < score:
|
||||
entry.path = path
|
||||
entry.score = score
|
||||
else:
|
||||
repeat_count = self.name_count_map.get(name, 0)
|
||||
self.name_count_map[name] = repeat_count + 1
|
||||
unique_name = name if repeat_count == 0 else name + '_' + str(repeat_count)
|
||||
self.host_build_id_map[build_id] = HostElfEntry(path, unique_name, score)
|
||||
else:
|
||||
entry = self.no_build_id_file_map.get(name)
|
||||
if entry:
|
||||
if entry.score < score:
|
||||
entry.path = path
|
||||
entry.score = score
|
||||
else:
|
||||
self.no_build_id_file_map[name] = HostElfEntry(path, name, score)
|
||||
|
||||
def collect_native_libs_on_device(self):
|
||||
self.device_build_id_map.clear()
|
||||
self.adb.check_run(['shell', 'mkdir', '-p', self.dir_on_device])
|
||||
if os.path.exists(self.build_id_list_file):
|
||||
os.remove(self.build_id_list_file)
|
||||
result, output = self.adb.run_and_return_output(['shell', 'ls', self.dir_on_device])
|
||||
if not result:
|
||||
return
|
||||
file_set = set(output.strip().split())
|
||||
if self.build_id_list_file not in file_set:
|
||||
return
|
||||
self.adb.run(['pull', self.dir_on_device + self.build_id_list_file])
|
||||
if os.path.exists(self.build_id_list_file):
|
||||
with open(self.build_id_list_file, 'rb') as fh:
|
||||
for line in fh.readlines():
|
||||
line = bytes_to_str(line).strip()
|
||||
items = line.split('=')
|
||||
if len(items) == 2:
|
||||
build_id, filename = items
|
||||
if filename in file_set:
|
||||
self.device_build_id_map[build_id] = filename
|
||||
remove(self.build_id_list_file)
|
||||
|
||||
def sync_native_libs_on_device(self):
|
||||
# Push missing native libs on device.
|
||||
for build_id in self.host_build_id_map:
|
||||
if build_id not in self.device_build_id_map:
|
||||
entry = self.host_build_id_map[build_id]
|
||||
self.adb.check_run(['push', entry.path, self.dir_on_device + entry.name])
|
||||
# Remove native libs not exist on host.
|
||||
for build_id in self.device_build_id_map:
|
||||
if build_id not in self.host_build_id_map:
|
||||
name = self.device_build_id_map[build_id]
|
||||
self.adb.run(['shell', 'rm', self.dir_on_device + name])
|
||||
# Push new build_id_list on device.
|
||||
with open(self.build_id_list_file, 'wb') as fh:
|
||||
for build_id in self.host_build_id_map:
|
||||
s = str_to_bytes('%s=%s\n' % (build_id, self.host_build_id_map[build_id].name))
|
||||
fh.write(s)
|
||||
self.adb.check_run(['push', self.build_id_list_file,
|
||||
self.dir_on_device + self.build_id_list_file])
|
||||
os.remove(self.build_id_list_file)
|
||||
|
||||
# Push elf files without build id on device.
|
||||
for entry in self.no_build_id_file_map.values():
|
||||
target = self.dir_on_device + entry.name
|
||||
|
||||
# Skip download if we have a file with the same name and size on device.
|
||||
result, output = self.adb.run_and_return_output(['shell', 'ls', '-l', target])
|
||||
if result:
|
||||
items = output.split()
|
||||
if len(items) > 5:
|
||||
try:
|
||||
file_size = int(items[4])
|
||||
except ValueError:
|
||||
file_size = 0
|
||||
if file_size == os.path.getsize(entry.path):
|
||||
continue
|
||||
self.adb.check_run(['push', entry.path, target])
|
||||
|
||||
|
||||
class ProfilerBase(object):
|
||||
"""Base class of all Profilers."""
|
||||
|
||||
def __init__(self, args):
|
||||
self.args = args
|
||||
self.adb = AdbHelper(enable_switch_to_root=not args.disable_adb_root)
|
||||
if not self.adb.is_device_available():
|
||||
log_exit('No Android device is connected via ADB.')
|
||||
self.is_root_device = self.adb.switch_to_root()
|
||||
self.android_version = self.adb.get_android_version()
|
||||
if self.android_version < 7:
|
||||
log_exit("""app_profiler.py isn't supported on Android < N, please switch to use
|
||||
simpleperf binary directly.""")
|
||||
self.device_arch = self.adb.get_device_arch()
|
||||
self.record_subproc = None
|
||||
|
||||
def profile(self):
|
||||
logging.info('prepare profiling')
|
||||
self.prepare()
|
||||
logging.info('start profiling')
|
||||
self.start()
|
||||
self.wait_profiling()
|
||||
logging.info('collect profiling data')
|
||||
self.collect_profiling_data()
|
||||
logging.info('profiling is finished.')
|
||||
|
||||
def prepare(self):
|
||||
"""Prepare recording. """
|
||||
self.download_simpleperf()
|
||||
if self.args.native_lib_dir:
|
||||
self.download_libs()
|
||||
|
||||
def download_simpleperf(self):
|
||||
simpleperf_binary = get_target_binary_path(self.device_arch, 'simpleperf')
|
||||
self.adb.check_run(['push', simpleperf_binary, '/data/local/tmp'])
|
||||
self.adb.check_run(['shell', 'chmod', 'a+x', '/data/local/tmp/simpleperf'])
|
||||
|
||||
def download_libs(self):
|
||||
downloader = NativeLibDownloader(self.args.ndk_path, self.device_arch, self.adb)
|
||||
downloader.collect_native_libs_on_host(self.args.native_lib_dir)
|
||||
downloader.collect_native_libs_on_device()
|
||||
downloader.sync_native_libs_on_device()
|
||||
|
||||
def start(self):
|
||||
raise NotImplementedError
|
||||
|
||||
def start_profiling(self, target_args):
|
||||
"""Start simpleperf record process on device."""
|
||||
args = ['/data/local/tmp/simpleperf', 'record', '-o', '/data/local/tmp/perf.data',
|
||||
self.args.record_options]
|
||||
if self.adb.run(['shell', 'ls', NATIVE_LIBS_DIR_ON_DEVICE]):
|
||||
args += ['--symfs', NATIVE_LIBS_DIR_ON_DEVICE]
|
||||
args += ['--log', self.args.log]
|
||||
args += target_args
|
||||
adb_args = [self.adb.adb_path, 'shell'] + args
|
||||
logging.info('run adb cmd: %s' % adb_args)
|
||||
self.record_subproc = subprocess.Popen(adb_args)
|
||||
|
||||
def wait_profiling(self):
|
||||
"""Wait until profiling finishes, or stop profiling when user presses Ctrl-C."""
|
||||
returncode = None
|
||||
try:
|
||||
returncode = self.record_subproc.wait()
|
||||
except KeyboardInterrupt:
|
||||
self.stop_profiling()
|
||||
self.record_subproc = None
|
||||
# Don't check return value of record_subproc. Because record_subproc also
|
||||
# receives Ctrl-C, and always returns non-zero.
|
||||
returncode = 0
|
||||
logging.debug('profiling result [%s]' % (returncode == 0))
|
||||
if returncode != 0:
|
||||
log_exit('Failed to record profiling data.')
|
||||
|
||||
def stop_profiling(self):
|
||||
"""Stop profiling by sending SIGINT to simpleperf, and wait until it exits
|
||||
to make sure perf.data is completely generated."""
|
||||
has_killed = False
|
||||
while True:
|
||||
(result, _) = self.adb.run_and_return_output(['shell', 'pidof', 'simpleperf'])
|
||||
if not result:
|
||||
break
|
||||
if not has_killed:
|
||||
has_killed = True
|
||||
self.adb.run_and_return_output(['shell', 'pkill', '-l', '2', 'simpleperf'])
|
||||
time.sleep(1)
|
||||
|
||||
def collect_profiling_data(self):
|
||||
self.adb.check_run_and_return_output(['pull', '/data/local/tmp/perf.data',
|
||||
self.args.perf_data_path])
|
||||
if not self.args.skip_collect_binaries:
|
||||
binary_cache_args = [sys.executable,
|
||||
os.path.join(get_script_dir(), 'binary_cache_builder.py')]
|
||||
binary_cache_args += ['-i', self.args.perf_data_path, '--log', self.args.log]
|
||||
if self.args.native_lib_dir:
|
||||
binary_cache_args += ['-lib', self.args.native_lib_dir]
|
||||
if self.args.disable_adb_root:
|
||||
binary_cache_args += ['--disable_adb_root']
|
||||
if self.args.ndk_path:
|
||||
binary_cache_args += ['--ndk_path', self.args.ndk_path]
|
||||
subprocess.check_call(binary_cache_args)
|
||||
|
||||
|
||||
class AppProfiler(ProfilerBase):
|
||||
"""Profile an Android app."""
|
||||
|
||||
def prepare(self):
|
||||
super(AppProfiler, self).prepare()
|
||||
self.app_versioncode = self.get_app_versioncode()
|
||||
if self.args.compile_java_code:
|
||||
self.compile_java_code()
|
||||
|
||||
def get_app_versioncode(self) -> Optional[str]:
|
||||
result, output = self.adb.run_and_return_output(
|
||||
['shell', 'pm', 'list', 'packages', '--show-versioncode'])
|
||||
if not result:
|
||||
return None
|
||||
prefix = f'package:{self.args.app} '
|
||||
for line in output.splitlines():
|
||||
if line.startswith(prefix):
|
||||
pos = line.find('versionCode:')
|
||||
if pos != -1:
|
||||
return line[pos + len('versionCode:'):].strip()
|
||||
return None
|
||||
|
||||
def compile_java_code(self):
|
||||
self.kill_app_process()
|
||||
# Fully compile Java code on Android >= N.
|
||||
self.adb.set_property('debug.generate-debug-info', 'true')
|
||||
self.adb.check_run(['shell', 'cmd', 'package', 'compile', '-f', '-m', 'speed',
|
||||
self.args.app])
|
||||
|
||||
def kill_app_process(self):
|
||||
if self.find_app_process():
|
||||
self.adb.check_run(['shell', 'am', 'force-stop', self.args.app])
|
||||
count = 0
|
||||
while True:
|
||||
time.sleep(1)
|
||||
pid = self.find_app_process()
|
||||
if not pid:
|
||||
break
|
||||
count += 1
|
||||
if count >= 5:
|
||||
logging.info('unable to kill %s, skipping...' % self.args.app)
|
||||
break
|
||||
# When testing on Android N, `am force-stop` sometimes can't kill
|
||||
# com.example.simpleperf.simpleperfexampleofkotlin. So use kill when this happens.
|
||||
if count >= 3:
|
||||
self.run_in_app_dir(['kill', '-9', str(pid)])
|
||||
|
||||
def find_app_process(self):
|
||||
result, pidof_output = self.adb.run_and_return_output(
|
||||
['shell', 'pidof', self.args.app])
|
||||
if not result:
|
||||
return None
|
||||
result, current_user = self.adb.run_and_return_output(
|
||||
['shell', 'am', 'get-current-user'])
|
||||
if not result:
|
||||
return None
|
||||
pids = pidof_output.split()
|
||||
for pid in pids:
|
||||
result, ps_output = self.adb.run_and_return_output(
|
||||
['shell', 'ps', '-p', pid, '-o', 'USER'])
|
||||
if not result:
|
||||
return None
|
||||
uid = SHELL_PS_UID_PATTERN.search(ps_output).group(1)
|
||||
if uid == current_user.strip():
|
||||
return int(pid)
|
||||
return None
|
||||
|
||||
def run_in_app_dir(self, args):
|
||||
if self.is_root_device:
|
||||
adb_args = ['shell', 'cd /data/data/' + self.args.app + ' && ' + (' '.join(args))]
|
||||
else:
|
||||
adb_args = ['shell', 'run-as', self.args.app] + args
|
||||
return self.adb.run_and_return_output(adb_args)
|
||||
|
||||
def start(self):
|
||||
if self.args.launch or self.args.activity or self.args.test:
|
||||
self.kill_app_process()
|
||||
args = ['--app', self.args.app]
|
||||
if self.app_versioncode:
|
||||
args += ['--add-meta-info', f'app_versioncode={self.app_versioncode}']
|
||||
self.start_profiling(args)
|
||||
if self.args.launch:
|
||||
self.start_app()
|
||||
if self.args.activity:
|
||||
self.start_activity()
|
||||
elif self.args.test:
|
||||
self.start_test()
|
||||
# else: no need to start an activity or test.
|
||||
|
||||
def start_app(self):
|
||||
result = self.adb.run(['shell', 'monkey', '-p', self.args.app, '1'])
|
||||
if not result:
|
||||
self.record_subproc.terminate()
|
||||
log_exit(f"Can't start {self.args.app}")
|
||||
|
||||
def start_activity(self):
|
||||
activity = self.args.app + '/' + self.args.activity
|
||||
result = self.adb.run(['shell', 'am', 'start', '-n', activity])
|
||||
if not result:
|
||||
self.record_subproc.terminate()
|
||||
log_exit("Can't start activity %s" % activity)
|
||||
|
||||
def start_test(self):
|
||||
runner = self.args.app + '/androidx.test.runner.AndroidJUnitRunner'
|
||||
result = self.adb.run(['shell', 'am', 'instrument', '-e', 'class',
|
||||
self.args.test, runner])
|
||||
if not result:
|
||||
self.record_subproc.terminate()
|
||||
log_exit("Can't start instrumentation test %s" % self.args.test)
|
||||
|
||||
|
||||
class NativeProgramProfiler(ProfilerBase):
|
||||
"""Profile a native program."""
|
||||
|
||||
def start(self):
|
||||
logging.info('Waiting for native process %s' % self.args.native_program)
|
||||
while True:
|
||||
(result, pid) = self.adb.run_and_return_output(['shell', 'pidof',
|
||||
self.args.native_program])
|
||||
if not result:
|
||||
# Wait for 1 millisecond.
|
||||
time.sleep(0.001)
|
||||
else:
|
||||
self.start_profiling(['-p', str(int(pid))])
|
||||
break
|
||||
|
||||
|
||||
class NativeCommandProfiler(ProfilerBase):
|
||||
"""Profile running a native command."""
|
||||
|
||||
def start(self):
|
||||
self.start_profiling([self.args.cmd])
|
||||
|
||||
|
||||
class NativeProcessProfiler(ProfilerBase):
|
||||
"""Profile processes given their pids."""
|
||||
|
||||
def start(self):
|
||||
self.start_profiling(['-p', ','.join(self.args.pid)])
|
||||
|
||||
|
||||
class NativeThreadProfiler(ProfilerBase):
|
||||
"""Profile threads given their tids."""
|
||||
|
||||
def start(self):
|
||||
self.start_profiling(['-t', ','.join(self.args.tid)])
|
||||
|
||||
|
||||
class SystemWideProfiler(ProfilerBase):
|
||||
"""Profile system wide."""
|
||||
|
||||
def start(self):
|
||||
self.start_profiling(['-a'])
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
|
||||
target_group = parser.add_argument_group(title='Select profiling target'
|
||||
).add_mutually_exclusive_group(required=True)
|
||||
target_group.add_argument('-p', '--app', help="""Profile an Android app, given the package name.
|
||||
Like `-p com.example.android.myapp`.""")
|
||||
|
||||
target_group.add_argument('-np', '--native_program', help="""Profile a native program running on
|
||||
the Android device. Like `-np surfaceflinger`.""")
|
||||
|
||||
target_group.add_argument('-cmd', help="""Profile running a command on the Android device.
|
||||
Like `-cmd "pm -l"`.""")
|
||||
|
||||
target_group.add_argument('--pid', nargs='+', help="""Profile native processes running on device
|
||||
given their process ids.""")
|
||||
|
||||
target_group.add_argument('--tid', nargs='+', help="""Profile native threads running on device
|
||||
given their thread ids.""")
|
||||
|
||||
target_group.add_argument('--system_wide', action='store_true', help="""Profile system wide.""")
|
||||
|
||||
app_target_group = parser.add_argument_group(title='Extra options for profiling an app')
|
||||
app_target_group.add_argument('--compile_java_code', action='store_true', help="""Used with -p.
|
||||
On Android N and Android O, we need to compile Java code into
|
||||
native instructions to profile Java code. Android O also needs
|
||||
wrap.sh in the apk to use the native instructions.""")
|
||||
|
||||
app_start_group = app_target_group.add_mutually_exclusive_group()
|
||||
app_start_group.add_argument('--launch', action='store_true', help="""Used with -p. Profile the
|
||||
launch time of an Android app. The app will be started or
|
||||
restarted.""")
|
||||
app_start_group.add_argument('-a', '--activity', help="""Used with -p. Profile the launch time
|
||||
of an activity in an Android app. The app will be started or
|
||||
restarted to run the activity. Like `-a .MainActivity`.""")
|
||||
|
||||
app_start_group.add_argument('-t', '--test', help="""Used with -p. Profile the launch time of an
|
||||
instrumentation test in an Android app. The app will be started or
|
||||
restarted to run the instrumentation test. Like
|
||||
`-t test_class_name`.""")
|
||||
|
||||
record_group = parser.add_argument_group('Select recording options')
|
||||
record_group.add_argument('-r', '--record_options',
|
||||
default='-e task-clock:u -f 1000 -g --duration 10', help="""Set
|
||||
recording options for `simpleperf record` command. Use
|
||||
`run_simpleperf_on_device.py record -h` to see all accepted options.
|
||||
Default is "-e task-clock:u -f 1000 -g --duration 10".""")
|
||||
|
||||
record_group.add_argument('-lib', '--native_lib_dir', type=extant_dir,
|
||||
help="""When profiling an Android app containing native libraries,
|
||||
the native libraries are usually stripped and lake of symbols
|
||||
and debug information to provide good profiling result. By
|
||||
using -lib, you tell app_profiler.py the path storing
|
||||
unstripped native libraries, and app_profiler.py will search
|
||||
all shared libraries with suffix .so in the directory. Then
|
||||
the native libraries will be downloaded on device and
|
||||
collected in build_cache.""")
|
||||
|
||||
record_group.add_argument('-o', '--perf_data_path', default='perf.data',
|
||||
help='The path to store profiling data. Default is perf.data.')
|
||||
|
||||
record_group.add_argument('-nb', '--skip_collect_binaries', action='store_true',
|
||||
help="""By default we collect binaries used in profiling data from
|
||||
device to binary_cache directory. It can be used to annotate
|
||||
source code and disassembly. This option skips it.""")
|
||||
|
||||
other_group = parser.add_argument_group('Other options')
|
||||
other_group.add_argument('--ndk_path', type=extant_dir,
|
||||
help="""Set the path of a ndk release. app_profiler.py needs some
|
||||
tools in ndk, like readelf.""")
|
||||
|
||||
other_group.add_argument('--disable_adb_root', action='store_true',
|
||||
help="""Force adb to run in non root mode. By default, app_profiler.py
|
||||
will try to switch to root mode to be able to profile released
|
||||
Android apps.""")
|
||||
|
||||
def check_args(args):
|
||||
if (not args.app) and (args.compile_java_code or args.activity or args.test):
|
||||
log_exit('--compile_java_code, -a, -t can only be used when profiling an Android app.')
|
||||
|
||||
args = parser.parse_args()
|
||||
check_args(args)
|
||||
if args.app:
|
||||
profiler = AppProfiler(args)
|
||||
elif args.native_program:
|
||||
profiler = NativeProgramProfiler(args)
|
||||
elif args.cmd:
|
||||
profiler = NativeCommandProfiler(args)
|
||||
elif args.pid:
|
||||
profiler = NativeProcessProfiler(args)
|
||||
elif args.tid:
|
||||
profiler = NativeThreadProfiler(args)
|
||||
elif args.system_wide:
|
||||
profiler = SystemWideProfiler(args)
|
||||
profiler.profile()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
BIN
Android/android-ndk-r27d/simpleperf/bin/android/arm/simpleperf
Normal file
BIN
Android/android-ndk-r27d/simpleperf/bin/android/arm64/simpleperf
Normal file
BIN
Android/android-ndk-r27d/simpleperf/bin/android/x86/simpleperf
Normal file
351
Android/android-ndk-r27d/simpleperf/binary_cache_builder.py
Normal file
@ -0,0 +1,351 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""binary_cache_builder.py: read perf.data, collect binaries needed by
|
||||
it, and put them in binary_cache.
|
||||
"""
|
||||
|
||||
from collections import defaultdict
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
from pathlib import Path
|
||||
import shutil
|
||||
import sys
|
||||
from typing import Dict, List, Optional, Tuple, Union
|
||||
|
||||
from simpleperf_report_lib import ReportLib
|
||||
from simpleperf_utils import (
|
||||
AdbHelper, BaseArgumentParser, extant_dir, extant_file, flatten_arg_list,
|
||||
ReadElf, str_to_bytes)
|
||||
|
||||
|
||||
def is_jit_symfile(dso_name):
|
||||
return dso_name.split('/')[-1].startswith('TemporaryFile')
|
||||
|
||||
|
||||
class BinaryCache:
|
||||
def __init__(self, binary_dir: Path):
|
||||
self.binary_dir = binary_dir
|
||||
|
||||
def get_path_in_cache(self, device_path: str, build_id: str) -> Path:
|
||||
""" Given a binary path in perf.data, return its corresponding path in the cache.
|
||||
"""
|
||||
if build_id:
|
||||
filename = device_path.split('/')[-1]
|
||||
# Add build id to make the filename unique.
|
||||
return self.binary_dir / build_id[2:] / filename
|
||||
|
||||
# For elf file without build id, we can only follow its path on device. Otherwise,
|
||||
# simpleperf can't find it. However, we don't prefer this way. Because:
|
||||
# 1) It doesn't work for native libs loaded directly from apk
|
||||
# (android:extractNativeLibs=”false”).
|
||||
# 2) It may exceed path limit on windows.
|
||||
if device_path.startswith('/'):
|
||||
device_path = device_path[1:]
|
||||
device_path = device_path.replace('/', os.sep)
|
||||
return Path(os.path.join(self.binary_dir, device_path))
|
||||
|
||||
|
||||
class BinarySource:
|
||||
""" Source to find debug binaries. """
|
||||
|
||||
def __init__(self, readelf: ReadElf):
|
||||
self.readelf = readelf
|
||||
|
||||
def collect_binaries(self, binaries: Dict[str, str], binary_cache: BinaryCache):
|
||||
""" pull binaries needed in perf.data to binary_cache.
|
||||
binaries: maps from binary path to its build_id in perf.data.
|
||||
"""
|
||||
raise Exception('not implemented')
|
||||
|
||||
def read_build_id(self, path: Path):
|
||||
return self.readelf.get_build_id(path)
|
||||
|
||||
|
||||
class BinarySourceFromDevice(BinarySource):
|
||||
""" Pull binaries from device. """
|
||||
|
||||
def __init__(self, readelf: ReadElf, disable_adb_root: bool):
|
||||
super().__init__(readelf)
|
||||
self.adb = AdbHelper(enable_switch_to_root=not disable_adb_root)
|
||||
|
||||
def collect_binaries(self, binaries: Dict[str, str], binary_cache: BinaryCache):
|
||||
if not self.adb.is_device_available():
|
||||
return
|
||||
for path, build_id in binaries.items():
|
||||
self.collect_binary(path, build_id, binary_cache)
|
||||
self.pull_kernel_symbols(binary_cache.binary_dir / 'kallsyms')
|
||||
|
||||
def collect_binary(self, path: str, build_id: str, binary_cache: BinaryCache):
|
||||
if not path.startswith('/') or path == "//anon" or path.startswith("/dev/"):
|
||||
# [kernel.kallsyms] or unknown, or something we can't find binary.
|
||||
return
|
||||
binary_cache_file = binary_cache.get_path_in_cache(path, build_id)
|
||||
self.check_and_pull_binary(path, build_id, binary_cache_file)
|
||||
|
||||
def check_and_pull_binary(self, path: str, expected_build_id: str, binary_cache_file: Path):
|
||||
"""If the binary_cache_file exists and has the expected_build_id, there
|
||||
is no need to pull the binary from device. Otherwise, pull it.
|
||||
"""
|
||||
if binary_cache_file.is_file() and (
|
||||
not expected_build_id or expected_build_id == self.read_build_id(binary_cache_file)
|
||||
):
|
||||
logging.info('use current file in binary_cache: %s', binary_cache_file)
|
||||
else:
|
||||
logging.info('pull file to binary_cache: %s to %s', path, binary_cache_file)
|
||||
target_dir = binary_cache_file.parent
|
||||
try:
|
||||
os.makedirs(target_dir, exist_ok=True)
|
||||
if binary_cache_file.is_file():
|
||||
binary_cache_file.unlink()
|
||||
success = self.pull_file_from_device(path, binary_cache_file)
|
||||
except FileNotFoundError:
|
||||
# It happens on windows when the filename or extension is too long.
|
||||
success = False
|
||||
if not success:
|
||||
logging.warning('failed to pull %s from device', path)
|
||||
|
||||
def pull_file_from_device(self, device_path: str, host_path: Path) -> bool:
|
||||
if self.adb.run(['pull', device_path, str(host_path)]):
|
||||
return True
|
||||
# On non-root devices, we can't pull /data/app/XXX/base.odex directly.
|
||||
# Instead, we can first copy the file to /data/local/tmp, then pull it.
|
||||
filename = device_path[device_path.rfind('/')+1:]
|
||||
if (self.adb.run(['shell', 'cp', device_path, '/data/local/tmp']) and
|
||||
self.adb.run(['pull', '/data/local/tmp/' + filename, host_path])):
|
||||
self.adb.run(['shell', 'rm', '/data/local/tmp/' + filename])
|
||||
return True
|
||||
return False
|
||||
|
||||
def pull_kernel_symbols(self, file_path: Path):
|
||||
if file_path.is_file():
|
||||
file_path.unlink()
|
||||
if self.adb.switch_to_root():
|
||||
self.adb.run(['shell', 'echo', '0', '>/proc/sys/kernel/kptr_restrict'])
|
||||
self.adb.run(['pull', '/proc/kallsyms', file_path])
|
||||
|
||||
|
||||
class BinarySourceFromLibDirs(BinarySource):
|
||||
""" Collect binaries from lib dirs. """
|
||||
|
||||
def __init__(self, readelf: ReadElf, lib_dirs: List[Path]):
|
||||
super().__init__(readelf)
|
||||
self.lib_dirs = lib_dirs
|
||||
self.filename_map = None
|
||||
self.build_id_map = None
|
||||
self.binary_cache = None
|
||||
|
||||
def collect_binaries(self, binaries: Dict[str, str], binary_cache: BinaryCache):
|
||||
self.create_filename_map(binaries)
|
||||
self.create_build_id_map(binaries)
|
||||
self.binary_cache = binary_cache
|
||||
|
||||
# Search all files in lib_dirs, and copy matching files to build_cache.
|
||||
for lib_dir in self.lib_dirs:
|
||||
if self.is_platform_symbols_dir(lib_dir):
|
||||
self.search_platform_symbols_dir(lib_dir)
|
||||
else:
|
||||
self.search_dir(lib_dir)
|
||||
|
||||
def create_filename_map(self, binaries: Dict[str, str]):
|
||||
""" Create a map mapping from filename to binaries having the name. """
|
||||
self.filename_map = defaultdict(list)
|
||||
for path, build_id in binaries.items():
|
||||
index = path.rfind('/')
|
||||
filename = path[index + 1:]
|
||||
self.filename_map[filename].append((path, build_id))
|
||||
|
||||
def create_build_id_map(self, binaries: Dict[str, str]):
|
||||
""" Create a map mapping from build id to binary path. """
|
||||
self.build_id_map = {}
|
||||
for path, build_id in binaries.items():
|
||||
if build_id:
|
||||
self.build_id_map[build_id] = path
|
||||
|
||||
def is_platform_symbols_dir(self, lib_dir: Path):
|
||||
""" Check if lib_dir points to $ANDROID_PRODUCT_OUT/symbols. """
|
||||
subdir_names = [p.name for p in lib_dir.iterdir()]
|
||||
return lib_dir.name == 'symbols' and 'system' in subdir_names
|
||||
|
||||
def search_platform_symbols_dir(self, lib_dir: Path):
|
||||
""" Platform symbols dir contains too many binaries. Reading build ids for
|
||||
all of them takes a long time. So we only read build ids for binaries
|
||||
having names exist in filename_map.
|
||||
"""
|
||||
for root, _, files in os.walk(lib_dir):
|
||||
for filename in files:
|
||||
binaries = self.filename_map.get(filename)
|
||||
if not binaries:
|
||||
continue
|
||||
file_path = Path(os.path.join(root, filename))
|
||||
build_id = self.read_build_id(file_path)
|
||||
for path, expected_build_id in binaries:
|
||||
if expected_build_id == build_id:
|
||||
self.copy_to_binary_cache(file_path, build_id, path)
|
||||
|
||||
def search_dir(self, lib_dir: Path):
|
||||
""" For a normal lib dir, it's unlikely to contain many binaries. So we can read
|
||||
build ids for all binaries in it. But users may give debug binaries with a name
|
||||
different from the one recorded in perf.data. So we should only rely on build id
|
||||
if it is available.
|
||||
"""
|
||||
for root, _, files in os.walk(lib_dir):
|
||||
for filename in files:
|
||||
file_path = Path(os.path.join(root, filename))
|
||||
build_id = self.read_build_id(file_path)
|
||||
if build_id:
|
||||
# For elf file with build id, use build id to match.
|
||||
device_path = self.build_id_map.get(build_id)
|
||||
if device_path:
|
||||
self.copy_to_binary_cache(file_path, build_id, device_path)
|
||||
elif self.readelf.is_elf_file(file_path):
|
||||
# For elf file without build id, use filename to match.
|
||||
for path, expected_build_id in self.filename_map.get(filename, []):
|
||||
if not expected_build_id:
|
||||
self.copy_to_binary_cache(file_path, '', path)
|
||||
break
|
||||
|
||||
def copy_to_binary_cache(
|
||||
self, from_path: Path, expected_build_id: str, device_path: str):
|
||||
to_path = self.binary_cache.get_path_in_cache(device_path, expected_build_id)
|
||||
if not self.need_to_copy(from_path, to_path, expected_build_id):
|
||||
# The existing file in binary_cache can provide more information, so no need to copy.
|
||||
return
|
||||
to_dir = to_path.parent
|
||||
if not to_dir.is_dir():
|
||||
os.makedirs(to_dir)
|
||||
logging.info('copy to binary_cache: %s to %s', from_path, to_path)
|
||||
shutil.copy(from_path, to_path)
|
||||
|
||||
def need_to_copy(self, from_path: Path, to_path: Path, expected_build_id: str):
|
||||
if not to_path.is_file() or self.read_build_id(to_path) != expected_build_id:
|
||||
return True
|
||||
return self.get_file_stripped_level(from_path) < self.get_file_stripped_level(to_path)
|
||||
|
||||
def get_file_stripped_level(self, path: Path) -> int:
|
||||
"""Return stripped level of an ELF file. Larger value means more stripped."""
|
||||
sections = self.readelf.get_sections(path)
|
||||
if '.debug_line' in sections:
|
||||
return 0
|
||||
if '.symtab' in sections:
|
||||
return 1
|
||||
return 2
|
||||
|
||||
|
||||
class BinaryCacheBuilder:
|
||||
"""Collect all binaries needed by perf.data in binary_cache."""
|
||||
|
||||
def __init__(self, ndk_path: Optional[str], disable_adb_root: bool):
|
||||
self.readelf = ReadElf(ndk_path)
|
||||
self.device_source = BinarySourceFromDevice(self.readelf, disable_adb_root)
|
||||
self.binary_cache_dir = Path('binary_cache')
|
||||
self.binary_cache = BinaryCache(self.binary_cache_dir)
|
||||
self.binaries = {}
|
||||
|
||||
def build_binary_cache(self, perf_data_path: str, symfs_dirs: List[Union[Path, str]]) -> bool:
|
||||
self.binary_cache_dir.mkdir(exist_ok=True)
|
||||
self.collect_used_binaries(perf_data_path)
|
||||
if not self.copy_binaries_from_symfs_dirs(symfs_dirs):
|
||||
return False
|
||||
self.pull_binaries_from_device()
|
||||
self.create_build_id_list()
|
||||
return True
|
||||
|
||||
def collect_used_binaries(self, perf_data_path):
|
||||
"""read perf.data, collect all used binaries and their build id(if available)."""
|
||||
# A dict mapping from binary name to build_id
|
||||
binaries = {}
|
||||
lib = ReportLib()
|
||||
lib.SetRecordFile(perf_data_path)
|
||||
lib.SetLogSeverity('error')
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
symbols = [lib.GetSymbolOfCurrentSample()]
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
for i in range(callchain.nr):
|
||||
symbols.append(callchain.entries[i].symbol)
|
||||
|
||||
for symbol in symbols:
|
||||
dso_name = symbol.dso_name
|
||||
if dso_name not in binaries:
|
||||
if is_jit_symfile(dso_name):
|
||||
continue
|
||||
name = 'vmlinux' if dso_name == '[kernel.kallsyms]' else dso_name
|
||||
binaries[name] = lib.GetBuildIdForPath(dso_name)
|
||||
self.binaries = binaries
|
||||
|
||||
def copy_binaries_from_symfs_dirs(self, symfs_dirs: List[Union[str, Path]]) -> bool:
|
||||
if symfs_dirs:
|
||||
lib_dirs: List[Path] = []
|
||||
for symfs_dir in symfs_dirs:
|
||||
if isinstance(symfs_dir, str):
|
||||
symfs_dir = Path(symfs_dir)
|
||||
if not symfs_dir.is_dir():
|
||||
logging.error("can't find dir %s", symfs_dir)
|
||||
return False
|
||||
lib_dirs.append(symfs_dir)
|
||||
lib_dir_source = BinarySourceFromLibDirs(self.readelf, lib_dirs)
|
||||
lib_dir_source.collect_binaries(self.binaries, self.binary_cache)
|
||||
return True
|
||||
|
||||
def pull_binaries_from_device(self):
|
||||
self.device_source.collect_binaries(self.binaries, self.binary_cache)
|
||||
|
||||
def create_build_id_list(self):
|
||||
""" Create build_id_list. So report scripts can find a binary by its build_id instead of
|
||||
path.
|
||||
"""
|
||||
build_id_list_path = self.binary_cache_dir / 'build_id_list'
|
||||
# Write in binary mode to avoid "\r\n" problem on windows, which can confuse simpleperf.
|
||||
with open(build_id_list_path, 'wb') as fh:
|
||||
for root, _, files in os.walk(self.binary_cache_dir):
|
||||
for filename in files:
|
||||
path = Path(os.path.join(root, filename))
|
||||
build_id = self.readelf.get_build_id(path)
|
||||
if build_id:
|
||||
relative_path = path.relative_to(self.binary_cache_dir)
|
||||
line = f'{build_id}={relative_path}\n'
|
||||
fh.write(str_to_bytes(line))
|
||||
|
||||
def find_path_in_cache(self, device_path: str) -> Optional[Path]:
|
||||
build_id = self.binaries.get(device_path)
|
||||
return self.binary_cache.get_path_in_cache(device_path, build_id)
|
||||
|
||||
|
||||
def main() -> bool:
|
||||
parser = BaseArgumentParser(description="""
|
||||
Pull binaries needed by perf.data from device to binary_cache directory.""")
|
||||
parser.add_argument('-i', '--perf_data_path', default='perf.data', type=extant_file, help="""
|
||||
The path of profiling data.""")
|
||||
parser.add_argument('-lib', '--native_lib_dir', type=extant_dir, nargs='+', help="""
|
||||
Path to find debug version of native shared libraries used in the app.""", action='append')
|
||||
parser.add_argument('--disable_adb_root', action='store_true', help="""
|
||||
Force adb to run in non root mode.""")
|
||||
parser.add_argument('--ndk_path', nargs=1, help='Find tools in the ndk path.')
|
||||
args = parser.parse_args()
|
||||
ndk_path = None if not args.ndk_path else args.ndk_path[0]
|
||||
builder = BinaryCacheBuilder(ndk_path, args.disable_adb_root)
|
||||
symfs_dirs = flatten_arg_list(args.native_lib_dir)
|
||||
return builder.build_binary_cache(args.perf_data_path, symfs_dirs)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
sys.exit(0 if main() else 1)
|
||||
271
Android/android-ndk-r27d/simpleperf/debug_unwind_reporter.py
Normal file
@ -0,0 +1,271 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2017 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""debug_unwind_reporter.py: report failed dwarf unwinding cases generated by debug-unwind cmd.
|
||||
|
||||
Below is an example using debug_unwind_reporter.py:
|
||||
1. Record with "-g --keep-failed-unwinding-debug-info" option on device.
|
||||
$ simpleperf record -g --keep-failed-unwinding-debug-info --app com.google.sample.tunnel \\
|
||||
--duration 10
|
||||
The generated perf.data can be used for normal reporting. But it also contains stack data
|
||||
and binaries for debugging failed unwinding cases.
|
||||
|
||||
2. Generate report with debug-unwind cmd.
|
||||
$ simpleperf debug-unwind -i perf.data --generate-report -o report.txt
|
||||
The report contains details for each failed unwinding case. It is usually too long to
|
||||
parse manually. That's why we need debug_unwind_reporter.py.
|
||||
|
||||
3. Use debug_unwind_reporter.py to parse the report.
|
||||
$ simpleperf debug-unwind -i report.txt --summary
|
||||
$ simpleperf debug-unwind -i report.txt --include-error-code 1
|
||||
...
|
||||
"""
|
||||
|
||||
import argparse
|
||||
from collections import Counter, defaultdict
|
||||
from simpleperf_utils import BaseArgumentParser
|
||||
from texttable import Texttable
|
||||
from typing import Dict, Iterator, List
|
||||
|
||||
|
||||
class CallChainNode:
|
||||
def __init__(self):
|
||||
self.dso = ''
|
||||
self.symbol = ''
|
||||
|
||||
|
||||
class Sample:
|
||||
""" A failed unwinding case """
|
||||
|
||||
def __init__(self, raw_lines: List[str]):
|
||||
self.raw_lines = raw_lines
|
||||
self.sample_time = 0
|
||||
self.error_code = 0
|
||||
self.callchain: List[CallChainNode] = []
|
||||
self.parse()
|
||||
|
||||
def parse(self):
|
||||
for line in self.raw_lines:
|
||||
key, value = line.split(': ', 1)
|
||||
if key == 'sample_time':
|
||||
self.sample_time = int(value)
|
||||
elif key == 'unwinding_error_code':
|
||||
self.error_code = int(value)
|
||||
elif key.startswith('dso'):
|
||||
callchain_id = int(key.rsplit('_', 1)[1])
|
||||
self._get_callchain_node(callchain_id).dso = value
|
||||
elif key.startswith('symbol'):
|
||||
callchain_id = int(key.rsplit('_', 1)[1])
|
||||
self._get_callchain_node(callchain_id).symbol = value
|
||||
|
||||
def _get_callchain_node(self, callchain_id: int) -> CallChainNode:
|
||||
callchain_id -= 1
|
||||
if callchain_id == len(self.callchain):
|
||||
self.callchain.append(CallChainNode())
|
||||
return self.callchain[callchain_id]
|
||||
|
||||
|
||||
class SampleFilter:
|
||||
def match(self, sample: Sample) -> bool:
|
||||
raise Exception('unimplemented')
|
||||
|
||||
|
||||
class CompleteCallChainFilter(SampleFilter):
|
||||
def match(self, sample: Sample) -> bool:
|
||||
for node in sample.callchain:
|
||||
if node.dso.endswith('libc.so') and (node.symbol in ('__libc_init', '__start_thread')):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
class ErrorCodeFilter(SampleFilter):
|
||||
def __init__(self, error_code: List[int]):
|
||||
self.error_code = set(error_code)
|
||||
|
||||
def match(self, sample: Sample) -> bool:
|
||||
return sample.error_code in self.error_code
|
||||
|
||||
|
||||
class EndDsoFilter(SampleFilter):
|
||||
def __init__(self, end_dso: List[str]):
|
||||
self.end_dso = set(end_dso)
|
||||
|
||||
def match(self, sample: Sample) -> bool:
|
||||
return sample.callchain[-1].dso in self.end_dso
|
||||
|
||||
|
||||
class EndSymbolFilter(SampleFilter):
|
||||
def __init__(self, end_symbol: List[str]):
|
||||
self.end_symbol = set(end_symbol)
|
||||
|
||||
def match(self, sample: Sample) -> bool:
|
||||
return sample.callchain[-1].symbol in self.end_symbol
|
||||
|
||||
|
||||
class SampleTimeFilter(SampleFilter):
|
||||
def __init__(self, sample_time: List[int]):
|
||||
self.sample_time = set(sample_time)
|
||||
|
||||
def match(self, sample: Sample) -> bool:
|
||||
return sample.sample_time in self.sample_time
|
||||
|
||||
|
||||
class ReportInput:
|
||||
def __init__(self):
|
||||
self.exclude_filters: List[SampleFilter] = []
|
||||
self.include_filters: List[SampleFilter] = []
|
||||
|
||||
def set_filters(self, args: argparse.Namespace):
|
||||
if not args.show_callchain_fixed_by_joiner:
|
||||
self.exclude_filters.append(CompleteCallChainFilter())
|
||||
if args.exclude_error_code:
|
||||
self.exclude_filters.append(ErrorCodeFilter(args.exclude_error_code))
|
||||
if args.exclude_end_dso:
|
||||
self.exclude_filters.append(EndDsoFilter(args.exclude_end_dso))
|
||||
if args.exclude_end_symbol:
|
||||
self.exclude_filters.append(EndSymbolFilter(args.exclude_end_symbol))
|
||||
if args.exclude_sample_time:
|
||||
self.exclude_filters.append(SampleTimeFilter(args.exclude_sample_time))
|
||||
|
||||
if args.include_error_code:
|
||||
self.include_filters.append(ErrorCodeFilter(args.include_error_code))
|
||||
if args.include_end_dso:
|
||||
self.include_filters.append(EndDsoFilter(args.include_end_dso))
|
||||
if args.include_end_symbol:
|
||||
self.include_filters.append(EndSymbolFilter(args.include_end_symbol))
|
||||
if args.include_sample_time:
|
||||
self.include_filters.append(SampleTimeFilter(args.include_sample_time))
|
||||
|
||||
def get_samples(self, input_file: str) -> Iterator[Sample]:
|
||||
sample_lines: List[str] = []
|
||||
in_sample = False
|
||||
with open(input_file, 'r') as fh:
|
||||
for line in fh.readlines():
|
||||
line = line.rstrip()
|
||||
if line.startswith('sample_time:'):
|
||||
in_sample = True
|
||||
elif not line:
|
||||
if in_sample:
|
||||
in_sample = False
|
||||
sample = Sample(sample_lines)
|
||||
sample_lines = []
|
||||
if self.filter_sample(sample):
|
||||
yield sample
|
||||
if in_sample:
|
||||
sample_lines.append(line)
|
||||
|
||||
def filter_sample(self, sample: Sample) -> bool:
|
||||
""" Return true if the input sample passes filters. """
|
||||
for exclude_filter in self.exclude_filters:
|
||||
if exclude_filter.match(sample):
|
||||
return False
|
||||
for include_filter in self.include_filters:
|
||||
if not include_filter.match(sample):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
class ReportOutput:
|
||||
def report(self, sample: Sample):
|
||||
pass
|
||||
|
||||
def end_report(self):
|
||||
pass
|
||||
|
||||
|
||||
class ReportOutputDetails(ReportOutput):
|
||||
def report(self, sample: Sample):
|
||||
for line in sample.raw_lines:
|
||||
print(line)
|
||||
print()
|
||||
|
||||
|
||||
class ReportOutputSummary(ReportOutput):
|
||||
def __init__(self):
|
||||
self.error_code_counter = Counter()
|
||||
self.symbol_counters: Dict[int, Counter] = defaultdict(Counter)
|
||||
|
||||
def report(self, sample: Sample):
|
||||
symbol_key = (sample.callchain[-1].dso, sample.callchain[-1].symbol)
|
||||
self.symbol_counters[sample.error_code][symbol_key] += 1
|
||||
self.error_code_counter[sample.error_code] += 1
|
||||
|
||||
def end_report(self):
|
||||
self.draw_error_code_table()
|
||||
self.draw_symbol_table()
|
||||
|
||||
def draw_error_code_table(self):
|
||||
table = Texttable()
|
||||
table.set_cols_align(['l', 'c'])
|
||||
table.add_row(['Count', 'Error Code'])
|
||||
for error_code, count in self.error_code_counter.most_common():
|
||||
table.add_row([count, error_code])
|
||||
print(table.draw())
|
||||
|
||||
def draw_symbol_table(self):
|
||||
table = Texttable()
|
||||
table.set_cols_align(['l', 'c', 'l', 'l'])
|
||||
table.add_row(['Count', 'Error Code', 'Dso', 'Symbol'])
|
||||
for error_code, _ in self.error_code_counter.most_common():
|
||||
symbol_counter = self.symbol_counters[error_code]
|
||||
for symbol_key, count in symbol_counter.most_common():
|
||||
dso, symbol = symbol_key
|
||||
table.add_row([count, error_code, dso, symbol])
|
||||
print(table.draw())
|
||||
|
||||
|
||||
def get_args() -> argparse.Namespace:
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
parser.add_argument('-i', '--input-file', required=True,
|
||||
help='report file generated by debug-unwind cmd')
|
||||
parser.add_argument(
|
||||
'--show-callchain-fixed-by-joiner', action='store_true',
|
||||
help="""By default, we don't show failed unwinding cases fixed by callchain joiner.
|
||||
Use this option to show them.""")
|
||||
parser.add_argument('--summary', action='store_true',
|
||||
help='show summary instead of case details')
|
||||
parser.add_argument('--exclude-error-code', metavar='error_code', type=int, nargs='+',
|
||||
help='exclude cases with selected error code')
|
||||
parser.add_argument('--exclude-end-dso', metavar='dso', nargs='+',
|
||||
help='exclude cases ending at selected binary')
|
||||
parser.add_argument('--exclude-end-symbol', metavar='symbol', nargs='+',
|
||||
help='exclude cases ending at selected symbol')
|
||||
parser.add_argument('--exclude-sample-time', metavar='time', type=int,
|
||||
nargs='+', help='exclude cases with selected sample time')
|
||||
parser.add_argument('--include-error-code', metavar='error_code', type=int,
|
||||
nargs='+', help='include cases with selected error code')
|
||||
parser.add_argument('--include-end-dso', metavar='dso', nargs='+',
|
||||
help='include cases ending at selected binary')
|
||||
parser.add_argument('--include-end-symbol', metavar='symbol', nargs='+',
|
||||
help='include cases ending at selected symbol')
|
||||
parser.add_argument('--include-sample-time', metavar='time', type=int,
|
||||
nargs='+', help='include cases with selected sample time')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main():
|
||||
args = get_args()
|
||||
report_input = ReportInput()
|
||||
report_input.set_filters(args)
|
||||
report_output = ReportOutputSummary() if args.summary else ReportOutputDetails()
|
||||
for sample in report_input.get_samples(args.input_file):
|
||||
report_output.report(sample)
|
||||
report_output.end_report()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
333
Android/android-ndk-r27d/simpleperf/doc/README.md
Normal file
@ -0,0 +1,333 @@
|
||||
# Simpleperf
|
||||
|
||||
Android Studio includes a graphical front end to Simpleperf, documented in
|
||||
[Inspect CPU activity with CPU Profiler](https://developer.android.com/studio/profile/cpu-profiler).
|
||||
Most users will prefer to use that instead of using Simpleperf directly.
|
||||
|
||||
Simpleperf is a native CPU profiling tool for Android. It can be used to profile
|
||||
both Android applications and native processes running on Android. It can
|
||||
profile both Java and C++ code on Android. The simpleperf executable can run on Android >=L,
|
||||
and Python scripts can be used on Android >= N.
|
||||
|
||||
Simpleperf is part of the Android Open Source Project.
|
||||
The source code is [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/).
|
||||
The latest document is [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/README.md).
|
||||
|
||||
[TOC]
|
||||
|
||||
## Introduction
|
||||
|
||||
An introduction slide deck is [here](./introduction.pdf).
|
||||
|
||||
Simpleperf contains two parts: the simpleperf executable and Python scripts.
|
||||
|
||||
The simpleperf executable works similar to linux-tools-perf, but has some specific features for
|
||||
the Android profiling environment:
|
||||
|
||||
1. It collects more info in profiling data. Since the common workflow is "record on the device, and
|
||||
report on the host", simpleperf not only collects samples in profiling data, but also collects
|
||||
needed symbols, device info and recording time.
|
||||
|
||||
2. It delivers new features for recording.
|
||||
1) When recording dwarf based call graph, simpleperf unwinds the stack before writing a sample
|
||||
to file. This is to save storage space on the device.
|
||||
2) Support tracing both on CPU time and off CPU time with --trace-offcpu option.
|
||||
3) Support recording callgraphs of JITed and interpreted Java code on Android >= P.
|
||||
|
||||
3. It relates closely to the Android platform.
|
||||
1) Is aware of Android environment, like using system properties to enable profiling, using
|
||||
run-as to profile in application's context.
|
||||
2) Supports reading symbols and debug information from the .gnu_debugdata section, because
|
||||
system libraries are built with .gnu_debugdata section starting from Android O.
|
||||
3) Supports profiling shared libraries embedded in apk files.
|
||||
4) It uses the standard Android stack unwinder, so its results are consistent with all other
|
||||
Android tools.
|
||||
|
||||
4. It builds executables and shared libraries for different usages.
|
||||
1) Builds static executables on the device. Since static executables don't rely on any library,
|
||||
simpleperf executables can be pushed on any Android device and used to record profiling data.
|
||||
2) Builds executables on different hosts: Linux, Mac and Windows. These executables can be used
|
||||
to report on hosts.
|
||||
3) Builds report shared libraries on different hosts. The report library is used by different
|
||||
Python scripts to parse profiling data.
|
||||
|
||||
Detailed documentation for the simpleperf executable is [here](#executable-commands-reference).
|
||||
|
||||
Python scripts are split into three parts according to their functions:
|
||||
|
||||
1. Scripts used for recording, like app_profiler.py, run_simpleperf_without_usb_connection.py.
|
||||
|
||||
2. Scripts used for reporting, like report.py, report_html.py, inferno.
|
||||
|
||||
3. Scripts used for parsing profiling data, like simpleperf_report_lib.py.
|
||||
|
||||
The python scripts are tested on Python >= 3.9. Older versions may not be supported.
|
||||
Detailed documentation for the Python scripts is [here](#scripts-reference).
|
||||
|
||||
|
||||
## Tools in simpleperf
|
||||
|
||||
The simpleperf executables and Python scripts are located in simpleperf/ in ndk releases, and in
|
||||
system/extras/simpleperf/scripts/ in AOSP. Their functions are listed below.
|
||||
|
||||
bin/: contains executables and shared libraries.
|
||||
|
||||
bin/android/${arch}/simpleperf: static simpleperf executables used on the device.
|
||||
|
||||
bin/${host}/${arch}/simpleperf: simpleperf executables used on the host, only supports reporting.
|
||||
|
||||
bin/${host}/${arch}/libsimpleperf_report.${so/dylib/dll}: report shared libraries used on the host.
|
||||
|
||||
*.py, inferno, purgatorio: Python scripts used for recording and reporting. Details are in [scripts_reference.md](scripts_reference.md).
|
||||
|
||||
|
||||
## Android application profiling
|
||||
|
||||
See [android_application_profiling.md](./android_application_profiling.md).
|
||||
|
||||
|
||||
## Android platform profiling
|
||||
|
||||
See [android_platform_profiling.md](./android_platform_profiling.md).
|
||||
|
||||
|
||||
## Executable commands reference
|
||||
|
||||
See [executable_commands_reference.md](./executable_commands_reference.md).
|
||||
|
||||
|
||||
## Scripts reference
|
||||
|
||||
See [scripts_reference.md](./scripts_reference.md).
|
||||
|
||||
## View the profile
|
||||
|
||||
See [view_the_profile.md](./view_the_profile.md).
|
||||
|
||||
## Answers to common issues
|
||||
|
||||
### Support on different Android versions
|
||||
|
||||
On Android < N, the kernel may be too old (< 3.18) to support features like recording DWARF
|
||||
based call graphs.
|
||||
On Android M - O, we can only profile C++ code and fully compiled Java code.
|
||||
On Android >= P, the ART interpreter supports DWARF based unwinding. So we can profile Java code.
|
||||
On Android >= Q, we can used simpleperf shipped on device to profile released Android apps, with
|
||||
`<profileable android:shell="true" />`.
|
||||
|
||||
|
||||
### Comparing DWARF based and stack frame based call graphs
|
||||
|
||||
Simpleperf supports two ways recording call stacks with samples. One is DWARF based call graph,
|
||||
the other is stack frame based call graph. Below is their comparison:
|
||||
|
||||
Recording DWARF based call graph:
|
||||
1. Needs support of debug information in binaries.
|
||||
2. Behaves normally well on both ARM and ARM64, for both Java code and C++ code.
|
||||
3. Can only unwind 64K stack for each sample. So it isn't always possible to unwind to the bottom.
|
||||
However, this is alleviated in simpleperf, as explained in the next section.
|
||||
4. Takes more CPU time than stack frame based call graphs. So it has higher overhead, and can't
|
||||
sample at very high frequency (usually <= 4000 Hz).
|
||||
|
||||
Recording stack frame based call graph:
|
||||
1. Needs support of stack frame registers.
|
||||
2. Doesn't work well on ARM. Because ARM is short of registers, and ARM and THUMB code have
|
||||
different stack frame registers. So the kernel can't unwind user stack containing both ARM and
|
||||
THUMB code.
|
||||
3. Also doesn't work well on Java code. Because the ART compiler doesn't reserve stack frame
|
||||
registers. And it can't get frames for interpreted Java code.
|
||||
4. Works well when profiling native programs on ARM64. One example is profiling surfacelinger. And
|
||||
usually shows complete flamegraph when it works well.
|
||||
5. Takes much less CPU time than DWARF based call graphs. So the sample frequency can be 10000 Hz or
|
||||
higher.
|
||||
|
||||
So if you need to profile code on ARM or profile Java code, DWARF based call graph is better. If you
|
||||
need to profile C++ code on ARM64, stack frame based call graphs may be better. After all, you can
|
||||
fisrt try DWARF based call graph, which is also the default option when `-g` is used. Because it
|
||||
always produces reasonable results. If it doesn't work well enough, then try stack frame based call
|
||||
graph instead.
|
||||
|
||||
|
||||
### Fix broken DWARF based call graph
|
||||
|
||||
A DWARF-based call graph is generated by unwinding thread stacks. When a sample is recorded, a
|
||||
kernel dumps up to 64 kilobytes of stack data. By unwinding the stack based on DWARF information,
|
||||
we can get a call stack.
|
||||
|
||||
Two reasons may cause a broken call stack:
|
||||
1. The kernel can only dump up to 64 kilobytes of stack data for each sample, but a thread can have
|
||||
much larger stack. In this case, we can't unwind to the thread start point.
|
||||
|
||||
2. We need binaries containing DWARF call frame information to unwind stack frames. The binary
|
||||
should have one of the following sections: .eh_frame, .debug_frame, .ARM.exidx or .gnu_debugdata.
|
||||
|
||||
To mitigate these problems,
|
||||
|
||||
|
||||
For the missing stack data problem:
|
||||
1. To alleviate it, simpleperf joins callchains (call stacks) after recording. If two callchains of
|
||||
a thread have an entry containing the same ip and sp address, then simpleperf tries to join them
|
||||
to make the callchains longer. So we can get more complete callchains by recording longer and
|
||||
joining more samples. This doesn't guarantee to get complete call graphs. But it usually works
|
||||
well.
|
||||
|
||||
2. Simpleperf stores samples in a buffer before unwinding them. If the bufer is low in free space,
|
||||
simpleperf may decide to truncate stack data for a sample to 1K. Hopefully, this can be recovered
|
||||
by callchain joiner. But when a high percentage of samples are truncated, many callchains can be
|
||||
broken. We can tell if many samples are truncated in the record command output, like:
|
||||
|
||||
```sh
|
||||
$ simpleperf record ...
|
||||
simpleperf I cmd_record.cpp:809] Samples recorded: 105584 (cut 86291). Samples lost: 6501.
|
||||
|
||||
$ simpleperf record ...
|
||||
simpleperf I cmd_record.cpp:894] Samples recorded: 7,365 (1,857 with truncated stacks).
|
||||
```
|
||||
|
||||
There are two ways to avoid truncating samples. One is increasing the buffer size, like
|
||||
`--user-buffer-size 1G`. But `--user-buffer-size` is only available on latest simpleperf. If that
|
||||
option isn't available, we can use `--no-cut-samples` to disable truncating samples.
|
||||
|
||||
For the missing DWARF call frame info problem:
|
||||
1. Most C++ code generates binaries containing call frame info, in .eh_frame or .ARM.exidx sections.
|
||||
These sections are not stripped, and are usually enough for stack unwinding.
|
||||
|
||||
2. For C code and a small percentage of C++ code that the compiler is sure will not generate
|
||||
exceptions, the call frame info is generated in .debug_frame section. .debug_frame section is
|
||||
usually stripped with other debug sections. One way to fix it, is to download unstripped binaries
|
||||
on device, as [here](#fix-broken-callchain-stopped-at-c-functions).
|
||||
|
||||
3. The compiler doesn't generate unwind instructions for function prologue and epilogue. Because
|
||||
they operates stack frames and will not generate exceptions. But profiling may hit these
|
||||
instructions, and fails to unwind them. This usually doesn't matter in a frame graph. But in a
|
||||
time based Stack Chart (like in Android Studio and Firefox profiler), this causes stack gaps once
|
||||
in a while. We can remove stack gaps via `--remove-gaps`, which is already enabled by default.
|
||||
|
||||
|
||||
### Fix broken callchain stopped at C functions
|
||||
|
||||
When using dwarf based call graphs, simpleperf generates callchains during recording to save space.
|
||||
The debug information needed to unwind C functions is in .debug_frame section, which is usually
|
||||
stripped in native libraries in apks. To fix this, we can download unstripped version of native
|
||||
libraries on device, and ask simpleperf to use them when recording.
|
||||
|
||||
To use simpleperf directly:
|
||||
|
||||
```sh
|
||||
# create native_libs dir on device, and push unstripped libs in it (nested dirs are not supported).
|
||||
$ adb shell mkdir /data/local/tmp/native_libs
|
||||
$ adb push <unstripped_dir>/*.so /data/local/tmp/native_libs
|
||||
# run simpleperf record with --symfs option.
|
||||
$ adb shell simpleperf record xxx --symfs /data/local/tmp/native_libs
|
||||
```
|
||||
|
||||
To use app_profiler.py:
|
||||
|
||||
```sh
|
||||
$ ./app_profiler.py -lib <unstripped_dir>
|
||||
```
|
||||
|
||||
|
||||
### How to solve missing symbols in report?
|
||||
|
||||
The simpleperf record command collects symbols on device in perf.data. But if the native libraries
|
||||
you use on device are stripped, this will result in a lot of unknown symbols in the report. A
|
||||
solution is to build binary_cache on host.
|
||||
|
||||
```sh
|
||||
# Collect binaries needed by perf.data in binary_cache/.
|
||||
$ ./binary_cache_builder.py -lib NATIVE_LIB_DIR,...
|
||||
```
|
||||
|
||||
The NATIVE_LIB_DIRs passed in -lib option are the directories containing unstripped native
|
||||
libraries on host. After running it, the native libraries containing symbol tables are collected
|
||||
in binary_cache/ for use when reporting.
|
||||
|
||||
```sh
|
||||
$ ./report.py --symfs binary_cache
|
||||
|
||||
# report_html.py searches binary_cache/ automatically, so you don't need to
|
||||
# pass it any argument.
|
||||
$ ./report_html.py
|
||||
```
|
||||
|
||||
|
||||
### Show annotated source code and disassembly
|
||||
|
||||
To show hot places at source code and instruction level, we need to show source code and
|
||||
disassembly with event count annotation. Simpleperf supports showing annotated source code and
|
||||
disassembly for C++ code and fully compiled Java code. Simpleperf supports two ways to do it:
|
||||
|
||||
1. Through report_html.py:
|
||||
1) Generate perf.data and pull it on host.
|
||||
2) Generate binary_cache, containing elf files with debug information. Use -lib option to add
|
||||
libs with debug info. Do it with
|
||||
`binary_cache_builder.py -i perf.data -lib <dir_of_lib_with_debug_info>`.
|
||||
3) Use report_html.py to generate report.html with annotated source code and disassembly,
|
||||
as described [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/scripts_reference.md#report_html_py).
|
||||
|
||||
2. Through pprof.
|
||||
1) Generate perf.data and binary_cache as above.
|
||||
2) Use pprof_proto_generator.py to generate pprof proto file. `pprof_proto_generator.py`.
|
||||
3) Use pprof to report a function with annotated source code, as described [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/doc/scripts_reference.md#pprof_proto_generator_py).
|
||||
|
||||
|
||||
### Reduce lost samples and samples with truncated stack
|
||||
|
||||
When using `simpleperf record`, we may see lost samples or samples with truncated stack data. Before
|
||||
saving samples to a file, simpleperf uses two buffers to cache samples in memory. One is a kernel
|
||||
buffer, the other is a userspace buffer. The kernel puts samples to the kernel buffer. Simpleperf
|
||||
moves samples from the kernel buffer to the userspace buffer before processing them. If a buffer
|
||||
overflows, we lose samples or get samples with truncated stack data. Below is an example.
|
||||
|
||||
```sh
|
||||
$ simpleperf record -a --duration 1 -g --user-buffer-size 100k
|
||||
simpleperf I cmd_record.cpp:799] Recorded for 1.00814 seconds. Start post processing.
|
||||
simpleperf I cmd_record.cpp:894] Samples recorded: 79 (16 with truncated stacks).
|
||||
Samples lost: 2,129 (kernelspace: 18, userspace: 2,111).
|
||||
simpleperf W cmd_record.cpp:911] Lost 18.5567% of samples in kernel space, consider increasing
|
||||
kernel buffer size(-m), or decreasing sample frequency(-f), or
|
||||
increasing sample period(-c).
|
||||
simpleperf W cmd_record.cpp:928] Lost/Truncated 97.1233% of samples in user space, consider
|
||||
increasing userspace buffer size(--user-buffer-size), or
|
||||
decreasing sample frequency(-f), or increasing sample period(-c).
|
||||
```
|
||||
|
||||
In the above example, we get 79 samples, 16 of them are with truncated stack data. We lose 18
|
||||
samples in the kernel buffer, and lose 2111 samples in the userspace buffer.
|
||||
|
||||
To reduce lost samples in the kernel buffer, we can increase kernel buffer size via `-m`. To reduce
|
||||
lost samples in the userspace buffer, or reduce samples with truncated stack data, we can increase
|
||||
userspace buffer size via `--user-buffer-size`.
|
||||
|
||||
We can also reduce samples generated in a fixed time period, like reducing sample frequency using
|
||||
`-f`, reducing monitored threads, not monitoring multiple perf events at the same time.
|
||||
|
||||
|
||||
## Bugs and contribution
|
||||
|
||||
Bugs and feature requests can be submitted at https://github.com/android/ndk/issues.
|
||||
Patches can be uploaded to android-review.googlesource.com as [here](https://source.android.com/setup/contribute/),
|
||||
or sent to email addresses listed [here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/OWNERS).
|
||||
|
||||
If you want to compile simpleperf C++ source code, follow below steps:
|
||||
1. Download AOSP main branch as [here](https://source.android.com/setup/build/requirements).
|
||||
2. Build simpleperf.
|
||||
```sh
|
||||
$ . build/envsetup.sh
|
||||
$ lunch aosp_arm64-trunk_staging-userdebug
|
||||
$ mmma system/extras/simpleperf -j30
|
||||
```
|
||||
|
||||
If built successfully, out/target/product/generic_arm64/system/bin/simpleperf is for ARM64, and
|
||||
out/target/product/generic_arm64/system/bin/simpleperf32 is for ARM.
|
||||
|
||||
The source code of simpleperf python scripts is in [system/extras/simpleperf/scripts](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/scripts/).
|
||||
Most scripts rely on simpleperf binaries to work. To update binaries for scripts (using linux
|
||||
x86_64 host and android arm64 target as an example):
|
||||
```sh
|
||||
$ cp out/host/linux-x86/lib64/libsimpleperf_report.so system/extras/simpleperf/scripts/bin/linux/x86_64/libsimpleperf_report.so
|
||||
$ cp out/target/product/generic_arm64/system/bin/simpleperf_ndk64 system/extras/simpleperf/scripts/bin/android/arm64/simpleperf
|
||||
```
|
||||
|
||||
Then you can try the latest simpleperf scripts and binaries in system/extras/simpleperf/scripts.
|
||||
@ -0,0 +1,313 @@
|
||||
# Android application profiling
|
||||
|
||||
This section shows how to profile an Android application.
|
||||
Some examples are [Here](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/demo/README.md).
|
||||
|
||||
Profiling an Android application involves three steps:
|
||||
1. Prepare an Android application.
|
||||
2. Record profiling data.
|
||||
3. Report profiling data.
|
||||
|
||||
[TOC]
|
||||
|
||||
## Prepare an Android application
|
||||
|
||||
Based on the profiling situation, we may need to customize the build script to generate an apk file
|
||||
specifically for profiling. Below are some suggestions.
|
||||
|
||||
1. If you want to profile a debug build of an application:
|
||||
|
||||
For the debug build type, Android studio sets android::debuggable="true" in AndroidManifest.xml,
|
||||
enables JNI checks and may not optimize C/C++ code. It can be profiled by simpleperf without any
|
||||
change.
|
||||
|
||||
2. If you want to profile a release build of an application:
|
||||
|
||||
For the release build type, Android studio sets android::debuggable="false" in AndroidManifest.xml,
|
||||
disables JNI checks and optimizes C/C++ code. However, security restrictions mean that only apps
|
||||
with android::debuggable set to true can be profiled. So simpleperf can only profile a release
|
||||
build under these three circumstances:
|
||||
If you are on a rooted device, you can profile any app.
|
||||
|
||||
If you are on Android >= Q, you can add profileableFromShell flag in AndroidManifest.xml, this makes
|
||||
a released app profileable by preinstalled profiling tools. In this case, simpleperf downloaded by
|
||||
adb will invoke simpleperf preinstalled in system image to profile the app.
|
||||
|
||||
```
|
||||
<manifest ...>
|
||||
<application ...>
|
||||
<profileable android:shell="true" />
|
||||
</application>
|
||||
</manifest>
|
||||
```
|
||||
|
||||
If you are on Android >= O, we can use [wrap.sh](https://developer.android.com/ndk/guides/wrap-script.html)
|
||||
to profile a release build:
|
||||
Step 1: Add android::debuggable="true" in AndroidManifest.xml to enable profiling.
|
||||
```
|
||||
<manifest ...>
|
||||
<application android::debuggable="true" ...>
|
||||
```
|
||||
|
||||
Step 2: Add wrap.sh in lib/`arch` directories. wrap.sh runs the app without passing any debug flags
|
||||
to ART, so the app runs as a release app. wrap.sh can be done by adding the script below in
|
||||
app/build.gradle.
|
||||
```
|
||||
android {
|
||||
buildTypes {
|
||||
release {
|
||||
sourceSets {
|
||||
release {
|
||||
resources {
|
||||
srcDir {
|
||||
"wrap_sh_lib_dir"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
task createWrapShLibDir
|
||||
for (String abi : ["armeabi-v7a", "arm64-v8a", "x86", "x86_64"]) {
|
||||
def dir = new File("app/wrap_sh_lib_dir/lib/" + abi)
|
||||
dir.mkdirs()
|
||||
def wrapFile = new File(dir, "wrap.sh")
|
||||
wrapFile.withWriter { writer ->
|
||||
writer.write('#!/system/bin/sh\n\$@\n')
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. If you want to profile C/C++ code:
|
||||
|
||||
Android studio strips symbol table and debug info of native libraries in the apk. So the profiling
|
||||
results may contain unknown symbols or broken callgraphs. To fix this, we can pass app_profiler.py
|
||||
a directory containing unstripped native libraries via the -lib option. Usually the directory can
|
||||
be the path of your Android Studio project.
|
||||
|
||||
|
||||
4. If you want to profile Java code:
|
||||
|
||||
On Android >= P, simpleperf supports profiling Java code, no matter whether it is executed by
|
||||
the interpreter, or JITed, or compiled into native instructions. So you don't need to do anything.
|
||||
|
||||
On Android O, simpleperf supports profiling Java code which is compiled into native instructions,
|
||||
and it also needs wrap.sh to use the compiled Java code. To compile Java code, we can pass
|
||||
app_profiler.py the --compile_java_code option.
|
||||
|
||||
On Android N, simpleperf supports profiling Java code that is compiled into native instructions.
|
||||
To compile java code, we can pass app_profiler.py the --compile_java_code option.
|
||||
|
||||
On Android <= M, simpleperf doesn't support profiling Java code.
|
||||
|
||||
|
||||
Below I use application [SimpleperfExampleCpp](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/demo/SimpleperfExampleCpp).
|
||||
It builds an app-debug.apk for profiling.
|
||||
|
||||
```sh
|
||||
$ git clone https://android.googlesource.com/platform/system/extras
|
||||
$ cd extras/simpleperf/demo
|
||||
# Open SimpleperfExampleCpp project with Android studio, and build this project
|
||||
# successfully, otherwise the `./gradlew` command below will fail.
|
||||
$ cd SimpleperfExampleCpp
|
||||
|
||||
# On windows, use "gradlew" instead.
|
||||
$ ./gradlew clean assemble
|
||||
$ adb install -r app/build/outputs/apk/debug/app-debug.apk
|
||||
```
|
||||
|
||||
## Record and report profiling data
|
||||
|
||||
We can use [app-profiler.py](scripts_reference.md#app_profilerpy) to profile Android applications.
|
||||
|
||||
```sh
|
||||
# Cd to the directory of simpleperf scripts. Record perf.data.
|
||||
# -p option selects the profiled app using its package name.
|
||||
# --compile_java_code option compiles Java code into native instructions, which isn't needed on
|
||||
# Android >= P.
|
||||
# -a option selects the Activity to profile.
|
||||
# -lib option gives the directory to find debug native libraries.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -a .MixActivity -lib path_of_SimpleperfExampleCpp
|
||||
```
|
||||
|
||||
This will collect profiling data in perf.data in the current directory, and related native
|
||||
binaries in binary_cache/.
|
||||
|
||||
Normally we need to use the app when profiling, otherwise we may record no samples. But in this
|
||||
case, the MixActivity starts a busy thread. So we don't need to use the app while profiling.
|
||||
|
||||
```sh
|
||||
# Report perf.data in stdio interface.
|
||||
$ ./report.py
|
||||
Cmdline: /data/data/simpleperf.example.cpp/simpleperf record ...
|
||||
Arch: arm64
|
||||
Event: task-clock:u (type 1, config 1)
|
||||
Samples: 10023
|
||||
Event count: 10023000000
|
||||
|
||||
Overhead Command Pid Tid Shared Object Symbol
|
||||
27.04% BusyThread 5703 5729 /system/lib64/libart.so art::JniMethodStart(art::Thread*)
|
||||
25.87% BusyThread 5703 5729 /system/lib64/libc.so long StrToI<long, ...
|
||||
...
|
||||
```
|
||||
|
||||
[report.py](scripts_reference.md#reportpy) reports profiling data in stdio interface. If there
|
||||
are a lot of unknown symbols in the report, check [here](README.md#how-to-solve-missing-symbols-in-report).
|
||||
|
||||
```sh
|
||||
# Report perf.data in html interface.
|
||||
$ ./report_html.py
|
||||
|
||||
# Add source code and disassembly. Change the path of source_dirs if it not correct.
|
||||
$ ./report_html.py --add_source_code --source_dirs path_of_SimpleperfExampleCpp \
|
||||
--add_disassembly
|
||||
```
|
||||
|
||||
[report_html.py](scripts_reference.md#report_htmlpy) generates report in report.html, and pops up
|
||||
a browser tab to show it.
|
||||
|
||||
## Record and report call graph
|
||||
|
||||
We can record and report [call graphs](executable_commands_reference.md#record-call-graphs) as below.
|
||||
|
||||
```sh
|
||||
# Record dwarf based call graphs: add "-g" in the -r option.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp \
|
||||
-r "-e task-clock:u -f 1000 --duration 10 -g" -lib path_of_SimpleperfExampleCpp
|
||||
|
||||
# Record stack frame based call graphs: add "--call-graph fp" in the -r option.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp \
|
||||
-r "-e task-clock:u -f 1000 --duration 10 --call-graph fp" \
|
||||
-lib path_of_SimpleperfExampleCpp
|
||||
|
||||
# Report call graphs in stdio interface.
|
||||
$ ./report.py -g
|
||||
|
||||
# Report call graphs in python Tk interface.
|
||||
$ ./report.py -g --gui
|
||||
|
||||
# Report call graphs in html interface.
|
||||
$ ./report_html.py
|
||||
|
||||
# Report call graphs in flamegraphs.
|
||||
# On Windows, use inferno.bat instead of ./inferno.sh.
|
||||
$ ./inferno.sh -sc
|
||||
```
|
||||
|
||||
## Report in html interface
|
||||
|
||||
We can use [report_html.py](scripts_reference.md#report_htmlpy) to show profiling results in a web browser.
|
||||
report_html.py integrates chart statistics, sample table, flamegraphs, source code annotation
|
||||
and disassembly annotation. It is the recommended way to show reports.
|
||||
|
||||
```sh
|
||||
$ ./report_html.py
|
||||
```
|
||||
|
||||
## Show flamegraph
|
||||
|
||||
To show flamegraphs, we need to first record call graphs. Flamegraphs are shown by
|
||||
report_html.py in the "Flamegraph" tab.
|
||||
We can also use [inferno](scripts_reference.md#inferno) to show flamegraphs directly.
|
||||
|
||||
```sh
|
||||
# On Windows, use inferno.bat instead of ./inferno.sh.
|
||||
$ ./inferno.sh -sc
|
||||
```
|
||||
|
||||
We can also build flamegraphs using https://github.com/brendangregg/FlameGraph.
|
||||
Please make sure you have perl installed.
|
||||
|
||||
```sh
|
||||
$ git clone https://github.com/brendangregg/FlameGraph.git
|
||||
$ ./report_sample.py --symfs binary_cache >out.perf
|
||||
$ FlameGraph/stackcollapse-perf.pl out.perf >out.folded
|
||||
$ FlameGraph/flamegraph.pl out.folded >a.svg
|
||||
```
|
||||
|
||||
## Report in Android Studio
|
||||
|
||||
simpleperf report-sample command can convert perf.data into protobuf format accepted by
|
||||
Android Studio cpu profiler. The conversion can be done either on device or on host. If you have
|
||||
more symbol info on host, then prefer do it on host with --symdir option.
|
||||
|
||||
```sh
|
||||
$ simpleperf report-sample --protobuf --show-callchain -i perf.data -o perf.trace
|
||||
# Then open perf.trace in Android Studio to show it.
|
||||
```
|
||||
|
||||
## Deobfuscate Java symbols
|
||||
|
||||
Java symbols may be obfuscated by ProGuard. To restore the original symbols in a report, we can
|
||||
pass a Proguard mapping file to the report scripts or report-sample command via
|
||||
`--proguard-mapping-file`.
|
||||
|
||||
```sh
|
||||
$ ./report_html.py --proguard-mapping-file proguard_mapping_file.txt
|
||||
```
|
||||
|
||||
## Record both on CPU time and off CPU time
|
||||
|
||||
We can [record both on CPU time and off CPU time](executable_commands_reference.md#record-both-on-cpu-time-and-off-cpu-time).
|
||||
|
||||
First check if trace-offcpu feature is supported on the device.
|
||||
|
||||
```sh
|
||||
$ ./run_simpleperf_on_device.py list --show-features
|
||||
dwarf-based-call-graph
|
||||
trace-offcpu
|
||||
```
|
||||
|
||||
If trace-offcpu is supported, it will be shown in the feature list. Then we can try it.
|
||||
|
||||
```sh
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -a .SleepActivity \
|
||||
-r "-g -e task-clock:u -f 1000 --duration 10 --trace-offcpu" \
|
||||
-lib path_of_SimpleperfExampleCpp
|
||||
$ ./report_html.py --add_disassembly --add_source_code \
|
||||
--source_dirs path_of_SimpleperfExampleCpp
|
||||
```
|
||||
|
||||
## Profile from launch
|
||||
|
||||
We can [profile from launch of an application](scripts_reference.md#profile-from-launch-of-an-application).
|
||||
|
||||
```sh
|
||||
# Start simpleperf recording, then start the Activity to profile.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -a .MainActivity
|
||||
|
||||
# We can also start the Activity on the device manually.
|
||||
# 1. Make sure the application isn't running or one of the recent apps.
|
||||
# 2. Start simpleperf recording.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp
|
||||
# 3. Start the app manually on the device.
|
||||
```
|
||||
|
||||
## Control recording in application code
|
||||
|
||||
Simpleperf supports controlling recording from application code. Below is the workflow:
|
||||
|
||||
1. Run `api_profiler.py prepare -p <package_name>` to allow an app recording itself using
|
||||
simpleperf. By default, the permission is reset after device reboot. So we need to run the
|
||||
script every time the device reboots. But on Android >= 13, we can use `--days` options to
|
||||
set how long we want the permission to last.
|
||||
|
||||
2. Link simpleperf app_api code in the application. The app needs to be debuggable or
|
||||
profileableFromShell as described [here](#prepare-an-android-application). Then the app can
|
||||
use the api to start/pause/resume/stop recording. To start recording, the app_api forks a child
|
||||
process running simpleperf, and uses pipe files to send commands to the child process. After
|
||||
recording, a profiling data file is generated.
|
||||
|
||||
3. Run `api_profiler.py collect -p <package_name>` to collect profiling data files to host.
|
||||
|
||||
Examples are CppApi and JavaApi in [demo](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/demo).
|
||||
|
||||
|
||||
## Parse profiling data manually
|
||||
|
||||
We can also write python scripts to parse profiling data manually, by using
|
||||
[simpleperf_report_lib.py](scripts_reference.md#simpleperf_report_libpy). Examples are report_sample.py,
|
||||
report_html.py.
|
||||
@ -0,0 +1,109 @@
|
||||
# Android platform profiling
|
||||
|
||||
[TOC]
|
||||
|
||||
## General Tips
|
||||
|
||||
Here are some tips for Android platform developers, who build and flash system images on rooted
|
||||
devices:
|
||||
1. After running `adb root`, simpleperf can be used to profile any process or system wide.
|
||||
2. It is recommended to use the latest simpleperf available in AOSP main, if you are not working
|
||||
on the current main branch. Scripts are in `system/extras/simpleperf/scripts`, binaries are in
|
||||
`system/extras/simpleperf/scripts/bin/android`.
|
||||
3. It is recommended to use `app_profiler.py` for recording, and `report_html.py` for reporting.
|
||||
Below is an example.
|
||||
|
||||
```sh
|
||||
# Record surfaceflinger process for 10 seconds with dwarf based call graph. More examples are in
|
||||
# scripts reference in the doc.
|
||||
$ ./app_profiler.py -np surfaceflinger -r "-g --duration 10"
|
||||
|
||||
# Generate html report.
|
||||
$ ./report_html.py
|
||||
```
|
||||
|
||||
4. Since Android >= O has symbols for system libraries on device, we don't need to use unstripped
|
||||
binaries in `$ANDROID_PRODUCT_OUT/symbols` to report call graphs. However, they are needed to add
|
||||
source code and disassembly (with line numbers) in the report. Below is an example.
|
||||
|
||||
```sh
|
||||
# Doing recording with app_profiler.py or simpleperf on device, and generates perf.data on host.
|
||||
$ ./app_profiler.py -np surfaceflinger -r "--call-graph fp --duration 10"
|
||||
|
||||
# Collect unstripped binaries from $ANDROID_PRODUCT_OUT/symbols to binary_cache/.
|
||||
$ ./binary_cache_builder.py -lib $ANDROID_PRODUCT_OUT/symbols
|
||||
|
||||
# Report source code and disassembly. Disassembling all binaries is slow, so it's better to add
|
||||
# --binary_filter option to only disassemble selected binaries.
|
||||
$ ./report_html.py --add_source_code --source_dirs $ANDROID_BUILD_TOP --add_disassembly \
|
||||
--binary_filter surfaceflinger.so
|
||||
```
|
||||
|
||||
## Start simpleperf from system_server process
|
||||
|
||||
Sometimes we want to profile a process/system-wide when a special situation happens. In this case,
|
||||
we can add code starting simpleperf at the point where the situation is detected.
|
||||
|
||||
1. Disable selinux by `adb shell setenforce 0`. Because selinux only allows simpleperf running
|
||||
in shell or debuggable/profileable apps.
|
||||
|
||||
2. Add below code at the point where the special situation is detected.
|
||||
|
||||
```java
|
||||
try {
|
||||
// for capability check
|
||||
Os.prctl(OsConstants.PR_CAP_AMBIENT, OsConstants.PR_CAP_AMBIENT_RAISE,
|
||||
OsConstants.CAP_SYS_PTRACE, 0, 0);
|
||||
// Write to /data instead of /data/local/tmp. Because /data can be written by system user.
|
||||
Runtime.getRuntime().exec("/system/bin/simpleperf record -g -p " + String.valueOf(Process.myPid())
|
||||
+ " -o /data/perf.data --duration 30 --log-to-android-buffer --log verbose");
|
||||
} catch (Exception e) {
|
||||
Slog.e(TAG, "error while running simpleperf");
|
||||
e.printStackTrace();
|
||||
}
|
||||
```
|
||||
|
||||
## Hardware PMU counter limit
|
||||
|
||||
When monitoring instruction and cache related perf events (in hw/cache/raw/pmu category of list cmd),
|
||||
these events are mapped to PMU counters on each cpu core. But each core only has a limited number
|
||||
of PMU counters. If number of events > number of PMU counters, then the counters are multiplexed
|
||||
among events, which probably isn't what we want. We can use `simpleperf stat --print-hw-counter` to
|
||||
show hardware counters (per core) available on the device.
|
||||
|
||||
On Pixel devices, the number of PMU counters on each core is usually 7, of which 4 of them are used
|
||||
by the kernel to monitor memory latency. So only 3 counters are available. It's fine to monitor up
|
||||
to 3 PMU events at the same time. To monitor more than 3 events, the `--use-devfreq-counters` option
|
||||
can be used to borrow from the counters used by the kernel.
|
||||
|
||||
## Get boot-time profile
|
||||
|
||||
On userdebug/eng devices, we can get boot-time profile via simpleperf.
|
||||
|
||||
Step 1. Customize the configuration if needed. By default, simpleperf tracks all processes
|
||||
except for itself, starts at `early-init`, and stops when `sys.boot_completed` is set.
|
||||
You can customize it by changing the trigger or command line flags in
|
||||
`system/extras/simpleperf/simpleperf.rc`.
|
||||
|
||||
Step 2. Add `androidboot.simpleperf.boot_record=1` to the kernel command line.
|
||||
For example, on Pixel devices, you can do
|
||||
```
|
||||
$ fastboot oem cmdline add androidboot.simpleperf.boot_record=1
|
||||
```
|
||||
|
||||
Step 3. Reboot the device. When booting, init finds that the kernel command line flag is set,
|
||||
so it forks a background process to run simpleperf to record boot-time profile.
|
||||
init starts simpleperf at `early-init` stage, which is very soon after second-stage init starts.
|
||||
|
||||
Step 4. After boot, the boot-time profile is stored in /tmp/boot_perf.data. Then we can pull
|
||||
the profile to host to report.
|
||||
|
||||
```
|
||||
$ adb shell ls /tmp/boot_perf.data
|
||||
/tmp/boot_perf.data
|
||||
```
|
||||
|
||||
Following is a boot-time profile example. From timestamp, the first sample is generated at about
|
||||
4.5s after booting.
|
||||
|
||||

|
||||
BIN
Android/android-ndk-r27d/simpleperf/doc/bottleneck.png
Normal file
|
After Width: | Height: | Size: 158 KiB |
@ -0,0 +1,268 @@
|
||||
# Collect ETM data for AutoFDO
|
||||
|
||||
[TOC]
|
||||
|
||||
## Introduction
|
||||
|
||||
ETM is a hardware feature available on arm64 devices. It collects the instruction stream running on
|
||||
each cpu. ARM uses ETM as an alternative for LBR (last branch record) on x86.
|
||||
Simpleperf supports collecting ETM data, and converting it to input files for AutoFDO, which can
|
||||
then be used for PGO (profile-guided optimization) during compilation.
|
||||
|
||||
On ARMv8, ETM is considered as an external debug interface (unless ARMv8.4 Self-hosted Trace
|
||||
extension is impelemented). So it needs to be enabled explicitly in the bootloader, and isn't
|
||||
available on user devices. For Pixel devices, it's available on EVT and DVT devices on Pixel 4,
|
||||
Pixel 4a (5G) and Pixel 5. To test if it's available on other devices, you can follow commands in
|
||||
this doc and see if you can record any ETM data.
|
||||
|
||||
## Examples
|
||||
|
||||
Below are examples collecting ETM data for AutoFDO. It has two steps: first recording ETM data,
|
||||
second converting ETM data to AutoFDO input files.
|
||||
|
||||
Record ETM data:
|
||||
|
||||
```sh
|
||||
# preparation: we need to be root to record ETM data
|
||||
$ adb root
|
||||
$ adb shell
|
||||
redfin:/ \# cd data/local/tmp
|
||||
redfin:/data/local/tmp \#
|
||||
|
||||
# Do a system wide collection, it writes output to perf.data.
|
||||
# If only want ETM data for kernel, use `-e cs-etm:k`.
|
||||
# If only want ETM data for userspace, use `-e cs-etm:u`.
|
||||
redfin:/data/local/tmp \# simpleperf record -e cs-etm --duration 3 -a
|
||||
|
||||
# To reduce file size and time converting to AutoFDO input files, we recommend converting ETM data
|
||||
# into an intermediate branch-list format.
|
||||
redfin:/data/local/tmp \# simpleperf inject --output branch-list -o branch_list.data
|
||||
```
|
||||
|
||||
Converting ETM data to AutoFDO input files needs to read binaries.
|
||||
So for userspace libraries, they can be converted on device. For kernel, it needs
|
||||
to be converted on host, with vmlinux and kernel modules available.
|
||||
|
||||
Convert ETM data for userspace libraries:
|
||||
|
||||
```sh
|
||||
# Injecting ETM data on device. It writes output to perf_inject.data.
|
||||
# perf_inject.data is a text file, containing branch counts for each library.
|
||||
redfin:/data/local/tmp \# simpleperf inject -i branch_list.data
|
||||
```
|
||||
|
||||
Convert ETM data for kernel:
|
||||
|
||||
```sh
|
||||
# pull ETM data to host.
|
||||
host $ adb pull /data/local/tmp/branch_list.data
|
||||
# download vmlinux and kernel modules to <binary_dir>
|
||||
# host simpleperf is in <aosp-top>/system/extras/simpleperf/scripts/bin/linux/x86_64/simpleperf,
|
||||
# or you can build simpleperf by `mmma system/extras/simpleperf`.
|
||||
host $ simpleperf inject --symdir <binary_dir> -i branch_list.data
|
||||
```
|
||||
|
||||
The generated perf_inject.data may contain branch info for multiple binaries. But AutoFDO only
|
||||
accepts one at a time. So we need to split perf_inject.data.
|
||||
The format of perf_inject.data is below:
|
||||
|
||||
```perf_inject.data format
|
||||
|
||||
executed range with count info for binary1
|
||||
branch with count info for binary1
|
||||
// name for binary1
|
||||
|
||||
executed range with count info for binary2
|
||||
branch with count info for binary2
|
||||
// name for binary2
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
We need to split perf_inject.data, and make sure one file only contains info for one binary.
|
||||
|
||||
Then we can use [AutoFDO](https://github.com/google/autofdo) to create profile. AutoFDO only works
|
||||
for binaries having an executable segment as its first loadable segment. But binaries built in
|
||||
Android may not follow this rule. Simpleperf inject command knows how to work around this problem.
|
||||
But there is a check in AutoFDO forcing binaries to start with an executable segment. We need to
|
||||
disable the check in AutoFDO, by commenting out L127-L136 in
|
||||
https://github.com/google/autofdo/commit/188db2834ce74762ed17108ca344916994640708#diff-2d132ecbb5e4f13e0da65419f6d1759dd27d6b696786dd7096c0c34d499b1710R127-R136.
|
||||
Then we can use `create_llvm_prof` in AutoFDO to create profiles used by clang.
|
||||
|
||||
```sh
|
||||
# perf_inject_binary1.data is split from perf_inject.data, and only contains branch info for binary1.
|
||||
host $ autofdo/create_llvm_prof -profile perf_inject_binary1.data -profiler text -binary path_of_binary1 -out a.prof -format binary
|
||||
|
||||
# perf_inject_kernel.data is split from perf_inject.data, and only contains branch info for [kernel.kallsyms].
|
||||
host $ autofdo/create_llvm_prof -profile perf_inject_kernel.data -profiler text -binary vmlinux -out a.prof -format binary
|
||||
```
|
||||
|
||||
Then we can use a.prof for PGO during compilation, via `-fprofile-sample-use=a.prof`.
|
||||
[Here](https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers) are more details.
|
||||
|
||||
### A complete example: etm_test_loop.cpp
|
||||
|
||||
`etm_test_loop.cpp` is an example to show the complete process.
|
||||
The source code is in [etm_test_loop.cpp](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/runtest/etm_test_loop.cpp).
|
||||
The build script is in [Android.bp](https://android.googlesource.com/platform/system/extras/+/main/simpleperf/runtest/Android.bp).
|
||||
It builds an executable called `etm_test_loop`, which runs on device.
|
||||
|
||||
Step 1: Build `etm_test_loop` binary.
|
||||
|
||||
```sh
|
||||
(host) <AOSP>$ . build/envsetup.sh
|
||||
(host) <AOSP>$ lunch aosp_arm64-trunk_staging-userdebug
|
||||
(host) <AOSP>$ make etm_test_loop
|
||||
```
|
||||
|
||||
Step 2: Run `etm_test_loop` on device, and collect ETM data for its running.
|
||||
|
||||
```sh
|
||||
(host) <AOSP>$ adb push out/target/product/generic_arm64/system/bin/etm_test_loop /data/local/tmp
|
||||
(host) <AOSP>$ adb root
|
||||
(host) <AOSP>$ adb shell
|
||||
(device) / # cd /data/local/tmp
|
||||
(device) /data/local/tmp # chmod a+x etm_test_loop
|
||||
(device) /data/local/tmp # simpleperf record -e cs-etm:u ./etm_test_loop
|
||||
simpleperf I cmd_record.cpp:729] Recorded for 0.0370068 seconds. Start post processing.
|
||||
simpleperf I cmd_record.cpp:799] Aux data traced: 1689136
|
||||
(device) /data/local/tmp # simpleperf inject -i perf.data --output branch-list -o branch_list.data
|
||||
simpleperf W dso.cpp:557] failed to read min virtual address of [vdso]: File not found
|
||||
(device) /data/local/tmp # exit
|
||||
(host) <AOSP>$ adb pull /data/local/tmp/branch_list.data
|
||||
```
|
||||
|
||||
Step 3: Convert ETM data to AutoFDO data.
|
||||
|
||||
```sh
|
||||
# Build simpleperf tool on host.
|
||||
(host) <AOSP>$ make simpleperf_ndk
|
||||
(host) <AOSP>$ simpleperf_ndk64 inject -i branch_list.data -o perf_inject_etm_test_loop.data --symdir out/target/product/generic_arm64/symbols/system/bin
|
||||
simpleperf W cmd_inject.cpp:505] failed to build instr ranges for binary [vdso]: File not found
|
||||
(host) <AOSP>$ cat perf_inject_etm_test_loop.data
|
||||
13
|
||||
1000-1010:1
|
||||
1014-1050:1
|
||||
...
|
||||
112c->0:1
|
||||
// /data/local/tmp/etm_test_loop
|
||||
|
||||
(host) <AOSP>$ create_llvm_prof -profile perf_inject_etm_test_loop.data -profiler text -binary out/target/product/generic_arm64/symbols/system/bin/etm_test_loop -out etm_test_loop.afdo -format binary
|
||||
(host) <AOSP>$ ls -lh etm_test_loop.afdo
|
||||
rw-r--r-- 1 user group 241 Aug 29 16:04 etm_test_loop.afdo
|
||||
```
|
||||
|
||||
Step 4: Use AutoFDO data to build optimized binary.
|
||||
|
||||
```sh
|
||||
(host) <AOSP>$ mkdir toolchain/pgo-profiles/sampling/
|
||||
(host) <AOSP>$ cp etm_test_loop.afdo toolchain/pgo-profiles/sampling/
|
||||
(host) <AOSP>$ vi toolchain/pgo-profiles/sampling/Android.bp
|
||||
# edit Android.bp to add a fdo_profile module
|
||||
# soong_namespace {}
|
||||
#
|
||||
# fdo_profile {
|
||||
# name: "etm_test_loop_afdo",
|
||||
# profile: ["etm_test_loop.afdo"],
|
||||
# }
|
||||
```
|
||||
|
||||
`soong_namespace` is added to support fdo_profile modules with the same name
|
||||
|
||||
In a product config mk file, update `PRODUCT_AFDO_PROFILES` with
|
||||
|
||||
```make
|
||||
PRODUCT_AFDO_PROFILES += etm_test_loop://toolchain/pgo-profiles/sampling:etm_test_loop_afdo
|
||||
```
|
||||
|
||||
```sh
|
||||
(host) <AOSP>$ vi system/extras/simpleperf/runtest/Android.bp
|
||||
# edit Android.bp to enable afdo for etm_test_loop.
|
||||
# cc_binary {
|
||||
# name: "etm_test_loop",
|
||||
# srcs: ["etm_test_loop.cpp"],
|
||||
# afdo: true,
|
||||
# }
|
||||
(host) <AOSP>$ make etm_test_loop
|
||||
```
|
||||
|
||||
If comparing the disassembly of `out/target/product/generic_arm64/symbols/system/bin/etm_test_loop`
|
||||
before and after optimizing with AutoFDO data, we can see different preferences when branching.
|
||||
|
||||
|
||||
## Collect ETM data with a daemon
|
||||
|
||||
Android also has a daemon collecting ETM data periodically. It only runs on userdebug and eng
|
||||
devices. The source code is in https://android.googlesource.com/platform/system/extras/+/main/profcollectd/.
|
||||
|
||||
## Support ETM in the kernel
|
||||
|
||||
To let simpleperf use ETM function, we need to enable Coresight driver in the kernel, which lives in
|
||||
`<linux_kernel>/drivers/hwtracing/coresight`.
|
||||
|
||||
The Coresight driver can be enabled by below kernel configs:
|
||||
|
||||
```config
|
||||
CONFIG_CORESIGHT=y
|
||||
CONFIG_CORESIGHT_LINK_AND_SINK_TMC=y
|
||||
CONFIG_CORESIGHT_SOURCE_ETM4X=y
|
||||
```
|
||||
|
||||
On Kernel 5.10+, we recommend building Coresight driver as kernel modules. Because it works with
|
||||
GKI kernel.
|
||||
|
||||
```config
|
||||
CONFIG_CORESIGHT=m
|
||||
CONFIG_CORESIGHT_LINK_AND_SINK_TMC=m
|
||||
CONFIG_CORESIGHT_SOURCE_ETM4X=m
|
||||
```
|
||||
|
||||
Android common kernel 5.10+ should have all the Coresight patches needed to collect ETM data.
|
||||
Android common kernel 5.4 misses two patches. But by adding patches in
|
||||
https://android-review.googlesource.com/q/topic:test_etm_on_hikey960_5.4, we can collect ETM data
|
||||
on hikey960 with 5.4 kernel.
|
||||
For Android common kernel 4.14 and 4.19, we have backported all necessary Coresight patches.
|
||||
|
||||
Besides Coresight driver, we also need to add Coresight devices in device tree. An example is in
|
||||
https://github.com/torvalds/linux/blob/master/arch/arm64/boot/dts/arm/juno-base.dtsi. There should
|
||||
be a path flowing ETM data from ETM device through funnels, ETF and replicators, all the way to
|
||||
ETR, which writes ETM data to system memory.
|
||||
|
||||
One optional flag in ETM device tree is "arm,coresight-loses-context-with-cpu". It saves ETM
|
||||
registers when a CPU enters low power state. It may be needed to avoid
|
||||
"coresight_disclaim_device_unlocked" warning when doing system wide collection.
|
||||
|
||||
One optional flag in ETR device tree is "arm,scatter-gather". Simpleperf requests 4M system memory
|
||||
for ETR to store ETM data. Without IOMMU, the memory needs to be contiguous. If the kernel can't
|
||||
fulfill the request, simpleperf will report out of memory error. Fortunately, we can use
|
||||
"arm,scatter-gather" flag to let ETR run in scatter gather mode, which uses non-contiguous memory.
|
||||
|
||||
|
||||
### A possible problem: trace_id mismatch
|
||||
|
||||
Each CPU has an ETM device, which has a unique trace_id assigned from the kernel.
|
||||
The formula is: `trace_id = 0x10 + cpu * 2`, as in https://github.com/torvalds/linux/blob/master/include/linux/coresight-pmu.h#L37.
|
||||
If the formula is modified by local patches, then simpleperf inject command can't parse ETM data
|
||||
properly and is likely to give empty output.
|
||||
|
||||
|
||||
## Enable ETM in the bootloader
|
||||
|
||||
Unless ARMv8.4 Self-hosted Trace extension is implemented, ETM is considered as an external debug
|
||||
interface. It may be disabled by fuse (like JTAG). So we need to check if ETM is disabled, and
|
||||
if bootloader provides a way to reenable it.
|
||||
|
||||
We can tell if ETM is disable by checking its TRCAUTHSTATUS register, which is exposed in sysfs,
|
||||
like /sys/bus/coresight/devices/coresight-etm0/mgmt/trcauthstatus. To reenable ETM, we need to
|
||||
enable non-Secure non-invasive debug on ARM CPU. The method depends on chip vendors(SOCs).
|
||||
|
||||
|
||||
## Related docs
|
||||
|
||||
* [Arm Architecture Reference Manual Armv8, D3 AArch64 Self-hosted Trace](https://developer.arm.com/documentation/ddi0487/latest)
|
||||
* [ARM ETM Architecture Specification](https://developer.arm.com/documentation/ihi0064/latest/)
|
||||
* [ARM CoreSight Architecture Specification](https://developer.arm.com/documentation/ihi0029/latest)
|
||||
* [CoreSight Components Technical Reference Manual](https://developer.arm.com/documentation/ddi0314/h/)
|
||||
* [CoreSight Trace Memory Controller Technical Reference Manual](https://developer.arm.com/documentation/ddi0461/b/)
|
||||
* [OpenCSD library for decoding ETM data](https://github.com/Linaro/OpenCSD)
|
||||
* [AutoFDO tool for converting profile data](https://github.com/google/autofdo)
|
||||
@ -0,0 +1,79 @@
|
||||
# Debug dwarf unwinding
|
||||
|
||||
Dwarf unwinding is the default way of getting call graphs in simpleperf. In this process,
|
||||
simpleperf asks the kernel to add stack and register data to each sample. Then it uses
|
||||
[libunwindstack](https://cs.android.com/android/platform/superproject/+/main:system/unwinding/libunwindstack/)
|
||||
to unwind the call stack. libunwindstack uses dwarf sections (like .debug_frame or .eh_frame) in
|
||||
elf files to know how to unwind the stack.
|
||||
|
||||
By default, `simpleperf record` unwinds a sample before saving it to disk, to reduce space consumed
|
||||
by stack data. But this behavior makes it harder to reproduce unwinding problems. So we added
|
||||
debug-unwind command, to help debug and profile dwarf unwinding. Below are two use cases.
|
||||
|
||||
[TOC]
|
||||
|
||||
## Debug failed unwinding cases
|
||||
|
||||
Unwinding a sample can fail for different reasons: not enough stack or register data, unknown
|
||||
thread maps, no dwarf info, bugs in code, etc. And to fix them, we need to get error details
|
||||
and be able to reproduce them. simpleperf record cmd has two options for this:
|
||||
`--keep-failed-unwinding-result` keeps error code for failed unwinding samples. It's lightweight
|
||||
and gives us a brief idea why unwinding stops.
|
||||
`--keep-failed-unwinding-debug-info` keeps stack and register data for failed unwinding samples. It
|
||||
can be used to reproduce the unwinding process given proper elf files. Below is an example.
|
||||
|
||||
```sh
|
||||
# Run record cmd and keep failed unwinding debug info.
|
||||
$ simpleperf64 record --app com.example.android.displayingbitmaps -g --duration 10 \
|
||||
--keep-failed-unwinding-debug-info
|
||||
...
|
||||
simpleperf I cmd_record.cpp:762] Samples recorded: 22026. Samples lost: 0.
|
||||
|
||||
# Generate a text report containing failed unwinding cases.
|
||||
$ simpleperf debug-unwind --generate-report -o report.txt
|
||||
|
||||
# Pull report.txt on host and show it using debug_unwind_reporter.py.
|
||||
# Show summary.
|
||||
$ debug_unwind_reporter.py -i report.txt --summary
|
||||
# Show summary of samples failed at a symbol.
|
||||
$ debug_unwind_reporter.py -i report.txt --summary --include-end-symbol SocketInputStream_socketRead0
|
||||
# Show details of samples failed at a symbol.
|
||||
$ debug_unwind_reporter.py -i report.txt --include-end-symbol SocketInputStream_socketRead0
|
||||
|
||||
# Reproduce unwinding a failed case.
|
||||
$ simpleperf debug-unwind --unwind-sample --sample-time 256666343213301
|
||||
|
||||
# Generate a test file containing a failed case and elf files for debugging it.
|
||||
$ simpleperf debug-unwind --generate-test-file --sample-time 256666343213301 --keep-binaries-in-test-file \
|
||||
/apex/com.android.runtime/lib64/bionic/libc.so,/apex/com.android.art/lib64/libopenjdk.so -o test.data
|
||||
```
|
||||
|
||||
## Profile unwinding process
|
||||
|
||||
We can also record samples without unwinding them. Then we can use debug-unwind cmd to unwind the
|
||||
samples after recording. Below is an example.
|
||||
|
||||
```sh
|
||||
# Record samples without unwinding them.
|
||||
$ simpleperf record --app com.example.android.displayingbitmaps -g --duration 10 \
|
||||
--no-unwind
|
||||
...
|
||||
simpleperf I cmd_record.cpp:762] Samples recorded: 9923. Samples lost: 0.
|
||||
|
||||
# Use debug-unwind cmd to unwind samples.
|
||||
$ simpleperf debug-unwind --unwind-sample
|
||||
```
|
||||
|
||||
We can profile the unwinding process, get hot functions for improvement.
|
||||
|
||||
```sh
|
||||
# Profile debug-unwind cmd.
|
||||
$ simpleperf record -g -o perf_unwind.data simpleperf debug-unwind --unwind-sample --skip-sample-print
|
||||
|
||||
# Then pull perf_unwind.data and report it.
|
||||
$ report_html.py -i perf_unwind.data
|
||||
|
||||
# We can also add source code annotation in report.html.
|
||||
$ binary_cache_builder.py -i perf_unwind.data -lib <path to aosp-main>/out/target/product/<device-name>/symbols/system
|
||||
$ report_html.py -i perf_unwind.data --add_source_code --source_dirs <path to aosp-main>/system/
|
||||
```
|
||||
@ -0,0 +1,696 @@
|
||||
# Executable commands reference
|
||||
|
||||
[TOC]
|
||||
|
||||
## How simpleperf works
|
||||
|
||||
Modern CPUs have a hardware component called the performance monitoring unit (PMU). The PMU has
|
||||
several hardware counters, counting events like how many cpu cycles have happened, how many
|
||||
instructions have executed, or how many cache misses have happened.
|
||||
|
||||
The Linux kernel wraps these hardware counters into hardware perf events. In addition, the Linux
|
||||
kernel also provides hardware independent software events and tracepoint events. The Linux kernel
|
||||
exposes all events to userspace via the perf_event_open system call, which is used by simpleperf.
|
||||
|
||||
Simpleperf has three main commands: stat, record and report.
|
||||
|
||||
The stat command gives a summary of how many events have happened in the profiled processes in a
|
||||
time period. Here’s how it works:
|
||||
1. Given user options, simpleperf enables profiling by making a system call to the kernel.
|
||||
2. The kernel enables counters while the profiled processes are running.
|
||||
3. After profiling, simpleperf reads counters from the kernel, and reports a counter summary.
|
||||
|
||||
The record command records samples of the profiled processes in a time period. Here’s how it works:
|
||||
1. Given user options, simpleperf enables profiling by making a system call to the kernel.
|
||||
2. Simpleperf creates mapped buffers between simpleperf and the kernel.
|
||||
3. The kernel enables counters while the profiled processes are running.
|
||||
4. Each time a given number of events happen, the kernel dumps a sample to the mapped buffers.
|
||||
5. Simpleperf reads samples from the mapped buffers and stores profiling data in a file called
|
||||
perf.data.
|
||||
|
||||
The report command reads perf.data and any shared libraries used by the profiled processes,
|
||||
and outputs a report showing where the time was spent.
|
||||
|
||||
## Commands
|
||||
|
||||
Simpleperf supports several commands, listed below:
|
||||
|
||||
```
|
||||
The debug-unwind command: debug/test dwarf based offline unwinding, used for debugging simpleperf.
|
||||
The dump command: dumps content in perf.data, used for debugging simpleperf.
|
||||
The help command: prints help information for other commands.
|
||||
The kmem command: collects kernel memory allocation information (will be replaced by Python scripts).
|
||||
The list command: lists all event types supported on the Android device.
|
||||
The record command: profiles processes and stores profiling data in perf.data.
|
||||
The report command: reports profiling data in perf.data.
|
||||
The report-sample command: reports each sample in perf.data, used for supporting integration of
|
||||
simpleperf in Android Studio.
|
||||
The stat command: profiles processes and prints counter summary.
|
||||
|
||||
```
|
||||
|
||||
Each command supports different options, which can be seen through help message.
|
||||
|
||||
```sh
|
||||
# List all commands.
|
||||
$ simpleperf --help
|
||||
|
||||
# Print help message for record command.
|
||||
$ simpleperf record --help
|
||||
```
|
||||
|
||||
Below describes the most frequently used commands, which are list, stat, record and report.
|
||||
|
||||
## The list command
|
||||
|
||||
The list command lists all events available on the device. Different devices may support different
|
||||
events because they have different hardware and kernels.
|
||||
|
||||
```sh
|
||||
$ simpleperf list
|
||||
List of hw-cache events:
|
||||
branch-loads
|
||||
...
|
||||
List of hardware events:
|
||||
cpu-cycles
|
||||
instructions
|
||||
...
|
||||
List of software events:
|
||||
cpu-clock
|
||||
task-clock
|
||||
...
|
||||
```
|
||||
|
||||
On ARM/ARM64, the list command also shows a list of raw events, they are the events supported by
|
||||
the ARM PMU on the device. The kernel has wrapped part of them into hardware events and hw-cache
|
||||
events. For example, raw-cpu-cycles is wrapped into cpu-cycles, raw-instruction-retired is wrapped
|
||||
into instructions. The raw events are provided in case we want to use some events supported on the
|
||||
device, but unfortunately not wrapped by the kernel.
|
||||
|
||||
## The stat command
|
||||
|
||||
The stat command is used to get event counter values of the profiled processes. By passing options,
|
||||
we can select which events to use, which processes/threads to monitor, how long to monitor and the
|
||||
print interval.
|
||||
|
||||
```sh
|
||||
# Stat using default events (cpu-cycles,instructions,...), and monitor process 7394 for 10 seconds.
|
||||
$ simpleperf stat -p 7394 --duration 10
|
||||
Performance counter statistics:
|
||||
|
||||
# count event_name # count / runtime
|
||||
16,513,564 cpu-cycles # 1.612904 GHz
|
||||
4,564,133 stalled-cycles-frontend # 341.490 M/sec
|
||||
6,520,383 stalled-cycles-backend # 591.666 M/sec
|
||||
4,900,403 instructions # 612.859 M/sec
|
||||
47,821 branch-misses # 6.085 M/sec
|
||||
25.274251(ms) task-clock # 0.002520 cpus used
|
||||
4 context-switches # 158.264 /sec
|
||||
466 page-faults # 18.438 K/sec
|
||||
|
||||
Total test time: 10.027923 seconds.
|
||||
```
|
||||
|
||||
### Select events to stat
|
||||
|
||||
We can select which events to use via -e.
|
||||
|
||||
```sh
|
||||
# Stat event cpu-cycles.
|
||||
$ simpleperf stat -e cpu-cycles -p 11904 --duration 10
|
||||
|
||||
# Stat event cache-references and cache-misses.
|
||||
$ simpleperf stat -e cache-references,cache-misses -p 11904 --duration 10
|
||||
```
|
||||
|
||||
When running the stat command, if the number of hardware events is larger than the number of
|
||||
hardware counters available in the PMU, the kernel shares hardware counters between events, so each
|
||||
event is only monitored for part of the total time. As a result, the number of events shown is
|
||||
smaller than the number of events that actually happened. The following is an example.
|
||||
|
||||
```sh
|
||||
# Stat using event cache-references, cache-references:u,....
|
||||
$ simpleperf stat -p 7394 -e cache-references,cache-references:u,cache-references:k \
|
||||
-e cache-misses,cache-misses:u,cache-misses:k,instructions --duration 1
|
||||
Performance counter statistics:
|
||||
|
||||
# count event_name # count / runtime
|
||||
490,713 cache-references # 151.682 M/sec
|
||||
899,652 cache-references:u # 130.152 M/sec
|
||||
855,218 cache-references:k # 111.356 M/sec
|
||||
61,602 cache-misses # 7.710 M/sec
|
||||
33,282 cache-misses:u # 5.050 M/sec
|
||||
11,662 cache-misses:k # 4.478 M/sec
|
||||
0 instructions #
|
||||
|
||||
Total test time: 1.000867 seconds.
|
||||
simpleperf W cmd_stat.cpp:946] It seems the number of hardware events are more than the number of
|
||||
available CPU PMU hardware counters. That will trigger hardware counter
|
||||
multiplexing. As a result, events are not counted all the time processes
|
||||
running, and event counts are smaller than what really happens.
|
||||
Use --print-hw-counter to show available hardware counters.
|
||||
```
|
||||
|
||||
In the example above, we monitor 7 events. Each event is only monitored part of the total time.
|
||||
Because the number of cache-references is smaller than the number of cache-references:u
|
||||
(cache-references only in userspace) and cache-references:k (cache-references only in kernel).
|
||||
The number of instructions is zero. After printing the result, simpleperf checks if CPUs have
|
||||
enough hardware counters to count hardware events at the same time. If not, it prints a warning.
|
||||
|
||||
To avoid hardware counter multiplexing, we can use `simpleperf stat --print-hw-counter` to show
|
||||
available counters on each CPU. Then don't monitor more hardware events than counters available.
|
||||
|
||||
```sh
|
||||
$ simpleperf stat --print-hw-counter
|
||||
There are 2 CPU PMU hardware counters available on cpu 0.
|
||||
There are 2 CPU PMU hardware counters available on cpu 1.
|
||||
There are 2 CPU PMU hardware counters available on cpu 2.
|
||||
There are 2 CPU PMU hardware counters available on cpu 3.
|
||||
There are 2 CPU PMU hardware counters available on cpu 4.
|
||||
There are 2 CPU PMU hardware counters available on cpu 5.
|
||||
There are 2 CPU PMU hardware counters available on cpu 6.
|
||||
There are 2 CPU PMU hardware counters available on cpu 7.
|
||||
```
|
||||
|
||||
When counter multiplexing happens, there is no guarantee of which events will be monitored at
|
||||
which time. If we want to ensure some events are always monitored at the same time, we can use
|
||||
`--group`.
|
||||
|
||||
```sh
|
||||
# Stat using event cache-references, cache-references:u,....
|
||||
$ simpleperf stat -p 7964 --group cache-references,cache-misses \
|
||||
--group cache-references:u,cache-misses:u --group cache-references:k,cache-misses:k \
|
||||
--duration 1
|
||||
Performance counter statistics:
|
||||
|
||||
# count event_name # count / runtime
|
||||
2,088,463 cache-references # 181.360 M/sec
|
||||
47,871 cache-misses # 2.292164% miss rate
|
||||
1,277,600 cache-references:u # 136.419 M/sec
|
||||
25,977 cache-misses:u # 2.033265% miss rate
|
||||
326,305 cache-references:k # 74.724 M/sec
|
||||
13,596 cache-misses:k # 4.166654% miss rate
|
||||
|
||||
Total test time: 1.029729 seconds.
|
||||
simpleperf W cmd_stat.cpp:946] It seems the number of hardware events are more than the number of
|
||||
...
|
||||
```
|
||||
|
||||
### Select target to stat
|
||||
|
||||
We can select which processes or threads to monitor via -p or -t. Monitoring a
|
||||
process is the same as monitoring all threads in the process. Simpleperf can also fork a child
|
||||
process to run the new command and then monitor the child process.
|
||||
|
||||
```sh
|
||||
# Stat process 11904 and 11905.
|
||||
$ simpleperf stat -p 11904,11905 --duration 10
|
||||
|
||||
# Stat processes with name containing "chrome".
|
||||
$ simpleperf stat -p chrome --duration 10
|
||||
# Stat processes with name containing part matching regex "chrome:(privileged|sandboxed)".
|
||||
$ simpleperf stat -p "chrome:(privileged|sandboxed)" --duration 10
|
||||
|
||||
# Stat thread 11904 and 11905.
|
||||
$ simpleperf stat -t 11904,11905 --duration 10
|
||||
|
||||
# Start a child process running `ls`, and stat it.
|
||||
$ simpleperf stat ls
|
||||
|
||||
# Stat the process of an Android application. On non-root devices, this only works for debuggable
|
||||
# or profileable from shell apps.
|
||||
$ simpleperf stat --app simpleperf.example.cpp --duration 10
|
||||
|
||||
# Stat only selected thread 11904 in an app.
|
||||
$ simpleperf stat --app simpleperf.example.cpp -t 11904 --duration 10
|
||||
|
||||
# Stat system wide using -a.
|
||||
$ simpleperf stat -a --duration 10
|
||||
```
|
||||
|
||||
### Decide how long to stat
|
||||
|
||||
When monitoring existing threads, we can use --duration to decide how long to monitor. When
|
||||
monitoring a child process running a new command, simpleperf monitors until the child process ends.
|
||||
In this case, we can use Ctrl-C to stop monitoring at any time.
|
||||
|
||||
```sh
|
||||
# Stat process 11904 for 10 seconds.
|
||||
$ simpleperf stat -p 11904 --duration 10
|
||||
|
||||
# Stat until the child process running `ls` finishes.
|
||||
$ simpleperf stat ls
|
||||
|
||||
# Stop monitoring using Ctrl-C.
|
||||
$ simpleperf stat -p 11904 --duration 10
|
||||
^C
|
||||
```
|
||||
|
||||
If you want to write a script to control how long to monitor, you can send one of SIGINT, SIGTERM,
|
||||
SIGHUP signals to simpleperf to stop monitoring.
|
||||
|
||||
### Decide the print interval
|
||||
|
||||
When monitoring perf counters, we can also use --interval to decide the print interval.
|
||||
|
||||
```sh
|
||||
# Print stat for process 11904 every 300ms.
|
||||
$ simpleperf stat -p 11904 --duration 10 --interval 300
|
||||
|
||||
# Print system wide stat at interval of 300ms for 10 seconds. Note that system wide profiling needs
|
||||
# root privilege.
|
||||
$ su 0 simpleperf stat -a --duration 10 --interval 300
|
||||
```
|
||||
|
||||
### Display counters in systrace
|
||||
|
||||
Simpleperf can also work with systrace to dump counters in the collected trace. Below is an example
|
||||
to do a system wide stat.
|
||||
|
||||
```sh
|
||||
# Capture instructions (kernel only) and cache misses with interval of 300 milliseconds for 15
|
||||
# seconds.
|
||||
$ su 0 simpleperf stat -e instructions:k,cache-misses -a --interval 300 --duration 15
|
||||
# On host launch systrace to collect trace for 10 seconds.
|
||||
(HOST)$ external/chromium-trace/systrace.py --time=10 -o new.html sched gfx view
|
||||
# Open the collected new.html in browser and perf counters will be shown up.
|
||||
```
|
||||
|
||||
### Show event count per thread
|
||||
|
||||
By default, stat cmd outputs an event count sum for all monitored targets. But when `--per-thread`
|
||||
option is used, stat cmd outputs an event count for each thread in monitored targets. It can be
|
||||
used to find busy threads in a process or system wide. With `--per-thread` option, stat cmd opens
|
||||
a perf_event_file for each exisiting thread. If a monitored thread creates new threads, event
|
||||
count for new threads will be added to the monitored thread by default, otherwise omitted if
|
||||
`--no-inherit` option is also used.
|
||||
|
||||
```sh
|
||||
# Print event counts for each thread in process 11904. Event counts for threads created after
|
||||
# stat cmd will be added to threads creating them.
|
||||
$ simpleperf stat --per-thread -p 11904 --duration 1
|
||||
|
||||
# Print event counts for all threads running in the system every 1s. Threads not running will not
|
||||
# be reported.
|
||||
$ su 0 simpleperf stat --per-thread -a --interval 1000 --interval-only-values
|
||||
|
||||
# Print event counts for all threads running in the system every 1s. Event counts for threads
|
||||
# created after stat cmd will be omitted.
|
||||
$ su 0 simpleperf stat --per-thread -a --interval 1000 --interval-only-values --no-inherit
|
||||
```
|
||||
|
||||
### Show event count per core
|
||||
|
||||
By default, stat cmd outputs an event count sum for all monitored cpu cores. But when `--per-core`
|
||||
option is used, stat cmd outputs an event count for each core. It can be used to see how events
|
||||
are distributed on different cores.
|
||||
When stating non-system wide with `--per-core` option, simpleperf creates a perf event for each
|
||||
monitored thread on each core. When a thread is in running state, perf events on all cores are
|
||||
enabled, but only the perf event on the core running the thread is in running state. So the
|
||||
percentage comment shows runtime_on_a_core / runtime_on_all_cores. Note that, percentage is still
|
||||
affected by hardware counter multiplexing. Check simpleperf log output for ways to distinguish it.
|
||||
|
||||
```sh
|
||||
# Print event counts for each cpu running threads in process 11904.
|
||||
# A percentage shows runtime_on_a_cpu / runtime_on_all_cpus.
|
||||
$ simpleperf stat -e cpu-cycles --per-core -p 1057 --duration 3
|
||||
Performance counter statistics:
|
||||
|
||||
# cpu count event_name # count / runtime
|
||||
0 1,667,660 cpu-cycles # 1.571565 GHz
|
||||
1 3,850,440 cpu-cycles # 1.736958 GHz
|
||||
2 2,463,792 cpu-cycles # 1.701367 GHz
|
||||
3 2,350,528 cpu-cycles # 1.700841 GHz
|
||||
5 7,919,520 cpu-cycles # 2.377081 GHz
|
||||
6 105,622,673 cpu-cycles # 2.381331 GHz
|
||||
|
||||
Total test time: 3.002703 seconds.
|
||||
|
||||
# Print event counts for each cpu system wide.
|
||||
$ su 0 simpleperf stat --per-core -a --duration 1
|
||||
|
||||
# Print cpu-cycle event counts for each cpu for each thread running in the system.
|
||||
$ su 0 simpleperf stat -e cpu-cycles -a --per-thread --per-core --duration 1
|
||||
```
|
||||
|
||||
### Monitor different events on different cores
|
||||
|
||||
Android devices usually have big and little cores. Different cores may support different events.
|
||||
Therefore, we may want to monitor different events on different cores. We can do this using
|
||||
the `--cpu` option. The `--cpu` option selects the cores on which to monitor events. A `--cpu`
|
||||
option affects all the following events until meeting another `--cpu` option. The first `--cpu`
|
||||
option also affects all events before it. Following are some examples:
|
||||
|
||||
```sh
|
||||
# By default, cpu-cycles and instructions are monitored on all cpus.
|
||||
$ su 0 simpleperf stat -e cpu-cycles,instructions -a --duration 1 --per-core
|
||||
|
||||
# Use one `--cpu` option to monitor cpu-cycles and instructions only on cpu 0-3,8.
|
||||
$ su 0 simpleperf stat -e cpu-cycles --cpu 0-3,8 -e instructions -a --duration 1 --per-core
|
||||
|
||||
# Use two `--cpu` options to monitor raw-l3d-cache-refill-rd on cpu 0-3, and raw-l3d-cache-refill on
|
||||
# cpu 4-8.
|
||||
$ su 0 simpleperf stat --cpu 0-3 -e raw-l3d-cache-refill-rd --cpu 4-8 -e raw-l3d-cache-refill \
|
||||
-a --duration 1 --per-core
|
||||
```
|
||||
|
||||
## The record command
|
||||
|
||||
The record command is used to dump samples of the profiled processes. Each sample can contain
|
||||
information like the time at which the sample was generated, the number of events since last
|
||||
sample, the program counter of a thread, the call chain of a thread.
|
||||
|
||||
By passing options, we can select which events to use, which processes/threads to monitor,
|
||||
what frequency to dump samples, how long to monitor, and where to store samples.
|
||||
|
||||
```sh
|
||||
# Record on process 7394 for 10 seconds, using default event (cpu-cycles), using default sample
|
||||
# frequency (4000 samples per second), writing records to perf.data.
|
||||
$ simpleperf record -p 7394 --duration 10
|
||||
simpleperf I cmd_record.cpp:316] Samples recorded: 21430. Samples lost: 0.
|
||||
```
|
||||
|
||||
### Select events to record
|
||||
|
||||
By default, the cpu-cycles event is used to evaluate consumed cpu cycles. But we can also use other
|
||||
events via -e.
|
||||
|
||||
```sh
|
||||
# Record using event instructions.
|
||||
$ simpleperf record -e instructions -p 11904 --duration 10
|
||||
|
||||
# Record using task-clock, which shows the passed CPU time in nanoseconds.
|
||||
$ simpleperf record -e task-clock -p 11904 --duration 10
|
||||
```
|
||||
|
||||
### Select target to record
|
||||
|
||||
The way to select target in record command is similar to that in the stat command.
|
||||
|
||||
```sh
|
||||
# Record process 11904 and 11905.
|
||||
$ simpleperf record -p 11904,11905 --duration 10
|
||||
|
||||
# Record processes with name containing "chrome".
|
||||
$ simpleperf record -p chrome --duration 10
|
||||
# Record processes with name containing part matching regex "chrome:(privileged|sandboxed)".
|
||||
$ simpleperf record -p "chrome:(privileged|sandboxed)" --duration 10
|
||||
|
||||
# Record thread 11904 and 11905.
|
||||
$ simpleperf record -t 11904,11905 --duration 10
|
||||
|
||||
# Record a child process running `ls`.
|
||||
$ simpleperf record ls
|
||||
|
||||
# Record the process of an Android application. On non-root devices, this only works for debuggable
|
||||
# or profileable from shell apps.
|
||||
$ simpleperf record --app simpleperf.example.cpp --duration 10
|
||||
|
||||
# Record only selected thread 11904 in an app.
|
||||
$ simpleperf record --app simpleperf.example.cpp -t 11904 --duration 10
|
||||
|
||||
# Record system wide.
|
||||
$ simpleperf record -a --duration 10
|
||||
```
|
||||
|
||||
### Set the frequency to record
|
||||
|
||||
We can set the frequency to dump records via -f or -c. For example, -f 4000 means
|
||||
dumping approximately 4000 records every second when the monitored thread runs. If a monitored
|
||||
thread runs 0.2s in one second (it can be preempted or blocked in other times), simpleperf dumps
|
||||
about 4000 * 0.2 / 1.0 = 800 records every second. Another way is using -c. For example, -c 10000
|
||||
means dumping one record whenever 10000 events happen.
|
||||
|
||||
```sh
|
||||
# Record with sample frequency 1000: sample 1000 times every second running.
|
||||
$ simpleperf record -f 1000 -p 11904,11905 --duration 10
|
||||
|
||||
# Record with sample period 100000: sample 1 time every 100000 events.
|
||||
$ simpleperf record -c 100000 -t 11904,11905 --duration 10
|
||||
```
|
||||
|
||||
To avoid taking too much time generating samples, kernel >= 3.10 sets the max percent of cpu time
|
||||
used for generating samples (default is 25%), and decreases the max allowed sample frequency when
|
||||
hitting that limit. Simpleperf uses --cpu-percent option to adjust it, but it needs either root
|
||||
privilege or to be on Android >= Q.
|
||||
|
||||
```sh
|
||||
# Record with sample frequency 10000, with max allowed cpu percent to be 50%.
|
||||
$ simpleperf record -f 1000 -p 11904,11905 --duration 10 --cpu-percent 50
|
||||
```
|
||||
|
||||
### Decide how long to record
|
||||
|
||||
The way to decide how long to monitor in record command is similar to that in the stat command.
|
||||
|
||||
```sh
|
||||
# Record process 11904 for 10 seconds.
|
||||
$ simpleperf record -p 11904 --duration 10
|
||||
|
||||
# Record until the child process running `ls` finishes.
|
||||
$ simpleperf record ls
|
||||
|
||||
# Stop monitoring using Ctrl-C.
|
||||
$ simpleperf record -p 11904 --duration 10
|
||||
^C
|
||||
```
|
||||
|
||||
If you want to write a script to control how long to monitor, you can send one of SIGINT, SIGTERM,
|
||||
SIGHUP signals to simpleperf to stop monitoring.
|
||||
|
||||
### Set the path to store profiling data
|
||||
|
||||
By default, simpleperf stores profiling data in perf.data in the current directory. But the path
|
||||
can be changed using -o.
|
||||
|
||||
```sh
|
||||
# Write records to data/perf2.data.
|
||||
$ simpleperf record -p 11904 -o data/perf2.data --duration 10
|
||||
```
|
||||
|
||||
#### Record call graphs
|
||||
|
||||
A call graph is a tree showing function call relations. Below is an example.
|
||||
|
||||
```
|
||||
main() {
|
||||
FunctionOne();
|
||||
FunctionTwo();
|
||||
}
|
||||
FunctionOne() {
|
||||
FunctionTwo();
|
||||
FunctionThree();
|
||||
}
|
||||
a call graph:
|
||||
main-> FunctionOne
|
||||
| |
|
||||
| |-> FunctionTwo
|
||||
| |-> FunctionThree
|
||||
|
|
||||
|-> FunctionTwo
|
||||
```
|
||||
|
||||
A call graph shows how a function calls other functions, and a reversed call graph shows how
|
||||
a function is called by other functions. To show a call graph, we need to first record it, then
|
||||
report it.
|
||||
|
||||
There are two ways to record a call graph, one is recording a dwarf based call graph, the other is
|
||||
recording a stack frame based call graph. Recording dwarf based call graphs needs support of debug
|
||||
information in native binaries. While recording stack frame based call graphs needs support of
|
||||
stack frame registers.
|
||||
|
||||
```sh
|
||||
# Record a dwarf based call graph
|
||||
$ simpleperf record -p 11904 -g --duration 10
|
||||
|
||||
# Record a stack frame based call graph
|
||||
$ simpleperf record -p 11904 --call-graph fp --duration 10
|
||||
```
|
||||
|
||||
[Here](README.md#suggestions-about-recording-call-graphs) are some suggestions about recording call graphs.
|
||||
|
||||
### Record both on CPU time and off CPU time
|
||||
|
||||
Simpleperf is a CPU profiler, which generates samples for a thread only when it is running on a
|
||||
CPU. But sometimes we want to know where the thread time is spent off-cpu (like preempted by other
|
||||
threads, blocked in IO or waiting for some events). To support this, simpleperf added a
|
||||
--trace-offcpu option to the record command. When --trace-offcpu is used, simpleperf does the
|
||||
following things:
|
||||
|
||||
1) Only cpu-clock/task-clock event is allowed to be used with --trace-offcpu. This let simpleperf
|
||||
generate on-cpu samples for cpu-clock event.
|
||||
2) Simpleperf also monitors sched:sched_switch event, which will generate a sched_switch sample
|
||||
each time the monitored thread is scheduled off cpu.
|
||||
3) Simpleperf also records context switch records. So it knows when the thread is scheduled back on
|
||||
a cpu.
|
||||
|
||||
The samples and context switch records collected by simpleperf for a thread are shown below:
|
||||
|
||||

|
||||
|
||||
Here we have two types of samples:
|
||||
1) on-cpu samples generated for cpu-clock event. The period value in each sample means how many
|
||||
nanoseconds are spent on cpu (for the callchain of this sample).
|
||||
2) off-cpu (sched_switch) samples generated for sched:sched_switch event. The period value is
|
||||
calculated as **Timestamp of the next switch on record** minus **Timestamp of the current sample**
|
||||
by simpleperf. So the period value in each sample means how many nanoseconds are spent off cpu
|
||||
(for the callchain of this sample).
|
||||
|
||||
**note**: In reality, switch on records and samples may lost. To mitigate the loss of accuracy, we
|
||||
calculate the period of an off-cpu sample as **Timestamp of the next switch on record or sample**
|
||||
minus **Timestamp of the current sample**.
|
||||
|
||||
When reporting via python scripts, simpleperf_report_lib.py provides SetTraceOffCpuMode() method
|
||||
to control how to report the samples:
|
||||
1) on-cpu mode: only report on-cpu samples.
|
||||
2) off-cpu mode: only report off-cpu samples.
|
||||
3) on-off-cpu mode: report both on-cpu and off-cpu samples, which can be split by event name.
|
||||
4) mixed-on-off-cpu mode: report on-cpu and off-cpu samples under the same event name.
|
||||
|
||||
If not set, mixed-on-off-cpu mode will be used to report.
|
||||
|
||||
When using report_html.py, inferno and report_sample.py, the report mode can be set by
|
||||
--trace-offcpu option.
|
||||
|
||||
Below are some examples recording and reporting trace offcpu profiles.
|
||||
|
||||
```sh
|
||||
# Check if --trace-offcpu is supported by the kernel (should be available on kernel >= 4.2).
|
||||
$ simpleperf list --show-features
|
||||
trace-offcpu
|
||||
...
|
||||
|
||||
# Record with --trace-offcpu.
|
||||
$ simpleperf record -g -p 11904 --duration 10 --trace-offcpu -e cpu-clock
|
||||
|
||||
# Record system wide with --trace-offcpu.
|
||||
$ simpleperf record -a -g --duration 3 --trace-offcpu -e cpu-clock
|
||||
|
||||
# Record with --trace-offcpu using app_profiler.py.
|
||||
$ ./app_profiler.py -p com.google.samples.apps.sunflower \
|
||||
-r "-g -e cpu-clock:u --duration 10 --trace-offcpu"
|
||||
|
||||
# Report on-cpu samples.
|
||||
$ ./report_html.py --trace-offcpu on-cpu
|
||||
# Report off-cpu samples.
|
||||
$ ./report_html.py --trace-offcpu off-cpu
|
||||
# Report on-cpu and off-cpu samples under different event names.
|
||||
$ ./report_html.py --trace-offcpu on-off-cpu
|
||||
# Report on-cpu and off-cpu samples under the same event name.
|
||||
$ ./report_html.py --trace-offcpu mixed-on-off-cpu
|
||||
```
|
||||
|
||||
## The report command
|
||||
|
||||
The report command is used to report profiling data generated by the record command. The report
|
||||
contains a table of sample entries. Each sample entry is a row in the report. The report command
|
||||
groups samples belong to the same process, thread, library, function in the same sample entry. Then
|
||||
sort the sample entries based on the event count a sample entry has.
|
||||
|
||||
By passing options, we can decide how to filter out uninteresting samples, how to group samples
|
||||
into sample entries, and where to find profiling data and binaries.
|
||||
|
||||
Below is an example. Records are grouped into 4 sample entries, each entry is a row. There are
|
||||
several columns, each column shows piece of information belonging to a sample entry. The first
|
||||
column is Overhead, which shows the percentage of events inside the current sample entry in total
|
||||
events. As the perf event is cpu-cycles, the overhead is the percentage of CPU cycles used in each
|
||||
function.
|
||||
|
||||
```sh
|
||||
# Reports perf.data, using only records sampled in libsudo-game-jni.so, grouping records using
|
||||
# thread name(comm), process id(pid), thread id(tid), function name(symbol), and showing sample
|
||||
# count for each row.
|
||||
$ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so \
|
||||
--sort comm,pid,tid,symbol -n
|
||||
Cmdline: /data/data/com.example.sudogame/simpleperf record -p 7394 --duration 10
|
||||
Arch: arm64
|
||||
Event: cpu-cycles (type 0, config 0)
|
||||
Samples: 28235
|
||||
Event count: 546356211
|
||||
|
||||
Overhead Sample Command Pid Tid Symbol
|
||||
59.25% 16680 sudogame 7394 7394 checkValid(Board const&, int, int)
|
||||
20.42% 5620 sudogame 7394 7394 canFindSolution_r(Board&, int, int)
|
||||
13.82% 4088 sudogame 7394 7394 randomBlock_r(Board&, int, int, int, int, int)
|
||||
6.24% 1756 sudogame 7394 7394 @plt
|
||||
```
|
||||
|
||||
### Set the path to read profiling data
|
||||
|
||||
By default, the report command reads profiling data from perf.data in the current directory.
|
||||
But the path can be changed using -i.
|
||||
|
||||
```sh
|
||||
$ simpleperf report -i data/perf2.data
|
||||
```
|
||||
|
||||
### Set the path to find binaries
|
||||
|
||||
To report function symbols, simpleperf needs to read executable binaries used by the monitored
|
||||
processes to get symbol table and debug information. By default, the paths are the executable
|
||||
binaries used by monitored processes while recording. However, these binaries may not exist when
|
||||
reporting or not contain symbol table and debug information. So we can use --symfs to redirect
|
||||
the paths.
|
||||
|
||||
```sh
|
||||
# In this case, when simpleperf wants to read executable binary /A/b, it reads file in /A/b.
|
||||
$ simpleperf report
|
||||
|
||||
# In this case, when simpleperf wants to read executable binary /A/b, it prefers file in
|
||||
# /debug_dir/A/b to file in /A/b.
|
||||
$ simpleperf report --symfs /debug_dir
|
||||
|
||||
# Read symbols for system libraries built locally. Note that this is not needed since Android O,
|
||||
# which ships symbols for system libraries on device.
|
||||
$ simpleperf report --symfs $ANDROID_PRODUCT_OUT/symbols
|
||||
```
|
||||
|
||||
### Filter samples
|
||||
|
||||
When reporting, it happens that not all records are of interest. The report command supports four
|
||||
filters to select samples of interest.
|
||||
|
||||
```sh
|
||||
# Report records in threads having name sudogame.
|
||||
$ simpleperf report --comms sudogame
|
||||
|
||||
# Report records in process 7394 or 7395
|
||||
$ simpleperf report --pids 7394,7395
|
||||
|
||||
# Report records in thread 7394 or 7395.
|
||||
$ simpleperf report --tids 7394,7395
|
||||
|
||||
# Report records in libsudo-game-jni.so.
|
||||
$ simpleperf report --dsos /data/app/com.example.sudogame-2/lib/arm64/libsudo-game-jni.so
|
||||
```
|
||||
|
||||
### Group samples into sample entries
|
||||
|
||||
The report command uses --sort to decide how to group sample entries.
|
||||
|
||||
```sh
|
||||
# Group records based on their process id: records having the same process id are in the same
|
||||
# sample entry.
|
||||
$ simpleperf report --sort pid
|
||||
|
||||
# Group records based on their thread id and thread comm: records having the same thread id and
|
||||
# thread name are in the same sample entry.
|
||||
$ simpleperf report --sort tid,comm
|
||||
|
||||
# Group records based on their binary and function: records in the same binary and function are in
|
||||
# the same sample entry.
|
||||
$ simpleperf report --sort dso,symbol
|
||||
|
||||
# Default option: --sort comm,pid,tid,dso,symbol. Group records in the same thread, and belong to
|
||||
# the same function in the same binary.
|
||||
$ simpleperf report
|
||||
```
|
||||
|
||||
#### Report call graphs
|
||||
|
||||
To report a call graph, please make sure the profiling data is recorded with call graphs,
|
||||
as [here](#record-call-graphs).
|
||||
|
||||
```
|
||||
$ simpleperf report -g
|
||||
```
|
||||
109
Android/android-ndk-r27d/simpleperf/doc/inferno.md
Normal file
@ -0,0 +1,109 @@
|
||||
# Inferno
|
||||
|
||||

|
||||
|
||||
[TOC]
|
||||
|
||||
## Description
|
||||
|
||||
Inferno is a flamegraph generator for native (C/C++) Android apps. It was
|
||||
originally written to profile and improve surfaceflinger performance
|
||||
(Android compositor) but it can be used for any native Android application
|
||||
. You can see a sample report generated with Inferno
|
||||
[here](./report.html). Report are self-contained in HTML so they can be
|
||||
exchanged easily.
|
||||
|
||||
Notice there is no concept of time in a flame graph since all callstack are
|
||||
merged together. As a result, the width of a flamegraph represents 100% of
|
||||
the number of samples and the height is related to the number of functions on
|
||||
the stack when sampling occurred.
|
||||
|
||||
|
||||

|
||||
|
||||
In the flamegraph featured above you can see the main thread of SurfaceFlinger.
|
||||
It is immediatly apparent that most of the CPU time is spent processing messages
|
||||
`android::SurfaceFlinger::onMessageReceived`. The most expensive task is to ask
|
||||
the screen to be refreshed as `android::DisplayDevice::prepare` shows in orange
|
||||
. This graphic division helps to see what part of the program is costly and
|
||||
where a developer's effort to improve performances should go.
|
||||
|
||||
## Example of bottleneck
|
||||
|
||||
A flamegraph give you instant vision on the CPU cycles cost centers but
|
||||
it can also be used to find specific offenders. To find them, look for
|
||||
plateaus. It is easier to see an example:
|
||||
|
||||

|
||||
|
||||
In the previous flamegraph, two
|
||||
plateaus (due to `android::BufferQueueCore::validateConsistencyLocked`)
|
||||
are immediately apparent.
|
||||
|
||||
## How it works
|
||||
|
||||
Inferno relies on simpleperf to record the callstack of a native application
|
||||
thousands of times per second. Simpleperf takes care of unwinding the stack
|
||||
either using frame pointer (recommended) or dwarf. At the end of the recording
|
||||
`simpleperf` also symbolize all IPs automatically. The record are aggregated and
|
||||
dumps dumped to a file `perf.data`. This file is pulled from the Android device
|
||||
and processed on the host by Inferno. The callstacks are merged together to
|
||||
visualize in which part of an app the CPU cycles are spent.
|
||||
|
||||
## How to use it
|
||||
|
||||
Open a terminal and from `simpleperf/scripts` directory type:
|
||||
```
|
||||
./inferno.sh (on Linux/Mac)
|
||||
inferno.bat (on Windows)
|
||||
```
|
||||
|
||||
Inferno will collect data, process them and automatically open your web browser
|
||||
to display the HTML report.
|
||||
|
||||
## Parameters
|
||||
|
||||
You can select how long to sample for, the color of the node and many other
|
||||
things. Use `-h` to get a list of all supported parameters.
|
||||
|
||||
```
|
||||
./inferno.sh -h
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Messy flame graph
|
||||
|
||||
A healthy flame graph features a single call site at its base (see [here](./report.html)).
|
||||
If you don't see a unique call site like `_start` or `_start_thread` at the base
|
||||
from which all flames originate, something went wrong. : Stack unwinding may
|
||||
fail to reach the root callsite. These incomplete
|
||||
callstack are impossible to merge properly. By default Inferno asks
|
||||
`simpleperf` to unwind the stack via the kernel and frame pointers. Try to
|
||||
perform unwinding with dwarf `-du`, you can further tune this setting.
|
||||
|
||||
|
||||
### No flames
|
||||
|
||||
If you see no flames at all or a mess of 1 level flame without a common base,
|
||||
this may be because you compiled without frame pointers. Make sure there is no
|
||||
` -fomit-frame-pointer` in your build config. Alternatively, ask simpleperf to
|
||||
collect data with dward unwinding `-du`.
|
||||
|
||||
|
||||
|
||||
### High percentage of lost samples
|
||||
|
||||
If simpleperf reports a lot of lost sample it is probably because you are
|
||||
unwinding with `dwarf`. Dwarf unwinding involves copying the stack before it is
|
||||
processed. Try to use frame pointer unwinding which can be done by the kernel
|
||||
and it much faster.
|
||||
|
||||
The cost of frame pointer is negligible on arm64 parameter but considerable
|
||||
on arm 32-bit arch (due to register pressure). Use a 64-bit build for better
|
||||
profiling.
|
||||
|
||||
### run-as: package not debuggable
|
||||
|
||||
If you cannot run as root, make sure the app is debuggable otherwise simpleperf
|
||||
will not be able to profile it.
|
||||
BIN
Android/android-ndk-r27d/simpleperf/doc/inferno.png
Normal file
|
After Width: | Height: | Size: 309 KiB |
BIN
Android/android-ndk-r27d/simpleperf/doc/inferno_small.png
Normal file
|
After Width: | Height: | Size: 20 KiB |
BIN
Android/android-ndk-r27d/simpleperf/doc/introduction.pdf
Normal file
56
Android/android-ndk-r27d/simpleperf/doc/jit_symbols.md
Normal file
@ -0,0 +1,56 @@
|
||||
# JIT symbols
|
||||
|
||||
[TOC]
|
||||
|
||||
## Java JIT symbols
|
||||
|
||||
On Android >= P, simpleperf supports profiling Java code, no matter whether it is executed by
|
||||
the interpreter, or JITed, or compiled into native instructions. So you don't need to do anything.
|
||||
|
||||
For details on Android O and N, see
|
||||
[android_application_profiling.md](./android_application_profiling.md#prepare-an-android-application).
|
||||
|
||||
## Generic JIT symbols
|
||||
|
||||
Simpleperf supports picking up symbols from per-pid symbol map files, somewhat similar to what
|
||||
Linux kernel perftool does. Application should create those files at specific locations.
|
||||
|
||||
### Symbol map file location for application
|
||||
|
||||
Application should create symbol map files in its data directory.
|
||||
|
||||
For example, process `123` of application `foo.bar.baz` should create
|
||||
`/data/data/foo.bar.baz/perf-123.map`.
|
||||
|
||||
### Symbol map file location for standalone program
|
||||
|
||||
Standalone programs should create symbol map files in `/data/local/tmp`.
|
||||
|
||||
For example, standalone program process `123` should create `/data/local/tmp/perf-123.map`.
|
||||
|
||||
### Symbol map file format
|
||||
|
||||
Symbol map file is a text file.
|
||||
|
||||
Every line describes a new symbol. Line format is:
|
||||
```
|
||||
<symbol-absolute-address> <symbol-size> <symbol-name>
|
||||
```
|
||||
|
||||
For example:
|
||||
```
|
||||
0x10000000 0x16 jit_symbol_one
|
||||
0x20000000 0x332 jit_symbol_two
|
||||
0x20002004 0x8 jit_symbol_three
|
||||
```
|
||||
|
||||
All characters after the symbol size and until the end of the line are parsed as the symbol name,
|
||||
with leading and trailing spaces removed. This means spaces are allowed in symbol names themselves.
|
||||
|
||||
### Known issues
|
||||
|
||||
Current implementation gets confused if memory pages where JIT symbols reside are reused by mapping
|
||||
a file either before or after.
|
||||
|
||||
For example, if memory pages were first used by `dlopen("libfoo.so")`, then freed by `dlclose`,
|
||||
then allocated for JIT symbols - simpleperf will report symbols from `libfoo.so` instead.
|
||||
|
After Width: | Height: | Size: 178 KiB |
|
After Width: | Height: | Size: 84 KiB |
|
After Width: | Height: | Size: 141 KiB |
|
After Width: | Height: | Size: 14 KiB |
|
After Width: | Height: | Size: 10 KiB |
|
After Width: | Height: | Size: 88 KiB |
|
After Width: | Height: | Size: 449 KiB |
|
After Width: | Height: | Size: 328 KiB |
BIN
Android/android-ndk-r27d/simpleperf/doc/pictures/flamescope.png
Normal file
|
After Width: | Height: | Size: 69 KiB |
|
After Width: | Height: | Size: 28 KiB |
|
After Width: | Height: | Size: 275 KiB |
|
After Width: | Height: | Size: 148 KiB |
BIN
Android/android-ndk-r27d/simpleperf/doc/pictures/report_html.png
Normal file
|
After Width: | Height: | Size: 285 KiB |
14859
Android/android-ndk-r27d/simpleperf/doc/report.html
Normal file
13953
Android/android-ndk-r27d/simpleperf/doc/report_bottleneck.html
Normal file
1732
Android/android-ndk-r27d/simpleperf/doc/report_html.html
Normal file
89
Android/android-ndk-r27d/simpleperf/doc/sample_filter.md
Normal file
@ -0,0 +1,89 @@
|
||||
# Sample Filter
|
||||
|
||||
Sometimes we want to report samples only for selected processes, threads, libraries, or time
|
||||
ranges. To filter samples, we can pass filter options to the report commands or scripts.
|
||||
|
||||
|
||||
## filter file format
|
||||
|
||||
To filter samples based on time ranges, simpleperf accepts a filter file when reporting. The filter
|
||||
file is in text format, containing a list of lines. Each line is a filter command. The filter file
|
||||
can be generated by `sample_filter.py`, and passed to report scripts via `--filter-file`.
|
||||
|
||||
```
|
||||
filter_command1 command_args
|
||||
filter_command2 command_args
|
||||
...
|
||||
```
|
||||
|
||||
### clock command
|
||||
|
||||
```
|
||||
CLOCK <clock_name>
|
||||
```
|
||||
|
||||
Set the clock used to generate timestamps in the filter file. Supported clocks are: `monotonic`,
|
||||
`realtime`. By default it is monotonic. The clock here should be the same as the clock used in
|
||||
profile data, which is set by `--clockid` in simpleperf record command.
|
||||
|
||||
### global time filter commands
|
||||
|
||||
```
|
||||
GLOBAL_BEGIN <begin_timestamp>
|
||||
GLOBAL_END <end_timestamp>
|
||||
```
|
||||
|
||||
The nearest pair of GLOBAL_BEGIN and GLOBAL_END commands makes a time range. When these commands
|
||||
are used, only samples in the time ranges are reported. Timestamps are 64-bit integers in
|
||||
nanoseconds.
|
||||
|
||||
```
|
||||
GLOBAL_BEGIN 1000
|
||||
GLOBAL_END 2000
|
||||
GLOBAL_BEGIN 3000
|
||||
GLOBAL_BEGIN 4000
|
||||
```
|
||||
|
||||
For the example above, samples in time ranges [1000, 2000) and [3000, 4000) are reported.
|
||||
|
||||
### process time filter commands
|
||||
|
||||
```
|
||||
PROCESS_BEGIN <pid> <begin_timestamp>
|
||||
PROCESS_END <pid> <end_timestamp>
|
||||
```
|
||||
|
||||
The nearest pair of PROCESS_BEGIN and PROCESS_END commands for the same process makes a time
|
||||
range. When these commands are used, each process has a list of time ranges, and only samples
|
||||
in the time ranges are reported.
|
||||
|
||||
```
|
||||
PROCESS_BEGIN 1 1000
|
||||
PROCESS_BEGIN 2 2000
|
||||
PROCESS_END 1 3000
|
||||
PROCESS_END 2 4000
|
||||
```
|
||||
|
||||
For the example above, process 1 samples in time range [1000, 3000) and process 2 samples in time
|
||||
range [2000, 4000) are reported.
|
||||
|
||||
### thread time filter commands
|
||||
|
||||
```
|
||||
THREAD_BEGIN <tid> <begin_timestamp>
|
||||
THREAD_END <tid> <end_timestamp>
|
||||
```
|
||||
|
||||
The nearest pair of THREAD_BEGIN and THREAD_END commands for the same thread makes a time
|
||||
range. When these commands are used, each thread has a list of time ranges, and only samples in the
|
||||
time ranges are reported.
|
||||
|
||||
```
|
||||
THREAD_BEGIN 1 1000
|
||||
THREAD_BEGIN 2 2000
|
||||
THREAD_END 1 3000
|
||||
THREAD_END 2 4000
|
||||
```
|
||||
|
||||
For the example above, thread 1 samples in time range [1000, 3000) and thread 2 samples in time
|
||||
range [2000, 4000) are reported.
|
||||
357
Android/android-ndk-r27d/simpleperf/doc/scripts_reference.md
Normal file
@ -0,0 +1,357 @@
|
||||
# Scripts reference
|
||||
|
||||
[TOC]
|
||||
|
||||
## Record a profile
|
||||
|
||||
### app_profiler.py
|
||||
|
||||
`app_profiler.py` is used to record profiling data for Android applications and native executables.
|
||||
|
||||
```sh
|
||||
# Record an Android application.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp
|
||||
|
||||
# Record an Android application with Java code compiled into native instructions.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp --compile_java_code
|
||||
|
||||
# Record the launch of an Activity of an Android application.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -a .SleepActivity
|
||||
|
||||
# Record a native process.
|
||||
$ ./app_profiler.py -np surfaceflinger
|
||||
|
||||
# Record a native process given its pid.
|
||||
$ ./app_profiler.py --pid 11324
|
||||
|
||||
# Record a command.
|
||||
$ ./app_profiler.py -cmd \
|
||||
"dex2oat --dex-file=/data/local/tmp/app-debug.apk --oat-file=/data/local/tmp/a.oat"
|
||||
|
||||
# Record an Android application, and use -r to send custom options to the record command.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp \
|
||||
-r "-e cpu-clock -g --duration 30"
|
||||
|
||||
# Record both on CPU time and off CPU time.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp \
|
||||
-r "-e task-clock -g -f 1000 --duration 10 --trace-offcpu"
|
||||
|
||||
# Save profiling data in a custom file (like perf_custom.data) instead of perf.data.
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -o perf_custom.data
|
||||
```
|
||||
|
||||
### Profile from launch of an application
|
||||
|
||||
Sometimes we want to profile the launch-time of an application. To support this, we added `--app` in
|
||||
the record command. The `--app` option sets the package name of the Android application to profile.
|
||||
If the app is not already running, the record command will poll for the app process in a loop with
|
||||
an interval of 1ms. So to profile from launch of an application, we can first start the record
|
||||
command with `--app`, then start the app. Below is an example.
|
||||
|
||||
```sh
|
||||
$ ./run_simpleperf_on_device.py record --app simpleperf.example.cpp \
|
||||
-g --duration 1 -o /data/local/tmp/perf.data
|
||||
# Start the app manually or using the `am` command.
|
||||
```
|
||||
|
||||
To make it convenient to use, `app_profiler.py` supports using the `-a` option to start an Activity
|
||||
after recording has started.
|
||||
|
||||
```sh
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp -a .MainActivity
|
||||
```
|
||||
|
||||
### api_profiler.py
|
||||
|
||||
`api_profiler.py` is used to control recording in application code. It does preparation work
|
||||
before recording, and collects profiling data files after recording.
|
||||
|
||||
[Here](./android_application_profiling.md#control-recording-in-application-code) are the details.
|
||||
|
||||
### run_simpleperf_without_usb_connection.py
|
||||
|
||||
`run_simpleperf_without_usb_connection.py` records profiling data while the USB cable isn't
|
||||
connected. Maybe `api_profiler.py` is more suitable, which also don't need USB cable when recording.
|
||||
Below is an example.
|
||||
|
||||
```sh
|
||||
$ ./run_simpleperf_without_usb_connection.py start -p simpleperf.example.cpp
|
||||
# After the command finishes successfully, unplug the USB cable, run the
|
||||
# SimpleperfExampleCpp app. After a few seconds, plug in the USB cable.
|
||||
$ ./run_simpleperf_without_usb_connection.py stop
|
||||
# It may take a while to stop recording. After that, the profiling data is collected in perf.data
|
||||
# on host.
|
||||
```
|
||||
|
||||
### binary_cache_builder.py
|
||||
|
||||
The `binary_cache` directory is a directory holding binaries needed by a profiling data file. The
|
||||
binaries are expected to be unstripped, having debug information and symbol tables. The
|
||||
`binary_cache` directory is used by report scripts to read symbols of binaries. It is also used by
|
||||
`report_html.py` to generate annotated source code and disassembly.
|
||||
|
||||
By default, `app_profiler.py` builds the binary_cache directory after recording. But we can also
|
||||
build `binary_cache` for existing profiling data files using `binary_cache_builder.py`. It is useful
|
||||
when you record profiling data using `simpleperf record` directly, to do system wide profiling or
|
||||
record without the USB cable connected.
|
||||
|
||||
`binary_cache_builder.py` can either pull binaries from an Android device, or find binaries in
|
||||
directories on the host (via `-lib`).
|
||||
|
||||
```sh
|
||||
# Generate binary_cache for perf.data, by pulling binaries from the device.
|
||||
$ ./binary_cache_builder.py
|
||||
|
||||
# Generate binary_cache, by pulling binaries from the device and finding binaries in
|
||||
# SimpleperfExampleCpp.
|
||||
$ ./binary_cache_builder.py -lib path_of_SimpleperfExampleCpp
|
||||
```
|
||||
|
||||
### run_simpleperf_on_device.py
|
||||
|
||||
This script pushes the `simpleperf` executable on the device, and run a simpleperf command on the
|
||||
device. It is more convenient than running adb commands manually.
|
||||
|
||||
## Viewing the profile
|
||||
|
||||
Scripts in this section are for viewing the profile or converting profile data into formats used by
|
||||
external UIs. For recommended UIs, see [view_the_profile.md](view_the_profile.md).
|
||||
|
||||
### report.py
|
||||
|
||||
report.py is a wrapper of the `report` command on the host. It accepts all options of the `report`
|
||||
command.
|
||||
|
||||
```sh
|
||||
# Report call graph
|
||||
$ ./report.py -g
|
||||
|
||||
# Report call graph in a GUI window implemented by Python Tk.
|
||||
$ ./report.py -g --gui
|
||||
```
|
||||
|
||||
### report_html.py
|
||||
|
||||
`report_html.py` generates `report.html` based on the profiling data. Then the `report.html` can show
|
||||
the profiling result without depending on other files. So it can be shown in local browsers or
|
||||
passed to other machines. Depending on which command-line options are used, the content of the
|
||||
`report.html` can include: chart statistics, sample table, flamegraphs, annotated source code for
|
||||
each function, annotated disassembly for each function.
|
||||
|
||||
```sh
|
||||
# Generate chart statistics, sample table and flamegraphs, based on perf.data.
|
||||
$ ./report_html.py
|
||||
|
||||
# Add source code.
|
||||
$ ./report_html.py --add_source_code --source_dirs path_of_SimpleperfExampleCpp
|
||||
|
||||
# Add disassembly.
|
||||
$ ./report_html.py --add_disassembly
|
||||
|
||||
# Adding disassembly for all binaries can cost a lot of time. So we can choose to only add
|
||||
# disassembly for selected binaries.
|
||||
$ ./report_html.py --add_disassembly --binary_filter libgame.so
|
||||
|
||||
# report_html.py accepts more than one recording data file.
|
||||
$ ./report_html.py -i perf1.data perf2.data
|
||||
```
|
||||
|
||||
Below is an example of generating html profiling results for SimpleperfExampleCpp.
|
||||
|
||||
```sh
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp
|
||||
$ ./report_html.py --add_source_code --source_dirs path_of_SimpleperfExampleCpp \
|
||||
--add_disassembly
|
||||
```
|
||||
|
||||
After opening the generated [`report.html`](./report_html.html) in a browser, there are several tabs:
|
||||
|
||||
The first tab is "Chart Statistics". You can click the pie chart to show the time consumed by each
|
||||
process, thread, library and function.
|
||||
|
||||
The second tab is "Sample Table". It shows the time taken by each function. By clicking one row in
|
||||
the table, we can jump to a new tab called "Function".
|
||||
|
||||
The third tab is "Flamegraph". It shows the graphs generated by [`inferno`](./inferno.md).
|
||||
|
||||
The fourth tab is "Function". It only appears when users click a row in the "Sample Table" tab.
|
||||
It shows information of a function, including:
|
||||
|
||||
1. A flamegraph showing functions called by that function.
|
||||
2. A flamegraph showing functions calling that function.
|
||||
3. Annotated source code of that function. It only appears when there are source code files for
|
||||
that function.
|
||||
4. Annotated disassembly of that function. It only appears when there are binaries containing that
|
||||
function.
|
||||
|
||||
### inferno
|
||||
|
||||
[`inferno`](./inferno.md) is a tool used to generate flamegraph in a html file.
|
||||
|
||||
```sh
|
||||
# Generate flamegraph based on perf.data.
|
||||
# On Windows, use inferno.bat instead of ./inferno.sh.
|
||||
$ ./inferno.sh -sc --record_file perf.data
|
||||
|
||||
# Record a native program and generate flamegraph.
|
||||
$ ./inferno.sh -np surfaceflinger
|
||||
```
|
||||
|
||||
### purgatorio
|
||||
|
||||
[`purgatorio`](../scripts/purgatorio/README.md) is a visualization tool to show samples in time order.
|
||||
|
||||
### pprof_proto_generator.py
|
||||
|
||||
It converts a profiling data file into `pprof.proto`, a format used by [pprof](https://github.com/google/pprof).
|
||||
|
||||
```sh
|
||||
# Convert perf.data in the current directory to pprof.proto format.
|
||||
$ ./pprof_proto_generator.py
|
||||
# Show report in pdf format.
|
||||
$ pprof -pdf pprof.profile
|
||||
|
||||
# Show report in html format. To show disassembly, add --tools option like:
|
||||
# --tools=objdump:<ndk_path>/toolchains/llvm/prebuilt/linux-x86_64/aarch64-linux-android/bin
|
||||
# To show annotated source or disassembly, select `top` in the view menu, click a function and
|
||||
# select `source` or `disassemble` in the view menu.
|
||||
$ pprof -http=:8080 pprof.profile
|
||||
```
|
||||
|
||||
### gecko_profile_generator.py
|
||||
|
||||
Converts `perf.data` to [Gecko Profile
|
||||
Format](https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md),
|
||||
the format read by https://profiler.firefox.com/.
|
||||
|
||||
Firefox Profiler is a powerful general-purpose profiler UI which runs locally in
|
||||
any browser (not just Firefox), with:
|
||||
|
||||
- Per-thread tracks
|
||||
- Flamegraphs
|
||||
- Search, focus for specific stacks
|
||||
- A time series view for seeing your samples in timestamp order
|
||||
- Filtering by thread and duration
|
||||
|
||||
Usage:
|
||||
|
||||
```
|
||||
# Record a profile of your application
|
||||
$ ./app_profiler.py -p simpleperf.example.cpp
|
||||
|
||||
# Convert and gzip.
|
||||
$ ./gecko_profile_generator.py -i perf.data | gzip > gecko-profile.json.gz
|
||||
```
|
||||
|
||||
Then open `gecko-profile.json.gz` in https://profiler.firefox.com/.
|
||||
|
||||
### report_sample.py
|
||||
|
||||
`report_sample.py` converts a profiling data file into the `perf script` text format output by
|
||||
`linux-perf-tool`.
|
||||
|
||||
This format can be imported into:
|
||||
|
||||
- [FlameGraph](https://github.com/brendangregg/FlameGraph)
|
||||
- [Flamescope](https://github.com/Netflix/flamescope)
|
||||
- [Firefox
|
||||
Profiler](https://github.com/firefox-devtools/profiler/blob/main/docs-user/guide-perf-profiling.md),
|
||||
but prefer using `gecko_profile_generator.py`.
|
||||
- [Speedscope](https://github.com/jlfwong/speedscope/wiki/Importing-from-perf-(linux))
|
||||
|
||||
```sh
|
||||
# Record a profile to perf.data
|
||||
$ ./app_profiler.py <args>
|
||||
|
||||
# Convert perf.data in the current directory to a format used by FlameGraph.
|
||||
$ ./report_sample.py --symfs binary_cache >out.perf
|
||||
|
||||
$ git clone https://github.com/brendangregg/FlameGraph.git
|
||||
$ FlameGraph/stackcollapse-perf.pl out.perf >out.folded
|
||||
$ FlameGraph/flamegraph.pl out.folded >a.svg
|
||||
```
|
||||
|
||||
### stackcollapse.py
|
||||
|
||||
`stackcollapse.py` converts a profiling data file (`perf.data`) to [Brendan
|
||||
Gregg's "Folded Stacks"
|
||||
format](https://queue.acm.org/detail.cfm?id=2927301#:~:text=The%20folded%20stack%2Dtrace%20format,trace%2C%20followed%20by%20a%20semicolon).
|
||||
|
||||
Folded Stacks are lines of semicolon-delimited stack frames, root to leaf,
|
||||
followed by a count of events sampled in that stack, e.g.:
|
||||
|
||||
```
|
||||
BusyThread;__start_thread;__pthread_start(void*);java.lang.Thread.run 17889729
|
||||
```
|
||||
|
||||
All similar stacks are aggregated and sample timestamps are unused.
|
||||
|
||||
Folded Stacks format is readable by:
|
||||
|
||||
- The [FlameGraph](https://github.com/brendangregg/FlameGraph) toolkit
|
||||
- [Inferno](https://github.com/jonhoo/inferno) (Rust port of FlameGraph)
|
||||
- [Speedscope](https://speedscope.app/)
|
||||
|
||||
Example:
|
||||
|
||||
```sh
|
||||
# Record a profile to perf.data
|
||||
$ ./app_profiler.py <args>
|
||||
|
||||
# Convert to Folded Stacks format
|
||||
$ ./stackcollapse.py --kernel --jit | gzip > profile.folded.gz
|
||||
|
||||
# Visualise with FlameGraph with Java Stacks and nanosecond times
|
||||
$ git clone https://github.com/brendangregg/FlameGraph.git
|
||||
$ gunzip -c profile.folded.gz \
|
||||
| FlameGraph/flamegraph.pl --color=java --countname=ns \
|
||||
> profile.svg
|
||||
```
|
||||
|
||||
## simpleperf_report_lib.py
|
||||
|
||||
`simpleperf_report_lib.py` is a Python library used to parse profiling data files generated by the
|
||||
record command. Internally, it uses libsimpleperf_report.so to do the work. Generally, for each
|
||||
profiling data file, we create an instance of ReportLib, pass it the file path (via SetRecordFile).
|
||||
Then we can read all samples through GetNextSample(). For each sample, we can read its event info
|
||||
(via GetEventOfCurrentSample), symbol info (via GetSymbolOfCurrentSample) and call chain info
|
||||
(via GetCallChainOfCurrentSample). We can also get some global information, like record options
|
||||
(via GetRecordCmd), the arch of the device (via GetArch) and meta strings (via MetaInfo).
|
||||
|
||||
Examples of using `simpleperf_report_lib.py` are in `report_sample.py`, `report_html.py`,
|
||||
`pprof_proto_generator.py` and `inferno/inferno.py`.
|
||||
|
||||
## ipc.py
|
||||
`ipc.py`captures the instructions per cycle (IPC) of the system during a specified duration.
|
||||
|
||||
Example:
|
||||
```sh
|
||||
./ipc.py
|
||||
./ipc.py 2 20 # Set interval to 2 secs and total duration to 20 secs
|
||||
./ipc.py -p 284 -C 4 # Only profile the PID 284 while running on core 4
|
||||
./ipc.py -c 'sleep 5' # Only profile the command to run
|
||||
```
|
||||
|
||||
The results look like:
|
||||
```
|
||||
K_CYCLES K_INSTR IPC
|
||||
36840 14138 0.38
|
||||
70701 27743 0.39
|
||||
104562 41350 0.40
|
||||
138264 54916 0.40
|
||||
```
|
||||
|
||||
## sample_filter.py
|
||||
|
||||
`sample_filter.py` generates sample filter files as documented in [sample_filter.md](https://android.googlesource.com/platform/system/extras/+/refs/heads/main/simpleperf/doc/sample_filter.md).
|
||||
A filter file can be passed in `--filter-file` when running report scripts.
|
||||
|
||||
For example, it can be used to split a large recording file into several report files.
|
||||
|
||||
```sh
|
||||
$ sample_filter.py -i perf.data --split-time-range 2 -o sample_filter
|
||||
$ gecko_profile_generator.py -i perf.data --filter-file sample_filter_part1 \
|
||||
| gzip >profile-part1.json.gz
|
||||
$ gecko_profile_generator.py -i perf.data --filter-file sample_filter_part2 \
|
||||
| gzip >profile-part2.json.gz
|
||||
```
|
||||
|
After Width: | Height: | Size: 11 KiB |
352
Android/android-ndk-r27d/simpleperf/doc/view_the_profile.md
Normal file
@ -0,0 +1,352 @@
|
||||
# View the profile
|
||||
|
||||
[TOC]
|
||||
|
||||
## Introduction
|
||||
|
||||
After using `simpleperf record` or `app_profiler.py`, we get a profile data file. The file contains
|
||||
a list of samples. Each sample has a timestamp, a thread id, a callstack, events (like cpu-cycles
|
||||
or cpu-clock) used in this sample, etc. We have many choices for viewing the profile. We can show
|
||||
samples in chronological order, or show aggregated flamegraphs. We can show reports in text format,
|
||||
or in some interactive UIs.
|
||||
|
||||
Below shows some recommended UIs to view the profile. Google developers can find more examples in
|
||||
[go/gmm-profiling](go/gmm-profiling?polyglot=linux-workstation#viewing-the-profile).
|
||||
|
||||
|
||||
## Continuous PProf UI (great flamegraph UI, but only available internally)
|
||||
|
||||
[PProf](https://github.com/google/pprof) is a mature profiling technology used extensively on
|
||||
Google servers, with a powerful flamegraph UI, with strong drilldown, search, pivot, profile diff,
|
||||
and graph visualisation.
|
||||
|
||||

|
||||
|
||||
We can use `pprof_proto_generator.py` to convert profiles into pprof.profile protobufs for use in
|
||||
pprof.
|
||||
|
||||
```
|
||||
# Output all threads, broken down by threadpool.
|
||||
./pprof_proto_generator.py
|
||||
|
||||
# Use proguard mapping.
|
||||
./pprof_proto_generator.py --proguard-mapping-file proguard.map
|
||||
|
||||
# Just the main (UI) thread (query by thread name):
|
||||
./pprof_proto_generator.py --comm com.example.android.displayingbitmaps
|
||||
```
|
||||
|
||||
This will print some debug logs about Failed to read symbols: this is usually OK, unless those
|
||||
symbols are hotspots.
|
||||
|
||||
The continuous pprof server has a file upload size limit of 50MB. To get around this limit, compress
|
||||
the profile before uploading:
|
||||
|
||||
```
|
||||
gzip pprof.profile
|
||||
```
|
||||
|
||||
After compressing, you can upload the `pprof.profile.gz` file to either http://pprof/ or
|
||||
http://pprofng/. Both websites have an 'Upload' tab for this purpose. Alternatively, you can use
|
||||
the following `pprof` command to upload the compressed profile:
|
||||
|
||||
```
|
||||
# Upload all threads in profile, grouped by threadpool.
|
||||
# This is usually a good default, combining threads with similar names.
|
||||
pprof --flame --tagroot threadpool pprof.profile.gz
|
||||
|
||||
# Upload all threads in profile, grouped by individual thread name.
|
||||
pprof --flame --tagroot thread pprof.profile.gz
|
||||
|
||||
# Upload all threads in profile, without grouping by thread.
|
||||
pprof --flame pprof.profile.gz
|
||||
This will output a URL, example: https://pprof.corp.google.com/?id=589a60852306144c880e36429e10b166
|
||||
```
|
||||
|
||||
## Firefox Profiler (great chronological UI)
|
||||
|
||||
We can view Android profiles using Firefox Profiler: https://profiler.firefox.com/. This does not
|
||||
require Firefox installation -- Firefox Profiler is just a website, you can open it in any browser.
|
||||
There is also an internal Google-Hosted Firefox Profiler, at go/profiler or go/firefox-profiler.
|
||||
|
||||

|
||||
|
||||
Firefox Profiler has a great chronological view, as it doesn't pre-aggregate similar stack traces
|
||||
like pprof does.
|
||||
|
||||
We can use `gecko_profile_generator.py` to convert raw perf.data files into a Firefox Profile, with
|
||||
Proguard deobfuscation.
|
||||
|
||||
```
|
||||
# Create Gecko Profile
|
||||
./gecko_profile_generator.py | gzip > gecko_profile.json.gz
|
||||
|
||||
# Create Gecko Profile using Proguard map
|
||||
./gecko_profile_generator.py --proguard-mapping-file proguard.map | gzip > gecko_profile.json.gz
|
||||
```
|
||||
|
||||
Then drag-and-drop gecko_profile.json.gz into https://profiler.firefox.com/.
|
||||
|
||||
Firefox Profiler supports:
|
||||
|
||||
1. Aggregated Flamegraphs
|
||||
2. Chronological Stackcharts
|
||||
|
||||
And allows filtering by:
|
||||
|
||||
1. Individual threads
|
||||
2. Multiple threads (Ctrl+Click thread names to select many)
|
||||
3. Timeline period
|
||||
4. Stack frame text search
|
||||
|
||||
## FlameScope (great jank-finding UI)
|
||||
|
||||
[Netflix's FlameScope](https://github.com/Netflix/flamescope) is a rough, proof-of-concept UI that
|
||||
lets you spot repeating patterns of work by laying out the profile as a subsecond heatmap.
|
||||
|
||||
Below, each vertical stripe is one second, and each cell is 10ms. Redder cells have more samples.
|
||||
See https://www.brendangregg.com/blog/2018-11-08/flamescope-pattern-recognition.html for how to
|
||||
spot patterns.
|
||||
|
||||
This is an example of a 60s DisplayBitmaps app Startup Profile.
|
||||
|
||||

|
||||
|
||||
You can see:
|
||||
|
||||
The thick red vertical line on the left is startup.
|
||||
The long white vertical sections on the left shows the app is mostly idle, waiting for commands
|
||||
from instrumented tests.
|
||||
Then we see periodically red blocks, which shows the app is periodically busy handling commands
|
||||
from instrumented tests.
|
||||
|
||||
Click the start and end cells of a duration:
|
||||
|
||||

|
||||
|
||||
To see a flamegraph for that duration:
|
||||
|
||||

|
||||
|
||||
Install and run Flamescope:
|
||||
|
||||
```
|
||||
git clone https://github.com/Netflix/flamescope ~/flamescope
|
||||
cd ~/flamescope
|
||||
pip install -r requirements.txt
|
||||
npm install
|
||||
npm run webpack
|
||||
python3 run.py
|
||||
```
|
||||
|
||||
Then open FlameScope in-browser: http://localhost:5000/.
|
||||
|
||||
FlameScope can read gzipped perf script format profiles. Convert simpleperf perf.data to this
|
||||
format with `report_sample.py`, and place it in Flamescope's examples directory:
|
||||
|
||||
```
|
||||
# Create `Linux perf script` format profile.
|
||||
report_sample.py | gzip > ~/flamescope/examples/my_simpleperf_profile.gz
|
||||
|
||||
# Create `Linux perf script` format profile using Proguard map.
|
||||
report_sample.py \
|
||||
--proguard-mapping-file proguard.map \
|
||||
| gzip > ~/flamescope/examples/my_simpleperf_profile.gz
|
||||
```
|
||||
|
||||
Open the profile "as Linux Perf", and click start and end sections to get a flamegraph of that
|
||||
timespan.
|
||||
|
||||
To investigate UI Thread Jank, filter to UI thread samples only:
|
||||
|
||||
```
|
||||
report_sample.py \
|
||||
--comm com.example.android.displayingbitmaps \ # UI Thread
|
||||
| gzip > ~/flamescope/examples/uithread.gz
|
||||
```
|
||||
|
||||
Once you've identified the timespan of interest, consider also zooming into that section with
|
||||
Firefox Profiler, which has a more powerful flamegraph viewer.
|
||||
|
||||
## Differential FlameGraph
|
||||
|
||||
See Brendan Gregg's [Differential Flame Graphs](https://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html) blog.
|
||||
|
||||
Use Simpleperf's `stackcollapse.py` to convert perf.data to Folded Stacks format for the FlameGraph
|
||||
toolkit.
|
||||
|
||||
Consider diffing both directions: After minus Before, and Before minus After.
|
||||
|
||||
If you've recorded before and after your optimisation as perf_before.data and perf_after.data, and
|
||||
you're only interested in the UI thread:
|
||||
|
||||
```
|
||||
# Generate before and after folded stacks from perf.data files
|
||||
./stackcollapse.py --kernel --jit -i perf_before.data \
|
||||
--proguard-mapping-file proguard_before.map \
|
||||
--comm com.example.android.displayingbitmaps \
|
||||
> perf_before.folded
|
||||
./stackcollapse.py --kernel --jit -i perf_after.data \
|
||||
--proguard-mapping-file proguard_after.map \
|
||||
--comm com.example.android.displayingbitmaps \
|
||||
> perf_after.folded
|
||||
|
||||
# Generate diff reports
|
||||
FlameGraph/difffolded.pl -n perf_before.folded perf_after.folded \
|
||||
| FlameGraph/flamegraph.pl > diff1.svg
|
||||
FlameGraph/difffolded.pl -n --negate perf_after.folded perf_before.folded \
|
||||
| FlameGraph/flamegraph.pl > diff2.svg
|
||||
```
|
||||
|
||||
## Android Studio Profiler
|
||||
|
||||
Android Studio Profiler supports recording and reporting profiles of app processes. It supports
|
||||
several recording methods, including one using simpleperf as backend. You can use Android Studio
|
||||
Profiler for both recording and reporting.
|
||||
|
||||
In Android Studio:
|
||||
Open View -> Tool Windows -> Profiler
|
||||
Click + -> Your Device -> Profileable Processes -> Your App
|
||||
|
||||

|
||||
|
||||
Click into "CPU" Chart
|
||||
|
||||
Choose Callstack Sample Recording. Even if you're using Java, this provides better observability,
|
||||
into ART, malloc, and the kernel.
|
||||
|
||||

|
||||
|
||||
Click Record, run your test on the device, then Stop when you're done.
|
||||
|
||||
Click on a thread track, and "Flame Chart" to see a chronological chart on the left, and an
|
||||
aggregated flamechart on the right:
|
||||
|
||||

|
||||
|
||||
If you want more flexibility in recording options, or want to add proguard mapping file, you can
|
||||
record using simpleperf, and report using Android Studio Profiler.
|
||||
|
||||
We can use `simpleperf report-sample` to convert perf.data to trace files for Android Studio
|
||||
Profiler.
|
||||
|
||||
```
|
||||
# Convert perf.data to perf.trace for Android Studio Profiler.
|
||||
# If on Mac/Windows, use simpleperf host executable for those platforms instead.
|
||||
bin/linux/x86_64/simpleperf report-sample --show-callchain --protobuf -i perf.data -o perf.trace
|
||||
|
||||
# Convert perf.data to perf.trace using proguard mapping file.
|
||||
bin/linux/x86_64/simpleperf report-sample --show-callchain --protobuf -i perf.data -o perf.trace \
|
||||
--proguard-mapping-file proguard.map
|
||||
```
|
||||
|
||||
In Android Studio: Open File -> Open -> Select perf.trace
|
||||
|
||||

|
||||
|
||||
|
||||
## Simpleperf HTML Report
|
||||
|
||||
Simpleperf can generate its own HTML Profile, which is able to show Android-specific information
|
||||
and separate flamegraphs for all threads, with a much rougher flamegraph UI.
|
||||
|
||||

|
||||
|
||||
This UI is fairly rough; we recommend using the Continuous PProf UI or Firefox Profiler instead. But
|
||||
it's useful for a quick look at your data.
|
||||
|
||||
Each of the following commands take as input ./perf.data and output ./report.html.
|
||||
|
||||
```
|
||||
# Make an HTML report.
|
||||
./report_html.py
|
||||
|
||||
# Make an HTML report with Proguard mapping.
|
||||
./report_html.py --proguard-mapping-file proguard.map
|
||||
```
|
||||
|
||||
This will print some debug logs about Failed to read symbols: this is usually OK, unless those
|
||||
symbols are hotspots.
|
||||
|
||||
See also [report_html.py's README](scripts_reference.md#report_htmlpy) and `report_html.py -h`.
|
||||
|
||||
|
||||
## PProf Interactive Command Line
|
||||
|
||||
Unlike Continuous PProf UI, [PProf](https://github.com/google/pprof) command line is publicly
|
||||
available, and allows drilldown, pivoting and filtering.
|
||||
|
||||
The below session demonstrates filtering to stack frames containing processBitmap.
|
||||
|
||||
```
|
||||
$ pprof pprof.profile
|
||||
(pprof) show=processBitmap
|
||||
(pprof) top
|
||||
Active filters:
|
||||
show=processBitmap
|
||||
Showing nodes accounting for 2.45s, 11.44% of 21.46s total
|
||||
flat flat% sum% cum cum%
|
||||
2.45s 11.44% 11.44% 2.45s 11.44% com.example.android.displayingbitmaps.util.ImageFetcher.processBitmap
|
||||
```
|
||||
|
||||
And then showing the tags of those frames, to tell what threads they are running on:
|
||||
|
||||
```
|
||||
(pprof) tags
|
||||
pid: Total 2.5s
|
||||
2.5s ( 100%): 31112
|
||||
|
||||
thread: Total 2.5s
|
||||
1.4s (57.21%): AsyncTask #3
|
||||
1.1s (42.79%): AsyncTask #4
|
||||
|
||||
threadpool: Total 2.5s
|
||||
2.5s ( 100%): AsyncTask #%d
|
||||
|
||||
tid: Total 2.5s
|
||||
1.4s (57.21%): 31174
|
||||
1.1s (42.79%): 31175
|
||||
```
|
||||
|
||||
Contrast with another method:
|
||||
|
||||
```
|
||||
(pprof) show=addBitmapToCache
|
||||
(pprof) top
|
||||
Active filters:
|
||||
show=addBitmapToCache
|
||||
Showing nodes accounting for 1.05s, 4.88% of 21.46s total
|
||||
flat flat% sum% cum cum%
|
||||
1.05s 4.88% 4.88% 1.05s 4.88% com.example.android.displayingbitmaps.util.ImageCache.addBitmapToCache
|
||||
```
|
||||
|
||||
For more information, see the [pprof README](https://github.com/google/pprof/blob/main/doc/README.md#interactive-terminal-use).
|
||||
|
||||
|
||||
## Simpleperf Report Command Line
|
||||
|
||||
The simpleperf report command reports profiles in text format.
|
||||
|
||||

|
||||
|
||||
You can call `simpleperf report` directly or call it via `report.py`.
|
||||
|
||||
```
|
||||
# Report symbols in table format.
|
||||
$ ./report.py --children
|
||||
|
||||
# Report call graph.
|
||||
$ bin/linux/x86_64/simpleperf report -g -i perf.data
|
||||
```
|
||||
|
||||
See also [report command's README](executable_commands_reference.md#The-report-command) and
|
||||
`report.py -h`.
|
||||
|
||||
|
||||
## Custom Report Interface
|
||||
|
||||
If the above View UIs can't fulfill your need, you can use `simpleperf_report_lib.py` to parse
|
||||
perf.data, extract sample information, and feed it to any views you like.
|
||||
|
||||
See [simpleperf_report_lib.py's README](scripts_reference.md#simpleperf_report_libpy) for more
|
||||
details.
|
||||
548
Android/android-ndk-r27d/simpleperf/gecko_profile_generator.py
Normal file
@ -0,0 +1,548 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2021 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""gecko_profile_generator.py: converts perf.data to Gecko Profile Format,
|
||||
which can be read by https://profiler.firefox.com/.
|
||||
|
||||
Example:
|
||||
./app_profiler.py
|
||||
./gecko_profile_generator.py | gzip > gecko-profile.json.gz
|
||||
|
||||
Then open gecko-profile.json.gz in https://profiler.firefox.com/
|
||||
"""
|
||||
|
||||
from collections import Counter
|
||||
from dataclasses import dataclass, field
|
||||
import json
|
||||
import logging
|
||||
import sys
|
||||
from typing import List, Dict, Optional, NamedTuple, Tuple
|
||||
|
||||
from simpleperf_report_lib import GetReportLib
|
||||
from simpleperf_utils import BaseArgumentParser, ReportLibOptions
|
||||
|
||||
|
||||
StringID = int
|
||||
StackID = int
|
||||
FrameID = int
|
||||
CategoryID = int
|
||||
Milliseconds = float
|
||||
GeckoProfile = Dict
|
||||
|
||||
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L156
|
||||
class Frame(NamedTuple):
|
||||
string_id: StringID
|
||||
relevantForJS: bool
|
||||
innerWindowID: int
|
||||
implementation: None
|
||||
optimizations: None
|
||||
line: None
|
||||
column: None
|
||||
category: CategoryID
|
||||
subcategory: int
|
||||
|
||||
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L216
|
||||
class Stack(NamedTuple):
|
||||
prefix_id: Optional[StackID]
|
||||
frame_id: FrameID
|
||||
category_id: CategoryID
|
||||
|
||||
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L90
|
||||
class Sample(NamedTuple):
|
||||
stack_id: Optional[StackID]
|
||||
time_ms: Milliseconds
|
||||
responsiveness: int
|
||||
complete_stack: bool
|
||||
|
||||
def to_json(self):
|
||||
return [self.stack_id, self.time_ms, self.responsiveness]
|
||||
|
||||
|
||||
# Schema: https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/profile.js#L425
|
||||
# Colors must be defined in:
|
||||
# https://github.com/firefox-devtools/profiler/blob/50124adbfa488adba6e2674a8f2618cf34b59cd2/res/css/categories.css
|
||||
CATEGORIES = [
|
||||
{
|
||||
"name": 'User',
|
||||
# Follow Brendan Gregg's Flamegraph convention: yellow for userland
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L419
|
||||
"color": 'yellow',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'Kernel',
|
||||
# Follow Brendan Gregg's Flamegraph convention: orange for kernel
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L417
|
||||
"color": 'orange',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'Native',
|
||||
# Follow Brendan Gregg's Flamegraph convention: yellow for userland
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L419
|
||||
"color": 'yellow',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'DEX',
|
||||
# Follow Brendan Gregg's Flamegraph convention: green for Java/JIT
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L411
|
||||
"color": 'green',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'OAT',
|
||||
# Follow Brendan Gregg's Flamegraph convention: green for Java/JIT
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L411
|
||||
"color": 'green',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'Off-CPU',
|
||||
# Follow Brendan Gregg's Flamegraph convention: blue for off-CPU
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L470
|
||||
"color": 'blue',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
# Not used by this exporter yet, but some Firefox Profiler code assumes
|
||||
# there is an 'Other' category by searching for a category with
|
||||
# color=grey, so include this.
|
||||
{
|
||||
"name": 'Other',
|
||||
"color": 'grey',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
{
|
||||
"name": 'JIT',
|
||||
# Follow Brendan Gregg's Flamegraph convention: green for Java/JIT
|
||||
# https://github.com/brendangregg/FlameGraph/blob/810687f180f3c4929b5d965f54817a5218c9d89b/flamegraph.pl#L411
|
||||
"color": 'green',
|
||||
"subcategories": ['Other']
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
def is_complete_stack(stack: List[str]) -> bool:
|
||||
""" Check if the callstack is complete. The stack starts from root. """
|
||||
for entry in stack:
|
||||
if ('__libc_init' in entry) or ('__start_thread' in entry):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
@dataclass
|
||||
class Thread:
|
||||
"""A builder for a profile of a single thread.
|
||||
|
||||
Attributes:
|
||||
comm: Thread command-line (name).
|
||||
pid: process ID of containing process.
|
||||
tid: thread ID.
|
||||
samples: Timeline of profile samples.
|
||||
frameTable: interned stack frame ID -> stack frame.
|
||||
stringTable: interned string ID -> string.
|
||||
stringMap: interned string -> string ID.
|
||||
stackTable: interned stack ID -> stack.
|
||||
stackMap: (stack prefix ID, leaf stack frame ID) -> interned Stack ID.
|
||||
frameMap: Stack Frame string -> interned Frame ID.
|
||||
"""
|
||||
comm: str
|
||||
pid: int
|
||||
tid: int
|
||||
samples: List[Sample] = field(default_factory=list)
|
||||
frameTable: List[Frame] = field(default_factory=list)
|
||||
stringTable: List[str] = field(default_factory=list)
|
||||
# TODO: this is redundant with frameTable, could we remove this?
|
||||
stringMap: Dict[str, int] = field(default_factory=dict)
|
||||
stackTable: List[Stack] = field(default_factory=list)
|
||||
stackMap: Dict[Tuple[Optional[int], int], int] = field(default_factory=dict)
|
||||
frameMap: Dict[str, int] = field(default_factory=dict)
|
||||
|
||||
def _intern_stack(self, frame_id: int, prefix_id: Optional[int]) -> int:
|
||||
"""Gets a matching stack, or saves the new stack. Returns a Stack ID."""
|
||||
key = (prefix_id, frame_id)
|
||||
stack_id = self.stackMap.get(key)
|
||||
if stack_id is not None:
|
||||
return stack_id
|
||||
stack_id = len(self.stackTable)
|
||||
self.stackTable.append(Stack(prefix_id=prefix_id,
|
||||
frame_id=frame_id,
|
||||
category_id=0))
|
||||
self.stackMap[key] = stack_id
|
||||
return stack_id
|
||||
|
||||
def _intern_string(self, string: str) -> int:
|
||||
"""Gets a matching string, or saves the new string. Returns a String ID."""
|
||||
string_id = self.stringMap.get(string)
|
||||
if string_id is not None:
|
||||
return string_id
|
||||
string_id = len(self.stringTable)
|
||||
self.stringTable.append(string)
|
||||
self.stringMap[string] = string_id
|
||||
return string_id
|
||||
|
||||
def _intern_frame(self, frame_str: str) -> int:
|
||||
"""Gets a matching stack frame, or saves the new frame. Returns a Frame ID."""
|
||||
frame_id = self.frameMap.get(frame_str)
|
||||
if frame_id is not None:
|
||||
return frame_id
|
||||
frame_id = len(self.frameTable)
|
||||
self.frameMap[frame_str] = frame_id
|
||||
string_id = self._intern_string(frame_str)
|
||||
|
||||
category = 0
|
||||
# Heuristic: kernel code contains "kallsyms" as the library name.
|
||||
if "kallsyms" in frame_str or ".ko" in frame_str:
|
||||
category = 1
|
||||
# Heuristic: empirically, off-CPU profiles mostly measure off-CPU
|
||||
# time accounted to the linux kernel __schedule function, which
|
||||
# handles blocking. This only works if we have kernel symbol
|
||||
# (kallsyms) access though. __schedule defined here:
|
||||
# https://cs.android.com/android/kernel/superproject/+/common-android-mainline:common/kernel/sched/core.c;l=6593;drc=0c99414a07ddaa18d8eb4be90b551d2687cbde2f
|
||||
if frame_str.startswith("__schedule "):
|
||||
category = 5
|
||||
elif ".so" in frame_str:
|
||||
category = 2
|
||||
elif ".vdex" in frame_str:
|
||||
category = 3
|
||||
elif ".oat" in frame_str:
|
||||
category = 4
|
||||
# "[JIT app cache]" is returned for JIT code here:
|
||||
# https://cs.android.com/android/platform/superproject/+/master:system/extras/simpleperf/dso.cpp;l=551;drc=4d8137f55782cc1e8cc93e4694ba3a7159d9a2bc
|
||||
elif "[JIT app cache]" in frame_str:
|
||||
category = 7
|
||||
|
||||
self.frameTable.append(Frame(
|
||||
string_id=string_id,
|
||||
relevantForJS=False,
|
||||
innerWindowID=0,
|
||||
implementation=None,
|
||||
optimizations=None,
|
||||
line=None,
|
||||
column=None,
|
||||
category=category,
|
||||
subcategory=0,
|
||||
))
|
||||
return frame_id
|
||||
|
||||
def add_sample(self, comm: str, stack: List[str], time_ms: Milliseconds) -> None:
|
||||
"""Add a timestamped stack trace sample to the thread builder.
|
||||
|
||||
Args:
|
||||
comm: command-line (name) of the thread at this sample
|
||||
stack: sampled stack frames. Root first, leaf last.
|
||||
time_ms: timestamp of sample in milliseconds
|
||||
"""
|
||||
# Unix threads often don't set their name immediately upon creation.
|
||||
# Use the last name
|
||||
if self.comm != comm:
|
||||
self.comm = comm
|
||||
|
||||
prefix_stack_id = None
|
||||
for frame in stack:
|
||||
frame_id = self._intern_frame(frame)
|
||||
prefix_stack_id = self._intern_stack(frame_id, prefix_stack_id)
|
||||
|
||||
self.samples.append(Sample(stack_id=prefix_stack_id,
|
||||
time_ms=time_ms,
|
||||
responsiveness=0,
|
||||
complete_stack=is_complete_stack(stack)))
|
||||
|
||||
def sort_samples(self) -> None:
|
||||
""" The samples aren't guaranteed to be in order. Sort them by time. """
|
||||
self.samples.sort(key=lambda s: s.time_ms)
|
||||
|
||||
def remove_stack_gaps(self, max_remove_gap_length: int, gap_distr: Dict[int, int]) -> None:
|
||||
""" Ideally all callstacks are complete. But some may be broken for different reasons.
|
||||
To create a smooth view in "Stack Chart", remove small gaps of broken callstacks.
|
||||
|
||||
Args:
|
||||
max_remove_gap_length: the max length of continuous broken-stack samples to remove
|
||||
"""
|
||||
if max_remove_gap_length == 0:
|
||||
return
|
||||
i = 0
|
||||
remove_flags = [False] * len(self.samples)
|
||||
while i < len(self.samples):
|
||||
if self.samples[i].complete_stack:
|
||||
i += 1
|
||||
continue
|
||||
n = 1
|
||||
while (i + n < len(self.samples)) and (not self.samples[i + n].complete_stack):
|
||||
n += 1
|
||||
gap_distr[n] += 1
|
||||
if n <= max_remove_gap_length:
|
||||
for j in range(i, i + n):
|
||||
remove_flags[j] = True
|
||||
i += n
|
||||
if True in remove_flags:
|
||||
old_samples = self.samples
|
||||
self.samples = [s for s, remove in zip(old_samples, remove_flags) if not remove]
|
||||
|
||||
def to_json_dict(self) -> Dict:
|
||||
"""Converts this Thread to GeckoThread JSON format."""
|
||||
|
||||
# Gecko profile format is row-oriented data as List[List],
|
||||
# And a schema for interpreting each index.
|
||||
# Schema:
|
||||
# https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L230
|
||||
return {
|
||||
"tid": self.tid,
|
||||
"pid": self.pid,
|
||||
"name": self.comm,
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L51
|
||||
"markers": {
|
||||
"schema": {
|
||||
"name": 0,
|
||||
"startTime": 1,
|
||||
"endTime": 2,
|
||||
"phase": 3,
|
||||
"category": 4,
|
||||
"data": 5,
|
||||
},
|
||||
"data": [],
|
||||
},
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L90
|
||||
"samples": {
|
||||
"schema": {
|
||||
"stack": 0,
|
||||
"time": 1,
|
||||
"responsiveness": 2,
|
||||
},
|
||||
"data": [s.to_json() for s in self.samples],
|
||||
},
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L156
|
||||
"frameTable": {
|
||||
"schema": {
|
||||
"location": 0,
|
||||
"relevantForJS": 1,
|
||||
"innerWindowID": 2,
|
||||
"implementation": 3,
|
||||
"optimizations": 4,
|
||||
"line": 5,
|
||||
"column": 6,
|
||||
"category": 7,
|
||||
"subcategory": 8,
|
||||
},
|
||||
"data": self.frameTable,
|
||||
},
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L216
|
||||
"stackTable": {
|
||||
"schema": {
|
||||
"prefix": 0,
|
||||
"frame": 1,
|
||||
"category": 2,
|
||||
},
|
||||
"data": self.stackTable,
|
||||
},
|
||||
"stringTable": self.stringTable,
|
||||
"registerTime": 0,
|
||||
"unregisterTime": None,
|
||||
"processType": "default",
|
||||
}
|
||||
|
||||
|
||||
def remove_stack_gaps(max_remove_gap_length: int, thread_map: Dict[int, Thread]) -> None:
|
||||
""" Remove stack gaps for each thread, and print status. """
|
||||
if max_remove_gap_length == 0:
|
||||
return
|
||||
total_sample_count = 0
|
||||
remove_sample_count = 0
|
||||
gap_distr = Counter()
|
||||
for tid in list(thread_map.keys()):
|
||||
thread = thread_map[tid]
|
||||
old_n = len(thread.samples)
|
||||
thread.remove_stack_gaps(max_remove_gap_length, gap_distr)
|
||||
new_n = len(thread.samples)
|
||||
total_sample_count += old_n
|
||||
remove_sample_count += old_n - new_n
|
||||
if new_n == 0:
|
||||
del thread_map[tid]
|
||||
if total_sample_count != 0:
|
||||
logging.info('Remove stack gaps with length <= %d. %d (%.2f%%) samples are removed.',
|
||||
max_remove_gap_length, remove_sample_count,
|
||||
remove_sample_count / total_sample_count * 100
|
||||
)
|
||||
logging.debug('Stack gap length distribution among samples (gap_length: count): %s',
|
||||
gap_distr)
|
||||
|
||||
|
||||
def _gecko_profile(
|
||||
record_file: str,
|
||||
symfs_dir: Optional[str],
|
||||
kallsyms_file: Optional[str],
|
||||
report_lib_options: ReportLibOptions,
|
||||
max_remove_gap_length: int,
|
||||
percpu_samples: bool) -> GeckoProfile:
|
||||
"""convert a simpleperf profile to gecko format"""
|
||||
lib = GetReportLib(record_file)
|
||||
|
||||
lib.ShowIpForUnknownSymbol()
|
||||
if symfs_dir is not None:
|
||||
lib.SetSymfs(symfs_dir)
|
||||
if kallsyms_file is not None:
|
||||
lib.SetKallsymsFile(kallsyms_file)
|
||||
if percpu_samples:
|
||||
# Grouping samples by cpus doesn't support off cpu samples.
|
||||
if lib.GetSupportedTraceOffCpuModes():
|
||||
report_lib_options.trace_offcpu = 'on-cpu'
|
||||
lib.SetReportOptions(report_lib_options)
|
||||
|
||||
arch = lib.GetArch()
|
||||
meta_info = lib.MetaInfo()
|
||||
record_cmd = lib.GetRecordCmd()
|
||||
|
||||
# Map from tid to Thread
|
||||
thread_map: Dict[int, Thread] = {}
|
||||
# Map from pid to process name
|
||||
process_names: Dict[int, str] = {}
|
||||
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
symbol = lib.GetSymbolOfCurrentSample()
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
sample_time_ms = sample.time / 1000000
|
||||
|
||||
stack = ['%s (in %s)' % (symbol.symbol_name, symbol.dso_name)]
|
||||
for i in range(callchain.nr):
|
||||
entry = callchain.entries[i]
|
||||
stack.append('%s (in %s)' % (entry.symbol.symbol_name, entry.symbol.dso_name))
|
||||
# We want root first, leaf last.
|
||||
stack.reverse()
|
||||
|
||||
if percpu_samples:
|
||||
if sample.tid == sample.pid:
|
||||
process_names[sample.pid] = sample.thread_comm
|
||||
process_name = process_names.get(sample.pid)
|
||||
stack = [
|
||||
'%s tid %d (in %s pid %d)' %
|
||||
(sample.thread_comm, sample.tid, process_name, sample.pid)] + stack
|
||||
thread = thread_map.get(sample.cpu)
|
||||
if thread is None:
|
||||
thread = Thread(comm=f'Cpu {sample.cpu}', pid=sample.cpu, tid=sample.cpu)
|
||||
thread_map[sample.cpu] = thread
|
||||
thread.add_sample(
|
||||
comm=f'Cpu {sample.cpu}',
|
||||
stack=stack,
|
||||
time_ms=sample_time_ms)
|
||||
else:
|
||||
# add thread sample
|
||||
thread = thread_map.get(sample.tid)
|
||||
if thread is None:
|
||||
thread = Thread(comm=sample.thread_comm, pid=sample.pid, tid=sample.tid)
|
||||
thread_map[sample.tid] = thread
|
||||
thread.add_sample(
|
||||
comm=sample.thread_comm,
|
||||
stack=stack,
|
||||
# We are being a bit fast and loose here with time here. simpleperf
|
||||
# uses CLOCK_MONOTONIC by default, which doesn't use the normal unix
|
||||
# epoch, but rather some arbitrary time. In practice, this doesn't
|
||||
# matter, the Firefox Profiler normalises all the timestamps to begin at
|
||||
# the minimum time. Consider fixing this in future, if needed, by
|
||||
# setting `simpleperf record --clockid realtime`.
|
||||
time_ms=sample_time_ms)
|
||||
|
||||
for thread in thread_map.values():
|
||||
thread.sort_samples()
|
||||
|
||||
remove_stack_gaps(max_remove_gap_length, thread_map)
|
||||
|
||||
threads = [thread.to_json_dict() for thread in thread_map.values()]
|
||||
|
||||
profile_timestamp = meta_info.get('timestamp')
|
||||
end_time_ms = (int(profile_timestamp) * 1000) if profile_timestamp else 0
|
||||
|
||||
# Schema: https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L305
|
||||
gecko_profile_meta = {
|
||||
"interval": 1,
|
||||
"processType": 0,
|
||||
"product": record_cmd,
|
||||
"device": meta_info.get("product_props"),
|
||||
"platform": meta_info.get("android_build_fingerprint"),
|
||||
"stackwalk": 1,
|
||||
"debug": 0,
|
||||
"gcpoison": 0,
|
||||
"asyncstack": 1,
|
||||
# The profile timestamp is actually the end time, not the start time.
|
||||
# This is close enough for our purposes; I mostly just want to know which
|
||||
# day the profile was taken! Consider fixing this in future, if needed,
|
||||
# by setting `simpleperf record --clockid realtime` and taking the minimum
|
||||
# sample time.
|
||||
"startTime": end_time_ms,
|
||||
"shutdownTime": None,
|
||||
"version": 24,
|
||||
"presymbolicated": True,
|
||||
"categories": CATEGORIES,
|
||||
"markerSchema": [],
|
||||
"abi": arch,
|
||||
"oscpu": meta_info.get("android_build_fingerprint"),
|
||||
"appBuildID": meta_info.get("app_versioncode"),
|
||||
}
|
||||
|
||||
# Schema:
|
||||
# https://github.com/firefox-devtools/profiler/blob/53970305b51b9b472e26d7457fee1d66cd4e2737/src/types/gecko-profile.js#L377
|
||||
# https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md
|
||||
return {
|
||||
"meta": gecko_profile_meta,
|
||||
"libs": [],
|
||||
"threads": threads,
|
||||
"processes": [],
|
||||
"pausedRanges": [],
|
||||
}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
parser.add_argument('--symfs',
|
||||
help='Set the path to find binaries with symbols and debug info.')
|
||||
parser.add_argument('--kallsyms', help='Set the path to find kernel symbols.')
|
||||
parser.add_argument('-i', '--record_file', nargs='?', default='perf.data',
|
||||
help='Default is perf.data.')
|
||||
parser.add_argument('--remove-gaps', metavar='MAX_GAP_LENGTH', dest='max_remove_gap_length',
|
||||
type=int, default=3, help="""
|
||||
Ideally all callstacks are complete. But some may be broken for different
|
||||
reasons. To create a smooth view in "Stack Chart", remove small gaps of
|
||||
broken callstacks. MAX_GAP_LENGTH is the max length of continuous
|
||||
broken-stack samples we want to remove.
|
||||
"""
|
||||
)
|
||||
parser.add_argument(
|
||||
'--percpu-samples', action='store_true',
|
||||
help='show samples based on cpus instead of threads')
|
||||
parser.add_report_lib_options()
|
||||
args = parser.parse_args()
|
||||
profile = _gecko_profile(
|
||||
record_file=args.record_file,
|
||||
symfs_dir=args.symfs,
|
||||
kallsyms_file=args.kallsyms,
|
||||
report_lib_options=args.report_lib_options,
|
||||
max_remove_gap_length=args.max_remove_gap_length,
|
||||
percpu_samples=args.percpu_samples,
|
||||
)
|
||||
|
||||
json.dump(profile, sys.stdout, sort_keys=True)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
2
Android/android-ndk-r27d/simpleperf/inferno.bat
Normal file
@ -0,0 +1,2 @@
|
||||
set SCRIPTPATH=%~dp0
|
||||
python %SCRIPTPATH%inferno\inferno.py %*
|
||||
137
Android/android-ndk-r27d/simpleperf/inferno/data_types.py
Normal file
@ -0,0 +1,137 @@
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
|
||||
class CallSite(object):
|
||||
|
||||
def __init__(self, method, dso):
|
||||
self.method = method
|
||||
self.dso = dso
|
||||
|
||||
|
||||
class Thread(object):
|
||||
|
||||
def __init__(self, tid, pid):
|
||||
self.tid = tid
|
||||
self.pid = pid
|
||||
self.name = ""
|
||||
self.samples = []
|
||||
self.flamegraph = FlameGraphCallSite("root", "", 0)
|
||||
self.num_samples = 0
|
||||
self.num_events = 0
|
||||
|
||||
def add_callchain(self, callchain, symbol, sample):
|
||||
self.name = sample.thread_comm
|
||||
self.num_samples += 1
|
||||
self.num_events += sample.period
|
||||
chain = []
|
||||
for j in range(callchain.nr):
|
||||
entry = callchain.entries[callchain.nr - j - 1]
|
||||
if entry.ip == 0:
|
||||
continue
|
||||
chain.append(CallSite(entry.symbol.symbol_name, entry.symbol.dso_name))
|
||||
|
||||
chain.append(CallSite(symbol.symbol_name, symbol.dso_name))
|
||||
self.flamegraph.add_callchain(chain, sample.period)
|
||||
|
||||
|
||||
class Process(object):
|
||||
|
||||
def __init__(self, name, pid):
|
||||
self.name = name
|
||||
self.pid = pid
|
||||
self.threads = {}
|
||||
self.cmd = ""
|
||||
self.props = {}
|
||||
# num_samples is the count of samples recorded in the profiling file.
|
||||
self.num_samples = 0
|
||||
# num_events is the count of events contained in all samples. Each sample contains a
|
||||
# count of events happened since last sample. If we use cpu-cycles event, the count
|
||||
# shows how many cpu-cycles have happened during recording.
|
||||
self.num_events = 0
|
||||
|
||||
def get_thread(self, tid, pid):
|
||||
thread = self.threads.get(tid)
|
||||
if thread is None:
|
||||
thread = self.threads[tid] = Thread(tid, pid)
|
||||
return thread
|
||||
|
||||
def add_sample(self, sample, symbol, callchain):
|
||||
thread = self.get_thread(sample.tid, sample.pid)
|
||||
thread.add_callchain(callchain, symbol, sample)
|
||||
self.num_samples += 1
|
||||
# sample.period is the count of events happened since last sample.
|
||||
self.num_events += sample.period
|
||||
|
||||
|
||||
class FlameGraphCallSite(object):
|
||||
|
||||
callsite_counter = 0
|
||||
@classmethod
|
||||
def _get_next_callsite_id(cls):
|
||||
cls.callsite_counter += 1
|
||||
return cls.callsite_counter
|
||||
|
||||
def __init__(self, method, dso, callsite_id):
|
||||
# map from (dso, method) to FlameGraphCallSite. Used to speed up add_callchain().
|
||||
self.child_dict = {}
|
||||
self.children = []
|
||||
self.method = method
|
||||
self.dso = dso
|
||||
self.num_events = 0
|
||||
self.offset = 0 # Offset allows position nodes in different branches.
|
||||
self.id = callsite_id
|
||||
|
||||
def weight(self):
|
||||
return float(self.num_events)
|
||||
|
||||
def add_callchain(self, chain, num_events):
|
||||
self.num_events += num_events
|
||||
current = self
|
||||
for callsite in chain:
|
||||
current = current.get_child(callsite)
|
||||
current.num_events += num_events
|
||||
|
||||
def get_child(self, callsite):
|
||||
key = (callsite.dso, callsite.method)
|
||||
child = self.child_dict.get(key)
|
||||
if child is None:
|
||||
child = self.child_dict[key] = FlameGraphCallSite(callsite.method, callsite.dso,
|
||||
self._get_next_callsite_id())
|
||||
return child
|
||||
|
||||
def trim_callchain(self, min_num_events, max_depth, depth=0):
|
||||
""" Remove call sites with num_events < min_num_events in the subtree.
|
||||
Remaining children are collected in a list.
|
||||
"""
|
||||
if depth <= max_depth:
|
||||
for key in self.child_dict:
|
||||
child = self.child_dict[key]
|
||||
if child.num_events >= min_num_events:
|
||||
child.trim_callchain(min_num_events, max_depth, depth + 1)
|
||||
self.children.append(child)
|
||||
# Relese child_dict since it will not be used.
|
||||
self.child_dict = None
|
||||
|
||||
def get_max_depth(self):
|
||||
return max([c.get_max_depth() for c in self.children]) + 1 if self.children else 1
|
||||
|
||||
def generate_offset(self, start_offset):
|
||||
self.offset = start_offset
|
||||
child_offset = start_offset
|
||||
for child in self.children:
|
||||
child_offset = child.generate_offset(child_offset)
|
||||
return self.offset + self.num_events
|
||||
1
Android/android-ndk-r27d/simpleperf/inferno/inferno.b64
Normal file
369
Android/android-ndk-r27d/simpleperf/inferno/inferno.py
Normal file
@ -0,0 +1,369 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""
|
||||
Inferno is a tool to generate flamegraphs for android programs. It was originally written
|
||||
to profile surfaceflinger (Android compositor) but it can be used for other C++ program.
|
||||
It uses simpleperf to collect data. Programs have to be compiled with frame pointers which
|
||||
excludes ART based programs for the time being.
|
||||
|
||||
Here is how it works:
|
||||
|
||||
1/ Data collection is started via simpleperf and pulled locally as "perf.data".
|
||||
2/ The raw format is parsed, callstacks are merged to form a flamegraph data structure.
|
||||
3/ The data structure is used to generate a SVG embedded into an HTML page.
|
||||
4/ Javascript is injected to allow flamegraph navigation, search, coloring model.
|
||||
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import datetime
|
||||
import logging
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
# fmt: off
|
||||
# pylint: disable=wrong-import-position
|
||||
SCRIPTS_PATH = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
|
||||
sys.path.append(SCRIPTS_PATH)
|
||||
from simpleperf_report_lib import ReportLib
|
||||
from simpleperf_utils import (log_exit, log_fatal, AdbHelper, open_report_in_browser,
|
||||
BaseArgumentParser)
|
||||
|
||||
from data_types import Process
|
||||
from svg_renderer import get_proper_scaled_time_string, render_svg
|
||||
# fmt: on
|
||||
|
||||
|
||||
def collect_data(args):
|
||||
""" Run app_profiler.py to generate record file. """
|
||||
app_profiler_args = [sys.executable, os.path.join(SCRIPTS_PATH, "app_profiler.py"), "-nb"]
|
||||
if args.app:
|
||||
app_profiler_args += ["-p", args.app]
|
||||
elif args.native_program:
|
||||
app_profiler_args += ["-np", args.native_program]
|
||||
elif args.pid != -1:
|
||||
app_profiler_args += ['--pid', str(args.pid)]
|
||||
elif args.system_wide:
|
||||
app_profiler_args += ['--system_wide']
|
||||
else:
|
||||
log_exit("Please set profiling target with -p, -np, --pid or --system_wide option.")
|
||||
if args.compile_java_code:
|
||||
app_profiler_args.append("--compile_java_code")
|
||||
if args.disable_adb_root:
|
||||
app_profiler_args.append("--disable_adb_root")
|
||||
record_arg_str = ""
|
||||
if args.dwarf_unwinding:
|
||||
record_arg_str += "-g "
|
||||
else:
|
||||
record_arg_str += "--call-graph fp "
|
||||
if args.events:
|
||||
tokens = args.events.split()
|
||||
if len(tokens) == 2:
|
||||
num_events = tokens[0]
|
||||
event_name = tokens[1]
|
||||
record_arg_str += "-c %s -e %s " % (num_events, event_name)
|
||||
else:
|
||||
log_exit("Event format string of -e option cann't be recognized.")
|
||||
logging.info("Using event sampling (-c %s -e %s)." % (num_events, event_name))
|
||||
else:
|
||||
record_arg_str += "-f %d " % args.sample_frequency
|
||||
logging.info("Using frequency sampling (-f %d)." % args.sample_frequency)
|
||||
record_arg_str += "--duration %d " % args.capture_duration
|
||||
app_profiler_args += ["-r", record_arg_str]
|
||||
returncode = subprocess.call(app_profiler_args)
|
||||
return returncode == 0
|
||||
|
||||
|
||||
def parse_samples(process, args, sample_filter_fn):
|
||||
"""Read samples from record file.
|
||||
process: Process object
|
||||
args: arguments
|
||||
sample_filter_fn: if not None, is used to modify and filter samples.
|
||||
It returns false for samples should be filtered out.
|
||||
"""
|
||||
|
||||
record_file = args.record_file
|
||||
symfs_dir = args.symfs
|
||||
kallsyms_file = args.kallsyms
|
||||
|
||||
lib = ReportLib()
|
||||
|
||||
lib.ShowIpForUnknownSymbol()
|
||||
if symfs_dir:
|
||||
lib.SetSymfs(symfs_dir)
|
||||
if record_file:
|
||||
lib.SetRecordFile(record_file)
|
||||
if kallsyms_file:
|
||||
lib.SetKallsymsFile(kallsyms_file)
|
||||
lib.SetReportOptions(args.report_lib_options)
|
||||
process.cmd = lib.GetRecordCmd()
|
||||
product_props = lib.MetaInfo().get("product_props")
|
||||
if product_props:
|
||||
manufacturer, model, name = product_props.split(':')
|
||||
process.props['ro.product.manufacturer'] = manufacturer
|
||||
process.props['ro.product.model'] = model
|
||||
process.props['ro.product.name'] = name
|
||||
if lib.MetaInfo().get('trace_offcpu') == 'true':
|
||||
process.props['trace_offcpu'] = True
|
||||
if args.one_flamegraph:
|
||||
log_exit("It doesn't make sense to report with --one-flamegraph for perf.data " +
|
||||
"recorded with --trace-offcpu.""")
|
||||
else:
|
||||
process.props['trace_offcpu'] = False
|
||||
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
symbol = lib.GetSymbolOfCurrentSample()
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
if sample_filter_fn and not sample_filter_fn(sample, symbol, callchain):
|
||||
continue
|
||||
process.add_sample(sample, symbol, callchain)
|
||||
|
||||
if process.pid == 0:
|
||||
main_threads = [thread for thread in process.threads.values() if thread.tid == thread.pid]
|
||||
if main_threads:
|
||||
process.name = main_threads[0].name
|
||||
process.pid = main_threads[0].pid
|
||||
|
||||
for thread in process.threads.values():
|
||||
min_event_count = thread.num_events * args.min_callchain_percentage * 0.01
|
||||
thread.flamegraph.trim_callchain(min_event_count, args.max_callchain_depth)
|
||||
|
||||
logging.info("Parsed %s callchains." % process.num_samples)
|
||||
|
||||
|
||||
def get_local_asset_content(local_path):
|
||||
"""
|
||||
Retrieves local package text content
|
||||
:param local_path: str, filename of local asset
|
||||
:return: str, the content of local_path
|
||||
"""
|
||||
with open(os.path.join(os.path.dirname(__file__), local_path), 'r') as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
def output_report(process, args):
|
||||
"""
|
||||
Generates a HTML report representing the result of simpleperf sampling as flamegraph
|
||||
:param process: Process object
|
||||
:return: str, absolute path to the file
|
||||
"""
|
||||
f = open(args.report_path, 'w')
|
||||
filepath = os.path.realpath(f.name)
|
||||
if not args.embedded_flamegraph:
|
||||
f.write("<html><body>")
|
||||
f.write("<div id='flamegraph_id' style='font-family: Monospace; %s'>" % (
|
||||
"display: none;" if args.embedded_flamegraph else ""))
|
||||
f.write("""<style type="text/css"> .s { stroke:black; stroke-width:0.5; cursor:pointer;}
|
||||
</style>""")
|
||||
f.write('<style type="text/css"> .t:hover { cursor:pointer; } </style>')
|
||||
f.write('<img height="180" alt = "Embedded Image" src ="data')
|
||||
f.write(get_local_asset_content("inferno.b64"))
|
||||
f.write('"/>')
|
||||
process_entry = ("Process : %s (%d)<br/>" % (process.name, process.pid)) if process.pid else ""
|
||||
thread_entry = '' if args.one_flamegraph else ('Threads: %d<br/>' % len(process.threads))
|
||||
if process.props['trace_offcpu']:
|
||||
event_entry = 'Total time: %s<br/>' % get_proper_scaled_time_string(process.num_events)
|
||||
else:
|
||||
event_entry = 'Event count: %s<br/>' % ("{:,}".format(process.num_events))
|
||||
# TODO: collect capture duration info from perf.data.
|
||||
duration_entry = ("Duration: %s seconds<br/>" % args.capture_duration
|
||||
) if args.capture_duration else ""
|
||||
f.write("""<div style='display:inline-block;'>
|
||||
<font size='8'>
|
||||
Inferno Flamegraph Report%s</font><br/><br/>
|
||||
%s
|
||||
Date : %s<br/>
|
||||
%s
|
||||
Samples : %d<br/>
|
||||
%s
|
||||
%s""" % ((': ' + args.title) if args.title else '',
|
||||
process_entry,
|
||||
datetime.datetime.now().strftime("%Y-%m-%d (%A) %H:%M:%S"),
|
||||
thread_entry,
|
||||
process.num_samples,
|
||||
event_entry,
|
||||
duration_entry))
|
||||
if 'ro.product.model' in process.props:
|
||||
f.write(
|
||||
"Machine : %s (%s) by %s<br/>" %
|
||||
(process.props["ro.product.model"],
|
||||
process.props["ro.product.name"],
|
||||
process.props["ro.product.manufacturer"]))
|
||||
if process.cmd:
|
||||
f.write("Capture : %s<br/><br/>" % process.cmd)
|
||||
f.write("</div>")
|
||||
f.write("""<br/><br/>
|
||||
<div>Navigate with WASD, zoom in with SPACE, zoom out with BACKSPACE.</div>""")
|
||||
f.write("<script>%s</script>" % get_local_asset_content("script.js"))
|
||||
if not args.embedded_flamegraph:
|
||||
f.write("<script>document.addEventListener('DOMContentLoaded', flamegraphInit);</script>")
|
||||
|
||||
# Sort threads by the event count in a thread.
|
||||
for thread in sorted(process.threads.values(), key=lambda x: x.num_events, reverse=True):
|
||||
thread_name = 'One flamegraph' if args.one_flamegraph else ('Thread %d (%s)' %
|
||||
(thread.tid, thread.name))
|
||||
f.write("<br/><br/><b>%s (%d samples):</b><br/>\n\n\n\n" %
|
||||
(thread_name, thread.num_samples))
|
||||
render_svg(process, thread.flamegraph, f, args.color)
|
||||
|
||||
f.write("</div>")
|
||||
if not args.embedded_flamegraph:
|
||||
f.write("</body></html")
|
||||
f.close()
|
||||
return "file://" + filepath
|
||||
|
||||
|
||||
def generate_threads_offsets(process):
|
||||
for thread in process.threads.values():
|
||||
thread.flamegraph.generate_offset(0)
|
||||
|
||||
|
||||
def collect_machine_info(process):
|
||||
adb = AdbHelper()
|
||||
process.props = {}
|
||||
process.props['ro.product.model'] = adb.get_property('ro.product.model')
|
||||
process.props['ro.product.name'] = adb.get_property('ro.product.name')
|
||||
process.props['ro.product.manufacturer'] = adb.get_property('ro.product.manufacturer')
|
||||
|
||||
|
||||
def main():
|
||||
# Allow deep callchain with length >1000.
|
||||
sys.setrecursionlimit(1500)
|
||||
parser = BaseArgumentParser(description="""Report samples in perf.data. Default option
|
||||
is: "-np surfaceflinger -f 6000 -t 10".""")
|
||||
record_group = parser.add_argument_group('Record options')
|
||||
record_group.add_argument('-du', '--dwarf_unwinding', action='store_true', help="""Perform
|
||||
unwinding using dwarf instead of fp.""")
|
||||
record_group.add_argument('-e', '--events', default="", help="""Sample based on event
|
||||
occurences instead of frequency. Format expected is
|
||||
"event_counts event_name". e.g: "10000 cpu-cyles". A few examples
|
||||
of event_name: cpu-cycles, cache-references, cache-misses,
|
||||
branch-instructions, branch-misses""")
|
||||
record_group.add_argument('-f', '--sample_frequency', type=int, default=6000, help="""Sample
|
||||
frequency""")
|
||||
record_group.add_argument('--compile_java_code', action='store_true',
|
||||
help="""On Android N and Android O, we need to compile Java code
|
||||
into native instructions to profile Java code. Android O
|
||||
also needs wrap.sh in the apk to use the native
|
||||
instructions.""")
|
||||
record_group.add_argument('-np', '--native_program', default="surfaceflinger", help="""Profile
|
||||
a native program. The program should be running on the device.
|
||||
Like -np surfaceflinger.""")
|
||||
record_group.add_argument('-p', '--app', help="""Profile an Android app, given the package
|
||||
name. Like -p com.example.android.myapp.""")
|
||||
record_group.add_argument('--pid', type=int, default=-1, help="""Profile a native program
|
||||
with given pid, the pid should exist on the device.""")
|
||||
record_group.add_argument('--record_file', default='perf.data', help='Default is perf.data.')
|
||||
record_group.add_argument('-sc', '--skip_collection', action='store_true', help="""Skip data
|
||||
collection""")
|
||||
record_group.add_argument('--system_wide', action='store_true', help='Profile system wide.')
|
||||
record_group.add_argument('-t', '--capture_duration', type=int, default=10, help="""Capture
|
||||
duration in seconds.""")
|
||||
|
||||
report_group = parser.add_argument_group('Report options')
|
||||
report_group.add_argument('-c', '--color', default='hot', choices=['hot', 'dso', 'legacy'],
|
||||
help="""Color theme: hot=percentage of samples, dso=callsite DSO
|
||||
name, legacy=brendan style""")
|
||||
report_group.add_argument('--embedded_flamegraph', action='store_true', help="""Generate
|
||||
embedded flamegraph.""")
|
||||
report_group.add_argument('--kallsyms', help='Set the path to find kernel symbols.')
|
||||
report_group.add_argument('--min_callchain_percentage', default=0.01, type=float, help="""
|
||||
Set min percentage of callchains shown in the report.
|
||||
It is used to limit nodes shown in the flamegraph. For example,
|
||||
when set to 0.01, only callchains taking >= 0.01%% of the event
|
||||
count of the owner thread are collected in the report.""")
|
||||
report_group.add_argument('--max_callchain_depth', default=1000000000, type=int, help="""
|
||||
Set maximum depth of callchains shown in the report. It is used
|
||||
to limit the nodes shown in the flamegraph and avoid processing
|
||||
limits. For example, when set to 10, callstacks will be cut after
|
||||
the tenth frame.""")
|
||||
report_group.add_argument('--no_browser', action='store_true', help="""Don't open report
|
||||
in browser.""")
|
||||
report_group.add_argument('-o', '--report_path', default='report.html', help="""Set report
|
||||
path.""")
|
||||
report_group.add_argument('--one-flamegraph', action='store_true', help="""Generate one
|
||||
flamegraph instead of one for each thread.""")
|
||||
report_group.add_argument('--symfs', help="""Set the path to find binaries with symbols and
|
||||
debug info.""")
|
||||
report_group.add_argument('--title', help='Show a title in the report.')
|
||||
parser.add_report_lib_options(
|
||||
report_group, sample_filter_group=report_group, sample_filter_with_pid_shortcut=False)
|
||||
|
||||
debug_group = parser.add_argument_group('Debug options')
|
||||
debug_group.add_argument('--disable_adb_root', action='store_true', help="""Force adb to run
|
||||
in non root mode.""")
|
||||
args = parser.parse_args()
|
||||
process = Process("", 0)
|
||||
|
||||
if not args.skip_collection:
|
||||
if args.pid != -1:
|
||||
process.pid = args.pid
|
||||
args.native_program = ''
|
||||
if args.system_wide:
|
||||
process.pid = -1
|
||||
args.native_program = ''
|
||||
|
||||
if args.system_wide:
|
||||
process.name = 'system_wide'
|
||||
else:
|
||||
process.name = args.app or args.native_program or ('Process %d' % args.pid)
|
||||
logging.info("Starting data collection stage for '%s'." % process.name)
|
||||
if not collect_data(args):
|
||||
log_exit("Unable to collect data.")
|
||||
if process.pid == 0:
|
||||
result, output = AdbHelper().run_and_return_output(['shell', 'pidof', process.name])
|
||||
if result:
|
||||
try:
|
||||
process.pid = int(output)
|
||||
except ValueError:
|
||||
process.pid = 0
|
||||
collect_machine_info(process)
|
||||
else:
|
||||
args.capture_duration = 0
|
||||
|
||||
sample_filter_fn = None
|
||||
if args.one_flamegraph:
|
||||
def filter_fn(sample, _symbol, _callchain):
|
||||
sample.pid = sample.tid = process.pid
|
||||
return True
|
||||
sample_filter_fn = filter_fn
|
||||
if not args.title:
|
||||
args.title = ''
|
||||
args.title += '(One Flamegraph)'
|
||||
|
||||
try:
|
||||
parse_samples(process, args, sample_filter_fn)
|
||||
generate_threads_offsets(process)
|
||||
report_path = output_report(process, args)
|
||||
if not args.no_browser:
|
||||
open_report_in_browser(report_path)
|
||||
except RuntimeError as r:
|
||||
if 'maximum recursion depth' in r.__str__():
|
||||
log_fatal("Recursion limit exceeded (%s), try --max_callchain_depth." % r)
|
||||
raise r
|
||||
|
||||
logging.info("Flamegraph generated at '%s'." % report_path)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
274
Android/android-ndk-r27d/simpleperf/inferno/script.js
Normal file
@ -0,0 +1,274 @@
|
||||
/*
|
||||
* Copyright (C) 2017 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
'use strict';
|
||||
|
||||
function flamegraphInit() {
|
||||
let flamegraph = document.getElementById('flamegraph_id');
|
||||
let svgs = flamegraph.getElementsByTagName('svg');
|
||||
for (let i = 0; i < svgs.length; ++i) {
|
||||
createZoomHistoryStack(svgs[i]);
|
||||
adjust_text_size(svgs[i]);
|
||||
}
|
||||
|
||||
function throttle(callback) {
|
||||
let running = false;
|
||||
return function() {
|
||||
if (!running) {
|
||||
running = true;
|
||||
window.requestAnimationFrame(function () {
|
||||
callback();
|
||||
running = false;
|
||||
});
|
||||
}
|
||||
};
|
||||
}
|
||||
window.addEventListener('resize', throttle(function() {
|
||||
let flamegraph = document.getElementById('flamegraph_id');
|
||||
let svgs = flamegraph.getElementsByTagName('svg');
|
||||
for (let i = 0; i < svgs.length; ++i) {
|
||||
adjust_text_size(svgs[i]);
|
||||
}
|
||||
}));
|
||||
}
|
||||
|
||||
// Create a stack add the root svg element in it.
|
||||
function createZoomHistoryStack(svgElement) {
|
||||
svgElement.zoomStack = [svgElement.getElementById(svgElement.attributes['rootid'].value)];
|
||||
}
|
||||
|
||||
function adjust_node_text_size(x, svgWidth) {
|
||||
let title = x.getElementsByTagName('title')[0];
|
||||
let text = x.getElementsByTagName('text')[0];
|
||||
let rect = x.getElementsByTagName('rect')[0];
|
||||
|
||||
let width = parseFloat(rect.attributes['width'].value) * svgWidth * 0.01;
|
||||
|
||||
// Don't even bother trying to find a best fit. The area is too small.
|
||||
if (width < 28) {
|
||||
text.textContent = '';
|
||||
return;
|
||||
}
|
||||
// Remove dso and #samples which are here only for mouseover purposes.
|
||||
let methodName = title.textContent.split(' | ')[0];
|
||||
|
||||
let numCharacters;
|
||||
for (numCharacters = methodName.length; numCharacters > 4; numCharacters--) {
|
||||
// Avoid reflow by using hard-coded estimate instead of
|
||||
// text.getSubStringLength(0, numCharacters).
|
||||
if (numCharacters * 7.5 <= width) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (numCharacters == methodName.length) {
|
||||
text.textContent = methodName;
|
||||
return;
|
||||
}
|
||||
|
||||
text.textContent = methodName.substring(0, numCharacters-2) + '..';
|
||||
}
|
||||
|
||||
function adjust_text_size(svgElement) {
|
||||
let svgWidth = window.innerWidth;
|
||||
let x = svgElement.getElementsByTagName('g');
|
||||
for (let i = 0; i < x.length; i++) {
|
||||
adjust_node_text_size(x[i], svgWidth);
|
||||
}
|
||||
}
|
||||
|
||||
function zoom(e) {
|
||||
let svgElement = e.ownerSVGElement;
|
||||
let zoomStack = svgElement.zoomStack;
|
||||
zoomStack.push(e);
|
||||
displaySVGElement(svgElement);
|
||||
select(e);
|
||||
|
||||
// Show zoom out button.
|
||||
svgElement.getElementById('zoom_rect').style.display = 'block';
|
||||
svgElement.getElementById('zoom_text').style.display = 'block';
|
||||
}
|
||||
|
||||
function displaySVGElement(svgElement) {
|
||||
let zoomStack = svgElement.zoomStack;
|
||||
let e = zoomStack[zoomStack.length - 1];
|
||||
let clicked_rect = e.getElementsByTagName('rect')[0];
|
||||
let clicked_origin_x;
|
||||
let clicked_origin_y = clicked_rect.attributes['oy'].value;
|
||||
let clicked_origin_width;
|
||||
|
||||
if (zoomStack.length == 1) {
|
||||
// Show all nodes when zoomStack only contains the root node.
|
||||
// This is needed to show flamegraph containing more than one node at the root level.
|
||||
clicked_origin_x = 0;
|
||||
clicked_origin_width = 100;
|
||||
} else {
|
||||
clicked_origin_x = clicked_rect.attributes['ox'].value;
|
||||
clicked_origin_width = clicked_rect.attributes['owidth'].value;
|
||||
}
|
||||
|
||||
|
||||
let svgBox = svgElement.getBoundingClientRect();
|
||||
let svgBoxHeight = svgBox.height;
|
||||
let svgBoxWidth = 100;
|
||||
let scaleFactor = svgBoxWidth / clicked_origin_width;
|
||||
|
||||
let callsites = svgElement.getElementsByTagName('g');
|
||||
for (let i = 0; i < callsites.length; i++) {
|
||||
let text = callsites[i].getElementsByTagName('text')[0];
|
||||
let rect = callsites[i].getElementsByTagName('rect')[0];
|
||||
|
||||
let rect_o_x = parseFloat(rect.attributes['ox'].value);
|
||||
let rect_o_y = parseFloat(rect.attributes['oy'].value);
|
||||
|
||||
// Avoid multiple forced reflow by hiding nodes.
|
||||
if (rect_o_y > clicked_origin_y) {
|
||||
rect.style.display = 'none';
|
||||
text.style.display = 'none';
|
||||
continue;
|
||||
}
|
||||
rect.style.display = 'block';
|
||||
text.style.display = 'block';
|
||||
|
||||
let newrec_x = rect.attributes['x'].value = (rect_o_x - clicked_origin_x) * scaleFactor +
|
||||
'%';
|
||||
let newrec_y = rect.attributes['y'].value = rect_o_y + (svgBoxHeight - clicked_origin_y
|
||||
- 17 - 2);
|
||||
|
||||
text.attributes['y'].value = newrec_y + 12;
|
||||
text.attributes['x'].value = newrec_x;
|
||||
|
||||
rect.attributes['width'].value = (rect.attributes['owidth'].value * scaleFactor) + '%';
|
||||
}
|
||||
|
||||
adjust_text_size(svgElement);
|
||||
}
|
||||
|
||||
function unzoom(e) {
|
||||
let svgOwner = e.ownerSVGElement;
|
||||
let stack = svgOwner.zoomStack;
|
||||
|
||||
// Unhighlight whatever was selected.
|
||||
if (selected) {
|
||||
selected.classList.remove('s');
|
||||
}
|
||||
|
||||
// Stack management: Never remove the last element which is the flamegraph root.
|
||||
if (stack.length > 1) {
|
||||
let previouslySelected = stack.pop();
|
||||
select(previouslySelected);
|
||||
}
|
||||
|
||||
// Hide zoom out button.
|
||||
if (stack.length == 1) {
|
||||
svgOwner.getElementById('zoom_rect').style.display = 'none';
|
||||
svgOwner.getElementById('zoom_text').style.display = 'none';
|
||||
}
|
||||
|
||||
displaySVGElement(svgOwner);
|
||||
}
|
||||
|
||||
function search(e) {
|
||||
let term = prompt('Search for:', '');
|
||||
let callsites = e.ownerSVGElement.getElementsByTagName('g');
|
||||
|
||||
if (!term) {
|
||||
for (let i = 0; i < callsites.length; i++) {
|
||||
let rect = callsites[i].getElementsByTagName('rect')[0];
|
||||
rect.attributes['fill'].value = rect.attributes['ofill'].value;
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
for (let i = 0; i < callsites.length; i++) {
|
||||
let title = callsites[i].getElementsByTagName('title')[0];
|
||||
let rect = callsites[i].getElementsByTagName('rect')[0];
|
||||
if (title.textContent.indexOf(term) != -1) {
|
||||
rect.attributes['fill'].value = 'rgb(230,100,230)';
|
||||
} else {
|
||||
rect.attributes['fill'].value = rect.attributes['ofill'].value;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let selected;
|
||||
document.addEventListener('keydown', (e) => {
|
||||
if (!selected) {
|
||||
return false;
|
||||
}
|
||||
|
||||
let nav = selected.attributes['nav'].value.split(',');
|
||||
let navigation_index;
|
||||
switch (e.keyCode) {
|
||||
// case 38: // ARROW UP
|
||||
case 87: navigation_index = 0; break; // W
|
||||
|
||||
// case 32 : // ARROW LEFT
|
||||
case 65: navigation_index = 1; break; // A
|
||||
|
||||
// case 43: // ARROW DOWN
|
||||
case 68: navigation_index = 3; break; // S
|
||||
|
||||
// case 39: // ARROW RIGHT
|
||||
case 83: navigation_index = 2; break; // D
|
||||
|
||||
case 32: zoom(selected); return false; // SPACE
|
||||
|
||||
case 8: // BACKSPACE
|
||||
unzoom(selected); return false;
|
||||
default: return true;
|
||||
}
|
||||
|
||||
if (nav[navigation_index] == '0') {
|
||||
return false;
|
||||
}
|
||||
|
||||
let target_element = selected.ownerSVGElement.getElementById(nav[navigation_index]);
|
||||
select(target_element);
|
||||
return false;
|
||||
});
|
||||
|
||||
function select(e) {
|
||||
if (selected) {
|
||||
selected.classList.remove('s');
|
||||
}
|
||||
selected = e;
|
||||
selected.classList.add('s');
|
||||
|
||||
// Update info bar
|
||||
let titleElement = selected.getElementsByTagName('title')[0];
|
||||
let text = titleElement.textContent;
|
||||
|
||||
// Parse title
|
||||
let method_and_info = text.split(' | ');
|
||||
let methodName = method_and_info[0];
|
||||
let info = method_and_info[1];
|
||||
|
||||
// Parse info
|
||||
// '/system/lib64/libhwbinder.so (4 events: 0.28%)'
|
||||
let regexp = /(.*) \((.*)\)/g;
|
||||
let match = regexp.exec(info);
|
||||
if (match.length > 2) {
|
||||
let percentage = match[2];
|
||||
// Write percentage
|
||||
let percentageTextElement = selected.ownerSVGElement.getElementById('percent_text');
|
||||
percentageTextElement.textContent = percentage;
|
||||
// console.log("'" + percentage + "'")
|
||||
}
|
||||
|
||||
// Set fields
|
||||
let barTextElement = selected.ownerSVGElement.getElementById('info_text');
|
||||
barTextElement.textContent = methodName;
|
||||
}
|
||||
204
Android/android-ndk-r27d/simpleperf/inferno/svg_renderer.py
Normal file
@ -0,0 +1,204 @@
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import sys
|
||||
|
||||
SVG_NODE_HEIGHT = 17
|
||||
FONT_SIZE = 12
|
||||
|
||||
UNZOOM_NODE_ORIGIN_X = 10
|
||||
UNZOOM_NODE_WIDTH = 80
|
||||
INFO_NODE_ORIGIN_X = 120
|
||||
INFO_NODE_WIDTH = 800
|
||||
PERCENT_NODE_ORIGIN_X = 930
|
||||
PERCENT_NODE_WIDTH = 250
|
||||
SEARCH_NODE_ORIGIN_X = 1190
|
||||
SEARCH_NODE_WIDTH = 80
|
||||
RECT_TEXT_PADDING = 10
|
||||
|
||||
|
||||
def hash_to_float(string):
|
||||
return hash(string) / float(sys.maxsize)
|
||||
|
||||
|
||||
def get_legacy_color(method):
|
||||
r = 175 + int(50 * hash_to_float(reversed(method)))
|
||||
g = 60 + int(180 * hash_to_float(method))
|
||||
b = 60 + int(55 * hash_to_float(reversed(method)))
|
||||
return (r, g, b)
|
||||
|
||||
|
||||
def get_dso_color(method):
|
||||
r = 170 + int(80 * hash_to_float(reversed(method)))
|
||||
g = 180 + int(70 * hash_to_float((method)))
|
||||
b = 170 + int(80 * hash_to_float(reversed(method)))
|
||||
return (r, g, b)
|
||||
|
||||
|
||||
def get_heat_color(callsite, total_weight):
|
||||
r = 245 + 10 * (1 - callsite.weight() / total_weight)
|
||||
g = 110 + 105 * (1 - callsite.weight() / total_weight)
|
||||
b = 100
|
||||
return (r, g, b)
|
||||
|
||||
def get_proper_scaled_time_string(value):
|
||||
if value >= 1e9:
|
||||
return '%.3f s' % (value / 1e9)
|
||||
if value >= 1e6:
|
||||
return '%.3f ms' % (value / 1e6)
|
||||
if value >= 1e3:
|
||||
return '%.3f us' % (value / 1e3)
|
||||
return '%.0f ns' % value
|
||||
|
||||
def create_svg_node(process, callsite, depth, f, total_weight, height, color_scheme, nav):
|
||||
x = float(callsite.offset) / total_weight * 100
|
||||
y = height - (depth + 1) * SVG_NODE_HEIGHT
|
||||
width = callsite.weight() / total_weight * 100
|
||||
|
||||
method = callsite.method.replace(">", ">").replace("<", "<")
|
||||
if width <= 0:
|
||||
return
|
||||
|
||||
if color_scheme == "dso":
|
||||
r, g, b = get_dso_color(callsite.dso)
|
||||
elif color_scheme == "legacy":
|
||||
r, g, b = get_legacy_color(method)
|
||||
else:
|
||||
r, g, b = get_heat_color(callsite, total_weight)
|
||||
|
||||
r_border, g_border, b_border = [max(0, color - 50) for color in [r, g, b]]
|
||||
|
||||
if process.props['trace_offcpu']:
|
||||
weight_str = get_proper_scaled_time_string(callsite.weight())
|
||||
else:
|
||||
weight_str = "{:,}".format(int(callsite.weight())) + ' events'
|
||||
|
||||
f.write(
|
||||
"""<g id=%d class="n" onclick="zoom(this);" onmouseenter="select(this);" nav="%s">
|
||||
<title>%s | %s (%s: %3.2f%%)</title>
|
||||
<rect x="%f%%" y="%f" ox="%f" oy="%f" width="%f%%" owidth="%f" height="15.0"
|
||||
ofill="rgb(%d,%d,%d)" fill="rgb(%d,%d,%d)" style="stroke:rgb(%d,%d,%d)"/>
|
||||
<text x="%f%%" y="%f" font-size="%d" font-family="Monospace"></text>
|
||||
</g>""" %
|
||||
(callsite.id,
|
||||
','.join(str(x) for x in nav),
|
||||
method,
|
||||
callsite.dso,
|
||||
weight_str,
|
||||
callsite.weight() / total_weight * 100,
|
||||
x,
|
||||
y,
|
||||
x,
|
||||
y,
|
||||
width,
|
||||
width,
|
||||
r,
|
||||
g,
|
||||
b,
|
||||
r,
|
||||
g,
|
||||
b,
|
||||
r_border,
|
||||
g_border,
|
||||
b_border,
|
||||
x,
|
||||
y + 12,
|
||||
FONT_SIZE))
|
||||
|
||||
|
||||
def render_svg_nodes(process, flamegraph, depth, f, total_weight, height, color_scheme):
|
||||
for i, child in enumerate(flamegraph.children):
|
||||
# Prebuild navigation target for wasd
|
||||
|
||||
if i == 0:
|
||||
left_index = 0
|
||||
else:
|
||||
left_index = flamegraph.children[i - 1].id
|
||||
|
||||
if i == len(flamegraph.children) - 1:
|
||||
right_index = 0
|
||||
else:
|
||||
right_index = flamegraph.children[i + 1].id
|
||||
|
||||
up_index = max(child.children, key=lambda x: x.weight()).id if child.children else 0
|
||||
|
||||
# up, left, down, right
|
||||
nav = [up_index, left_index, flamegraph.id, right_index]
|
||||
|
||||
create_svg_node(process, child, depth, f, total_weight, height, color_scheme, nav)
|
||||
# Recurse down
|
||||
render_svg_nodes(process, child, depth + 1, f, total_weight, height, color_scheme)
|
||||
|
||||
|
||||
def render_search_node(f):
|
||||
f.write(
|
||||
"""<rect id="search_rect" style="stroke:rgb(0,0,0);" onclick="search(this);" class="t"
|
||||
rx="10" ry="10" x="%d" y="10" width="%d" height="30" fill="rgb(255,255,255)""/>
|
||||
<text id="search_text" class="t" x="%d" y="30" onclick="search(this);">Search</text>
|
||||
""" % (SEARCH_NODE_ORIGIN_X, SEARCH_NODE_WIDTH, SEARCH_NODE_ORIGIN_X + RECT_TEXT_PADDING))
|
||||
|
||||
|
||||
def render_unzoom_node(f):
|
||||
f.write(
|
||||
"""<rect id="zoom_rect" style="display:none;stroke:rgb(0,0,0);" class="t"
|
||||
onclick="unzoom(this);" rx="10" ry="10" x="%d" y="10" width="%d" height="30"
|
||||
fill="rgb(255,255,255)"/>
|
||||
<text id="zoom_text" style="display:none;" class="t" x="%d" y="30"
|
||||
onclick="unzoom(this);">Zoom out</text>
|
||||
""" % (UNZOOM_NODE_ORIGIN_X, UNZOOM_NODE_WIDTH, UNZOOM_NODE_ORIGIN_X + RECT_TEXT_PADDING))
|
||||
|
||||
|
||||
def render_info_node(f):
|
||||
f.write(
|
||||
"""<clipPath id="info_clip_path"> <rect id="info_rect" style="stroke:rgb(0,0,0);"
|
||||
rx="10" ry="10" x="%d" y="10" width="%d" height="30" fill="rgb(255,255,255)"/>
|
||||
</clipPath>
|
||||
<rect id="info_rect" style="stroke:rgb(0,0,0);"
|
||||
rx="10" ry="10" x="%d" y="10" width="%d" height="30" fill="rgb(255,255,255)"/>
|
||||
<text clip-path="url(#info_clip_path)" id="info_text" x="%d" y="30"></text>
|
||||
""" % (INFO_NODE_ORIGIN_X, INFO_NODE_WIDTH, INFO_NODE_ORIGIN_X, INFO_NODE_WIDTH,
|
||||
INFO_NODE_ORIGIN_X + RECT_TEXT_PADDING))
|
||||
|
||||
|
||||
def render_percent_node(f):
|
||||
f.write(
|
||||
"""<rect id="percent_rect" style="stroke:rgb(0,0,0);"
|
||||
rx="10" ry="10" x="%d" y="10" width="%d" height="30" fill="rgb(255,255,255)"/>
|
||||
<text id="percent_text" text-anchor="end" x="%d" y="30">100.00%%</text>
|
||||
""" % (PERCENT_NODE_ORIGIN_X, PERCENT_NODE_WIDTH,
|
||||
PERCENT_NODE_ORIGIN_X + PERCENT_NODE_WIDTH - RECT_TEXT_PADDING))
|
||||
|
||||
|
||||
def render_svg(process, flamegraph, f, color_scheme):
|
||||
height = (flamegraph.get_max_depth() + 2) * SVG_NODE_HEIGHT
|
||||
f.write("""<div class="flamegraph_block" style="width:100%%; height:%dpx;">
|
||||
""" % height)
|
||||
f.write("""<svg xmlns="http://www.w3.org/2000/svg"
|
||||
xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1"
|
||||
width="100%%" height="100%%" style="border: 1px solid black;"
|
||||
rootid="%d">
|
||||
""" % (flamegraph.children[0].id))
|
||||
f.write("""<defs > <linearGradient id="background_gradiant" y1="0" y2="1" x1="0" x2="0" >
|
||||
<stop stop-color="#eeeeee" offset="5%" /> <stop stop-color="#efefb1" offset="90%" />
|
||||
</linearGradient> </defs>""")
|
||||
f.write("""<rect x="0.0" y="0" width="100%" height="100%" fill="url(#background_gradiant)" />
|
||||
""")
|
||||
render_svg_nodes(process, flamegraph, 0, f, flamegraph.weight(), height, color_scheme)
|
||||
render_search_node(f)
|
||||
render_unzoom_node(f)
|
||||
render_info_node(f)
|
||||
render_percent_node(f)
|
||||
f.write("</svg></div><br/>\n\n")
|
||||
137
Android/android-ndk-r27d/simpleperf/ipc.py
Normal file
@ -0,0 +1,137 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2023 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""ipc.py: Capture the Instructions per Cycle (IPC) of the system during a
|
||||
specified duration.
|
||||
|
||||
Example:
|
||||
./ipc.py
|
||||
./ipc.py 2 20 # Set interval to 2 secs and total duration to 20 secs
|
||||
./ipc.py -p 284 -C 4 # Only profile the PID 284 while running on core 4
|
||||
./ipc.py -c 'sleep 5' # Only profile the command to run
|
||||
|
||||
Result looks like:
|
||||
K_CYCLES K_INSTR IPC
|
||||
36840 14138 0.38
|
||||
70701 27743 0.39
|
||||
104562 41350 0.40
|
||||
138264 54916 0.40
|
||||
"""
|
||||
|
||||
import io
|
||||
import logging
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
|
||||
from simpleperf_utils import (
|
||||
AdbHelper, BaseArgumentParser, get_target_binary_path, log_exit)
|
||||
|
||||
def start_profiling(adb, args, target_args):
|
||||
"""Start simpleperf process on device."""
|
||||
shell_args = ['simpleperf', 'stat', '-e', 'cpu-cycles',
|
||||
'-e', 'instructions', '--interval', str(args.interval * 1000),
|
||||
'--duration', str(args.duration)]
|
||||
shell_args += target_args
|
||||
adb_args = [adb.adb_path, 'shell'] + shell_args
|
||||
logging.info('run adb cmd: %s' % adb_args)
|
||||
return subprocess.Popen(adb_args, stdout=subprocess.PIPE)
|
||||
|
||||
def capture_stats(adb, args, stat_subproc):
|
||||
"""Capture IPC profiling stats or stop profiling when user presses Ctrl-C."""
|
||||
try:
|
||||
print("%-10s %-10s %5s" % ("K_CYCLES", "K_INSTR", "IPC"))
|
||||
cpu_cycles = 0
|
||||
for line in io.TextIOWrapper(stat_subproc.stdout, encoding="utf-8"):
|
||||
if 'cpu-cycles' in line:
|
||||
if args.cpu == None:
|
||||
cpu_cycles = int(line.split()[0].replace(",", ""))
|
||||
continue
|
||||
columns = line.split()
|
||||
if args.cpu == int(columns[0]):
|
||||
cpu_cycles = int(columns[1].replace(",", ""))
|
||||
elif 'instructions' in line:
|
||||
if cpu_cycles == 0: cpu_cycles = 1 # PMCs are broken, or no events
|
||||
ins = -1
|
||||
columns = line.split()
|
||||
if args.cpu == None:
|
||||
ins = int(columns[0].replace(",", ""))
|
||||
elif args.cpu == int(columns[0]):
|
||||
ins = int(columns[1].replace(",", ""))
|
||||
if ins >= 0:
|
||||
print("%-10d %-10d %5.2f" %
|
||||
(cpu_cycles / 1000, ins / 1000, ins / cpu_cycles))
|
||||
|
||||
except KeyboardInterrupt:
|
||||
stop_profiling(adb)
|
||||
stat_subproc = None
|
||||
|
||||
def stop_profiling(adb):
|
||||
"""Stop profiling by sending SIGINT to simpleperf and wait until it exits."""
|
||||
has_killed = False
|
||||
while True:
|
||||
(result, _) = adb.run_and_return_output(['shell', 'pidof', 'simpleperf'])
|
||||
if not result:
|
||||
break
|
||||
if not has_killed:
|
||||
has_killed = True
|
||||
adb.run_and_return_output(['shell', 'pkill', '-l', '2', 'simpleperf'])
|
||||
time.sleep(1)
|
||||
|
||||
def capture_ipc(args):
|
||||
# Initialize adb and verify device
|
||||
adb = AdbHelper(enable_switch_to_root=True)
|
||||
if not adb.is_device_available():
|
||||
log_exit('No Android device is connected via ADB.')
|
||||
is_root_device = adb.switch_to_root()
|
||||
device_arch = adb.get_device_arch()
|
||||
|
||||
if args.pid:
|
||||
(result, _) = adb.run_and_return_output(['shell', 'ls', '/proc/%s' % args.pid])
|
||||
if not result:
|
||||
log_exit("Pid '%s' does not exist" % args.pid)
|
||||
|
||||
target_args = []
|
||||
if args.cpu is not None:
|
||||
target_args += ['--per-core']
|
||||
if args.pid:
|
||||
target_args += ['-p', args.pid]
|
||||
elif args.command:
|
||||
target_args += [args.command]
|
||||
else:
|
||||
target_args += ['-a']
|
||||
|
||||
stat_subproc = start_profiling(adb, args, target_args)
|
||||
capture_stats(adb, args, stat_subproc)
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
parser.add_argument('-C', '--cpu', type=int, help='Capture IPC only for this CPU core')
|
||||
process_group = parser.add_mutually_exclusive_group()
|
||||
process_group.add_argument('-p', '--pid', help='Capture IPC only for this PID')
|
||||
process_group.add_argument('-c', '--command', help='Capture IPC only for this command')
|
||||
parser.add_argument('interval', nargs='?', default=1, type=int, help='sampling interval in seconds')
|
||||
parser.add_argument('duration', nargs='?', default=10, type=int, help='sampling duration in seconds')
|
||||
|
||||
args = parser.parse_args()
|
||||
if args.interval > args.duration:
|
||||
log_exit("interval cannot be greater than duration")
|
||||
|
||||
capture_ipc(args)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
659
Android/android-ndk-r27d/simpleperf/pprof_proto_generator.py
Normal file
@ -0,0 +1,659 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2017 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""pprof_proto_generator.py: read perf.data, generate pprof.profile, which can be
|
||||
used by pprof.
|
||||
|
||||
Example:
|
||||
./app_profiler.py
|
||||
./pprof_proto_generator.py
|
||||
pprof -text pprof.profile
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
import re
|
||||
import sys
|
||||
|
||||
from simpleperf_report_lib import GetReportLib
|
||||
from simpleperf_utils import (Addr2Nearestline, BaseArgumentParser, BinaryFinder, extant_dir,
|
||||
flatten_arg_list, log_exit, ReadElf, ToolFinder)
|
||||
try:
|
||||
import profile_pb2
|
||||
except ImportError as e:
|
||||
log_exit(f'{e}\nprotobuf package is missing or too old. Please install it like ' +
|
||||
'`pip install protobuf==4.21`.')
|
||||
|
||||
|
||||
# Some units of common event names
|
||||
EVENT_UNITS = {
|
||||
'cpu-clock': 'nanoseconds',
|
||||
'cpu-cycles': 'cpu-cycles',
|
||||
'instructions': 'instructions',
|
||||
'task-clock': 'nanoseconds',
|
||||
}
|
||||
|
||||
|
||||
def load_pprof_profile(filename):
|
||||
profile = profile_pb2.Profile()
|
||||
with open(filename, "rb") as f:
|
||||
profile.ParseFromString(f.read())
|
||||
return profile
|
||||
|
||||
|
||||
def store_pprof_profile(filename, profile):
|
||||
with open(filename, 'wb') as f:
|
||||
f.write(profile.SerializeToString())
|
||||
|
||||
|
||||
class PprofProfilePrinter(object):
|
||||
|
||||
def __init__(self, profile):
|
||||
self.profile = profile
|
||||
self.string_table = profile.string_table
|
||||
|
||||
def show(self):
|
||||
p = self.profile
|
||||
sub_space = ' '
|
||||
print('Profile {')
|
||||
print('%d sample_types' % len(p.sample_type))
|
||||
for i in range(len(p.sample_type)):
|
||||
print('sample_type[%d] = ' % i, end='')
|
||||
self.show_value_type(p.sample_type[i])
|
||||
print('%d samples' % len(p.sample))
|
||||
for i in range(len(p.sample)):
|
||||
print('sample[%d]:' % i)
|
||||
self.show_sample(p.sample[i], sub_space)
|
||||
print('%d mappings' % len(p.mapping))
|
||||
for i in range(len(p.mapping)):
|
||||
print('mapping[%d]:' % i)
|
||||
self.show_mapping(p.mapping[i], sub_space)
|
||||
print('%d locations' % len(p.location))
|
||||
for i in range(len(p.location)):
|
||||
print('location[%d]:' % i)
|
||||
self.show_location(p.location[i], sub_space)
|
||||
for i in range(len(p.function)):
|
||||
print('function[%d]:' % i)
|
||||
self.show_function(p.function[i], sub_space)
|
||||
print('%d strings' % len(p.string_table))
|
||||
for i in range(len(p.string_table)):
|
||||
print('string[%d]: %s' % (i, p.string_table[i]))
|
||||
print('drop_frames: %s' % self.string(p.drop_frames))
|
||||
print('keep_frames: %s' % self.string(p.keep_frames))
|
||||
print('time_nanos: %u' % p.time_nanos)
|
||||
print('duration_nanos: %u' % p.duration_nanos)
|
||||
print('period_type: ', end='')
|
||||
self.show_value_type(p.period_type)
|
||||
print('period: %u' % p.period)
|
||||
for i in range(len(p.comment)):
|
||||
print('comment[%d] = %s' % (i, self.string(p.comment[i])))
|
||||
print('default_sample_type: %d' % p.default_sample_type)
|
||||
print('} // Profile')
|
||||
print()
|
||||
|
||||
def show_value_type(self, value_type, space=''):
|
||||
print('%sValueType(typeID=%d, unitID=%d, type=%s, unit=%s)' %
|
||||
(space, value_type.type, value_type.unit,
|
||||
self.string(value_type.type), self.string(value_type.unit)))
|
||||
|
||||
def show_sample(self, sample, space=''):
|
||||
sub_space = space + ' '
|
||||
for i in range(len(sample.location_id)):
|
||||
print('%slocation_id[%d]: id %d' % (space, i, sample.location_id[i]))
|
||||
self.show_location_id(sample.location_id[i], sub_space)
|
||||
for i in range(len(sample.value)):
|
||||
print('%svalue[%d] = %d' % (space, i, sample.value[i]))
|
||||
for i in range(len(sample.label)):
|
||||
print('%slabel[%d] = %s:%s' % (space, i, self.string(sample.label[i].key),
|
||||
self.string(sample.label[i].str)))
|
||||
|
||||
def show_location_id(self, location_id, space=''):
|
||||
location = self.profile.location[location_id - 1]
|
||||
self.show_location(location, space)
|
||||
|
||||
def show_location(self, location, space=''):
|
||||
sub_space = space + ' '
|
||||
print('%sid: %d' % (space, location.id))
|
||||
print('%smapping_id: %d' % (space, location.mapping_id))
|
||||
self.show_mapping_id(location.mapping_id, sub_space)
|
||||
print('%saddress: %x' % (space, location.address))
|
||||
for i in range(len(location.line)):
|
||||
print('%sline[%d]:' % (space, i))
|
||||
self.show_line(location.line[i], sub_space)
|
||||
|
||||
def show_mapping_id(self, mapping_id, space=''):
|
||||
mapping = self.profile.mapping[mapping_id - 1]
|
||||
self.show_mapping(mapping, space)
|
||||
|
||||
def show_mapping(self, mapping, space=''):
|
||||
print('%sid: %d' % (space, mapping.id))
|
||||
print('%smemory_start: %x' % (space, mapping.memory_start))
|
||||
print('%smemory_limit: %x' % (space, mapping.memory_limit))
|
||||
print('%sfile_offset: %x' % (space, mapping.file_offset))
|
||||
print('%sfilename: %s(%d)' % (space, self.string(mapping.filename),
|
||||
mapping.filename))
|
||||
print('%sbuild_id: %s(%d)' % (space, self.string(mapping.build_id),
|
||||
mapping.build_id))
|
||||
print('%shas_functions: %s' % (space, mapping.has_functions))
|
||||
print('%shas_filenames: %s' % (space, mapping.has_filenames))
|
||||
print('%shas_line_numbers: %s' % (space, mapping.has_line_numbers))
|
||||
print('%shas_inline_frames: %s' % (space, mapping.has_inline_frames))
|
||||
|
||||
def show_line(self, line, space=''):
|
||||
sub_space = space + ' '
|
||||
print('%sfunction_id: %d' % (space, line.function_id))
|
||||
self.show_function_id(line.function_id, sub_space)
|
||||
print('%sline: %d' % (space, line.line))
|
||||
|
||||
def show_function_id(self, function_id, space=''):
|
||||
function = self.profile.function[function_id - 1]
|
||||
self.show_function(function, space)
|
||||
|
||||
def show_function(self, function, space=''):
|
||||
print('%sid: %d' % (space, function.id))
|
||||
print('%sname: %s' % (space, self.string(function.name)))
|
||||
print('%ssystem_name: %s' % (space, self.string(function.system_name)))
|
||||
print('%sfilename: %s' % (space, self.string(function.filename)))
|
||||
print('%sstart_line: %d' % (space, function.start_line))
|
||||
|
||||
def string(self, string_id):
|
||||
return self.string_table[string_id]
|
||||
|
||||
|
||||
class Label(object):
|
||||
def __init__(self, key_id: int, str_id: int):
|
||||
# See profile.Label.key
|
||||
self.key_id = key_id
|
||||
# See profile.Label.str
|
||||
self.str_id = str_id
|
||||
|
||||
|
||||
class Sample(object):
|
||||
|
||||
def __init__(self):
|
||||
self.location_ids = []
|
||||
self.values = {}
|
||||
self.labels = []
|
||||
|
||||
def add_location_id(self, location_id):
|
||||
self.location_ids.append(location_id)
|
||||
|
||||
def add_value(self, sample_type_id, value):
|
||||
self.values[sample_type_id] = self.values.get(sample_type_id, 0) + value
|
||||
|
||||
def add_values(self, values):
|
||||
for sample_type_id, value in values.items():
|
||||
self.add_value(sample_type_id, value)
|
||||
|
||||
@property
|
||||
def key(self):
|
||||
return tuple(self.location_ids)
|
||||
|
||||
|
||||
class Location(object):
|
||||
|
||||
def __init__(self, mapping_id, address, vaddr_in_dso):
|
||||
self.id = -1 # unset
|
||||
self.mapping_id = mapping_id
|
||||
self.address = address
|
||||
self.vaddr_in_dso = vaddr_in_dso
|
||||
self.lines = []
|
||||
|
||||
@property
|
||||
def key(self):
|
||||
return (self.mapping_id, self.address)
|
||||
|
||||
|
||||
class Line(object):
|
||||
|
||||
def __init__(self):
|
||||
self.function_id = 0
|
||||
self.line = 0
|
||||
|
||||
|
||||
class Mapping(object):
|
||||
|
||||
def __init__(self, start, end, pgoff, filename_id, build_id_id):
|
||||
self.id = -1 # unset
|
||||
self.memory_start = start
|
||||
self.memory_limit = end
|
||||
self.file_offset = pgoff
|
||||
self.filename_id = filename_id
|
||||
self.build_id_id = build_id_id
|
||||
|
||||
@property
|
||||
def key(self):
|
||||
return (
|
||||
self.memory_start,
|
||||
self.memory_limit,
|
||||
self.file_offset,
|
||||
self.filename_id,
|
||||
self.build_id_id)
|
||||
|
||||
|
||||
class Function(object):
|
||||
|
||||
def __init__(self, name_id, dso_name_id, vaddr_in_dso):
|
||||
self.id = -1 # unset
|
||||
self.name_id = name_id
|
||||
self.dso_name_id = dso_name_id
|
||||
self.vaddr_in_dso = vaddr_in_dso
|
||||
self.source_filename_id = 0
|
||||
self.start_line = 0
|
||||
|
||||
@property
|
||||
def key(self):
|
||||
return (self.name_id, self.dso_name_id)
|
||||
|
||||
|
||||
# pylint: disable=no-member
|
||||
class PprofProfileGenerator(object):
|
||||
|
||||
def __init__(self, config):
|
||||
self.config = config
|
||||
self.lib = None
|
||||
|
||||
config['binary_cache_dir'] = 'binary_cache'
|
||||
if not os.path.isdir(config['binary_cache_dir']):
|
||||
config['binary_cache_dir'] = None
|
||||
self.dso_filter = set(config['dso_filters']) if config.get('dso_filters') else None
|
||||
self.max_chain_length = config['max_chain_length']
|
||||
self.profile = profile_pb2.Profile()
|
||||
self.profile.string_table.append('')
|
||||
self.string_table = {}
|
||||
self.sample_types = {}
|
||||
self.sample_map = {}
|
||||
self.sample_list = []
|
||||
self.location_map = {}
|
||||
self.location_list = []
|
||||
self.mapping_map = {}
|
||||
self.mapping_list = []
|
||||
self.function_map = {}
|
||||
self.function_list = []
|
||||
|
||||
# Map from dso_name in perf.data to (binary path, build_id).
|
||||
self.binary_map = {}
|
||||
self.read_elf = ReadElf(self.config['ndk_path'])
|
||||
self.binary_finder = BinaryFinder(config['binary_cache_dir'], self.read_elf)
|
||||
|
||||
def load_record_file(self, record_file):
|
||||
self.lib = GetReportLib(record_file)
|
||||
|
||||
if self.config['binary_cache_dir']:
|
||||
self.lib.SetSymfs(self.config['binary_cache_dir'])
|
||||
kallsyms = os.path.join(self.config['binary_cache_dir'], 'kallsyms')
|
||||
if os.path.isfile(kallsyms):
|
||||
self.lib.SetKallsymsFile(kallsyms)
|
||||
|
||||
if self.config.get('show_art_frames'):
|
||||
self.lib.ShowArtFrames()
|
||||
self.lib.SetReportOptions(self.config['report_lib_options'])
|
||||
|
||||
comments = [
|
||||
"Simpleperf Record Command:\n" + self.lib.GetRecordCmd(),
|
||||
"Converted to pprof with:\n" + " ".join(sys.argv),
|
||||
"Architecture:\n" + self.lib.GetArch(),
|
||||
]
|
||||
meta_info = self.lib.MetaInfo()
|
||||
if "app_versioncode" in meta_info:
|
||||
comments.append("App Version Code:\n" + meta_info["app_versioncode"])
|
||||
for comment in comments:
|
||||
self.profile.comment.append(self.get_string_id(comment))
|
||||
if "timestamp" in meta_info:
|
||||
self.profile.time_nanos = int(meta_info["timestamp"]) * 1000 * 1000 * 1000
|
||||
|
||||
numbers_re = re.compile(r"\d+")
|
||||
|
||||
# Process all samples in perf.data, aggregate samples.
|
||||
while True:
|
||||
report_sample = self.lib.GetNextSample()
|
||||
if report_sample is None:
|
||||
self.lib.Close()
|
||||
self.lib = None
|
||||
break
|
||||
event = self.lib.GetEventOfCurrentSample()
|
||||
symbol = self.lib.GetSymbolOfCurrentSample()
|
||||
callchain = self.lib.GetCallChainOfCurrentSample()
|
||||
|
||||
sample_type_id = self.get_sample_type_id(event.name)
|
||||
sample = Sample()
|
||||
sample.add_value(sample_type_id, 1)
|
||||
sample.add_value(sample_type_id + 1, report_sample.period)
|
||||
sample.labels.append(Label(
|
||||
self.get_string_id("thread"),
|
||||
self.get_string_id(report_sample.thread_comm)))
|
||||
# Heuristic: threadpools doing similar work are often named as
|
||||
# name-1, name-2, name-3. Combine threadpools into one label
|
||||
# "name-%d" if they only differ by a number.
|
||||
sample.labels.append(Label(
|
||||
self.get_string_id("threadpool"),
|
||||
self.get_string_id(
|
||||
numbers_re.sub("%d", report_sample.thread_comm))))
|
||||
sample.labels.append(Label(
|
||||
self.get_string_id("pid"),
|
||||
self.get_string_id(str(report_sample.pid))))
|
||||
sample.labels.append(Label(
|
||||
self.get_string_id("tid"),
|
||||
self.get_string_id(str(report_sample.tid))))
|
||||
if self._filter_symbol(symbol):
|
||||
location_id = self.get_location_id(report_sample.ip, symbol)
|
||||
sample.add_location_id(location_id)
|
||||
for i in range(max(0, callchain.nr - self.max_chain_length), callchain.nr):
|
||||
entry = callchain.entries[i]
|
||||
if self._filter_symbol(symbol):
|
||||
location_id = self.get_location_id(entry.ip, entry.symbol)
|
||||
sample.add_location_id(location_id)
|
||||
if sample.location_ids:
|
||||
self.add_sample(sample)
|
||||
|
||||
def gen(self, jobs: int):
|
||||
# 1. Generate line info for locations and functions.
|
||||
self.gen_source_lines(jobs)
|
||||
|
||||
# 2. Produce samples/locations/functions in profile.
|
||||
for sample in self.sample_list:
|
||||
self.gen_profile_sample(sample)
|
||||
for mapping in self.mapping_list:
|
||||
self.gen_profile_mapping(mapping)
|
||||
for location in self.location_list:
|
||||
self.gen_profile_location(location)
|
||||
for function in self.function_list:
|
||||
self.gen_profile_function(function)
|
||||
|
||||
return self.profile
|
||||
|
||||
def _filter_symbol(self, symbol):
|
||||
if not self.dso_filter or symbol.dso_name in self.dso_filter:
|
||||
return True
|
||||
return False
|
||||
|
||||
def get_string_id(self, str_value):
|
||||
if not str_value:
|
||||
return 0
|
||||
str_id = self.string_table.get(str_value)
|
||||
if str_id is not None:
|
||||
return str_id
|
||||
str_id = len(self.string_table) + 1
|
||||
self.string_table[str_value] = str_id
|
||||
self.profile.string_table.append(str_value)
|
||||
return str_id
|
||||
|
||||
def get_string(self, str_id):
|
||||
return self.profile.string_table[str_id]
|
||||
|
||||
def get_sample_type_id(self, name):
|
||||
sample_type_id = self.sample_types.get(name)
|
||||
if sample_type_id is not None:
|
||||
return sample_type_id
|
||||
sample_type_id = len(self.profile.sample_type)
|
||||
sample_type = self.profile.sample_type.add()
|
||||
sample_type.type = self.get_string_id(name + '_samples')
|
||||
sample_type.unit = self.get_string_id('samples')
|
||||
sample_type = self.profile.sample_type.add()
|
||||
sample_type.type = self.get_string_id(name)
|
||||
units = EVENT_UNITS.get(name, 'count')
|
||||
sample_type.unit = self.get_string_id(units)
|
||||
self.sample_types[name] = sample_type_id
|
||||
return sample_type_id
|
||||
|
||||
def get_location_id(self, ip, symbol):
|
||||
binary_path, build_id = self.get_binary(symbol.dso_name)
|
||||
mapping_id = self.get_mapping_id(symbol.mapping[0], binary_path, build_id)
|
||||
location = Location(mapping_id, ip, symbol.vaddr_in_file)
|
||||
function_id = self.get_function_id(symbol.symbol_name, binary_path, symbol.symbol_addr)
|
||||
if function_id:
|
||||
# Add Line only when it has a valid function id, see http://b/36988814.
|
||||
# Default line info only contains the function name
|
||||
line = Line()
|
||||
line.function_id = function_id
|
||||
location.lines.append(line)
|
||||
|
||||
exist_location = self.location_map.get(location.key)
|
||||
if exist_location:
|
||||
return exist_location.id
|
||||
# location_id starts from 1
|
||||
location.id = len(self.location_list) + 1
|
||||
self.location_list.append(location)
|
||||
self.location_map[location.key] = location
|
||||
return location.id
|
||||
|
||||
def get_mapping_id(self, report_mapping, filename, build_id):
|
||||
filename_id = self.get_string_id(filename)
|
||||
build_id_id = self.get_string_id(build_id)
|
||||
mapping = Mapping(report_mapping.start, report_mapping.end,
|
||||
report_mapping.pgoff, filename_id, build_id_id)
|
||||
exist_mapping = self.mapping_map.get(mapping.key)
|
||||
if exist_mapping:
|
||||
return exist_mapping.id
|
||||
# mapping_id starts from 1
|
||||
mapping.id = len(self.mapping_list) + 1
|
||||
self.mapping_list.append(mapping)
|
||||
self.mapping_map[mapping.key] = mapping
|
||||
return mapping.id
|
||||
|
||||
def get_binary(self, dso_name):
|
||||
""" Return (binary_path, build_id) for a given dso_name. """
|
||||
value = self.binary_map.get(dso_name)
|
||||
if value:
|
||||
return value
|
||||
|
||||
binary_path = dso_name
|
||||
build_id = self.lib.GetBuildIdForPath(dso_name)
|
||||
# Try elf_path in binary cache.
|
||||
elf_path = self.binary_finder.find_binary(dso_name, build_id)
|
||||
if elf_path:
|
||||
binary_path = str(elf_path)
|
||||
|
||||
# The build ids in perf.data are padded to 20 bytes, but pprof needs without padding.
|
||||
build_id = ReadElf.unpad_build_id(build_id)
|
||||
self.binary_map[dso_name] = (binary_path, build_id)
|
||||
return (binary_path, build_id)
|
||||
|
||||
def get_mapping(self, mapping_id):
|
||||
return self.mapping_list[mapping_id - 1] if mapping_id > 0 else None
|
||||
|
||||
def get_function_id(self, name, dso_name, vaddr_in_file):
|
||||
if name == 'unknown':
|
||||
return 0
|
||||
function = Function(self.get_string_id(name), self.get_string_id(dso_name), vaddr_in_file)
|
||||
exist_function = self.function_map.get(function.key)
|
||||
if exist_function:
|
||||
return exist_function.id
|
||||
# function_id starts from 1
|
||||
function.id = len(self.function_list) + 1
|
||||
self.function_list.append(function)
|
||||
self.function_map[function.key] = function
|
||||
return function.id
|
||||
|
||||
def get_function(self, function_id):
|
||||
return self.function_list[function_id - 1] if function_id > 0 else None
|
||||
|
||||
def add_sample(self, sample):
|
||||
exist_sample = self.sample_map.get(sample.key)
|
||||
if exist_sample:
|
||||
exist_sample.add_values(sample.values)
|
||||
else:
|
||||
self.sample_list.append(sample)
|
||||
self.sample_map[sample.key] = sample
|
||||
|
||||
def gen_source_lines(self, jobs: int):
|
||||
# 1. Create Addr2line instance
|
||||
if not self.config.get('binary_cache_dir'):
|
||||
logging.info("Can't generate line information because binary_cache is missing.")
|
||||
return
|
||||
if not ToolFinder.find_tool_path('llvm-symbolizer', self.config['ndk_path']):
|
||||
logging.info("Can't generate line information because can't find llvm-symbolizer.")
|
||||
return
|
||||
# We have changed dso names to paths in binary_cache in self.get_binary(). So no need to
|
||||
# pass binary_cache_dir to BinaryFinder.
|
||||
binary_finder = BinaryFinder(None, self.read_elf)
|
||||
addr2line = Addr2Nearestline(self.config['ndk_path'], binary_finder, True)
|
||||
|
||||
# 2. Put all needed addresses to it.
|
||||
for location in self.location_list:
|
||||
mapping = self.get_mapping(location.mapping_id)
|
||||
dso_name = self.get_string(mapping.filename_id)
|
||||
if location.lines:
|
||||
function = self.get_function(location.lines[0].function_id)
|
||||
addr2line.add_addr(dso_name, None, function.vaddr_in_dso, location.vaddr_in_dso)
|
||||
for function in self.function_list:
|
||||
dso_name = self.get_string(function.dso_name_id)
|
||||
addr2line.add_addr(dso_name, None, function.vaddr_in_dso, function.vaddr_in_dso)
|
||||
|
||||
# 3. Generate source lines.
|
||||
addr2line.convert_addrs_to_lines(jobs)
|
||||
|
||||
# 4. Annotate locations and functions.
|
||||
for location in self.location_list:
|
||||
if not location.lines:
|
||||
continue
|
||||
mapping = self.get_mapping(location.mapping_id)
|
||||
dso_name = self.get_string(mapping.filename_id)
|
||||
dso = addr2line.get_dso(dso_name)
|
||||
if not dso:
|
||||
continue
|
||||
sources = addr2line.get_addr_source(dso, location.vaddr_in_dso)
|
||||
if not sources:
|
||||
continue
|
||||
for i, source in enumerate(sources):
|
||||
source_file, source_line, function_name = source
|
||||
if i == 0:
|
||||
# Don't override original function name from report library, which is more
|
||||
# accurate when proguard mapping file is given.
|
||||
function_id = location.lines[0].function_id
|
||||
# Clear default line info.
|
||||
location.lines.clear()
|
||||
else:
|
||||
function_id = self.get_function_id(function_name, dso_name, 0)
|
||||
if function_id == 0:
|
||||
continue
|
||||
location.lines.append(self.add_line(source_file, source_line, function_id))
|
||||
|
||||
for function in self.function_list:
|
||||
dso_name = self.get_string(function.dso_name_id)
|
||||
if function.vaddr_in_dso:
|
||||
dso = addr2line.get_dso(dso_name)
|
||||
if not dso:
|
||||
continue
|
||||
sources = addr2line.get_addr_source(dso, function.vaddr_in_dso)
|
||||
if sources:
|
||||
source_file, source_line, _ = sources[0]
|
||||
function.source_filename_id = self.get_string_id(source_file)
|
||||
function.start_line = source_line
|
||||
|
||||
def add_line(self, source_file, source_line, function_id):
|
||||
line = Line()
|
||||
function = self.get_function(function_id)
|
||||
function.source_filename_id = self.get_string_id(source_file)
|
||||
line.function_id = function_id
|
||||
line.line = source_line
|
||||
return line
|
||||
|
||||
def gen_profile_sample(self, sample):
|
||||
profile_sample = self.profile.sample.add()
|
||||
profile_sample.location_id.extend(sample.location_ids)
|
||||
sample_type_count = len(self.sample_types) * 2
|
||||
values = [0] * sample_type_count
|
||||
for sample_type_id in sample.values:
|
||||
values[sample_type_id] = sample.values[sample_type_id]
|
||||
profile_sample.value.extend(values)
|
||||
|
||||
for l in sample.labels:
|
||||
label = profile_sample.label.add()
|
||||
label.key = l.key_id
|
||||
label.str = l.str_id
|
||||
|
||||
def gen_profile_mapping(self, mapping):
|
||||
profile_mapping = self.profile.mapping.add()
|
||||
profile_mapping.id = mapping.id
|
||||
profile_mapping.memory_start = mapping.memory_start
|
||||
profile_mapping.memory_limit = mapping.memory_limit
|
||||
profile_mapping.file_offset = mapping.file_offset
|
||||
profile_mapping.filename = mapping.filename_id
|
||||
profile_mapping.build_id = mapping.build_id_id
|
||||
profile_mapping.has_filenames = True
|
||||
profile_mapping.has_functions = True
|
||||
if self.config.get('binary_cache_dir'):
|
||||
profile_mapping.has_line_numbers = True
|
||||
profile_mapping.has_inline_frames = True
|
||||
else:
|
||||
profile_mapping.has_line_numbers = False
|
||||
profile_mapping.has_inline_frames = False
|
||||
|
||||
def gen_profile_location(self, location):
|
||||
profile_location = self.profile.location.add()
|
||||
profile_location.id = location.id
|
||||
profile_location.mapping_id = location.mapping_id
|
||||
profile_location.address = location.address
|
||||
for i in range(len(location.lines)):
|
||||
line = profile_location.line.add()
|
||||
line.function_id = location.lines[i].function_id
|
||||
line.line = location.lines[i].line
|
||||
|
||||
def gen_profile_function(self, function):
|
||||
profile_function = self.profile.function.add()
|
||||
profile_function.id = function.id
|
||||
profile_function.name = function.name_id
|
||||
profile_function.system_name = function.name_id
|
||||
profile_function.filename = function.source_filename_id
|
||||
profile_function.start_line = function.start_line
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description='Generate pprof profile data in pprof.profile.')
|
||||
parser.add_argument('--show', nargs='?', action='append', help='print existing pprof.profile.')
|
||||
parser.add_argument('-i', '--record_file', nargs='+', default=['perf.data'], help="""
|
||||
Set profiling data file to report. Default is perf.data""")
|
||||
parser.add_argument('-o', '--output_file', default='pprof.profile', help="""
|
||||
The path of generated pprof profile data.""")
|
||||
parser.add_argument('--max_chain_length', type=int, default=1000000000, help="""
|
||||
Maximum depth of samples to be converted.""") # Large value as infinity standin.
|
||||
parser.add_argument('--ndk_path', type=extant_dir, help='Set the path of a ndk release.')
|
||||
parser.add_argument(
|
||||
'-j', '--jobs', type=int, default=os.cpu_count(),
|
||||
help='Use multithreading to speed up source code annotation.')
|
||||
sample_filter_group = parser.add_argument_group('Sample filter options')
|
||||
sample_filter_group.add_argument('--dso', nargs='+', action='append', help="""
|
||||
Use samples only in selected binaries.""")
|
||||
parser.add_report_lib_options(sample_filter_group=sample_filter_group)
|
||||
|
||||
args = parser.parse_args()
|
||||
if args.show:
|
||||
show_file = args.show[0] if args.show[0] else 'pprof.profile'
|
||||
profile = load_pprof_profile(show_file)
|
||||
printer = PprofProfilePrinter(profile)
|
||||
printer.show()
|
||||
return
|
||||
|
||||
config = {}
|
||||
config['output_file'] = args.output_file
|
||||
config['dso_filters'] = flatten_arg_list(args.dso)
|
||||
config['ndk_path'] = args.ndk_path
|
||||
config['max_chain_length'] = args.max_chain_length
|
||||
config['report_lib_options'] = args.report_lib_options
|
||||
generator = PprofProfileGenerator(config)
|
||||
for record_file in args.record_file:
|
||||
generator.load_record_file(record_file)
|
||||
profile = generator.gen(args.jobs)
|
||||
store_pprof_profile(config['output_file'], profile)
|
||||
logging.info("Report is generated at '%s' successfully." % config['output_file'])
|
||||
logging.info('Before uploading to the continuous PProf UI, use gzip to compress the file.')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
40
Android/android-ndk-r27d/simpleperf/profile_pb2.py
Normal file
@ -0,0 +1,40 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Generated by the protocol buffer compiler. DO NOT EDIT!
|
||||
# source: profile.proto
|
||||
"""Generated protocol buffer code."""
|
||||
from google.protobuf.internal import builder as _builder
|
||||
from google.protobuf import descriptor as _descriptor
|
||||
from google.protobuf import descriptor_pool as _descriptor_pool
|
||||
from google.protobuf import symbol_database as _symbol_database
|
||||
# @@protoc_insertion_point(imports)
|
||||
|
||||
_sym_db = _symbol_database.Default()
|
||||
|
||||
|
||||
|
||||
|
||||
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\rprofile.proto\x12\x12perftools.profiles\"\xd5\x03\n\x07Profile\x12\x32\n\x0bsample_type\x18\x01 \x03(\x0b\x32\x1d.perftools.profiles.ValueType\x12*\n\x06sample\x18\x02 \x03(\x0b\x32\x1a.perftools.profiles.Sample\x12,\n\x07mapping\x18\x03 \x03(\x0b\x32\x1b.perftools.profiles.Mapping\x12.\n\x08location\x18\x04 \x03(\x0b\x32\x1c.perftools.profiles.Location\x12.\n\x08\x66unction\x18\x05 \x03(\x0b\x32\x1c.perftools.profiles.Function\x12\x14\n\x0cstring_table\x18\x06 \x03(\t\x12\x13\n\x0b\x64rop_frames\x18\x07 \x01(\x03\x12\x13\n\x0bkeep_frames\x18\x08 \x01(\x03\x12\x12\n\ntime_nanos\x18\t \x01(\x03\x12\x16\n\x0e\x64uration_nanos\x18\n \x01(\x03\x12\x32\n\x0bperiod_type\x18\x0b \x01(\x0b\x32\x1d.perftools.profiles.ValueType\x12\x0e\n\x06period\x18\x0c \x01(\x03\x12\x0f\n\x07\x63omment\x18\r \x03(\x03\x12\x1b\n\x13\x64\x65\x66\x61ult_sample_type\x18\x0e \x01(\x03\"\'\n\tValueType\x12\x0c\n\x04type\x18\x01 \x01(\x03\x12\x0c\n\x04unit\x18\x02 \x01(\x03\"V\n\x06Sample\x12\x13\n\x0blocation_id\x18\x01 \x03(\x04\x12\r\n\x05value\x18\x02 \x03(\x03\x12(\n\x05label\x18\x03 \x03(\x0b\x32\x19.perftools.profiles.Label\"@\n\x05Label\x12\x0b\n\x03key\x18\x01 \x01(\x03\x12\x0b\n\x03str\x18\x02 \x01(\x03\x12\x0b\n\x03num\x18\x03 \x01(\x03\x12\x10\n\x08num_unit\x18\x04 \x01(\x03\"\xdd\x01\n\x07Mapping\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x14\n\x0cmemory_start\x18\x02 \x01(\x04\x12\x14\n\x0cmemory_limit\x18\x03 \x01(\x04\x12\x13\n\x0b\x66ile_offset\x18\x04 \x01(\x04\x12\x10\n\x08\x66ilename\x18\x05 \x01(\x03\x12\x10\n\x08\x62uild_id\x18\x06 \x01(\x03\x12\x15\n\rhas_functions\x18\x07 \x01(\x08\x12\x15\n\rhas_filenames\x18\x08 \x01(\x08\x12\x18\n\x10has_line_numbers\x18\t \x01(\x08\x12\x19\n\x11has_inline_frames\x18\n \x01(\x08\"v\n\x08Location\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x12\n\nmapping_id\x18\x02 \x01(\x04\x12\x0f\n\x07\x61\x64\x64ress\x18\x03 \x01(\x04\x12&\n\x04line\x18\x04 \x03(\x0b\x32\x18.perftools.profiles.Line\x12\x11\n\tis_folded\x18\x05 \x01(\x08\")\n\x04Line\x12\x13\n\x0b\x66unction_id\x18\x01 \x01(\x04\x12\x0c\n\x04line\x18\x02 \x01(\x03\"_\n\x08\x46unction\x12\n\n\x02id\x18\x01 \x01(\x04\x12\x0c\n\x04name\x18\x02 \x01(\x03\x12\x13\n\x0bsystem_name\x18\x03 \x01(\x03\x12\x10\n\x08\x66ilename\x18\x04 \x01(\x03\x12\x12\n\nstart_line\x18\x05 \x01(\x03\x42-\n\x1d\x63om.google.perftools.profilesB\x0cProfileProtob\x06proto3')
|
||||
|
||||
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals())
|
||||
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'profile_pb2', globals())
|
||||
if _descriptor._USE_C_DESCRIPTORS == False:
|
||||
|
||||
DESCRIPTOR._options = None
|
||||
DESCRIPTOR._serialized_options = b'\n\035com.google.perftools.profilesB\014ProfileProto'
|
||||
_PROFILE._serialized_start=38
|
||||
_PROFILE._serialized_end=507
|
||||
_VALUETYPE._serialized_start=509
|
||||
_VALUETYPE._serialized_end=548
|
||||
_SAMPLE._serialized_start=550
|
||||
_SAMPLE._serialized_end=636
|
||||
_LABEL._serialized_start=638
|
||||
_LABEL._serialized_end=702
|
||||
_MAPPING._serialized_start=705
|
||||
_MAPPING._serialized_end=926
|
||||
_LOCATION._serialized_start=928
|
||||
_LOCATION._serialized_end=1046
|
||||
_LINE._serialized_start=1048
|
||||
_LINE._serialized_end=1089
|
||||
_FUNCTION._serialized_start=1091
|
||||
_FUNCTION._serialized_end=1186
|
||||
# @@protoc_insertion_point(module_scope)
|
||||
96
Android/android-ndk-r27d/simpleperf/proto/branch_list.proto
Normal file
@ -0,0 +1,96 @@
|
||||
/*
|
||||
* Copyright (C) 2020 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
// The branch list file format is generated by the inject command. It contains
|
||||
// a single BranchList message.
|
||||
|
||||
syntax = "proto3";
|
||||
|
||||
package simpleperf.proto;
|
||||
|
||||
message BranchList {
|
||||
// Used to identify format in generated proto files.
|
||||
// Should always be "simpleperf:EtmBranchList".
|
||||
string magic = 1;
|
||||
repeated ETMBinary etm_data = 2;
|
||||
LBRData lbr_data = 3;
|
||||
}
|
||||
|
||||
message ETMBinary {
|
||||
string path = 1;
|
||||
string build_id = 2;
|
||||
|
||||
message Address {
|
||||
// vaddr in binary, instr addr before the first branch
|
||||
uint64 addr = 1;
|
||||
|
||||
message Branch {
|
||||
// Each bit represents a branch: 0 for not branch, 1 for branch.
|
||||
// Bit 0 comes first, bit 7 comes last.
|
||||
bytes branch = 1;
|
||||
uint32 branch_size = 2;
|
||||
uint64 count = 3;
|
||||
}
|
||||
|
||||
repeated Branch branches = 2;
|
||||
}
|
||||
|
||||
repeated Address addrs = 3;
|
||||
|
||||
enum BinaryType {
|
||||
ELF_FILE = 0;
|
||||
KERNEL = 1;
|
||||
KERNEL_MODULE = 2;
|
||||
}
|
||||
BinaryType type = 4;
|
||||
|
||||
message KernelBinaryInfo {
|
||||
// kernel_start_addr is used to convert kernel ip address to vaddr in vmlinux.
|
||||
// If it is zero, the Address in KERNEL binary has been converted to vaddr. Otherwise,
|
||||
// the Address in KERNEL binary is still ip address, and need to be converted later.
|
||||
uint64 kernel_start_addr = 1;
|
||||
}
|
||||
|
||||
KernelBinaryInfo kernel_info = 5;
|
||||
}
|
||||
|
||||
message LBRData {
|
||||
repeated Sample samples = 1;
|
||||
repeated Binary binaries = 2;
|
||||
|
||||
message Sample {
|
||||
// If binary_id >= 1, it refers to LBRData.binaries[binary_id - 1]. Otherwise, it's invalid.
|
||||
uint32 binary_id = 1;
|
||||
uint64 vaddr_in_file = 2;
|
||||
repeated Branch branches = 3;
|
||||
|
||||
message Branch {
|
||||
// If from_binary_id >= 1, it refers to LBRData.binaries[from_binary_id - 1]. Otherwise, it's
|
||||
// invalid.
|
||||
uint32 from_binary_id = 1;
|
||||
// If to_binary_id >= 1, it refers to LBRData.binaries[to_binary_id - 1]. Otherwise, it's
|
||||
// invalid.
|
||||
uint32 to_binary_id = 2;
|
||||
uint64 from_vaddr_in_file = 3;
|
||||
uint64 to_vaddr_in_file = 4;
|
||||
}
|
||||
}
|
||||
|
||||
message Binary {
|
||||
string path = 1;
|
||||
string build_id = 2;
|
||||
}
|
||||
}
|
||||
@ -0,0 +1,165 @@
|
||||
/*
|
||||
* Copyright (C) 2022 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
// The file format generated by cmd_report_sample.proto is as below:
|
||||
// char magic[10] = "SIMPLEPERF";
|
||||
// LittleEndian16(version) = 1;
|
||||
// LittleEndian32(record_size_0)
|
||||
// message Record(record_0) (having record_size_0 bytes)
|
||||
// LittleEndian32(record_size_1)
|
||||
// message Record(record_1) (having record_size_1 bytes)
|
||||
// ...
|
||||
// LittleEndian32(record_size_N)
|
||||
// message Record(record_N) (having record_size_N bytes)
|
||||
// LittleEndian32(0)
|
||||
|
||||
syntax = "proto2";
|
||||
option optimize_for = LITE_RUNTIME;
|
||||
package simpleperf_report_proto;
|
||||
option java_package = "com.android.tools.profiler.proto";
|
||||
option java_outer_classname = "SimpleperfReport";
|
||||
|
||||
message Sample {
|
||||
// Monotonic clock time in nanoseconds. On kernel < 4.1, it's perf clock instead.
|
||||
optional uint64 time = 1;
|
||||
optional int32 thread_id = 2;
|
||||
|
||||
message CallChainEntry {
|
||||
// virtual address of the instruction in elf file
|
||||
optional uint64 vaddr_in_file = 1;
|
||||
|
||||
// index of the elf file containing the instruction
|
||||
optional uint32 file_id = 2;
|
||||
|
||||
// symbol_id refers to the name of the function containing the instruction.
|
||||
// If the function name is found, it is a valid index in the symbol table
|
||||
// of File with 'id' field being file_id, otherwise it is -1.
|
||||
optional int32 symbol_id = 3;
|
||||
|
||||
enum ExecutionType {
|
||||
// methods belong to native libraries, AOT compiled JVM code and ART methods not used near
|
||||
// JVM methods
|
||||
NATIVE_METHOD = 0;
|
||||
INTERPRETED_JVM_METHOD = 1;
|
||||
JIT_JVM_METHOD = 2;
|
||||
// ART methods used near JVM methods. It's shown only when --show-art-frames is used.
|
||||
ART_METHOD = 3;
|
||||
}
|
||||
optional ExecutionType execution_type = 4 [default = NATIVE_METHOD];
|
||||
}
|
||||
|
||||
repeated CallChainEntry callchain = 3;
|
||||
|
||||
// Simpleperf generates one sample whenever a specified amount of events happen
|
||||
// while running a monitored thread. So each sample belongs to one event type.
|
||||
// Event type can be cpu-cycles, cpu-clock, sched:sched_switch or other types.
|
||||
// By using '-e' option, we can ask simpleperf to record samples for one or more
|
||||
// event types.
|
||||
// Each event type generates samples independently. But recording more event types
|
||||
// will cost more cpu time generating samples, which may affect the monitored threads
|
||||
// and sample lost rate.
|
||||
// event_count field shows the count of the events (belong to the sample's event type)
|
||||
// that have happened since last sample (belong to the sample's event type) for the
|
||||
// same thread. However, if there are lost samples between current sample and previous
|
||||
// sample, the event_count is the count of events from the last lost sample.
|
||||
optional uint64 event_count = 4;
|
||||
|
||||
// An index in meta_info.event_type, shows which event type current sample belongs to.
|
||||
optional uint32 event_type_id = 5;
|
||||
|
||||
message UnwindingResult {
|
||||
// error code provided by libunwindstack, in
|
||||
// https://cs.android.com/android/platform/superproject/+/master:system/unwinding/libunwindstack/include/unwindstack/Error.h
|
||||
optional uint32 raw_error_code = 1;
|
||||
// error addr provided by libunwindstack
|
||||
optional uint64 error_addr = 2;
|
||||
|
||||
// error code interpreted by simpleperf
|
||||
enum ErrorCode {
|
||||
ERROR_NONE = 0; // No error
|
||||
ERROR_UNKNOWN = 1; // Error not interpreted by simpleperf, see raw_error_code
|
||||
ERROR_NOT_ENOUGH_STACK = 2; // Simpleperf doesn't record enough stack data
|
||||
ERROR_MEMORY_INVALID = 3; // Memory read failed
|
||||
ERROR_UNWIND_INFO = 4; // No debug info in binary to support unwinding
|
||||
ERROR_INVALID_MAP = 5; // Unwind in an invalid map
|
||||
ERROR_MAX_FRAME_EXCEEDED = 6; // Stopped at MAX_UNWINDING_FRAMES, which is 512.
|
||||
ERROR_REPEATED_FRAME = 7; // The last frame has the same pc/sp as the next.
|
||||
ERROR_INVALID_ELF = 8; // Unwind in an invalid elf file
|
||||
}
|
||||
optional ErrorCode error_code = 3;
|
||||
}
|
||||
|
||||
// Unwinding result is provided for samples without a complete callchain, when recorded with
|
||||
// --keep-failed-unwinding-result or --keep-failed-unwinding-debug-info.
|
||||
optional UnwindingResult unwinding_result = 6;
|
||||
}
|
||||
|
||||
message LostSituation {
|
||||
optional uint64 sample_count = 1;
|
||||
optional uint64 lost_count = 2;
|
||||
}
|
||||
|
||||
message File {
|
||||
// unique id for each file, starting from 0, and add 1 each time.
|
||||
optional uint32 id = 1;
|
||||
|
||||
// file path, like /system/lib/libc.so.
|
||||
optional string path = 2;
|
||||
|
||||
// symbol table of the file.
|
||||
repeated string symbol = 3;
|
||||
|
||||
// mangled symbol table of the file.
|
||||
repeated string mangled_symbol = 4;
|
||||
}
|
||||
|
||||
message Thread {
|
||||
optional uint32 thread_id = 1;
|
||||
optional uint32 process_id = 2;
|
||||
optional string thread_name = 3;
|
||||
}
|
||||
|
||||
message MetaInfo {
|
||||
repeated string event_type = 1;
|
||||
optional string app_package_name = 2;
|
||||
optional string app_type = 3; // debuggable, profileable or non_profileable
|
||||
optional string android_sdk_version = 4;
|
||||
optional string android_build_type = 5; // user, userdebug or eng
|
||||
|
||||
// True if the profile is recorded with --trace-offcpu option.
|
||||
optional bool trace_offcpu = 6;
|
||||
}
|
||||
|
||||
// Thread context switch info. It is available when MetaInfo.trace_offcpu = true.
|
||||
message ContextSwitch {
|
||||
// If true, the thread is scheduled on cpu, otherwise it is scheduled off cpu.
|
||||
optional bool switch_on = 1;
|
||||
|
||||
// Monotonic clock time in nanoseconds. On kernel < 4.1, it's perf clock instead.
|
||||
optional uint64 time = 2;
|
||||
optional uint32 thread_id = 3;
|
||||
}
|
||||
|
||||
message Record {
|
||||
oneof record_data {
|
||||
Sample sample = 1;
|
||||
LostSituation lost = 2;
|
||||
File file = 3;
|
||||
Thread thread = 4;
|
||||
MetaInfo meta_info = 5;
|
||||
ContextSwitch context_switch = 6;
|
||||
}
|
||||
}
|
||||
60
Android/android-ndk-r27d/simpleperf/proto/record_file.proto
Normal file
@ -0,0 +1,60 @@
|
||||
/*
|
||||
* Copyright (C) 2021 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
// message types used in perf.data.
|
||||
|
||||
syntax = "proto3";
|
||||
|
||||
package simpleperf.proto;
|
||||
|
||||
|
||||
message DebugUnwindFeature {
|
||||
message File {
|
||||
string path = 1;
|
||||
uint64 size = 2;
|
||||
}
|
||||
|
||||
repeated File file = 1;
|
||||
}
|
||||
|
||||
message FileFeature {
|
||||
string path = 1;
|
||||
uint32 type = 2;
|
||||
uint64 min_vaddr = 3;
|
||||
|
||||
message Symbol {
|
||||
uint64 vaddr = 1;
|
||||
uint32 len = 2;
|
||||
string name = 3;
|
||||
}
|
||||
repeated Symbol symbol = 4;
|
||||
|
||||
message DexFile {
|
||||
repeated uint64 dex_file_offset = 1;
|
||||
}
|
||||
message ElfFile {
|
||||
uint64 file_offset_of_min_vaddr = 1;
|
||||
}
|
||||
message KernelModule {
|
||||
uint64 memory_offset_of_min_vaddr = 1;
|
||||
}
|
||||
|
||||
oneof type_specific_msg {
|
||||
DexFile dex_file = 5; // Only when type = DSO_DEX_FILE
|
||||
ElfFile elf_file = 6; // Only when type = DSO_ELF_FILE
|
||||
KernelModule kernel_module = 7; // Only when type = DSO_KERNEL_MODULE
|
||||
}
|
||||
}
|
||||
69
Android/android-ndk-r27d/simpleperf/purgatorio/README.md
Normal file
@ -0,0 +1,69 @@
|
||||
# Purgatorio
|
||||
|
||||
[link on wikipedia](https://en.wikipedia.org/wiki/Purgatorio)
|
||||
|
||||

|
||||
|
||||
Purgatorio is a visualization tool for simpleperf traces. It's based on [libsimpleperf](https://source.corp.google.com/android/system/extras/simpleperf/;l=1?q=simpleperf&sq=package:%5Eandroid$),
|
||||
[Bokeh](https://bokeh.org/) and [D3 flamegraphs](https://github.com/spiermar/d3-flame-graph).
|
||||
|
||||
The main difference from [Inferno](https://source.corp.google.com/android/system/extras/simpleperf/scripts/inferno/;l=1) is that Purgatorio focuses on visualizing system-wide traces (the ones with the `-a` argument) on a time-organized sequence, and allow the user to interact with the graph by zooming, hovering on samples and visualize a flame graph for specific samples (be it restricted by time interval, set of threads or whatever subset).
|
||||
|
||||
## Obtaining the sources
|
||||
|
||||
git clone sso://user/balejs/purgatorio
|
||||
|
||||
## Getting ready
|
||||
|
||||
**NOTE**: In theory it should work on most OSes, but Purgatorio has been tested on gLinux only. Any feedback, recommendations and patches to get it work elsewhere will be welcome (balejs@).
|
||||
|
||||
Purgatorio tends to be self-contained, but Bokeh and some of its dependencies aren't shipped with the default python libraries, hence they require to be installed with pip3. Assuming they already have python3 installed, Purgatorio hopefuls should follow these steps:
|
||||
|
||||
$ sudo apt-get install python3-pip
|
||||
$ pip3 install jinja2 bokeh pandas
|
||||
|
||||
Run `python3 purgatorio.py -h` for a list of command-line arguments.
|
||||
|
||||
## Example
|
||||
|
||||
One can trace a Camera warm launch with:
|
||||
|
||||
$ adb shell simpleperf record --trace-offcpu --call-graph fp -o /data/local/camera_warm_launch.data -a
|
||||
[launch camera here, then press ctrl + c]
|
||||
$ adb pull /data/local/camera_warm_launch.data
|
||||
|
||||
And then run:
|
||||
|
||||
python3 purgatorio.py camera_warm_launch.data
|
||||
|
||||
If you get lots of "Failed to read symbols" messages, and backtraces in the diagram don't show the symbols you're interested into, you might want to try [building a symbols cache](https://chromium.googlesource.com/android_ndk/+/refs/heads/master/simpleperf/doc/README.md#how-to-solve-missing-symbols-in-report) for the trace, then run purgatorio again with:
|
||||
|
||||
python3 purgatorio.py camera_warm_launch.data -u [symbols cache]
|
||||
|
||||
# Purgatorio interface
|
||||
The Purgatorio User Interface is divided in three areas:
|
||||
|
||||
## Main Graph
|
||||
It's the area to the left, including process names and color-coded dots grouped by process. It's used to navigate throungh the trace and identify samples of ineterest. By hovering on a sample (or set of samples) their callstacks will be visualized over the graph. When selecting a et of samples, their aggregated data will be visualized in the other sections of the UI. Multiple sections of the graph can be aggregated by holding down the [ctrl] button during selection.
|
||||
|
||||
The toolbox to the right can be used to configure interactions with the graph:
|
||||
|
||||

|
||||
|
||||
## Flame graph
|
||||
The flame graph is located in the upper right portion. Once samples are selected in the main graph, the flame graph will show an interactive visualization for their aggregated callstacks. In this case the selection included mostly samples for com.google.android.GoogleCamera
|
||||
|
||||

|
||||
|
||||
It's possible to select a given stack entry to zoom on it and look at entry deeper in the callstack
|
||||
|
||||

|
||||
|
||||
When studiyng system issues it's often useful to visualize an inverted callstack. This can be done by clicking the related check box. The graph below is the same as in the first flame graph above, but with call stack inverted. In this case, inverted visualization directly points at [possible issues with io](http://b/158783580#comment12)
|
||||
|
||||

|
||||
|
||||
## Sample table
|
||||
It's located in the lower right and counts samples by thread (for direct flame graphs) or symbol (for inverted flame graphs). Table columns can be sorted by clicking on their respective layers, and selecting specific lines filters the contents of the flame graph to the selected threads or symbols. Multiple lines can be selected at the same time.
|
||||
|
||||

|
||||
|
After Width: | Height: | Size: 20 KiB |
|
After Width: | Height: | Size: 72 KiB |
|
After Width: | Height: | Size: 51 KiB |
BIN
Android/android-ndk-r27d/simpleperf/purgatorio/images/table.png
Normal file
|
After Width: | Height: | Size: 68 KiB |
|
After Width: | Height: | Size: 60 KiB |
|
After Width: | Height: | Size: 186 KiB |
306
Android/android-ndk-r27d/simpleperf/purgatorio/purgatorio.py
Normal file
@ -0,0 +1,306 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2021 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
import argparse
|
||||
import bisect
|
||||
import jinja2
|
||||
import io
|
||||
import math
|
||||
import os
|
||||
import pandas as pd
|
||||
from pathlib import Path
|
||||
import re
|
||||
import sys
|
||||
|
||||
from bokeh.embed import components
|
||||
from bokeh.io import output_file, show
|
||||
from bokeh.layouts import layout, Spacer
|
||||
from bokeh.models import ColumnDataSource, CustomJS, WheelZoomTool, HoverTool, FuncTickFormatter
|
||||
from bokeh.models.widgets import DataTable, DateFormatter, TableColumn
|
||||
from bokeh.models.ranges import FactorRange
|
||||
from bokeh.palettes import Category20b
|
||||
from bokeh.plotting import figure
|
||||
from bokeh.resources import INLINE
|
||||
from bokeh.transform import jitter
|
||||
from bokeh.util.browser import view
|
||||
from functools import cmp_to_key
|
||||
|
||||
# fmt: off
|
||||
simpleperf_path = Path(__file__).absolute().parents[1]
|
||||
sys.path.insert(0, str(simpleperf_path))
|
||||
import simpleperf_report_lib as sp
|
||||
from simpleperf_utils import BaseArgumentParser
|
||||
# fmt: on
|
||||
|
||||
|
||||
def create_graph(args, source, data_range):
|
||||
graph = figure(
|
||||
sizing_mode='stretch_both', x_range=data_range,
|
||||
tools=['pan', 'wheel_zoom', 'ywheel_zoom', 'xwheel_zoom', 'reset', 'tap', 'box_select'],
|
||||
active_drag='box_select', active_scroll='wheel_zoom',
|
||||
tooltips=[('thread', '@thread'),
|
||||
('callchain', '@callchain{safe}')],
|
||||
title=args.title, name='graph')
|
||||
|
||||
# a crude way to avoid process name cluttering at some zoom levels.
|
||||
# TODO: remove processes from the ticker base on the number of samples currently visualized.
|
||||
# The process with most samples visualized should always be visible on the ticker
|
||||
graph.xaxis.formatter = FuncTickFormatter(args={'range': data_range, 'graph': graph}, code="""
|
||||
var pixels_per_entry = graph.inner_height / (range.end - range.start) //Do not rond end and start here
|
||||
var entries_to_skip = Math.ceil(12 / pixels_per_entry) // kind of 12 px per entry
|
||||
var desc = tick.split(/:| /)
|
||||
// desc[0] == desc[1] for main threads
|
||||
var keep = (desc[0] == desc[1]) &&
|
||||
!(desc[2].includes('unknown') ||
|
||||
desc[2].includes('Binder') ||
|
||||
desc[2].includes('kworker'))
|
||||
|
||||
if (pixels_per_entry < 8 && !keep) {
|
||||
//if (index + Math.round(range.start)) % entries_to_skip != 0) {
|
||||
return ""
|
||||
}
|
||||
|
||||
return tick """)
|
||||
|
||||
graph.xaxis.major_label_orientation = math.pi/6
|
||||
|
||||
graph.circle(y='time',
|
||||
x='thread',
|
||||
source=source,
|
||||
color='color',
|
||||
alpha=0.3,
|
||||
selection_fill_color='White',
|
||||
selection_line_color='Black',
|
||||
selection_line_width=0.5,
|
||||
selection_alpha=1.0)
|
||||
|
||||
graph.y_range.range_padding = 0
|
||||
graph.xgrid.grid_line_color = None
|
||||
return graph
|
||||
|
||||
|
||||
def create_table(graph):
|
||||
# Empty dataframe, will be filled up in js land
|
||||
empty_data = {'thread': [], 'count': []}
|
||||
table_source = ColumnDataSource(pd.DataFrame(
|
||||
empty_data, columns=['thread', 'count'], index=None))
|
||||
graph_source = graph.renderers[0].data_source
|
||||
|
||||
columns = [
|
||||
TableColumn(field='thread', title='Thread'),
|
||||
TableColumn(field='count', title='Count')
|
||||
]
|
||||
|
||||
# start with a small table size (stretch doesn't reduce from the preferred size)
|
||||
table = DataTable(
|
||||
width=100,
|
||||
height=100,
|
||||
sizing_mode='stretch_both',
|
||||
source=table_source,
|
||||
columns=columns,
|
||||
index_position=None,
|
||||
name='table')
|
||||
|
||||
graph_selection_cb = CustomJS(code='update_selections()')
|
||||
|
||||
graph_source.selected.js_on_change('indices', graph_selection_cb)
|
||||
table_source.selected.js_on_change('indices', CustomJS(args={}, code='update_flamegraph()'))
|
||||
|
||||
return table
|
||||
|
||||
|
||||
def generate_template(template_file='index.html.jinja2'):
|
||||
loader = jinja2.FileSystemLoader(
|
||||
searchpath=os.path.dirname(os.path.realpath(__file__)) + '/templates/')
|
||||
|
||||
env = jinja2.Environment(loader=loader)
|
||||
return env.get_template(template_file)
|
||||
|
||||
|
||||
def generate_html(args, components_dict, title):
|
||||
resources = INLINE.render()
|
||||
script, div = components(components_dict)
|
||||
return generate_template().render(
|
||||
resources=resources, plot_script=script, plot_div=div, title=title)
|
||||
|
||||
|
||||
class ThreadDescriptor:
|
||||
def __init__(self, pid, tid, name):
|
||||
self.name = name
|
||||
self.tid = tid
|
||||
self.pid = pid
|
||||
|
||||
def __lt__(self, other):
|
||||
return self.pid < other.pid or (self.pid == other.pid and self.tid < other.tid)
|
||||
|
||||
def __gt__(self, other):
|
||||
return self.pid > other.pid or (self.pid == other.pid and self.tid > other.tid)
|
||||
|
||||
def __eq__(self, other):
|
||||
return self.pid == other.pid and self.tid == other.tid and self.name == other.name
|
||||
|
||||
def __str__(self):
|
||||
return str(self.pid) + ':' + str(self.tid) + ' ' + self.name
|
||||
|
||||
|
||||
def generate_datasource(args):
|
||||
lib = sp.ReportLib()
|
||||
lib.ShowIpForUnknownSymbol()
|
||||
|
||||
if args.usyms:
|
||||
lib.SetSymfs(args.usyms)
|
||||
|
||||
if args.input_file:
|
||||
lib.SetRecordFile(args.input_file)
|
||||
|
||||
if args.ksyms:
|
||||
lib.SetKallsymsFile(args.ksyms)
|
||||
|
||||
lib.SetReportOptions(args.report_lib_options)
|
||||
|
||||
product = lib.MetaInfo().get('product_props')
|
||||
|
||||
if product:
|
||||
manufacturer, model, name = product.split(':')
|
||||
|
||||
start_time = -1
|
||||
end_time = -1
|
||||
|
||||
times = []
|
||||
threads = []
|
||||
thread_descs = []
|
||||
callchains = []
|
||||
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
|
||||
symbol = lib.GetSymbolOfCurrentSample()
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
|
||||
if start_time == -1:
|
||||
start_time = sample.time
|
||||
|
||||
sample_time = (sample.time - start_time) / 1e6 # convert to ms
|
||||
|
||||
times.append(sample_time)
|
||||
|
||||
if sample_time > end_time:
|
||||
end_time = sample_time
|
||||
|
||||
thread_desc = ThreadDescriptor(sample.pid, sample.tid, sample.thread_comm)
|
||||
|
||||
threads.append(str(thread_desc))
|
||||
|
||||
if thread_desc not in thread_descs:
|
||||
bisect.insort(thread_descs, thread_desc)
|
||||
|
||||
callchain_str = ''
|
||||
|
||||
for i in range(callchain.nr):
|
||||
symbol = callchain.entries[i].symbol # SymbolStruct
|
||||
entry_line = ''
|
||||
|
||||
if args.include_dso_names:
|
||||
entry_line += symbol._dso_name.decode('utf-8') + ':'
|
||||
|
||||
entry_line += symbol._symbol_name.decode('utf-8')
|
||||
|
||||
if args.include_symbols_addr:
|
||||
entry_line += ':' + hex(symbol.symbol_addr)
|
||||
|
||||
if i < callchain.nr - 1:
|
||||
callchain_str += entry_line + '<br>'
|
||||
|
||||
callchains.append(callchain_str)
|
||||
|
||||
# define colors per-process
|
||||
palette = Category20b[20]
|
||||
color_map = {}
|
||||
|
||||
last_pid = -1
|
||||
palette_index = 0
|
||||
|
||||
for thread_desc in thread_descs:
|
||||
if thread_desc.pid != last_pid:
|
||||
last_pid = thread_desc.pid
|
||||
palette_index += 1
|
||||
palette_index %= len(palette)
|
||||
|
||||
color_map[str(thread_desc.pid)] = palette[palette_index]
|
||||
|
||||
colors = []
|
||||
for sample_thread in threads:
|
||||
pid = str(sample_thread.split(':')[0])
|
||||
colors.append(color_map[pid])
|
||||
|
||||
threads_range = [str(thread_desc) for thread_desc in thread_descs]
|
||||
data_range = FactorRange(factors=threads_range, bounds='auto')
|
||||
|
||||
data = {'time': times,
|
||||
'thread': threads,
|
||||
'callchain': callchains,
|
||||
'color': colors}
|
||||
|
||||
source = ColumnDataSource(data)
|
||||
|
||||
return source, data_range
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser()
|
||||
parser.add_argument('-i', '--input_file', type=str, required=True, help='input file')
|
||||
parser.add_argument('--title', '-t', type=str, help='document title')
|
||||
parser.add_argument('--ksyms', '-k', type=str, help='path to kernel symbols (kallsyms)')
|
||||
parser.add_argument('--usyms', '-u', type=str, help='path to tree with user space symbols')
|
||||
parser.add_argument('--output', '-o', type=str, help='output file')
|
||||
parser.add_argument('--dont_open', '-d', action='store_true', help='Don\'t open output file')
|
||||
parser.add_argument('--include_dso_names', '-n', action='store_true',
|
||||
help='Include dso names in backtraces')
|
||||
parser.add_argument('--include_symbols_addr', '-s', action='store_true',
|
||||
help='Include addresses of symbols in backtraces')
|
||||
parser.add_report_lib_options(default_show_art_frames=True)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# TODO test hierarchical ranges too
|
||||
source, data_range = generate_datasource(args)
|
||||
|
||||
graph = create_graph(args, source, data_range)
|
||||
table = create_table(graph)
|
||||
|
||||
output_filename = args.output
|
||||
|
||||
if not output_filename:
|
||||
output_filename = os.path.splitext(os.path.basename(args.input_file))[0] + '.html'
|
||||
|
||||
title = os.path.splitext(os.path.basename(output_filename))[0]
|
||||
|
||||
html = generate_html(args, {'graph': graph, 'table': table}, title)
|
||||
|
||||
with io.open(output_filename, mode='w', encoding='utf-8') as fout:
|
||||
fout.write(html)
|
||||
|
||||
if not args.dont_open:
|
||||
view(output_filename)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@ -0,0 +1,66 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
<script
|
||||
src="https://code.jquery.com/jquery-3.5.1.min.js"
|
||||
integrity="sha256-9/aliU8dGd2tb6OSsuzixeV4y/faTqgFtohetphbbj0="
|
||||
crossorigin="anonymous"></script>
|
||||
|
||||
<meta charset="utf-8">
|
||||
|
||||
{{ resources }}
|
||||
|
||||
<style>
|
||||
{% include 'styles.css' %}
|
||||
</style>
|
||||
|
||||
<link rel="stylesheet" type="text/css" href="https://cdn.jsdelivr.net/npm/d3-flame-graph@4.0.6/dist/d3-flamegraph.css">
|
||||
|
||||
<script
|
||||
src="https://code.jquery.com/ui/1.12.1/jquery-ui.min.js"
|
||||
integrity="sha256-VazP97ZCwtekAsvgPBSUwPFKdrwD3unUfSGVYrahUqU="
|
||||
crossorigin="anonymous"></script>
|
||||
<script type="text/javascript" src="https://d3js.org/d3.v4.min.js"></script>
|
||||
<script type="text/javascript" src="https://cdn.jsdelivr.net/npm/d3-flame-graph@4.0.6/dist/d3-flamegraph.min.js"></script>
|
||||
|
||||
</head>
|
||||
<body>
|
||||
<div id="help_dialog" class="dialog">
|
||||
<div class="dialog_area">
|
||||
<span class="dialog_close">×</span>
|
||||
<p> <b>Main plot (upper left):</b> pan with click+mouse movement, zoom in/out with the mouse
|
||||
wheel, hover on sample clusters to see backtraces. Select samples with the rectangular
|
||||
selection tool or by clicking on them. Select holding shift to add or ctrl+shift to
|
||||
remove samples to or from the selection. Different tools can be enabled/disabled from
|
||||
the toolbox.</p>
|
||||
<p><b>Flame graph (upper right):</b> click on specific items to zoom in.</p>
|
||||
<p><b>Sample table (lower right):</b> select processes to filter in the Flame graph.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="top_right">
|
||||
<button id="help_button" class="help" text-align="right">HELP</button>
|
||||
</div>
|
||||
<div class="left"> {{ plot_div.graph }} </div>
|
||||
<div class="middle_right">
|
||||
<div id="flame"/>
|
||||
</div>
|
||||
|
||||
<div class="bottom_right">
|
||||
<div style="display: flex; justify-content: space-around">
|
||||
<div>
|
||||
<label for="regex">Filter by regex:</label>
|
||||
<input type="text" id="regex" oninput="update_selections()"/>
|
||||
</div>
|
||||
<div>
|
||||
Invert callstack <input type="checkbox" id="inverted_checkbox" onclick="update_selections()">
|
||||
</div>
|
||||
</div>
|
||||
{{ plot_div.table }}
|
||||
</div>
|
||||
|
||||
<script>{% include 'main.js' %}</script>
|
||||
{{ plot_script }}
|
||||
</body>
|
||||
</html>
|
||||
245
Android/android-ndk-r27d/simpleperf/purgatorio/templates/main.js
Normal file
@ -0,0 +1,245 @@
|
||||
/*
|
||||
* Copyright (C) 2021 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
function generateHash (name) {
|
||||
// Return a vector (0.0->1.0) that is a hash of the input string.
|
||||
// The hash is computed to favor early characters over later ones, so
|
||||
// that strings with similar starts have similar vectors. Only the first
|
||||
// 6 characters are considered.
|
||||
const MAX_CHAR = 6
|
||||
|
||||
var hash = 0
|
||||
var maxHash = 0
|
||||
var weight = 1
|
||||
var mod = 10
|
||||
|
||||
if (name) {
|
||||
for (var i = 0; i < name.length; i++) {
|
||||
if (i > MAX_CHAR) { break }
|
||||
hash += weight * (name.charCodeAt(i) % mod)
|
||||
maxHash += weight * (mod - 1)
|
||||
weight *= 0.70
|
||||
}
|
||||
if (maxHash > 0) { hash = hash / maxHash }
|
||||
}
|
||||
return hash
|
||||
}
|
||||
|
||||
function offCpuColorMapper (d) {
|
||||
if (d.highlight) return '#E600E6'
|
||||
|
||||
let name = d.data.n || d.data.name
|
||||
let vector = 0
|
||||
const nameArr = name.split('`')
|
||||
|
||||
if (nameArr.length > 1) {
|
||||
name = nameArr[nameArr.length - 1] // drop module name if present
|
||||
}
|
||||
name = name.split('(')[0] // drop extra info
|
||||
vector = generateHash(name)
|
||||
|
||||
const r = 0 + Math.round(55 * (1 - vector))
|
||||
const g = 0 + Math.round(230 * (1 - vector))
|
||||
const b = 200 + Math.round(55 * vector)
|
||||
|
||||
return 'rgb(' + r + ',' + g + ',' + b + ')'
|
||||
}
|
||||
|
||||
var flame = flamegraph()
|
||||
.cellHeight(18)
|
||||
.width(window.innerWidth * 3 / 10 - 20) // 30% width
|
||||
.transitionDuration(750)
|
||||
.minFrameSize(5)
|
||||
.transitionEase(d3.easeCubic)
|
||||
.inverted(false)
|
||||
.sort(true)
|
||||
.title("")
|
||||
//.differential(false)
|
||||
//.elided(false)
|
||||
.selfValue(false)
|
||||
.setColorMapper(offCpuColorMapper);
|
||||
|
||||
|
||||
function update_table() {
|
||||
let inverted = document.getElementById("inverted_checkbox").checked
|
||||
let regex
|
||||
let graph_source = Bokeh.documents[0].get_model_by_name('graph').renderers[0].data_source
|
||||
let table_source = Bokeh.documents[0].get_model_by_name('table').source
|
||||
|
||||
let graph_selection = graph_source.selected.indices
|
||||
let threads = graph_source.data.thread
|
||||
let callchains = graph_source.data.callchain
|
||||
|
||||
let selection_len = graph_selection.length;
|
||||
|
||||
if (document.getElementById("regex").value) {
|
||||
regex = new RegExp(document.getElementById("regex").value)
|
||||
}
|
||||
|
||||
table_source.data.thread = []
|
||||
table_source.data.count = []
|
||||
table_source.data.index = []
|
||||
|
||||
for (let i = 0; i < selection_len; i ++) {
|
||||
let entry = "<no callchain>"
|
||||
|
||||
if (regex !== undefined && !regex.test(callchains[graph_selection[i]])) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (inverted) {
|
||||
let callchain = callchains[graph_selection[i]].split("<br>")
|
||||
|
||||
for (let e = 0; e < callchain.length; e ++) {
|
||||
if (callchain[e] != "") { // last entry is apparently always an empty string
|
||||
entry = callchain[e]
|
||||
break
|
||||
}
|
||||
}
|
||||
} else {
|
||||
entry = threads[graph_selection[i]]
|
||||
}
|
||||
|
||||
let pos = table_source.data.thread.indexOf(entry)
|
||||
|
||||
if(pos == -1) {
|
||||
table_source.data.thread.push(entry)
|
||||
table_source.data.count.push(1)
|
||||
table_source.data.index.push(table_source.data.thread.length)
|
||||
} else {
|
||||
table_source.data.count[pos] ++
|
||||
}
|
||||
}
|
||||
|
||||
table_source.selected.indices = []
|
||||
table_source.change.emit()
|
||||
}
|
||||
|
||||
|
||||
function should_insert_callchain(callchain, items, filter_index, inverted) {
|
||||
for (t = 0; t < filter_index.length; t ++) {
|
||||
if (callchain[0] === items[filter_index[t]]) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
if (filter_index.length > 0) {
|
||||
return false
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
|
||||
function insert_callchain(root, callchain, inverted) {
|
||||
let root_pos = -1
|
||||
let node = root
|
||||
|
||||
node.value ++
|
||||
|
||||
for (let e = 0; e < callchain.length; e ++) {
|
||||
let entry = callchain[e].replace(/^\s+|\s+$/g, '')
|
||||
let entry_pos = -1
|
||||
|
||||
for (let j = 0; j < node.children.length; j ++) {
|
||||
if (node.children[j].name == entry) {
|
||||
entry_pos = j
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
if (entry_pos == -1) {
|
||||
node.children.push({name: entry, value:0, children:[]})
|
||||
entry_pos = node.children.length - 1
|
||||
}
|
||||
|
||||
node = node.children[entry_pos]
|
||||
node.value ++
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
function update_flamegraph() {
|
||||
let inverted = document.getElementById("inverted_checkbox").checked
|
||||
let root = {name: inverted ? "samples" : "processes", value: 0, children: []}
|
||||
|
||||
let graph_source = Bokeh.documents[0].get_model_by_name('graph').renderers[0].data_source
|
||||
let graph_selection = graph_source.selected.indices
|
||||
let callchains = graph_source.data.callchain
|
||||
let graph_threads = graph_source.data.thread
|
||||
|
||||
let table_source = Bokeh.documents[0].get_model_by_name('table').source
|
||||
let table_selection = table_source.selected.indices
|
||||
let table_threads = table_source.data.thread
|
||||
let regex
|
||||
|
||||
if (document.getElementById("regex").value) {
|
||||
regex = new RegExp(document.getElementById("regex").value)
|
||||
}
|
||||
|
||||
for (let i = 0; i < graph_selection.length; i ++) {
|
||||
let thread = graph_threads[graph_selection[i]]
|
||||
let callchain = callchains[graph_selection[i]].split("<br>")
|
||||
callchain = callchain.filter(function(e){return e != ""})
|
||||
|
||||
if (regex !== undefined && !regex.test(callchains[graph_selection[i]])) {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (callchain.length == 0) {
|
||||
callchain.push("<no callchain>")
|
||||
}
|
||||
|
||||
callchain.push(thread)
|
||||
|
||||
if (!inverted){
|
||||
callchain = callchain.reverse()
|
||||
}
|
||||
|
||||
if (should_insert_callchain(callchain, table_threads, table_selection)) {
|
||||
insert_callchain(root, callchain)
|
||||
}
|
||||
}
|
||||
|
||||
if (root.children.length == 1) {
|
||||
root = root.children[0]
|
||||
}
|
||||
|
||||
d3.select("#flame")
|
||||
.datum(root)
|
||||
.call(flame)
|
||||
}
|
||||
|
||||
var help_dialog = document.getElementById("help_dialog");
|
||||
|
||||
document.getElementById("help_button").onclick = function() {
|
||||
help_dialog.style.display = "block";
|
||||
}
|
||||
|
||||
window.onclick = function(event) {
|
||||
if (event.target == help_dialog) {
|
||||
help_dialog.style.display = "none";
|
||||
}
|
||||
}
|
||||
|
||||
document.getElementsByClassName("dialog_close")[0].onclick = function() {
|
||||
help_dialog.style.display = "none";
|
||||
}
|
||||
|
||||
function update_selections() {
|
||||
update_flamegraph()
|
||||
update_table()
|
||||
}
|
||||
@ -0,0 +1,133 @@
|
||||
/*
|
||||
* Copyright (C) 2021 The Android Open Source Project
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
body {
|
||||
font-family: sans-serif;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar {
|
||||
width: 1em;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-track {
|
||||
box-shadow: inset 0 0 0.1em white;
|
||||
border-radius: 0.5em;
|
||||
}
|
||||
|
||||
::-webkit-scrollbar-thumb {
|
||||
background: lightgrey;
|
||||
border-radius: 0.5em;
|
||||
}
|
||||
|
||||
div.left {
|
||||
position:fixed;
|
||||
top: 0;
|
||||
left: 0;
|
||||
width: 70%;
|
||||
height: 100%;
|
||||
}
|
||||
|
||||
div.top_right {
|
||||
position: fixed;
|
||||
top: 0;
|
||||
right: 0;
|
||||
text-align: right;
|
||||
width: 30%;
|
||||
height: 1em;
|
||||
}
|
||||
|
||||
div.middle_right {
|
||||
position:fixed;
|
||||
top: 2%;
|
||||
right: 0;
|
||||
width: 30%;
|
||||
height: 78%;
|
||||
overflow-y: scroll;
|
||||
}
|
||||
|
||||
div.bottom_right {
|
||||
position: fixed;
|
||||
width: 30%;
|
||||
height: 20%;
|
||||
bottom: 0;
|
||||
right: 0;
|
||||
}
|
||||
|
||||
button {
|
||||
border: none;
|
||||
outline: none;
|
||||
padding: 0;
|
||||
background: white;
|
||||
font-size: 0.5em;
|
||||
}
|
||||
|
||||
button.help:before
|
||||
{
|
||||
content: '?';
|
||||
display: inline-block;
|
||||
font-weight: bold;
|
||||
text-align: center;
|
||||
font-size: 1.4em;
|
||||
width: 1.5em;
|
||||
height: 1.5em;
|
||||
line-height: 1.6em;
|
||||
border-radius: 1.2em;
|
||||
margin-right: 0.3em;
|
||||
color: GoldenRod;
|
||||
background: white;
|
||||
border: 0.1em solid GoldenRod;
|
||||
}
|
||||
|
||||
button.help:hover:before
|
||||
{
|
||||
color: white;
|
||||
background: GoldenRod;
|
||||
}
|
||||
|
||||
.dialog {
|
||||
display: none;
|
||||
position: fixed;
|
||||
z-index: 1;
|
||||
left: 0;
|
||||
top: 0;
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
overflow: auto;
|
||||
background-color: rgba(0,0,0,0.4);
|
||||
}
|
||||
|
||||
.dialog_area {
|
||||
background-color: white;
|
||||
margin: 20% auto;
|
||||
border: 0.05em solid gray;
|
||||
border-radius: 0.5em;
|
||||
padding-left: 0.5em;
|
||||
padding-right: 0.5em;
|
||||
width: 50%;
|
||||
}
|
||||
|
||||
.dialog_close {
|
||||
color: darkgray;
|
||||
float: right;
|
||||
font-size: 2em;
|
||||
font-weight: bold;
|
||||
}
|
||||
|
||||
.dialog_close:focus,
|
||||
.dialog_close:hover {
|
||||
cursor: pointer;
|
||||
color: black;
|
||||
}
|
||||
344
Android/android-ndk-r27d/simpleperf/report.py
Normal file
@ -0,0 +1,344 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2015 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""Simpleperf gui reporter: provide gui interface for simpleperf report command.
|
||||
|
||||
There are two ways to use gui reporter. One way is to pass it a report file
|
||||
generated by simpleperf report command, and reporter will display it. The
|
||||
other ways is to pass it any arguments you want to use when calling
|
||||
simpleperf report command. The reporter will call `simpleperf report` to
|
||||
generate report file, and display it.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
import os.path
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
try:
|
||||
from tkinter import *
|
||||
from tkinter.font import Font
|
||||
from tkinter.ttk import *
|
||||
except ImportError:
|
||||
from Tkinter import *
|
||||
from tkFont import Font
|
||||
from ttk import *
|
||||
|
||||
from simpleperf_utils import *
|
||||
|
||||
PAD_X = 3
|
||||
PAD_Y = 3
|
||||
|
||||
|
||||
class CallTreeNode(object):
|
||||
|
||||
"""Representing a node in call-graph."""
|
||||
|
||||
def __init__(self, percentage, function_name):
|
||||
self.percentage = percentage
|
||||
self.call_stack = [function_name]
|
||||
self.children = []
|
||||
|
||||
def add_call(self, function_name):
|
||||
self.call_stack.append(function_name)
|
||||
|
||||
def add_child(self, node):
|
||||
self.children.append(node)
|
||||
|
||||
def __str__(self):
|
||||
strs = self.dump()
|
||||
return '\n'.join(strs)
|
||||
|
||||
def dump(self):
|
||||
strs = []
|
||||
strs.append('CallTreeNode percentage = %.2f' % self.percentage)
|
||||
for function_name in self.call_stack:
|
||||
strs.append(' %s' % function_name)
|
||||
for child in self.children:
|
||||
child_strs = child.dump()
|
||||
strs.extend([' ' + x for x in child_strs])
|
||||
return strs
|
||||
|
||||
|
||||
class ReportItem(object):
|
||||
|
||||
"""Representing one item in report, may contain a CallTree."""
|
||||
|
||||
def __init__(self, raw_line):
|
||||
self.raw_line = raw_line
|
||||
self.call_tree = None
|
||||
|
||||
def __str__(self):
|
||||
strs = []
|
||||
strs.append('ReportItem (raw_line %s)' % self.raw_line)
|
||||
if self.call_tree is not None:
|
||||
strs.append('%s' % self.call_tree)
|
||||
return '\n'.join(strs)
|
||||
|
||||
|
||||
class EventReport(object):
|
||||
|
||||
"""Representing report for one event attr."""
|
||||
|
||||
def __init__(self, common_report_context):
|
||||
self.context = common_report_context[:]
|
||||
self.title_line = None
|
||||
self.report_items = []
|
||||
|
||||
|
||||
def parse_event_reports(lines):
|
||||
# Parse common report context
|
||||
common_report_context = []
|
||||
line_id = 0
|
||||
while line_id < len(lines):
|
||||
line = lines[line_id]
|
||||
if not line or line.find('Event:') == 0:
|
||||
break
|
||||
common_report_context.append(line)
|
||||
line_id += 1
|
||||
|
||||
event_reports = []
|
||||
in_report_context = True
|
||||
cur_event_report = EventReport(common_report_context)
|
||||
cur_report_item = None
|
||||
call_tree_stack = {}
|
||||
vertical_columns = []
|
||||
last_node = None
|
||||
|
||||
has_skipped_callgraph = False
|
||||
|
||||
for line in lines[line_id:]:
|
||||
if not line:
|
||||
in_report_context = not in_report_context
|
||||
if in_report_context:
|
||||
cur_event_report = EventReport(common_report_context)
|
||||
continue
|
||||
|
||||
if in_report_context:
|
||||
cur_event_report.context.append(line)
|
||||
if line.find('Event:') == 0:
|
||||
event_reports.append(cur_event_report)
|
||||
continue
|
||||
|
||||
if cur_event_report.title_line is None:
|
||||
cur_event_report.title_line = line
|
||||
elif not line[0].isspace():
|
||||
cur_report_item = ReportItem(line)
|
||||
cur_event_report.report_items.append(cur_report_item)
|
||||
# Each report item can have different column depths.
|
||||
vertical_columns = []
|
||||
else:
|
||||
for i in range(len(line)):
|
||||
if line[i] == '|':
|
||||
if not vertical_columns or vertical_columns[-1] < i:
|
||||
vertical_columns.append(i)
|
||||
|
||||
if not line.strip('| \t'):
|
||||
continue
|
||||
if 'skipped in brief callgraph mode' in line:
|
||||
has_skipped_callgraph = True
|
||||
continue
|
||||
|
||||
if line.find('-') == -1:
|
||||
line = line.strip('| \t')
|
||||
function_name = line
|
||||
last_node.add_call(function_name)
|
||||
else:
|
||||
pos = line.find('-')
|
||||
depth = -1
|
||||
for i in range(len(vertical_columns)):
|
||||
if pos >= vertical_columns[i]:
|
||||
depth = i
|
||||
assert depth != -1
|
||||
|
||||
line = line.strip('|- \t')
|
||||
m = re.search(r'^([\d\.]+)%[-\s]+(.+)$', line)
|
||||
if m:
|
||||
percentage = float(m.group(1))
|
||||
function_name = m.group(2)
|
||||
else:
|
||||
percentage = 100.0
|
||||
function_name = line
|
||||
|
||||
node = CallTreeNode(percentage, function_name)
|
||||
if depth == 0:
|
||||
cur_report_item.call_tree = node
|
||||
else:
|
||||
call_tree_stack[depth - 1].add_child(node)
|
||||
call_tree_stack[depth] = node
|
||||
last_node = node
|
||||
|
||||
if has_skipped_callgraph:
|
||||
logging.warning('some callgraphs are skipped in brief callgraph mode')
|
||||
|
||||
return event_reports
|
||||
|
||||
|
||||
class ReportWindow(object):
|
||||
|
||||
"""A window used to display report file."""
|
||||
|
||||
def __init__(self, main, report_context, title_line, report_items):
|
||||
frame = Frame(main)
|
||||
frame.pack(fill=BOTH, expand=1)
|
||||
|
||||
font = Font(family='courier', size=12)
|
||||
|
||||
# Report Context
|
||||
for line in report_context:
|
||||
label = Label(frame, text=line, font=font)
|
||||
label.pack(anchor=W, padx=PAD_X, pady=PAD_Y)
|
||||
|
||||
# Space
|
||||
label = Label(frame, text='', font=font)
|
||||
label.pack(anchor=W, padx=PAD_X, pady=PAD_Y)
|
||||
|
||||
# Title
|
||||
label = Label(frame, text=' ' + title_line, font=font)
|
||||
label.pack(anchor=W, padx=PAD_X, pady=PAD_Y)
|
||||
|
||||
# Report Items
|
||||
report_frame = Frame(frame)
|
||||
report_frame.pack(fill=BOTH, expand=1)
|
||||
|
||||
yscrollbar = Scrollbar(report_frame)
|
||||
yscrollbar.pack(side=RIGHT, fill=Y)
|
||||
xscrollbar = Scrollbar(report_frame, orient=HORIZONTAL)
|
||||
xscrollbar.pack(side=BOTTOM, fill=X)
|
||||
|
||||
tree = Treeview(report_frame, columns=[title_line], show='')
|
||||
tree.pack(side=LEFT, fill=BOTH, expand=1)
|
||||
tree.tag_configure('set_font', font=font)
|
||||
|
||||
tree.config(yscrollcommand=yscrollbar.set)
|
||||
yscrollbar.config(command=tree.yview)
|
||||
tree.config(xscrollcommand=xscrollbar.set)
|
||||
xscrollbar.config(command=tree.xview)
|
||||
|
||||
self.display_report_items(tree, report_items)
|
||||
|
||||
def display_report_items(self, tree, report_items):
|
||||
for report_item in report_items:
|
||||
prefix_str = '+ ' if report_item.call_tree is not None else ' '
|
||||
id = tree.insert(
|
||||
'',
|
||||
'end',
|
||||
None,
|
||||
values=[
|
||||
prefix_str +
|
||||
report_item.raw_line],
|
||||
tag='set_font')
|
||||
if report_item.call_tree is not None:
|
||||
self.display_call_tree(tree, id, report_item.call_tree, 1)
|
||||
|
||||
def display_call_tree(self, tree, parent_id, node, indent):
|
||||
id = parent_id
|
||||
indent_str = ' ' * indent
|
||||
|
||||
if node.percentage != 100.0:
|
||||
percentage_str = '%.2f%% ' % node.percentage
|
||||
else:
|
||||
percentage_str = ''
|
||||
|
||||
for i in range(len(node.call_stack)):
|
||||
s = indent_str
|
||||
s += '+ ' if node.children and i == len(node.call_stack) - 1 else ' '
|
||||
s += percentage_str if i == 0 else ' ' * len(percentage_str)
|
||||
s += node.call_stack[i]
|
||||
child_open = False if i == len(node.call_stack) - 1 and indent > 1 else True
|
||||
id = tree.insert(id, 'end', None, values=[s], open=child_open,
|
||||
tag='set_font')
|
||||
|
||||
for child in node.children:
|
||||
self.display_call_tree(tree, id, child, indent + 1)
|
||||
|
||||
|
||||
def display_report_file(report_file, self_kill_after_sec):
|
||||
fh = open(report_file, 'r')
|
||||
lines = fh.readlines()
|
||||
fh.close()
|
||||
|
||||
lines = [x.rstrip() for x in lines]
|
||||
event_reports = parse_event_reports(lines)
|
||||
|
||||
if event_reports:
|
||||
root = Tk()
|
||||
for i in range(len(event_reports)):
|
||||
report = event_reports[i]
|
||||
parent = root if i == 0 else Toplevel(root)
|
||||
ReportWindow(parent, report.context, report.title_line, report.report_items)
|
||||
if self_kill_after_sec:
|
||||
root.after(self_kill_after_sec * 1000, lambda: root.destroy())
|
||||
root.mainloop()
|
||||
|
||||
|
||||
def call_simpleperf_report(args, show_gui, self_kill_after_sec):
|
||||
simpleperf_path = get_host_binary_path('simpleperf')
|
||||
if not show_gui:
|
||||
subprocess.check_call([simpleperf_path, 'report'] + args)
|
||||
else:
|
||||
report_file = 'perf.report'
|
||||
subprocess.check_call([simpleperf_path, 'report', '--full-callgraph'] + args +
|
||||
['-o', report_file])
|
||||
display_report_file(report_file, self_kill_after_sec=self_kill_after_sec)
|
||||
|
||||
|
||||
def get_simpleperf_report_help_msg():
|
||||
simpleperf_path = get_host_binary_path('simpleperf')
|
||||
args = [simpleperf_path, 'report', '-h']
|
||||
proc = subprocess.Popen(args, stdout=subprocess.PIPE)
|
||||
(stdoutdata, _) = proc.communicate()
|
||||
stdoutdata = bytes_to_str(stdoutdata)
|
||||
return stdoutdata[stdoutdata.find('\n') + 1:]
|
||||
|
||||
|
||||
def main():
|
||||
self_kill_after_sec = 0
|
||||
args = sys.argv[1:]
|
||||
if args and args[0] == "--self-kill-for-testing":
|
||||
self_kill_after_sec = 1
|
||||
args = args[1:]
|
||||
if len(args) == 1 and os.path.isfile(args[0]):
|
||||
display_report_file(args[0], self_kill_after_sec=self_kill_after_sec)
|
||||
|
||||
i = 0
|
||||
args_for_report_cmd = []
|
||||
show_gui = False
|
||||
while i < len(args):
|
||||
if args[i] == '-h' or args[i] == '--help':
|
||||
print('report.py A python wrapper for simpleperf report command.')
|
||||
print('Options supported by simpleperf report command:')
|
||||
print(get_simpleperf_report_help_msg())
|
||||
print('\nOptions supported by report.py:')
|
||||
print('--gui Show report result in a gui window.')
|
||||
print('\nIt also supports showing a report generated by simpleperf report cmd:')
|
||||
print('\n python report.py report_file')
|
||||
sys.exit(0)
|
||||
elif args[i] == '--gui':
|
||||
show_gui = True
|
||||
i += 1
|
||||
else:
|
||||
args_for_report_cmd.append(args[i])
|
||||
i += 1
|
||||
|
||||
call_simpleperf_report(args_for_report_cmd, show_gui, self_kill_after_sec)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
1780
Android/android-ndk-r27d/simpleperf/report_html.js
Normal file
1098
Android/android-ndk-r27d/simpleperf/report_html.py
Normal file
107
Android/android-ndk-r27d/simpleperf/report_sample.py
Normal file
@ -0,0 +1,107 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""report_sample.py: report samples in the same format as `perf script`.
|
||||
"""
|
||||
|
||||
import sys
|
||||
from simpleperf_report_lib import GetReportLib
|
||||
from simpleperf_utils import BaseArgumentParser, flatten_arg_list, ReportLibOptions
|
||||
from typing import List, Set, Optional
|
||||
|
||||
|
||||
def report_sample(
|
||||
record_file: str,
|
||||
symfs_dir: str,
|
||||
kallsyms_file: str,
|
||||
show_tracing_data: bool,
|
||||
header: bool,
|
||||
report_lib_options: ReportLibOptions):
|
||||
""" read record_file, and print each sample"""
|
||||
lib = GetReportLib(record_file)
|
||||
|
||||
lib.ShowIpForUnknownSymbol()
|
||||
if symfs_dir is not None:
|
||||
lib.SetSymfs(symfs_dir)
|
||||
if kallsyms_file is not None:
|
||||
lib.SetKallsymsFile(kallsyms_file)
|
||||
lib.SetReportOptions(report_lib_options)
|
||||
|
||||
if header:
|
||||
print("# ========")
|
||||
print("# cmdline : %s" % lib.GetRecordCmd())
|
||||
print("# arch : %s" % lib.GetArch())
|
||||
for k, v in lib.MetaInfo().items():
|
||||
print('# %s : %s' % (k, v.replace('\n', ' ')))
|
||||
print("# ========")
|
||||
print("#")
|
||||
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
event = lib.GetEventOfCurrentSample()
|
||||
symbol = lib.GetSymbolOfCurrentSample()
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
|
||||
sec = sample.time // 1000000000
|
||||
usec = (sample.time - sec * 1000000000) // 1000
|
||||
print('%s\t%d/%d [%03d] %d.%06d: %d %s:' % (sample.thread_comm,
|
||||
sample.pid, sample.tid, sample.cpu, sec,
|
||||
usec, sample.period, event.name))
|
||||
print('\t%16x %s (%s)' % (sample.ip, symbol.symbol_name, symbol.dso_name))
|
||||
for i in range(callchain.nr):
|
||||
entry = callchain.entries[i]
|
||||
print('\t%16x %s (%s)' % (entry.ip, entry.symbol.symbol_name, entry.symbol.dso_name))
|
||||
if show_tracing_data:
|
||||
data = lib.GetTracingDataOfCurrentSample()
|
||||
if data:
|
||||
print('\ttracing data:')
|
||||
for key, value in data.items():
|
||||
print('\t\t%s : %s' % (key, value))
|
||||
print('')
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description='Report samples in perf.data.')
|
||||
parser.add_argument('--symfs',
|
||||
help='Set the path to find binaries with symbols and debug info.')
|
||||
parser.add_argument('--kallsyms', help='Set the path to find kernel symbols.')
|
||||
parser.add_argument('-i', '--record_file', nargs='?', default='perf.data',
|
||||
help='Default is perf.data.')
|
||||
parser.add_argument('--show_tracing_data', action='store_true', help='print tracing data.')
|
||||
parser.add_argument('--header', action='store_true',
|
||||
help='Show metadata header, like perf script --header')
|
||||
parser.add_argument('-o', '--output_file', default='', help="""
|
||||
The path of the generated report. Default is stdout.""")
|
||||
parser.add_report_lib_options()
|
||||
args = parser.parse_args()
|
||||
# If the output file has been set, redirect stdout.
|
||||
if args.output_file != '' and args.output_file != '-':
|
||||
sys.stdout = open(file=args.output_file, mode='w')
|
||||
report_sample(
|
||||
record_file=args.record_file,
|
||||
symfs_dir=args.symfs,
|
||||
kallsyms_file=args.kallsyms,
|
||||
show_tracing_data=args.show_tracing_data,
|
||||
header=args.header,
|
||||
report_lib_options=args.report_lib_options)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
46
Android/android-ndk-r27d/simpleperf/report_sample_pb2.py
Normal file
@ -0,0 +1,46 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Generated by the protocol buffer compiler. DO NOT EDIT!
|
||||
# source: cmd_report_sample.proto
|
||||
"""Generated protocol buffer code."""
|
||||
from google.protobuf.internal import builder as _builder
|
||||
from google.protobuf import descriptor as _descriptor
|
||||
from google.protobuf import descriptor_pool as _descriptor_pool
|
||||
from google.protobuf import symbol_database as _symbol_database
|
||||
# @@protoc_insertion_point(imports)
|
||||
|
||||
_sym_db = _symbol_database.Default()
|
||||
|
||||
|
||||
|
||||
|
||||
DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x17\x63md_report_sample.proto\x12\x17simpleperf_report_proto\"\xed\x06\n\x06Sample\x12\x0c\n\x04time\x18\x01 \x01(\x04\x12\x11\n\tthread_id\x18\x02 \x01(\x05\x12\x41\n\tcallchain\x18\x03 \x03(\x0b\x32..simpleperf_report_proto.Sample.CallChainEntry\x12\x13\n\x0b\x65vent_count\x18\x04 \x01(\x04\x12\x15\n\revent_type_id\x18\x05 \x01(\r\x12I\n\x10unwinding_result\x18\x06 \x01(\x0b\x32/.simpleperf_report_proto.Sample.UnwindingResult\x1a\x94\x02\n\x0e\x43\x61llChainEntry\x12\x15\n\rvaddr_in_file\x18\x01 \x01(\x04\x12\x0f\n\x07\x66ile_id\x18\x02 \x01(\r\x12\x11\n\tsymbol_id\x18\x03 \x01(\x05\x12\x63\n\x0e\x65xecution_type\x18\x04 \x01(\x0e\x32<.simpleperf_report_proto.Sample.CallChainEntry.ExecutionType:\rNATIVE_METHOD\"b\n\rExecutionType\x12\x11\n\rNATIVE_METHOD\x10\x00\x12\x1a\n\x16INTERPRETED_JVM_METHOD\x10\x01\x12\x12\n\x0eJIT_JVM_METHOD\x10\x02\x12\x0e\n\nART_METHOD\x10\x03\x1a\xf0\x02\n\x0fUnwindingResult\x12\x16\n\x0eraw_error_code\x18\x01 \x01(\r\x12\x12\n\nerror_addr\x18\x02 \x01(\x04\x12M\n\nerror_code\x18\x03 \x01(\x0e\x32\x39.simpleperf_report_proto.Sample.UnwindingResult.ErrorCode\"\xe1\x01\n\tErrorCode\x12\x0e\n\nERROR_NONE\x10\x00\x12\x11\n\rERROR_UNKNOWN\x10\x01\x12\x1a\n\x16\x45RROR_NOT_ENOUGH_STACK\x10\x02\x12\x18\n\x14\x45RROR_MEMORY_INVALID\x10\x03\x12\x15\n\x11\x45RROR_UNWIND_INFO\x10\x04\x12\x15\n\x11\x45RROR_INVALID_MAP\x10\x05\x12\x1c\n\x18\x45RROR_MAX_FRAME_EXCEEDED\x10\x06\x12\x18\n\x14\x45RROR_REPEATED_FRAME\x10\x07\x12\x15\n\x11\x45RROR_INVALID_ELF\x10\x08\"9\n\rLostSituation\x12\x14\n\x0csample_count\x18\x01 \x01(\x04\x12\x12\n\nlost_count\x18\x02 \x01(\x04\"H\n\x04\x46ile\x12\n\n\x02id\x18\x01 \x01(\r\x12\x0c\n\x04path\x18\x02 \x01(\t\x12\x0e\n\x06symbol\x18\x03 \x03(\t\x12\x16\n\x0emangled_symbol\x18\x04 \x03(\t\"D\n\x06Thread\x12\x11\n\tthread_id\x18\x01 \x01(\r\x12\x12\n\nprocess_id\x18\x02 \x01(\r\x12\x13\n\x0bthread_name\x18\x03 \x01(\t\"\x99\x01\n\x08MetaInfo\x12\x12\n\nevent_type\x18\x01 \x03(\t\x12\x18\n\x10\x61pp_package_name\x18\x02 \x01(\t\x12\x10\n\x08\x61pp_type\x18\x03 \x01(\t\x12\x1b\n\x13\x61ndroid_sdk_version\x18\x04 \x01(\t\x12\x1a\n\x12\x61ndroid_build_type\x18\x05 \x01(\t\x12\x14\n\x0ctrace_offcpu\x18\x06 \x01(\x08\"C\n\rContextSwitch\x12\x11\n\tswitch_on\x18\x01 \x01(\x08\x12\x0c\n\x04time\x18\x02 \x01(\x04\x12\x11\n\tthread_id\x18\x03 \x01(\r\"\xde\x02\n\x06Record\x12\x31\n\x06sample\x18\x01 \x01(\x0b\x32\x1f.simpleperf_report_proto.SampleH\x00\x12\x36\n\x04lost\x18\x02 \x01(\x0b\x32&.simpleperf_report_proto.LostSituationH\x00\x12-\n\x04\x66ile\x18\x03 \x01(\x0b\x32\x1d.simpleperf_report_proto.FileH\x00\x12\x31\n\x06thread\x18\x04 \x01(\x0b\x32\x1f.simpleperf_report_proto.ThreadH\x00\x12\x36\n\tmeta_info\x18\x05 \x01(\x0b\x32!.simpleperf_report_proto.MetaInfoH\x00\x12@\n\x0e\x63ontext_switch\x18\x06 \x01(\x0b\x32&.simpleperf_report_proto.ContextSwitchH\x00\x42\r\n\x0brecord_dataB6\n com.android.tools.profiler.protoB\x10SimpleperfReportH\x03')
|
||||
|
||||
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals())
|
||||
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'cmd_report_sample_pb2', globals())
|
||||
if _descriptor._USE_C_DESCRIPTORS == False:
|
||||
|
||||
DESCRIPTOR._options = None
|
||||
DESCRIPTOR._serialized_options = b'\n com.android.tools.profiler.protoB\020SimpleperfReportH\003'
|
||||
_SAMPLE._serialized_start=53
|
||||
_SAMPLE._serialized_end=930
|
||||
_SAMPLE_CALLCHAINENTRY._serialized_start=283
|
||||
_SAMPLE_CALLCHAINENTRY._serialized_end=559
|
||||
_SAMPLE_CALLCHAINENTRY_EXECUTIONTYPE._serialized_start=461
|
||||
_SAMPLE_CALLCHAINENTRY_EXECUTIONTYPE._serialized_end=559
|
||||
_SAMPLE_UNWINDINGRESULT._serialized_start=562
|
||||
_SAMPLE_UNWINDINGRESULT._serialized_end=930
|
||||
_SAMPLE_UNWINDINGRESULT_ERRORCODE._serialized_start=705
|
||||
_SAMPLE_UNWINDINGRESULT_ERRORCODE._serialized_end=930
|
||||
_LOSTSITUATION._serialized_start=932
|
||||
_LOSTSITUATION._serialized_end=989
|
||||
_FILE._serialized_start=991
|
||||
_FILE._serialized_end=1063
|
||||
_THREAD._serialized_start=1065
|
||||
_THREAD._serialized_end=1133
|
||||
_METAINFO._serialized_start=1136
|
||||
_METAINFO._serialized_end=1289
|
||||
_CONTEXTSWITCH._serialized_start=1291
|
||||
_CONTEXTSWITCH._serialized_end=1358
|
||||
_RECORD._serialized_start=1361
|
||||
_RECORD._serialized_end=1711
|
||||
# @@protoc_insertion_point(module_scope)
|
||||
@ -0,0 +1,39 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2017 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""run_simpleperf_on_device.py:
|
||||
It downloads simpleperf to /data/local/tmp on device, and run it with all given arguments.
|
||||
It saves the time downloading simpleperf and using `adb shell` directly.
|
||||
"""
|
||||
import subprocess
|
||||
import sys
|
||||
from simpleperf_utils import AdbHelper, get_target_binary_path, Log
|
||||
|
||||
|
||||
def main():
|
||||
Log.init()
|
||||
adb = AdbHelper()
|
||||
device_arch = adb.get_device_arch()
|
||||
simpleperf_binary = get_target_binary_path(device_arch, 'simpleperf')
|
||||
adb.check_run(['push', simpleperf_binary, '/data/local/tmp'])
|
||||
adb.check_run(['shell', 'chmod', 'a+x', '/data/local/tmp/simpleperf'])
|
||||
shell_cmd = 'cd /data/local/tmp && ./simpleperf ' + ' '.join(sys.argv[1:])
|
||||
sys.exit(subprocess.call([adb.adb_path, 'shell', shell_cmd]))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@ -0,0 +1,105 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2018 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""
|
||||
Support profiling without usb connection in below steps:
|
||||
1. With usb connection, start simpleperf recording.
|
||||
2. Unplug the usb cable and play the app you want to profile, while the process of
|
||||
simpleperf keeps running and collecting samples.
|
||||
3. Replug the usb cable, stop simpleperf recording and pull recording file on host.
|
||||
|
||||
Note that recording is stopped once the app is killed. So if you restart the app
|
||||
during profiling time, simpleperf only records the first running.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
|
||||
from simpleperf_utils import AdbHelper, BaseArgumentParser, get_target_binary_path
|
||||
|
||||
|
||||
def start_recording(args):
|
||||
adb = AdbHelper()
|
||||
device_arch = adb.get_device_arch()
|
||||
simpleperf_binary = get_target_binary_path(device_arch, 'simpleperf')
|
||||
adb.check_run(['push', simpleperf_binary, '/data/local/tmp'])
|
||||
adb.check_run(['shell', 'chmod', 'a+x', '/data/local/tmp/simpleperf'])
|
||||
adb.check_run(['shell', 'rm', '-rf', '/data/local/tmp/perf.data',
|
||||
'/data/local/tmp/simpleperf_output'])
|
||||
shell_cmd = 'cd /data/local/tmp && nohup ./simpleperf record ' + args.record_options
|
||||
if args.app:
|
||||
shell_cmd += ' --app ' + args.app
|
||||
if args.pid:
|
||||
shell_cmd += ' -p ' + args.pid
|
||||
if args.size_limit:
|
||||
shell_cmd += ' --size-limit ' + args.size_limit
|
||||
shell_cmd += ' >/data/local/tmp/simpleperf_output 2>&1'
|
||||
print('shell_cmd: %s' % shell_cmd)
|
||||
subproc = subprocess.Popen([adb.adb_path, 'shell', shell_cmd])
|
||||
# Wait 2 seconds to see if the simpleperf command fails to start.
|
||||
time.sleep(2)
|
||||
if subproc.poll() is None:
|
||||
print('Simpleperf recording has started. Please unplug the usb cable and run the app.')
|
||||
print('After that, run `%s stop` to get recording result.' % sys.argv[0])
|
||||
else:
|
||||
adb.run(['shell', 'cat', '/data/local/tmp/simpleperf_output'])
|
||||
sys.exit(subproc.returncode)
|
||||
|
||||
|
||||
def stop_recording(args):
|
||||
adb = AdbHelper()
|
||||
result = adb.run(['shell', 'pidof', 'simpleperf'])
|
||||
if not result:
|
||||
logging.warning('No simpleperf process on device. The recording has ended.')
|
||||
else:
|
||||
adb.run(['shell', 'pkill', '-l', '2', 'simpleperf'])
|
||||
print('Waiting for simpleperf process to finish...')
|
||||
while adb.run(['shell', 'pidof', 'simpleperf']):
|
||||
time.sleep(1)
|
||||
adb.run(['shell', 'cat', '/data/local/tmp/simpleperf_output'])
|
||||
adb.check_run(['pull', '/data/local/tmp/perf.data', args.perf_data_path])
|
||||
print('The recording data has been collected in %s.' % args.perf_data_path)
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
subparsers = parser.add_subparsers()
|
||||
start_parser = subparsers.add_parser('start', help='Start recording.')
|
||||
start_parser.add_argument('-r', '--record_options',
|
||||
default='-e task-clock:u -g',
|
||||
help="""Set options for `simpleperf record` command.
|
||||
Default is `-e task-clock:u -g`.""")
|
||||
start_parser.add_argument('-p', '--app', help="""Profile an Android app, given the package
|
||||
name. Like `-p com.example.android.myapp`.""")
|
||||
start_parser.add_argument('--pid', help="""Profile an Android app, given the process id.
|
||||
Like `--pid 918`.""")
|
||||
start_parser.add_argument('--size_limit', type=str,
|
||||
help="""Stop profiling when recording data reaches
|
||||
[size_limit][K|M|G] bytes. Like `--size_limit 1M`.""")
|
||||
start_parser.set_defaults(func=start_recording)
|
||||
stop_parser = subparsers.add_parser('stop', help='Stop recording.')
|
||||
stop_parser.add_argument('-o', '--perf_data_path', default='perf.data', help="""The path to
|
||||
store profiling data on host. Default is perf.data.""")
|
||||
stop_parser.set_defaults(func=stop_recording)
|
||||
args = parser.parse_args()
|
||||
args.func(args)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
130
Android/android-ndk-r27d/simpleperf/sample_filter.py
Normal file
@ -0,0 +1,130 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2024 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""sample_filter.py: generate sample filter files, which can be passed in the
|
||||
--filter-file option when reporting.
|
||||
|
||||
Example:
|
||||
./sample_filter.py -i perf.data --split-time-range 2 -o sample_filter
|
||||
./gecko_profile_generator.py -i perf.data --filter-file sample_filter_part1 \
|
||||
| gzip >profile-part1.json.gz
|
||||
./gecko_profile_generator.py -i perf.data --filter-file sample_filter_part2 \
|
||||
| gzip >profile-part2.json.gz
|
||||
"""
|
||||
|
||||
import logging
|
||||
from simpleperf_report_lib import ReportLib
|
||||
from simpleperf_utils import BaseArgumentParser
|
||||
from typing import Tuple
|
||||
|
||||
|
||||
class RecordFileReader:
|
||||
def __init__(self, record_file: str):
|
||||
self.record_file = record_file
|
||||
|
||||
def get_time_range(self) -> Tuple[int, int]:
|
||||
""" Return a tuple of (min_timestamp, max_timestamp). """
|
||||
min_timestamp = 0
|
||||
max_timestamp = 0
|
||||
lib = ReportLib()
|
||||
lib.SetRecordFile(self.record_file)
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if not sample:
|
||||
break
|
||||
if not min_timestamp or sample.time < min_timestamp:
|
||||
min_timestamp = sample.time
|
||||
if not max_timestamp or sample.time > max_timestamp:
|
||||
max_timestamp = sample.time
|
||||
lib.Close()
|
||||
return (min_timestamp, max_timestamp)
|
||||
|
||||
|
||||
def show_time_range(record_file: str) -> None:
|
||||
reader = RecordFileReader(record_file)
|
||||
time_range = reader.get_time_range()
|
||||
print('time range of samples is %.3f s' % ((time_range[1] - time_range[0]) / 1e9))
|
||||
|
||||
|
||||
def filter_samples(
|
||||
record_file: str, split_time_range: int, exclude_first_seconds: int,
|
||||
exclude_last_seconds: int, output_file_prefix: str) -> None:
|
||||
reader = RecordFileReader(record_file)
|
||||
min_timestamp, max_timestamp = reader.get_time_range()
|
||||
comment = 'total time range: %d seconds' % ((max_timestamp - min_timestamp) // 1e9)
|
||||
if exclude_first_seconds:
|
||||
min_timestamp += int(exclude_first_seconds * 1e9)
|
||||
comment += ', exclude first %d seconds' % exclude_first_seconds
|
||||
if exclude_last_seconds:
|
||||
max_timestamp -= int(exclude_last_seconds * 1e9)
|
||||
comment += ', exclude last %d seconds' % exclude_last_seconds
|
||||
if min_timestamp > max_timestamp:
|
||||
logging.error('All samples are filtered out')
|
||||
return
|
||||
if not split_time_range:
|
||||
output_file = output_file_prefix
|
||||
with open(output_file, 'w') as fh:
|
||||
fh.write('// %s\n' % comment)
|
||||
fh.write('GLOBAL_BEGIN %d\n' % min_timestamp)
|
||||
fh.write('GLOBAL_END %d\n' % max_timestamp)
|
||||
print('Generate sample filter file: %s' % output_file)
|
||||
else:
|
||||
step = (max_timestamp - min_timestamp) // split_time_range
|
||||
cur_timestamp = min_timestamp
|
||||
for i in range(split_time_range):
|
||||
output_file = output_file_prefix + '_part%s' % (i + 1)
|
||||
with open(output_file, 'w') as fh:
|
||||
time_range_comment = 'current range: %d to %d seconds' % (
|
||||
(cur_timestamp - min_timestamp) // 1e9,
|
||||
(cur_timestamp + step - min_timestamp) // 1e9)
|
||||
fh.write('// %s, %s\n' % (comment, time_range_comment))
|
||||
fh.write('GLOBAL_BEGIN %d\n' % cur_timestamp)
|
||||
if i == split_time_range - 1:
|
||||
cur_timestamp = max_timestamp
|
||||
else:
|
||||
cur_timestamp += step
|
||||
fh.write('GLOBAL_END %d\n' % (cur_timestamp + 1))
|
||||
cur_timestamp += 1
|
||||
print('Generate sample filter file: %s' % output_file)
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
parser.add_argument('-i', '--record-file', nargs='?', default='perf.data',
|
||||
help='Default is perf.data.')
|
||||
parser.add_argument('--show-time-range', action='store_true', help='show time range of samples')
|
||||
parser.add_argument('--split-time-range', type=int,
|
||||
help='split time ranges of samples into several parts')
|
||||
parser.add_argument('--exclude-first-seconds', type=int,
|
||||
help='exclude samples recorded in the first seconds')
|
||||
parser.add_argument('--exclude-last-seconds', type=int,
|
||||
help='exclude samples recorded in the last seconds')
|
||||
parser.add_argument(
|
||||
'-o', '--output-file-prefix', default='sample_filter',
|
||||
help='prefix for the generated sample filter files')
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.show_time_range:
|
||||
show_time_range(args.record_file)
|
||||
|
||||
if args.split_time_range or args.exclude_first_seconds or args.exclude_last_seconds:
|
||||
filter_samples(args.record_file, args.split_time_range, args.exclude_first_seconds,
|
||||
args.exclude_last_seconds, args.output_file_prefix)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
691
Android/android-ndk-r27d/simpleperf/simpleperf_report_lib.py
Normal file
@ -0,0 +1,691 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2016 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""simpleperf_report_lib.py: a python wrapper of libsimpleperf_report.so.
|
||||
Used to access samples in perf.data.
|
||||
|
||||
"""
|
||||
|
||||
import collections
|
||||
from collections import namedtuple
|
||||
import ctypes as ct
|
||||
from pathlib import Path
|
||||
import struct
|
||||
from typing import Any, Dict, List, Optional, Union
|
||||
|
||||
from simpleperf_utils import (bytes_to_str, get_host_binary_path, is_windows, log_exit,
|
||||
str_to_bytes, ReportLibOptions)
|
||||
|
||||
|
||||
def _is_null(p: Optional[ct._Pointer]) -> bool:
|
||||
if p:
|
||||
return False
|
||||
return ct.cast(p, ct.c_void_p).value is None
|
||||
|
||||
|
||||
def _char_pt(s: str) -> bytes:
|
||||
return str_to_bytes(s)
|
||||
|
||||
|
||||
def _char_pt_to_str(char_pt: ct.c_char_p) -> str:
|
||||
return bytes_to_str(char_pt)
|
||||
|
||||
|
||||
def _check(cond: bool, failmsg: str):
|
||||
if not cond:
|
||||
raise RuntimeError(failmsg)
|
||||
|
||||
|
||||
class SampleStruct(ct.Structure):
|
||||
""" Instance of a sample in perf.data.
|
||||
ip: the program counter of the thread generating the sample.
|
||||
pid: process id (or thread group id) of the thread generating the sample.
|
||||
tid: thread id.
|
||||
thread_comm: thread name.
|
||||
time: time at which the sample was generated. The value is in nanoseconds.
|
||||
The clock is decided by the --clockid option in `simpleperf record`.
|
||||
in_kernel: whether the instruction is in kernel space or user space.
|
||||
cpu: the cpu generating the sample.
|
||||
period: count of events have happened since last sample. For example, if we use
|
||||
-e cpu-cycles, it means how many cpu-cycles have happened.
|
||||
If we use -e cpu-clock, it means how many nanoseconds have passed.
|
||||
"""
|
||||
_fields_ = [('ip', ct.c_uint64),
|
||||
('pid', ct.c_uint32),
|
||||
('tid', ct.c_uint32),
|
||||
('_thread_comm', ct.c_char_p),
|
||||
('time', ct.c_uint64),
|
||||
('_in_kernel', ct.c_uint32),
|
||||
('cpu', ct.c_uint32),
|
||||
('period', ct.c_uint64)]
|
||||
|
||||
@property
|
||||
def thread_comm(self) -> str:
|
||||
return _char_pt_to_str(self._thread_comm)
|
||||
|
||||
@property
|
||||
def in_kernel(self) -> bool:
|
||||
return bool(self._in_kernel)
|
||||
|
||||
|
||||
class TracingFieldFormatStruct(ct.Structure):
|
||||
"""Format of a tracing field.
|
||||
name: name of the field.
|
||||
offset: offset of the field in tracing data.
|
||||
elem_size: size of the element type.
|
||||
elem_count: the number of elements in this field, more than one if the field is an array.
|
||||
is_signed: whether the element type is signed or unsigned.
|
||||
is_dynamic: whether the element is a dynamic string.
|
||||
"""
|
||||
_fields_ = [('_name', ct.c_char_p),
|
||||
('offset', ct.c_uint32),
|
||||
('elem_size', ct.c_uint32),
|
||||
('elem_count', ct.c_uint32),
|
||||
('is_signed', ct.c_uint32),
|
||||
('is_dynamic', ct.c_uint32)]
|
||||
|
||||
_unpack_key_dict = {1: 'b', 2: 'h', 4: 'i', 8: 'q'}
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return _char_pt_to_str(self._name)
|
||||
|
||||
def parse_value(self, data: ct.c_char_p) -> Union[str, bytes, List[bytes]]:
|
||||
""" Parse value of a field in a tracepoint event.
|
||||
The return value depends on the type of the field, and can be an int value, a string,
|
||||
an array of int values, etc. If the type can't be parsed, return a byte array or an
|
||||
array of byte arrays.
|
||||
"""
|
||||
if self.is_dynamic:
|
||||
offset, max_len = struct.unpack('<HH', data[self.offset:self.offset + 4])
|
||||
length = 0
|
||||
while length < max_len and bytes_to_str(data[offset + length]) != '\x00':
|
||||
length += 1
|
||||
return bytes_to_str(data[offset: offset + length])
|
||||
|
||||
if self.elem_count > 1 and self.elem_size == 1:
|
||||
# Probably the field is a string.
|
||||
# Don't use self.is_signed, which has different values on x86 and arm.
|
||||
length = 0
|
||||
while length < self.elem_count and bytes_to_str(data[self.offset + length]) != '\x00':
|
||||
length += 1
|
||||
return bytes_to_str(data[self.offset: self.offset + length])
|
||||
unpack_key = self._unpack_key_dict.get(self.elem_size)
|
||||
if unpack_key:
|
||||
if not self.is_signed:
|
||||
unpack_key = unpack_key.upper()
|
||||
value = struct.unpack('%d%s' % (self.elem_count, unpack_key),
|
||||
data[self.offset:self.offset + self.elem_count * self.elem_size])
|
||||
else:
|
||||
# Since we don't know the element type, just return the bytes.
|
||||
value = []
|
||||
offset = self.offset
|
||||
for _ in range(self.elem_count):
|
||||
value.append(data[offset: offset + self.elem_size])
|
||||
offset += self.elem_size
|
||||
if self.elem_count == 1:
|
||||
value = value[0]
|
||||
return value
|
||||
|
||||
|
||||
class TracingDataFormatStruct(ct.Structure):
|
||||
"""Format of tracing data of a tracepoint event, like
|
||||
https://www.kernel.org/doc/html/latest/trace/events.html#event-formats.
|
||||
size: total size of all fields in the tracing data.
|
||||
field_count: the number of fields.
|
||||
fields: an array of fields.
|
||||
"""
|
||||
_fields_ = [('size', ct.c_uint32),
|
||||
('field_count', ct.c_uint32),
|
||||
('fields', ct.POINTER(TracingFieldFormatStruct))]
|
||||
|
||||
|
||||
class EventStruct(ct.Structure):
|
||||
"""Event type of a sample.
|
||||
name: name of the event type.
|
||||
tracing_data_format: only available when it is a tracepoint event.
|
||||
"""
|
||||
_fields_ = [('_name', ct.c_char_p),
|
||||
('tracing_data_format', TracingDataFormatStruct)]
|
||||
|
||||
@property
|
||||
def name(self) -> str:
|
||||
return _char_pt_to_str(self._name)
|
||||
|
||||
|
||||
class MappingStruct(ct.Structure):
|
||||
""" A mapping area in the monitored threads, like the content in /proc/<pid>/maps.
|
||||
start: start addr in memory.
|
||||
end: end addr in memory.
|
||||
pgoff: offset in the mapped shared library.
|
||||
"""
|
||||
_fields_ = [('start', ct.c_uint64),
|
||||
('end', ct.c_uint64),
|
||||
('pgoff', ct.c_uint64)]
|
||||
|
||||
|
||||
class SymbolStruct(ct.Structure):
|
||||
""" Symbol info of the instruction hit by a sample or a callchain entry of a sample.
|
||||
dso_name: path of the shared library containing the instruction.
|
||||
vaddr_in_file: virtual address of the instruction in the shared library.
|
||||
symbol_name: name of the function containing the instruction.
|
||||
symbol_addr: start addr of the function containing the instruction.
|
||||
symbol_len: length of the function in the shared library.
|
||||
mapping: the mapping area hit by the instruction.
|
||||
"""
|
||||
_fields_ = [('_dso_name', ct.c_char_p),
|
||||
('vaddr_in_file', ct.c_uint64),
|
||||
('_symbol_name', ct.c_char_p),
|
||||
('symbol_addr', ct.c_uint64),
|
||||
('symbol_len', ct.c_uint64),
|
||||
('mapping', ct.POINTER(MappingStruct))]
|
||||
|
||||
@property
|
||||
def dso_name(self) -> str:
|
||||
return _char_pt_to_str(self._dso_name)
|
||||
|
||||
@property
|
||||
def symbol_name(self) -> str:
|
||||
return _char_pt_to_str(self._symbol_name)
|
||||
|
||||
|
||||
class CallChainEntryStructure(ct.Structure):
|
||||
""" A callchain entry of a sample.
|
||||
ip: the address of the instruction of the callchain entry.
|
||||
symbol: symbol info of the callchain entry.
|
||||
"""
|
||||
_fields_ = [('ip', ct.c_uint64),
|
||||
('symbol', SymbolStruct)]
|
||||
|
||||
|
||||
class CallChainStructure(ct.Structure):
|
||||
""" Callchain info of a sample.
|
||||
nr: number of entries in the callchain.
|
||||
entries: a pointer to an array of CallChainEntryStructure.
|
||||
|
||||
For example, if a sample is generated when a thread is running function C
|
||||
with callchain function A -> function B -> function C.
|
||||
Then nr = 2, and entries = [function B, function A].
|
||||
"""
|
||||
_fields_ = [('nr', ct.c_uint32),
|
||||
('entries', ct.POINTER(CallChainEntryStructure))]
|
||||
|
||||
|
||||
class FeatureSectionStructure(ct.Structure):
|
||||
""" A feature section in perf.data to store information like record cmd, device arch, etc.
|
||||
data: a pointer to a buffer storing the section data.
|
||||
data_size: data size in bytes.
|
||||
"""
|
||||
_fields_ = [('data', ct.POINTER(ct.c_char)),
|
||||
('data_size', ct.c_uint32)]
|
||||
|
||||
|
||||
class ReportLibStructure(ct.Structure):
|
||||
_fields_ = []
|
||||
|
||||
|
||||
# pylint: disable=invalid-name
|
||||
class ReportLib(object):
|
||||
""" Read contents from perf.data. """
|
||||
|
||||
def __init__(self, native_lib_path: Optional[str] = None):
|
||||
if native_lib_path is None:
|
||||
native_lib_path = self._get_native_lib()
|
||||
|
||||
self._load_dependent_lib()
|
||||
self._lib = ct.CDLL(native_lib_path)
|
||||
self._CreateReportLibFunc = self._lib.CreateReportLib
|
||||
self._CreateReportLibFunc.restype = ct.POINTER(ReportLibStructure)
|
||||
self._DestroyReportLibFunc = self._lib.DestroyReportLib
|
||||
self._SetLogSeverityFunc = self._lib.SetLogSeverity
|
||||
self._SetSymfsFunc = self._lib.SetSymfs
|
||||
self._SetRecordFileFunc = self._lib.SetRecordFile
|
||||
self._SetKallsymsFileFunc = self._lib.SetKallsymsFile
|
||||
self._ShowIpForUnknownSymbolFunc = self._lib.ShowIpForUnknownSymbol
|
||||
self._ShowArtFramesFunc = self._lib.ShowArtFrames
|
||||
self._MergeJavaMethodsFunc = self._lib.MergeJavaMethods
|
||||
self._AddProguardMappingFileFunc = self._lib.AddProguardMappingFile
|
||||
self._AddProguardMappingFileFunc.restype = ct.c_bool
|
||||
self._GetSupportedTraceOffCpuModesFunc = self._lib.GetSupportedTraceOffCpuModes
|
||||
self._GetSupportedTraceOffCpuModesFunc.restype = ct.c_char_p
|
||||
self._SetTraceOffCpuModeFunc = self._lib.SetTraceOffCpuMode
|
||||
self._SetTraceOffCpuModeFunc.restype = ct.c_bool
|
||||
self._SetSampleFilterFunc = self._lib.SetSampleFilter
|
||||
self._SetSampleFilterFunc.restype = ct.c_bool
|
||||
self._AggregateThreadsFunc = self._lib.AggregateThreads
|
||||
self._AggregateThreadsFunc.restype = ct.c_bool
|
||||
self._GetNextSampleFunc = self._lib.GetNextSample
|
||||
self._GetNextSampleFunc.restype = ct.POINTER(SampleStruct)
|
||||
self._GetEventOfCurrentSampleFunc = self._lib.GetEventOfCurrentSample
|
||||
self._GetEventOfCurrentSampleFunc.restype = ct.POINTER(EventStruct)
|
||||
self._GetSymbolOfCurrentSampleFunc = self._lib.GetSymbolOfCurrentSample
|
||||
self._GetSymbolOfCurrentSampleFunc.restype = ct.POINTER(SymbolStruct)
|
||||
self._GetCallChainOfCurrentSampleFunc = self._lib.GetCallChainOfCurrentSample
|
||||
self._GetCallChainOfCurrentSampleFunc.restype = ct.POINTER(CallChainStructure)
|
||||
self._GetTracingDataOfCurrentSampleFunc = self._lib.GetTracingDataOfCurrentSample
|
||||
self._GetTracingDataOfCurrentSampleFunc.restype = ct.POINTER(ct.c_char)
|
||||
self._GetBuildIdForPathFunc = self._lib.GetBuildIdForPath
|
||||
self._GetBuildIdForPathFunc.restype = ct.c_char_p
|
||||
self._GetFeatureSection = self._lib.GetFeatureSection
|
||||
self._GetFeatureSection.restype = ct.POINTER(FeatureSectionStructure)
|
||||
self._instance = self._CreateReportLibFunc()
|
||||
assert not _is_null(self._instance)
|
||||
|
||||
self.meta_info: Optional[Dict[str, str]] = None
|
||||
self.current_sample: Optional[SampleStruct] = None
|
||||
self.record_cmd: Optional[str] = None
|
||||
|
||||
def _get_native_lib(self) -> str:
|
||||
return get_host_binary_path('libsimpleperf_report.so')
|
||||
|
||||
def _load_dependent_lib(self):
|
||||
# As the windows dll is built with mingw we need to load 'libwinpthread-1.dll'.
|
||||
if is_windows():
|
||||
self._libwinpthread = ct.CDLL(get_host_binary_path('libwinpthread-1.dll'))
|
||||
|
||||
def Close(self):
|
||||
if self._instance:
|
||||
self._DestroyReportLibFunc(self._instance)
|
||||
self._instance = None
|
||||
|
||||
def SetReportOptions(self, options: ReportLibOptions):
|
||||
""" Set report options in one call. """
|
||||
if options.proguard_mapping_files:
|
||||
for file_path in options.proguard_mapping_files:
|
||||
self.AddProguardMappingFile(file_path)
|
||||
if options.show_art_frames:
|
||||
self.ShowArtFrames(True)
|
||||
if options.trace_offcpu:
|
||||
self.SetTraceOffCpuMode(options.trace_offcpu)
|
||||
if options.sample_filters:
|
||||
self.SetSampleFilter(options.sample_filters)
|
||||
if options.aggregate_threads:
|
||||
self.AggregateThreads(options.aggregate_threads)
|
||||
|
||||
def SetLogSeverity(self, log_level: str = 'info'):
|
||||
""" Set log severity of native lib, can be verbose,debug,info,error,fatal."""
|
||||
cond: bool = self._SetLogSeverityFunc(self.getInstance(), _char_pt(log_level))
|
||||
_check(cond, 'Failed to set log level')
|
||||
|
||||
def SetSymfs(self, symfs_dir: str):
|
||||
""" Set directory used to find symbols."""
|
||||
cond: bool = self._SetSymfsFunc(self.getInstance(), _char_pt(symfs_dir))
|
||||
_check(cond, 'Failed to set symbols directory')
|
||||
|
||||
def SetRecordFile(self, record_file: str):
|
||||
""" Set the path of record file, like perf.data."""
|
||||
cond: bool = self._SetRecordFileFunc(self.getInstance(), _char_pt(record_file))
|
||||
_check(cond, 'Failed to set record file')
|
||||
|
||||
def ShowIpForUnknownSymbol(self):
|
||||
self._ShowIpForUnknownSymbolFunc(self.getInstance())
|
||||
|
||||
def ShowArtFrames(self, show: bool = True):
|
||||
""" Show frames of internal methods of the Java interpreter. """
|
||||
self._ShowArtFramesFunc(self.getInstance(), show)
|
||||
|
||||
def MergeJavaMethods(self, merge: bool = True):
|
||||
""" This option merges jitted java methods with the same name but in different jit
|
||||
symfiles. If possible, it also merges jitted methods with interpreted methods,
|
||||
by mapping jitted methods to their corresponding dex files.
|
||||
Side effects:
|
||||
It only works at method level, not instruction level.
|
||||
It makes symbol.vaddr_in_file and symbol.mapping not accurate for jitted methods.
|
||||
Java methods are merged by default.
|
||||
"""
|
||||
self._MergeJavaMethodsFunc(self.getInstance(), merge)
|
||||
|
||||
def AddProguardMappingFile(self, mapping_file: Union[str, Path]):
|
||||
""" Add proguard mapping.txt to de-obfuscate method names. """
|
||||
if not self._AddProguardMappingFileFunc(self.getInstance(), _char_pt(str(mapping_file))):
|
||||
raise ValueError(f'failed to add proguard mapping file: {mapping_file}')
|
||||
|
||||
def SetKallsymsFile(self, kallsym_file: str):
|
||||
""" Set the file path to a copy of the /proc/kallsyms file (for off device decoding) """
|
||||
cond: bool = self._SetKallsymsFileFunc(self.getInstance(), _char_pt(kallsym_file))
|
||||
_check(cond, 'Failed to set kallsyms file')
|
||||
|
||||
def GetSupportedTraceOffCpuModes(self) -> List[str]:
|
||||
""" Get trace-offcpu modes supported by the recording file. It should be called after
|
||||
SetRecordFile(). The modes are only available for profiles recorded with --trace-offcpu
|
||||
option. All possible modes are:
|
||||
on-cpu: report on-cpu samples with period representing time spent on cpu
|
||||
off-cpu: report off-cpu samples with period representing time spent off cpu
|
||||
on-off-cpu: report both on-cpu samples and off-cpu samples, which can be split
|
||||
by event name.
|
||||
mixed-on-off-cpu: report on-cpu and off-cpu samples under the same event name.
|
||||
"""
|
||||
modes_str = self._GetSupportedTraceOffCpuModesFunc(self.getInstance())
|
||||
_check(not _is_null(modes_str), 'Failed to call GetSupportedTraceOffCpuModes()')
|
||||
modes_str = _char_pt_to_str(modes_str)
|
||||
return modes_str.split(',') if modes_str else []
|
||||
|
||||
def SetTraceOffCpuMode(self, mode: str):
|
||||
""" Set trace-offcpu mode. It should be called after SetRecordFile(). The mode should be
|
||||
one of the modes returned by GetSupportedTraceOffCpuModes().
|
||||
"""
|
||||
res: bool = self._SetTraceOffCpuModeFunc(self.getInstance(), _char_pt(mode))
|
||||
_check(res, f'Failed to call SetTraceOffCpuMode({mode})')
|
||||
|
||||
def SetSampleFilter(self, filters: List[str]):
|
||||
""" Set options used to filter samples. Available options are:
|
||||
--exclude-pid pid1,pid2,... Exclude samples for selected processes.
|
||||
--exclude-tid tid1,tid2,... Exclude samples for selected threads.
|
||||
--exclude-process-name process_name_regex Exclude samples for processes with name
|
||||
containing the regular expression.
|
||||
--exclude-thread-name thread_name_regex Exclude samples for threads with name
|
||||
containing the regular expression.
|
||||
--include-pid pid1,pid2,... Include samples for selected processes.
|
||||
--include-tid tid1,tid2,... Include samples for selected threads.
|
||||
--include-process-name process_name_regex Include samples for processes with name
|
||||
containing the regular expression.
|
||||
--include-thread-name thread_name_regex Include samples for threads with name
|
||||
containing the regular expression.
|
||||
--filter-file <file> Use filter file to filter samples based on timestamps. The
|
||||
file format is in doc/sampler_filter.md.
|
||||
|
||||
The filter argument should be a concatenation of options.
|
||||
"""
|
||||
filter_array = (ct.c_char_p * len(filters))()
|
||||
filter_array[:] = [_char_pt(f) for f in filters]
|
||||
res: bool = self._SetSampleFilterFunc(self.getInstance(), filter_array, len(filters))
|
||||
_check(res, f'Failed to call SetSampleFilter({filters})')
|
||||
|
||||
def AggregateThreads(self, thread_name_regex_list: List[str]):
|
||||
""" Given a list of thread name regex, threads with names matching the same regex are merged
|
||||
into one thread. As a result, samples from different threads (like a thread pool) can be
|
||||
shown in one flamegraph.
|
||||
"""
|
||||
regex_array = (ct.c_char_p * len(thread_name_regex_list))()
|
||||
regex_array[:] = [_char_pt(f) for f in thread_name_regex_list]
|
||||
res: bool = self._AggregateThreadsFunc(
|
||||
self.getInstance(),
|
||||
regex_array, len(thread_name_regex_list))
|
||||
_check(res, f'Failed to call AggregateThreads({thread_name_regex_list})')
|
||||
|
||||
def GetNextSample(self) -> Optional[SampleStruct]:
|
||||
""" Return the next sample. If no more samples, return None. """
|
||||
psample = self._GetNextSampleFunc(self.getInstance())
|
||||
if _is_null(psample):
|
||||
self.current_sample = None
|
||||
else:
|
||||
self.current_sample = psample[0]
|
||||
return self.current_sample
|
||||
|
||||
def GetCurrentSample(self) -> Optional[SampleStruct]:
|
||||
return self.current_sample
|
||||
|
||||
def GetEventOfCurrentSample(self) -> EventStruct:
|
||||
event = self._GetEventOfCurrentSampleFunc(self.getInstance())
|
||||
assert not _is_null(event)
|
||||
return event[0]
|
||||
|
||||
def GetSymbolOfCurrentSample(self) -> SymbolStruct:
|
||||
symbol = self._GetSymbolOfCurrentSampleFunc(self.getInstance())
|
||||
assert not _is_null(symbol)
|
||||
return symbol[0]
|
||||
|
||||
def GetCallChainOfCurrentSample(self) -> CallChainStructure:
|
||||
callchain = self._GetCallChainOfCurrentSampleFunc(self.getInstance())
|
||||
assert not _is_null(callchain)
|
||||
return callchain[0]
|
||||
|
||||
def GetTracingDataOfCurrentSample(self) -> Optional[Dict[str, Any]]:
|
||||
data = self._GetTracingDataOfCurrentSampleFunc(self.getInstance())
|
||||
if _is_null(data):
|
||||
return None
|
||||
event = self.GetEventOfCurrentSample()
|
||||
result = collections.OrderedDict()
|
||||
for i in range(event.tracing_data_format.field_count):
|
||||
field = event.tracing_data_format.fields[i]
|
||||
result[field.name] = field.parse_value(data)
|
||||
return result
|
||||
|
||||
def GetBuildIdForPath(self, path: str) -> str:
|
||||
build_id = self._GetBuildIdForPathFunc(self.getInstance(), _char_pt(path))
|
||||
assert not _is_null(build_id)
|
||||
return _char_pt_to_str(build_id)
|
||||
|
||||
def GetRecordCmd(self) -> str:
|
||||
if self.record_cmd is not None:
|
||||
return self.record_cmd
|
||||
self.record_cmd = ''
|
||||
feature_data = self._GetFeatureSection(self.getInstance(), _char_pt('cmdline'))
|
||||
if not _is_null(feature_data):
|
||||
void_p = ct.cast(feature_data[0].data, ct.c_void_p)
|
||||
arg_count = ct.cast(void_p, ct.POINTER(ct.c_uint32)).contents.value
|
||||
void_p.value += 4
|
||||
args = []
|
||||
for _ in range(arg_count):
|
||||
str_len = ct.cast(void_p, ct.POINTER(ct.c_uint32)).contents.value
|
||||
void_p.value += 4
|
||||
char_p = ct.cast(void_p, ct.POINTER(ct.c_char))
|
||||
current_str = ''
|
||||
for j in range(str_len):
|
||||
c = bytes_to_str(char_p[j])
|
||||
if c != '\0':
|
||||
current_str += c
|
||||
if ' ' in current_str:
|
||||
current_str = '"' + current_str + '"'
|
||||
args.append(current_str)
|
||||
void_p.value += str_len
|
||||
self.record_cmd = ' '.join(args)
|
||||
return self.record_cmd
|
||||
|
||||
def _GetFeatureString(self, feature_name: str) -> str:
|
||||
feature_data = self._GetFeatureSection(self.getInstance(), _char_pt(feature_name))
|
||||
result = ''
|
||||
if not _is_null(feature_data):
|
||||
void_p = ct.cast(feature_data[0].data, ct.c_void_p)
|
||||
str_len = ct.cast(void_p, ct.POINTER(ct.c_uint32)).contents.value
|
||||
void_p.value += 4
|
||||
char_p = ct.cast(void_p, ct.POINTER(ct.c_char))
|
||||
for i in range(str_len):
|
||||
c = bytes_to_str(char_p[i])
|
||||
if c == '\0':
|
||||
break
|
||||
result += c
|
||||
return result
|
||||
|
||||
def GetArch(self) -> str:
|
||||
return self._GetFeatureString('arch')
|
||||
|
||||
def MetaInfo(self) -> Dict[str, str]:
|
||||
""" Return a string to string map stored in meta_info section in perf.data.
|
||||
It is used to pass some short meta information.
|
||||
"""
|
||||
if self.meta_info is None:
|
||||
self.meta_info = {}
|
||||
feature_data = self._GetFeatureSection(self.getInstance(), _char_pt('meta_info'))
|
||||
if not _is_null(feature_data):
|
||||
str_list = []
|
||||
data = feature_data[0].data
|
||||
data_size = feature_data[0].data_size
|
||||
current_str = ''
|
||||
for i in range(data_size):
|
||||
c = bytes_to_str(data[i])
|
||||
if c != '\0':
|
||||
current_str += c
|
||||
else:
|
||||
str_list.append(current_str)
|
||||
current_str = ''
|
||||
for i in range(0, len(str_list), 2):
|
||||
self.meta_info[str_list[i]] = str_list[i + 1]
|
||||
return self.meta_info
|
||||
|
||||
def getInstance(self) -> ct._Pointer:
|
||||
if self._instance is None:
|
||||
raise Exception('Instance is Closed')
|
||||
return self._instance
|
||||
|
||||
|
||||
ProtoSample = namedtuple('ProtoSample', ['ip', 'pid', 'tid',
|
||||
'thread_comm', 'time', 'in_kernel', 'cpu', 'period'])
|
||||
ProtoEvent = namedtuple('ProtoEvent', ['name', 'tracing_data_format'])
|
||||
ProtoSymbol = namedtuple(
|
||||
'ProtoSymbol',
|
||||
['dso_name', 'vaddr_in_file', 'symbol_name', 'symbol_addr', 'symbol_len', 'mapping'])
|
||||
ProtoMapping = namedtuple('ProtoMapping', ['start', 'end', 'pgoff'])
|
||||
ProtoCallChain = namedtuple('ProtoCallChain', ['nr', 'entries'])
|
||||
ProtoCallChainEntry = namedtuple('ProtoCallChainEntry', ['ip', 'symbol'])
|
||||
|
||||
|
||||
class ProtoFileReportLib:
|
||||
""" Read contents from profile in cmd_report_sample.proto format.
|
||||
It is generated by `simpleperf report-sample`.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def is_supported_format(record_file: str):
|
||||
with open(record_file, 'rb') as fh:
|
||||
if fh.read(10) == b'SIMPLEPERF':
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def get_report_sample_pb2():
|
||||
try:
|
||||
import report_sample_pb2
|
||||
return report_sample_pb2
|
||||
except ImportError as e:
|
||||
log_exit(f'{e}\nprotobuf package is missing or too old. Please install it like ' +
|
||||
'`pip install protobuf==4.21`.')
|
||||
|
||||
def __init__(self):
|
||||
self.report_sample_pb2 = ProtoFileReportLib.get_report_sample_pb2()
|
||||
self.samples: List[self.report_sample_pb2.Sample] = []
|
||||
self.sample_index = -1
|
||||
self.files: List[self.report_sample_pb2.File] = []
|
||||
self.thread_map: Dict[int, self.report_sample_pb2.Thread] = {}
|
||||
self.meta_info: Optional[self.report_sample_pb2.MetaInfo] = None
|
||||
self.fake_mapping_starts = []
|
||||
|
||||
def Close(self):
|
||||
pass
|
||||
|
||||
def SetReportOptions(self, options: ReportLibOptions):
|
||||
pass
|
||||
|
||||
def SetLogSeverity(self, log_level: str = 'info'):
|
||||
pass
|
||||
|
||||
def SetSymfs(self, symfs_dir: str):
|
||||
pass
|
||||
|
||||
def SetRecordFile(self, record_file: str):
|
||||
with open(record_file, 'rb') as fh:
|
||||
data = fh.read()
|
||||
_check(data[:10] == b'SIMPLEPERF', f'magic number mismatch: {data[:10]}')
|
||||
version = struct.unpack('<H', data[10:12])[0]
|
||||
_check(version == 1, f'version mismatch: {version}')
|
||||
i = 12
|
||||
while i < len(data):
|
||||
_check(i + 4 <= len(data), 'data format error')
|
||||
size = struct.unpack('<I', data[i:i + 4])[0]
|
||||
if size == 0:
|
||||
break
|
||||
i += 4
|
||||
_check(i + size <= len(data), 'data format error')
|
||||
record = self.report_sample_pb2.Record()
|
||||
record.ParseFromString(data[i: i + size])
|
||||
i += size
|
||||
if record.HasField('sample'):
|
||||
self.samples.append(record.sample)
|
||||
elif record.HasField('file'):
|
||||
self.files.append(record.file)
|
||||
elif record.HasField('thread'):
|
||||
self.thread_map[record.thread.thread_id] = record.thread
|
||||
elif record.HasField('meta_info'):
|
||||
self.meta_info = record.meta_info
|
||||
fake_mapping_start = 0
|
||||
for file in self.files:
|
||||
self.fake_mapping_starts.append(fake_mapping_start)
|
||||
fake_mapping_start += len(file.symbol) + 1
|
||||
|
||||
def ShowIpForUnknownSymbol(self):
|
||||
pass
|
||||
|
||||
def ShowArtFrames(self, show: bool = True):
|
||||
pass
|
||||
|
||||
def SetSampleFilter(self, filters: List[str]):
|
||||
raise NotImplementedError('sample filters are not implemented for report_sample profiles')
|
||||
|
||||
def GetNextSample(self) -> Optional[ProtoSample]:
|
||||
self.sample_index += 1
|
||||
return self.GetCurrentSample()
|
||||
|
||||
def GetCurrentSample(self) -> Optional[ProtoSample]:
|
||||
if self.sample_index >= len(self.samples):
|
||||
return None
|
||||
sample = self.samples[self.sample_index]
|
||||
thread = self.thread_map[sample.thread_id]
|
||||
return ProtoSample(
|
||||
ip=0, pid=thread.process_id, tid=thread.thread_id, thread_comm=thread.thread_name,
|
||||
time=sample.time, in_kernel=False, cpu=0, period=sample.event_count)
|
||||
|
||||
def GetEventOfCurrentSample(self) -> ProtoEvent:
|
||||
sample = self.samples[self.sample_index]
|
||||
event_name = self.meta_info.event_type[sample.event_type_id]
|
||||
return ProtoEvent(name=event_name, tracing_data_format=None)
|
||||
|
||||
def GetSymbolOfCurrentSample(self) -> ProtoSymbol:
|
||||
sample = self.samples[self.sample_index]
|
||||
node = sample.callchain[0]
|
||||
return self._build_symbol(node)
|
||||
|
||||
def GetCallChainOfCurrentSample(self) -> ProtoCallChain:
|
||||
entries = []
|
||||
sample = self.samples[self.sample_index]
|
||||
for node in sample.callchain[1:]:
|
||||
symbol = self._build_symbol(node)
|
||||
entries.append(ProtoCallChainEntry(ip=0, symbol=symbol))
|
||||
return ProtoCallChain(nr=len(entries), entries=entries)
|
||||
|
||||
def _build_symbol(self, node) -> ProtoSymbol:
|
||||
file = self.files[node.file_id]
|
||||
if node.symbol_id == -1:
|
||||
symbol_name = 'unknown'
|
||||
fake_symbol_addr = self.fake_mapping_starts[node.file_id] + len(file.symbol)
|
||||
fake_symbol_pgoff = 0
|
||||
else:
|
||||
symbol_name = file.symbol[node.symbol_id]
|
||||
fake_symbol_addr = self.fake_mapping_starts[node.file_id] = node.symbol_id + 1
|
||||
fake_symbol_pgoff = node.symbol_id + 1
|
||||
mapping = ProtoMapping(fake_symbol_addr, 1, fake_symbol_pgoff)
|
||||
return ProtoSymbol(dso_name=file.path, vaddr_in_file=node.vaddr_in_file,
|
||||
symbol_name=symbol_name, symbol_addr=0, symbol_len=1, mapping=[mapping])
|
||||
|
||||
def GetBuildIdForPath(self, path: str) -> str:
|
||||
return ''
|
||||
|
||||
def GetRecordCmd(self) -> str:
|
||||
return ''
|
||||
|
||||
def GetArch(self) -> str:
|
||||
return ''
|
||||
|
||||
def MetaInfo(self) -> Dict[str, str]:
|
||||
return {}
|
||||
|
||||
|
||||
def GetReportLib(record_file: str) -> Union[ReportLib, ProtoFileReportLib]:
|
||||
if ProtoFileReportLib.is_supported_format(record_file):
|
||||
lib = ProtoFileReportLib()
|
||||
else:
|
||||
lib = ReportLib()
|
||||
lib.SetRecordFile(record_file)
|
||||
return lib
|
||||
1229
Android/android-ndk-r27d/simpleperf/simpleperf_utils.py
Normal file
136
Android/android-ndk-r27d/simpleperf/stackcollapse.py
Normal file
@ -0,0 +1,136 @@
|
||||
#!/usr/bin/env python3
|
||||
#
|
||||
# Copyright (C) 2021 The Android Open Source Project
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
#
|
||||
|
||||
"""stackcollapse.py: convert perf.data to Brendan Gregg's "Folded Stacks" format,
|
||||
which can be read by https://github.com/brendangregg/FlameGraph, and many
|
||||
other tools.
|
||||
|
||||
Example:
|
||||
./app_profiler.py
|
||||
./stackcollapse.py | ~/FlameGraph/flamegraph.pl --color=java --countname=ns > flamegraph.svg
|
||||
"""
|
||||
|
||||
from collections import defaultdict
|
||||
from simpleperf_report_lib import GetReportLib
|
||||
from simpleperf_utils import BaseArgumentParser, flatten_arg_list, ReportLibOptions
|
||||
from typing import DefaultDict, List, Optional, Set
|
||||
|
||||
import logging
|
||||
import sys
|
||||
|
||||
|
||||
def collapse_stacks(
|
||||
record_file: str,
|
||||
symfs_dir: str,
|
||||
kallsyms_file: str,
|
||||
event_filter: str,
|
||||
include_pid: bool,
|
||||
include_tid: bool,
|
||||
annotate_kernel: bool,
|
||||
annotate_jit: bool,
|
||||
include_addrs: bool,
|
||||
report_lib_options: ReportLibOptions):
|
||||
"""read record_file, aggregate per-stack and print totals per-stack"""
|
||||
lib = GetReportLib(record_file)
|
||||
|
||||
if include_addrs:
|
||||
lib.ShowIpForUnknownSymbol()
|
||||
if symfs_dir is not None:
|
||||
lib.SetSymfs(symfs_dir)
|
||||
if kallsyms_file is not None:
|
||||
lib.SetKallsymsFile(kallsyms_file)
|
||||
lib.SetReportOptions(report_lib_options)
|
||||
|
||||
stacks: DefaultDict[str, int] = defaultdict(int)
|
||||
event_defaulted = False
|
||||
event_warning_shown = False
|
||||
while True:
|
||||
sample = lib.GetNextSample()
|
||||
if sample is None:
|
||||
lib.Close()
|
||||
break
|
||||
event = lib.GetEventOfCurrentSample()
|
||||
symbol = lib.GetSymbolOfCurrentSample()
|
||||
callchain = lib.GetCallChainOfCurrentSample()
|
||||
if not event_filter:
|
||||
event_filter = event.name
|
||||
event_defaulted = True
|
||||
elif event.name != event_filter:
|
||||
if event_defaulted and not event_warning_shown:
|
||||
logging.warning(
|
||||
'Input has multiple event types. Filtering for the first event type seen: %s' %
|
||||
event_filter)
|
||||
event_warning_shown = True
|
||||
continue
|
||||
|
||||
stack = []
|
||||
for i in range(callchain.nr):
|
||||
entry = callchain.entries[i]
|
||||
func = entry.symbol.symbol_name
|
||||
if annotate_kernel and "kallsyms" in entry.symbol.dso_name or ".ko" in entry.symbol.dso_name:
|
||||
func += '_[k]' # kernel
|
||||
if annotate_jit and entry.symbol.dso_name == "[JIT app cache]":
|
||||
func += '_[j]' # jit
|
||||
stack.append(func)
|
||||
if include_tid:
|
||||
stack.append("%s-%d/%d" % (sample.thread_comm, sample.pid, sample.tid))
|
||||
elif include_pid:
|
||||
stack.append("%s-%d" % (sample.thread_comm, sample.pid))
|
||||
else:
|
||||
stack.append(sample.thread_comm)
|
||||
stack.reverse()
|
||||
stacks[";".join(stack)] += sample.period
|
||||
|
||||
for k in sorted(stacks.keys()):
|
||||
print("%s %d" % (k, stacks[k]))
|
||||
|
||||
|
||||
def main():
|
||||
parser = BaseArgumentParser(description=__doc__)
|
||||
parser.add_argument('--symfs',
|
||||
help='Set the path to find binaries with symbols and debug info.')
|
||||
parser.add_argument('--kallsyms', help='Set the path to find kernel symbols.')
|
||||
parser.add_argument('-i', '--record_file', nargs='?', default='perf.data',
|
||||
help='Default is perf.data.')
|
||||
parser.add_argument('--pid', action='store_true', help='Include PID with process names')
|
||||
parser.add_argument('--tid', action='store_true', help='Include TID and PID with process names')
|
||||
parser.add_argument('--kernel', action='store_true',
|
||||
help='Annotate kernel functions with a _[k]')
|
||||
parser.add_argument('--jit', action='store_true', help='Annotate JIT functions with a _[j]')
|
||||
parser.add_argument('--addrs', action='store_true',
|
||||
help='include raw addresses where symbols can\'t be found')
|
||||
sample_filter_group = parser.add_argument_group('Sample filter options')
|
||||
sample_filter_group.add_argument('--event-filter', nargs='?', default='',
|
||||
help='Event type filter e.g. "cpu-cycles" or "instructions"')
|
||||
parser.add_report_lib_options(sample_filter_group=sample_filter_group,
|
||||
sample_filter_with_pid_shortcut=False)
|
||||
args = parser.parse_args()
|
||||
collapse_stacks(
|
||||
record_file=args.record_file,
|
||||
symfs_dir=args.symfs,
|
||||
kallsyms_file=args.kallsyms,
|
||||
event_filter=args.event_filter,
|
||||
include_pid=args.pid,
|
||||
include_tid=args.tid,
|
||||
annotate_kernel=args.kernel,
|
||||
annotate_jit=args.jit,
|
||||
include_addrs=args.addrs,
|
||||
report_lib_options=args.report_lib_options)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||