Skip to content

Commit 2c59069

Browse files
committed
Add MultiNode TradeFed tests for --shards and with session retry
This builds upon the previous test setup and runner for TradeFed (CTS/VTS) with the following major features and changes: (1) Allow for "sharded" tests accross multiple DUTs (speedup test suite execution) (2) Use session retry for stabilizing final failure counts (3) Don't require DUT root access in any of the involved operations (4) Trying to be failure tolerant: Continue as long as at least one device is available. Re-establish lost connection if possible. (5) Drop support for VTS (temporary solution, it was not tested at all) (6) Add support for GTS and STS Details related to the points above: (1) Implemented using the LAVA MultiNode API. There should be one "master" job that executes the TradeFed runner and an arbitrary number of 'workers' that host additional DUTs and make them accessible via adb tcpip. Continuous messaging between master and worker is used to recover lost devices when needed. (3) Migrated features requiring root to non-root equivalents. (5) No testing was done with VTS, therefore it was partly dropped for now. (6) Apart from adjusted paths and command names, no changes were required compared to CTS. This change would ideally be a replacement of the existing TradeFed setup, or at least share more code with it. Remaining issues on that way are: - (5) - no VTS testing done. This relates to behavior of --retry, --shards, and related shell output - MultiNode environment expected by this setup. The runner could be further generalized to be usable from a regular (non-MultiNode) test submission, so that it would in a basic case not introduce any overhead compared to the existing runner. Change-Id: Idef4a5a9aac1f3cd8fc2aa1e609f544ee15ae528 Depends-On: I23f22344b9bd758d3898d4345204157cecd7d624 Depends-On: Ie011cb2cd899dac938066ca4eee7652b83ac38d4 Depends-On: I638d6a2cf44a5569569703308172e3056030783f Depends-On: I912ab5168bbe1b4fc0a3a8112db5cbc94a812b7c Signed-off-by: Karsten Tausche <karsten@fairphone.com>
1 parent 81ffea2 commit 2c59069

File tree

6 files changed

+1275
-0
lines changed

6 files changed

+1275
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
job_name: MutliNode_xTS_template
2+
timeouts:
3+
job:
4+
hours: 2
5+
priority: medium
6+
visibility: public
7+
reboot_to_fastboot: false
8+
9+
protocols:
10+
lava-lxc:
11+
master:
12+
name: lxc-xts-master
13+
template: debian
14+
distribution: debian
15+
release: stretch
16+
worker:
17+
name: lxc-xts-worker
18+
template: debian
19+
distribution: debian
20+
release: stretch
21+
lava-multinode:
22+
# There must be one master and an arbitrary number of additional workers, so
23+
# that 1+n devices will be available in the TradeFed shell.
24+
roles:
25+
master:
26+
count: 1
27+
device_type: # TODO (a Android device, e.g., nexus4)
28+
timeout:
29+
minutes: 30
30+
worker:
31+
expect_role: master
32+
host_role: master
33+
count: 2
34+
device_type: # TODO (a Android device, e.g., nexus4)
35+
timeout:
36+
minutes: 30
37+
38+
actions:
39+
- deploy:
40+
namespace: tlxc
41+
to: lxc
42+
os: debian
43+
role:
44+
- master
45+
- worker
46+
timeout:
47+
minutes: 10
48+
packages:
49+
- adb
50+
- fastboot
51+
- unzip
52+
- wget
53+
- zip
54+
55+
- boot:
56+
namespace: tlxc
57+
role:
58+
- master
59+
- worker
60+
prompts:
61+
- 'root@(.*):/#'
62+
timeout:
63+
minutes: 5
64+
method: lxc
65+
66+
# TODO: Device-type specific deployment.
67+
# - deploy:
68+
# namespace: droid
69+
# to: fastboot
70+
# role:
71+
# - master
72+
# - worker
73+
# timeout:
74+
# minutes: 30
75+
# images:
76+
# # TODO
77+
# os: debian
78+
79+
- boot:
80+
namespace: droid
81+
role:
82+
- master
83+
- worker
84+
timeout:
85+
minutes: 10
86+
method: fastboot
87+
88+
- test:
89+
namespace: tlxc
90+
role:
91+
- master
92+
- worker
93+
timeout:
94+
minutes: 20
95+
definitions:
96+
- repository: https://review.linaro.org/qa/test-definitions
97+
from: git
98+
path: automated/android/wait-single-boot-completed.yaml
99+
name: wait-single-boot-completed
100+
101+
- test:
102+
namespace: tlxc
103+
role:
104+
- worker
105+
timeout:
106+
minutes: 5
107+
definitions:
108+
- repository: https://review.linaro.org/qa/test-definitions
109+
from: git
110+
path: automated/android/wait-single-network-connected.yaml
111+
name: wait-single-network-connected
112+
113+
- test:
114+
namespace: tlxc
115+
role:
116+
- worker
117+
timeout:
118+
minutes: 15
119+
definitions:
120+
- repository: https://review.linaro.org/qa/test-definitions
121+
from: git
122+
path: automated/android/multinode/share-local-device-over-adb-tcpip.yaml
123+
name: share-local-device-over-adb-tcpip
124+
params:
125+
TIMEOUT_SECS: "600"
126+
127+
- test:
128+
namespace: tlxc
129+
role:
130+
- worker
131+
timeout:
132+
hours: 1
133+
definitions:
134+
- repository: https://review.linaro.org/qa/test-definitions
135+
from: git
136+
path: automated/android/multinode/wait-and-keep-local-device-accessible.yaml
137+
name: wait-and-keep-local-device-accessible
138+
params:
139+
# The sum of these timeouts must be smaller than the lava-multinode timeout for the master.
140+
BOOT_TIMEOUT_SECS: "480"
141+
NETWORK_TIMEOUT_SECS: "300"
142+
ADB_CONNECT_TEST_TIMEOUT_SECS: "60"
143+
144+
- test:
145+
namespace: tlxc
146+
role:
147+
- worker
148+
timeout:
149+
minutes: 15
150+
definitions:
151+
- repository: https://review.linaro.org/qa/test-definitions
152+
from: git
153+
path: automated/android/multinode/wait-for-release-and-reset.yaml
154+
name: wait-for-release-and-reset
155+
156+
- test:
157+
namespace: tlxc
158+
role:
159+
- master
160+
timeout:
161+
minutes: 15
162+
definitions:
163+
- repository: https://review.linaro.org/qa/test-definitions
164+
from: git
165+
path: automated/android/multinode/connect-to-remote-adb-tcpip-devices.yaml
166+
name: connect-to-remote-adb-tcpip-devices
167+
params:
168+
ADB_CONNECT_TIMEOUT_SECS: 300
169+
170+
- test:
171+
namespace: tlxc
172+
role:
173+
- master
174+
timeout:
175+
hours: 1
176+
definitions:
177+
- repository: https://review.linaro.org/qa/test-definitions
178+
from: git
179+
path: automated/android/multinode/tradefed/tradefed-multinode.yaml
180+
params:
181+
TEST_PARAMS: "run cts --disable-reboot --include-filter CtsNetTestCases"
182+
TEST_RETRY_PARAMS: "run cts --disable-reboot"
183+
TEST_PATH: "android-cts"
184+
TEST_URL: "https://dl.google.com/dl/android/cts/android-cts-7.1_r23-linux_x86-arm.zip"
185+
STATE_CHECK_FREQUENCY_SECS: "300"
186+
MAX_NUM_RUNS: "25"
187+
RUNS_IF_UNCHANGED: "5"
188+
FAILURES_PRINTED: "50"
189+
# For Artifactorial:
190+
# URL: ""
191+
# TOKEN: ""
192+
name: cts
193+
194+
- test:
195+
namespace: tlxc
196+
role:
197+
- master
198+
timeout:
199+
minutes: 10
200+
definitions:
201+
- repository: https://review.linaro.org/qa/test-definitions
202+
from: git
203+
path: automated/android/multinode/release-remote-adb-tcpip-devices.yaml
204+
name: release-remote-adb-tcpip-devices
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
"""Utilities for handling STS-specific behavior in TradeFed.
2+
3+
The following behavior was noted at least in the 2018-09 version of STS for
4+
Android 7. When running STS, it manipulates the logged device fingerprint to
5+
show up as a 'user' build with 'release-keys', even when using the required
6+
setup with either 'userdebug' or 'eng' build. That behavior breaks the TradeFed
7+
rerun feature, as the fingerprint read from the device will not match the logged
8+
fingerprint of a previous run.
9+
10+
StsUtil works around this behavior by reverting the manipulated fingerprint in
11+
the log file to the string reported by the device. tradefed-runner-multinode.py
12+
uses this module to apply STS workarounds automatically when STS is run.
13+
"""
14+
15+
import os
16+
import shutil
17+
import subprocess
18+
import xml.etree.ElementTree as ET
19+
20+
21+
class StsUtil:
22+
"""Interface for STS related workarounds when automating TradeFed.
23+
24+
For applying StsUtil, use one instance per TradeFed STS invocation. Ideally,
25+
construct it before running any tests, so when the passed device is in a
26+
good known state. Call fix_result_file_fingerprints() after each completed
27+
run, before rerunning.
28+
29+
Applying StsUtil to non-STS TradeFed runs does not help, but should also not
30+
affect the results in any way.
31+
"""
32+
33+
def __init__(
34+
self, device_serial_or_address, logger, device_access_timeout_secs=60
35+
):
36+
"""Construct a StsUtil instance for a TradeFed invocation.
37+
38+
Args:
39+
device_serial_or_address (str):
40+
Serial number of network address if the device that will be used
41+
to determine the reference fingerprint.
42+
logger (logging.Logger)
43+
Logger instance to redirect messages to.
44+
device_access_timeout_secs (int):
45+
Timeout in seconds for `adb` calls.
46+
"""
47+
48+
self.device_serial_or_address = device_serial_or_address
49+
self.logger = logger
50+
self.device_access_timeout_secs = device_access_timeout_secs
51+
# Try reading the device fingerprint now. There is a better chance that
52+
# the device is in a good state now than after a test run. If reading
53+
# fails here, however, we can still retry in
54+
# fix_result_file_fingerprints().
55+
try:
56+
self.device_fingerprint = self.read_device_fingerprint()
57+
except subprocess.CalledProcessError:
58+
self.device_fingerprint = None
59+
60+
def read_device_fingerprint(self):
61+
"""Read the fingerprint of device_serial_or_address via adb.
62+
63+
Returns:
64+
str:
65+
Fingerprint of the device.
66+
67+
Raises:
68+
subprocess.CalledProcessError:
69+
If the communication with `adb` does not lead to
70+
expected results.
71+
"""
72+
73+
fingerprint = subprocess.check_output(
74+
[
75+
"adb",
76+
"-s",
77+
self.device_serial_or_address,
78+
"shell",
79+
"getprop",
80+
"ro.build.fingerprint",
81+
],
82+
universal_newlines=True,
83+
timeout=self.device_access_timeout_secs,
84+
).rstrip()
85+
86+
self.logger.debug("Device reports fingerprint '%s'", fingerprint)
87+
88+
return fingerprint
89+
90+
def fix_result_file_fingerprints(self, result_dir):
91+
"""Fix STS-manipulated device fingerprints in result files.
92+
93+
This will replace the fingerprint in the result files with the correct
94+
fingerprint as reported by the device.
95+
96+
Args:
97+
result_dir (str):
98+
Path to the result directory of the STS run to fix. This folder
99+
must contain a test_result.xml and test_result_failures.html,
100+
which are both present in a result folder of a completed
101+
TradeFed run.
102+
103+
Raises:
104+
subprocess.CalledProcessError:
105+
If the device fingerprint could not be determined via adb.
106+
"""
107+
108+
if self.device_fingerprint is None:
109+
self.device_fingerprint = self.read_device_fingerprint()
110+
111+
test_result_path = os.path.join(result_dir, "test_result.xml")
112+
test_result_path_orig = test_result_path + ".orig"
113+
shutil.move(test_result_path, test_result_path_orig)
114+
115+
test_result_failures_path = os.path.join(
116+
result_dir, "test_result_failures.html"
117+
)
118+
test_result_failures_path_orig = test_result_failures_path + ".orig"
119+
shutil.move(test_result_failures_path, test_result_failures_path_orig)
120+
121+
# Find the manipulated fingerprint in the result XML.
122+
test_result_tree = ET.parse(test_result_path_orig)
123+
result_build_node = test_result_tree.getroot().find("Build")
124+
manipulated_fingerprint = result_build_node.get("build_fingerprint")
125+
126+
self.logger.debug(
127+
"Reverting STS manipulated device fingerprint: '%s' -> '%s'",
128+
manipulated_fingerprint,
129+
self.device_fingerprint,
130+
)
131+
132+
# Fix the fingerprint in the result file.
133+
result_build_node.set("build_fingerprint", self.device_fingerprint)
134+
test_result_tree.write(test_result_path)
135+
136+
# Fix the fingerprint in the failures overview HTML.
137+
with open(
138+
test_result_failures_path_orig, "r"
139+
) as test_result_failures_file:
140+
test_result_failures = test_result_failures_file.read().replace(
141+
manipulated_fingerprint, self.device_fingerprint
142+
)
143+
with open(test_result_failures_path, "w") as test_result_failures_file:
144+
test_result_failures_file.write(test_result_failures)

0 commit comments

Comments
 (0)