Artificial Intelligence for Automated Software Testing
1. .lusoftware verification & validation
VVS
Artificial Intelligence for Automated
Software Testing
Lionel Briand
ISSTA/ECOOP Summer School 2018
2. Objectives
• Applications of main AI techniques in test automation
• Not a short university course
• Overview (partial) with pointers for further information
• Industrial research projects
• Challenging as there is a lot of material and many techniques
involved
• Disclaimer: Inevitably biased presentation based on personal
experience
2
3. Acknowledgments
• Annibale Panichella, for introductory slides on Search-Based Software
Testing
• P. Repoussis, “Metaheuristic Algorithms: A brief introduction on the
basics you need to know” (Presentation)
• Ethem Alpaydim, “Introduction to Machine Learning”, MIT press
• Amy Davis, “Overview of Natural Language Processing” (Presentation)
• Research projects: Shiva Nejati, Raja Ben Abdessalem, Annibale
Panichella, Sadeeq Jan, Andrea Arcuri, Dennis Appelt, Fabrizio Pastore,
Chunhui Wang … (sorry for those I forgot)
3
4. Biography
• 24 years of post-PhD research experience
• IEEE Fellow, Harlan Mills IEEE CS award
• Canada Research Chair, ERC Advanced grant
• ICSE PC co-chair in 2014
• EiC of Empirical Software Engineering (Springer) for 13 years
• Graduated 27 PhD students
• Worked with >30 industry partners (aerospace, automotive, health care, finance …)
• I like research driven by industrial problems
• H-index = 73, around 24K citations (for those interested in the “number game”)
4
5. Collaborative Research @ SnT
5
• Research in context
• Addresses actual needs
• Well-defined problem
• Long-term collaborations
• Our lab is the industry
6. SVV Dept.
6
• Established in 2012, part of the SnT centre
• Requirements Engineering, Security Analysis, Design Verification,
Automated Testing, Runtime Monitoring
• ~ 25 lab members
• Partnerships with industry
• ERC Advanced grant
7. Outline
• Introduction to software testing
• Introduction to relevant AI techniques
• Introduction to Search-Based Software Testing (SBST)
• Industrial research projects where AI was applied to testing
problems
• Lessons learned and the road ahead
7
9. Outline
• Quick overview of software testing
• The role of AI in automated software testing
• Metaheuristic search
• Machine learning
• Natural Language Processing (NLP)
9
10. Definition of Software Testing
• International Software Testing Qualifications Board:
“Software testing is a process of executing a program or
application with the intent of finding the software bugs. It can
also be stated as the process of validating and verifying that
a software program or application or product meets the
business and technical requirements that guided its design and
development.”
10
11. Software Testing Overview
11
SW Representation
(e.g., specifications)
SW Code
Derive Test cases
Execute Test cases
Compare
Expected
Results or properties
Get Test Results
Test Oracle
[Test Result==Oracle][Test Result!=Oracle]
Automation!
12. Main Challenge
• The main challenge in testing software systems is
scalability
• Scalability: The extent to which a technique can be applied
on large or complex artifacts (e.g., input spaces, code,
models) and still provide useful, automated support with
acceptable effort, CPU, and memory?
• Effective automation is a prerequisite for scalability
12
13. Importance of Software Testing
• Software testing is the most prevalent verification and validation
technique in practice
• It represents a large percentage of software development costs,
e.g., >50% is not rare
• Testing services are a USD 9-Billion market
• The cost of software failures was estimated to be (a very minimum
of) USD 1.1 trillion in 2016
• Inadequate tools and technologies is one of the most important
factors of testing costs and inefficiencies
13
14. Search-Based Software Testing
• Express test generation problem
as a search or optimization
problem
• Search for test input data with
certain properties, i.e., source
code coverage
• Non-linearity of software (if, loops,
…): complex, discontinuous, non-
linear search spaces
• Many search algorithms
(metaheuristics), from local
search to global search, e.g., Hill
Climbing, Simulated Annealing
and Genetic Algorithms
e search space neighbouring the
for fitness. If a better candidate
mbing moves to that new point,
rhood of that candidate solution.
the neighbourhood of the current
fers no better candidate solutions;
If the local optimum is not the
gure 3a), the search may benefit
performing a climb from a new
cape (Figure 3b).
le Hill Climbing is Simulated
Simulated Annealing is similar to
ement around the search space is
be made to points of lower fitness
he aim of escaping local optima.
bability value that is dependent
‘temperature’, which decreases
ogresses (Figure 4). The lower
kely the chances of moving to a
ch space, until ‘freezing point’ is
the algorithm behaves identically
d Annealing is named so because
hysical process of annealing in
curve of the fitness landscape until a local optimum is found. The fina
position may not represent the global optimum (part (a)), and restarts ma
be required (part (b))
Fitness
Input domain
Figure 4. Simulated Annealing may temporarily move to points of poore
fitness in the search space
Fitness
Input domain
Figure 5. Genetic Algorithms are global searches, sampling many poin
in the fitness landscape at once
“Search-Based Software Testing: Past, Present and Future”
Phil McMinn
Genetic Algorithm
14
cusses future directions for Search-Based
g, comprising issues involving execution
estability, automated oracles, reduction of
st and multi-objective optimisation. Finally,
udes with closing remarks.
-BASED OPTIMIZATION ALGORITHMS
form of an optimization algorithm, and
mplement, is random search. In test data
s are generated at random until the goal of
mple, the coverage of a particular program
nch) is fulfilled. Random search is very poor
ns when those solutions occupy a very small
ll search space. Such a situation is depicted
re the number of inputs covering a particular
are very few in number compared to the
ut domain. Test data may be found faster
ly if the search is given some guidance.
c searches, this guidance can be provided
a problem-specific fitness function, which
points in the search space with respect to
or their suitability for solving the problem
Input domain
portion of
input domain
denoting required
test data
randomly-generated
inputs
Figure 2. Random search may fail to fulfil low-probability test goals
Fitness
Input domain
(a) Climbing to a local optimum
15. Machine Learning and Testing
• ML supports decision making
based on data
• Test planning
• Test cost estimation
• Test case management
• Test case prioritization
• Test case design
• Test case refinement
• Test case evaluation
15
• Debugging
• Fault localization
• Bug prioritization
• Fault prediction
• “Machine Learning-based Software
Testing: Towards a Classification
Framework.” SEKE 2011
16. NLP and Testing
• Natural language is prevalent in software development
• User documentation, procedures, natural language
requirements, etc.
• Natural Language Processing (NLP)
• Can it be used to help automate testing?
• Derive test cases, including oracles
• Traceability between requirements and system test
cases (required by many standards)
16
19. Metaheuristic Search
• Stochastic optimization through search
• They efficiently explore the search space in order to find good
(near-optimal) feasible solutions
• They can address both discrete- and continuous-domain
optimization problems
• Applicable to many practical situations
• They provide no guarantee of global or local optimality
19
21. Example Problem
• Let’s consider the problem of finding the best visiting sequence (route) to
serve 14 customers.
• Traveling Salesman Problem (STP) – Combinatorial optimization
• How many possible routes?
• (n-1)!=(15-1)!=14!= 8,7178 X 1010 = 88 billion solutions
• Exhaustive search is feasible within a day for the above problem
• But what type of algorithm would you pick with 13,508 cities and 1049,933
feasible solutions?
• Combinatorial explosion
21
22. Example Metaheuristics
• Genetic Algorithms
• Simulated Annealing
• Tabu Search
• Ant Colony Optimization
• Particle Swarm Optimization
• Iterated Local Search
22
23. Remarks
• Metaheuristics are non-deterministic
• They usually incorporate mechanisms to avoid getting trapped in
confined areas of the search space
• They are not problem specific
• They may use some form of memory to better guide the search
• They are a relatively new field (since the ‘80s or so)
• They have become possible because we can now afford vast
amounts of computation
23
25. Genetic Algorithms (GAs)
Genetic Algorithm: Population-based, search algorithm
inspired be evolution theory
Natural selection: Individuals that best
fit the natural environment survive
Reproduction: surviving individuals
generate offsprings (next generation)
Mutation: offsprings inherits
properties of their parents with some
mutations
Iteration: generation after generation
the new offspring fit better the
environment than their parents
From A. Panichella 25
27. Machine Learning
• Machine learning is programming computers to optimize a
performance criterion using example data or past experience.
• Learning general models from data capturing particular
examples.
• Data is increasingly cheap and abundant; knowledge is
expensive and scarce.
• Build a model that is a good and useful approximation to the
data.
27
29. 29
Classification
• Example: Credit
scoring
• Differentiating
between low-risk and
high-risk customers
from their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
From E. Alpaydim
30. Natural Language Processing
l Natural Language Understanding
l Natural Language Generation
l In software testing:
l Analyze NL requirements or other forms of documentation to interpret
them and translate them into a form supporting verification and testing
l Generating specifications, test cases (inputs and oracles), and scripts from
NL requirements or other forms of documentation
l Traceability between requirements and other artifacts
30
31. Understanding Sentences
Parsing and Grammar
How is a sentence composed? (Syntactic analysis)
Lexicons
How is a word composed? (Morphological analysis)
Prefixes, suffixes, and root forms, e.g., short-ness
Ambiguity
Disambiguation: Finding the correct interpretation
31
32. A Parsing Example
Grammar:
The Sentence: The boy went home.
S à NP VP
NP à Article N | Proper
VP à Verb NP
N à home | boy | store
Proper à Betty | John
Verb à go|give|see
Article à the | an | a
From A. Davis 32
34. Syntactic Analysis Challenges
• Singular vs plural, gender
• Adjectives, adverbs …
• Handling ambiguity
• Syntactic ambiguity: “fruit flies like a banana”
• Having to parse syntactically incorrect sentences
34
35. Semantic Disambiguation
Example: “with”
Sentence Relation
I ate spaghetti with meatballs. (ingredient of spaghetti)
I ate spaghetti with salad. (side dish of spaghetti)
I ate spaghetti with abandon. (manner of eating)
I ate spaghetti with a fork. (instrument of eating)
I ate spaghetti with a friend. (accompanier of eating)
Disambiguation is probabilistic!
35
37. Outline
• Definition of Search-Based Software Engineering (SBSE)
• Definition of Search-Based Software Testing (SBST)
• SBST applied to coverage testing
• Multiple-target techniques
37
38. Definitions
• Search-Based Software Engineering (SBSE): «The application
of meta-heuristic search-based optimization techniques to
find near-optimal solutions in software engineering
problems.»
• Problem Reformulation: Reformulating typical SE problems as
optimization problems
• Search-based Software Testing: Metaheuristics have shown
to be particularly useful to address many testing problems.
38
40. Why SBST?
No Exhaustive Search
No Exact Search
Meta-heuristics
Random Algorithms
Tabu Search
Hill Climbing
Hill Climbing
Ant Colony
Simulating AnnealingParticle Swarm Optimization
Genetic Algorithms
Issues:
1. Large search space
2. Complex problem (NP-Complete)
4. Often required data are (partly)
available only upon test execution
40
41. SBST Example
41
Class Triangle {
int a, b, c; //sides
int type = NOT_A_TRIANGLE;
Triangle (int a, int b, int c){…}
void checkRightAngle() {…}
void computeTriangleType() {…}
boolean isTriangle() {…}
public static void main (String args[]) {…}
}
Goal: Automatic generation of test cases using genetic algorithms in
order to achieve the maximum statement coverage
Genetic
Algorithms:
1) Solution
Representation
2) Fitness function
3) Selection
4) Reproduction
(crossover and
mutation)
42. Solution Representation
42
Class Triangle {
int a, b, c; //sides
int type = NOT_A_TRIANGLE;
Triangle (int a, int b, int c){…}
void checkRightAngle() {…}
void computeTriangleType() {…}
boolean isTriangle() {…}
public static void main (String args[]) {…}
}
The chromosome used for test case generation is the input vector (sequence
of input values used by the test case to run the program), which may be fixed
length or variable length
In our running
example there are
only three input
parameters: a, b, c
a b cX =
Fixed length
chromosome
43. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1
25
6 7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
43
44. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1
25
6 7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
44
45. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1
25
6 7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10>
45
46. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
6
1
25
7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10>
x2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10>
What is the closest TC to
cover the statement 8?
46
47. Approach Level
Approach_level(P(x), t)
Given the execution trace obtained by running program P with input vector x,
the approach level is the minimum number of control nodes between an
executed statement and the coverage target t.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10> AL=2
x2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> AL=0
1
25
7 3
98
10
46
47
48. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
6
1
25
7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10> AL=2
x2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> AL=0
x3 = (2, -2, 10) Path(x3) = <1, 5, 7, 9, 10> AL=0
What is the closest TC to
cover the statement 8?
48
49. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
6
1
25
7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10>
x2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> if (3==4)
x3 = (2, -2, 10) Path(x3) = <1, 5, 7, 9, 10> if (10==-2)
49
50. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
6
1
25
7 3
98
10
4
1
25
6 7 3
98
4
Control flow
graph
Dependency
graph
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
x1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10>
x2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> abs (3-4) = 1
x3 = (2, -2, 10) Path(x3) = <1, 5, 7, 9, 10> abs (-2-10) = 12
50
51. Branch Distance
Branch_distance(P(x), t)
Given the first control node where the execution diverges from the target t, the predicate
at such node is converted to a distance (from taking the desired branch), normalised
between 0 and 1 (less important than Approach level).
Such a distance measure how far is the test case from taking the desired branch. For
boolean and numerical variables a, b:
51
52. Branch Distance
Branch_distance(P(x), t)
For string variables a and b, the branch distance is computed using the following rules:
where j is the position of the first different character such that a[j]!=b[j], while a[i] ==
b[i] for i<j (a[j]-b[j]) is set to zero if a==b). Example of edit distance:
edit_dist(“abcd”, “abbb”)=2
52
M. Alshraideh, L. Bottaci, “Search-
based software test data
generation for string data using
program-specific search
Operators”, STVR, 2006
53. Branch distance rules for composite predicate
Branch Distance
Alternative normalisation of d:
BD(c) = 1-α-d
BD(c) = d/(d+β)
with α>1 and β>0
53
54. For statement and branch coverage, given a specific coverage target t, a widely
used fitness function (to be minimised) is:
f(x) = approach_level(P(x), t) + branch_distance(P(x),t)
Approach_level(P(x), t)
Given the execution trace obtained by running program P with input vector x, the
approach level is the minimum number of control nodes between an executed
statement and the coverage target t.
Branch_distance(P(x), t)
Given the first control node where the execution diverges from the target t, the
predicate at such node is converted to a distance (from taking the desired
branch), normalised between 0 and 1.
Fitness Function?
54
55. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1
25
6 7 3
98
4
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
X1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10> AL=2 f=2.5
X2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> AL=0
d(a != b) = K = 1
BD (a != b) = 1/ (1+1) = 0.5
f(X1) = 2 + 0.5 = 2.5
a!=b a==b
55
56. Fitness Function?
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1
25
6 7 3
98
4
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
d(b == c) = abs(b-c) + K = 2
BD (b == c) = 2 / (2+1) = 0.66
f(X2) = 0 + 0.66 = 0.66
a!=b a==b
X1 = (2, 2, 2) Path(x1) = <1, 2, 3, 10> AL=2 f=2.5
X2 = (2, 3, 4) Path(x2) = <1, 5, 7, 9, 10> AL=0 f=0.66
56
57. x6 = (3,4,5) P ≈ 1/f = 1/0.66 ≈ 1.51
x2 = (2,3,4) P ≈ 1/f = 1/0.66 ≈ 1.51
x7 = (3,5,7) P ≈ 1/f = 1/0.75 ≈ 1.33
x8 = (6,8,4) P ≈ 1/f = 1/0.83 ≈ 1.20
x1 = (2,2,2) P ≈ 1/f = 1/2.50 ≈ 0.40
x5 = (2,2,3) P ≈ 1/f = 1/2.50 ≈ 0.40
x3 = (-2,3,6) P ≈ 1/f ≈ 0
x4 = (2,3,7) P ≈ 1/f ≈ 0
Roulette wheel selection
1) Assign to each test case a probability equal to 1/f (inverse of the fitness score)
Roulette Wheel Selection
58. x6 = (3,4,5) P ≈ 1/f = 1/0.66 ≈ 1.51
x2 = (2,3,4) P ≈ 1/f = 1/0.66 ≈ 1.51
x7 = (3,5,7) P ≈ 1/f = 1/0.75 ≈ 1.33
x8 = (6,8,4) P ≈ 1/f = 1/0.83 ≈ 1.20
x1 = (2,2,2) P ≈ 1/f = 1/2.50 ≈ 0.40
x5 = (2,2,3) P ≈ 1/f = 1/2.50 ≈ 0.40
x3 = (-2,3,6) P ≈ 1/f ≈ 0
x4 = (2,3,7) P ≈ 1/f ≈ 0
Roulette wheel selection
1) Assign to each test case a probability equal to 1/f (inverse of the fitness score)
2) Normalise the obtained probability
3) Each test case has a probability to be selected that is proportional to its slice in the
roulette wheel
Tot. = 6.35
0.23
0.23
0.20
0.18
0.06
0.06
x6
24%
x2
24%
x7
21%
x8
19%
x1
6%
x5
6%
x3
0%
x4
0%
Roulette wheel
Roulette Wheel Selection
59. x6 = (3,4,5) P ≈ 0.23
x2 = (2,3,4) P ≈ 0.23
x7 = (3,5,7) P ≈ 0.20
x8 = (6,8,4) P ≈ 0.18
x1 = (2,2,2) P ≈ 0.06
x5 = (2,2,3) P ≈ 0.06
x3 = (-2,3,6) P ≈ 0
x4 = (2,3,7) P ≈ 0
Roulette wheel selection
1) Assign to each test case a probability equal to 1/f (inverse of the fitness score)
2) Normalise the obtained probability
3) Each test case has a probability to be selected that is proportional to its slice in the
roulette wheel
x6 = (3,4,5)
x2 = (2,3,4)
x6 = (3,4,5)
x1 = (2,2,2)
x8 = (-3,0,-2)
x2 = (2,3,4)
x7 = (3,5,2)
x7 = (3,5,2)
Roulette Wheel Selection
60. One-point crossover (probability = 0.8)
It takes two parents and cuts their chromosome strings at some randomly chosen position
and the produced substrings are then swapped to produce two new full-length
chromosomes.
x1 = (2,2,2)
x2 = (2,3,4)
x3 = (-2,3,6)
x4 = (2,3,7)
x5 = (2,2,3)
x6 = (3,4,5)
x7 = (3,5,7)
x8 = (6,8,4)
Parents
-
x2, x5
x3, x8
x4, x6
SEL
SEL
-
SEL
Offsprings
x1 = (2,2,2)
x2 = (2,2,3)
x3 = (-2,3,4)
x4 = (2,3,5)
x5 = (2,3,4)
x6 = (3,4,7)
x7 = (3,5,7)
x8 = (6,8,6)
Cut-point
-
1
2
2
1
2
-
2
Reproduction (Crossover)
62. Initial Population
Mutation
Crossover
Selection
End?
YESNO
One target approach:
1) Select one target (statement or branch) to cover
2) Run GAs until reaching the maximum search budget (max iterations) or when the target is
covered (fitness function = 0)
3) Repeat from step (1) for a new target (statement or branch)
One-Target Approach
class Triangle {
void computeTriangleType() {
if (a == b) {
if (b == c)
type = "EQUILATERAL";
else
type = "ISOSCELES";
} else if (a == c) {
type = "ISOSCELES";
} else {
if (b == c)
type = "ISOSCELES";
else
checkRightAngle();
}
System.out.println(type);
}
}
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
target
62
63. Limitations
Some coverage targets may be infeasible
Some coverage targets may be very difficult to achieve
Since a limited search budget is available for test case generation:
• Infeasible targets may use the search budget without reaching any target
• Difficult target may use most of the search budget, leaving lots of easier
coverage target uncovered
• The order in which targets are considered affects the final results
How to solve multiple targets at once? How?
63
64. Techniques for Multiple Targets
G. Fraser, A.Arcuri
“Whole Test Suite Generation”
IEEE Transactions on Software Engineering, 2013
A.Panichella, F. M. Kifetew, P. Tonella
“Reformulating Branch Coverage as Many-Objective
Optimization Problems”
IIEEE International Conference on Software Testing, Verification, and
Validation (ICST), 2015
64
68. 42%
32%
9%
4%
3%
3%
3%
2%
2%
Code Injection
Manipulated data structures
Collect and analyze information
Indicator
Employ probabilistic techniques
Manipulate system resources
Subvert access control
Abuse existing functionality
Engage in deceptive…
X-Force Threat Intelligence Index
2017
68
https://www.ibm.com/security/xforce/
More than 40% of all
attacks were injection
attacks (e.g., SQLi)
71. Injection Attacks
71
SQL query
Name Surname …
Aria Stark …
John Snow …
… … …
Query result
SELECT *
FROM Users
WHERE (usr = ‘’ AND
psw = ‘’) OR 1=1 --
Server SQL DatabaseClient
Web form
‘) OR 1=1 --
Username
Password
OK
76. Grammar-based Attack
Generation
• BNF grammar for SQLi attacks
• Random strategy: randomly selected production rules are
applied recursively until only terminals are left
• Random strategy not efficient for bypassing attacks that are
difficult to find
• Machine learning? Search?
• How to guide the search? How can ML help?
76
77. Anatomy of SQLi attacks
77
‘ OR“a”=“a”#
Bypassing Attack
<START>
<sq> <wsp> <sqliAttack> <cmt>
<boolAttack>
<opOR> <boolTrueExpr>
OR <bynaryTrue>
<dq> <ch> <dq> <opEq> <dq> <ch> <dq>
“ a ” = “ a ”
<sQuoteContext>
‘ #_
Derivation Tree
‘
_
OR”a”=“a”
#
S =
{
Attack Slices
78. Learning Attack Patterns
78
S1 S2 S3 S4 … Sn Outcome
A1 1 1 0 0 … 0 Passed
A2 0 1 0 0 … 0 Blocked
… … … … … … … …
Am 1 1 1 1 … 1 Blocked
Training Set
PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S2
Decision Tree
Sn
S1
…
• Random trees
• Random forest
79. Learning Attack Patterns
79
S1 S2 S3 S4 … Sn Outcome
A1 1 1 0 0 … 0 Passed
A2 0 1 0 0 … 0 Blocked
… … … … … … … …
Am 1 1 1 1 … 1 Blocked
PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S2
Sn
S1
…
Training Set Decision Tree
Attack Pattern
S2 ∧ ¬ Sn ∧ S1
80. Machine Learning
Generating Attacks via ML and
EAs
80
Evolutionary Algorithm (EA)
Iteratively refine successful attack
conditions PassedBlocked
S4
YesNo
YesNo
YesNo
S3
S2
Sn
S1
…
82. Related Work
• Automated repair of WAFs
• Automated testing targeting XML and SQL injections in web
applications
• Automated detection of malicious SQL statements
82
86. Automotive Environment
• Highly varied environments, e.g., road topology, weather, building
and pedestrians …
• Huge number of possible scenarios, e.g., determined by
trajectories of pedestrians and cars
• ADAS play an increasingly critical role
• A challenge for testing
86
87. Advanced Driver Assistance
Systems (ADAS)
Decisions are made over time based on sensor data
87
Sensors
Controller
Actuators Decision
Sensors
/Camera
Environment
ADAS
88. A General and Fundamental Shift
• Increasingly so, it is easier to learn behavior from data using
machine learning, rather than specify and code
• Deep learning, reinforcement learning …
• Example: Neural networks (deep learning)
• Millions of weights learned
• No explicit code, no specifications
• Verification, testing?
88
89. CPS Development Process
89
Functional modeling:
• Controllers
• Plant
• Decision
Continuous and discrete
Simulink models
Model simulation and
testing
Architecture modelling
• Structure
• Behavior
• Traceability
System engineering modeling
(SysML)
Analysis:
• Model execution and
testing
• Model-based testing
• Traceability and
change impact
analysis
• ...
(partial) Code generation
Deployed executables on
target platform
Hardware (Sensors ...)
Analog simulators
Testing (expensive)
Hardware-in-the-Loop
Stage
Software-in-the-Loop
Stage
Model-in-the-Loop Stage
89
90. Automotive Environment
• Highly varied environments, e.g., road topology, weather, building
and pedestrians …
• Huge number of possible scenarios, e.g., determined by
trajectories of pedestrians and cars
• ADAS play an increasingly critical role
• A challenge for testing
90
91. Our Goal
• Developing an automated testing technique
for ADAS
91
• To help engineers efficiently and
effectively explore the complex test input
space of ADAS
• To identify critical (failure-revealing) test
scenarios
• Characterization of input conditions that
lead to most critical situations, e.g.,
safety violations
92. 92
Automated Emergency Braking
System (AEB)
92
“Brake-request”
when braking is needed
to avoid collisions
Decision making
Vision
(Camera)
Sensor
Brake
Controller
Objects’
position/speed
93. Example Critical Situation
• “AEB properly detects a pedestrian in front of the car with a
high degree of certainty and applies braking, but an accident
still happens where the car hits the pedestrian with a
relatively high speed”
93
94. Testing ADAS
94
A simulator based on
Physical/Mathematical models
On-road testing
Simulation-based (model) testing
95. Testing via Physics-based
Simulation
95
ADAS
(SUT)
Simulator (Matlab/Simulink)
Model
(Matlab/Simulink)
▪ Physical plant (vehicle / sensors / actuators)
▪ Other cars
▪ Pedestrians
▪ Environment (weather / roads / traffic signs)
Test input
Test output
time-stamped output
97. ADAS Testing Challenges
• Test input space is large, complex and multidimensional
• Explaining failures and fault localization are difficult
• Execution of physics-based simulation models is computationally
expensive
97
98. Our Approach
• We use decision tree classification models
• We use multi-objective search algorithm (NSGAII)
• Objective Functions:
• Each search iteration calls simulation to compute objective functions
• Input values required to perform the simulation:
98
1. Minimum distance between the pedestrian and the field of view
2. The car speed at the time of collision
3. The probability that the object detected is a pedestrian
Precipita-
tion
Fogginess Road
shape
Visibility
range
Car-speed Person-
speed
Person-
position
Person-
orientation
99. Multiple Objectives: Pareto Front
99
Individual A Pareto
dominates individual B if
A is at least as good as B
in every objective
and better than B in at
least one objective.
Dominated by x
F1
F2
Pareto front
x
• A multi-objective optimization algorithm (e.g., NSGA II) must:
• Guide the search towards the global Pareto-Optimal front.
• Maintain solution diversity in the Pareto-Optimal front.
100. Search-based Testing Process
100
Test input generation (NSGA II)
Evaluating test inputs
- Select best tests
- Generate new tests
(candidate)
test inputs
- Simulate every (candidate) test
- Compute fitness functions
Fitness
values
Test cases revealing worst case system behaviors
Input data ranges/dependencies + Simulator + Fitness functions
defined based on Oracles
105. Search Guided by Classification
105
Test input generation (NSGA II)
Evaluating test inputs
Build a classification tree
Select/generate tests in the fittest regions
Apply genetic operators
Input data ranges/dependencies + Simulator + Fitness functions
defined based on Oracles
(candidate)
test inputs
- Simulate every (candidate) test
- Compute fitness functions
Fitness
values
Test cases revealing worst case system behaviors +
A characterization of critical input regions
114. Use Case Specifications
Example
Precondition: The system has been initialized
Basic Flow
1. The SeatSensor SENDS the weight TO the system.
2. INCLUDE USE CASE Self Diagnosis.
3. The system VALIDATES THAT no error has been detected.
4. The system VALIDATES THAT the weight is above 20 Kg.
5. The system sets the occupancy status to adult.
6. The system SENDS the occupancy status TO AirbagControlUnit.
--written according to RUCM template--
114
115. 115
Precondition: The system has been initialized
Basic Flow
1. The SeatSensor SENDS the weight TO the system.
2. INCLUDE USE CASE Self Diagnosis.
3. The system VALIDATES THAT no error has been detected.
4. The system VALIDATES THAT the weight is above 20 Kg.
5. The system sets the occupancy status to adult.
6. The system SENDS the occupancy status TO AirbagControlUnit.
Alternative Flow
RFS 4.
1. IF the weight is above 1 Kg THEN
2. The system sets the occupancy status to child.
3. ENDIF.
4. RESUME STEP 6.
116. UseCaseStart
Input
Condition
Condition
Output
Exit
Condition
Internal
Internal
Include INCLUDE USE CASE Self Diagnosis.
IF the weight is above 1 Kg THEN
The SeatSensor SENDS the weight TO the system.
The system sets the occupancy status to adult.
The system SENDS the occupant class TO AirbagControlUnit.
The system VALIDATES THAT no error has been detected.
The system sets the occupancy status to child.
The system VALIDATES THAT the weight is above 20 Kg.
Precondition: The system has been initialized.
Model-based
Test Case Generation
driven by
coverage criteria
116
117. Domain Model:
Formalizing Conditions
Manually written OCL constraint:
“The system VALIDATES THAT no error has been detected.”
Error.allInstances()->forAll( i | i.isDetected = false)
117
118. UseCaseStart
Input
Condition
Condition
Output
Exit
Condition
Internal
Internal
Include INCLUDE USE CASE Self Diagnosis.
IF the weight is above 1 Kg THEN
The SeatSensor SENDS the weight TO the system.
The system sets the occupancy status to adult.
The system SENDS the occupant class TO AirbagControlUnit.
The system VALIDATES THAT no error has been detected.
The system sets the occupancy status to child.
The system VALIDATES THAT the weight is above 20 Kg.
Precondition: The system has been initialized.
OCL
OCL
OCL
OCL
System.allInstances()->forAll( s | s.initialized = true )
AND System.allInstances()->forAll( s | s.initialized = true )
AND Error.allInstances()->forAll( e | e.isDetected = false)
AND System.allInstances()
->forAll( s | s.occupancyStatus = Occupancy::Adult )
Path condition:
Constraint
Solving
Test inputs:
118
119. Automated Generation of OCL
Expressions
“The system VALIDATES THAT
no error has been detected.”
Error.allInstances()->forAll( i | i.isDetected = false)
OCLgen
119
120. Error.allInstances()->forAll( i | i.isDetected = false)
EntityName left-hand side
(variable)
right-hand side
(variable/value)
operator
Pattern
120
121. OCLgen solution
“The system sets the occupancy status to adult.”
actor affected by the verb final state
1. determine the role of words in a sentence
121
122. OCLgen solution
“The system sets the occupancy status to adult.”
actor affected by the verb final state
2. match words in the sentence with concepts in the domain model
1. determine the role of words in a sentence
122
123. OCLgen solution
“The system sets the occupancy status to adult.”
BodySense.allInstances()
->forAll( i | i.occupancyStatus = Occupancy::Adult)
actor affected by the verb final state
2. match words in the sentence with concepts in the domain model
3. generate the OCL constraint using a verb-specific transformation rule
1. determine the role of words in a sentence
123
124. OCLgen solution
“The system sets the occupancy status to adult.”
BodySense.allInstances()
->forAll( i | i.occupancyStatus = Occupancy::Adult)
actor affected by the verb final state
2. match words in the sentence with concepts in the domain model
1. determine the role of words in a sentence
Based on Semantic Role Labeling
Lexicons that describe the sets of roles typically
Based on String similarity
124
3. generate the OCL constraint using a verb-specific transformation rule
125. Constraints Generation Process
Execute SRL
use case sentence
text with SRL labels
Select and apply verb-specific transformation rule
• All rules share a common algorithmic structure
• Rules differ for the SRL roles labels considered
EntityName.allInstances()->forAll( i | i.LHS <Operator> RHS)125
127. Problem and Context
• Schedulability analysis encompasses techniques that try to
predict whether (critical) tasks are schedulable, i.e., meet
their deadlines
• Stress testing runs carefully selected test cases that have
a high probability of leading to deadline misses
• Stress testing is complementary to schedulability analysis
• Testing is typically expensive, e.g., hardware in the loop
• Finding stress test cases is difficult
127
128. Finding Stress Test Cases is Hard
128
0
1
2
3
4
5
6
7
8
9
j0, j1 , j2 arrive at at0 , at1 , at2 and must
finish before dl0 , dl1 , dl2
J1 can miss its deadline dl1 depending on
when at2 occurs!
0
1
2
3
4
5
6
7
8
9
j0 j1 j2 j0 j1 j2
at0
dl0
dl1
at1 dl2
at2
T
T
at0
dl0 dl1
at1
at2
dl2
129. Challenges and Solutions
• Ranges for arrival times form a very large input space
• Task interdependencies and properties constrain what
parts of the space are feasible
• Solution: We re-expressed the problem as a constraint
optimization problem and used a combination of constraint
programming (IBM CPLEX) and meta-heuristic search (GA)
129
130. Constraint Optimization
130
Constraint Optimization Problem
Static Properties of Tasks
(Constants)
Dynamic Properties of Tasks
(Variables)
Performance Requirement
(Objective Function)
OS Scheduler Behaviour
(Constraints)
131. Combining CP and GA
131
A:12 S. Di Alesio et al.
Fig. 3: Overview of GA+CP: the solutions x , y and z in the initial population of GA evolve into
133. Summary
• We provided a solution for generating stress test cases by combining
meta-heuristic search and constraint programming
• Meta-heuristic search (GA) identifies high risk regions in the
input space
• Constraint programming (CP) finds provably worst-case
schedules within these (limited) regions
• Achieve (nearly) GA efficiency and CP effectiveness
• Our approach can be used both for stress testing and
schedulability analysis (assumption free)
133
135. Other Industrial Projects
• Delphi: Testing and verification of CPS Simulink models
(e.g., controllers) [Matinnejad et al.]
• SES: Hardware-in-the-Loop, acceptance testing of CPS
[Seung et al.]
• IEE: Testing timing properties in embedded systems
[Wang et al.]
• Luxembourg government: Generating representative,
synthetic test data for information systems [Soltana et
al.]
135
136. Role of AI
• Metaheuristic search:
• Many test automation problems can be re-expressed into
search and optimization problems
• Machine learning:
• Automation can be better guided and effective when
learning from data: test execution results, fault detection …
• Natural Language Processing:
• Natural language is commonly used and is an obstacle to
automated analysis and therefore test automation
136
137. Search-Based Solutions
• Versatile
• Helps relax assumptions compared to exact approaches
• Helps decrease modeling requirements
• Scalability, e.g., easy to parallelize
• Requires massive empirical studies
• Search is rarely sufficient by itself
137
138. Multidisciplinary Approach
• Single-technology approaches rarely work in practice
• Combined search with:
• Machine learning
• Solvers, e.g., CP, SMT
• Statistical approaches, e.g., sensitivity analysis
• System and environment modeling and simulation
138
139. The Road Ahead
• We need to develop techniques that strike a balance in terms
of scalability, practicality, applicability, and offering a
maximum level of dependability guarantees
• We need more multi-disciplinary research involving AI
• In most industrial contexts, offering absolute guarantees
(correctness, safety, or security) is illusory
• The best trade-offs between cost and level of guarantees is
necessarily context-dependent
• Research in this field cannot be oblivious to context (domain
…)
139
141. Some Leading Researchers
• A. Arcuri, Westerdals and U. of Luxembourg, Norway & Luxembourg
• R. Feldt, Chalmers U., Sweden
• G. Fraser, U. of Passau, Germany
• M. Harman, Facebook and UCL, UK
• T. Menzies, North-Carolina State U. , USA
• P. McMinn, U. Sheffield, UK
• A. Panichella, Delft U., NL
• M. Pezze, P. Tonella, U. of Lugano, Switzerland
• A. Zeller, CISPA and Saarland U., Germany
141
142. Selected SBST References
• McMinn, “Search-Based Software Testing: Past, Present and Future”, ICST 2011
• Harman et al., “Search-based software engineering: Trends, techniques and applications”, ACM
Computing Surveys, 2012
• Fraser, Arcuri, “Whole Test Suite Generation”, IEEE Transactions on Software Engineering, 2013
• A.Panichella et al., “Reformulating Branch Coverage as Many-Objective Optimization Problems”,
ICST 2015
• Ali et al., “Generating Test Data from OCL Constraints with Search Techniques”, IEEE Transactions
on Software Engineering, 2013
• Hemmati et al., “Achieving Scalable Model-based Testing through Test Case Diversity”, ACM
TOSEM, 2013
142
143. Selected ML-driven Testing
• Noorian et al., “Machine Learning-based Software Testing:
Towards a Classification Framework”, SEKE 2011
• Briand et al., “Using machine learning to refine category-partition
test specifications and test suites”, Information and Software
Technology (Elsevier), 2009
• Appelt et al., “A Machine Learning-Driven Evolutionary Approach
for Testing Web Application Firewalls”, IEEE Transaction on
Reliability, 2018
• Machine learning session at ISSTA 2018!
143
144. NLP-driven Testing
• Wang et al., “Automatic generation of system test cases from use case
specifications”, ISSTA 2015
• Wang et al., “Automated Generation of Constraints from Use Case
Specifications to Support System Testing”, ICST 2018
• Mai et al., “A Natural Language Programming Approach for
Requirements-based Security Testing”, ISSRE 2018
• Blasi et al., “Translating Code Comments to Procedure Specifications”,
ISSTA 2018
• Arnaoudova et al., “The use of text retrieval and natural language
processing in software engineering”, ICSE 2015
144
145. Selected Industrial Examples
• Matinnejad et al., “MiL Testing of Highly Configurable Continuous
Controllers: Scalable Search Using Surrogate Models”, ASE 2014
• Di Alesio et al. “Combining genetic algorithms and constraint
programming to support stress testing of task deadlines”, ACM
Transactions on Software Engineering and Methodology, 2015
• Ben Abdessalem et al., "Testing Vision-Based Control Systems Using
Learnable Evolutionary Algorithms”, ICSE 2018
• Soltana et al., “Synthetic Data Generation for Statistical
Testing”, ASE 2017.
• Shin et al., “Test case prioritization for acceptance testing of cyber-
physical systems”, ISSTA 2018
145
146. Selected Industrial Examples
• Appelt et al., “A Machine Learning-Driven Evolutionary Approach for
Testing Web Application Firewalls”, IEEE Transaction on Reliability, 2018
• Jan et al., “Automatic Generation of Tests to Exploit XML Injection
Vulnerabilities in Web Applications”, IEEE Transactions on Software
Engineering, 2018
• Wang et al., “System Testing of Timing Requirements Based on Use Cases
and Timed Automata”, ICST 2017
• Wang et al., “Automated Generation of Constraints from Use Case
Specifications to Support System Testing”, ICST 2018
146
147. .lusoftware verification & validation
VVS
Artificial Intelligence for Automated
Software Testing
Lionel Briand
ISSTA/ECOOP Summer School 2018