The document describes a keynote presentation for a new IT product called Keynote. It emphasizes that the future of IT is agile and collaborative, with a focus on iterative and customer-centric development. It introduces the concept of an agile service desk to empower every team. The presentation highlights new features for improved visibility, management, and resolution of incidents.
34. Creating a call-out
Watch the tutorial in the
Presentation Guidelines at
Atlassian.design to learn how to
create call-outs on screenshots
within this template.
35. Creating a call-out
Watch the tutorial in the
Presentation Guidelines at
Atlassian.design to learn how to
create call-outs on screenshots
within this template.
36. Go to bit.ly/jsd-next-gen
or visit the Jira Service Desk booth to learn more
EARLY ACCESS TO NEXT-GEN
INTRODUCING
55. FBGRMKJM DC
Leave call
Timeline Stakeholder Updates
Banc.ly backend war roomINCIDENT #23 OPEN
11:33 AM | sent from slack
Dana Casey
The event stream data had some invalid records.
We need to fix the error handling and alerting.
11:33 AM
how about the instance? it’s not
working and we need to backup the
old versions
Frances Ball
CHAT
Add responder
Support has verified customers can now access our online banking systems with no problems.
Malenia Kang01:47 ·MK
Josie Michaels01:45 ·
Josie Michaels01:50 · Incident resolved ·
16:45 (UTC +8) · Statuspage updated · Mary Smith
| We have now fully restored service to all of our customers. We will continue to monitor the login
services to ensure no further issues remain.
Resolved
FilterAdd entry
Banc.ly backend
Banc.ly SREs
Banc.ly frontend
Josie Michaels
PARTICIPANTS
Frances Ball
Gussie Romero
Dana Casey
Malenia Kang
Incident Commander
Communications Officer
Scribe
3
0
RESPONDER TEAMS
Assign/Update roles
56.
57.
58.
59.
60. Executive summary
Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the CloudWatch logs due to
errors on /deposit/v2 API.
> Rate limiting has prevented the flux service from receiving stream notifications.
The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to free said spanner this led to a
a blockage 2 weeks ago.
Leadup
The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when there was an overflow of
possums in the Google pipeline. Obviously, this led to an outage of Google logins.
Fault
This outage was first detected by New Relic. Simo Nalakorn was then alerted and acknowledged the alert at
7:21pm
Detection
Root causes
In appropriate spanner lubrication. The Google login pipeline should be able to withstand this kind of
blockage, but it’s thresholds were exceeded. We ultimately performed inadequate checks of this pipeline.
Mitigation and resolution
Defragging the pipeline cleared the possums, allowing us to restart it. Login service restored at 7:51pm.
Lessons learnt
Banc.ly site down for customers - 500 errors on /deposit/v2 API -
postmortem report
Reports / Postmortems /
Draft
TimelineDetails
Am checking for possums in the Google tracts, as
they have infested us before.
Josie Michaels01:34 ·
The curse has not yet been lifted from the login. I
am continuing to search.
Josie Michaels01:20 ·
The defragulator is checked and is not the source
of the problem. Frag lines are flowing smoothly.
Josie Michaels01:47 ·MK
We have now fully cleared out the Login blockage.
It seems that Google was full of possums again. We
reset our API tokens and drained all cisterns of the
pestilence but we will remain ever vigilant. The
Josie Michaels01:45 ·
Mary Smith01:50 · Incident resolved ·
16:45 (UTC +8) · Statuspage updated · Mary Smith
| We have now fully restored service to all
of our customers. We will continue to monitor the
login services to ensure no further issues remain.
Resolve
d
FilterAdd entry
61. Executive summary
Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the CloudWatch logs due to
errors on /deposit/v2 API.
> Rate limiting has prevented the flux service from receiving stream notifications.
|Describe the circumstances that led to this incident
Leadup
Describe what failed to work as expected
Fault
Describe how the incident was detected
Detection
Root causes
Run a 5-whys analysis to understand the true causes of the incident
Mitigation and resolution
What steps did you take to resolve this incident?
Lessons learnt
What went well? What could have gone better? What else did you learn?
Banc.ly site down for customers - 500 errors on /deposit/v2 API -
postmortem report
Reports / Postmortems /
Draft
TimelineDetails
Am checking for possums in the Google tracts, as
they have infested us before.
Josie Michaels01:34 ·
The curse has not yet been lifted from the login. I
am continuing to search.
Josie Michaels01:20 ·
The defragulator is checked and is not the source
of the problem. Frag lines are flowing smoothly.
Josie Michaels01:47 ·MK
We have now fully cleared out the Login blockage.
It seems that Google was full of possums again. We
reset our API tokens and drained all cisterns of the
pestilence but we will remain ever vigilant. The
Josie Michaels01:45 ·
Mary Smith01:50 · Incident resolved ·
16:45 (UTC +8) · Statuspage updated · Mary Smith
| We have now fully restored service to all
of our customers. We will continue to monitor the
login services to ensure no further issues remain.
Resolve
d
FilterAdd entry
Executive summary
Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the CloudWatch logs due to
errors on /deposit/v2 API.
> Rate limiting has prevented the flux service from receiving stream notifications.
|Describe the circumstances that led to this incident
Leadup
Describe what failed to work as expected
Fault
Describe how the incident was detected
Detection
Root causes
Run a 5-whys analysis to understand the true causes of the incident
Mitigation and resolution
What steps did you take to resolve this incident?
Lessons learnt
What went well? What could have gone better? What else did you learn?
Banc.ly site down for customers - 500 errors on /deposit/v2 API -
postmortem report
Reports / Postmortems /
Draft
62. Global Reports Alert MTTA/R reports
API usage reports
Notification reports
Alert reports
Alert MTTA/R reports
Alert analytics
User productivity analytics
Incomming call routing
Infrastructure health report
Monthly overview
On call reports
Postmortem reports
ICC Past Sessions
63. Global Reports Alert MTTA/R reports
API usage reports
Notification reports
Alert reports
Alert MTTA/R reports
Alert analytics
User productivity analytics
Incomming call routing
Infrastructure health report
Monthly overview
On call reports
Postmortem reports
ICC Past Sessions
70. Hands on Lab: Modern Incident
Management with Opsgenie
Wednesday, 5:00 - 6:00 pm @ the Training Theater
Thursday, 3:45 - 4:45 pm @ the Training Theater
76. API slowness
An issue in a deploy caused a couple minutes of downtime across all status pages.
Upon discovering the issue, we rolled back immediately. This has been resolved and we
apologize for the inconvenience.
77.
78. An issue in a deploy caused a couple minutes of downtime across all
status pages. Upon discovering the issue, we rolled back immediately.
This has been resolved and we apologize for the inconvenience.
INCIDENT IDENTIFIED
Sept 4, 16:06 CDT
SUBSCRIBE TO UPDATES
84. Status@bancly.com
[Banc.ly Status] Issues with the Banc.ly website
Issues with the Banc.ly website
Investigating
The website is still experiencing issues for approximately 15% of Banc.ly’s
consumer banking customers. Our engineers are working with our partners
to restore connectivity.
Components Affected
Oct 25, 11:48 PDT
Time Posted
85. Status@bancly.com
[Banc.ly Status] Issues with the Banc.ly website
Issues with the Banc.ly website
Investigating
The website is still experiencing issues for approximately 15% of Banc.ly’s
consumer banking customers. Our engineers are working with our partners
to restore connectivity.
Components Affected
Oct 25, 11:48 PDT
Time Posted
87. Service desks
Human resources
We can help with new employee
onboarding and general queries.
IT Support
We can help with any questions
regarding your computer.
Give kudos
Say thanks to your colleagues,
send them a kudos here.
Welcome to the Banc.ly Help Center
Find help and services
1 Requests
Status update
Mobile app users having trouble logging in View Status page
Status update
Mobile app users having trouble logging in View in Statuspage
99. Create incident
CRITICAL
ViewUpdate
Incidents
Performing maintenance on our file sync systems for the entire
weekend
BEGINNING 4 OCT 2018 (01:30 PDT)
Update
Maintenance TemplatesIncidentsOpen
Apps
Your page
Upcoming
Components
Subscribers
Incidents
Public site
Banc.ly
View status page
Search
SCHEDULED
CRITICAL
ViewUpdate
Performing maintenance on our file sync systems for the entire
weekend
10 MINS AGO (08:30 UTC)
Component group name - component name long lorem Short name
Update
IN PROGRESS
2
100. Escalations
Banc.ly backend weekday
Banc.ly backend weekend
0 On-call users in Banc.ly backend,if not acknowledgedm
5 Sarah Smith, if not acknowledgedm
10 Ryan Windows, if not acknowledgedm
m15 Evgeny Willows, if not ackno
20 Bancm
On-call
Integrations
Services
Members
Roles
Policies
Conferences
Activity stream
On-call
Routing rules
for any received alert
route the alert to
Friday 18:00 - Monday 06:00
Banc.ly backend weekend
Banc.ly backend weekday
routing time is betweenAND
THEN
IF
route alerts toELSE
/TeamsBanc.ly backend
developers
On-call schedules
Banc.ly backend (+03:00) MSK Moscow
Apps
Your page
Components
Subscribers
Incidents
Public site
Banc.ly
View status page
2
101. DUPLICATED FOR TALK TRACk
Escalations
Banc.ly backend weekday
Banc.ly backend weekend
0 On-call users in Banc.ly backend,if not acknowledgedm
5 Sarah Smith, if not acknowledgedm
10 Ryan Windows, if not acknowledgedm
m15 Evgeny Willows, if not ackno
20 Bancm
On-call
Integrations
Services
Members
Roles
Policies
Conferences
Activity stream
On-call
Routing rules
for any received alert
route the alert to
Friday 18:00 - Monday 06:00
Banc.ly backend weekend
Banc.ly backend weekday
routing time is betweenAND
THEN
IF
route alerts toELSE
/TeamsBanc.ly backend
developers
On-call schedules
Banc.ly backend (+03:00) MSK Moscow
Apps
Your page
Components
Subscribers
Incidents
Public site
Banc.ly
View status page
2
103. Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 9:57 AM
ConnectWise Integration +4
Elapsed time: 4h 4m 38s
P2
FilterTimeline Add entry
Am checking for possums in the Google tracts, as they
have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the login. I am
continuing to search.
Liam Hens00:34 ·
The defragulator is checked and is not the source of the
problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage. It
seems that Google was full of possums again. We reset
our API tokens and drained all cisterns of the pestilence
but we will remain ever vigilant. The next step will be to
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to all of
our customers. We will continue to monitor the login
services to ensure no further issues remain.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
Monday 16 February 2018
Mary Smith23:54 · Stakeholders updated ·
| Not having
any toilet paper sucks, but we are woking to resolve this
We are continuing to monitor this outage
Join command center
Open
Associated alerts Responders StakeholdersDetails Service desk tickets
BLY-1227 I can’t login to the site
BLY-1224
WAITING FOR SUPPORT
Banc.ly isn’t working
Link ticket
5 linked service desk tickets
BLY-1219 Can’t access the site
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1218 The site is down
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1216 Website? More like web-shite!
WAITING FOR SUPPORT
PENDING RESPONSE
PENDING RESPONSE
PENDING RESPONSE
Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 9:57 AM
ConnectWise Integration +4
Elapsed time: 4h 4m 38s
P2
FilterTimeline Add entry
Am checking for possums in the Google tracts, as they
have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the login. I am
continuing to search.
Liam Hens00:34 ·
The defragulator is checked and is not the source of the
problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage. It
seems that Google was full of possums again. We reset
our API tokens and drained all cisterns of the pestilence
but we will remain ever vigilant. The next step will be to
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to all of
our customers. We will continue to monitor the login
services to ensure no further issues remain.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
Monday 16 February 2018
Mary Smith23:54 · Stakeholders updated ·
| Not having
any toilet paper sucks, but we are woking to resolve this
We are continuing to monitor this outage
Join command center
Open
Associated alerts Responders StakeholdersDetails Service desk tickets
BLY-1227 I can’t login to the site
BLY-1224
WAITING FOR SUPPORT
Banc.ly isn’t working
Link ticket
5 linked service desk tickets
BLY-1219 Can’t access the site
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1218 The site is down
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1216 Website? More like web-shite!
WAITING FOR SUPPORT
PENDING RESPONSE
PENDING RESPONSE
PENDING RESPONSE
104. Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 9:57 AM
ConnectWise Integration +4
Elapsed time: 4h 4m 38s
P2
FilterTimeline Add entry
Am checking for possums in the Google tracts, as they
have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the login. I am
continuing to search.
Liam Hens00:34 ·
The defragulator is checked and is not the source of the
problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage. It
seems that Google was full of possums again. We reset
our API tokens and drained all cisterns of the pestilence
but we will remain ever vigilant. The next step will be to
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to all of
our customers. We will continue to monitor the login
services to ensure no further issues remain.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
Monday 16 February 2018
Mary Smith23:54 · Stakeholders updated ·
| Not having
any toilet paper sucks, but we are woking to resolve this
We are continuing to monitor this outage
Join command center
Open
Associated alerts Responders StakeholdersDetails Service desk tickets
BLY-1227 I can’t login to the site
BLY-1224
WAITING FOR SUPPORT
Banc.ly isn’t working
Link ticket
5 linked service desk tickets
BLY-1219 Can’t access the site
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1218 The site is down
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1216 Website? More like web-shite!
WAITING FOR SUPPORT
PENDING RESPONSE
PENDING RESPONSE
PENDING RESPONSE
Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 9:57 AM
ConnectWise Integration +4
Elapsed time: 4h 4m 38s
P2
T
M
Join command center
Open
Associated alerts Responders StakeholdersDetails Service desk tickets
BLY-1227 I can’t login to the site
BLY-1224
WAITING FOR SUPPORT
Banc.ly isn’t working
Link ticket
5 linked service desk tickets
BLY-1219 Can’t access the site
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1218 The site is down
BENTO-1227 WAITING FOR SUPPORTUpdate documentation on developer siteBLY-1216 Website? More like web-shite!
WAITING FOR SUPPORT
PENDING RESPONSE
PENDING RESPONSE
PENDING RESPONSE
105.
106. Saved Search
PREDEFINED
All
Private services
Public services
Sort by Service name
Services
{ } Search SaveSearch
6 services
Add service
Notification Email
Nullam ullamcorper congue lacus vel tempus
SQL Etiam aliquet Phasellus
Tardis-ops
Tardis-opsJAVA Python SDKs
Vestibulum nec lacus sit amet libero semper sollicitudin
SDK Java
HAS INCIDENT
MorpheusOpsGenie App Service - Frontend
In rhoncus, ipsum pulvinar laoreet viverra, orci erat posuere enim, ac consequat purus lorem in arcu
Opsgenie Fronthend
Pricing and Subscription Management
In hac habitasse platea dictumst. Interdum et malesuada fames ac ante ipsum primis in faucibus
Pricing Subscription
HAS INCIDENTAlexstrasza-ops
Pricing Subscription
Alexstrasza-ops
Schedule Tardis-ops
User and authorization related functionalities via Web UI and Rest APIs
HAS INCIDENTUser and Authorization Management
111. Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents 2
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents 2
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
112. Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents 2
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents 2
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
Components
Third-partyActive
Apps
Your page
Components
Subscribers
Incidents
Adroll
Adzerk
Apigee
Akana
Aircall
Atlassian
Atlassian Bitbucket
Acquia, incView status page
Edit
Edit
Add
Add
Add
Add
Add
Add
Public site
Banc.ly
113.
114.
115. Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 9:57 AM
ConnectWise Integration +4
Elapsed time: 4h 4m 38s
P2
+ Assign role
Associated alerts Responders StakeholdersDetails
Team Banc.ly backend
Service bancly-backend-api
Role User
Description Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the
CloudWatch logs due to errors on /deposit/v2 API.
> Rate limiting has prevented the flux service from receiving stream notifications.
Incident response roles
Incident commander
Please make sure the users added to each response role have the necessary
incident management rights to take actions on the incidents.
P2 - HighPriority
FilterTimeline Add entry
Am checking for possums in the Google tracts, as they
have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the login. I am
continuing to search.
Liam Hens00:34 ·
The defragulator is checked and is not the source of the
problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage. It
seems that Google was full of possums again. We reset
our API tokens and drained all cisterns of the pestilence
but we will remain ever vigilant. The next step will be to
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to all of
our customers. We will continue to monitor the login
services to ensure no further issues remain.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
Monday 16 February 2018
Mary Smith23:54 · Stakeholders updated ·
| Not having
any toilet paper sucks, but we are woking to resolve this
We are continuing to monitor this outage
Jira issues
Create new issue Link existing issue
Join command center
Open
116. + Assign role
Role User
Incident response roles
Incident commander
Please make sure the users added to each response role have the necessary
incident management rights to take actions on the incidents.
Jira issues
Create new issue Link existing issue
Create Cancel
Banc.ly backend What needs to be fixed?Add error handling to deposit API for invalid tuple length
Create
117. + Assign role
Role User
Incident response roles
Incident commander
Please make sure the users added to each response role have the necessary
incident management rights to take actions on the incidents.
Jira issues
Create new issue Link existing issue
BBE-1227 TO DOAdd error handling to deposit API for invalid tuple length
BBE-1228 TO DOFix alerting rules to notify devs when rate limits exceeded
Create new issue Link existing issue