Knowledge-infused AI

Knowledge-infused AI for Healthcare:
Role of Conceptual Medical Knowledge in
Improving Machine Understanding
Artificial Intelligence Institute
Manas Gaur

AI
Outline
Why do we need Knowledge Infusion ?
Let Me Tell You About Your Mental Health! :
Contextualized Classification of Reddit Post to DSM-5
Unsupervised Abstractive Summarization of Diagnostic
Mental Health Interviews
Semi-Deep Knowledge Infusion
Shallow Knowledge Infusion

BERT
Abstractive
Summarization using
Integer Linear
Programming (ILP)
Abstractive Summarization
using ILP and PHQ-9
Statistical Statistical + Constraints
Statistical + Constraints
+ Knowledge
AI

Arachie, Chidubem, Manas Gaur, Sam Anzaroot, William Groves, Ke Zhang, and Alejandro Jaimes. "Unsupervised Detection of Sub-events in
Large Scale Disasters." arXiv preprint arXiv:1912.13332 (2019).
Unsupervised Detection of Sub-events in Large Scale
Disasters
AI

AI
W2V
Islamic
Corpus
Religious
Dimension
(R)
R
R
R
Contextual Dimension Modeling
is a one-time learning process
User
Contextual
Dimension based
Representation
W2V
Ideological
Corpus
W2V
Hate
Corpus
(...)
I
I
I
(...)
H
H
H
(...)

Probably Approximately Correct Learning
AI

Probably Approximately Correct Learning
How do you know that a training set has a
good domain coverage?
Robust Classifier → Low Generalizability
Error
Consistent Classifier → Low Training Error
Confidence: More Certainty
(lower δ) means more number of
samples.
Complexity: More complicated
hypothesis (|H|) means more
number of samples
AI

PAC Learning to Knowledge Infusion
Challenge:
Existing ML
Models:
Infusion:
True Data
Distribution
Hypothesis Data
Distribution
AI

Dataset
enrich
Machine
Learning
Tacit
Knowledge
Hypothesis
testing or
similarity-based
verification
Shallow Infusion
Dataset
Tacit
Knowledge
Self-aware or
External Knowledge Self-aware or
External Knowledge
Similarity
based
verification
Semi-Deep Infusion
AI

Benefit of Infusing Knowledge
Interpretability: Rules or Axioms that are constructed
from patterns learned by a machine learning model.
Traceability: If we can validate the correctness of rules
or axioms using a ground truth, we achieve traceability
Explainability: Interpretability + Traceability
Interpretability
Explainability
AI

Knowledge Infusion
Identification and Integration of Commonsense
knowledge for principled reasoning.
Identification: Finding relevant information at an
appropriate abstraction level in the Knowledge
Graph
Integration: Controlled content enrichment or
modification to reduce Impedance Mismatch in
learning
Benefit: Robustness is ensured
AI

Patient is a known case of non-Hodgkin’s lymphoma and
undergone three cycles of chemotherapy.
AI

Algorithmic possibilities and
limitations of AI System
AI
Teaching Materials
● Ontology
● Knowledge Graph
● Knowledge Base
● Lexicons
Teaching Materials form a conceptual framework
of interconnecting sets of domain-focused
concepts and relationships
Remove ambiguity and sparsity.
Drug Abuse Ontology
● Concepts (315)
● Relations (31)
● Instances (814)

Teaching Materials
Commonsense
Reasoning
Web Mining Knowledge-based Crowdsourcing
E.g. NELL, KnowItAll
E.g. ConceptNet,
OpenMind
Mathematical Informal Large-Scale
E.g. Situation
Calculus
E.g. LIWC, Scripts E.g. CYC, DBpedia
AI

Knowledge Infusion in Healthcare
3 Challenges
Abstraction
Contextualization
Personalization
Shallow Infusion
Shallow and Semi-Deep Infusion
Shallow, Semi-Deep, and Deep
Infusion
AI

Abstraction : Medical Entity Normalization
I am sick of loss,
need a way out
No way out,
I am tired of my losses
Losses, Losses, I want to die
SuicideDepression
Suicide Depression Suicide
Depression
depress, suicide ideation suicide ideation, depress Depress, suicide attempt
AI

Teaching Material: Suicide Severity Lexicon
Suicide Risk Class Number of
Entities
Sample Medical Phrases
Suicide Indicator 1472 Severe mood disorder with
psychotic feature;
Severe major depression;
Family history of suicide;
Sedative
Suicide Ideation 409 Bipolar affective disorder;
Borderline Personality;
Depressive conduct disorder;
Sexual maturation disorder
Suicide Behavior 145 Suicidal behavior;
Intentional self-harm;
Incomplete attempt;
Threatening suicide
Suicide Attempt 123 Attempt actual suicide;
Attempt physical damage;
Intensive care;
Second-degree burns
Suicide by Hanging
[SNOMED ID: 287190007]
<child of> Suicide
[SNOMED ID:44301001]
<sibling of> Drug Overdose
[SNOMED ID:274228002]
<sibling of> Personal history
of self-harm [ICD-10 ID:
Z91.5]
<sibling of> Severe depressive
episode psychotic symptoms
[ICD-10 ID: F32.3]
AI

Contextualization
I dont think Ive thought
about it every day of my
entire life. I have for a good
portion of it, however, my
boyfriend may be able to
determine whether I’m worth
his time
Outcome : Suicide Indication
Having a plan for my own
suicide has been a long time
relief for me as well. I more
often than not wish I were
dead.
I dont think Ive thought about
it every day of my entire life. I
have for a good portion of it,
however, my boyfriend may
be able to determine whether
I’m worth his time
Outcome : Suicidal Ideation
AI

Contextualization
Medical Knowledge Bases
Language Model
(LDA, BERT)
Content Similarity Matrix
AI

Personalization
refers to future course of action by taking into account the contextual factors such as user’s health
history, physical characteristics, environmental factors, activity, and lifestyle.
Without
Contextualized
Personalization
With
Contextualized
Personalization
Chatbot with contextualized
(asthma) knowledge is
potentially more personalized
and engaging.
AI

Let Me Tell You About Your Mental Health! :
Contextualized Classification of Reddit Post to
DSM-5
Gaur, Manas, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "Let
me tell you about your mental health!: Contextualized classification of reddit posts to dsm-5 for web-based intervention." In Proceedings of the
27th ACM International Conference on Information and Knowledge Management, pp. 753-762. ACM, 2018.
AI

Problem Statement:
Can data on the Web assist Mental Health
Professionals in Early Intervention ?
AI

Motivation
People (clinician and patient)
● Social Anxiety in patient’s face to face conversation
with Mental health Professional
● Poor recall rate of the patient
● Poor understanding of patient’s behavior
Data
● Clinical data is time-limited.
● Twitter data is short and not categorized
● Reddit data is long and categorized
● Reddit categorization does not overlap with Clinician
AI

Main Post
Comment
Reply
Subreddit
AI

Challenge
➢ How can we use Reddit for psychiatric diagnosis?
○ Is it possible to map Subreddits to Diagnostic
Statistical Manual for Mental Health ?
○ If yes, can we build a learning algorithm for
classifying the user on social media to appropriate
DSM-5 category for suitable diagnosis?
AI

2013, 5th Edition Diagnostic and Statistical Manual of Mental Disorders (DSM-5) is a
psychiatric bible that can cure 46.4% of adult US population suffering from Mental Illness.
Redditors conversing on Alcohol Abuse, Caffeine Intoxication can be mapped to DSM-5
category: Substance-use and Addictive Disorder
There are 21 Diagnostic categories of which 20 are specific to Mental Health
Background on DSM-5
AI

Examples
I know you want me to say no and that it is a part of
me blah blah blah. But I can't. Honestly, not having
bipolar disorder would be a huge blessing. I would
be so much happier and could control my life better. I
wouldn't have frantic, scattered thoughts and
depression. I would be normal, happy, and less
dramatic.
Depressive Disorders
Post from Bipolar Subreddit:
DSM-5 Chapter:
Upon additional research, zolpidem (ambien) has a
half-life of 2-3 hours, and so if he’s still awake, he’s
either got a massive tolerance for this stuff or he’s
really trolling.
Suicidal Behavior/Ideation Disorders
Post from Suicidewatch Subreddit:
DSM-5 Chapter:
AI

Dataset
2005-2016
550K Users
8 Million Conversations
15 Mental Health
Subreddits
2005-2016
270K Users
( Only Authors of
Main Posts)
3 Million
Conversations (Main
Posts Only)
15 Mental Health
Subreddits
AI

Reddit to DSM-5 Mapping
Medical Knowledge Bases
N-grams
(n=1, 2, 3)
LDA
LDA over
Bi-grams
Normalized
Hit
Score
DSM-5
Lexicon
<Reddit Post>
<Subreddit Label>
Input
<Reddit Post>
<DSM-5 Label>
Output
DAO
Drug
Abuse
Ontology
AI

● Topics describing each subreddits are identified through:
○ Skip Gram model to generate n-grams
○ LDA over individual subreddits
○ LDA over bigrams of individual subreddits
● Relevant topics were identified constraining through Topic
Coherence measure.
● We utilize UCI topic coherence model which is Pointwise
Mutual Information.
Language Modeling and Coherence
AI

We have computed the Normalized Hit Score (nhs) between
LDA topics of each subreddit (S) and the DSM-5 lexicon (D) to
infer their corresponding DSM-5 category.
Normalized Hit Score
AI

BiPolar
Depression Disorder
Subreddits DSM-5 Chapter:
BiPolarReddit
BiPolarSOS
Depression
Addiction
Substance use & Addictive Disorder
Crippling Alcoholism
Opiates Recovery
Opiates
Self-Harm
Stop Self-Harm
Mapping Example
AI

SEDO
Semantic Encoding and Decoding Optimization. It is a
procedure to modulate word embedding (vectors) of a word.
Reddit with
DSM-5 labels
Word
Embedding
Model
Correlation Matrix
(Q)over word
vectors
Medical
Knowledge Bases
Domain
Experts
Correlation
Matrix (P)
over DSM-5
Lexicon or DAO
SEDO
Optimiz
e P, Q &
Z
DSM-5 Lexicon
DSM-5
Vocabulary
Matrix
Word-modulated
Word
Embeddings
DSM-5
Classification
Cross Correlation
Matrix (Z)
between word
vectors and DSM-
5 Lexicon or DAO
Linguistic
Features
DAO
Architecture
AI

We have infused background knowledge in DSM-5-DAO
to classification process utilizing SEDO.
We introduce SEDO as an approach for obtaining a
discriminative weight matrix between the DSM-5
lexicon and Reddit embedding space
SEDO modulates the embeddings of each word in the
Reddit content of the user based on proximity of the
word to DSM-5 category.
Correlation Matrix
(Q)over word vectors
Correlation Matrix
(P)
over DSM-5
Lexicon or DAO
SEDO
Optimiz
e P, Q &
Z
Cross Correlation
Matrix (Z)
between word
vectors and DSM-5
Lexicon or DAO
Semantic Encoding and Decoding Optimization
AI

12808
Words
300 dimension embedding 300 dimension embedding
20 DSM-5
Categories
R
D
Reddit Word
Embedding
Model
DSM-5 -DAO
Lexicon
W
Solvable Sylvester Equation
AI

Encoding DSM-5 to Reddit embedding space
Decoding Reddit to DSM-5 embedding
space
AI

Domain-specific
Knowledge lowers
False Alarm Rates.
AI

Unsupervised Abstractive Summarization of
Diagnostic Mental Health Interviews
Gaur, Manas, Vamsi Aribandi, Ugur Kursuncu, Amanuel Alambo, Krishnaprasad Thirunarayan, Jonathan Beich and Amit Sheth. "Unsupervised Abstractive Summarization of
Diagnostic Mental Health Interviews", under review in The Web Conference 2020
AI

● Mental Health Professionals are involved in interactive and
note-taking activities, which negatively affect the decision
making:
○ lowering empathy towards the patient,
○ accompanied by mistrust due to social stigma and
therapeutic pessimism, and
○ distracting from capturing relevant information,
● Thus thwarting a learned follow-up procedure.
● The proposed research utilizes an infusion of Knowledge
in an Abstractive Summarization framework (PHQxAS).
● The framework summarizes long conversations (58-60
sentences) in 7-8 sentences
Motivation
AI

Dataset
● The Distress Analysis Interview Corpus Wizard-of-Oz
(DAIC-WoZ) interviews database consists of clinical
interviews designed to support the diagnosis of psychological
conditions such as anxiety, depression, and post-traumatic
stress disorder.
● It contains data from 189 interviews, generally 7-33 minutes
long, with an average length of 16 minutes.
● The interviews were conducted by a virtual interviewer which
is controlled by a human in another room.
● 5 out of the 189 interviews have been excluded for this
study as they have imperfections in the data collection
or transcription process.
● We further filtered the interview scripts based on
subjectivity, polarity, and entropy analysis.
AI

● Identification of relevant utterances from interview transcripts
using PHQ-9 Lexicon.
● Generation of a semantic similarity score for a word to assess
its relevance to mental issue.
● We do it by retrofitting ConceptNet embedding with the
PHQ-9 Lexicon.
● Let c(wi) be the maximum cosine similarity score between a
word wi in ConceptNet (V vocab size) and PHQ-9 Lexicon.
Word Semantic Score (WSS) of any word wt is calculated as:
AI
Our Approach

● Improvement of generated summaries using linguistic quality
measure (LQ).
● LQ formulation uses WSS(wt), so that more domain-relevant
terms appear in summaries.
● Unification of our modification into an Integer Linear
Programming (ILP) Framework, which optimizes
Informativeness (I) and LQ.
● The ILP framework intrinsically constructs a Word Graph with k
paths (Pk) and tries to maximize the I(Pk) and LQ(Pk).
● TextRank is used to measure
informativeness.
● A language model is used to
evaluate linguistic quality.
● To select the best path, both
measures are incorporated to
formulate an optimization problem.
● This optimization problem is solved
through an ILP framework.
Our Approach
AI

We compare our approach with state-of-the art summarization techniques:
● Extractive Summarization (ES) : Greedily identifies important utterances from interview
scripts and produce a summary. It fails to gather context in the conversation.
● Abstractive Summarization (AS) : Examines and Interpret the interview scripts to
generate more contextualized summaries. It fails to gather domain knowledge.
● Abstractive over Extractive (AOES): ES is efficient in filtering out non-informative
sentences which can help AS to generate more coherent summaries.
● Knowledge Infused AS (KIAS) (Our approach): Existing approaches do not consider
domain knowledge, important to end user. AS and ES tend to lose important pieces of
information as explained in illustrated summaries.
Since, there are no ground truth summaries on clinical diagnostic interviews, we considered the
interview transcripts for evaluation.
AI
Baselines

KL Divergence Based Evaluation
● Median KL divergence score for different summarization
approaches and PHQxAS over 184 patient summaries.
● Median KL explains the amount of information lost in
summarization and is insensitive to outlier summaries.
● As the ``number of topics (NTopics)'' increases, LDA
tends to identify topics which are specific and rare. As a
result median KL tends to increase and summaries starts
to diverge from conversation.
● Our approach still sets the lower bound by
generating summaries close to pruned
conversations.
● The number of topics were restricted to 7 because of the
length of the interviews per patient.
AI

The plot illustrates KL scores
of those patient summaries
where our approach
marginally outperforms with
state-of-the art with a median
KL of 0.48.
The plot illustrates KL
scores of those patient
summaries where our
approach significantly
outperformed the state-of-
the art summarization
approaches with a median
KL of 0.2.

Domain Expert Based Evaluation
● Questions with Unclear Context: The questions
interpreted and phrased by the summarizer are
essential to an MHP, but they require some
inferencing by an MHP for apprehension.
● For example: Participant was asked, when was the
last time that happened?, where the referent of
"that" is unclear.
● Questions with Clear Context: These are the
questions that are useful to an MHP as they are
complete and no inferencing is required on the part
of MHP.
● For example: Participant was asked, did they ever
suffer from PTSD?
● Meaningful Response: We consider a
response as significant if it is useful to an MHP
to understand patient behavior, or it matches
well with the question being asked by the
MHP.
AI

Generated Summaries
https://docs.google.com/spreadsheets/d/17ax_FsLs4Xkb95g4RDWT04g631vciktqH_mwisE6A6s/edit?usp=sh
aring
AI

● Valiant, Leslie G. "Robust logics." Artificial Intelligence 117.2 (2000): 231-253.
● Banerjee, Siddhartha, Prasenjit Mitra, and Kazunari Sugiyama. "Multi-document abstractive summarization using ilp
based multi-sentence compression." In Twenty-Fourth International Joint Conference on Artificial Intelligence. 2015.
● Nikhil Priyatam, Sangameshwar Patil, Girish Palshikar, and Vasudeva Varma, Medical Concept Normalization by
Encoding Target Knowledge, In NIPS ML4H Workshop, 2019
● Kapanipathi, Pavan, Veronika Thost, Siva Sankalp Patel, Spencer Whitehead, Ibrahim Abdelaziz, Avinash Balakrishnan,
Maria Chang et al. "Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks." arXiv
preprint arXiv:1911.02060 (2019).
● Kursuncu, Ugur, Manas Gaur, and Amit Sheth. "Knowledge Infused Learning (K-IL): Towards Deep Incorporation of
Knowledge in Deep Learning." arXiv preprint arXiv:1912.00512 (2019).
● Kim, Jinkyu, and John Canny. "Interpretable learning for self-driving cars by visualizing causal attention." In Proceedings
of the IEEE international conference on computer vision, pp. 2942-2950. 2017.
● Yang, Bishan, and Tom Mitchell. "Leveraging knowledge bases in lstms for improving machine reading." arXiv preprint
arXiv:1902.09091 (2019).
References
AI

Amit Sheth
amit@knoesis.org
Thirunarayan
Krishnaprasad
tkprasad@knoesis.or
g
Jyotishman
Pathak
jyp2001@med.cornell.ed
u
Uğur Kurşuncu
ugur@knoesis.org
Acknowledgement
AI

In Reddit conversations
can be:
● Main Posts
● Comments
● Replies
Not all the conversations
are informative.
Pw is probability of occurrence of a
word w in a Reddit main post file,
UWS is the set of unique words in S,
and |UWS| is total number of unique
words in a subreddit S.

Number of Definite Articles : Tells about the abstractness
of the content. Higher value means personal communication
Number of Words Per Post : Defines descriptiveness of
the content.
First Person Pronouns: Higher use of first person
pronouns defines social anxiety, distress, interpersonal
problems etc.
Number of Pronouns : Depressed users use significantly
more first person singular pronouns then second or third
person.
Subordinate Conjunction : Rational thought process
Horizontal Linguistic Features

Number of POS tags : Noun, Verb, and Adjective
Similarity between the posts: detect gradual or
abrupt drifting of topics.
Intra-Subreddit Similarity: defines the similarity
between the users within a subreddit.
Inter-Subreddit Similarity: defined as an average
similarity between a user in a subreddit A and all
other users in other subreddits.
Vertical Lingusitic Features

Sentiment Scores: We used AFINN lexicon which is an
evaluation of word list for sentiment analysis in informal text.
Emotion Scores: We used LabMT, a word list that score
happiness of a corpus. Developed over Twitter, Google
Books, and New York Times.
Readability Scores: Using Flesch-Kincaid readability index
to score the content of user suffering from mental illness.
Fine-Grained Features

● Contextual Features: These features defines the context of the user-content.
○ Word Embedding Model : Trained over 3 Million posts from 15 subreddits using
varying window sizes (2,5,10), varying frequency (2 and 5), Skip Gram and
softmax configuration.
○ Linguistic Inquiry and Word Count: psycholinguistic words defining mental state
of the person through written samples. E.g. Worried, Fearful, nervous maps to
Anxiety
○ TF-IDF: Define the importance of the word in a document (subreddit).
● Contextual features with modulation: Since word embedding model ignores
importance of the words, tf-idf scores can help classification by strongly distinguishing
important word over other.
Contextual Features with/without Modulations

Legend Method
B1 RF (Baseline)
B2 Baseline + SMOTE
B3 BRF - TF-IDF
R1 BRF Contextual Features (CF)
R2 BRF-CF with TF-IDF
R3 BRF - LIWC Features
R4 BRF - Twitter Word Embedding
O1
BRF - CF (SEDO Weights generated from DSM-5 Lexicon
without DAO)
O2
BRF - CF (SEDO Weights generated from DSM-5 Lexicon with
DAO without Slang Terms)
O3
BRF - CF(SEDO Weights generated from DSM-5 Lexicon without
DAO with Slang Terms)
O4
BRF- Contextual Features(SEDO Weights generated from DSM-
5 Lexicon with DAO and Slang Terms)
Model and Annotator Agreement:
84%

Knowledge-infused AI

Recommended

Recommended

More Related Content

Similar to Knowledge-infused AI

Similar to Knowledge-infused AI (12)

Recently uploaded

Recently uploaded (20)

Knowledge-infused AI

Editor's Notes