Text Mining of Web-Based Medical Content

Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

"1120215149"
Text Mining of Web-Based Medical Content

Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

114.99 In Stock

eBook

$114.99 

Available on Compatible NOOK devices, the free NOOK App and in My Digital Library.
WANT A NOOK?  Explore Now

Related collections and offers

LEND ME® See Details

Overview

Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.


Product Details

ISBN-13: 9781614519768
Publisher: De Gruyter
Publication date: 10/09/2014
Series: Speech Technology and Text Mining in Medicine and Health Care , #1
Sold by: Barnes & Noble
Format: eBook
Pages: 284
File size: 4 MB
Age Range: 18 Years

About the Author

Amy Neustein, Founder and CTO, Linguistic Technology Systems, Fort Lee, NJ, USA.

Table of Contents

Preface v

List of authors xix

Part I Methods and techniques for mining biomedical literature and electronic health records 1

1 Application of text mining to biomedical knowledge extraction: analyzing clinical narratives and medical literature Amy Neustein S. Sagar Imambi Mário Rodrigues António Teixeira Liliana Ferreira 3

1.1 Introduction 3

1.2 Background 6

1.2.1 Clinical and biomedical text 6

1.2.2 Information retrieval 8

1.2.2.1 Information retrieval process 9

1.2.3 Information extraction 10

1.2.4 Challenges to biomedical information extraction systems 10

1.2.5 Applications of biomedical information extraction tools 12

1.3 Biomedical knowledge extraction using text mining 13

1.3.1 Unstructured text gathering and preprocessing 14

1.3.1.1 Text gathering 14

1.3.1.2 Text preprocessing 15

1.3.2 Extraction of features and semantic information 15

1.3.3 Analysis of annotated texts 16

1.3.3.1 Algorithms for text classification 18

1.3.3.2 Classification evaluation measures 20

1.3.4 Presentation 23

1.4 Text mining tools 23

1.5 Summary 26

Appendix "A" 27

References 28

2 Unlocking information in electronic health records using natural language processing: a case study in medication information extraction Hua Xu Joshua C. Denny 33

2.1 Introduction to clinical natural language processing 33

2.2 Medication information in EHRs 35

2.3 Medication information extraction systems and methods 37

2.3.1 Relevant work 37

2.3.2 Summary of approaches 39

2.3.2.1 Rule-based methods 39

2.3.2.2 Machine learning-based methods 40

2.3.2.3 Hybrid methods 42

2.4 Uses of medication information extraction tools in clinical research 42

2.5 Challenges and future work 43

References 44

3 Online health information semantic search and exploration: reporting on two prototypes for performing information extraction on both a hospital intranet and the world wide web António Teixeira Liliana Ferreira Mário Rodrigues 49

3.1 Introduction 49

3.2 Background 51

3.3 Related work 52

3.3.1 Semantic search 53

3.3.2 Health information search and exploration 53

3.3.3 Information extraction for health 54

3.3.4 Ontology-based information extraction - OBIE 55

3.4 A general architecture for health search: handling both private and public content 56

3.5 Two semantic search systems for health 58

3.5.1 MedInX 58

3.5.1.1 MedInX ontologies 59

3.5.1.2 MedInX system 61

3.5.1.3 Representative results 62

3.5.2 SPHInX- Semantic search of public health information in Portuguese 64

3.5.2.1 System architecture 64

3.5.2.2 Natural language processing 65

3.5.2.3 Semantic extraction models 65

3.5.2.4 Semantic extraction and integration 66

3.5.2.5 Search and exploration 67

3.6 Conclusion 69

Acknowledgments 70

References 70

Part II Machine learning techniques for mining medical search queries and health-related social media posts and tweets 75

4 Predicting dengue incidence in Thailand from online search queries that Include weather and climatic variables Jedsada Chartree Angel Bravo-Salgado Tamara Jimenez Armin R. Mikler 77

4.1 Introduction 77

4.1.1 Dengue disease in the world 78

4.2 Epidemiology of dengue disease 79

4.2.1 Temperature change and the ecology of A aegypti 80

4.3 Using online data to forecast incidence of dengue 83

4.3.1 Background and related work 83

4.3.2 Methodology for dengue cases prediction 86

4.3.2.1 Framework 86

4.3.2.2 Data sets 87

4.3.2.3 Predictive models 89

4.3.2.4 Validation 92

4.3.3 Prediction analysis 93

4.3.3.1 Multiple linear regression 93

4.3.3.2 Artificial neural network 96

4.3.3.3 Comparison of predictive models 100

4.3.4 Discussion 100

4.4 Conclusion 102

References 103

5 A study of personal health information posted online: using machine learning to validate the importance of the terms detected by MedDRA and SNOMED in revealing health information in social media Kambiz Ghazinour Marina Sokotova Stan Matwin 107

5.1 Introduction 107

5.2 Related background 108

5.2.1 Personal health information in social networks 108

5.2.2 Protection of personal health information 111

5.2.3 Previous work 112

5.3 Technology 113

5.3.1 Data mining 113

5.3.2 Machine learning 114

5.3.3 Information extraction 114

5.3.4 Natural language processing 115

5.4 Electronic resources of medical terminology 116

5.4.1 MedDRA and its use in text data mining 116

5.4.2 SNOMED and its use in text data mining 117

5.4.3 Benefits of using MedDRA and SNOMED 119

5.5 Empirical study 119

5.5.1 MySpace data 119

5.5.2 Data annotation 120

5.5.3 MedDRA results 121

5.5.4 SNOMED results 122

5.6 Risk factor of personal information 123

5.6.1 Introducing RFPI 123

5.6.2 Results from MedDRA and SNOMED 124

5.6.3 Challenges in detecting PHI 126

5.7 Learning the profile of PHI disclosure 127

5.7.1 Part I - Standard bag of words model 127

5.7.2 Part II - Special treatment for medical terms 128

5.8 Conclusion and future work 128

Acknowledgment 130

References 130

6 Twitter for health - building a social media search engine to better understand and curate laypersons' personal experiences Hanna Suominen Leif Hanlen Cécile Paris 133

6.1 Introduction 133

6.2 Background 136

6.2.1 Social media as a source of health information 136

6.2.2 Information search on social media 138

6.3 Proposed solutions 141

6.3.1 Tools for information retrieval on twitter 141

6.3.1.1 Basic recipe for building a search engine 142

6.3.1.2 Solutions 143

6.3.1.3 Health concerns, availability of clean water and food, and other information for crisis management knowledge from twitter 148

6.4 Background 148

6.5 Some solutions 149

6.6 Tools for combining, comparing, and correlating tweets with other sources of health information 156

6.7 Discussion 160

6.8 Related solutions 160

6.8.1 Maps applications for disease monitoring 161

6.8.2 Maps applications in crisis situations 161

6.8.3 Extraction systems to monitor relationships between drugs and adverse events 162

6.8.4 An early warning system to discover unrecognized adverse drug events 164

6.9 Methods for information curation 166

6.10 Future work 167

Acknowledgments 168

References 168

Part III Using speech and audio technologies for improving access to online content for the computer-illiterate and the visually impaired 175

7 An empirical study of user satisfaction with a health dialogue system designed for the Nigerian low-literate, computer-illiterate, and visually impaired Otufemi Oyelami 177

7.1 Introduction 177

7.2 Related work 178

7.3 Dialogue systems 181

7.4 Methods 182

7.4.1 Participants 182

7.4.2 Demographics of the participants 183

7.4.3 Data collection 183

7.4.4 Data analysis 183

7.5 Health dialogue system (HDS) 183

7.6 Results 184

7.6.1 Experiences with mobile/computing devices 184

7.6.2 User satisfaction and acceptability of HDS 186

7.7 Conclusion 187

Acknowledgment 187

References 187

8 DVX - the descriptive video exchange project: using crowd-based audio clips to improve online video access for the blind and the visually impaired Keith M. Williams 191

8.1 Current problems with video data 191

8.2 The description solution 192

8.2.1 What is description? 192

8.2.2 Description for the visually impaired 192

8.2.2.1 Current types 193

8.3 Architecture of DVX 194

8.3.1 The DVX server 194

8.3.1.1 Major data elements, attributes and actions 195

8.3.1.2 Current implementation 198

8.3.1.3 Tomcat servlet container 198

8.3.1.4 Applications 199

8.4 DVX solves description problems 204

8.5 DVX and video search 205

8.6 Conclusion 206

Acknowledgment 206

Part IV Visual data: new methods and approaches to mining radiographic image data and video metadata 207

9 Information extraction from medical images: evaluating a novel automatic image annotation system using semantic-based visual Information retrieval Dumitru Dan Burdescu Liana Stanescu Marius Brezovan 209

9.1 Introduction 210

9.2 Background 211

9.3 Related work 213

9.4 Architecture of system 215

9.5 The segmentation algorithm - graph-based object detection (GBOD) 219

9.6 Experimental results 230

9.7 Conclusions 235

References 237

10 Helping patients in performing online video search: evaluating the importance of medical terminology extracted from MeSH and ICD-10 in health video title and description Randi Karlsen Jose Enrique Borrás Morelt Johan Gustav Bellika Vicente Traver Salcedo 241

10.1 Introduction 242

10.2 Data and methods 243

10.2.1 Obtaining video data 243

10.2.2 Detecting medical terms in video title and/or description 245

10.2.3 Medical vocabularies 246

10.3 Results 247

10.3.1 ICD-10 Results 247

10.3.2 MeSH Results 250

10.3.3 Terms used in video titles and descriptions 253

10.3.4 Occurrences of terms - when discarding the most common terms 255

10.4 Discussion 256

10.4.1 Findings 256

10.4.2 How ICD-10 and MeSH terms can be useful 257

10.4.3 Discriminating power of terms 258

10.4.4 The uniqueness of our study when compared to other work 258

10.5 Conclusion 260

Acknowledgments 260

References 261

Editor's biography 263

From the B&N Reads Blog

Customer Reviews