‘Qualities’ not ‘Quality’ – Text Analysis Methods to Classify Consumer Health Websites

Guocai Chen; Jim Warren; Joanne Evans

‘Qualities’ not ‘Quality’ – Text Analysis Methods to Classify Consumer Health Websites

Guocai Chen, Jim Warren, Joanne Evans

Abstract

There is an increasing need to help health consumers to achieve timely, differentiated access to quality online healthcare resources. This paper describes and evaluates methods for automated classification of consumer health Web content with respect to qualitative attributes relevant to the preferences of individual health consumers. This is illustrated in the context of identifying breast cancer consumer web pages that are ‘supportive’ versus ‘medical’ perspective, as compared to an existing manual classification employed by a breast cancer portal with personalised search preference options. Classification is performed based on analysis of word co-occurrences and an enhanced decision tree classifier (a decision forest). Current classification test results for ‘medical’ versus ‘supportive’ type resources are 90% accurate (95% confidence interval, 86-94%) using this decision forest classifier. These early results are indicating that language use patterns can be used to automate such classification with acceptable accuracy; however, a wider range of websites and metadata attributes needs to be assessed and compared to end-user feedback. Future application may be either in a tool to facilitate metadata coders in populating the databases of domain-specific portals such as BCKOnline, or in providing tagging or sorting on content type on live search results from health consumers.

Keywords

Text Analysis; Health Consumers

Full Text:

PDF

:::::::::::::: eJHI - electronic Journal of Health Informatics - ISSN 1446-4381 ::::::::::::::

Privacy Statement - Uptime

e-Journal of Health Informatics

Vol 7, No 1: Selected Proceedings from HIC 2010

Vol 6, No 4: Special Issue on AHIC 2010

Vol 6, No 3: Special Issue on E-Health Strategies

Vol 6, No 2: Special Issue on Smart Healthcare Systems

Vol 6, No 1: Special Issue on HIC 2009

Vol 5, No 2: Human Interfaces, Software & Smartphones for Education & Care

Vol 5 No 1: Special Issue on Systemic Interoperability

Vol 4, No 1: Special Issue on HIC 2008

Vol 3, No 2: Special Issue on Health Information Systems

Vol 3, No 1: Special Issue on Privacy and Security

Vol 2, No 2: Special Issue on HIC 2006

Vol 2, No 1: Special Issue on Aged Care Informatics

Vol 1, No 1: Inaugural Issue and Special Issue on Health Data Mining

‘Qualities’ not ‘Quality’ – Text Analysis Methods to Classify Consumer Health Websites

Abstract

Keywords

Full Text:

Username
Password
Remember me