There is an enormous amount of information available to clinicians and patients. Finding, sifting, and turning it into evidence is a major challenge.
In this chapter we introduce the skills that are needed to find existing evidence, read papers, assess the quality of the literature, and then summarize it in a systematic review and meta-analysis.
Evidence can be found in a wide range of sources. PubMed, a leading portal for databases like Medline, includes >20 million citations to journal articles and books. Unless you use a systematic method to find your evidence, you will waste a lot of time, and potentially miss a great deal of important work.
You must begin with a clearly defined clinical or research question and some idea of which information source you will use to search for evidence. Even if you define your research question thoroughly, you may have to modify your search strategy after your initial review.
If you are not familiar with how to do this then make use of the online tutorials (PubMed), or contact your local librarian.
If your search produces 2000 articles, you may need to refine it to make the results more manageable or focused; while if only 5 articles are found, you may want to consider broadening the search terms you used. It is always essential to go back to your defined research question and use the details in the question you are attempting to answer to guide your search strategy and its criteria.
This is perhaps the most important adjunct to your literature review. Accurate bibliographic details, search histories, critique details, key information from papers, etc. will help you find things again quickly. Reference Manager, Endnote and Mendeley are useful electronic systems.
A 76-year-old man has been diagnosed with inoperable intra-hepatic cholangiocarcinoma and has been advised that his prognosis is poor, 6 months at best. He asks you (his GP) whether he should have photodynamic therapy (PDT) which he has heard may prolong his survival. How do you find out whether it would be appropriate for him?
‘In a 76-year-old man who has been diagnosed with inoperable intra-hepatic cholangiocarcinoma, will PDT increase his survival compared with palliative care alone?’
In a search engine such as PubMed, type your keywords, e.g.:
The search engine will then translate these:
This produces 109 results (search performed March 2011). If you are looking for a quick answer you are likely to want a review article. You can limit your results to review, or you can add systematic review into the search, producing 9 articles.
If you are doing a more thorough search (e.g. if you are doing a research project, or preparing a journal club) you should look at all 109 of these titles and abstracts to select those that are relevant. In a formal systematic review (see p.88) you would define inclusion and exclusion criteria for the papers.
You then read the selected papers in detail to see if they are:
Methods for assessing these are covered in detail in this chapter. In this case, a brief search identifies a recent, high-quality systematic review that concludes there is some limited evidence that PDT may improve survival,1 but this is based on 2 small randomized controlled trials (RCTs) and some observational studies of varied quality. A firm conclusion was not possible.
If you find appropriate evidence then you should apply it. In this case, current evidence is not sufficient to make a strong recommendation. However, you would discuss this with your patient, and contact the oncologist to see if there might be ongoing trials that he could enter.
As a GP you are unlikely to have another patient with this condition in the future, but you may still be interested and return to the literature in a couple of years to see if the evidence has changed.
There are numerous e-resources available to help you identify information and search for up-to-date medical evidence. Any library will be able to provide a list of the key resources that you can use, including details of which ones they have free access to.
Some databases are simply a way of organizing information on all relevant publications in the field. Medline, Embase, and PsychInfo include articles from biomedical journals and some books and other publications. Where they cover the same broad topic, e.g. Medline and Embase, the majority of their content will overlap, but with some key areas of difference. It is always advisable to search at least 2 databases to ensure that you are not missing articles.
Some databases help to do this for you. For example, PubMed is a way of accessing several databases, and it also provides links to the publishers and to full-text articles where these are available.
A citation is a reference to another article or source of information. Information about citations can be useful in looking for the most influential articles which are generally cited many times. There are many ways of accessing citation data now, but the most comprehensive is the Citation Indices published by the ISI Web of Science. Citation information is also provided through Google Scholar.
Information is sometimes published first at conferences where it may appear in official conference proceedings. These can be accessed through another index, the Conference Proceedings Citation Index, available through the Web of Knowledge.
Publicly funded research is increasingly published in open-access journals to ensure that everyone benefits from the findings. For example UK PubMed Central (UKPMC) provides free access to peer-reviewed research papers in the medical and life sciences. It includes over 2 million full-text journal articles, access to 24 million abstracts, and clinical guidelines (from NHS).
If you are looking for answers to a specific medical query it is best to see if the literature has already been reviewed by a reputable group. Evidence-based reviews can be found in a number of places including:
When you search for articles it is essential to keep good records and be able to manage the data. Bibliographic software packages allow you to manage all the referenced evidence you find by enabling you to store it in your own personal database or library. In general, these packages are designed to assist in the following tasks:
There are many packages, including:
Critical appraisal is the process of assessing the validity of research and deciding how applicable it is to the question you are seeking to answer. This section will cover how to read a paper with these aims in mind.
In subsequent sections we describe tools available to support appraisal of papers reporting different types of study.
Were potential confounders (see p.148) measured and adjusted for?
Were any patients consulted or involved in any way in the design, management, analysis, or interpretation of the study?
Applicability is, in some ways, the hardest to judge in a rigidly scientific manner and decisions in this area may still be an art.
Ask yourself: are the problems I deal with sufficiently like those in the study to extrapolate the findings? Can I generalize from this study to my own practice?
If a doctor were to locate a paper that is scientifically faultless, he may be left pondering questions like, if the selection criteria only included ‘patients between 70 and 80 years old’ can I use the conclusions for patients in the 65 to 70 age group, and what about the relatively fit and ‘biologically young’ 81-year-olds?
Can studies on urban Americans be extrapolated from, say, Birmingham, Alabama, to Birmingham, West Midlands, and are rural practices in Norway different to those in Wales?
Similarly a teacher might ask ‘Are teaching practices that have been shown to be effective in 5- to 8-year-olds also of value in pre-school age children?’ or a social worker might enquire ‘Would counselling techniques used successfully in Asian youths apply equally well to those from an Afro-Caribbean background?’.
Studies vary in design, and there are key factors to look out for when appraising each type. Specific checklists have been developed for a wide range of studies and can be useful when writing or reading papers.
These checklists are all available through the EQUATOR network website where they are regularly reviewed, updated, and new ones added.1,2 It is therefore worth checking the website for updates. As an example we have included the checklists and flow-diagrams for RCTs—the CONSORT statement later in this chapter (see pp.84–86).
Other guidelines are listed in Table 4.1, check the websites for details.
Consider:
Consider:
Consider:
Table 4.1 Study appraisal guidelines
Acronym | Study design to be appraised | Website |
---|---|---|
CONSORT | RCTs | ![]() |
MOOSE | Meta-analyses of observational studies | ![]() |
PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analyses | ![]() |
STARD | Diagnostic test accuracy studies | ![]() |
STREGA | Strengthening the Reporting of Genetic Associations | ![]() |
STROBE | Observational studies | ![]() |
See Table 4.2 for the CONSORT statement, and Fig. 4.1 for flow diagram and additional recommendations.
Table 4.2 CONSORT statement*
Section/topic | Checklist item |
---|---|
Title and abstract | |
Identification as a randomized trial in the title | |
Structured summary of trial design, methods, results, and conclusions (see CONSORT for abstracts) | |
Introduction | |
Background and objectives | Scientific background and explanation of rationale |
Specific objectives or hypotheses | |
Methods | |
Trial design | Description of trial design (such as parallel, factorial) including allocation ratio |
Important changes to methods after trial commencement (such as eligibility criteria), with reasons | |
Participants | Eligibility criteria for participants |
Settings and locations where the data were collected | |
Interventions | The interventions for each group with sufficient details to allow replication, including how and when they were actually administered |
Outcomes | Completely defined pre-specified primary and secondary outcome measures, including how and when assessed |
Any changes to trial outcomes after trial commenced, with reasons | |
Sample size | How sample size was determined |
When applicable, explanation of any interim analyses and stopping guidelines | |
Randomization | |
Sequence generation | Method used to generate the random allocation sequence |
Type of randomization; details of any restriction (such as blocking and block size) | |
Allocation concealment mechanism | Mechanism used to implement the random allocation sequence (such as sequentially numbered containers), describing any steps taken to conceal the sequence until interventions were assigned |
Implementation | Who generated the random allocation sequence, who enrolled participants, and who assigned participants to interventions |
Blinding | If done, who was blinded after assignment to interventions (for example, participants, care providers, those assessing outcomes) and how |
If relevant, description of the similarity of interventions | |
Statistical methods | Statistical methods used to compare groups for primary and secondary outcomes |
Methods for additional analyses, such as subgroup analyses and adjusted analyses | |
Results | |
Participant flow (a diagram is strongly recommended) | For each group, the numbers of participants who were randomly assigned, received intended treatment, and were analysed for the primary outcome |
For each group, losses and exclusions after randomization, together with reasons | |
Recruitment | Dates defining the periods of recruitment and follow-up |
Why the trial ended or was stopped | |
Baseline data | A table showing baseline demographic and clinical characteristics for each group |
Numbers analysed | For each group, number of participants (denominator) included in each analysis and whether the analysis was by original assigned groups |
Outcomes and estimation | For each primary and secondary outcome, results for each group, and the estimated effect size and its precision (such as 95% confidence interval) |
For binary outcomes, presentation of both absolute and relative effect sizes is recommended | |
Ancillary analyses | Results of any other analyses performed, including subgroup analyses and adjusted analyses, distinguishing pre-specified from exploratory |
Harms | All important harms or unintended effects in each group (for specific guidance see CONSORT for harms) |
Discussion | |
Limitations | Trial limitations, addressing sources of potential bias, imprecision, and, if relevant, multiplicity of analyses |
Generalizability | Generalizability (external validity, applicability) of trial findings |
Interpretation | Interpretation consistent with results, balancing benefits and harms, and considering other relevant evidence |
Registration | Registration number and name of trial registry |
Protocol | Where the full trial protocol can be accessed, if available |
Funding | Sources of funding and other support (such as supply of drugs), role of funders |
*CONSORT strongly recommends reading this statement (previous pages) in conjunction with the CONSORT 2010 Explanation and Elaboration for important clarifications on all the items.
If relevant, they also recommend reading CONSORT extensions for cluster randomized trials, non-inferiority and equivalence trials, non-pharmacological treatments, herbal interventions, and pragmatic trials. Additional extensions are forthcoming: for those and for up to date references relevant to this checklist, see: http://www.consort-statement.org
Fig. 4.1 CONSORT Statement 2010 flow diagram. From Altman DG, Moher D, for the CONSORT Group. CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials. BMJ 2010; 340:c332.
A systematic review is a method of providing a summary appraisal of existing evidence. Unlike traditional literature reviews, which are subjective and selective, this method is designed to be rigorous and reproducible, reducing bias.
A systematic review should be based on a protocol that is produced in advance and includes the following elements:
Use the checklist in PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) at M http://www.prisma-statement.org. Describe the methods as in the protocol, including the details of the search. The results of the search should be presented in a standard flow-chart (Fig. 4.2). Details of the included papers should be in tables, as should key extracted data.
Fig. 4.2 PRISMA 2009 flow diagram. From Moher D, et al. Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med 2009: 6(6):e1000097. (Reproduced under the terms of the Creative Commons Attribution Licence.)
A meta-analysis is a statistical analysis of a collection of studies, often collected together through a systematic review. A systematic review provides a summary appraisal of the evidence in the systematically collected studies in a narrative form. In contrast, the statistical techniques in a meta-analysis enable a quantitative review to be undertaken by combining together the results from all the individual available studies to estimate an overall summary, or average, finding across studies. The studies themselves are the primary units of analysis and can include the range of various different epidemiological study designs that we discuss throughout this book.
As a result of limited study size, biases, differing definitions, or quality of exposure, disease and potential confounding data, as well as other study limitations, individual studies often have inconsistent findings. They are therefore often insufficient in themselves to definitively answer a research question or provide clear enough evidence on which to base clinical practice. A meta-analysis, however, has several advantages, including:
There are 4 steps in a meta-analysis:
1. Extraction of the main result or study effect estimate from each individual study (calculation is sometimes necessary) e.g. odds ratio, relative risk etc., together with an estimate of the probability that the effect estimate is due to chance. Every study effect estimate is accompanied by the standard error, describing the variability in the study estimate due to random error. Sometimes we see this expressed as the variance, which is the standard error (SE) squared or the level of certainty we have in the estimated study result may be shown through the confidence interval (see p.229).
2. Checking whether it is appropriate to calculate a pooled summary/average result across the studies. Appropriateness depends on just how different the individual studies are that you are trying to combine. Sometimes, it is not appropriate to combine the studies at all. If it is appropriate, you must decide which method to use before you begin calculation (see p.92).
3. Calculation of the summary result as a weighted average across the studies, using either a random effects or fixed effects model (see p. 93). The weight usually takes into account the variance of the study effect estimate which reflects the size of the study population of each individual study. This weighted average gives greater weight to the results from studies which provide us with more information (usually larger studies with smaller variances) as these are usually more reliable and precise and less weight to less informative studies (often smaller studies).
4. Presentation of summary results—you will often see these as forest plots (Fig. 4.3). This is a graphical representation of the results from each study included in a meta-analysis, together with the combined meta-analysis result.
Fig. 4.3 Understanding the results of meta analysis presented in a forest plot.
It is not always appropriate to pool results into a statistical meta-analysis.
There are 2 elements in the decision: clinical and statistical.
Galbraith (radial) plots facilitate the examination of heterogeneity, including detection of outliers (Fig. 4.4):
The method depends on whether there is statistical heterogeneity. If no evidence for heterogeneity is found, a fixed effects model can be used to pool the effect estimates. If some heterogeneity exists, and the underlying assumptions of a fixed effects model (i.e. that diverse studies are estimating a single effect) is too simplistic, the heterogeneity can be allowed for by using an alternative approach known as the random effects model to pool the effect estimates.
Fig. 4.4 Interpreting the Galbraith (radial) plot.
If specific sub-groups of studies display heterogeneity, these can be pooled separately in sub-analyses, e.g. in the earlier example in this topic on the treatment of depression, pooling findings for young patients only together or pooling study findings focusing on longer-term outcomes together, etc.
Publication bias refers to the greater likelihood that research with statistically significant results will be published in the peer-reviewed literature in comparison to those with null or non-significant results.
When evaluating a meta-analysis, you need to consider the following:
Fig. 4.5 Interpreting funnel plots: the lack of publication bias (a) and the presence of publication bias (b).
Egger, M et al. (eds). DG. Systematic Reviews in Health Care. London: BMJ Publishing Group; 2001.
Greenhalgh T. How to read a paper: Papers that summarise other papers (systematic reviews and meta-analyses). BMJ 1997; 315:672–75.
The Cochrane Collaboration: http://www.cochrane-net.org
* Note that meta-analysis is only as good as the source data. It is no substitute for undertaking high quality original studies and trials which can then feed in to the decision-making process.