Critical appraisal of a RCT (February
2018).
Exercise has been shown to be as
effective as surgery for subacromial pain syndrome (SAPS) in a number of
different studies [1,2,3]. While this is
encouraging for proponents of exercise based physiotherapy, it also raises more
questions than it answers. What type of
exercise? How much of it? For how long? Does it matter if it's painful or not?
How, why, or even does either treatment actually help? This month we examine a RCT designed to try
to shed some light some of these questions by investigating whether one type of
exercise (in this case non-painful eccentric training of the shoulder external
rotators) produces better results than another (general exercise) in this
patient group.
This is the third in a series of blogs
(the first two can be found here and here).
Again, this blog will not provide a systematic or comprehensive critical
appraisal of the chosen paper, or employ one of the many critical appraisal
tools available, but will highlight what we consider to be three important
elements to consider when trying to interpret the results of the trial and
apply them to our patients. This blog
will consider various aspects of RCT design and implementation, an overview of
which can be found here.
This month we will consider the potential
sources of bias and the degree of confidence or doubt we have in results of the
following RCT:
Shoulder external rotator eccentric
training versus general shoulder exercise for subacromial pain syndrome: a
randomized controlled trial. The International Journal of Sports Physical
Therapy, 12(7), 1121-33.
This study was designed to investigate
whether eccentric training of the shoulder external rotators (ETER) or general
exercise (GE) produces better clinical outcomes in those with SAPS. This study defined SAPS based on the
presence of at least three of the following; a positive Neer, Hawkins-Kennedy,
or empty can test, painful resisted external rotation, palpable tenderness of
the supraspinatus or infraspinatus insertion, or a painful arc of abduction.
The primary outcome measure was the Western Ontario Rotator Cuff Index (WORC),
a patient reported outcome measure which considers physical symptoms, sports
and recreation, work, lifestyle, and emotions.
Secondary outcome measures were a Numerical Pain Rating Scale (NPRS) for
best, worst, and average pain, isometric strength, active range of movement, Y
balance test, and Global Rating of Change (GROC). 48 participants were randomised into two
groups (25 in the ETER group and 23 in the GE group) and underwent a six week
exercise program including four visits to a physical therapist (the study was
conducted in the USA). The ETER group
performed non-painful eccentric external rotation (3 sets of 15 with a 3 second
eccentric phase), resisted scapula retraction (2 sets of 10), and a cross body
stretch (3 reps with 30-45 second holds).
The GE group performed active flexion and abduction with no resistance
(2 x 10 reps of each), and the same resisted scapula retraction and stretching
excises as the ETER group. All outcomes were measured at baseline, 3 week, 6
weeks, and 6 months. The study found
that ETER produced statically significant superior results compared to GE at 3
weeks, 6 weeks, and 6 months according to WORC score, NPRS score, and isometric
muscle strength. There were no statistically significant differences in active
range of movement, Y balance, or GROC. The authors conclude that eccentric
training may be efficacious to improve self-reported pain, function and
strength in those with SAPS.
It is often easier to find faults when
critically appraising a study, so let's start with the strengths of this
RCT. It asked a clinically relevant
question and used a design appropriate to answer it, and the study protocol was
published before the trial began which guards against bias. Additionally, the interventions and outcome
measures were well described. This may
sound simple but the quality of description of interventions and outcome
measures in RCT’s, especially those including therapeutic exercise, is often
poor limiting interpretation, application, and replication of results [4,5].
However there are a few aspects of the trial
which we must consider before deciding how confident we can be in the reported
superiority of ETER exercises over GE.
The three areas we feel are most important to consider are:
1.
Differential
dropouts and statistical analysis.
2.
The choice of comparator.
3.
Methods of
randomisation and baseline differences between groups.
1. Differential dropout rates and statistical analysis.
Dropouts and the resultant incomplete
data cause investigators a number of problems when it comes to analysing and
interpreting RCT results. One of the key characteristics of RCT’s that reduces
their susceptibility to bias is randomisation. Random allocation of participants to different
treatment groups aims to ensure that the groups are comparable at baseline in
relation to both known and unknown factors that might influence the outcome of
the trial. This increases our confidence
that any differences in treatment effectiveness are related to the intervention
of interest rather than any baseline differences. In order to maintain this control of bias, the
random treatment assignment must be preserved through the whole trial including
in the statistical analysis. This form
of statistical analysis is known as an intention-to-treat (ITT) analysis. This analyses participants based on the group
to which they were assigned at randomisation irrespective of what treatment
they actually received or whether they completed the trial. This is considered
the most appropriate method of analysis when comparing the effectiveness of
treatments in RCT’s [6].
The authors in this study did not employ
an ITT approach because they were concerned that the asymmetrical dropouts
between groups may cause type I error (finding a significant difference when
one does not exist). Instead they analysed only participants that completed the
trial which is termed a completed cases analysis. Whether asymmetrical dropouts cause error in
the results of an RCT depends on why the data are missing (rather awkwardly
termed ‘missingness’) and how it is handled in the analysis [7]. If those that dropout do so completely at
random, then a completed cases analysis is reasonable because the two groups
available for analysis are still based on chance alone. However, if the data is not missing
completely at random, a completed cases approach analyses a non-random subset
of those that entered the trial and compromises the initial randomisation
process. In this trial the differential dropouts between groups (39% in the GE
group dropped out compared to 12% in the ETER group) increases suspicion that that
the reasons for this may not have been completely random [8]. Using a completed cases analysis therefore
increases doubt that the differences in treatment outcome can be attributed to
the intervention of interest confidently.
Author Response:
Thank you, this is a great point and one that we
deliberated over for quite some time.
There are pros and cons with ITT and we had concerns that week 3 between
group comparisons could be inflated with the 2 subjects in the control group
reporting a slight worsening in subjective outcomes and subsequently dropping
from the trial early. This will end up
lending to your comparator group interventions discussion but generally
speaking my biggest concern was that the active range of motion exercises used
as a comparator may have slightly increased symptoms in the early phase of
treatment for some subjects in the control group and carrying over week 1 data
could falsely inflate between group differences in favor of the experimental
group. If the 6 month dropouts were the
only issue using ITT to carry over week 6 data to 6 month data would be an
easier decision but the early dropouts from the control group were a big factor
in the decision.
2. The choice of comparator.
To accurately test how effective a
treatment is it needs to be compared to something. Studies where treatment outcomes are measured
without a comparison group can show that a patient had a particular treatment
and got better, but cannot show that they got better because of the particular
treatment. Controlled studies
(both RCT’s and other non-randomised controlled studies) use a control group to
demonstrate what would have happened if the participants had not had the
treatment of interest, either by doing nothing (no treatment control), making
patients think that they have had the treatment of interest but without
administering the active components (placebo control), or by comparing to
another treatment (active control). In this case an active control was
chosen. While this is reasonable because
the alternative to using eccentric exercises would be to provide an alternative
exercise-based treatment, the most appropriate comparator would be
representative of current practice (so we know whether changing to this ‘new’
or ‘different’ treatment is better than what we already do). This study uses
range of movement exercises (with one resisted exercise that was standardised
across groups) to represent general exercise.
The authors themselves identify that this may not be representative to a
typical exercise program used in clinical practice. Unless this reflects our current practice, it
makes it very difficult to know what these results mean.
The choice of control also introduces
doubt as to whether we can be sure that it was the type of exercise that was
the decisive factor in determining the results of this trial. Both groups performed exercises that involved
both concentric and eccentric phases.
This makes it less clear whether this was a true comparison between two
distinct types of exercise. The control group also performed lower dose and
lower resistance exercises than the ETER group.
Previous studies have suggested that exercise protocols that include
resisted exercise may be more effective that those that do not [9], and that
higher dose exercise may be more effective than lower dose exercise [10]. Even
if we accept the reported differences in outcomes between the two groups, can
we be confident that it was the type of exercise that caused them?
Author response:
This is another great point. I am
not confident that the comparator exercise in this trial was representative of
what a PT would do in practice. Simply
having a patient actively move the shoulder through an elevation movement
without load may not be a typical general exercise program. Its possible that the various differences
between exercise programs, ie load, specific isolated movement, arm position
etc could be the reason for between group differences rather than the fact that
the experimental group utilized an eccentric exercise.
3. Methods of randomisation and baseline
differences between groups.
As described, the benefit of
randomisation is that it theoretically balances both known and unknown factors
that could potentially influence the outcome of the trial between the groups.
This increases our confidence that any difference in outcome is due to the
intervention of interest and not some other known or unknown difference between
groups. In this trial the researchers
randomised patients by asking them to blindly place a pen on a table of random
numbers. Manual randomisation methods such as this (or the use of a coin toss,
drawing lots, of shuffling cards) introduce more doubt than more robust methods
like using computer generated or remotely generated random numbers because
either the participant or investigator could theoretically influence the
process. For example what happened if a
participant landed equidistant between 2, 3, or even 4 numbers in the table?
This raises an important point in the assessment of the risk of bias; we are
not saying that the participants or investigators did unduly influence the
randomisation process in this trial, just that the method used increases doubt
because they theoretically could have done so.
We as readers will never know for sure if the results were unduly
influenced, and that is why we asses for the risk of bias rather than actual
bias itself.
If there was bias in the randomisation
process this would mean that there were systematic differences between the two
treatment groups. However, the fact that
there were systematic differences between the two groups does not necessarily
mean that there was bias in the randomisation process. Randomisation can only
maximise the probability that known and unknown factors are balanced between
the two groups; it cannot not guarantee that this is the case. The bigger the
sample size the more likely they are to be balanced (the reasons for this were
discussed in a previous blog here). In
this study there were statistically significant differences in favour of the
ETER group in strength (ABD/ER ratio) and Y balance. There were also
non-significant (but not necessarily non-important) differences in all other
baseline strength measurements, most range of movement measurements, best pain,
and younger age. We do not really know
how, why, or even if exercise really does help patients with SAPS so we cannot
know how, why, or if these baseline differences affected treatment outcomes. If it is feasible that younger, stronger,
patients with better range of movement
and balance are more likely to benefit more from exercise based
treatment, then we have to consider that
it could have been differences between groups rather than the differences in
treatment effectiveness that caused the differences in outcomes.
Author response: I
also agree with this, hindsight is 20/20, If we did a similar trial again the
use of computer generated randomization would be much preferred. The topic of baseline variables that could be
affiliated with improved outcomes is very important. I would love to have collected more baseline
variables in a larger sample and run a regression off the responders to
determine the patient characteristics that are consistent with a positive
outcome. In this case we are examining
the between group mean but some participants have dramatic improvements over
others. It would be interesting to know
which patients respond best to heavy load exercises and which do not respond as
favorably.
Conclusion
This study reports that eccentric
training of the external rotators of the shoulder produces significantly better
results that general exercise in those with SAPS. However, as with any RCT, it
is important to appraise the methods used before accepting the results. Differential dropouts between groups and the
way the data were analysed might increase risk of bias and hence decreases our
confidence in the reported results, and baseline differences between groups and
the choice of comparator increases doubt as to whether any differences in
outcome can be specifically linked to the type of exercise performed.
Author response: One
last point about this topic is that the progression of exercise mode, load and
volume dosing is critically important.
The level of tissue irritability is also an important factor to help
dictate exercise prescription and in clinical practice I wouldn’t arbitrarily
prescribe eccentric exercises to any patient with chronic sub-acromial
pain. Progressions in arm position, type
of movement (ie. Isometric vs light isotonic vs eccentric) and load/dose
increases respective of patient tolerance and baseline strength will be
important to integrate into future trials.
A pragmatic design that allows the clinician to manipulate these exercise
prescription variables based on patient presentation will be important in
future studies. Thank you all for your
interest and review of this topic.
Eric Chaconas
Thanks for reading, hope it
was useful; thoughts gratefully received.
Paul Regan, Chris Littlewood, Tomas Parraguez, Brian Cho, Sijmen Hacquebord
[1]
Haahr JP, Østergaard S, Dalsgaard J, Norup K, Frost P,
Lausen S, Holm EA,
Andersen JH, (2005). Exercises versus arthroscopic
decompression in patients with subacromial impingement: a randomised,
controlled study in 90 cases with a one year follow up. Annals of Rheumatic Diseases, 64(5), 760-4.
[2]
Haahr JP, Anderson JH,
(2006). Exercises may be as efficient as
subacromial decompression in patients with subacromial stage II impingement:
4–8-years’ follow-up in a prospective, randomized study. Scandinavian
Journal of Rheumatology, 35(3), 224–228.
[3]
Ketola S, Lehtinen JT, Arnala I, (2017). Arthroscopic
decompression not recommended in the treatment of rotator cuff tendinopathy: a
final review of a randomised controlled trial at a minimum follow-up of ten
years. The Bone and Joint Journal, 99-B(6), 799-805.
[4]
Hoffmann TC, Glasziou PP, Boutron I, Milne R,
Perera R,
Moher D, Altman DG, Barbour V, Macdonald H, Johnston M, Lamb SE,
Dixon-Woods M, McCulloch P, Wyatt JC,
Chan AW,
Michie S, (2014). Better reporting of
interventions: template for intervention description and replication (TIDieR)
checklist and guide. British Medical
Journal, 348:g1687.
[5]
Page P, Hoogenboom B, Voight M, (2017). Improving the
reporting of therapeutic exercise interventions in rehabilitation
research. International Journal of
Sports Physical Therapy, 12(2):297-304.
[6]
Higgins JPT, Green S
(editors). Cochrane Handbook for Systematic Reviews of Interventions Version
5.1.0 [updated March 2011]. The Cochrane Collaboration, 2011. Available from
http://handbook.cochrane.org.
[7]
Bell ML, Kenward, MG, Horton,
NJ, (2013). Differential dropout and
bias in randomised controlled trials: when it matters and when it may not. British Medical Journal, 346:e8668.
[8]
Moher D, Hopewell S, Schulz KF, Montori V, Gøtzsche PC, Devereaux, PJ,
Elbourne D, Egger M, Altman DG,
(2010). ConSoRT 2010 explanation and
elaboration: updated guidelines for reporting parallel group randomised trials. British Medical Journal,340:c869
[9]
Littlewood C, Malliaras P, Chance-Larsen K, (2015).
Therapeutic exercise for rotator cuff tendinopathy: a systematic review
of contextual factors and prescription parameters. International Journal of Rehabilitation
Research, 38(2), 95-106.
[10]
Østerås H, Torstensen TA, Østerås B,
(2010). High-dosage medical exercise therapy in patients with long-term
subacromial shoulder pain: a randomized controlled trial. Physiotherapy Research International, 15(4), 232-42.
No comments:
Post a Comment