This is the first in a series of blogs that I hope will be a useful learning resource to help develop critical appraisal skills. We are awash with research, often with conflicting messages, and so it is important to understand which studies we can trust, which studies we should be cautious about accepting and what all this means for clinical practice. So, over the coming months, my intention is to select studies relevant to physiotherapy and write a critique of that research (apologies in advance; this is a long blog because I have taken time to explain the concepts in some detail; this won't be necessary in future blogs so I promise they will be shorter).
I thought it appropriate and timely to begin this series with a critical appraisal of some of my own work. So, this first blog will be a critical appraisal of the SELF study; a randomised controlled trial with the aim of evaluating the effectiveness of a self-managed single exercise programme in comparison to usual physiotherapy treatment. The published report of the SELF study can be accessed here; the open-access version here; the trial registration site here; the protocol here; and the pilot study here.
Some readers might be interested in the rationale and development of the self-managed single exercise programme but I won't go into detail on that in this blog but further information is available via this published paper; the open-access version is here too.
And finally, before we begin, for those of you not too familiar with the randomised controlled trial, I wrote about the basic design and rationale in a previous blog, here. This might be a useful starting point if some of the terms used seem unfamiliar or confusing.
There are numerous resources available to help us critically appraise studies. During this blog I'm going to take advantage of one such resource; the Cochrane risk of bias tool. The Cochrane risk of bias tool asks questions in relation to sequence generation (selection bias), allocation sequence concealment (selection bias), blinding of participants and personnel (performance bias), blinding of outcome assessment (detection bias), incomplete outcome data (attrition bias), selective outcome reporting (reporting bias) and other potential sources of bias, to help us judge whether we can trust the findings of a randomised controlled trial or whether we should be cautious or even reject the findings.
Sequence generation and the SELF study
Sequence generation refers to the methods used to generate the random allocation sequence. Remember, as discussed in the previous blog, random allocation to the different groups in a trial is important to ensure that the groups are comparable to begin with. If the groups are different at the start of a trial, for example in relation to pain severity, then it is likely that they will be different at the end of the trial too and so we cannot say with any confidence that the differences are due to the treatments received. Rather the difference is more likely explained by this confounding factor(s).
Valid random allocation increases the chance that groups in trials will be comparable in relation to the factors that we know about, e.g. age, gender, pain severity, but also factors that we might not know about or might not yet understand their importance, e.g. diet, sleep quality, genetics. In the SELF study we used a computer generated random allocation sequence which is regarded as a valid method. Other valid methods include coin toss or throw of a dice. Methods that would raise concern include allocation by date of birth, attendance at the clinic, clinician preference.
So, in relation to sequence generation we can conclude that the SELF study presents a low risk of bias, i.e. the method used was trustworthy.
Allocation sequence concealment and the SELF study
Once the sequence by which patients will be randomly allocated to different treatment groups within a randomised controlled trial has been generated, we can then begin to think about allocating patients. But, imagine we have used computer software, for example Microsoft Excel, to generate a list of random numbers where the number one means that the patient will be allocated to receive the new intervention and the number two means that the patient will be allocated to receive the comparator treatment or usual care. You have printed this list of numbers out and it is on the desk in your clinic. So, you can see that the next patient to be recruited to the study will be a number one and so allocated to the new intervention. But, when the next eligible patient arrives and they have multiple co-morbidities and low levels of motivation, and you do not believe that they will do well with the new intervention, what do you think the likely outcome will be? It is likely that the patient won't be recruited to the study. Then the next eligible patient arrives and they are fit and well and have high levels of motivation, what do you think the likely outcome will be this time? There has been a fair bit of research undertaken looking at this particular issue and where the random sequence is not concealed, i.e. the next allocation is known or can be predicted, or where the allocation concealment is broken, the benefit of the new or experimental intervention is likely to be over estimated. Here is an interesting review that suggests two-thirds of conclusions in favour of one of the interventions were no longer supported if only trials with adequate allocation concealment were included. Clearly then allocation concealment is an important issue to consider.
In the SELF study, we used consecutively numbered sealed opaque envelopes to conceal the allocation. The patient identification number had to be written on the envelope before opening so we could then see that the patients had been allocated in sequence and without knowledge of the treatment group to which they were going to be allocated. This method is regarded as a valid method of allocation concealment although it is preferable if the allocation is undertaken away from the recruiting clinic by a third-party, for example a clinical trials unit, using a telephone or web-based system.
In my experience it is not uncommon for authors not to report their methods of allocation concealment; sometimes this will be because the allocation has not been concealed but sometimes, especially in the older trials before reporting guidelines were introduced, simply omitted. At this point it is worth pointing out that the quality of reporting does not necessarily reflect the quality of the study. This is something I have looked at in a previous study for those of you interested. If in doubt, it is often worth contacting the authors directly to seek clarification.
The methods used to generate the random sequence and the methods used to conceal allocation, are methods to minimise selection bias. Selection bias is apparent when groups are not comparable to begin with or when inadequate methods of sequence generation and allocation concealment have been used. When we look at the baseline characteristics of the participants in the SELF study, they are comparable in relation to the data we collected. One apparent exception though is the participants in the usual physiotherapy group reported a longer mean duration of symptoms. However, on closer inspection, the duration data is positively skewed, i.e. not normally distributed, and so when this data is summarised using the median value the groups are comparable: median duration of symptoms in the self-managed single exercise group = 7 months and 6 months in the usual physiotherapy group.
So, in relation to allocation concealment and more broadly in relation to selection bias we can conclude that the SELF study presents a low risk of bias.
Blinding of participants and personnel and the SELF study
Blinding is where participants/ patients and study personnel, for example the clinicians, are unaware of which treatment the patient has received. In drug trials, blinding is readily achieved because pills can be developed which look the same but contain different ingredients. Hence, the patients and the clinicians will be unaware of whether the patient has taken the active drug or the placebo. In this context, you can probably appreciate that knowing which drug has been taken could effect the response or performance of the patient or clinician; 'my headache won't get better because I've only taken a placebo pill and my headache is real!' However, in many physiotherapy trials it is difficult or sometimes impossible to blind participants or personnel. In a trial of therapeutic ultrasound, this might be possible by using a de-tuned machine as the comparator but when comparing exercise to manual therapy, for example, this is not feasible because the differences between the interventions are clear to both patients and clinicians. However, even though blinding might not be possible it does not mean that this will not affect the results of the treatment particularly if the patients or clinicians have strong preferences or expectations. For example, consider a patient with a strong preference for the new intervention under investigation who is randomly allocated to receive usual care; this patient is likely to suffer resentful demoralisation, i.e. the patient becomes demoralised due to their non-preferred allocation which might bias the reporting of the treatment results adversely. If the patient was blinded then they would not have the knowledge of whether they received the intervention or control treatment but, as mentioned, this is not always possible. (In a future blog I will consider the impact of patient preference and expectations in terms of alternative trial designs and their impact on clinical outcomes).
In the SELF study, we initially aimed to blind the participants by describing both the self-managed single exercise programme and usual physiotherapy treatment, i.e. treatment prescribed by the treating physiotherapist as they would typically do for patients presenting with this problem, as 'physiotherapy' treatments. This approach recognised that different physiotherapists treat individual patients differently and most patients arrive at the physiotherapy clinic not knowing the specific ingredients of the physiotherapy they will receive. However, the research ethics committee would not permit this and requested that a full description of both interventions be offered to the patients as part of the process of informed consent. Clearly, we could not blind the treating physiotherapists otherwise they would not know which treatment to deliver but it was apparent that the beliefs and preferences of the treating physiotherapists did have an impact; details are available through two qualitative studies that we undertook alongside the pilot trial and main trial here and here.
Hence, although the potential impact is unknown, in relation to blinding of participants and study personnel and hence potential for performance bias, we can conclude that there is potential for a high risk of bias with the SELF study.
Blinding of outcome assessment and the SELF study
Increasingly in physiotherapy we are recognising the value of patient reported outcome measures. Previously we have placed value on measuring outcomes such as range of movement, strength, tissue status which we now recognise often don't associate well with the pain and disability that our patients complain of. In the SELF study the primary outcome was the shoulder pain and disability index (SPADI). The SPADI contains 13 items which the patient completes to detail their current level of shoulder pain and disability. Here, as with many other physiotherapy trials using patient reported outcomes, is the issue; the patient is reporting their outcome and, as we have seen above, in the SELF study were not blinded. Hence, blinding of outcome assessment was also not possible.
Thus, although the potential impact is unknown, in relation to blinding of outcome assessment and hence potential for detection bias, we can conclude that there is potential for a high risk of bias with the SELF study.
Incomplete outcome data and the SELF study
Clearly without data we cannot make a valid judgement about the effectiveness of our treatments. Hence, where outcome data is incomplete we need to exercise caution. In the SELF study only 70% of patients returned their SPADI data at 3-months, 56% at 6-months and 49% at 12-months. This is a concern because would our conclusions be different if all patients had returned their data? The answer is probably, or at the very least we could have been more confident in the conclusions drawn.
In this respect, both groups improved by what would be regarded as a clinically significant amount between 0 and three months; only the self-managed exercise group improved by a clinically significant amount between three and six months; and then the improvement between six and 12 months was not clinically significant for either group. At no time-point was there a statistically significant difference between the two groups, i.e. one approach proving superior to the other. Imagine though if we had the missing data, the changes or differences observed in the study from those participants who returned their SPADI data might be accentuated or attenuated; unfortunately we do not know.
One critical feature in relation to incomplete outcome data is whether there is a differential response between the different outcome groups, i.e. is there greater drop-out in one group compared to the other. This might be an important indicator relating to both the safety and effectiveness of an intervention, i.e. the intervention might be causing harm causing patients to drop out, so findings from a trial with differential drop-out should be treated extremely cautiously. Differential drop-out was not observed in the SELF study but bias could still be introduced if the reasons for missing outcomes differ. We do not know the reasons so caution should be exercised.
Thus, in relation to incomplete outcome data and hence potential for attrition bias, we can conclude that there is potential for a high risk of bias with the SELF study.
Selective outcome reporting and the SELF study
Researchers typically collect lots of data when undertaking research, and therefore potentially have lots of opportunity to detect whether a treatment works. As we know in research, positive findings can be a chance occurrence and if you collect enough data it increases the likelihood of attaining a significant finding. Or, for example, one section of an outcome measure might show significant findings if taken in isolation but not when treated as a whole measure.
However, if the outcome measure or measures that are to be collected and used to evaluate whether a treatment is effective are not detailed before the trial begins then we would struggle to know whether authors have only reported a biased selection of their data. This is a significant issue in health research with recent movements to require researchers to register their trials on open databases before the study begins and to also publish protocols. This ensures that the researchers do what they say they would do. Further detail is available here. In relation to the SELF study, as detailed above, the trial was registered prior to initiation and the protocol was published. The published report refers directly to the pre-specified outcomes and hence we can conclude that there is a low risk of reporting bias with the SELF study.
Other potential sources of bias and the SELF study
There are a number of other potential sources of bias that might affect the internal validity or trustworthiness of a randomised controlled trial. Some relate to the specific form of randomised controlled trial used, e.g. cluster, cross-over, and further detail is provided here.
From my perspective, beyond the potential sources of bias discussed above, there are no obvious other sources of bias to consider in relation to the SELF study but if you managed to read this far and have some thoughts then please do let me know through the comments section.
A final point to note; the discussion above has been focused on internal validity, i.e. trustworthiness, rather than external validity or generalisability. Clearly a valid/ trustworthy trial is of limited use if it is not generalisable, i.e. can not be applied more widely in practice. There are points to consider in relation to the external validity of the SELF study linked to its development and pragmatic methods and I will aim to shed some light on these issues in the coming weeks.
So, to conclude, based on this structured critical appraisal, it would be sensible to be cautious when interpreting the results of the SELF study. There are some clear strengths in the design of the study but, particularly, due to the risk associated with attrition bias, caution is needed. Although the pilot study reported similar results to the SELF study at three months, which is reassuring, replication of the study with particular attention to reducing attrition is indicated particularly because the simple self-managed exercise is simple and confers minimal burden on patients. Hence, if the self-managed single exercise programme proves comparable to usual physiotherapy treatment then there is pragmatic value.
In terms of future research, I am currently involved with the GRASP study which, although not a replication of the SELF study, overlaps to some degree and will provide further insight in due course.
Thanks for reading, hope it was useful; thoughts gratefully received.