Showing posts with label non-response bias. Show all posts
Showing posts with label non-response bias. Show all posts

Thursday, April 6, 2023

The challenge of measuring shoulder arthroplasty outcomes: bias, ceiling effects, and practicality.

Each surgeon has the opportunity - indeed the responsibility - to keep track of her or his surgical outcomes for the purpose of knowing what is and what is not working in the practice. This point is discussed in detail in this link. In that many failures of arthroplasty occur more than five years after surgery, long-term followup is critical.




Three of the key elements of an effective/informative/practical outcome system are (1) capturing the highest possible percentage of patients treated, (2) being able to present the results to patients in terms that patients and surgeons understand, and (3) having a system that is validated and universally applicable so that data can be compared among centers.

#1 requires minimizing exclusion bias. Many scales, such as the Constant Score, the UCLA score, and the Shoulder Arthroplasty Smart score require the patient to return to the office for the measurement of ranges of motion (and, in some cases, strength). In addition to risking observer bias and inter-observer variability, the requirement of returning to the office risks selectively excluding those patients living at a distance from their provider, those unwilling or unable to return, and those of limited economic means. Computer-based scoring systems, such as the PROMIS and Computer Adaptive Testing, risk selectively excluding patients without access to computers, those who are not computer literate and those not proficient in English. The ideal system makes it easy for all patients to be included in long-term followup: inexpensive, quick to complete, accessible independent of the location of the patient and independent of the patient's computer literacy and access.

#2 requires that the outcome data are presented in a way that is meaningful to the patient and surgeon. Most patients will have difficulty understanding the significance of a "score of 72" on PROMIS, Constant, UCLA or SAS, but many would understand the significance of the improvement in specific shoulder functions achieved by their surgeon for a specific condition presented as shown below (showing results obtained using the Simple Shoulder Test results for extended head hemiarthroplasty in the treatment of patients having cuff tear arthropathy with retained active elevation).





#3 Most of the commonly used outcome measures have been carefully validated, for example see Is the Simple Shoulder Test a valid outcome instrument for shoulder arthroplasty? which shows, in spite of the fact that 15% percent of the patients achieved the maximal SST score, there was a near-perfect correlation between satisfaction and the final SST score, suggesting that the "ceiling effect" is likely to have little clinical significance.





What this means is that a shoulder that can perform each of the 12 SST functions (below) is an excellent and highly satisfactory shoulder.





If the ceiling effect was a concern, one could add a thirteenth question: "Can you throw a football 100 yards with the affected arm?". Very few shoulders, normal or post-arthroplasty, would hit the ceiling of 13/13 "yes" responses.


In the same vein, the authors of Validation of a machine learning–derived clinicalmetric to quantify outcomes after total shoulderarthroplasty and Exactech Equinoxe anatomic versus reverse total shoulder arthroplasty for primary osteoarthritis: case controlled comparisons using the machine learning–derived Shoulder Arthroplasty Smart score correctly point out that the Shoulder Arthroplasty Smart score (you can experiment with it on this link) does not have a ceiling effect. In order to achieve the ceiling of the SAS score, the shoulder needs to be measured as having 180 degrees of active forward elevation, internal rotation to T7, and 90 degrees of active external rotation with the arm at the side.


These values will be difficult to attain because they are substantially greater than those found in the general population (see Shoulder range of movement in the general population: age and gender stratified normative data using a community-based cohort): average active shoulder flexion was 160° and average active external rotation was 59°.

Another approach for those concerned about the "ceiling effect" is put forth by the authors of Quantifying success after anatomic total shoulder arthroplasty: the minimal clinically important percentage of maximal possible improvement. They expressed the amount of improvement as the percentage of maximum possible improvement (%MPI) (based on a prior study: The prognosis for improvement in comfort and function after the ream-and-run arthroplasty for glenohumeral arthritis: an analysis of 176 consecutive cases). The %MPI is calculated as (postoperative score - preoperative score)/(perfect score - preoperative score). The "ceiling" in the %MPI would only be reached if the score improved from the worst possible score to the best possible score - a rare event.
Then they determined the minimal clinically important difference (MCID) for the %MPI using the anchor method. Interestingly their calculated MCID-%MPI values are similar for many of the commonly used scores: 33% for the SST, 32% for the ASES score, 38% for the UCLA score, 30% for the Shoulder Pain and Disability Index score, and 33% for the Shoulder Arthroplasty Smart score.


Comment: A surgeon's choice of the optimal patient followup system needs to be made in consideration of the above factors as well as the required staff time and cost of implementation. The goal is to capture long-term data on the highest percentage of patients treated using a method that is affordable and practical for the office.

You can support cutting edge shoulder research that is leading to better care for patients with shoulder problems, click on this link.

Follow on twitter: https://twitter.com/shoulderarth
Follow on facebook: click on this link
Follow on facebook: https://www.facebook.com/frederick.matsen
Follow on LinkedIn: https://www.linkedin.com/in/rick-matsen-88b1a8133/

Here are some videos that are of shoulder interest
Shoulder arthritis - what you need to know (see this link).
How to x-ray the shoulder (see this link).
The ream and run procedure (see this link).
The total shoulder arthroplasty (see this link).
The cuff tear arthropathy arthroplasty (see this link).
The reverse total shoulder arthroplasty (see this link).
The smooth and move procedure for irreparable rotator cuff tears (see this link).
Shoulder rehabilitation exercises (see this link).

Saturday, January 29, 2022

Optimizing patient followup after shoulder surgery

As pointed out by E.A.Codman almost a century ago, the goal of outcome research is to follow every patient long enough to determine whether the treatment was successful and to ask, "if not, why not".  The goal of clinical research using patient self-assessment is to capture the largest possible percentage of those potentially eligible, without the risk of non-response or selection bias that may systematically exclude certain categories of patients. Is computer adaptive testing the ideal tool in this regard? 

Let's consider a recent article, Performance and responsiveness to change of PROMIS UE in patients undergoing total shoulder arthroplasty These authors studied the Patient-Reported Outcomes Measurement Information System Upper Extremity Computer Adaptive Test (PROMIS UE CAT) with respect to its responsiveness in patients undergoing shoulder arthroplasty. They found that the PROMIS UE demonstrated excellent correlation (range: 0.68-0.84) with the standard legacy instruments, the Simple Shoulder Test, the American Shoulder and Elbow Surgeons Score and the Oxford Shoulder Score at all postoperative time-points.  




Comment:  The authors pointed out that "the mean number of questions administered by PROMIS UE CAT was 4, compared to 11 in ASES and 12 in both SST and OSS, which translates to a lower patient response burden when filling out PROMIS surveys"; however, it is not at all clear that this "patient response burden" of an extra 34 seconds (see below) compromises the patient participation in postoperative followup. 


Instead, consider "Patients were required to have access to a working email and a computer/phone to participate in this study so socioeconomic factors and advanced age could have limited participation and confound our results." While "all patients undergoing TSA were offered enrollment in this study", only 97 of all the TSAs performed by this busy arthroplasty service between March 2019 and April 2021 could be included. 


Computer adaptive testing requires that the patient have access to and is trained in the use of a computer interface that they will use before surgery and after surgery at the designed followup intervals. While this may be straightforward for some patients, other patients may not have the necessary training and access. 




It seems important to use a followup system that does not carry the risk of systematically excluding patients who are older, less educated, less healthy or socioeconomically disadvantaged.

For example in a study discussed in this link one third of the eligible patients did not provide a 12-month PROMIS response. It appears that the characteristics of the PROMIS system has the potential for excluding a substantial number of patients - possibly those with inferior results or those from less advantaged socio-economic situations (see this link).


For contrast compare the response rate from the above study with that in What is a Successful Outcome Following Reverse Total Shoulder Arthroplasty?, a study in which the more easily accessible Simple Shoulder Test enabled 87% of the patients in the original sample to provide two year followup.  In contrast to the PROMIS, the SST can be completed anywhere and requires only a pencil or a pen.


Another article is relevant: Correlation of PROMIS Physical Function Upper Extremity Computer Adaptive Test with American Shoulder and Elbow Surgeons shoulder assessment form and Simple Shoulder Test in patients with shoulder arthritis 


The purpose of this study was to evaluate the Patient-Reported Outcomes Measurement Informative System Physical Function Upper Extremity Computer Adaptive Test (PROMIS PFUE CAT) measurement tool against the already validated American Shoulder and Elbow Surgeons (ASES) shoulder assessment form and the Simple Shoulder Test (SST) in patients with shoulder arthritis.

The average times to complete the SST and PROMIS PFUE CAT were determined to be 96.9 ± 25.1 seconds and 62.6 ± 22.8 seconds, respectively. The question is whether the saving of 34 seconds is worth the limitations of the PROMIS?


The scatter plot from this article also brings up another issue with the PROMIS: four patients who indicated that they could perform none of the 12 functions of the SST, still had PROMIS scores in the same range as three patients what could perform eight of these functions. Thus, the PROMIS was unable to discriminate between a non-functioning shoulder and a reasonably functional one.

 


Another recent article is relevant:PROMIS Upper Extremity Underperforms Psychometrically Relative to American Shoulder and Elbow Surgeons Score in Patients Undergoing Primary Rotator Cuff Repair


PROMIS UE-CAT correlated to a degree with the ASES (r=0.684) and had a 4% floor effect and no ceiling effect.  While PROMIS UE-CAT initially required fewer test items for overall equivalent coverage of shoulder function assessment, final models after recursive item elimination revealed the ASES instrument to have more well-fitting items over a broader range of shoulder function.


The authors concluded that: "Until further refinements in the PROMIS UE-CAT instrument are made, it should not replace the ASES instrument in patients undergoing primary RCR."


Performance and responsiveness to change of PROMIS UE in patients undergoing total shoulder arthroplasty described the so called "ceiling effect" they observed with the OSS (17%) and the Simple Shoulder Test (18%). How big an issue is this really? In the graph below from The prognosis for improvement in comfort and function after the ream-and-run arthroplasty for glenohumeral arthritis: an analysis of 176 consecutive cases  ...
































..it can be seen a large number of patients having the ream and run for osteoarthritis "hit the ceiling" of 12 out of 12 on the SST. This means that 


-the shoulder was comfortable at the side

-the shoulder allowed the patient to sleep comfortably

-the shoulder allowed reach to the small of the back to tuck in a shirt

-the shoulder allowed placement of the hand behind the head with the elbow straight out to the side

-the shoulder could lift a coin, a one pound weight, and an eight pound weight to the level of the top of the head without bending the elbow

-the shoulder allowed carrying 20 pounds at the side

-the shoulder allowed tossing a softball 20 yards underhand

-the shoulder allowed throwing a soft ball 20 yards overhand

-the shoulder allowed washing the back of the opposite shoulder

-the shoulder allowed work full time at the patient's usual job


In our view that's a pretty high ceiling; it is remarkable that so many patients can hit it after the ream and run. Obviously one could avoid the "ceiling effect" by adding a question such as, "would your shoulder allow you to throw 100 yards?", but it seems that "yes" responses to each of the 12 existing questions indicates a comfortable and highly functional shoulder.


Performance and responsiveness to change of PROMIS UE in patients undergoing total shoulder arthroplasty points out that while the minimal clinically important difference (MCID) has been well established for the Simple Shoulder Test (see Is the Simple Shoulder Test a valid outcome instrument for shoulder arthroplasty?) and many of the other legacy scales, but that "further quantification of meaningful responsiveness to change will require estimation of the minimal clinically important difference and substantial clinical benefit for PROMIS UE CAT". 



Finally,  Performance and responsiveness to change of PROMIS UE in patients undergoing total shoulder arthroplasty erroneously states that "all except SST have been translated and adapted to several other languages.See below:


    Simple shoulder test and Oxford Shoulder Score: Persian translation and cross-cultural validation


    Validation of the Simple Shoulder Test in a Portuguese-Brazilian Population. Is the Latent Variable Structure and Validation of the Simple Shoulder Test Stable across Cultures?






Follow on twitter: https://twitter.com/shoulderarth

Follow on facebook: https://www.facebook.com/frederick.matsen

Follow on LinkedIn: https://www.linkedin.com/in/rick-matsen-88b1a8133/


How you can support research in shoulder surgery Click on this link.

Here are some videos that are of shoulder interest
Shoulder arthritis - what you need to know (see this link)
The smooth and move for irreparable cuff tears (see this link)
The total shoulder arthroplasty (see this link).
The ream and run technique is shown in this link.
The cuff tear arthropathy arthroplasty (see this link).
The reverse total shoulder arthroplasty (see this link).

Shoulder rehabilitation exercises (see this link).

Friday, September 3, 2021

Do we know what our patients think about the surgical care we give?

Press Ganey Surveys in Patients Undergoing Upper-Extremity Surgical Procedures Response Rate and Evidence of Nonresponse Bias

These authors point out that patient satisfaction surveys are important measures of the patient experience and these surveys can provide data for quality improvement. 


They sought to determine the response rate and the factors associated with the completion of the Press Ganey (PG) Ambulatory Surgery Survey (PGAS) in patients who underwent ambulatory upper extremity surgical procedures.


The PGAS poses questions such as the below regarding the provider of their care


They conducted a review of the orthopaedic registry at a single academic ambulatory surgical center for

patients who underwent an upper-extremity surgical procedure from 2015 to 2019. The institutional Press Ganey database was queried to determine the patients who completed the PGAS postoperatively. 


They calculated the response rate, and compared the baseline characteristics and patient-reported outcome measures between responders and nonresponders.


Of the 1,489 patients included, 201 (13.5%) were responders and 1,288 (86.5%) were nonresponders. 


Differences existed in baseline characteristics between groups, with responders being significantly older (p = 0.004) and having significantly higher proportions of White race (p < 0.001), college education (p = 0.011), employment (p = 0.005),marriage (p = 0.006), and higher income earners (p < 0.001). 



Responders had significantly better baseline Patient-Reported Outcomes Measurement

Information System scores across multiple domains (p < 0.05), but these differences did not exceed the MCID.



The authors concluded that the PGAS response rates were low (13.5%) and that the responders may not be representative of all patients.


Comment: This study points out the perils of non-response bias in the Press-Ganey scores. It provides clear evidence that the 13.5% of patients who responded did not represent the diversity of the population of interest. 


We suggest that a similar non-response bias may be present in the application of computer-based outcome measures, such as the computer-based PROMIS. We suspect that PROMIS responders may differ from PROMIS non-responders and that the differences in demographics may be similar to those seen for the PG scores. This is important in that both the Press-Ganey and the PROMIS scores may yield results that are not representative of the total group of patients under consideration; furthermore because of the non-response bias effect the results obtained from the responders may be better than they would be for the entire group of interest. 


There seem to be two lessons here: (1) in presenting results with measures such as PG and PROMIS, we must be alert to the response rate as well as to the risk of non-response bias and (2) we must seek and use measures that are as inclusive as possible, so that access, questionnaire fatigue, and technology do not form barriers to patient participation. 


Follow on twitter: https://twitter.com/shoulderarth

Follow on facebook: https://www.facebook.com/frederick.matsen

Follow on LinkedIn: https://www.linkedin.com/in/rick-matsen-88b1a8133/


How you can support research in shoulder surgery Click on this link.


Here are some videos that are of shoulder interest
Shoulder arthritis - what you need to know (see this link)
The smooth and move for irreparable cuff tears (see this link)
The total shoulder arthroplasty (see this link).
The ream and run technique is shown in this link.
The cuff tear arthropathy arthroplasty (see this link).
The reverse total shoulder arthroplasty (see this link).
Shoulder rehabilitation exercises (see this link).