Daily News Glosses Over Real Progress in Cancer Immunotherapy
This blog addresses a news event that unfolded in the rapidly developing area of immuno-oncology just in the past few weeks — in effect, offering an addendum to the Cancer Immunotherapy Update in our current, September 2016 issue.
How much science does a business person need to know? In this industry, the more the better. Without at least an excellent layperson’s understanding of the company’s scientific underpinnings, a life science executive will remain vulnerable to the greatest of all hazards in managing the business — today’s news.
Case in point: The early August reports of a single cancer-immunotherapy trial by Bristol-Myers Squibb, testing the hypothesis of PD-L1 as a biomarker predicting response to an anti-PD-1 drug, Opdivo (nivolumab). Within the framework of the Phase 3 “Checkmate-026” trial, the hypothesis failed, but the day BMS released the trial’s top-line results, I was amazed by the wave of negativity that engulfed the story, even especially among industry reporters and opinionators. It seemed obvious their pessimistic interpretation of the trial, as proof the IO field had been hyped, fed the subsequent moral bombast in the consumer press. In a profoundly sad but poorly informed NY Times op/ed, the husband of a lung-cancer victim belittled BMS for dishonoring Opdivo clinical trial subjects with its television ads for the product — and stating flatly the drug did not help lung-cancer patients. Actually, it does.
But it’s not a simple story. The development of new medicines is roughly progressive but never linear. Opdivo is already approved for treating non-small cell lung cancer (NSCLC). The Opdivo ad touted approved indications for advanced squamous NSCLC and non-squamous NSCLC for patients no longer responding to chemotherapy, and BMS already had strong clinical data to back what the ad claimed: “longer life.” But the subsequent biomarker-driven study was supposed to clarify and expand Opdivo’s use with a patient sub-population, not put the entire field of IO on trial.
I don’t blame the public for conflating clinical studies and TV commercials, and perhaps it is time again to question the wisdom of DTC advertising for prescription drugs in general. Yet, in this case, the moral indictment was unfounded and potentially harmful to further clinical trial recruitment in this area. Nothing in the “failed” trial undermined the findings of the previous ones with Opdivo — or of the anti-PD-1 therapeutic hypothesis. To spread the word about the proven benefits of an immunotherapeutic is not a dishonor to patients in follow-on trials, failed or otherwise.
Having defended a company’s honor in this case, I cannot defend its science. Granted, IO research and oncology research in general are moving at a much faster pace these days, and the infant field of biomarker development may sometimes lag behind the discovery of new disease mechanisms and the details within each one. Thus, by the time a clinical trial can reach completion, the science may have moved beyond the trial’s original premise. In the Opdivo biomarker trial, however, it appears BMS may have ignored early warning signs that the PD-L1 hypothesis had flaws. At the same time news of the first breakthrough checkpoint inhibitor trials broke several years ago, leading researchers such as Dr. Pam Sharma at MD Anderson also reported conflicting evidence on PD-L1 as a selective therapeutic marker and questioned its usefulness as such. An outside observer can only imagine why BMS moved ahead with such apparent confidence in the concept. Perhaps this is where the commercial and R&D sides of corporate pharma, which often communicate in ways that can be antagonistic and out of sync, intermingled to a counterproductive end.
When the initial pulse of checkpoint-inhibitor studies came out with their amazing response rates, it was natural to celebrate the sea change. The commercial people may well have supported and even advocated the use of a handy biomarker in follow-up trials. Yet about the same time, perhaps when BMS was designing the Checkmate-026 trial, there also seemed to be a growing scientific consensus that PD-L1 was it — the magic biomarker that would identify the patients most likely to benefit from anti-PD-1. A trial by Merck for its PD-1 blocker Keytruda (pembrolizumab) later appeared to back up the hypothesis. But the small trial by BMS, using a much more liberal criterion for PD-L1 positivity, did not.
Checkmate-026 essentially found no higher rates of treatment benefit for NSCLC patients qualified as PD-L1 positive. But the study used a much lower cutoff level of PD-L1 expression, 5 percent of tumor cells, than the Keytruda study’s level of 50 percent, thus expanding the potential treatment population, a.k.a. market, significantly. If anything, the poor results should have cast doubt on the chosen biomarker as a qualification tool. Instead, critics pounced on the trial data as proof against the entire class of anti-PD-1 and checkpoint inhibitor drugs: “Oh, here it comes, folks, the big let-down that always follows the hype over every promising new cancer therapy.” One person tweeted, “Pharma needs to stop rushing drugs to market for broad populations, only to come back years later with the biomarkers.”
Such a statement could easily act like anti-matter to the massive hyperbole that surrounds so much imaginative but still-unproven pharma science — a mutually destructive explosion of ignorance. Anyone with only limited knowledge of cancer immunotherapy might easily buy into the hype or the anti-hype. Ignorance of facts, and of real-world complexity, feeds poor decisions. Patients can waste time and resources chasing unreal expectations or suspicions. Industry executives can make strategic and operational decisions they greatly regret later — or overlook an opportunity altogether. But if they stay focused on where Checkmate-026 and the hundreds of other IO trials are taking the science, they may also come to understand where the science is taking the business.
Some people believe BMS may have let its greed for a broader patient population overrule its judgment in the trial design, bringing in a lot of patients who should not have qualified. Could this be a common hazard with Phase 3 trials, especially when companies smell “blockbuster?” Set the qualification bar too high and you get fewer patients; set it too low, you get more patients but poorer results, even failure. Or, more likely in this case — choose the wrong biomarker, and your qualification means nothing. Our 2016 Update, like the entire cancer-immunotherapy series, contains numerous examples of the hazards in testing IO drugs, singly or in combination with others.
In truth, the BMS trial results are consistent with the central thesis of our update: Anti-PD-1/PD-L1 therapy has become the backbone of IO, but only for a subset of patients who have generated sufficient TILs (tumor-infiltrating leukocytes), aka CD8-T cells — either on their own or with a boost from a (not yet approved) co-stimulator.
Other emerging theories, rather than competing with the TIL thesis, may just be filling in the missing pieces of a larger picture. Studies of tumor microenvironments show tumor antigens may be more prevalent when durable responses to anti-PD-1 occur. But do the antigens merely give the TILs more targets once the treatment liberates T cells from the tumor’s immunosuppression? After all, it is tumor-infiltrating T cells that ultimately kill the cancer.
I asked Dr. Llew Keltner, the moderator of our “virtual roundtable” on cancer immunotherapy and main source for this year’s update, about the BMS trial and other continuing developments with implications for the IO space. His response follows below, underlining the overall case for why PD-L1 may not be an adequate biomarker at all:
KELTNER: It is very hard to interpret the early release of a single data point — “did not meet its final endpoint” — in the Checkmate-026 trial. We have to assume the final endpoint is the PFS [progression free survival] “primary” endpoint noted in clinical trials.gov.
Although the actual calculations are beyond this discussion, the reality is that the number of applicable patients in the treatment arm of the Checkmate-026 trial (very likely less than 135), applying typical error rates, is far too small to make any statistical determination of “not meeting the final endpoint.” Like the great majority of clinical studies, error rates were not accurately taken into account in design, and thus there is no way to interpret any result.
But the real failure of the trial is use of the wrong endpoint to select patients liable to respond to nivolumab (Opdivo). Data from many studies are showing that tumors with high populations of tumor-infiltrating CD8-T cells (TILs), respond to anti-PD-1 and anti-PD-L1, regardless of PD-L1 status. Although PD-L1 expression may be a reasonable marker of a mounted tumor-cell defense against TILs, it is not likely to be as robust as looking at number, phenotype, and clonality of TILs. Roche’s early September announcement of Phase 3 results for Tecentriq (atezolizumab), in advanced or metastatic non-small cell lung cancer, supports that hypothesis, showing “significant improvements” in patients’ survival regardless of their PD-L1 status.
Why not use TIL level as the patient selection factor? It could be ignorance and lemming-like belief in the “universality” of PD-L1 as a marker, or the opposite — knowing that only up to 40 percent of NSCLC patients will be TIL positive, thus potentially limiting use of nivolumab to a smaller population than with a 5-percent PD-L1 cutoff. But the available data do not support the latter argument; the numbers of TIL-positive patients and patients with 5 percent or greater PD-L1 expression should be similar, although the overlap might not be high. Perhaps an even more sophisticated design would be to combine Merck’s criterion of 50-percent PD-L1 expression with “or TIL-positive.” The patient group would be larger — probably not by 80 percent, but perhaps by well over 50 percent — and likely more reliable.
TIL measurements can be made in ways that greatly reduce error rates. In a robust investigation, intratumoral T cells must stain for CD8. Then the CD8s must show a phenotype of high activation via secretion of multiple cytokines. Then the high-secreting CD8s must be shown to have a relatively high clonality index, a measure of their potential diversity. Expensive and thorough testing, but for an important yet small clinical trial, a great way to reduce the error rates for patient selection.
To date, there are only anecdotal reports of differences in response between nivolumab and pembrolizumab (Keytruda), and the very questionable data from the Checkmate-026 trial provide no clarity on real potential differences between the two anti-PD-1s in solid tumors. There are stronger indications of difference in response and survival in hematological malignancies between nivolumab or pembrolizumab and pidilizumab (in development by Medivation – now Pfizer) — with pidilizumab the apparent winner, likely due to its second mechanism of action not involving binding to PD-1. Though interested investigators are beginning to think about head-to-head studies in prostate and other malignancies, it will be some time before any solid, statistically sound conclusions can be made about relative response in, and survival of, patients using various anti-PD-1 drugs. The “results” of the Opdivo Checkmate-026 study should in no way be relied on for such a comparative conclusion.
Biomarkers and biopsies in clinical trials promise to bring more predictability to treatment through patient selection, but they can also introduce their own uncertainties. You can see a lot of the devil in the details, looking at a reasonable estimation of the error rates for determination of PD-L1 biomarker positivity in Checkmate-026:
The literature (Phillips, et. al. 2015 and others) notes that there is at least a 10-percent error rate in operator-to-operator, and center-to-center, determination of PD-L1 positivity at the 5-percent cutoff of the Opdivo biomarker trial. In addition, there is a 16-percent or more discordant rate from biopsy to biopsy in the same tumor samples. Actual error rate of the assay itself, or “reagent binding,” is likely to be in the normal IHC (immunohistochemistry) range of 6 to 7 percent. Thus the actual total error rate for the assays are almost certainly in excess of 30 percent. Accounting for variability in staining intensity and its relationship to scoring would increase the error rate further. Some studies (Villaruz and Socinski 2013, and others) say the error rate for measurements of PFS (progression-free survival) is at least 25 percent; others have argued the real rates are much higher. Thus, at best, in the “026” trial the likelihood of a patient who is truly above the cutoff PD-L1 rate of 5 percent, and who truly has a tumor response, actually showing tumor response is only about 50 percent. That is not a good basis for making conclusions about the real patient responses in any trial.