Pancreatic cancer evolution and heterogeneity: integrating omics and clinical data

24 | MARCH 2022 | VOLUME 22 www.nature.com/nrc REV I EWS 0123456789( ) ; : Pancreatic ductal adenocarcinoma (PDAC) is a gland-forming malignancy arising from the pancreas, a retroperitoneal organ. It has a low, albeit increasing incidence of 13 per 100,000 people per year and a lifetime risk of less than 2%, with no sex or geographical bias. Well-proven epidemiological risk factors include smoking, pancreatitis, alcohol use, obesity and type 2 diabetes mellitus1. The predominantly advanced ages at diagnosis are approximately normally distributed and concentrated about a median of 70 years. PDAC has one of the highest death rates of any solid organ malignancy, with overall 5-year survival of less than 10%. Cure rates have increased minimally over decades despite improvements in surgical and medical management, with PDAC currently third most common and on a projected course to become the second most common cause of cancer-related deaths in the United States2. These stark statistics have inspired international research efforts focused on all stages of the disease, including germline risk, precursor lesions, primary tumours and metastases. What were initially siloed research programmes have coalesced into multi-institutional, multinational collaborations, and team-based funding enterprises. Well-informed translational efforts inform the care of patients with PDAC at all levels, including screening trials and guidelines, interdisciplinary care, including ‘molecular’ tumour boards, and molecular inclusion criteria for trials of novel therapies. Tumour evolution is the result of genomic instability caused by somatic mutations, chromosomal rearrangements, copy number alterations and epigenetic changes, resulting in loss of tumour suppressor genes and activation of oncogenes that disengage the affected cells from their regulatory cycles. These molecular events are favoured by natural selection, as neoplastic cells have competitive growth advantages. Understanding this progression is the unifying goal of much cancer research and informs better management strategies. During the course of PDAC research, two models of PDAC evolution have been proposed, as depicted in Fig. 1a. The first to be proposed was a stepwise progression model based on the observation that a higher degree of precursor dysplasia is associated with higher accumulation of genetic alterations, eventually resulting in malignancy over many years, akin to that described for other epithelial cancers3,4. The more recently described punctuated evolution progression model is based on the observation that PDAC driver gene inactivations can occur simultaneously in complex chromosomal rearrangements, implying rapid tumour development and Pancreatitis inflammation of the pancreas, the causes of which include gallstones, alcohol and germline predisposition. Pancreatic cancer evolution and heterogeneity: integrating omics and clinical data Ashton A. Connor1 and Steven Gallinger2,3,4,5 ✉ Abstract | Pancreatic ductal adenocarcinoma (PDAC), already among the deadliest epithelial malignancies, is rising in both incidence and contribution to overall cancer deaths. Decades of research have improved our understanding of PDAC carcinogenesis, including characterizing germline predisposition, the cell of origin, precursor lesions, the sequence of genetic alterations, including simple and structural alterations, transcriptional changes and subtypes, tumour heterogeneity, metastatic progression and the tumour microenvironment. These fundamental advances informcontemporary translational efforts in primary prevention, screening and early detection, multidisciplinary management and survivorship, as prospective clinical trials begin to adopt molecular-based selection criteria to guide targeted therapies. Genomic and transcriptomic data on PDACwere also included in the international pan-cancer analysis of approximately 2,600 cancers, amilestone in cancer research that allows further insight through comparisonwith other tumour types. Thus, this is an ideal time to reviewour current knowledge of PDAC evolution and heterogeneity, gained from the study of preclinical models and patient biospecimens, and to propose amodel of PDAC evolution that takes into consideration findings from varied sources, with a particular focus on the genomics of human PDAC. 1Department of Surgery, Houston Methodist Hospital, Houston, TX, USA. 2Hepatobiliary/Pancreatic Surgical Oncology Program, University Health Network, Toronto, ON, Canada. 3PanCuRx Translational Research Initiative, Ontario Institute for Cancer Research, Toronto, ON, Canada. 4Wallace McCain Centre for Pancreatic Cancer, Princess Margaret Hospital Cancer Centre, Toronto, ON, Canada. 5Ontario Pancreas Cancer Study, Mount Sinai Hospital, Toronto, ON, Canada. ✉e-mail: steven.gallinger@ uhn.ca https://doi.org/10.1038/ s41568-021-00418-1 Reviews Nature reviews | CanCer volume 22 | march 2022 | 131

VOLUME 22 | MARCH 2022 | 25 NATURE REVIEWS | CANCER REV I EWS 0123456789( ) ; : dissemination 5 , corresponding more with clinical presentations of relatively sudden onset of advanced disease in some patients 6 . Herein, we review PDAC research in the context of these two evolutionary models, with emphasis on the sequencing of human samples. For clinicians and scientists involved in PDAC research and management, this is an important moment to reflect on the global experience to date and prioritize future avenues. For those not directly involved in this field, PDAC provides invaluable lessons for how to develop a truly collaborative, well-organized and well-funded research community that engages patients and informs their care. In this Review, we outline the evidence to date on PDAC predisposition, the cell of origin and precursor lesions, the sequence of genetic alterations, including simple and structural alterations during carcinogenesis, resultant transcriptional changes, subtypes and tumour heterogeneity, the interaction with the tumour microenvironment, metastatic progression and therapeutic insights. However, the study of pancreatic cancer extends beyond the scope of this Review and also encompasses circulating tumour markers and circulating cells 7 ,8 as well as other pancreatic cancer types 9 , such as acinar cell or neuroendocrine carcinomas, which were recently reviewed elsewhere. Predisposition, origins and precursors In humans, PDAC is thought to arise from ductal epithelium that undergoes neoplasia. The stepwise model posits gradual dysplasia tomalignant progression through microscopically discernible precursor lesions, including pancreatic intraepithelial neoplasms (PanINs), intraductal papillary mucinous neoplasms and mucinous cystic neoplasms . In most individuals, this process is entirely somatic, while up to 10% of patients with PDAC have a germline predisposition to malignancy 1 ,10 ,11 . A minority of this predisposition is attributable to variants in known high-penetrance genes, summarized in TABLe 1 , including those linked to hereditary breast and ovarian cancer and Lynch syndrome , which can give rise to pancreatic tumours with particular phenotypes 12 . The majority of familial aggregation is either attributed to a large number of variants of individually low penetrance 13 or not yet explained, but may be prognostic, predictive and inform screening strategies 14 ,15 . The Precede Consortium is an international collaboration to improve screening and prevention of PDAC in those with increased heritable risk. Evidence that the epithelium lining pancreatic ducts is the origin of the PDAC cells, which progress through higher grades of cytological atypia , comes from morphology 3 ,16 –18 , genetic 19 –22 and expression 23 –26 studies and organoid models 27 ,28 . Precursor lesions and PDACs are seen histologically to be arising contiguously from the ductal epithelium, with stepwise progression from lowgrade to high-grade dysplasia. Older Sanger and newer next-generation sequencing data demonstrate that ductal epithelium, which appears histologically normal, PanINs and PDACs share similar genetic alterations. Some microscopically normal ductal epithelium and nearly all low-grade PanINs harbour KRAS mutations 20 ,29 , while high-grade lesions are more likely to contain tumour suppressor gene alterations, telomere shortening 19 and passenger mutations concordant with those in the synchronous invasive cancer. Increasing PanIN dysplasia is associated with more frequent loss of cyclin-dependent kinase inhibitor 2A ( CDKN2A ) 30 and with loss of TP53 and SMAD4 in the highest-grade lesions 3 ,31 , as depicted in Fig. 2 . Cell type can also be determined by expression of marker genes, including cytokeratin 19 ( CK19 ; also known as KRT19 ), SRY-box 9 ( SOX9 ) and GATA6 specific to pancreatic ductal cells, NKX6-1 specific to pancreatic islets and GATA4 specific to acinar cells 23 ,24 . Immunohistochemistry and RNA sequencing studies of mouse and human PanINs a b Rate of somatic mutations Time Time Stepwise progression Punctuated progression Normal pancreatic cell Pancreatic cancer cell Punctuated model Hybrid model Stepwise model First driver gene mutation Second driver gene mutation Third driver gene mutation Chromothripsis Fig. 1 | Stepwise model versus punctuated model of pancreatic ductal adenocarcinoma progression. a | The expected rates of somatic simple and structural mutations ( y axis) over time ( x axis) inpancreatic ductal adenocarcinoma if one is followingeither a stepwiseprogressionmodel or a punctuatedevolutionprogressionmodel. The stepwisemodel (blue line) assumes a steady rate ofmutation accumulationover time. Thepunctuatedevolutionmodel (red line) assumes a steady rateofmutation accumulation that is acceleratedby brief periods ofmore rapid change, shownhere as an inflexionpoint in the curve. However, it shouldbe noted that thedepicted initial rates and their variations over time are hypothetical. b | The hypothetical accumulationof somatic events over time as a pancreatic cell undergoes dysplasia, including structural (chromothripsis) and simple somaticmutations under a punctuatedmodel (top), a stepwisemodel (bottom) or a hybridmodel (middle). Pancreatic intraepithelial neoplasms (PaniNs). Microscopic lesions, either flat or papillary, arising in the lining of the intrapancreatic ducts and composed of cuboidal or columnar cells with degrees of cytological and architectural atypia. Intraductal papillary mucinous neoplasms Microscopic to macroscopic lesions arising in the lining of the intrapancreatic ducts, either main or branch ducts, and composed of mucinous epithelial cells with degrees of cytological and architectural atypia. Mucinous cystic neoplasms Macroscopic cystic lesions arising in the pancreas, usually the body and tail, without communication with the pancreatic duct, and composed of mucinous epithelial cells with degrees of cytological and architectural atypia. Lynch syndrome The result of heterozygous germline deficiency in mismatch repair genes ( MLH1, MSH2, MSH6 and PMS2 ), which causes an increased risk of certain cancers, especially colorectal, endometrial and pancreatic adenocarcinomas. Cytological atypia Abnormal cellular appearance, which could be the shape, colour or size of the entire cell or intracellular contents, often including large, irregularly shaped and hyperchromic nuclei. www.nature.com/nrc REV I EWS 132 | march 2022 | volume 22

26 | MARCH 2022 | VOLUME 22 www.nature.com/nrc REV I EWS 0123456789( ) ; : and PDACs show expression of ductal markers across all degrees of tumour cell purity. Exocrine acinar marker expression has been reported but interpreted as contamination25,26. Organoid models derived from both mouse and human normal ductal and neoplastic tissues all similarly express markers of ductal cells, but not of other pancreatic cell lineages27. Furthermore, mouse and human neoplastic organoids have recapitulatedmorphological, genetic and transcriptomic features of PanIN progression to PDAC over time, and organoids derived from human pluripotent stem cells induced towards a pancreatic cell lineage and then PDAC behave similarly to human tumours, expressing CK19 and SOX9 (reF.28). Early objections to PanINs as the precursor lesions were that these may be, paradoxically, intraductal extensions of the invasive tumour, as they are seen in large numbers, distantly and discontinuously from the primary tumour. These concerns were addressed in a recent study using exome sequencing of autopsy specimens that showed distant high-grade PanINs are phylogenetically related to the invasive cancer, harbouring as many base substitutions but fewer copy number alterations22. This is explained by PanINs colonizing the pancreas by so-called cancerization of the ducts, a phenomenon now reported using histology in the clinic18,32. Alternative cells of origin, including pancreatic cancer stem cells and acinar cells, have been considered. A cancer cell population exhibiting CD44, CD24, epidermal surface antigen (ESA)33 and CD133 (also known as PROM1)34 expression by immunohistochemistry has been identified in mature PDAC with both stem-like properties and cellular plasticity, meaning that these cells have the ability to alter their differentiation33,34. However, leucine-rich repeat-containing G protein-coupled receptor 5 gene (LGR5) expression, which characterizes adult stem cells in multiple organs, is not found in the adult pancreas in the absence of injury35. Acinar-to-ductal metaplasias are the earliest precursor in genetically engineered mouse models with endogenous KRAS overexpression23,36 and are recognized histologically18 in human samples, but these lack the genetic changes observed in PanINs37, so their role in human PDAC progression is unclear. Precursor lesions have high prevalence, with low-grade PanINs found in more than 75% of the population and high-grade PanINs found in approximately 5% of patients in a recent autopsy series from Japan38. The prevalence of precursor lesions has not been shown to differ in patients with or without germline predisposition39. Given these statistics, the fact that PDAC has a relatively low lifetime incidence is a quandary that may be answered in future somatic mutational analyses. Genetic alterations in primary tumours Simple somatic mutations and driver genes. The International Cancer Genome Consortium (ICGC) currently stores more than 600 whole PDAC exomes and genomes, obtained mostly from surgical resection of primary tumours40. Analysis of point mutations and short insertions and deletions, so-called simple somatic mutations, reveals four commonly altered genes, namely KRAS in approximately 90% of PDACs, TP53 in 80%, CDKN2A in 60%and SMAD4 in 40%, followed by a series of genes altered significantly more often than expected by chance yet at low individual prevalences, including A+T-rich interactive domain-containing protein 1A (ARID1A), lysine-specific demethylase 6A (KDM6A), RING finger 43 (RNF43), transforming growth factor-β (TGFβ) receptor 2 (TGFBR2), GNAS (encoding Gαs), MAP3K21, BRAF, SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A member 4 (SMARCA4), activin receptor type 2A (ACVR2A), ACVR1B and NRAS24. Thus, the somatically mutated genes most frequent in PanINs remain most frequent in PDAC. Each PDAC exome has on average approximately 40 simple somatic mutations24,41, while each genome bears approximately 6,0005,42–44. This is relatively low for an adenocarcinoma40,45. Alterations in the fourmain driver genes have themost proven impact on PDAC progression. Accumulated alterations in these four genes result in stepwise increases in cell cycle proliferation in both human and mouse46 samples, as assayed by either immunohistochemistry of Ki-67 or RNA sequencing44,47. Clinically, their biallelic loss assayed by immunohistochemistry in large cohorts demonstrates their individual prognostic value48–50. Both the very high frequency of mutations in KRAS and early activation of KRAS imply that RAS oncogenic signalling is the principal driver of PDAC. The infrequent wild-type KRAS tumours typically bear alterations in other oncogenic drivers in the RAS pathway, including GNAS, BRAF and CTNNB1 (encoding β-catenin)13,44,51, or may alternatively Table 1 | Summary of high penetrance hereditary pancreatic cancer syndromes Predisposition syndrome Gene or genes Relative risk of PDAC1 Associated tumour phenotype43,59 Hereditary breast and ovarian cancer BRCA1, BRCA2, PALB2, RAD51C 2–6 Homologous recombination deficiency Mutational signature 3 Lynch syndrome MLH1, MSH2, MSH6, EPCAM 8–9 Microsatellite instability Mutational signatures 6, 15, 21, 26 and 44 High total mutational burden Familial adenomatous polyposis APC 4–5 NR Peutz–Jeghers syndrome STK11 130 None Familial atypical multiplemolemelanoma syndrome CDKN2A 10–65 None Hereditary pancreatitis PRSS1, SPINK1 50–70 NR Cystic fibrosis CFTR 1–6 NR Ataxia–telangectasia ATM 5 None Familial pancreatic cancer NR 1–35 NR APC, adenomatous polyposis coli; CDKN2A, cyclin-dependent kinase inhibitor 2A; CFTR, cystic fibrosis transmembrane conductance regulator; EPCAM, epithelial cell adhesion molecule; MLH1, mutL homologue 1; MSH, mutS homologue; NR, not reported; PALB2, partner and localizer of BRCA2; PDAC, pancreatic ductal adenocarcinoma; PRSS1, encoding trypsin-1; SPINK1, serine protease inhibitor Kazal-type 1; STK11, serine/threonine kinase 11. Acinar-to-ductal metaplasias Metaplasias formed from a process that involves pancreatic acinar cells differentiating into duct-like cells. Nature reviews | CanCer Rev i ews volume 22 | march 2022 | 133

VOLUME 22 | MARCH 2022 | 27 NATURE REVIEWS | CANCER REV I EWS 0123456789( ) ; : have enrichment of mTOR pathway-activating mutations, especially in patients with predisposing germline mutations 13 . Genetically engineered mouse models of PDAC have shown that KRAS activation is sufficient to initiate pancreatic carcinogenesis 36 , and that concomitant mutations in CDKN2A 52 , SMAD4 (reFs 46 ,53 ) or TP53 (reF. 54 ) hasten cancer progression either locally in the primary tumour or in distant metastases. The many genes with low-prevalence oncogenic point mutations can be aggregated into 13 canonical molecular pathways, each of which is altered in 50–100% of PDACs. These include RAS signalling, regulation of the G1/S cell cycle phase transition, TGFβ signalling, JUN amino-terminal kinase (JNK) signalling, integrin signalling, WNT–Notch signalling, Hedgehog signalling, apoptosis, DNA damage control, small GTPasedependent signalling, invasion and homophilic cell adhesion 24 and embryonic regulators of axon guidance 41 . PDAC can be subtyped by clustering of these pathways, but this approach has not demonstrated clinical relevance. The greater frequency of simple somatic mutations per PDAC genome than per exome combined with the relatively high total number of available genomes allows the derivation of mutational signatures from base substitutions in 96 possible trinucleotide contexts 55 . Many of these signatures either have been associated with known exposures or have been shown experimentally to recapitulate effects of mutagenic processes on the genome 56 . In PDAC, three signatures predominate, namely age-related cytosine deaminations in most primary tumours (signature 1) 57 , homologous recombination deficiency (HRD) in 10% (signature 3) 58 and mismatch repair deficiency (MMRD) (signatures 6, 15, 21, 26 and 44) 59 in 1–2% 43 . These molecular signatures are also found at similar frequencies in PanINs 22 and in metastases 44 , implying that the same processes acting early on the cancer genome are maintained throughout the cancer cell lifetime. Most patients with deleterious germline variants develop tumours bearing signatures associated with their predispositions, such as patients with Lynch syndrome, whose tumours have the MMRD signature, demonstrating the importance of recognizing hereditary syndromes and instituting genetic counselling and testing 43 . Recent professional society guidelines suggest germline testing of all incident PDAC cases 15 . Nearly all cases of PDAC with HRD have biallelic inactivation of BRCA1 , BRCA2 , partner and localizer of BRCA2 ( PALB2 ) or RAD51C , and cases with MMRD have loss of MutL homologue 1 ( MLH1 ), MutS homologue 2 ( MSH2 ), MSH6 or postmeiotic segregation increased protein 2 ( PMS2 ). In these, the first allelic inactivation is often a germline defect 43 . PDACs with HRD or MMRD arise through somatic mutations in genes other than the canonical PDAC drivers, namely FAM72C , SLIT–ROBO Rho GTPase-activating protein 2D ( SRGAP2D ) 44 and ACVR2A , and Janus kinase 1 ( JAK1 ) 59 , respectively, although the biological and translational significances of these mutations are not clear. These signatures are independent of histological or other molecular subtypes, are prognostic and possibly predictive (see later) and offer, in conjunction with structural variants , deeper insight into the order of molecular events in PDAC 50 ,59 . Structural variants, copy number alterations and genomic instability. Microarrays and whole-genome sequencing allow resolution of structural variants, their resulting copy number changes and complex chromosomal rearrangements, which are both frequent and have been implicated in PDAC progression 60 ,61 . Significantly recurrent copy number gains and losses affect known PDAC oncogenes, KRAS (12p12.1), GATA6 (18q11.2), MET (7q31.2), NOTCH1 (9q34.3), ERBB2 (17q12), AKT2 (19q13) and MYC (8q24.2), and tumour suppressor genes, CDKN2A (9p21.3), SMAD4 (18q21.2), TP53 (17p13.1), BRCA1 (17q21.31) , ARID1A (1p36.11), PTEN (10q23.31), the polybromo 1 ( PBRM1 ) (3p21.1) and SMARCA4 (19p13.2) 51 ,62 . The distributions of structural variants, both intrachromosomal and interchromosomal, were used by Waddell et al. 42 to classify PDAC into four genomic subtypes: namely, ‘stable’, with fewer than 50 variants; ‘scattered’, with fewer than 200 variants; ‘locally rearranged’, with involvement of only one or two chromosomes; and ‘unstable’, with more than 200 variants. This last group was associated with HRD signature 3 and somewhat predictive of response to platinum-based chemotherapy in the original study 42 . Complex chromosomal changes associated with mitotic errors have also been identified in PDAC, including polyploidization and chromothripsis , in both PDAC-specific analyses 5 ,42 and pan-cancer analyses 40 ,63 . The reported frequencies for chromothripsis range from 15% 42 to more than 50% 5 ,63 and for polyploidization range up to 45% 5 . In the context of tumour evolution, most simple somatic mutations were shown to precede Epithelial stage Ductal epithelium Low-grade PanIN High-grade PanIN Invasive adenocarcinoma Genetic alterations +/– KRAS • KRAS • Telomere shortening • TP53 • CDKN2A • SMAD4 • Structural variation • Chromothripsis • Polyploidization ECM Fig. 2 | Histological stages in pancreatic ductal adenocarcinoma progression. The progression fromnormal pancreatic ductal epithelium to low-grade then high-grade dysplastic pancreatic intraepithelial neoplasias (PanINs) to invasive adenocarcinoma, with accompanying genetic alterations, is shown. Histologically normal ductal epitheliummay bear KRAS activatingmutations. Low-grade PanINs bear the earliest somatic changes of KRAS oncogene activation and telomere shortening. High-grade PanINs accumulate inactivation of the cell cycle regulatory tumour suppressor genes TP53 , cyclin-dependent kinase inhibitor 2A ( CDKN2A ) and/or SMAD4 . Invasive adenocarcinomas havemore structural and copy number variants, including complex processes such as chromothripsis and polyploidization. ECM, extracellular matrix. Mutational signatures reproducible patterns of somatic changes in DNA, most commonly identified by considering the 96 possible single-nucleotide substitutions in a trinucleotide context (mutated base and bases immediately 5′ and 3′ to it), that are thought to arise from different mutational processes active during the course of cancer development, including endogenous DNA repair deficiencies and exogenous carcinogens (for example, UV radiation and smoking). Biallelic inactivation Two-step process by which both alleles of a gene are lost, often consisting of an inactivating mutation of one copy of the gene and a structural deletion of the other copy, resulting in loss of heterozygosity and loss of that gene’s function, thus commonly observed with tumour suppressor genes during carcinogenesis. Structural variants Alteration of a region of DNA, typically 1kb or greater, by a number of mechanisms, including duplication, deletion, inversion and translocation. www.nature.com/nrc REV I EWS 134 | march 2022 | volume 22

28 | MARCH 2022 | VOLUME 22 www.nature.com/nrc REV I EWS 0123456789( ) ; : and most copy number alterations were shown to follow polyploidization events, implying that some PDACs accrue point mutations during an initial diploid phase, followed by genomic instability5,63. In some tumours sequenced to sufficient resolution, chromothripsis was shown to cause gains and losses of multiple PDAC driver genes, including most frequently loss of SMAD4 and gain of GATA6 and KRAS, simultaneously rather than sequentially5,63. Thus, in some PDACs, observed genomic instability patterns are more consistent with a punctuated evolution progression model, which posits a short latency period between gain of invasive and metastatic properties, as depicted in Fig. 1b. Nevertheless, somatic mutations alone cannot sufficiently explain the biological and clinical differences in PDAC tumours. Many factors impacting gene expression, including epigenetic changes64 and non-coding mutations65,66, are not yet as well annotated, necessitating direct RNA quantification by microarrays or sequencing. Transcriptional subtypes Gene expression studies in PDAC initially focused on subtyping primary tumours obtained from surgical resection, and three separate, eponymous systems were published in rapid succession, namely the three-group classif ication (classical, quasi-mesenchymal or exocrine-like) of Collisson et al.25, the two-group classification (basal-like or classical) of Moffitt et al.26 and the four-group classification (squamous, immunogenic, pancreatic progenitor or aberrantly differentiated endocrine exocrine (ADEX)) of Bailey et al.62. Each of these was prognostic for survival in multivariate analyses of patients with resected PDAC. These subtyping efforts culminated with a publication by The Cancer Genome Atlas (TCGA)51 directly comparing the three systems in a separate cohort of 150 patients with resected primary tumours, finding that high-purity tumours were readily classified into two groups that corresponded to the quasi-mesenchymal group of Collisson et al. or the squamous group of Bailey et al. or the basal-like group of Moffit et al. and the classical group of Collisson et al. or the progenitor group of Bailey et al. or the classical group of Moffitt et al. The Bailey et al. immunogenic, Bailey et al. ADEX and Collisson et al. exocrine-like subtypes were found only in low-tumour-cellularity samples, and were thus thought to reflect contaminating gene expression from non-neoplastic cells. TCGA added support to its assignment of two groups with long non-coding RNA (lncRNA) sequencing, methylation analysis and proteomics, all of which clustered cases similarly to the two consensus mRNA subtypes. This was then validated in a subsequent meta-analysis67. In multivariate Cox regression analysis, which included T category (where T represents the size of the primary tumour), N category (where N represents the number of regional lymph nodes), tumour margin status, adjuvant therapy and histological grade, the two subtypes remained independent predictors of overall survival after surgery, indicating that they encompass biologically and clinically important information44,51. Interestingly, expression of hypoxic markers has been identified by RNA sequencing in approximately 50% of PDAC tumours and was significantly associated with basal-like cases, despite no overlap in the identifying gene sets44. These transcriptional studies have informed PDAC growth patterns as well. Morphologically, PDAC can be classified as having more or less than 40% gland formation with high interobserver concordance and strong association with classical or basal-like gene expression, respectively68. Squamous morphology seen in more than 30% of invasive tumours was also associated with basal-like tumours by several groups62,68,69. The classical subtype has greater expression of KRAS and GATA6 (reFs25,26,70), and corresponding cell lines showed decreased colony formation following RNAi-mediated knockdown of GATA6 and KRAS, whereas basal-like cell lines did not25. Putative mechanisms accounting for GATA6 overexpression in the classical subtype include its copy number gain and high transcription both as mRNA and as a long non-coding RNA transcript, along with AS1, which may become transcriptionally active51. These observations have facilitated clinical translation, including the development of a simplified 16-gene classifier67 and a GATA6 in situ hybridization assay71 to more readily classify PDAC, even from endoscopic biopsy samples. However, when during tumour evolution PDAC diverges into either major subtype and by what mechanism remains unclear. Moreover, these subtypes are prognostic only in resectable tumours, not in patients with metastatic disease. To address this, Chan-Seng-Yue et al.72 performed de novo reclassification of subtypes using a combined cohort of patients with primary andmetastatic tumours that had undergone cellular enrichment by laser capture microdissection, expanding the two groups into five, namely ‘basal-like A’, ‘basal-like B’, ‘hybrid’, ‘classical A’ and ‘classical B’, as depicted in Fig. 3. In this dataset, basal-like A and basal-like B approximately distinguished metastatic disease from localized disease, respectively, and chemotherapy resistance was observed in only the Collisson et al.25 Moffitt et al.26 Bailey et al.62 Chan-Seng-Yue et al.72 Kalimuthu et al.68 Classical (GATA6) Exocrine-like Basal-like Squamous (TP63 and MYC) Progenitor (FOXA2, FOXA3, PDX1, HNF1 and HNF4) ADEX (MIST1, NR5A2 and RBPJL) Basal-like A Basal-like B Classical A Classical B Hybrid >40% non-glandforming component <40% non-glandforming component Quasi-mesenchymal (S100A2) Immunogenic (CD4, CD8, CTLA4 and PD1) Classical Fig. 3 | Expression-based subtypes of pancreatic ductal adenocarcinoma. The various gene expression-based subtypes described for the epithelial component of pancreatic ductal adenocarcinoma are shown. Each row represents a subtyping schema. The overlaps approximate the relationships between these subtypes, although there is some variability. The sizes of the bars are not proportional to the frequency of each subtype. Some of the genes whose expressions are associatedwith each subtype are shown. ADEX, aberrantly differentiated endocrine exocrine. CTLA4, cytotoxic T lymphocyte-associated antigen 4; FOXA, forkhead box A; HNF, hepatocyte nuclear factor; MIST1, muscle, intestine and stomach expression 1 (also known as BHLHA15); NR5A2, nuclear receptor subfamily 5 group Amember 2; PD1, programmed cell death 1; PDX1, pancreas/duodenum homeobox 1; RBPJL, recombining binding protein suppressor of hairless-like. Polyploidization The acquisition of one or more additional sets of chromosomes by a normally haploid or diploid cell, often consisting of whole-genome duplication in carcinogenesis. Chromothripsis A mutational process characterized by up to thousands of structural variations occurring as a single event in localized regions of one chromosome or a few chromosomes. Nature reviews | CanCer Rev i ews volume 22 | march 2022 | 135

VOLUME 22 | MARCH 2022 | 29 NATURE REVIEWS | CANCER REV I EWS 0123456789( ) ; : former. By single-cell sequencing of primary tumours and metastases, classical and basal-like expression signatures segregated to distinct cell populations within each individual bulk tumour72. This has been validated by multiregion sequencing of primary tumours andmetastases69 and by single-cell sequencing of primary tumours73. Thus, PDAC evolves as a mixture of both expression phenotypes, with its behaviour determined by the dominant phenotype and observed plasticity72 between subtypes. The genomic drivers of classical and basal-like phenotypes were shown to be biallelic loss of SMAD4 with GATA6 amplification and biallelic loss of TP53 and/or CDKN2A withmutant KRAS allele amplification, respectively, although no feature was completely exclusive68,72. The allelic imbalance of KRAS leading to severer disease is supported by organoid72 and mouse74 models. Thus, while RNA sequencing of precursor lesions to determine whether these expression phenotypes are established in PanINs is lacking, we can infer that the early acquisition of asymmetric driver gene mutations sets the stage for subsequent PDAC behaviour, which is itself dynamic and also probably susceptible to the effects of chemotherapy. While the five-subtype stratification system still requires validation, it does demonstrate the importance of including metastases in PDAC studies, as pancreatic cancer cell evolution often progresses with dissemination to new environments, such as the liver, lung and peritoneum5,44,72. It also implies that histological and molecular heterogeneity is due to both pancreatic cancer cell clonality and pancreatic cancer cell plasticity. Metastases and tumour heterogeneity At the time of diagnosis only 20% of PDACs are localized, with more than 30% having spread to regional lymph nodes, and more than 50% having metastasized to other solid organs, principally the liver and lungs. Five-year survival is strongly inversely associated with stage: approximately 40% for localized disease, 10% for regional spread and 2% with distant disease. However, most research has focused on precursors and primary tumours, given the availability of these tissues. However, patient enrolment in both rapid autopsy programmes69,75–79 and prospective ‘genomic-based’ clinical trials of stage 4 disease involving percutaneous biopsy of metastases5,43,44,72,80 has allowed the acquisition of both paired and unpaired samples of primary tumours and metastases. Understanding progression from primary tumours to metastasis requires recognition of both intratumour heterogeneity and intertumour heterogeneity, which are closely associated in PDAC76. Many gene alterations in the primary tumour are subclonal, occurring at cancer cell fractions below 1, meaning they are not present in every cancer cell, resulting in intratumour heterogeneity81. Gross sectioning of primary PDAC into numerous 3D pieces for targeted sequencing demonstrates a mosaicism within the bulk tumour of spatially distributed subclonal cell populations, each with predominant mutations of similar cancer cell fractions76. Anatomically distinct metastases are similarly characterized by mutations corresponding to these specific subclonal cell populations in the primary tumour. Parental clones in the primary tumour harbour most of the observed genetic alterations and chromosomal instability seen in metastases, upon which further mutations are superimposed, leading to subclonal evolution76. This observation has been validated by whole-genome sequencing of tumour cell-enriched bulk paired samples of primary tumours and metastases5,44. Thus, subclonal heterogeneity within the primary tumour leads to the heterogeneity between metastases, which are populated by distinct subclones. This high mutational conservation between paired PDAC primary tumours and metastases has been delineated further with insights for understanding tumour progression44,76–78. Simple somatic variants are closely concordant in untreated primary tumours and metastases, and metastasis-specific variation is found in genes of ambiguous importance to carcinogenesis78. Driver gene inactivations and mutational signatures are maintained in untreated primary tumours and metastases, implying that each mutational process contributes in equal proportion to simple genomic variation over time. The degree of heterogeneity can be quantified for different mutational features using mathematical approaches. These have focused on simple and structural variants using the Jaccard similarity index. Among simple somatic mutations in untreated primary tumours and metastases, truncations (stop gain, splice site and frameshift) are most conserved, followed by non-truncating coding (missense and non-frameshift), non-coding (promoters and enhancers) and then silent mutations, following the expected selective pressure on each mutation class44. Among structural variants, inversions and translocations are more conserved than deletions and duplications, a validated observation with the implication that breakage–fusion–bridge cycles resulting from telomere loss are early events in PDAC oncogenesis44,77. Moreover, in pan-cancer analyses, it was shown that abnormal telomere maintenance is a common early event in tumours arising from tissues with low replicative activity, such as the pancreas40. Metastases do harbour greater genomic instability, demonstrated by increased numbers of structural variants, a greater proportion of copy number variants deviating from ploidy and greater overall polyploidy44,72. In patients with three or more sequenced metastatic sites, the shared patterns of simple somatic mutations and structural variants are compatible with the Halstedian concept of sequential progression from primary tumour to lymph node to distant organs44. Transcriptional conservation between untreated paired PDAC primary tumours and metastases has also been demonstrated, although this phenomenon is less well studied. It has been recurrently observed that classical and basal-like expression subtypes are highly concordant between paired primary tumours and metastases26,44; however, the five-subtype classification of Chan-Seng-Yue et al. can be fluid72. KRAS dosage72,74,77,82 is higher in metastases than in primary tumours, especially in basal-like tumours, shown by both copy number alterations and RNA expression, and thereby is evidence that KRAS allelic imbalance is important for both tumour initiation and metastatic spread. However, the Allelic imbalance The unequal expression of two alleles of a gene, which can be caused by several mechanisms, including copy number changes or epigenetic inactivation. Jaccard similarity index A statistic used to measure the similarity (or inversely diversity) of two or more sample sets, calculated as the intersection divided by the union of the members of those sets. Breakage–fusion–bridge cycles This structural variation arises due to telomere loss at one end of a chromosome, leading to fusion at that identical site on the sister chromatid in prophase, then breakage of the chromosomes at a random site during chromatid separation in anaphase, causing uneven distribution of genetic material to daughter nuclei, and repetition of the cycle with subsequent cell divisions. www.nature.com/nrc Rev i ews 136 | march 2022 | volume 22

30 | MARCH 2022 | VOLUME 22 www.nature.com/nrc REV I EWS 0123456789( ) ; : mechanism of KRAS allelic imbalance during tumour initiation is an activating point mutation, whereas in metastases, it is cumulatively amplifying structural rearrangements 44 ,72 . Hypoxia gene expression is also highly concordant between paired primary tumours and metastases, despite histologically different organ microenvironments, implying that regulation of hypoxia is an inherent and heritable feature of PDAC cells 44 . Cases of synchronous PDAC and metachronous PDAC were recently described 44 , perhaps arising from cancerization of the pancreatic ducts 22 . The recognition of these ‘new’ phenomena is likely attributable to improved imaging and histology and the longer survival of patients with resected primaries. However, most ‘second’ lesions when sequenced have been shown to be intraparenchymal metastases 44 , with only a minority being distinct primary tumours 79 . Although not yet recognized by consensus PDAC cancer staging, patients with these ‘second’ intraparenchymal tumours display the same clinical and molecular features as other patients with advanced PDAC, and thus should probably be managed as patients with metastatic disease. Timing of cancer progression Remarkable insight into our understanding of PDAC evolution from normal ductal epithelium to precursor lesion through to malignancy and metastasis has been achieved, and from this, a comprehensive model, such as the one depicted in Fig. 4 , can be proposed. There has also been significant interest in estimating the time required to progress from PDAC precursor to invasion to metastasis, as this has obvious implications for genetically high risk or population-based screening strategies. Early reports suggested that patients undergoing resection for chronic pancreatitis bearing multifocal PanINs then developed PDAC within 3–10 years 3 ,83 . Models were then later developed using variables, which included the number of neutral mutations inpairedprimary tumours and metastases, the pancreatic cell division rate and the mutation rate per base pair, to estimate the time from the first somatic mutation to the origin of the parental invasive cell as 10–12years, the time from then to the origin of the index metastatic cell as 6–7years and the time from dissemination to patient death as 2–3 years 76 ,84 . These relatively large intervals imply that there are substantial opportunities for intervention, which have not yet been realized in imaging-based screening trials 15 . These models of cancer progression make assumptions of uniformity across PDAC evolution and subtypes. However, our improved understanding of PDAC evolution could inform updated models. First, studies quantifying mutational conservation inform how to include simple somatic and structural variants in such models 44 ,65 . Second, PDAC cell cycle proliferation rate has been shown to be associated with the numbers of driver gene inactivations, and to increase in metastases, rather than the fixed cellular replication rate previously assumed across a tumour lifespan and across all tumour types 44 ,47 . Third, the mutational signature of cytosine deaminations (‘signature 1’) is associated with patient age, and hence the mutation rate per base pair likely varies with mutational processes acting on the cancer cell 55 . Furthermore, with the vast genotypic and phenotypic heterogeneity that has been revealed across PDACs, it seems unlikely that all PDACs progress at the same rate, and separate models may be needed to encompass the full spectrum of PDAC biology, and then to develop more rational screening strategies. A recent pan-cancer timing analysis demonstrated consistencies in carcinogenesis across tumour types, including PDAC 85 . Early changes included mutations in a constrained set of driver genes, especially TP53 and KRAS , and early structural variation, especially deletions of chromosome 17. Whole-genome duplication was a commonly observed intermediate change, seen in 40% of PDACs. Subsequent late changes in mutations and mutational signatures varied over time 85 . Focusing on spontaneous cytosine deaminations, this analysis estimated the time to diagnosis from whole-genome doubling in a precursor lesion to be approximately 4 years for PDAC, with a range of 2.3–11 years 85 . Ductal epithelium Low-grade PanIN High-grade PanIN • RAS activation by simple mutation • Telomere shortening • Subclonal heterogeneity • Transcriptional subtypes TP53, CDKN2A and SMAD4 inactivations Mitotic errors, polyploidization, chromothripsis KRAS allelic imbalance, copy number gain Index invasive cell Index metastatic cell Lymph node metastasis Distant organ metastasis Halstedian sequential progression • Cancerization of ducts • Intraparenchymal spread • Simple mutations predominate • Higher number of founder mutations • Higher functional significance • Copy number variants predominate • Lower number of progressor mutations • Lower functional significance Classical Basal-like Plasticity Fig. 4 | Evolutionary model of pancreatic ductal adenocarcinoma. The neoplastic progression frompancreatic ductal epithelium to distant metastasis based on integration of findings from several studies is shown. The increase in hue of the cells corresponds to a greater total mutational burden and subclonal tumour heterogeneity. Initially, a pancreatic ductal epithelial cell acquires simple somaticmutations, including RAS activation, followed by loss of cell cycle control mediated by loss of cyclin-dependent kinase inhibitor 2A ( CDKN2A ), SMAD4 and/or TP53 , facilitating cell growth. An unknown trigger, perhaps telomere loss and breakage–fusion–bridge cycles, leads to complexmitotic errors, polyploidy inmost cases and chromothripsis in some, followed by more rapid acquisition of structural and copy number variation. More rapid neoplastic cell proliferationwith heterogeneous driver and pathway alterations leads to amixture of transcriptional subtypes, and hypoxic phenotypes, KRAS allelic imbalance, invasion and dissemination. Further spread exclusively within the pancreas is possible, perhaps by cancerization of the ducts, and spread beyond the pancreas appears to followpredominantly lymphatic routes in a Halstedian, sequential fashion. PanIN, pancreatic intraepithelial neoplasia. Synchronous PDAC Defined as two or more neoplasms identified simultaneously or up to 6 months apart in the same patient. Metachronous PDAC Defined as two or more neoplasms identified more than 6 months apart in the same patient. Neutral mutations A DNA mutation that is independent of natural selection, such as a synonymous base substitution. NaTure revIeWS | CANCER REV I EWS volume 22 | march 2022 | 137

VOLUME 22 | MARCH 2022 | 31 NATURE REVIEWS | CANCER REV I EWS 0123456789( ) ; : Tumour microenvironment Our knowledge of the extrinsic microenvironment in PDAC has also advanced owing to the implementation of new techniques, summarized in Box 1, including the role of the dense tumour-associated stroma characteristic of PDAC, the pancreatic microbiome86–90 and the blunted antitumour immune response43,89–93. The dense stroma is composed predominantly of cancer-associated fibroblasts (CAFs) and immune and endothelial cells89. The long-held assumption that this stroma enhances malignancy was challenged by the observation that two stroma-depleted mouse models, paradoxically, displayed increased PDAC aggressiveness94,95 and the fact that clinical trials of stroma-targeted agents failed96,97. Through bulk RNA sequencing, Moffitt et al.26 characterized PDAC stroma into two subtypes, ‘normal’ and ‘activated’, which were prognostic independently of the tumour subtypes and could be paired with either subtype (classical or basal-like). Further delineation of the effect the tumour microenvironment has on PDAC progression was recently demonstrated in both organoid82 and cell line98 studies. Injection of patient-derived organoids into either mouse pancreatic duct or interstitium resulted in differences in expression-based subtypes, proliferation rates, histology, and the extent of stromal reaction, demonstrating that the tumour microenvironment influences characteristics of the cancer cells in a mechanism that was shown to involve TGFβ signalling82. In a separate study, co-culture of PDAC and CAF cell lines in various ratios followed by single-cell sequencing demonstrated differences in gene expression in both cell types depending on the degree of mixture, implying crosstalk between the two, and again this was shown to be mediated by TGFβ signalling98. While there is no evidence of bacterial or viral integration into the PDAC genome43,99, recent evidence86 has demonstrated that there is an increased quantity and diversity of bacteria in pancreatic tumours compared with normal human and mouse pancreatic tissue. Bacteria found in the PDAC microenvironment likely contribute to PDAC pathogenesis in mouse models, as PDAC growth was shown to be inhibited upon bacterial ablation with an oral antibiotic regimen and could resume upon faecal transfer from PDAC-bearing mice but not control mice86. The mechanism involves, in part, bacterial translocation from the bowel, modulation of antitumour immunity by Toll-like receptor (TLR) signalling in macrophages and alterations in the levels of T cell populations86,87. Furthermore, the composition of the PDAC microbiome has been associated with postoperative survival87 and with chemotherapy resistance owing to metabolism of gemcitabine by bacterial cytosine deaminase88. The PDAC immune response43,89–93, which results from the balance between intrinsic cancer cell features and the protumour and antitumour immune cell populations in the microenvironment, is too complex to be comprehensively reviewed here. Briefly, of the three possible drivers of cancer cell recognition by the immune system, namely expression of neoantigens, viral antigens or fetal antigens, PDAC appears to bear only the first of these100. Most studies have found that neoantigen expression is directly related to simple somatic mutational burden43,91, which itself is related to mutational signatures acting on the cancer genome43. Patients whose tumours show neoantigen molecular mimicry of microbial epitopes may have longer survival93, although it has also been shown that low neoantigen expression may also confer longer survival89. The number of neoantigens in primary PDAC tumours is associated with stromal expression of innate and adaptive immunity genes, including granzyme A (GZMA), perforin 1 (PRF1), cytotoxic T lymphocyte-associated antigen 4 (CTLA4), programmed cell death 1 ligand 1 (PDL1; also known as CD274) and indoleamine 2,3-dioxygenase 1 (IDO1)43. In PDAC metastases, this association may be less pronounced92. Also, some primary and metastatic PDAC tumours express a chemokine signature indicative of T cell activation92. It remains unclear exactly how antitumour immunity impacts PDAC progression in humans, although contributions from tumour-associated macrophages101 to primary PDAC growth and neutrophil-derived chemokines102 to the growth of metastases have been shown in mouse models. Overall, while our understanding of the factors extrinsic to the cancer cell is deepening, a comprehensive description and validated role of stroma in PDAC evolution remains elusive. Translation of PDAC evolution Insights from PDAC sequencing may provide valuable information, which could lead to the development of methods for earlier detection as well as improved therapies. The use of circulating biomarkers, including circulating tumour cells (CTCs) and circulating tumour DNA (ctDNA), to identify somatic mutations103 and methylation changes104 for the detection, classification and monitoring of cancer is an exciting area of research that remains in the investigational stage and is thus not covered here. Box 1 | Strategies for studying pancreatic ductal adenocarcinoma stroma • Primary andmetastatic pancreatic ductal adenocarcinomas (PDacs) exhibit abundant stroma, which includes extracellular matrix, fibroblasts, endothelial cells and immune cells. Investigating the protumour and antitumour functions of this stroma may informpatient care. • histology of haematoxylin and eosin (h&e)-stained sections and immunohistochemistry of primary PDac resections has demonstrated three stromal subtypes, which can be used to define prognosis: ‘mature’, with dense acellular collagen; ‘immature’, with scant collagen and high cellularity; and ‘intermediate’89. • rNa sequencing of bulk stroma has revealed two subtypes: ‘normal’, with high expression of pancreatic stellate cell markers; and ‘activated’, with immune-promoting and tumour-promotingmarker expression26. • Single-cell sequencing of human PDacor mouse and/or cell linemodels has demonstrated cancer-associated fibroblast (caF) heterogeneity andmechanisms of adenocarcinoma stimulation involving transforming growth factor-β (TGFβ) signalling98. • Gene-knockout mouse models of PDac have shown that reduced stromal content is associated withmore aggressive tumour growth, undifferentiated histology, and increased vascularity94,95. • Thus, PDac-associated stroma is heterogeneous, with various assays demonstrating both tumour-promoting and tumour-restricting functions, which have not yet been translated into clinical practice, and the nuances of which remain to be elaborated further. www.nature.com/nrc Rev i ews 138 | march 2022 | volume 22

RkJQdWJsaXNoZXIy MTYzOTI3MA==