A 5-lncRNA signature predicts clinical prognosis and demonstrates a different mRNA expression in adult soft tissue sarcoma
Highlight box
Key findings
• The study found that this 5-lncRNA signature could predict prognosis well, especially in leiomyosarcoma (LMS), a subtype of adult soft tissue sarcoma (SARC).
• The protein-protein interaction (PPI) network analysis revealed that MYH11, MYLK, and CNN1 were the most important hub genes among the 18 differentially expressed messenger RNAs (DEmRNAs), all of which are essential for muscle function.
What is known and what is new?
• The molecular markers that can diagnose SARC and its subtypes are lacking. Long non-coding RNAs (lncRNAs) have been considered potential biomarkers for diagnosis and prognosis in many types of cancers. However, the relationship between lncRNA and SARC is still vague, and the lncRNA signature that can predict SARC and its effects on gene expression is also rare.
• In this study, a predictive clinical model for SARC was successfully established, showing better prediction accuracy in patients with LMS. Importantly, we identified MYH11, MYLK, and CNN1 as potential therapeutic targets for SARC.
What is the implication, and what should change now?
• We must realize the limitations of this study that warrant consideration, including the lack of additional SARC samples collected to validate the 5-lncRNA Cox regression model, and the unclear mechanisms of how the 5 lncRNAs regulate the hub genes in SARC. We are planning to explore these issues in our future studies and believe that meaningful results will be observed.
Introduction
Adult soft tissue sarcoma (SARC) is a diverse mesenchymal malignancy that originates in muscle, tendons, fat, lymph and blood vessels, and nerves. It accounts for less than 1% of all new cancers but is highly aggressive, leading to a disproportionate share of cancer-related deaths among young adults aged 20–39 years (seer.cancer.gov). There are over 100 histologic subtypes, with leiomyosarcoma (LMS), dedifferentiated liposarcoma (DDLPS), undifferentiated sarcoma (UDS), and fibromyxosarcoma (FBS) being the most common subtypes (1).
Interventional radiology plays a crucial role in diagnosing SARC, with image-guided percutaneous core needle biopsy being the most commonly used technique for this purpose (2). However, the diagnosis of SARC with radiology is nonspecific and insensitive (2). Biopsy is the gold standard for diagnosing SARC, but this method is invasive and may be harmful to the body. There is a lack of molecular markers that can diagnose SARC and its subtypes.
Recently, evidence has been discovered that the majority of human genome transcripts are non-coding RNAs, rather than protein-coding RNAs (3,4). Non-coding RNAs with more than 200 nucleotides are called long non-coding RNAs (lncRNAs), and tens of thousands of lncRNAs have been identified through modern biotechnology (5). Unlike the old notion that non-protein-coding transcripts are non-functional RNAs, lncRNA plays a significant role in various biological activities, including DNA modification, RNA transcription, pre-messenger RNA (mRNA) splicing, mRNA degradation, mRNA translation, and protein regulation (6-8). LncRNAs play various biological functions that are associated with many types of cancers, including those originating in the lung, colon, breast, stomach, and liver, among others (9-12). LncRNAs have been considered as potential biomarkers for diagnosis and prognosis.
Some studies have reported that lncRNAs are associated with SARC (13-15). Shao et al.’s research identified that the lncRNA PILRLS was significantly overexpressed and closely associated with the occurrence of retroperitoneal liposarcoma (15); O’Brien et al. showed that the lncRNA MYCNOS-01 regulates the growth of rhabdomyosarcoma (13); a study found that lncRNA MATN1-AS1 has the ability to predict the prognosis in osteosarcoma patients (16). Large numbers of lncRNAs, such as lnc-SERTAD2-3, lnc-PSMA8-1, lnc-DUXAP8, and others, are closely associated with the pathogenic process of SARC (17-19). However, such studies are rare. The relationship between lncRNA and SARC currently remains vague, and there is a need for an lncRNA signature that can predict SARC and its effects on gene expression.
Here, we used The Cancer Genome Atlas (TCGA) data, utilized the Cox proportional hazards regression model (abbreviated as Cox regression model), and identified a 5-lncRNAs signature. Specifically, by considering the risk score based on the Cox regression model, we observed distinct mRNA characteristics between the high- and low-risk groups. Additionally, we discovered that LMS samples exhibit a unique differentially expressed mRNA (DEmRNA) profile. The significant linear correlation between lncRNA and DEmRNA was also detailed in our study. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-203/rc).
Methods
TCGA data retrieval and preprocessing
This study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). An SARC RNA sequencing (RNA-seq) dataset and corresponding clinical follow-up information were downloaded from the TCGA repository (https://cancergenome.nih.gov/; using R GDCRNATools package). Totally, data from 259 patients, who have both an RNA-seq dataset and clinical follow-up information, were extracted. The patients were divided into a training set (n=130, used to construct an lncRNA-related Cox regression model) and a testing set (n=129, used to validate the model). The clinical information is detailed in Table 1.
Table 1
Characteristics | Train (n=130) | Test (n=129) |
---|---|---|
Age (years), median [SD] | 61 [15] | 62 [14] |
Gender, n (%) | ||
Male | 60 (46.2) | 58 (45.0) |
Female | 70 (53.8) | 71 (55.0) |
Histological type, n (%) | ||
Leiomyosarcoma | 51 (39.2) | 53 (41.1) |
Undifferentiated sarcoma | 16 (12.3) | 18 (14.0) |
Dedifferentiated liposarcoma | 31 (23.8) | 27 (20.9) |
Fibromyxosarcoma | 14 (10.8) | 13 (10.1) |
Synovial sarcoma | 6 (4.6) | 4 (3.1) |
Malignant fibrous histiocytoma | 4 (3.1) | 8 (6.2) |
Malignant peripheral nerve sheath tumor | 6 (4.6) | 3 (2.3) |
Giant cell sarcoma | 1 (0.8) | 2 (1.6) |
Pleomorphic liposarcoma | 1 (0.8) | 1 (0.8) |
Survival status, n (%) | ||
Alive | 80 (61.5) | 80 (62.0) |
Dead | 50 (38.5) | 49 (38.0) |
Median follow-up (days) | 1,196 | 1,187 |
SARC, sarcoma; SD, standard deviation.
The lncRNA and mRNA were identified using the Genecode database (https://www.gencodegenes.org/human/release_22.html). In total, 60,483 RNAs, 14,852 lncRNAs, and 20,271 mRNAs were acquired. The selected lncRNAs and mRNAs’ raw counts were normalized and log2-transformed; the normalized lncRNA and mRNA with expression levels >0 and appearing in at least 50% of the samples were chosen as the final data.
Constructing and validating the lncRNA-related Cox regression model
All the selected lncRNAs in the training were subjected to the univariable Cox regression analysis using the R survival package. Only those with a P value <0.05 were chosen as a target lncRNA.
A robust likelihood-based survival modeling approach was utilized by the robust package in R to identify the optimal lncRNA signature (20,21). A total of 259 samples were randomly divided into a training set [comprising N*(1−t) samples] and a testing set (comprising N*t samples, where t=1/3). All lncRNAs identified through univariable Cox regression analysis in the training set were included in the subsequent analysis to obtain parameter estimates. Each lncRNA was then assessed for log likelihood using the parameter estimate in the validation set. This process was iterated 1,000 times, resulting in 1,000 log likelihood values for each lncRNA. The top-performing lncRNA was chosen based on the highest mean log likelihood. The next best lncRNA was determined through evaluation of 2-lncRNA models, selecting the one with the highest mean log likelihood. Multiple models were developed using this approach. Akaike information criteria (AICs) were employed to assess these models, with the model having the lowest AIC being chosen for the final multivariable Cox regression analysis of lncRNAs. Following this robust analysis, the potential for enhanced feasibility and reliability of clinical prognosis based on lncRNAs was significantly improved.
The lncRNAs selected from a robust analysis were applied to multivariable Cox regression analysis in the training set. The Wald test was used for the regression coefficient test. The final target lncRNA was selected based on P<0.05. Subsequently, a Cox regression model was constructed for further clinical prognosis prediction.
The risk score of each training sample was calculated. According to the median risk score, the patients were classified into high- and low-risk groups. Similarly, the risk score was calculated in the testing set and complete set, and the high- and low-risk group were identified by the median risk score from the training set. As SARC has many different pathological types, the risk score was also calculated in the LMS set (n=104), DDLPS set (n=58), and other pathological type set (n=97) [including undifferentiated sarcoma (UDS), fibromyxosarcoma (FBS), synovial sarcoma (SS), malignant fibrous histiocytoma (MFH), malignant peripheral nerve sheath tumor (MPNST), giant cell sarcoma (GCS), and pleomorphic liposarcoma (PS)], identifying the high- and low-risk groups using the same method.
In order to validate the Cox regression model’s ability to accurately predict the prognosis, the Kaplan-Meier method was applied to survival analysis in the training set, testing set, complete set, and different pathological type set using GraphPad Prism 9 (GraphPad Software, San Diego, CA, USA). The log-rank test was used for comparison (Appendix 1).
The association between the final selected lncRNAs and mRNAs
Differentially expressed messenger RNAs (DEmRNAs) in the complete set between the high- and low-risk group were identified using the R limma package with a threshold of |log2 fold change (log2FC)| >2 and adj. P<0.05. A heatmap was generated using the R pheatmap package, clustered by different risk scores and pathological types.
To investigate the potential functional characteristics of the DEmRNA, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted using the Database for Annotation, Visualization, and Integrated Discovery (DAVID; https://davidbioinformatics.nih.gov/) online tool. A statistical significance threshold criterion of P<0.05 was applied.
To observe the association between the final selected lncRNAs, a Cox regression model was constructed, and for the DEmRNAs, Spearman correlation coefficient analyzed using the software SPSS 23.0 (IBM Corp., Armonk, NY, USA). From the heatmap of DEmRNAs in the complete set, we found that LMS samples were markedly different from other pathological types. Therefore, we selected LMS samples and established Spearman correlation coefficients between the finally selected lncRNAs and DEmRNAs to identify whether the Cox regression model could have a better correlation in LMS. The KEGG pathways of hub genes were calculated by Metascape (https://metascape.org/), and a protein-protein interaction (PPI) network was drawn using the Search Tool for the Retrieval of Interacting Genes/Proteins database (STRING; https://cn.string-db.org/).
Statistical analysis
SPSS 23.0 was used for statistical analysis. GraphPad Prism 9.0 software was utilized for graphical presentation. The heatmaps were created using R software (pheatmap package). One-way analysis of variance (ANOVA) was conducted to examine differences among 3 or more groups, whereas Student’s t-tests were used to compare data between 2 groups. The heatmap data was normalized using the z-score method [z=(x−µ)/σ]. The flowchart of this study is shown in Figure 1.
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB6-5134-R1.jpg/w300)
Results
Based on our selection criteria, 6,895 lncRNAs and 12,333 mRNAs were screened from a pool of 14,852 lncRNAs and 20,271 mRNAs. A total of 259 SARC samples with clinical data and 6,895 lncRNA expression profiles were divided into a training set (n=130) and a testing set (n=129).
Constructing the Cox regression model in the training set
With a screened threshold of P<0.05, univariable Cox regression analysis filtered 412 lncRNAs (Table S1) from a total of 6,895 lncRNAs. We listed the top 20 lncRNAs with the lowest P values in Table 2. A robust likelihood-based analysis of the 412 filtered lncRNAs identified 20 lncRNAs with the lowest AIC values, which are presented in Table 3. Additionally, 18 lncRNAs with P<0.05 were selected for multivariable Cox regression analysis. Ultimately, five lncRNAs were identified (with P<0.05, Table 4), and a Cox regression model was developed using them. The Cox regression model is as follows: risk score = (−0.5698*AC018645.2) + 0.1732*LINC02454 + 0.387*ERICD + 0.6262*DSCR9 + 0.9781*AL031770.1. The proportional hazards Assumption (PH) confirmed that the hazard ratio (HR) did not change over time (Table S2).
Table 2
LncRNA | Ensembl ID | Hazard ratio | 95% CI | P value |
---|---|---|---|---|
GCC2-AS1 | ENSG00000214184 | 1.74 | 1.24–2.45 | 0.001 |
AC112721.2 | ENSG00000222032 | 1.26 | 1.10–1.45 | 0.001 |
AL592166.1 | ENSG00000225721 | 2.06 | 1.36–3.13 | <0.001 |
DSCR9 | ENSG00000230366 | 2.20 | 1.41–3.42 | <0.001 |
AL109615.2 | ENSG00000231881 | 2.19 | 1.59–3.00 | <0.001 |
AP001626.1 | ENSG00000235023 | 1.53 | 1.22–1.93 | <0.001 |
JARID2-AS1 | ENSG00000235488 | 2.11 | 1.41–3.15 | <0.001 |
BRWD1-AS1 | ENSG00000238141 | 2.35 | 1.41–3.93 | 0.001 |
PARAL1 | ENSG00000243961 | 1.44 | 1.15–1.81 | 0.001 |
AC021127.1 | ENSG00000248719 | 2.01 | 1.33–3.05 | <0.001 |
LINC00942 | ENSG00000249628 | 1.36 | 1.16–1.59 | <0.001 |
SRD5A3-AS1 | ENSG00000249700 | 0.49 | 0.33–0.72 | <0.001 |
AC025419.1 | ENSG00000250748 | 1.26 | 1.09–1.44 | 0.001 |
AC097478.1 | ENSG00000251095 | 1.67 | 1.27–2.21 | <0.001 |
LINC01290 | ENSG00000260468 | 0.46 | 0.29–0.73 | 0.001 |
RP11-680G10.1 | ENSG00000261567 | 1.80 | 1.27–2.54 | <0.001 |
AP003071.4 | ENSG00000261625 | 0.76 | 0.65–0.90 | 0.001 |
AF001548.2 | ENSG00000263335 | 0.57 | 0.41–0.78 | <0.001 |
AC068025.1 | ENSG00000264808 | 1.94 | 1.38–2.74 | <0.001 |
LINC01163 | ENSG00000280953 | 1.38 | 1.14–1.67 | 0.001 |
lncRNA, long non-coding RNA; CI, confidence interval.
Table 3
LncRNA | Gene | nloglik | AIC |
---|---|---|---|
AC018645.2 | ENSG00000273014 | 475.74 | 953.48* |
AL109917.1 | ENSG00000231050 | 468.25 | 940.50* |
SLC16A1-AS1 | ENSG00000226419 | 461.44 | 928.88* |
AP002852.1 | ENSG00000253633 | 458.34 | 924.69* |
LINC02454 | ENSG00000256268 | 457.71 | 925.42* |
AL592166.1 | ENSG00000225721 | 456.93 | 925.87* |
LINC00891 | ENSG00000281852 | 455.58 | 925.15* |
ERICD | ENSG00000280303 | 448.82 | 913.65* |
AL136162.1 | ENSG00000271888 | 448.71 | 915.42* |
AP003071.4 | ENSG00000261625 | 444.15 | 908.30* |
AL157373.2 | ENSG00000229896 | 442.40 | 906.80* |
AL035458.2 | ENSG00000250917 | 439.59 | 903.17* |
DSCR9 | ENSG00000230366 | 438.09 | 902.17* |
AL031770.1 | ENSG00000230433 | 432.43 | 892.86* |
AL513218.1 | ENSG00000272100 | 432.04 | 894.07* |
AL135818.2 | ENSG00000260810 | 427.16 | 886.32* |
AC243829.1 | ENSG00000274767 | 425.19 | 884.39* |
RP11-321A17.6 | ENSG00000276995 | 422.57 | 881.15* |
AC005840.4 | ENSG00000276718 | 422.57 | 883.14 |
AL135818.1 | ENSG00000258875 | 420.95 | 881.90 |
*, P<0.05. lncRNA, long non-coding RNA; AIC, Akaike’s information criteria
Table 4
lncRNA | Ensembl ID | Coefficient | Hazard ratio | P value |
---|---|---|---|---|
AC018645.2 | ENSG00000273014 | −0.5698 | 0.5656 | 0.007 |
LINC02454 | ENSG00000256268 | 0.1732 | 1.1891 | 0.006 |
ERICD | ENSG00000280303 | 0.3870 | 1.4725 | 0.007 |
DSCR9 | ENSG00000230366 | 0.6262 | 1.8705 | 0.002 |
AL031770.1 | ENSG00000230433 | 0.9781 | 2.6593 | <0.001 |
Wald test P=9e−09. lncRNA, long non-coding RNA.
Validating the Cox regression model
According to the median risk score, high- and low-risk groups were identified in the training set, testing set, complete set, and different pathological type sets (including LMS set, DDLPS set, and other subtype sets).
Validating the Cox regression model in the training set
The expression levels of five lncRNAs were different between the high- and low-risk groups (Figure 2A), with only AC018645.2 being lower in the high-risk group, whereas the others were higher. In the training set, the high-risk group had a lower survival time and more deceased patients (Figure 2B). Kaplan-Meier survival analysis and log-rank test suggested that the survival time of patients with SARC was negatively associated with the and patients in the low-risk group had a prolonged survival time (P=2.04e−06) (Figure 2C). The P value of the log-rank test clearly indicated that the Cox regression model could predict the prognosis of SARC in the training set.
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB7-7568-R1.jpg/w300)
Validating the 5-lncRNA Cox regression model in the testing set and complete set
To further confirm that this risk score-based Cox regression model could predict the prognosis in SARC, we validated the 5-lncRNA signature in the testing set and complete set. Figure 3 displays the results of the testing set. Based on the median risk score of the training set, 68 samples were assigned to the high-risk group and 61 to the low-risk group. Although the significant difference was not as pronounced as in the training test, they exhibited a similar trend. Kaplan-Meier survival analysis and log-rank test indicated a significant difference in survival time between the high- and low-risk groups (P=9.35e−05). Figure 4 displays the results of the complete set, with 133 samples in the high-risk group and 126 samples in the low-risk group. The scatter diagrams suggested that using the Cox regression model constructed with lncRNAs could predict the prognosis in all samples. The survival times and expression levels of prognosis-related lncRNAs differed between the two groups, and Kaplan-Meier curves further supported a significantly prolonged survival time in the low-risk groups compared to the high-risk groups (P=4.00e−09).
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB8-6070-R1.jpg/w300)
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB9-9531-R1.jpg/w300)
Validating a 5-lncRNA Cox regression model in different pathological types
According to the clinical data, 9 different pathological types were distributed among the 259 samples. There were various characteristics among them, so Kaplan-Meier survival analysis and log-rank tests were conducted in the LMS set, DDLPS set, and other sets. Figure 5 suggests that the Cox regression model could predict the prognosis well in all the pathological types, but it performed better in the LMS set {P=1.19e−06, HR 6.134 [95% confidence interval (CI): 2.951–12.752]} (Figure 5A-5D). The heatmap indicated that the expression of AC018645.2 was lower in the low-risk group, whereas LINC02454, ERICD, DSCR9, and AL031770.1 showed adverse trends (Figure 5E).
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB10-4657-R1.jpg/w300)
LMS samples have a novel DEmRNA profile
Recent evidence shows that lncRNAs are closely related to the expression of mRNAs. Based on the risk score, we identified a high-risk group (n=133) and a low-risk group (n=126) from the complete samples. Subsequently, we compared the mRNAs of the 2 groups. In total, 44 DEmRNAs were identified. Compared to the high-risk group, only ANPEP (logFC =−2.00) and LRRC15 (logFC =−2.20) were down-regulated in the low-risk group, whereas the other 42 DEmRNAs were up-regulated, with the maximum up-regulated mRNA being MYH11 (logFC =4.11) (Table S3).
Clustered by pathology types, we discovered that LMS samples had a unique DEmRNA profile. The expression levels of DEmRNAs were different from those of other pathology types, and this tendency was more pronounced in the low-risk group (Figure 6).
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB11-3965-R1.jpg/w300)
A total of 18 GO terms were identified in the GO enrichment analysis of DEmRNAs, comprising 8 BP terms, 7 CC terms, and 7 MF terms (Table 5, P<0.05). Additionally, 6 KEGG pathways were enriched from the DEmRNAs (Table 6, P<0.05). The functional analysis of DEmRNAs focused on the regulation of muscle tissue and the proliferation of mesenchymal cells, which were found to be associated with SARC, particularly LMS.
Table 5
Category | Term | mRNA | Adjusted P value |
---|---|---|---|
GO BP | GO:0006936~muscle contraction | ACTG2, LMOD1, MYLK, SORBS1, MYH11, DE, TMOD1 | <0.001 |
GO:0010463~mesenchymal cell proliferation | FGF7, HAND2 | 0.01 | |
GO:0007015~actin filament organization | SORBS1, TMOD1, LMOD1 | 0.01 | |
GO:0051694~pointed-end actin filament capping | TMOD1, LMOD1 | 0.02 | |
GO:0030239~myofibril assembly | TMOD1, LMOD1 | 0.03 | |
GO:1901381~positive regulation of potassium ion transmembrane transport | KCNMB1, KCNH2 | 0.04 | |
GO:0006939~smooth muscle contraction | MYLK, MYH11 | 0.04 | |
GO:0060314~regulation of ryanodine-sensitive calcium-release channel activity | PLN, JPH2 | 0.045 | |
GO MF | GO:0003779~actin binding | CNN1, TMOD1, TAGLN, LMOD1, SYNPO2, MYLK, SORBS1 | <0.001 |
GO:0008131~primary amine oxidase activity | MAOB, AOC3 | 0.01 | |
GO:0005523~tropomyosin binding | TMOD1, LMOD1 | 0.03 | |
GO CC | GO:0030018~Z disc | PPP1R12B, JPH2, PGM5, SYNPO2 DES | <0.001 |
GO:0001725~stress fiber | MYLK, SORBS1, PGM5 | 0.007 | |
GO:0005925~focal adhesion | CNN1, SYNPO2, SORBS1, PGM5, NFASC | 0.01 | |
GO:0042383~sarcolemma | POPDC2, PGM5, DES | 0.02 | |
GO:0005865~striated muscle thin filament | TMOD1, LMOD1 | 0.03 | |
GO:0005829~cytosol | MYLK, PPP1R12B, SORBS1, MAPK10, MYH11, TMOD1, PGM5, ACTG2, LMOD1, CES1, ANPEP, PPP1R14A, AR, DES | 0.03 | |
GO:0032982~myosin filament | ACTG2, MYH11 | 0.04 |
DEmRNA, differentially expressed messenger RNA; GO, Gene Ontology; BP, biological process; MF, molecular function; CC, cellular component.
Table 6
Category | Term | mRNA | Adjusted P value |
---|---|---|---|
KEGG pathway | hsa04270: vascular smooth muscle contraction | ACTG2, ADCY5, KCNMB1, MYH11, MYLK, PPP1R12B, IRAG1, PPP1R14A, PLN | <0.001 |
KEGG pathway | hsa04024: cAMP signaling pathway | ADCY5, PLN, MAPK10, POPDC2 | <0.001 |
KEGG pathway | hsa04810: regulation of actin cytoskeleton | FGF7, MYH11, MYLK, PPP1R12B | <0.001 |
KEGG pathway | hsa04728: dopaminergic synapse | ADCY5, MAOB, MAPK10 | <0.001 |
KEGG pathway | hsa04020: calcium signaling pathway | FGF7, MYLK, PLN | 0.006 |
DEmRNA, differentially expressed messenger RNA; KEGG, Kyoto Encyclopedia of Genes and Genomes.
5-lncRNAs have a significant correlation with DEmRNAs
Linear correlation analysis was utilized in all the 259 samples. AC018645.2 was found to have a positive correlation with 42 DEmRNAs (Except ANPEP and LRRC15), whereas LINC02454 and AL031770.1 exhibited the opposite trend (Figure 7A). LINC02454 showed the strongest correlation with DEmRNA, particularly with KCNH2 (Spearman’s r =−0.628, P=9.101e−30), PPP1R12B (Spearman’s r =−0.628, P=7.300e−30), PGM5 (Spearman’s r =−0.608, P=1.487e−27), and ANPEP (Spearman’s r =0.577, P=2.222e−24) (Figure 7A). All 44 DEmRNAs had a significant correlation with risk score (|Spearman’s r| ranging from 0.195 to 0.517, including 2 positive correlations and 42 negative correlations).
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB12-1893-R1.jpg/w300)
As LMS samples had a special DEmRNA profile, and GO analysis strongly showed that the functions of DEmRNA were related to LMS, linear correlation analysis was also used in LMS samples. As expected, all of the DEmRNAs showed a more significant correlation with the risk score. The relationship between AC018645.2 and DEmRNA became stronger, as did that between DSCR9 and DEmRNA, but the correlation with LINC02454 reversed (Figure 7A). A total of 18 DEmRNAs showed a stronger negative correlation with the risk score (|Spearman’s r| >0.6), namely MYH11, ACTG2, MYLK, TAGLN, LMOD1, PRUNE2, CNN1, AOC3, HMCN2, JPH2, RASL12, SYNPO2, KCNMB1, PGM5, MRVI1, PPP1R12B, SORBS1, and PPP1R14A, as depicted in Figure 7B.
KEGG pathway and PPI network of 18 DEmRNAs
A total of 3 KEGG pathways, namely vascular smooth muscle contraction, cGMP-PKG signaling pathway, and regulation of actin cytoskeleton, were predicted with 18 DEmRNAs. All of these KEGG pathways are related to the function of smooth muscle (Figure 8A). Furthermore, MYH11, MYLK, and CNN1 were identified as the hub genes (having the highest degrees) in the PPI network, and many studies (10-12) have revealed their effects on the regulation of smooth muscle (Figure 8B, Table 7).
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB13-5459-R1.jpg/w300)
Table 7
DEmRNA | Degree | Betweenness | Closeness |
---|---|---|---|
MYH11 | 18 | 17.40 | 0.80 |
ACTG2 | 16 | 7.73 | 0.71 |
CNN1 | 16 | 14.50 | 0.75 |
MYLK | 16 | 26.23 | 0.71 |
LMOD1 | 16 | 8.73 | 0.71 |
SORBS1 | 10 | 3.00 | 0.60 |
PPP1R12B | 10 | 22.00 | 0.60 |
TAGLN | 10 | 0.00 | 0.57 |
SYNPO2 | 10 | 22.40 | 0.57 |
KCNMB1 | 8 | 0.00 | 0.52 |
DEmRNA, differentially expressed messenger RNA.
Next, we found that patients with a higher risk score have increased expression of MYH11, MYLK, and CNN1, particularly in the LMS group (Figure 9). From the above results, we might infer the reason why our 5-lncRNA signature model had a better predictive function in LMS patients: most of the DEmRNAs between patients with high-risk scores and those with low-risk scores have the function to regulate smooth muscle.
![Click on image to zoom](http://cdn.amegroups.cn/journals/pbpc/files/journals/3/articles/95680/public/95680-PB14-6785-R1.jpg/w300)
Discussion
SARC is a complex type of cancer that includes at least 100 different histologic and molecular subtypes, each with varying clinical behaviors. Surgery and chemotherapy are the primary treatments for SARC (14). Over the past decade, there has been a shift from a universal treatment approach to a more personalized treatment strategy based on histology (2). This approach aims to customize not only the type and scope of cancer surgery but also the utilization and necessity of multimodality therapy.
Precision medicine has been defined as an approach that utilizes a person’s genetics, environment, and lifestyle to help determine the optimal approach for preventing or treating diseases (22). The genetic molecular markers and genetic predictor models are usually used in precision medicine, greatly improving therapeutic effects for patients with cancer. Unfortunately, genetic molecular markers and genetic predictor models in SARC are lacking. Recently, the effects of genetic molecular markers and the functions of lncRNA signatures in prognosis have been widely discussed (4,23).
In order to improve the prognosis prediction of SARC, we adopted univariable Cox regression analysis, robust analysis, and multivariable Cox regression analysis. We constructed a Cox regression model and identified five lncRNAs: AC018645.2, LINC02454, ERICD, DSCR9, and AL031770.1. The risk score was calculated by the regression coefficient of the cox regression model, and the patients with higher risk scores were found to have shorter survival times. The Cox regression model constructed by the five lncRNAs could predict the prognosis of SARC well. Furthermore, we found that this predictive model showed better prediction in patients with LMS.
Among the five lncRNAs, only AC018645.2 was found to lower the risk of mortality and prolong the patients’ survival time. The other four lncRNAs, including LINC02454, ERICD, DSCR9, and AL031770.1, had an inverse effect. Up to now, AC018645.2 and AL031770.1 were the first reported genes with the potential to predict prognosis in cancer. Some studies have found that LINC02454 is significantly overexpressed in tumor tissues and is highly correlated with lymph node metastasis in thyroid carcinoma and laryngeal squamous cell carcinoma (24-26). Our study also found that LINC02454 was upregulated in SARC. In 2002, DSCR9 was first discovered and subsequently found to have active transcription but no protein coding ability (27). A study on DNA methylation revealed that DSCR9 exhibited a high frequency of methylation in malignant prostate cells, indicating that DSCR9 might regulate the progression of prostate cancer (28). Other studies have found that DSCR9 is closely related to the progression of pancreatic cancer, as well as pancreatic cancer and renal cell carcinoma (29-31). ERICD (also known as ERIC) was found to be regulated by E2F1 (a critical downstream target of the tumor suppressor) and modulate the cellular response to DNA damage (32). Further research found that inhibition of ERICD expression increased E2F1-mediated apoptosis, whereas increased DNA damage promoted ERICD expression levels (33). Recently, a study found that the lncRNA ERICD interacts with ARID3A through E2F1 and regulates the migration and proliferation of osteosarcoma cells (32). Our study strongly agrees with the above study and found that ERICD expression levels are higher in the high-risk group. The possible reason could be that the high-risk group has more DNA damage, and higher ERICD expression levels restrict E2F1-induced apoptosis. However, this conclusion needs further deliberation. On the one side, there is no evidence to identify if this mechanism is appropriate for SARC. On the other side, the difference in ERICD expression levels between the high- and low-risk groups is not as significant compared to the other 4 lncRNAs.
Considering the regulation of lncRNA to DNA, we found 44 DEmRNAs between the high- and low-risk groups. Only two mRNAs (ANPEP, LRRC15) were up-regulated, and 42 were down-regulated in the high-risk group compared to the low-risk group. Subsequently, the 44 DEmRNAs were clustered by pathology types, revealing that LMS samples exhibited a unique DEmRNA profile. The expression levels of the 44 DEmRNAs in the LMS samples were notably distinct from other pathology types. Further analysis through GO and KEGG enrichment revealed that the functions of the 44 DEmRNAs were primarily associated with the regulation of muscle tissue and the proliferation of mesenchymal cells, which are believed to be linked to SARC, particularly LMS. Lastly, we identified 5 lncRNAs that exhibited a strong correlation with DEmRNAs, with the correlation being more pronounced in LMS samples. We also identified 18 DEmRNAs that showed a robust correlation with the risk score in LMS samples (|Spearman’s r| >0.6). The results from the PPI network analysis highlighted that MYH11, MYLK, and CNN1 were the hub genes for SARC, particularly for the LMS subtype.
MYH11, MYLK, and CNN1 have been reported to be associated with smooth muscle disease (34-36). The results of some studies have shown that MYH11 and MYLK had high expression levels in LMS samples, which greatly validated the reliability of our results (1,10-12). In our study, the expression levels of these two genes were higher in LMS samples. Another study identified that MYLK could improve overall survival in tumor patients (35). Correspondingly, our study identified that the lower risk score group expressed lower MYLK, indicating that patients with LMS might have a better prognosis than other subtypes of SARC. Additionally, research has shown that CNN1 is important to distinguish clinically relevant molecular subtypes in LMS (36). Furthermore, MYH11, MYLK, and CNN1 have been reported to be important for other cancers. MYH11 was found to interact with DNMT3B via hypermethylation, regulating gastric cancer progression (37), and the CBFB-MYH11 fusion neoantigen enables T cell recognition and killing of acute myeloid leukemia (37). MYLK promotes hepatocellular carcinoma progression by regulating the cytoskeleton to enhance epithelial-mesenchymal transition and is considered a molecular target in bladder cancer (38,39). CNN1 affects bladder cancer progression and metabolic reprogramming by modulating the HIF-1α signaling pathway and regulates the DKK1/Wnt/β-catenin/c-myc signaling pathway by activating TIMP2 in lung squamous cell carcinoma cells (40,41). In this study, we found that patients with a higher 5-lncRNA-related risk score had higher expression of MYH11, MYLK, and CNN1, especially among LMS patient.
Besides the aforementioned three key genes, other genes such as PRUNE2, TAGLN, ACTG2, and LMOD1 are also associated with SARC (1,36,42). As far as we know, JPH2 is vital for heart diseases (43,44), and another 10 DEmRNAs, including MRVI1, PPP1R14A, PPP1R12B, SORBS1, RASL12, AOC3, PGM5, SYNPO2, and HMCN2 are reported to be related to multiple cancers, but have not been significantly studied in relation to SARC (45-49).
Conclusions
We applied a stable method to identify a 5-lncRNA signature in SARC, which predicted the prognosis well. MYH11, MYLK, and CNN1 are hub genes that regulate smooth function in patients with LMS. Our data suggest that the 5 lncRNAs and 3 hub genes may have some connection. They have significant prognostic and therapeutic implications for SARC, especially for LMS, and deserve more attention. The results revealed in this study might be useful for the precise diagnosis and therapy of SARC. However, we must realize the limitations of this study that warrant consideration, including the lack of additional SARC samples collected to validate the 5-lncRNA Cox regression model, and the unclear mechanisms of how the 5-lncRNAs regulate the hub genes in SARC. We are planning to explore these issues in our future studies and believe that meaningful results will be observed.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-203/rc
Peer Review File: Available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-203/prf
Funding: This work was financially supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tcr.amegroups.com/article/view/10.21037/tcr-24-203/coif). All authors report that this work was financially supported by Scientific Research Project of Hunan Provincial Department of Education (24B1084) and Guangdong Provincial Bureau of Traditional Chinese Medicine Research Project (20212149). The authors have no other conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Cancer Genome Atlas Research Network. Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas. Cell 2017;171:950-965.e28. [Crossref] [PubMed]
- Gamboa AC, Gronchi A, Cardona K. Soft-tissue sarcoma in adults: An update on the current state of histiotype-specific management in an era of personalized medicine. CA Cancer J Clin 2020;70:200-29. [Crossref] [PubMed]
- Oliver J, Onieva JL, Garrido-Barros M, et al. Association of Circular RNA and Long Non-Coding RNA Dysregulation with the Clinical Response to Immune Checkpoint Blockade in Cutaneous Metastatic Melanoma. Biomedicines 2022;10:2419. [Crossref] [PubMed]
- Anastasiadou E, Jacob LS, Slack FJ. Non-coding RNA networks in cancer. Nat Rev Cancer 2018;18:5-18. [Crossref] [PubMed]
- Kopp F, Mendell JT. Functional Classification and Experimental Dissection of Long Noncoding RNAs. Cell 2018;172:393-407. [Crossref] [PubMed]
- Choi SW, Kim HW, Nam JW. The small peptide world in long noncoding RNAs. Brief Bioinform 2019;20:1853-64. [Crossref] [PubMed]
- Pasieka R, Zasoński G, Raczyńska KD. Role of Long Intergenic Noncoding RNAs in Cancers with an Overview of MicroRNA Binding. Mol Diagn Ther 2023;27:29-47. [Crossref] [PubMed]
- Choudhari R, Sedano MJ, Harrison AL, et al. Long noncoding RNAs in cancer: From discovery to therapeutic targets. Adv Clin Chem 2020;95:105-47. [Crossref] [PubMed]
- Li L, Zhang X, Liu N, et al. LINC00473: A novel oncogenic long noncoding RNA in human cancers. J Cell Physiol 2021;236:4174-83. [Crossref] [PubMed]
- Xie KY, Chen SZ, Wang Y, et al. Establishment and validation of a prognostic immune-related lncRNA risk model for acute myeloid leukemia. Transl Cancer Res 2023;12:3693-702. [Crossref] [PubMed]
- Huang T, Wang YJ, Huang MT, et al. LINC00470 accelerates the proliferation and metastasis of melanoma through promoting APEX1 expression. Cell Death Dis 2021;12:410. [Crossref] [PubMed]
- Melixetian M, Pelicci PG, Lanfrancone L. Regulation of LncRNAs in Melanoma and Their Functional Roles in the Metastatic Process. Cells 2022;11:577. [Crossref] [PubMed]
- O'Brien EM, Selfe JL, Martins AS, et al. The long non-coding RNA MYCNOS-01 regulates MYCN protein levels and affects growth of MYCN-amplified rhabdomyosarcoma and neuroblastoma cells. BMC Cancer 2018;18:217. [Crossref] [PubMed]
- Kunisada T, Nakata E, Fujiwara T, et al. Soft-tissue sarcoma in adolescents and young adults. Int J Clin Oncol 2023;28:1-11. [Crossref] [PubMed]
- Shao Y, Zhang Y, Hou Y, et al. A novel long noncoding RNA PILRLS promote proliferation through TCL1A by activing MDM2 in Retroperitoneal liposarcoma. Oncotarget 2017;8:13971-8. [Crossref] [PubMed]
- Liu Y, Wang D, Ji Q, et al. LncRNA MATN1-AS1 for Prediction of Prognosis in Osteosarcoma Patients and Its Cellular Function. Mol Biotechnol 2022;64:66-74. [Crossref] [PubMed]
- Meng L, Shang H, Liu Q, et al. Lnc-PSMA8-1 activated by GEFT promotes rhabdomyosarcoma progression via upregulation of mTOR expression by sponging miR-144-3p. BMC Cancer 2024;24:79. [Crossref] [PubMed]
- Zhang Z, Liu J, Wu Y, et al. Long Noncoding RNA SERTAD2-3 Inhibits Osteosarcoma Proliferation and Migration by Competitively Binding miR-29c. Genet Test Mol Biomarkers 2020;24:67-72. [Crossref] [PubMed]
- Yang T, Guo JP, Li F, et al. Long non coding RNA DUXAP8 regulates TOP2A in the growth and metastasis of osteosarcoma via microRNA 635. Mol Med Rep 2021;24:511. [Crossref] [PubMed]
- Kendall WL, Pollock KH, Brownie C. A likelihood-based approach to capture-recapture estimation of demographic parameters under the robust design. Biometrics 1995;51:293-308.
- Yang Y, Hong C, Liang JW, et al. A likelihood-based approach to assessing frequency of pathogenicity among variants of unknown significance in susceptibility genes. Stat Med 2021;40:593-606. [Crossref] [PubMed]
- Sisodiya SM. Precision medicine and therapies of the future. Epilepsia 2021;62:S90-S105. [Crossref] [PubMed]
- Wei R, Gao F, Zeng Z, et al. Molecular profiling of gene fusions in soft tissue sarcomas by Ion AmpliSeq(TM): a study of 35 cases. Transl Cancer Res 2022;11:488-99. [Crossref] [PubMed]
- Cao Y, Li J, Du Y, et al. LINC02454 promotes thyroid carcinoma progression via upregulating HMGA2 through CREB1. FASEB J 2023;37:e23288. [Crossref] [PubMed]
- Chen B, Weng Y, Li M, et al. LINC02454-CCT complex interaction is essential for telomerase activity and cell proliferation in head and neck squamous cell carcinoma. Cancer Lett 2024;588:216734. [Crossref] [PubMed]
- Zhu Q, Zhang R, Lu F, et al. Cuproptosis-related LINC02454 as a biomarker for laryngeal squamous cell carcinoma based on a novel risk model and in vitro and in vivo analyses. J Cancer Res Clin Oncol 2023;149:15185-206. [Crossref] [PubMed]
- Takamatsu K, Maekawa K, Togashi T, et al. Identification of two novel primate-specific genes in DSCR. DNA Res 2002;9:89-97. [Crossref] [PubMed]
- Yegnasubramanian S, Wu Z, Haffner MC, et al. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences. BMC Genomics 2011;12:313. [Crossref] [PubMed]
- Huang H, Li X, Zhang X, et al. DSCR9/miR-21-5p axis inhibits pancreatic cancer proliferation and resistance to gemcitabine via BTG2 signaling. Acta Biochim Biophys Sin (Shanghai) 2022;54:1775-88. [Crossref] [PubMed]
- Li M, Lin C, Cai Z. Downregulation of the long noncoding RNA DSCR9 (Down syndrome critical region 9) delays breast cancer progression by modulating microRNA-504-5p-dependent G protein-coupled receptor 65. Hum Cell 2023;36:1516-34. [Crossref] [PubMed]
- Liu Y, Luo J, Zeng J, et al. Competing endogenous RNA analysis identified lncRNA DSCR9 as a novel prognostic biomarker associated with metastasis and tumor microenvironment in renal cell carcinoma. Oncol Lett 2023;26:290. [Crossref] [PubMed]
- Arman K, Saadat KASM, Igci YZ, et al. Long noncoding RNA ERICD interacts with ARID3A via E2F1 and regulates migration and proliferation of osteosarcoma cells. Cell Biol Int 2020;44:2263-74. [Crossref] [PubMed]
- Feldstein O, Nizri T, Doniger T, et al. The long non-coding RNA ERIC is regulated by E2F and modulates the cellular response to DNA damage. Mol Cancer 2013;12:131. [Crossref] [PubMed]
- Liu R, Jin JP. Calponin isoforms CNN1, CNN2 and CNN3: Regulators for actin cytoskeleton functions in smooth muscle and non-muscle cells. Gene 2016;585:143-53. [Crossref] [PubMed]
- Deaton RA, Bulut G, Serbulea V, et al. A New Autosomal Myh11-CreER(T2) Smooth Muscle Cell Lineage Tracing and Gene Knockout Mouse Model-Brief Report. Arterioscler Thromb Vasc Biol 2023;43:203-11. [Crossref] [PubMed]
- Fournier N, Fabre A. Smooth muscle motility disorder phenotypes: A systematic review of cases associated with seven pathogenic genes (ACTG2, MYH11, FLNA, MYLK, RAD21, MYL9 and LMOD1). Intractable Rare Dis Res 2022;11:113-9. [Crossref] [PubMed]
- Biernacki MA, Foster KA, Woodward KB, et al. CBFB-MYH11 fusion neoantigen enables T cell recognition and killing of acute myeloid leukemia. J Clin Invest 2020;130:5127-41. [Crossref] [PubMed]
- Jin H, Liu B, Guo X, et al. MYLK and CALD1 as molecular targets in bladder cancer. Medicine (Baltimore) 2023;102:e36302. [Crossref] [PubMed]
- Lin J, He Y, Chen L, et al. MYLK promotes hepatocellular carcinoma progression through regulating cytoskeleton to enhance epithelial-mesenchymal transition. Clin Exp Med 2018;18:523-33. [Crossref] [PubMed]
- Zhang Z, Li X, Ren S, et al. CNN1 Represses Bladder Cancer Progression and Metabolic Reprogramming by Modulating HIF-1α Signaling Pathway. Front Oncol 2022;12:859707. [Crossref] [PubMed]
- Liu W, Fu X, Li R. CNN1 regulates the DKK1/Wnt/β-catenin/c-myc signaling pathway by activating TIMP2 to inhibit the invasion, migration and EMT of lung squamous cell carcinoma cells. Exp Ther Med 2021;22:855. [Crossref] [PubMed]
- Guo X, Jo VY, Mills AM, et al. Clinically Relevant Molecular Subtypes in Leiomyosarcoma. Clin Cancer Res 2015;21:3501-11. [Crossref] [PubMed]
- Lehnart SE, Wehrens XHT. The role of junctophilin proteins in cellular function. Physiol Rev 2022;102:1211-61. [Crossref] [PubMed]
- Poulet C, Sanchez-Alonso J, Swiatlowska P, et al. Junctophilin-2 tethers T-tubules and recruits functional L-type calcium channels to lipid rafts in adult cardiomyocytes. Cardiovasc Res 2021;117:149-61. [Crossref] [PubMed]
- Zhou H, Li L, Xie W, et al. TAGLN and High-mobility Group AT-Hook 2 (HMGA2) Complex Regulates TGF-β-induced Colorectal Cancer Metastasis. Onco Targets Ther 2020;13:10489-98. [Crossref] [PubMed]
- Ma L, Wang H, Sun Y, et al. P53-induced MRVI1 mediates carcinogenesis of colorectal cancer. Scand J Gastroenterol 2020;55:824-33. [Crossref] [PubMed]
- Wang Z, Huang R, Wang H, et al. Prognostic and Immunological Role of PPP1R14A as a Pan-Cancer Analysis Candidate. Front Genet 2022;13:842975. [Crossref] [PubMed]
- OuYang C, Xie Y, Fu Q, et al. SYNPO2 suppresses hypoxia-induced proliferation and migration of colorectal cancer cells by regulating YAP-KLF5 axis. Tissue Cell 2021;73:101598. [Crossref] [PubMed]
- Duan X, Wang L, Wang Z, et al. LncRNA PGM5-AS1 Inhibits the Progression of Bladder Cancer by Regulating miR-587/SLIT3 Axis. Crit Rev Eukaryot Gene Expr 2022;32:9-22. [Crossref] [PubMed]