Background/objectives:
The transformer, introduced in 2017, has revolutionized several fields, including oncology. Its applications in cancer detection, diagnosis, prognosis prediction and treatment have shown significant potential. However, a comprehensive analysis of the global research trends and future directions in image-driven cancer diagnosis is lacking.
Methods:
We conducted a bibliometric analysis using CiteSpace and VOSviewer to explore the development of transformer applications in image-driven cancer diagnosis(TICD). A total of 2923 papers published between 2017 and 2026 were analyzed, with early access papers from 2026 included. We examined publication trends, international collaboration, and citation patterns to identify research hotspots and emerging directions.
Results:
The number of publications on transformer applications in image-driven cancer diagnosis has rapidly increased, with a notable surge beginning in 2022. China and the United States are the leading contributors to the field, with high levels of international collaboration. The primary research focuses on the application of transformer-based models for image classification, segmentation, and enhancement. Their development is moving toward lightweight design, interpretability, multimodal fusion, and low annotation dependence. However, while the volume of publications is increasing, the impact (measured by citation counts) varies across countries and institutions.
Conclusions:
The field of TICD is in a robust growth phase, attracting significant attention from global researchers, particularly from China and the United states. While international collaboration is prevalent, the field faces challenges regarding the generalizability and scalability of research findings. Future research should focus on translating these promising technologies into clinical practice, ensuring that they are adaptable and applicable in diverse oncology contexts.
1 IntroductionWith the rapid development of artificial intelligence (AI) technology, the integration of AI and medicine has become one of the most promising innovative fields (1, 2). Deep learning, as the dominant method in AI, has widespread applications in oncology (3). The Transformer, introduced by Google’s research team in 2017 (4), utilizes multi-head self-attention mechanisms and position encoding to enable efficient parallel computation and capture broader dependencies. It has demonstrated exceptional performance and significant potential in tumor diagnosis (5) and prediction (6).
Accurately identifying tumor-related regions in medical images is crucial for tumor staging, treatment planning, and surgical navigation, with significant clinical value. The Transformer architecture aids tumor diagnosis by enhancing medical imaging (7, 8), medical image segmentation (9–11), and multimodal classification (12–14). It also predicts tumor growth using biomarkers (15) and builds survival models to forecast prognosis (16–18), providing essential decision support for precision medicine and personalized treatment plans.
Transformers have broken through the limitations of traditional methods in tumor image tasks, achieving superior performance in diagnosis and classification (19), prognosis prediction (20, 21), and cross-center generalization (22).The development of TICD represents an important direction for medical progress.Tracking advances in this field is of great significance for promoting medical research and facilitating clinical translation.
Many scholars have conducted research on the aforementioned issues, with some focusing on specific areas, such as the use of transformers in brain tumor image segmentation, precise tumor diagnosis, or the diagnosis and prediction of specific cancers (e.g., breast cancer, lung cancer) from a systematic review perspective (23–26). However, a macro-level bibliometric analysis of the overall research situation has not been conducted. The advancement of TICD remains an unexplored frontier in contemporary scholarly inquiry. With the advancement of large models, the Transformer architecture will no longer be confined to a single type of cancer or a single function; instead, it is likely to make significant contributions to multimodal large models and pan-tumordiagnosis. Therefore, a comprehensive review of various research directions of TICD will help provide references for future developments. The temporal relevance of research constitutes a critical factor that scholars must prioritize in the production of academic outcomes, as the imperative to rapidly comprehend disciplinary landscapes and monitor emergent scholarly developments presents a significant challenge within contemporary academia. Bibliometrics can objectively quantify and analyze research progress, trends, evolution, and the distribution of research efforts in a particular field, offering a comprehensive understanding of its current state. This paper aims to reveal the application status and development trends of TICD from aspects such as publication trends, international collaboration, and thematic evolution, with the results presented in a visual format. This study helps scholars quickly and intuitively grasp the research landscape, track progress, and identify emerging hotspots in the field, thereby advancing the development and innovation of intelligent diagnostic imaging for cancer.
2 Materials and methods2.1 Data sourceWe selected the Science Citation Index Expanded (SCI-EXPANDED) from the Web of Science Core Collection as the data source. The search query was “Topic=(“attention mechanism” OR “transformer” OR ViT OR BERT OR GPT OR “Self-attention network” OR “Multi-head attention”) AND Topic=(Tumor or Neoplasms or Oncology or Cancer) AND Topic= imag*” with the search conducted on December 29, 2025. Since the Transformer architecture was first proposed in 2017, we set the publication period from 2017 to the present. To capture the latest developments in the field, we included early access articles. Early Access publications were identified using the standardized “Early Access” document type label provided in the Web of Science Core Collection metadata. All Early Access records were included in the main analysis to ensure a comprehensive and up-to-date bibliometric overview. Therefore, the inclusion criteria were: (a) article, early access, proceeding paper and review article types; (b) literature published between 2017 and 2026; (c) literature published in English. (d)Topic-relevant publications. After manual screening and verification, 1264 insufficiently relevant publications were excluded. These primarily include medical image processing studies adopting CNN algorithms related to attention mechanisms, where the Transformer architecture was not utilized. Two scholars from relevant fields selected 200 documents to assess their eligibility for inclusion criteria, with a Cohen’s Kappa of 77.5%.
The process of literature acquisition and selection is shown in Figure 1.

Flowchart of literature screening.
2.2 Bibliometric analysisWe used Microsoft Excel 2019 for flowcharts and statistical tables. Charticulator and SCImago Graphica 1.0.25 were employed for analyzing and visualizing international research collaborations. For bibliometric analysis of authors, journals, institutions, keywords, references, and citations, we utilized Citespace 6.4.R1 and VOSviewer 1.6.20. Citespace and VOSviewer are widely used tools for bibliometric analysis and visualization, enabling the identification of research hotspots and trends across various disciplines (27–29).
3 Results3.1 General dataBased on the literature retrieval strategy, a total of 2923 papers published in SCI-E since 2017 were collected, including 2263 research articles and 87 review articles. These publications were authored by 13357 researchers from 3328 institutions across 97 countries/regions and were published in 818 journals.
3.2 Publication trendFrom 2019 to 2026, the number of publications gradually increased, with a sharp surge in 2022, reaching a growth rate of 100%. Although the upward trend continued thereafter, the growth rate declined (Figure 2). As of the retrieval date, 2025 had the highest number of publications, accounting for 30.38% of the total. Additionally, 85 early access papers for 2026 had already been published.

Publication trend of TICD.
The publication trends of the top 10 publishing countries are shown in Figure 3. Most countries experienced a surge in publications in 2024, maintaining an upward trend. However, The growth rate of the number of publications in Australia and Germany has slowed down.

Top 10 countries/regions with annual publication trends of TICD.
3.3 Analysis of research countriesFigure 4A visually illustrates the distribution of research efforts and international collaborations in the field of TICD. The publishing countries are primarily concentrated in East Asia, and parts of the Americas, with China and the United States leading in publication volume and demonstrating frequent collaboration. Figure 4B further depicts international cooperation patterns, showing that China and the United States have the most extensive collaborations, engaging in research partnerships with multiple countries. The United Kingdom, Saudi Arabia, and India also exhibit significant international cooperation.

Publications and cooperation in different countries/regions of the world. (A) Map of the world’s countries/regions in terms of publications and collaborations in the field of TICD. (The size of the circle represents the number of articles issued. The thickness of the connecting line represents the number of collaborative communications between countries. The color of the circles represents the intensity of cooperation.). (B) Cooperation between countries/regions.
Chinese scholars have published the highest number of papers, totaling 1494; however, the average number of citations per paper is 10.67, indicating a moderate citation impact. The United States and India follow, with 444 and 352 publications, respectively. Australia has the highest average citation count per paper at 23.8, with 92 papers cited 2190 times. The United States ranks second, with 444 publications receiving 9760 citations, resulting in an average of 21.98 citations per paper. The publication volume and citation statistics for the top 10 publishing countries are presented in Table 1.
CountryDocumentsCitationsCitations/Publications and citations of the top 10 countries.
3.4 Analysis of research institutionsFigure 5 illustrates the co-authorship network among institutions that have published more than eight papers, with a total of 175 institutions meeting this threshold. The top five institutions in terms of publication volume are the Chinese Academy of Sciences (92), Shanghai Jiao Tong University (64), Central South University (55), Sun Yat-sen University (55), and Fudan University (54), as showned in Table 2.

Inter-institutional collaboration status. Each node represents an academic institution, with node size indicating the number of publications. Node and link colors are mapped to the time period of publication (blue: 2009–2014, green: 2014–2019, yellow: 2019–2024). Line thickness reflects the strength of collaborative relationships between institutions.
InstitutionDocumentsCitationsCitations/Publications and citations of the top 10 institutions.
From the perspective of publication quality (with a minimum threshold of five publications), the Vanderbilt University (USA) has the highest average citations per paper (176.85), despite having published only 13 papers with 2299 citations. It is followed by ShanghaiTech University (China) with an average of 97.13 citations per paper, East China Normal University (China) (59.38), University of Lübeck (Germany) (52.63), and The Hong Kong University of Science and Technology (Hongkong, China) (50.73).
The institutional collaboration network forms 11 clusters, reflecting the strength of collaborative relationships. Domestic collaborations are frequent, though international partnerships are also significant. The vast majority of clusters include Chinese institutions, which are predominantly universities. Collaborations among Chinese institutions are fairly frequent, and the scope of domestic cooperation is not limited by geographical location, with extensive partnerships established. Some institutions have collaborated with research-powerful organizations in other countries, such as harvard university(USA), University of New South Wales(Australia), University of Lübeck(Germany) and University New South Wales (Australia). European countries engage in more extensive international collaborations. For instance, Harvard Medical School has established extensive partnerships with a large number of institutions from various other countries, such as Memorial Sloan Kettering Cancer Center(USA), University of Leeds(The United Kingdom), German Cancer Research Center and RWTH Aachen University (Germany).
Different colors in Figure 5 represent the temporal evolution of institutional research on this topic. The Chinese Academy of Sciences and Northeastern University (USA) were among the earliest institutions to conduct research in this area (June 2023). During the period from April to June 2024, institutions with significant publication output included Princess Nourah Bint Abdulrahman University(Saudi Arabia), SRM Institute of Science & Technology(India), Chitkara University(India), Igdir University(Turkey) etc. Research on this topic by these institutions started relatively late.
3.5 Bibliometric analysis of authorsWe constructed an author co-authorship network based on authors with more than five publications(Figure 6), identifying 194 authors within this threshold. The largest connected network comprises 135 authors, accounting for 69.59% of the total, indicating that more than half of the researchers collaborate with others. The collaboration network consists of 16 clusters, with the largest cluster containing 18 authors and the smallest containing 2 authors.

Visualization of co-authorship.
Among the most collaborative researchers, Li, Chen, Grzegorzek, Marcin, and Sun, Hongzan exhibit the highest total link strengths, with 50, 49, and 46 links, respectively. From a temporal perspective, early contributors to this field include Zhang yongbin, Yu lequn, and Yao jiawen, who were active as early as 2023. Between 2024 and 2025, notable researchers included Kuang hulin, Zhang yudong, and Liu jun have more recently focused on this field.
The top 10 most prolific scholars in this area, the majority of whom are from China. Pacal, ishak lead in publication volume, with 16 papers. They are followed by Shi, jun (15 papers), Gou, fangfang (15 papers), and Wu, jia (15 papers) and Wang, jing(15 papers). In terms of research impact, Tang, yucheng, Bian, hao, and Wang, yifeng have the highest average citation counts per paper, with 379, 140, and 137.4 citations, respectively.
3.6 Analysis of journals and cited journalsAmong the 818 sources, 104 have published more than five papers on the application of TICD. Table 3 lists the top 11 sources in terms of publication volume The majority of these journals belong to the JCR Q1. The top three journals in terms of publication volume are Biomedical Signal Processing and Control, IEEE Access, and Scientific Reports, which have published 181, 123, and 108 papers, accounting for 6.19%, 4.24%, and 3.69% of the total, respectively. Among the top 11 sources, Medical Image Analysis has the highest total citations (3016) and the highest average citations per paper (62.83), making it the most influential journal in this field.
SourcesIFPublications an d citations of the top 11 sources.
In total, the collected papers cite references from 14336 different journals. Setting the threshold for cited journal occurrences at more than 20, we constructed a co-citation network of cited journals (Figure 7A). The top three cited sources are arXiv (9715 citations), Lecture Notes in Computational Science and Engineering (6237 citations), and Conference on Computer Vision and Pattern Recognition (CVPR) (5981 citations). The cited sources form four distinct clusters:The red cluster is primarily related to radiology research; The green cluster is mainly associated with the computer medical imaging; The blue cluster focuses on the intersection of computer science and biomedical fields; The yellow cluster is associated with medical physics.

(A) Co-citation relationships between journals. (B) A dual-map overlap of journals on TICD.
Using the overlay maps function in CiteSpace, we visualized the citation relationships between citing and cited journals (Figure 7B). This analysis reveals the academic disciplines where Transformer-related oncology research is published and referenced. Citing journals mainly belong to four disciplines: 1) Medicine, medical, clinical; 2) Molecular biology, immunology; 3)Mathematics, systems, mathematical; 4) Physics, materials, chemistry. Cited journals are primarily concentrated in five disciplines: 1)Psychology, education, social sciences; 2) Health, nursing, medicine; 3)Molecular biology; 4) Systems, computing, computer science;5) Chemistry, materials, physics.
3.7 Co-occurrence analysis of keywordsWe selected keywords that appeared more than five times for visualization analysis using VOSviewer (Figures 8A,C). Among the 6368 keywords, 508 met the selection criteria. The most frequently occurring keywords include deep learning (843), followed by transformer (633), classification (345), vision transformer (326), and cancer (289). The cancer types that have received the most attention from scholars are shown in Table 3, with breast cancer being the most studied (371), followed by brain cancer, skin cancer and lung cancer. See Table 4 for details.

(A) Co-occurrence analysis of keywords. (B) Temporal evolution of keywords. (C) The density map of TICD research. The size of word, the size of round, and the opacity of yellow is positively related to the co-occurrence frequency.
Types of cancerFrequency of keywordbreast cancer371brain cancer353skin cancer198lung cancer181colorectal cancer143liver cancer69cervical cancer65gastric cancer27carcinoma16thyroid cancer15pancreatic cancer11kidney cancer11head and neck cancer10oral cancer10osteosarcoma10laryngeal cancer7The frequency of keywords related to different types of diseases.
Using CiteSpace, we generated a keyword evolution map, which retains eight clusters(Figure 8B). The eight clusters collectively focus on deep learning (DL) applications, especially vision transformers (ViT) and convolutional neural networks (CNNs), in medical oncology, centering on cancer diagnosis, segmentation, and prognosis via medical imaging and computational pathology.
Clusters 0, 2, 7, and 8 focus on DL-based medical image analysis for lung and breast cancer, emphasizing multi-scale feature fusion, boundary enhancement, and ViT-CNN integration to improve diagnostic accuracy via CT and mammographic images.Clusters 1, 5, and 6 specialize in breast cancer research, covering BI-RADS classification, neoadjuvant chemotherapy response assessment, and survival prediction, with innovations in ViT-CNN modules and attention mechanisms.Clusters 3 and 4 extend to computational pathology and cross-modal AI, integrating natural language processing (NLP) with digital pathology for thyroid cancer/lymphatic metastasis detection (Cluster 3) and applying large models (SAM, LLMs) to pulmonary nodule segmentation and colorectal neoplasm staging (Cluster 4).Overall, these clusters reflect a paradigm shift toward transformer-augmented DL in oncology, featuring multi-modal data integration and translational focus. ViT-CNN hybrid architectures and large foundation models are core enablers for advancing AI-driven precision diagnosis and prognosis across diverse cancers.
The evolution of keywords over time also reveals research trends. Terms such as transformer, medical image segmentation and multiple instance learning appeared earlier, primarily between 2023 and 2024. As research progressed, scholars expanded into various application areas, with keywords like accuracy, explainable AI, Medical Image Analysis and foundation model becoming more prevalent after 2025.
If certain keywords are concentrated within a specific period, they can be considered burst terms, reflecting different stages of development within a field. Using CiteSpace, we identified the top 11 most prominent burst keywords from studies on TICD (Table 5). The most intense burst term was “attention” in 2021, followed by “survival analysis” in 2019. More recently, scholars have focused on computer-aided diagnosis and radiomics, indicating emerging research directions.
KeywordsYearStrengthBeginEndattention20215.3520212023survival analysis20215.0220212023c
Comments (0)