Orthognathic surgery, as a pivotal treatment in the field of oral and maxillofacial surgery, aims to correct orthognathic malformations by adjusting the three-dimensional position of the jaws. The rise of virtual surgical planning (VSP) marks a significant milestone in the development of orthognathic postoperative prediction techniques (Knoops et al., 2018). Among its core modules, the virtual treatment objective (VTO) has demonstrated a mean error of 0.87 ± 0.31 mm between simulated and actual results at 6 months postoperatively (Zhang et al., 2023). How to achieve more accurate preoperative prediction and postoperative outcome control, to provide surgeons with guidance on digital surgical protocols, has become a central focus for clinicians and researchers (Zhu et al., 2023).
Traditional postoperative prediction methods rely primarily on biomechanical modeling, encompassing three major technical systems: the mass–spring model (MSM), the mass–tensor model (MTM), and the finite element method (FEM). By simulating soft-tissue deformation through a mass–spring network (San Vicente et al., 2009), the MSM offered low computational cost and was thus an early research focus, but its linear simplification assumption limits the accuracy of its prediction of complex biomechanical behaviors (Litner et al., 2008). FEM, as the most physically relevant technique, can accurately simulate the cascading deformation of soft tissues induced by bone movement, by constructing a nonlinear material model based on patient-specific CT data (Reichard et al., 2017). The time-consuming simulations and manual meshing of these conventional methods limit clinical adoption, confining them primarily to research (Kim et al., 2021).
In recent years, deep learning techniques have demonstrated significant efficiency advantages in postoperative outcome prediction (Sankar et al., 2025; Barone et al., 2025). For instance, Ter Horst et al. (Ter Horst et al., 2021) designed a hybrid feature network for mandibular advancement surgery, which directly integrates parametric characterization of mandibular movement with global geometric features from facial point clouds to predict soft-tissue deformation. Similarly, Lampen et al. (2022) implemented a graph convolutional network for fast inference of soft-tissue deformation. Furthermore, Gong et al. (2025) built a novel conditional generative adversarial network (CGAN) to predict post-treatment changes in lateral appearance by understanding the relationship between soft- and hard-tissue changes on lateral radiographs. While such models offer computational efficiency with reduced complexity, they often overlook individualized craniofacial relationships, utilize only the vertex set of the facial mesh without accounting for its intrinsic structure, disregard the biomechanical relationship between bones and soft tissues, and fail to achieve subject-specific transformations, leading to increases in prediction errors for local deformation details.
In this context, joint face–skeletal prediction has progressively become a central research focus. Its emergence signifies a paradigm shift in orthognathic surgery, from ‘bone-driven’ to ‘bone–soft tissue synergy-driven’. While VSP accurately adjusts bone position, it often treats soft-tissue deformation as a passive outcome of bone movement, thereby overlooking the dynamic feedback mechanism between these two structures. Joint prediction, by explicitly modeling the spatial–mechanical interactions between the skeleton and the face, has emerged as a core strategy for bridging the gap between prediction accuracy and clinical applicability, combining data-driven efficiency with the physical constraints of biomechanics.
Early attempts in this domain include FC-Net (Ma et al., 2021), which pioneered joint input of skeletal motion and facial point clouds. Subsequent advancements, such as ACMT-Net (Fang et al., 2022), introduced dynamic attention mechanisms to explicitly quantify mechanical impacts between bone and soft tissues. More recent models like P2P-ConvGC (Bao et al., 2024) and DGCFP (Huang et al., 2025) further integrated global context and dual-space convolutions to enhance feature extraction and reduce prediction errors. These innovations collectively represent significant progress in addressing the limitations of single-modality approaches and traditional biomechanical models.
Despite significant progress in joint prediction, existing reviews (Almarhoumi, 2024; Salazar et al., 2024) lack a comprehensive assessment of joint prediction, but predominantly focus on single-modality or independent prediction methods. These methods ignore the intrinsic connection between facial and internal skeletal structures, or only utilize the vertex set of the facial mesh without considering its inherent mesh structure, predicting only facial soft tissue without predicting skeletal changes, and thus unable to provide effective guidance for VSP. Therefore, the aim of our systematic review was to evaluate the development of artificial intelligence in joint prediction of facial soft-tissue and skeletal changes after orthognathic surgery, as well as to comprehensively assess its prediction accuracy and clinical usability.
Comments (0)