Regulating genome language models: navigating policy challenges at the intersection of AI and genetics

Acerbi A, Stubbersfield JM (2023) Large language models show human-like content biases in transmission chain experiments. Proc Natl Acad Sci 120(44):e2313790120. https://doi.org/10.1073/pnas.2313790120

Article  PubMed  PubMed Central  CAS  Google Scholar 

Aditya H, Chawla S, Dhingra G, Rai P, Sood S, Singh T, Wase ZM, Bahga A, Madisetti VK (2024) Evaluating privacy leakage and memorization attacks on large language models (LLMs) in generative AI applications. J Softw Eng Appl 17(5):421–447. https://doi.org/10.4236/jsea.2024.175023

Article  Google Scholar 

Aitken DM, Leslie DD, Ostmann DF, Pratt J, Margetts PH, Dorobantu DC (2022) Common regulatory capacity for AI. Alan Turing Ins. https://doi.org/10.5281/zenodo.6838926

Allen JG, Loo J, Campoverde JLL (2025) governing intelligence: singapore’s evolving AI governance framework. Cambridge Forum AI Law Govern 1(January):e12. https://doi.org/10.1017/cfl.2024.12

Article  Google Scholar 

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410. https://doi.org/10.1016/S0022-2836(05)80360-2

Article  PubMed  CAS  Google Scholar 

Andrews, C., 2025. European Commission withdraws AI Liability Directive from consideration. IAPP. https://iapp.org/news/a/european-commission-withdraws-ai-liability-directive-from-consideration

Arner DW, Castellano GG, Selga EK (2022) The Transnational data governance problem. Berkeley Technol Law J 37(2):623–700

Google Scholar 

Asim MN, Ibrahim MA, Zaib A, Dengel A (2025) DNA sequence analysis landscape: a comprehensive review of dna sequence analysis task types, databases, datasets, word embedding methods, and language models. Front Med 12(April):1503229. https://doi.org/10.3389/fmed.2025.1503229

Article  Google Scholar 

Avsec E, Blatnik A, Krajc M (2025) Secondary findings in hereditary cancer genes after germline genetic testing—systematic review of literature. Human Genet. https://doi.org/10.1007/s00439-025-02746-w

Article  Google Scholar 

Ayoub NF, Balakrishnan K, Ayoub MS, Barrett TF, David AP, Gray ST (2024) Inherent bias in large language models: a random sampling analysis. Mayo Clin Proc Digit Health 2(2):186–191. https://doi.org/10.1016/j.mcpdig.2024.03.003

Article  PubMed  PubMed Central  Google Scholar 

Babic B, Gerke S, Evgeniou T, Glenn Cohen I (2021) Beware explanations from AI in health care. Science 373(6552):284–286. https://doi.org/10.1126/science.abg1834

Article  PubMed  CAS  Google Scholar 

Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS (2009) MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. https://doi.org/10.1093/nar/gkp335

Article  PubMed  PubMed Central  Google Scholar 

Balasubramaniam N, Kauppinen M, Rannisto A, Hiekkanen K, Kujala S (2023) Transparency and explainability of AI systems: from ethical guidelines to requirements. Inf Softw Technol 159(July):107197. https://doi.org/10.1016/j.infsof.2023.107197

Article  Google Scholar 

Barda N, Yona G, Rothblum GN, Greenland P, Leibowitz M, Balicer R, Bachmat E, Dagan N (2021) Addressing bias in prediction models by improving subpopulation calibration. J Am Med Inform Assoc 28(3):549–558. https://doi.org/10.1093/jamia/ocaa283

Article  PubMed  Google Scholar 

Barrance E, Kazim E, Trengove M, Zannone S, Koshiyama A (2022) Overview and commentary of the CDEI’s extended roadmap to an effective AI assurance ecosystem. Frontiers Artific Intell. https://doi.org/10.3389/frai.2022.932358

Article  Google Scholar 

Batool A, Zowghi D, Bano M (2025) AI governance: a systematic literature review. AI Ethics. https://doi.org/10.1007/s43681-024-00653-w

Article  Google Scholar 

Battey CJ, Ralph PL, Kern AD (2020) Predicting geographic location from genetic variation with deep neural networks. Elife. https://doi.org/10.7554/eLife.54507

Article  PubMed  PubMed Central  Google Scholar 

Benegas G, Batra SS, Song YS (2023) DNA language models are powerful predictors of genome-wide variant effects. Proc Natl Acad Sci 120(44):e2311219120. https://doi.org/10.1073/pnas.2311219120

Article  PubMed  PubMed Central  CAS  Google Scholar 

Benegas G, Albors C, Aw AJ, Ye C, Song YS (2024) GPN-MSA: an alignment-based DNA language model for genome-wide variant effect prediction. bioRxiv, https://doi.org/10.1101/2023.10.10.561776

Bilkey GA, Burns BL, Coles EP, Bowman FL, Beilby JP, Pachter NS, Baynam G, Dawkins HJS, Nowak KJ, Weeramanthri TS (2019) Genomic testing for human health and disease across the life cycle: applications and ethical, legal, and social challenges. Frontiers Public Health. https://doi.org/10.3389/fpubh.2019.00040

Article  Google Scholar 

Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, von Arx S, Bernstein MS et al. (2021) On the opportunities and risks of foundation models. arXiv:2108.07258. arXiv. https://doi.org/10.48550/arXiv.2108.07258

Bonomi L, Huang Y, Ohno-Machado L (2020) Privacy challenges and research opportunities for genomic data sharing. Nat Genet 52(7):646–654. https://doi.org/10.1038/s41588-020-0651-0

Article  PubMed  PubMed Central  CAS  Google Scholar 

Boshar S, Trop E, de Almeida BP, Copoiu L, Pierrot T (2024) Are genomic language models all you need? Exploring genomic language models on protein downstream tasks. Bioinformatics 40(9):btae529. https://doi.org/10.1093/bioinformatics/btae529

Article  PubMed  PubMed Central  CAS  Google Scholar 

Brown S, Davidovic J, Hasan A (2021) The algorithm audit: scoring the algorithms that score us. Big Data Soc 8(1):2053951720983865. https://doi.org/10.1177/2053951720983865

Article  Google Scholar 

Buiten MC (2021) ‘Your DNA is one click away’: The GDPR and direct-to-consumer genetic testing. in consumer law and economics, edited by Klaus Mathis and Avishalom Tor, 205–23. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-49028-7_10

Buocz T, Pfotenhauer S, Eisenberger I (2023) Regulatory sandboxes in the AI act: reconciling innovation and safety? Law Innov Technol 15(2):357–389. https://doi.org/10.1080/17579961.2023.2245678

Article  Google Scholar 

Cahyawijaya S, Tiezheng Y, Zihan L, Xiaopu Z, Tze Wing Tiffany M, Yuk Yu Nancy Ip, Pascale F (2022) SNP2Vec: scalable self-supervised pre-training for genome-wide association study. In Proceedings of the 21st workshop on biomedical language processing, edited by Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, and Junichi Tsujii, 140–54. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.bionlp-1.14

Cestonaro C, Delicati A, Marcante B, Caenazzo L, Tozzo P (2023) Defining medical liability when artificial intelligence is applied on diagnostic algorithms: a systematic review. Frontiers Med. https://doi.org/10.3389/fmed.2023.1305756

Article  Google Scholar 

Cheng Le, Ming Hu, Hong T (2025) Profiling elements, risks, and governance of artificial intelligence: implications from DeepSeek. Int J Digit Law Govern. https://doi.org/10.1515/ijdlg-2025-0008

Article  Google Scholar 

Cihon P, Kleinaltenkamp MJ, Schuett J, Baum SD (2021) AI certification: advancing ethical practice by reducing information asymmetries. IEEE Trans Technol Soc 2(4):200–209. https://doi.org/10.1109/TTS.2021.3077595

Article  Google Scholar 

Consens ME, Li B, Poetsch AR, Gilbert S (2025) Genomic language models could transform medicine but not yet. Npj Digit Med 8(1):1–4. https://doi.org/10.1038/s41746-025-01603-4

Article  Google Scholar 

Contractor, D., McDuff, D., Haines, J.K., Lee, J., Hines, C., Hecht, B., Vincent, N., Li, H., 2022. Behavioral Use Licensing for Responsible AI, in: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’22. Association for Computing Machinery, New York, NY, USA, pp. 778–788. https://doi.org/10.1145/3531146.3533143

Cohen IG, Mello MM (2018) HIPAA and protecting health information in the 21st Century. JAMA 320(3):231–232. https://doi.org/10.1001/jama.2018.5630

Article  PubMed  Google Scholar 

Corrêa, N.K., Galvão, C., Santos, J.W., Pino, C.D., Pinto, E.P., Barbosa, C., Massmann, D., Mambrini, R., Galvão, L., Terem, E., Oliveira, N. de, 2023. Worldwide AI ethics: A review of 200 guidelines and recommendations for AI governance. Patterns 4: 100857. https://doi.org/10.1016/j.patter.2023.100857

Cui J, Araujo DA (2024) Rethinking use-restricted open-source licenses for regulating abuse of generative models. Big Data Soc 11(1):20539517241229700. https://doi.org/10.1177/20539517241229699

Article  Google Scholar 

da Fonseca, A.T., Vaz de Sequeira, E., Barreto Xavier, L., 2024. Liability for AI Driven Systems, in: Sousa Antunes, H., Freitas, P.M., Oliveira, A.L., Martins Pereira, C., Vaz de Sequeira, E., Barreto Xavier, L. (Eds.), Multidisciplinary Perspectives on Artificial Intelligence and the Law. Springer International Publishing, Cham, pp. 299–317. https://doi.org/10.1007/978-3-031-41264-6_16

Dalla-Torre H, Gonzalez L, Mendoza-Revilla J, Lopez Carranza N, Grzywaczewski AH, Oteri F, Pierrot T (2025) Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat Methods 22(2):287–297. https://doi.org/10.1101/2023.01.11.523679

Article  PubMed  CAS  Google Scholar 

Das BC, Amini MH, Wu Y (2025) Security and privacy challenges of large language models: a survey. ACM Comput Surv 57(6):152:1–152:39. https://doi.org/10.1145/3712001

Article  Google Scholar 

Demajo S, Ramis-Zaldivar JE, Muiños F, Grau ML, Andrianova M, López-Bigas N, González-Pérez A (2024) Identification of clonal hematopoiesis driver mutations through in silico saturation mutagenesis. Cancer Discov 14(9):1717–1731. https://doi.org/10.1158/2159-8290.CD-23-1416

Article  PubMed  PubMed Central  Google Scholar 

Dial

Comments (0)

No login
gif