List of Preprints and Publications


* Co-first author
# Corresponding author


Selected preprints

  1. Forbes AN*, Xu D*, Cohen S, Pancholi P, Khurana E#. 2022
    Discovery of novel therapeutic targets in cancer using patient-specific gene regulatory networks. bioRxiv



Peer-reviewed publications



  1. Xu D*, Forbes AN*, Cohen S, Palladino A, Karadimitriou T, Khurana E#. 2023
    Recapitulation of patient-specific 3D chromatin conformation using machine learning. Cell Reports Methods 100578. (Also see the Preview)

  2. Li D, Zhan Y, Wang N, Tang F, Lee CJ, Bayshtok G, Moore AR, Wong EWP, Pachai MR, Xie Y, Sher J, Zhao JL, Khudoynazarova M, Gopalan A, Chan J, Khurana E, Shepherd P, Navone NM, Chi P, Chen Y. 2023
    ETV4 mediates dosage-dependent prostate tumor initiation and cooperates with p53 loss to generate prostate cancer. Science Advances 9(14).

  3. Martinez-Fundichely A, Dixon A, Khurana E#. 2022
    Modeling tissue-specific breakpoint proximity of structural variations from whole-genomes to identify cancer drivers. Nature Communications 13, 5640.

  4. Yan J, Chen Y, Patel AJ, Warda S, Lee CJ, Nixon BG, Wong EW, Miranda-Román MA, Yang N, Wang Y, Pachai MR, Sher J, Giff E, Tang F, Khurana E, Singer S, Liu Y, Galbo PM Jr, Maag JL, Koche RP, Zheng D, Antonescu C, Deng L, Li M, Chen Y, Chi P. 2022
    Tumor-intrinsic PRC2 inactivation drives a context-dependent immune-desert microenvironment and is sensitized by immunogenic therapeutic viruses. J Clin Invest e153437.

  5. Tang F*, Xu D*, Wang S*, Wong CK*, Martinez-Fundichely A, Lee CJ, Cohen S, Park J, Hill CE, Eng K, Bareja R, Han T, Liu EM, Palladino A, Di W, Gao D, Abida W, Beg S, Puca L, Meneses M, Stanchina ED, Berger MF, Gopalan A, Dow LE, Mosquera JM, Beltran H, Sternberg CN, Chi P, Scher HI, Sboner A, Chen Y#, Khurana E#. 2022
    Chromatin profiles classify castration-resistant prostate cancers suggesting therapeutic targets. Science. 376(6596) Free Access link.

  6. Aguiar-Pulido V, Wolujewicz P, Martinez-Fundichely A, Elhaik E, Thareja G, AbdelAleem A, Chalhoub N, Cuykendall T, Al-Zamer J, Lei Y, El-Bashir H, Musser J, Al-Kaabi A, Shaw G, Khurana E, Suhre K, Mason E, Elemento O, Finnell H, Ross E. 2021
    Systems biology analysis of human genomes points to key pathways conferring spina bifida risk. Proc Natl Acad Sci U S A 118(51), e2106844118.

  7. Baggiolini A, Callahan S, Montal E, Weiss J, Trieu T, Tagore M, Tischfield S, Walsh R, Suresh S, Fan Y, Campbell N, Perlee S, Saurat N, Hunter M, Simon-Vermot T, Huang T, Ma Y, Hollmann T, Tickoo S, Taylor B, Khurana E, Koche R, Studer L, White R. 2021
    Developmental chromatin programs determine oncogenic competence in melanoma. Science. 373(6559).

  8. Carrot-Zhang J, Yao X, Devarakonda S, Deshpande A, Damrauer JS, Silva TC, Wong CK, Choi HY, Felau I, Robertson AG, Castro MAA, Bao L, Rheinbay E, Liu EM, Trieu T, Haan D, Yau C, Hinoue T, Liu Y, Shapira O, Kumar K, Mungall KL, Zhang H, Lee JJ, Berger A, Gao GF, Zhitomirsky B, Liang WW, Zhou M, Moorthi S, Berger AH, Collisson EA, Zody MC, Ding L, Cherniack AD, Getz G, Elemento O, Benz CC, Stuart J, Zenklusen JC, Beroukhim R, Chang JC, Campbell JD, Hayes DN, Yang L, Laird PW, Weinstein JN, Kwiatkowski DJ, Tsao MS, Travis WD, Khurana E, Berman BP, Hoadley KA, Robine N; TCGA Research Network, Meyerson M, Govindan R, Imielinski M. 2021
    Whole-genome characterization of lung adenocarcinomas lacking the RTK/RAS/RAF pathway. Cell Rep. 108707.

  9. Liu EM, Martinez-Fundichely A, Bollapragada R, Spiewack M, Khurana E#. 2021
    CNCDatabase: a database of non-coding cancer drivers. Nucleic Acids Research 49(D1), D1094–D1101.

  10. Han T, Goswami S, Hu Y, Tang F, Zafra MP, Murphy C, Cao Z, Poirier JT, Khurana E, Elemento O, Hechtman JF, Ganesh K, Yaeger R, Dow LE. 2020
    Lineage reversion drives WNT independence in intestinal cancer. Cancer Discov CD-19-1536.

  11. Xu D, Gokcumen G, Khurana E#. 2020
    Loss-of-function tolerance of enhancers in the human genome. PLoS Genetics 16(4), e1008663.

  12. Trieu T#, Martinez-Fundichely A Khurana E#. 2020
    A deep learning approach to predict the impact of non-coding sequence variants on 3D chromatin structure. Genome Biology 21(1), 79.

  13. Kumar S, Warrell J, Li S, McGillivray P, Meyerson W, Salichos L, Harmanci A, Martinez-Fundichely A, Chan C, Nielsen M, Lochovsky L, Zhang Y, Li X, Pedersen J, Herrmann C, Getz G, Khurana E, Gerstein M. 2020
    Passenger mutations in 2500 cancer genomes: Overall molecular functional impact and consequences. Cell 180(5), 915–927.e16.

  14. ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. 2020
    Pan-cancer Analysis of Whole Genomes. Nature 578, 82-93.

  15. Rheinbay E, Nielsen MM, Abascal F, Wala JA, Shapira O, Tiao G, Hornshøj H, Hess JM, Juul RI, Lin Z, Feuerbach L, Sabarinathan R, Madsen T, Kim J, Mularoni L, Shuai S, Lanzós A, Herrmann C, Maruvka YE, Shen C, Amin SB, Bandopadhayay P, Bertl J, Boroevich KA, Busanovich J, Carlevaro-Fita J, Chakravarty D, Chan CWY, Craft D, Dhingra P, Diamanti K, Fonseca NA, Gonzalez-Perez A, Guo Q, Hamilton MP, Haradhvala NJ, Hong C, Isaev K, Johnson TA, Juul M, Kahles A, Kahraman A, Kim Y, Komorowski J, Kumar K, Kumar S, Lee D, Lehmann KV, Li Y, Liu EM, Lochovsky L, Park K, Pich O, Roberts ND, Saksena G, Schumacher SE, Sidiropoulos N, Sieverling L, Sinnott-Armstrong N, Stewart C, Tamborero D, Tubio JMC, Umer HM, Uusküla-Reimand L, Wadelius C, Wadi L, Yao X, Zhang CZ, Zhang J, Haber JE, Hobolth A, Imielinski M, Kellis M, Lawrence MS, von Mering C, Nakagawa H, Raphael BJ, Rubin MA, Sander C, Stein LD, Stuart JM, Tsunoda T, Wheeler DA, Johnson R, Reimand J, Gerstein M, Khurana E, Campbell PJ, López-Bigas N; PCAWG Drivers and Functional Interpretation Working Group; PCAWG Structural Variation Working Group, Weischenfeldt J, Beroukhim R, Martincorena I, Pedersen JS, Getz G; PCAWG Consortium. 2020
    Analyses of Non-Coding Somatic Drivers in 2,658 Cancer Whole Genomes. Nature 578, 102-111.

  16. Reyna MA, Haan D, Paczkowska M, Verbeke LPC, Vazquez M, Kahraman A, Pulido-Tamayo S, Barenboim J, Wadi L, Dhingra P, Shrestha R, Getz G, Lawrence MS, Pedersen JS, Rubin MA, Wheeler DA, Brunak S, Izarzugaza JMG, Khurana E, Marchal K, von Mering C, Sahinalp SC, Valencia A; PCAWG Drivers and Functional Interpretation Working Group, Reimand J, Stuart JM, Raphael BJ; PCAWG Consortium. 2020
    Pathway and network analysis of more than 2500 whole cancer genomes. Nature Communications 11, 729

  17. Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, Khurana E, Waszak S, Korbel JO, Haber JE, Imielinski M; PCAWG Structural Variation Working Group, Weischenfeldt J, Beroukhim R, Campbell PJ; PCAWG Consortium. 2020
    Patterns of somatic structural variation in human cancer genomes. Nature 578, 112-121

  18. Liu EM*, Martinez-Fundichely A*, Diaz BJ, Aronson B, Cuykendall T, MacKay M, Dhingra P, Wong E, Chi P, Apostolou E, Sanjana NE, Khurana E#. 2019
    Identification of cancer drivers at CTCF insulators in 1,962 whole-genomes. Cell Systems 8, 446-455.e8

  19. Bailey M, Tokheim C, Porta-Pardo E, Sengupta S, Bertrand D, Weerasinghe A, Colaprico A, Wendl M, Kim J, Reardon B, Ng P, Jeong K, Cao S, Wang Z, Gao J, Gao Q, Wang F, Liu EM, Mularoni L, Rubio-Perez C, Nagarajan N, Cortes-Ciriano I, Zhou D, Liang W, Hess J, Yellapantula V, Tamborero D, Gonzalez-Perez A, Suphavilai C, Ko J, Khurana E, Park P, Van Allen E, Liang H, The MC3 Working Group, The Cancer Genome Atlas Research Network, Lawrence M, Godzik A, Lopez-Bigas N, Stuart J, Wheeler D, Getz G, Chen K, Lazar A, Mills G, Karchin R, Ding L. 2018
    Comprehensive Characterization of Cancer Driver Genes and Mutations. Cell 173, 371–385.e18

  20. Backenroth D, He Z, Kiryluk K, Boeva V, Pethukova L, Khurana E, Christiano A, Buxbaum J, Ionita-Laza I. 2018
    FUN-LDA: A latent Dirichlet allocation model for predicting tissue-specific functional effects of noncoding variation. American Journal of Human Genetics 102, 920–942

  21. Kim J, Geyer F, Martelotto L, Ng C, Lim R, Selenica P, Li A, Pareja F, Fusco N, Edelweiss M, Kumar R, Forbes A, Khurana E, Mariani O, Badve S, Vincent-Saloman A, Norton L, Reis-Filho J, Weigelt B. 2018
    MYBL1 rearrangements and MYB amplification in breast adenoid cystic carcinomas lacking the MYB-NFIB fusion gene. The Journal of Pathology 244, 143–150

  22. Dhingra P, Martinez-Fundichely A, Berger A, Huang F, Forbes A, Liu M, Liu D, Sboner A, Tamayo P, Rickman D#, Rubin M, Khurana E#. 2017
    Identification of novel prostate cancer drivers using RegNetDriver: A framework for integration of genetic and epigenetic alterations with tissue-specific regulatory network. Genome Biology 18, 141 (Also see RegNetDriver)

    Selected for 'Top 10 Papers Reading List' in Regulatory & Systems Genomics by RECOMB/ISCB

  23. Romanel A, Garritano S, Stringa B, Blattner M, Dalfovo D, Chakravarty D, Soong D, Cotter K, Petris G, Dhingra P, Gasperini P, Cereseto A, Elemento O, Sboner A, Khurana E, Inga A, Rubin M, Demichelis F. 2017
    Inherited determinants of early recurrent somatic mutations in prostate cancer. Nature Communications 8, 48

  24. Feigin M, Garvin T, Bailey P, Waddell N, Chang D, Kelley D, Shuai S, Gallinger S, McPherson J, Grimmond S, Khurana E, Stein L, Biankin A, Schatz M, Tuveson D. 2017
    Recurrent noncoding regulatory mutations in pancreatic ductal adenocarcinoma. Nature Genetics 49, 825–833

  25. Dhingra P, Fu Y, Gerstein M #, Khurana E#. 2017
    Using FunSeq2 for coding and noncoding variant annotation and prioritization. Current Protocols in Bioinformatics 57, 15.11.1–15.11.17

  26. Cuykendall T, Rubin M, Khurana E#. 2017
    Non-coding genetic variation in cancer. Current Opinion in Systems Biology 1, 9–15

  27. Khurana E#. 2016
    Cancer Genomics: Hard-to-reach repairs. Nature 32, 181–182 PDF

  28. Khurana E#, Fu Y, Chakravarty D, Demichelis F, Rubin M#, Gerstein M#. 2016
    Role of non-coding sequence variants in cancer. Nature Reviews Genetics 17, 93–108 PDF

  29. The Cancer Genome Atlas Research Network. 2015
    The molecular taxonomy of primary prostate cancer. Cell 163, 1011–1025.

  30. The 1000 Genomes Project Consortium. 2015
    A global reference for human genetic variation. Nature 526, 68.

  31. Lochovsky L, Zhang J, Fu Y, Khurana E, Gerstein M. 2015
    LARVA: An integrative framework for large-scale analysis of recurrent variants in noncoding annotations. Nucleic Acids Res. 43(17), 8123.

  32. Fu Y, Liu Z, Lu S, Bedford J, Mu X, Yip K, Khurana E#, Gerstein M#. 2014
    FunSeq2: A framework for prioritizing noncoding regulatory variants in cancer. Genome Biology 15, 480. (co-senior author). (Also see FunSeq with new data context)

  33. Talbert-Slagle K, Atkins KE, Yan KK, Khurana E, Gerstein M, Bradley EH, Berg D, Galvani AP, Townsend J. 2014
    Cellular Superspreaders: An Epidemiological Perspective on HIV Infection inside the Body. PLoS Pathogens 10(5), e1004092.

  34. Khurana E#. 2013
    Learning to swim in a sea of genomic data. Genome Biology 14, 315. PDF

  35. Khurana E*, Fu Y*, Colonna V*, Mu X*, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J, Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, Clarke D, Clarke L, Cunningham F, Evani U, Flicek P, Fragoza R, Garrison E, Gibbs R, Gumus Z, Herreo J, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin S, MacArthur D, Marth G, Muzny D, Pers T, Ritchie G, Rosenfeld J, Sisu C, Wei X, Wilson M, Xue Y, Yu F, 1000 Genomes Project Consortium, Dermitzakis E, Yu H, Rubin M, Tyler-Smith C, Gerstein M. 2013
    Integrative annotation of variants from 1092 humans: application to cancer genomics. Science 342, 84. (Also see FunSeq with new data context)

    Research Highlight in Nature and Nature Genetics

  36. Khurana E*, Fu Y*, Chen J, Gerstein M. 2013
    Interpretation of genomic variants using a unified biological network approach. PLoS Computational Biology 9(3), e1002886.

  37. The 1000 Genomes Project Consortium. 2012
    An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56-65.

  38. Habegger L, Balasubramanian S, Chen D, Khurana E, Sboner A, Harmanci A, Rozowsky J, Clarke D, Snyder M, Gerstein M. 2012
    VAT: A computational framework to functionally annotate variants in personal genomes within a cloud-computing environment. Bioinformatics 28, 2267-2269.

  39. The ENCODE Project Consortium. 2012
    An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature 489, 57-74.

  40. Gerstein M*, Kundaje A*, Hariharan M*, Landt S*, Yan K*, Cheng C*, Mu X*, Khurana E*, Rozowsky J*, Alexander R*, Min R*, Alves P*, Abyzov A, Addleman N, Bhardwaj N, Boyle A, Cayting P, Charos A, Chen D, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O'Geen H, Ouyang Z, Partridge E, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy T, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip K, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham P, Myers R, Weissman S, Snyder M. 2012
    Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91-100.

  41. MacArthur DG, Balasubramanian S, Frankish A, Huang N, Morris J, Walter K, Jostins L, Habegger L, Pickrell JK, Montgomery SB, Albers CA, Zhang ZD, Conrad DF, Lunter G, Zheng H, Ayub Q, DePristo MA, Banks E, Hu M, Handsaker RE, Rosenfeld JA, Fromer M, Jin M, Mu XJ, Khurana E, Ye K, Kay M, Saunders GI, Suner MM, Hunt T, Barnes IH, Amid C, Carvalho-Silva DR, Bignell AH, Snow C, Yngvadottir B, Bumpstead S, Cooper DN, Xue Y, Romero IG; 1000 Genomes Project Consortium, Wang J, Li Y, Gibbs RA, McCarroll SA, Dermitzakis ET, Pritchard JK, Barrett JC, Harrow J, Hurles ME, Gerstein MB, Tyler-Smith C. 2012
    A systematic survey of loss-of-function variants in human protein-coding genes. Science 335, 823-828.

  42. The ENCODE Project Consortium. 2011
    A User's Guide to the Encyclopedia of DNA elements. PLoS Biology 9, e1001046.

  43. Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HYK, Leng J, Li R, Li Y, Lin CY, Luo R, Mu X, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stütz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO & 1000 Genomes Project. 2011.
    Mapping copy number variation by population-scale genome sequencing. Nature 470, 59-65.

  44. Lu ZJ, Yip KY, Wang G, Shou C, Hillier LW, Khurana E, Agarwal A, Auerbach R, Rozowsky J, Cheng C, Kato M, Miller DM, Slack F, Snyder M, Waterson, RH, Reinke V, Gerstein M. 2011.
    Prediction and characterization of non-coding RNAs in C.elegans by integrating conservation, secondary structure and high throughput sequencing and array data. Genome Res. 21, 276-285.

  45. Khurana E#, DeVane RH, Dal Peraro M, Klein ML#. 2011.
    Computational study of drug binding to the membrane-bound tetrameric M2 peptide bundle from influenza A virus. Biochim Biophys Acta 1808(2), 530-537.

  46. Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK, Feng X, Leng J, Vielle A, Niu W, Rhrissorrakrai K, Agarwal A, Alexander RP, Barber G, Brdlik CM, Brennan J, Brouillet JJ, Carr A, Cheung MS, Clawson H, Contrino S, Dannenberg LO, Dernburg AF, Desai A, Dick L, Dosé AC, Du J, Egelhofer T, Ercan S, Euskirchen G, Ewing B, Feingold EA, Gassmann R, Good PJ, Green P, Gullie r F, Gutwein M, Guyer MS, Habegger L, Han T, Henikoff JG, Henz SR, Hinrichs A, Holster H, Hyman T, In iguez AL, Janette J, Jensen M, Kato M, Kent WJ, Kephart E, Khivansara V, Khurana E, Kim JK, Ko lasinska-Zwierz P, Lai EC, Latorre I, Leahey A, Lewis S, Lloyd P, Lochovsky L, Lowdon RF, Lubling Y, Lyne R, Maccoss M, Mackowiak SD, Mangone M, McKay S, Mecenas D, Merrihew G, Miller DM 3rd, Muroyama A , Murray JI, Ooi SL, Pham H, Phippen T, Preston EA, Rajewsky N, Rätsch G, Rosenbaum H, Rozowsky J, Rutherford K, Ruzanov P, Sarov M, Sasidharan R, Sboner A, Scheid P, Segal E, Shin H, Shou C, Slack FJ, Slightam C, Smith R, Spencer WC, Stinson EO, Taing S, Takasaki T, Vafeados D, Voronina K, Wang G, Washington NL, Whittle CM, Wu B, Yan KK, Zeller G, Zha Z, Zhong M, Zhou X; modENCODE Consortium, Ahringer J, Strome S, Gunsalus KC, Micklem G, Liu XS, Reinke V, Kim SK, Hillier LW, Henikoff S, Piano F, Snyder M, Stein L, Lieb JD, Waterston RH. 2010.
    Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project. Science 330(6012), 1775-1787.

  47. 1000 Genomes Project Consortium. 2010.
    A map of human genome variation from population-scale sequencing. Nature 467(7319), 1061-1073.

  48. Khurana E, Lam HY, Cheng C, Carriero N, Cayting P, Gerstein MB. 2010.
    Segmental duplications in the human genome reveal details of pseudogene formation. Nucleic Acids Res. 38(20), 6997-7007.

  49. Holford ME, Khurana E, Cheung KH, Gerstein M. 2010.
    Using semantic web rules to reason on an ontology of pseudogenes. Bioinformatics 26(12), i71-78.

  50. Arinaminpathy Y*, Khurana E*,# , Engelman DM, Gerstein MB#. 2009.
    Computational Analysis of membrane proteins: the largest class of drug targets. Drug Discov Today 14(23/24), 1130-1135.

  51. Liu YJ, Zheng D, Balasubramanian S, Carriero N, Khurana E , Robilotto R, Gerstein MB. 2009.
    Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlight a recent burst of retrotranspositional activity. BMC Genomics 10, 480-492.

  52. Lam HY, Khurana E, Fang G, Cayting P, Carriero N, Cheung KH, Gerstein MB. 2009.
    Pseudofam: the pseudogene families database. Nucleic Acids Res. 37(Database issue), D738-D743.

  53. Talbert-Slagle K, Marlatt S, Barrera FN, Khurana E, Oates JE, Gerstein M, Engelman DM, Dixon A, Dimaio D. 2009.
    Artificial transmembrane oncoproteins smaller than the bovine papillomavirus E5 protein redefine sequence requirements for activation of the platelet derived growth factor {beta} receptor. J Virol. 83(19), 9773-9785.

  54. Khurana E#, Dal Peraro M#, DeVane RH, Vemparala S, DeGrado WF#, Klein ML. 2009.
    Molecular dynamics calculations suggest a conduction mechanism for the M2 proton channel from infleunza A virus. Proc. Natl. Acad. Sci. USA 106(4), 1069-1074.

  55. Khurana E#, Devane RH, Kohlmeyer A, Klein ML. 2008.
    Probing peptide nanotube self-assembly at a liquid-liquid interface with coarse-grained molecular dynamics. Nano Lett. 8(11), 3626-3630.

  56. Khurana E#, Nielsen SO, Ensing B, Klein ML. 2006.
    Self-assembling cyclic peptides: molecular dynamics studies of dimers in polar and nonpolar solvents. J Phys Chem B. 110(38), 18965-18972

  57. Khurana E#, Nielsen SO, Klein ML. 2006.
    Gemini surfactants at the air/water interface: a fully atomistic molecular dynamics study. J Phys Chem B. 110(44), 22136-22142.

  58. Dutta S, Singhal P, Agrawal P, Tomer R, Kritee K, Khurana E, Jayaram B. 2006.
    A physicochemical model for analyzing DNA sequences. J Chem Inf Model. 46(1), 78-85.