Yangtian Luo

· Assistant Professor of Instruction

Northwestern University · Literature

Active 1995–2024

h-index71

Citations21.4k

Papers692369 last 5y

Funding$47.4M2 active

Faculty page

See your match with Yangtian Luo — sign in to PhdFit.Sign in

About

Yangtian Luo is an Assistant Professor of Instruction in the Department of Asian Languages and Cultures at Northwestern University. She holds a Ph.D. in Chinese Linguistics from the University of Wisconsin-Madison. Her research specializes in Chinese Linguistics, Second Language Acquisition, and Teaching Chinese as a Second Language. Prior to joining Northwestern, she served as a Lecturer of Chinese at Lawrence University, an M.A. Fellow in Chinese at the University of North Carolina-Charlotte, and an Instructor at the University of Wisconsin-Madison. These roles helped her refine her teaching skills and expand her understanding of language and cultural studies. Yangtian's commitment to teaching has been recognized with the Honorable Mention Early-career Teaching Award at Lawrence University and the Honored Instructor title at the University of Wisconsin-Madison. Her scholarly work includes publications in the International Journal of Chinese Linguistics and a study examining the impact of COVID-19 on Chinese L2 learners' motivation. Her ongoing research focuses on a multi-modal Chinese text corpus centered on Speech Respiration, aiming to understand the complexities of learning Mandarin. By investigating the correlation between the prosodic features of Mandarin Chinese and respiratory rhythms in speech activity, she seeks to advance teaching methodologies in Mandarin.

Research topics

Medicine
Computer Science
Artificial Intelligence
Information Retrieval
Natural Language Processing
Biology
Internal medicine
Genetics
Pathology
Political Science
Virology
Medical physics
Machine Learning
Surgery
Psychology
Mathematics
Computational biology
Intensive care medicine
Knowledge management
Data science
Nanotechnology
Cell biology
Immunology
Emergency medicine

Selected publications

Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations
Nature Medicine · 2024 · 182 citations
- Computer Science
- Computer Science
- Medicine
Polygenic risk scores (PRSs) have improved in predictive performance, but several challenges remain to be addressed before PRSs can be implemented in the clinic, including reduced predictive performance of PRSs in diverse populations, and the interpretation and communication of genetic results to both providers and patients. To address these challenges, the National Human Genome Research Institute-funded Electronic Medical Records and Genomics (eMERGE) Network has developed a framework and pipeline for return of a PRS-based genome-informed risk assessment to 25,000 diverse adults and children as part of a clinical study. From an initial list of 23 conditions, ten were selected for implementation based on PRS performance, medical actionability and potential clinical utility, including cardiometabolic diseases and cancer. Standardized metrics were considered in the selection process, with additional consideration given to strength of evidence in African and Hispanic populations. We then developed a pipeline for clinical PRS implementation (score transfer to a clinical laboratory, validation and verification of score performance), and used genetic ancestry to calibrate PRS mean and variance, utilizing genetically diverse data from 13,475 participants of the All of Us Research Program cohort to train and test model parameters. Finally, we created a framework for regulatory compliance and developed a PRS clinical report for return to providers and for inclusion in an additional genome-informed risk assessment. The initial experience from eMERGE can inform the approach needed to implement PRS-based testing in diverse clinical settings.
DOI
Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers
npj Digital Medicine · 2023 · 636 citations
- Computer Science
- Information Retrieval
- Natural Language Processing
Large language models such as ChatGPT can produce increasingly realistic text, with unknown information on the accuracy and integrity of using these models in scientific writing. We gathered fifth research abstracts from five high-impact factor medical journals and asked ChatGPT to generate research abstracts based on their titles and journals. Most generated abstracts were detected using an AI output detector, 'GPT-2 Output Detector', with % 'fake' scores (higher meaning more likely to be generated) of median [interquartile range] of 99.98% 'fake' [12.73%, 99.98%] compared with median 0.02% [IQR 0.02%, 0.09%] for the original abstracts. The AUROC of the AI output detector was 0.94. Generated abstracts scored lower than original abstracts when run through a plagiarism detector website and iThenticate (higher scores meaning more matching text found). When given a mixture of original and general abstracts, blinded human reviewers correctly identified 68% of generated abstracts as being generated by ChatGPT, but incorrectly identified 14% of original abstracts as being generated. Reviewers indicated that it was surprisingly difficult to differentiate between the two, though abstracts they suspected were generated were vaguer and more formulaic. ChatGPT writes believable scientific abstracts, though with completely generated data. Depending on publisher-specific guidelines, AI output detectors may serve as an editorial tool to help maintain scientific standards. The boundaries of ethical and acceptable use of large language models to help scientific writing are still being discussed, and different journals and conferences are adopting varying policies.
DOI
Long-term kidney function recovery and mortality after COVID-19-associated acute kidney injury: an international multi-centre observational cohort study
EClinicalMedicine · 2022 · 79 citations
- Medicine
- Intensive care medicine
- Emergency medicine
Background: While acute kidney injury (AKI) is a common complication in COVID-19, data on post-AKI kidney function recovery and the clinical factors associated with poor kidney function recovery is lacking. Methods: A retrospective multi-centre observational cohort study comprising 12,891 hospitalized patients aged 18 years or older with a diagnosis of SARS-CoV-2 infection confirmed by polymerase chain reaction from 1 January 2020 to 10 September 2020, and with at least one serum creatinine value 1-365 days prior to admission. Mortality and serum creatinine values were obtained up to 10 September 2021. Findings: Advanced age (HR 2.77, 95%CI 2.53-3.04, p < 0.0001), severe COVID-19 (HR 2.91, 95%CI 2.03-4.17, p < 0.0001), severe AKI (KDIGO stage 3: HR 4.22, 95%CI 3.55-5.00, p < 0.0001), and ischemic heart disease (HR 1.26, 95%CI 1.14-1.39, p < 0.0001) were associated with worse mortality outcomes. AKI severity (KDIGO stage 3: HR 0.41, 95%CI 0.37-0.46, p < 0.0001) was associated with worse kidney function recovery, whereas remdesivir use (HR 1.34, 95%CI 1.17-1.54, p < 0.0001) was associated with better kidney function recovery. In a subset of patients without chronic kidney disease, advanced age (HR 1.38, 95%CI 1.20-1.58, p < 0.0001), male sex (HR 1.67, 95%CI 1.45-1.93, p < 0.0001), severe AKI (KDIGO stage 3: HR 11.68, 95%CI 9.80-13.91, p < 0.0001), and hypertension (HR 1.22, 95%CI 1.10-1.36, p = 0.0002) were associated with post-AKI kidney function impairment. Furthermore, patients with COVID-19-associated AKI had significant and persistent elevations of baseline serum creatinine 125% or more at 180 days (RR 1.49, 95%CI 1.32-1.67) and 365 days (RR 1.54, 95%CI 1.21-1.96) compared to COVID-19 patients with no AKI. Interpretation: COVID-19-associated AKI was associated with higher mortality, and severe COVID-19-associated AKI was associated with worse long-term post-AKI kidney function recovery. Funding: Authors are supported by various funders, with full details stated in the acknowledgement section.
DOI
Circulating ACE2-expressing extracellular vesicles block broad strains of SARS-CoV-2
Nature Communications · 2022 · 167 citations
- Virology
- Biology
- Immunology
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused the pandemic of the coronavirus induced disease 2019 (COVID-19) with evolving variants of concern. It remains urgent to identify novel approaches against broad strains of SARS-CoV-2, which infect host cells via the entry receptor angiotensin-converting enzyme 2 (ACE2). Herein, we report an increase in circulating extracellular vesicles (EVs) that express ACE2 (evACE2) in plasma of COVID-19 patients, which levels are associated with severe pathogenesis. Importantly, evACE2 isolated from human plasma or cells neutralizes SARS-CoV-2 infection by competing with cellular ACE2. Compared to vesicle-free recombinant human ACE2 (rhACE2), evACE2 shows a 135-fold higher potency in blocking the binding of the viral spike protein RBD, and a 60- to 80-fold higher efficacy in preventing infections by both pseudotyped and authentic SARS-CoV-2. Consistently, evACE2 protects the hACE2 transgenic mice from SARS-CoV-2-induced lung injury and mortality. Furthermore, evACE2 inhibits the infection of SARS-CoV-2 variants (α, β, and δ) with equal or higher potency than for the wildtype strain, supporting a broad-spectrum antiviral mechanism of evACE2 for therapeutic development to block the infection of existing and future coronaviruses that use the ACE2 receptor.
Publisher OA PDF DOI
A comprehensive SARS-CoV-2–human protein–protein interactome reveals COVID-19 pathobiology and potential host therapeutic targets
Nature Biotechnology · 2022 · 196 citations
- Computational biology
- Biology
- Virology
Publisher OA PDF DOI
Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers
bioRxiv (Cold Spring Harbor Laboratory) · 2022 · 410 citations
- Computer Science
- Natural Language Processing
- Artificial Intelligence
Abstract Background Large language models such as ChatGPT can produce increasingly realistic text, with unknown information on the accuracy and integrity of using these models in scientific writing. Methods We gathered ten research abstracts from five high impact factor medical journals (n=50) and asked ChatGPT to generate research abstracts based on their titles and journals. We evaluated the abstracts using an artificial intelligence (AI) output detector, plagiarism detector, and had blinded human reviewers try to distinguish whether abstracts were original or generated. Results All ChatGPT-generated abstracts were written clearly but only 8% correctly followed the specific journal’s formatting requirements. Most generated abstracts were detected using the AI output detector, with scores (higher meaning more likely to be generated) of median [interquartile range] of 99.98% [12.73, 99.98] compared with very low probability of AI-generated output in the original abstracts of 0.02% [0.02, 0.09]. The AUROC of the AI output detector was 0.94. Generated abstracts scored very high on originality using the plagiarism detector (100% [100, 100] originality). Generated abstracts had a similar patient cohort size as original abstracts, though the exact numbers were fabricated. When given a mixture of original and general abstracts, blinded human reviewers correctly identified 68% of generated abstracts as being generated by ChatGPT, but incorrectly identified 14% of original abstracts as being generated. Reviewers indicated that it was surprisingly difficult to differentiate between the two, but that the generated abstracts were vaguer and had a formulaic feel to the writing. Conclusion ChatGPT writes believable scientific abstracts, though with completely generated data. These are original without any plagiarism detected but are often identifiable using an AI output detector and skeptical human reviewers. Abstract evaluation for journals and medical conferences must adapt policy and practice to maintain rigorous scientific standards; we suggest inclusion of AI output detectors in the editorial process and clear disclosure if these technologies are used. The boundaries of ethical and acceptable use of large language models to help scientific writing remain to be determined.
DOI
Evolving phenotypes of non-hospitalized patients that indicate long COVID
BMC Medicine · 2021 · 151 citations
- Medicine
- Internal medicine
- Pediatrics
BACKGROUND: For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of PASC phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. METHODS: In this retrospective electronic health record (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston Metropolitan Area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. RESULTS: We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients' medical records 2 months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR 2.60, 95% CI [1.94-3.46]), alopecia (OR 3.09, 95% CI [2.53-3.76]), chest pain (OR 1.27, 95% CI [1.09-1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22-1.64]), pneumonia (OR 1.66, 95% CI [1.28-2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22-1.64]) is one of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. CONCLUSIONS: The findings of this study confirm many of the post-COVID-19 symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63% of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.
DOI
The role of machine learning in clinical research: transforming the future of evidence generation
Trials · 2021 · 277 citations
- Political Science
- Computer Science
- Medicine
BACKGROUND: Interest in the application of machine learning (ML) to the design, conduct, and analysis of clinical trials has grown, but the evidence base for such applications has not been surveyed. This manuscript reviews the proceedings of a multi-stakeholder conference to discuss the current and future state of ML for clinical research. Key areas of clinical trial methodology in which ML holds particular promise and priority areas for further investigation are presented alongside a narrative review of evidence supporting the use of ML across the clinical trial spectrum. RESULTS: Conference attendees included stakeholders, such as biomedical and ML researchers, representatives from the US Food and Drug Administration (FDA), artificial intelligence technology and data analytics companies, non-profit organizations, patient advocacy groups, and pharmaceutical companies. ML contributions to clinical research were highlighted in the pre-trial phase, cohort selection and participant management, and data collection and analysis. A particular focus was paid to the operational and philosophical barriers to ML in clinical research. Peer-reviewed evidence was noted to be lacking in several areas. CONCLUSIONS: ML holds great promise for improving the efficiency and quality of clinical research, but substantial barriers remain, the surmounting of which will require addressing significant gaps in evidence.
Publisher OA PDF DOI
Zn‐MOF Encapsulated Antibacterial and Degradable Microneedles Array for Promoting Wound Healing
Advanced Healthcare Materials · 2021 · 314 citations
- Materials science
- Nanotechnology
- Biomedical engineering
An infected skin wound caused by external injury remains a serious challenge in clinical practice. Wound dressings with the properties of antibacterial activity and potent regeneration capacity are highly desirable for wound healing. In this paper, a degradable, ductile, and wound-friendly Zn-MOF encapsulated methacrylated hyaluronic acid (MeHA) microneedles (MNs) array is fabricated through the molding method for promoting wound healing. Due to the damage capability against the bacteria capsule and oxidative stress of the zinc ion released from the Zn-MOF, such MNs array presents excellent antibacterial activity, as well as considerable biocompatibility. Besides, the degradable MNs array composed of photo-crosslinked MeHA possesses the superior capabilities to continuously and steadily release the loaded active ingredients and avoid secondary damage to the wound. Moreover, the low molecular weight hyaluronic acid (HA) generated by hydrolysis of MeHA is also conducive to tissue regeneration. Benefiting from these features, it has been demonstrated that the Zn-MOF encapsulated degradable MNs array can dramatically accelerate epithelial regeneration and neovascularization. These results indicate that the combination of MOFs and degradable MNs array is of great value for promoting wound healing.
DOI

Recent grants

Modeling the Incompleteness and Biases of Health Data
NIH · $1.3M · 2020–2025
Bayesian Generative Methods for Extracting and Modeling Relations in EHR Narratives
NIH · $208k · 2017–2019
Data Portal Core
NIH · $33.7M · 2021–2026
Bayesian Generative Methods for Extracting and Modeling Relations in EHR Narratives
NIH · $182k · 2017–2020
CRITICAL: Collaborative Resource for Intensive care Translational science, Informatics, Comprehensive Analytics, and Learning
NIH · $4.9M · 2021–2026

Frequent coauthors

Chengsheng Mao
Northwestern University
79 shared
Antoine Neuraz
Centre de Recherche des Cordeliers
59 shared
Yan Gong
Zhongnan Hospital of Wuhan University
58 shared
Conghua Xie
Wuhan University
56 shared
Paul Avillach
Boston Children's Hospital
50 shared
Gabriel A. Brat
Harvard University
47 shared
Griffin M. Weber
45 shared
Chuan Hong
Duke University
42 shared

Awards & honors

Honorable Mention Early-career Teaching Award at Lawrence Un…
Honored Instructor at the University of Wisconsin-Madison

Similar researchers at Northwestern University

Resume-aware match score
Save to shortlist
AI-drafted outreach

See your match with Yangtian Luo

PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.

Join the waitlist How it works

Free to start
No credit card
30-second signup

Find professors who actually fit you