
Mahdi Imani
VerifiedNortheastern University · Electrical and Energy Engineering
Active 2009–2026
About
Mahdi Imani is an Assistant Professor in the Department of Electrical and Computer Engineering at Northeastern University College of Engineering, having joined the faculty in August 2021. His research focuses on machine learning and decision-making in complex cyber-physical systems, with particular emphasis on developing algorithms that enhance data collection, incorporate expert knowledge, and improve statistical inference in challenging environments. His work aims to advance the understanding and control of large-scale, uncertain systems through innovative Bayesian and reinforcement learning methods. Imani holds a PhD in Electrical Engineering from Texas A&M University, earned in 2019, and has received numerous awards for his research, including a $1.2 million DARPA grant for probabilistic cognitive reasoning in mixed reality systems, a $1.5 million ONR award to foster human-AI collaboration, and a $385,000 NSF grant to improve statistical inference techniques. His contributions extend to developing scalable inference methods, Bayesian optimization frameworks, and control strategies for complex networks, with applications spanning cybersecurity, biomedical modeling, and autonomous systems. He is actively involved in advancing the field through his research projects, publications, and collaborations.
Research topics
- Artificial Intelligence
- Computer Science
- Machine Learning
- Engineering
- Data Mining
- Control engineering
- Data science
- Mathematics
- Mathematical optimization
Selected publications
IEEE Transactions on Neural Networks and Learning Systems · 2026-01-01
articleInverse reinforcement learning (IRL) seeks to infer the latent reward function and the associated optimal policy from expert demonstrations. However, most current IRL methods assume centralized access to all trajectory data, which is impractical in real-world scenarios characterized by decentralized data sources and privacy concerns. To this end, this article proposes a novel algorithm for federated maximum-likelihood IRL (F-ML-IRL) and provides a rigorous analysis of its convergence rate. The proposed F-ML-IRL leverages dual aggregation to update the shared global model and performs bi-level local updates: an upper level learning task to optimize the parameterized reward function by maximizing the discounted likelihood of observing human expert trajectories under the current policy, and a lower level learning task to find the optimal agent policy regarding the entropy-regularized discounted cumulative reward under the current reward function. We analyze the convergence rate of the proposed F-ML-IRL algorithm and show that the global model in F-ML-IRL converges to a stationary point for both the reward and policy parameters within finite time. That is, the log-distance between the recovered policy and the optimal policy, as well as the gradient of the likelihood objective, converges to zero. Evaluating our F-ML-IRL algorithm on high-dimensional robotic control tasks in MuJoCo, we show that it ensures convergence of the recovered reward in decentralized learning and outperforms centralized baselines due to its ability to utilize distributed data-attaining better recovered rewards than all baselines in 12 out of 20 tasks.
Multimodal Hateful Meme Detection with Graph Attention Networks and Contextual Cues
2025-06-24
articleThe classification of hateful memes remains a challenging task due to their multimodal nature, where the interplay of textual and visual elements often conveys implicit or nuanced harmful content. This paper introduces a novel classification framework leveraging Graph Attention Networks (GATs) to model cross-modal relationships between visual and textual components. The proposed method integrates Visual Question Answering (VQA) and image captioning to enhance contextual understanding and refine semantic representations of multimodal data. Each meme is represented as a fully connected graph, where nodes correspond to embeddings derived from visual features, captions, and VQA responses, while GATs dynamically assign importance to these relationships. Experimental evaluation on the HarMeme dataset demonstrates the effectiveness of our approach, achieving competitive accuracy and AUROC compared to unimodal baselines (ResNet, DistilBERT) and showing promising improvements over existing multimodal models such as Contrastive Language Image Pre-training (CLIP). These results highlight the potential of the proposed GAT-based architectures for improving hateful meme detection and advancing multimodal content analysis.
Draft and Refine with Visual Experts
ArXiv.org · 2025-11-14
preprintOpen accessWhile recent Large Vision-Language Models (LVLMs) exhibit strong multimodal reasoning abilities, they often produce ungrounded or hallucinated responses because they rely too heavily on linguistic priors instead of visual evidence. This limitation highlights the absence of a quantitative measure of how much these models actually use visual information during reasoning. We propose Draft and Refine (DnR), an agent framework driven by a question-conditioned utilization metric. The metric quantifies the model's reliance on visual evidence by first constructing a query-conditioned relevance map to localize question-specific cues and then measuring dependence through relevance-guided probabilistic masking. Guided by this metric, the DnR agent refines its initial draft using targeted feedback from external visual experts. Each expert's output (such as boxes or masks) is rendered as visual cues on the image, and the model is re-queried to select the response that yields the largest improvement in utilization. This process strengthens visual grounding without retraining or architectural changes. Experiments across VQA and captioning benchmarks show consistent accuracy gains and reduced hallucination, demonstrating that measuring visual utilization provides a principled path toward more interpretable and evidence-driven multimodal agent systems. Code is available at https://github.com/EavnJeong/Draft-and-Refine-with-Visual-Experts.
Global Optimization on Graph-Structured Data via Gaussian Processes with Spectral Representations
ArXiv.org · 2025-11-11
preprintOpen accessBayesian optimization (BO) is a powerful framework for optimizing expensive black-box objectives, yet extending it to graph-structured domains remains challenging due to the discrete and combinatorial nature of graphs. Existing approaches often rely on either full graph topology-impractical for large or partially observed graphs-or incremental exploration, which can lead to slow convergence. We introduce a scalable framework for global optimization over graphs that employs low-rank spectral representations to build Gaussian process (GP) surrogates from sparse structural observations. The method jointly infers graph structure and node representations through learnable embeddings, enabling efficient global search and principled uncertainty estimation even with limited data. We also provide theoretical analysis establishing conditions for accurate recovery of underlying graph structure under different sampling regimes. Experiments on synthetic and real-world datasets demonstrate that our approach achieves faster convergence and improved optimization performance compared to prior methods.
Probabilistic Verification of Cybersickness in Virtual Reality Through Bayesian Networks
2025-10-08 · 3 citations
articleSenior authorCybersickness remains a major challenge in virtual and mixed reality (VR/MR), yet existing methods primarily focus on predicting its onset without offering formal guarantees regarding its occurrence or effective mitigation. As VR/MR applications expand into safety-critical domains like healthcare, defense, verifiable safety assurances become essential to protect users from adverse physiological and psychological effects. This paper introduces a probabilistic verification framework leveraging Bayesian Networks (BN) to explicitly model the interactions among system parameters, human physiological responses, and cybersickness severity. Unlike deep learning approaches that lack interpretability and formal verification capabilities, the proposed BN model explicitly captures how environmental and system-level factors (e.g., luminance, spectral entropy, and image gradient complexity via HoG features) influence physiological responses (e.g., heart rate, reaction time, eye tracking), ultimately affecting cybersickness severity. By learning the joint probability distribution of these factors, our approach provides rigorous formal guarantees on cybersickness risk under specified operational conditions. If these guarantees are not met, automated adaptive adjustments are recommended to restore safe conditions. Experimental validation involving physiological and systemlevel data demonstrates that Bayesian Networks provide an interpretable and efficient framework, uniquely enabling formal probabilistic verification of cybersickness risks. This capability makes the proposed approach particularly suitable for designing and deploying VR/MR systems with explicitly verified safety constraints.
Game-Theoretic Defense Policy for Network Security Against Intelligent Adversary
2025-08-17 · 4 citations
articleSenior authorThe rapid evolution of IT infrastructure and networked systems has increased their susceptibility to sophisticated and intelligent cyber threats. Despite advancements in attack detection, adversaries continuously refine their strategies, exploiting vulnerabilities with growing complexity. In this paper, we model the dynamic interaction between a defender and an intelligent adversary as a two-player zero-sum game. The defender’s partial observability of the adversary and network state is represented using a partially observable Markov decision process (POMDP). We develop a recursive method to compute the posterior distribution of network compromises based on incomplete observations of network states and no access to adversarial actions. An optimal minimum mean square error (MMSE) estimator leverages this posterior for the recursive estimation of network compromises. To ensure the defender follows the Nash equilibrium, where neither player has the incentive to deviate, our automated defense policy employs the Nash strategy based on the optimal MMSE estimate of the network state. Two evaluation metrics are introduced to assess the policy’s effectiveness: expected mean square error and expected policy misalignment. Simulation results show improved defense effectiveness over static or non-strategic automated policies, demonstrating the advantages of strategic decision-making in network security.
Bayesian topology inference of regulatory networks under partial observability
Results in Control and Optimization · 2025-05-01
articleOpen accessSenior authorBiological systems, such as microbial communities in metagenomics and gene regulatory networks (GRNs) in genomics, are composed of a vast number of interacting components observed through inherently noisy data. These systems play a critical role in understanding fundamental biological processes, including gene regulation, microbial interactions, and cellular dynamics. For example, microbial communities involve complex interactions between microbes, bacteria, genes, and small molecules observed through omics data, while GRNs consist of numerous interacting genes observed via various gene-expression technologies. However, reconstructing the topology of such networks poses significant challenges due to their large scale, high dimensionality, and the presence of noise. Existing inference techniques often struggle with scalability, interpretability, and overfitting, making them unsuitable for analyzing large and complex biological systems. To overcome these challenges, this paper proposes a Bayesian topology optimization framework for efficient and scalable inference of regulatory networks modeled as partially-observed Boolean dynamical systems (POBDS). The method combines the Boolean Kalman Filter (BKF) as an optimal estimator for POBDS, with Bayesian optimization, which employs Gaussian Process regression and a topology-inspired kernel function to model the log-likelihood function. Numerical experiments demonstrate the superior performance of our framework. In the p53-MDM2 network, our method accurately infers topology with 8 and 16 unknown regulations, achieving higher log-likelihood with 100 and 200 evaluations, respectively. For the mammalian cell cycle network with 10 unknown regulations, proposed method identifies the correct topology among 59,049 possibilities with lower error and faster convergence.
Deep Reinforcement Learning for Intervention of Partially Observable Regulatory Networks
2025-07-08 · 1 citations
articleOpen accessSenior authorThis paper presents a deep reinforcement learning framework for designing optimal intervention policies in Gene Regulatory Networks (GRNs) under partial observability. Existing methods often assume full observability of the system states, which is unrealistic in practice due to incomplete or noisy gene expression data. To address these limitations, we extend Boolean network models to include partial observability. The uncertainty in gene expression data and stochasticity in gene activities impacted by interventions are captured through the posterior distribution of states, called the belief state. We formulate the optimal intervention policy over the belief space, maximizing long-term rewards by reducing harmful gene activations while accounting for system and data uncertainties. Deep reinforcement learning, particularly deep Q-network (DQN), is developed to enable approximation of the optimal intervention policy at scale. Our analytical results demonstrate that the method converges to the optimal dynamic programming solution if the uncertainty in the gene state disappears. Numerical experiments on a melanoma gene regulatory network demonstrate the efficacy of the proposed approach, showing improved performance compared to existing methods in maintaining desirable system states and reducing the activation of cancer-related genes.
Learning Personalized Human Decision Models in Cyber Defense
IEEE Transactions on Artificial Intelligence · 2025-01-01
articleArtificial intelligence (AI) is expected to enhance cyber defense operations by providing novel capabilities that support human operators in high-stakes environments, such as decision assistance, automation, and after-action reviews. Effective AI support in cyber defense must adapt to the cognitive profiles and decision tendencies of human operators. These environments pose unique challenges: decisions are time-critical, threats evolve in an adversarial manner, and human behavior is heterogeneous and difficult to infer from limited data. We address these by inferring cognitive traits, such as risk tolerance, curiosity, and prospect, that govern individual decision-making tendencies. Our goal is not to replicate user behavior, but to understand it, using personalized models to enable riskaware guidance and intervention. Such models support trust calibration, role assignment, and personalized training. They also enable differentiated guidance strategies for novice and expert users, all while preserving mission safety. We propose a kernel–based inverse learning framework that infers userspecific latent cognitive traits from limited behavioral data. Human decision-making is formulated as a Markov Decision Process, explicitly modeling how individual defenders perceive and respond to evolving threats. We validate our approach using a psychological survey from 108 participants in the TTCP CAGE Challenge 2 testbed. Results demonstrate that the proposed method accurately recovers individualized traits and has strong predictive accuracy, even under sparse data and behavioral variability. Unlike prior approaches, the proposed framework provides a cognitively interpretable foundation for assistive AI systems, enabling them to anticipate and adapt to diverse user decision-making styles.
Decentralized Reinforcement Learning for Asymmetric Gene Network Interventions
IEEE Transactions on Computational Biology and Bioinformatics · 2025-11-01 · 1 citations
articleOpen accessSenior authorGene regulatory networks (GRNs) regulate essential cellular functions, and their dysregulation contributes to diseases such as cancer and autoimmune disorders. Designing effective interventions is challenging due to (i) the adaptive resistance of cells to therapies and (ii) the limited knowledge of genes' states during the intervention process through gene expression data. To address these challenges, this paper develops a decentralized deep reinforcement learning framework for intervention in GRNs. The intervention process is formulated as an asymmetric two-player zero-sum game, where the history-dependent intervention policy is derived against a cell that has complete knowledge of gene states. The optimal intervention policy is expressed as a Nash equilibrium policy, and a deep policy gradient approach is developed to approximate this policy. The analytical results demonstrate that under non-aggressive cell responses, the proposed intervention policy achieves higher-than-expected gains, ensuring robustness even against the most complex adaptive cellular responses. Furthermore, if the true system state becomes fully observable, the proposed method converges to the full-state Nash equilibrium. Numerical experiments on two benchmark GRN models, p53-MDM2 and melanoma regulatory networks, validate the proposed method, demonstrating its superior adaptability under uncertainty compared to state-of-the-art intervention strategies.
Recent grants
Neurally-Inspired Integration of Communication and Cognitive Computation in Hyperspace
NSF · $360k · 2023–2026
UKRI/BBSRC-NSF/BIO: Interpretable and Noise-Robust Machine Learning for Neurophysiology
NSF · $797k · 2023–2026
III: Small: Statistical Inference through Data-Collection and Expert-Knowledge Incorporation
NSF · $384k · 2023–2026
Hyperdimensional Neural Computation for Real-Time Cognitive Learning
NSF · $300k · 2021–2025
CPS: Small: Brain-Inspired Memorization and Attention for Intelligent Sensing
NSF · $500k · 2023–2027
Frequent coauthors
- 34 shared
Ulisses Braga-Neto
- 25 shared
Seyede Fatemeh Ghoreishi
Northeastern University
- 10 shared
Edward R. Dougherty
Salve Regina University
- 9 shared
Mohammad Alali
Northeastern University
- 5 shared
S. H. R. Hosseini
Northeastern University
- 4 shared
Amirhossein Ravari
Northeastern University
- 4 shared
Xiaoning Qian
Texas A&M University
- 4 shared
Armita Kazeminajafabadi
Northeastern University
Labs
Northeastern University College of EngineeringPI
Education
- 2019
PhD, Electrical and Computer Engineering
Texas A&M University
Awards & honors
- Best Paper Finalist Award, 20th IFAC Symposium on Systems Id…
- Outstanding Associate Editor Award, IEEE Transactions on Neu…
- Best Paper Finalist Award, IEEE American Control Conference,…
- 2022 Oracle Research Award
- Association of Former Students Distinguished Graduate Studen…
- Resume-aware match score
- Save to shortlist
- AI-drafted outreach
See your match with Mahdi Imani
PhdFit ranks faculty by your research interests, methods, and publications — grounded in their actual work, not templates.
- Free to start
- No credit card
- 30-second signup