cv | Michael Doyle

Basics

Name	Michael Doyle
Label	AI Researcher & Engineer
Email	michaeldoyle1994@gmail.com
Website	https://doyled-it.com
Phone	(775) 450-6522
Summary	An AI researcher with a passion for language, open source, and applying research to important problems.

Work

2018.07 - Present
Lead AI Research Engineer

The MITRE Corporation

Led multiple ML projects. Researched, developed, tested, trained, and deployed machine learning models and applications.
- Leading a research project training, evaluating, and applying LLMs for Binary Reverse Engineering tasks
- Led research project on LLMs, IT Modernization, and code understanding
- Led research project on training object detectors for neuromorphic cameras
- Maintained internal GPU servers and services on OpenShift and Linux servers
- Developed and open sourced fire simulator to be used in RL training for wildfire mitigation
- Developed and open sourced LLM IT modernization library
- Developed agentic LLM backend prototype for government sponsor application
- Researched adversarial object detection and classification
- Researched adversarial vision for classical depth
- Researched adversarial text for Machine Translation
- Directly deployed trained models on government sponsor systems
- Engaged in government sponsor test events in the field
2016.05 - 2018.05
Firmware Engineering Intern

Space Micro Inc.

Wrote, simulated, synthesized, and implemented FPGA code in VHDL using ModelSim, Synplify Pro, Vivado, and Libero Designer. Designed PCBs in KiCAD and tested hardware.

Education

2013.09 - 2018.05
BS/BA Dual Degrees

University of San Diego

Electrical Engineering
- Minors in Mathematics and Physics
- Magna Cum Laude

Publications

2025.06.24

Can LLMs Replace Humans During Code Chunking?

ArXiv

Investigates using LLMs to modernize legacy government code by focusing on code-chunking methods to overcome input limitations, showing that LLMs can effectively partition code and generate high-quality documentation.
2025.04.23

Impact of Comments on LLM Comprehension of Legacy Code

ArXiv

Presents preliminary findings on the impact of documentation on LLM comprehension of legacy code, leveraging multiple-choice question answering (MCQA) for evaluation.
2024.11.22

Leveraging LLMs for Legacy Code Modernization: Challenges and Opportunities for LLM-Generated Documentation

ArXiv

Investigates using LLMs to generate documentation for legacy code in MUMPS and ALC, proposing a prompting strategy and evaluation rubric while highlighting the limitations of current automated metrics.
2024.06

Testing the Effect of Code Documentation on Large Language Model Code Understanding

North American Association for Computational Linguistics (NAACL)

Empirically analyzes how code documentation quality impacts the code generation and understanding capabilities of Large Language Models (LLMs). It reveals that incorrect documentation significantly hinders LLMs' code comprehension, while incomplete or missing documentation has no significant impact.
2023.11

Reinforcement Learning for Wildfire Mitigation in Simulated Disaster Environments

Neural Information Processing Systems (NeurIPS)

Presents a reinforcement learning approach to wildfire mitigation in simulated disaster environments. We release two software libraries, SimFire and SimHarness, to facilitate future research in this area.
2022.09

Practical Attacks on Machine Translation using Paraphrase

Association for Machine Translation in the Americas (AMTA)

Investigated the vulnerability of machine translation systems to adversarial attacks constructed with limited information. A novel attack method was proposed that generates perturbations using paraphrases and evaluates their impact on meaning preservation and translation degradation across various language pairs and systems.
2021.04

The vulnerability of UAVs: an adversarial machine learning perspective

SPIE

Proposes a methodology to evaluate the vulnerability of unmanned aerial vehicles (UAVs) to adversarial machine learning attacks by analyzing potential attack vectors at each stage of UAV operation.

Skills

	AI & ML
	NLP
	LLMs
	RAG
	Computer Vision
	Acoustics
	RL
	Simulation
	Adversarial

	Languages
	Python
	Bash
	SQL
	LaTeX
	Vue.js
	TypeScript
	CSS
	MATLAB
	VHDL

	Libraries
	NumPy
	SciPy
	PyTorch
	TensorFlow
	FastAPI
	Typer
	LangChain
	HuggingFace
	Weaviate
	DSPy

	Technologies
	Git
	Docker
	Vector DBs
	OpenShift
	GitLab CI
	GitHub Actions

Languages

	English
	Native speaker

	Spanish
	Intermediate

	German
	Beginner

Projects

2023.07 - 2025.09
Janus LLM

A library for LLM IT modernization, using LLMs, RAG, and intelligent chunking.
2019.09 - 2024.11
SimFire

A wildfire simulator written in Python and meant for Reinforcement Learning research for wildfire mitigation.
2019.09 - 2024.08
SimHarness

A reinforcement learning harness meant to be used with SimFire for wildfire mitigation research.
2024.08 - 2024.08
PFC-LLM

A CLI tool designed to classify documents with a political score according to the Left/Right, Libertarian/Authoritarian political compass, based on the Sapply test.

Basics

Work

The MITRE Corporation

Led multiple ML projects. Researched, developed, tested, trained, and deployed machine learning models and applications.

Space Micro Inc.

Wrote, simulated, synthesized, and implemented FPGA code in VHDL using ModelSim, Synplify Pro, Vivado, and Libero Designer. Designed PCBs in KiCAD and tested hardware.

Education

University of San Diego

Electrical Engineering

Publications

ArXiv

Investigates using LLMs to modernize legacy government code by focusing on code-chunking methods to overcome input limitations, showing that LLMs can effectively partition code and generate high-quality documentation.

ArXiv

Presents preliminary findings on the impact of documentation on LLM comprehension of legacy code, leveraging multiple-choice question answering (MCQA) for evaluation.

ArXiv

Investigates using LLMs to generate documentation for legacy code in MUMPS and ALC, proposing a prompting strategy and evaluation rubric while highlighting the limitations of current automated metrics.

North American Association for Computational Linguistics (NAACL)

Neural Information Processing Systems (NeurIPS)

Presents a reinforcement learning approach to wildfire mitigation in simulated disaster environments. We release two software libraries, SimFire and SimHarness, to facilitate future research in this area.

Association for Machine Translation in the Americas (AMTA)

SPIE

Proposes a methodology to evaluate the vulnerability of unmanned aerial vehicles (UAVs) to adversarial machine learning attacks by analyzing potential attack vectors at each stage of UAV operation.

Skills

Languages

Projects

A library for LLM IT modernization, using LLMs, RAG, and intelligent chunking.

A wildfire simulator written in Python and meant for Reinforcement Learning research for wildfire mitigation.

A reinforcement learning harness meant to be used with SimFire for wildfire mitigation research.

A CLI tool designed to classify documents with a political score according to the Left/Right, Libertarian/Authoritarian political compass, based on the Sapply test.