Conor Kotwasinski

ckotwasinski@gmail.com · (708) 834-3108 · New York Metropolitan Area · Active TS Clearance
GitHub LinkedIn

I am an Engineer at Systems & Technology Research (STR), where I build emulation, instrumentation, and vulnerability research tooling for embedded and real-time systems. Most of my day-to-day involves extending QEMU to model new processor architectures and SoCs, writing eBPF-based tracing frameworks, and developing fuzzers that operate at the full-system level.

I hold a combined Master's and Bachelor's in Computer Science from Northwestern University, where I founded the CTF team, served on the Blockchain Club executive board, and TA'd courses in system security, digital forensics, and deep learning.

Research & Papers

Classical and Novel Attacks on Scientific Applications
2025 · Conor Kotwasinski, Michael Polinski, Peter Dinda
Explores two attack models against scientific computing software. A fuzzing campaign with AFL++ against LAMMPS, LAGHOS, ENZO, and GROMACS finds hundreds to thousands of crash-inducing inputs per target in under a day, with 1–5% classified as likely exploitable for arbitrary code execution. Separately, introduces chaos control attacks: imperceptible manipulations to floating-point inputs, below the threshold of standard output precision, that cause unmodified simulations of chaotic dynamical systems to compute attacker-chosen results without code injection, immune to classical mitigations like ASLR or NX. Includes MEDES, a system for searching the nudge space of arbitrary black-box scientific programs.
Limits of Service Discovery: Investigating Automatic Microsegmentation Policy Generation for Microservices
2023 · Conor Kotwasinski, Li-Kang Tan, Hongyi Charles Zhou, Yan Chen
Investigates automatic generation of L3/L4 microsegmentation policies for microservice architectures by combining static binary analysis of Docker images with service registry data. We reverse engineer Java bytecode from container images to extract inter-service call graphs via @FeignClient annotations, query Eureka for socket address mappings, and synthesize Kubernetes NetworkPolicies, all without access to source code. Evaluates the limits of Zookeeper's unstandardized Znode structure versus Eureka's uniform REST API for generalizable policy generation.
Automating Microservice Development using Large Language Models
2024 · Master's Thesis, Northwestern University
Investigates LLM-driven code generation for microservices vs. monoliths by decomposing 24 user stories into service descriptions, auto-generating implementations with Claude 3 Opus, and running auto-generated test suites. Microservices significantly outperformed monoliths (mean scores 30–43% vs. 12–33%, p<0.05 in Trial 2), with the key insight that monolithic architectures trigger "not implemented" errors 2–3× more often, a direct consequence of code length exceeding the LLM's context window. Conversely, microservices' modular structure naturally fits within context limits, allowing fault isolation to preserve partial functionality even when individual services fail. Code Llama Instruct 34B, despite a 100k token context window, generated incomplete Flask applications with inconsistencies (e.g., port mismatches), suggesting the problem is not just size but also model quality. Taxonomy of six failure modes and Mann-Whitney U analysis quantify the architectural advantage.
The Perpetual Variance Swap
2022 · Robert Leifke, Alexandru Beloiu, Conor Kotwasinski, William Wang, Ashrith Bandla, George Fane
Derives a constant function market maker whose trading function φ(R₁, R₂) = R₁ + log(R₂) replicates a logarithmic payoff, making the expected arbitrageur profit a function of ½(σ² − K²vol)T, the same payout structure as a traditional variance swap, but without oracles or counterparties. Shows how shorting the concave LP share yields a convex position that can gamma-hedge Uniswap V3 impermanent loss, and derives the hedge ratio explicitly from the second derivatives of both payoff curves.

What I Work On

The bulk of my work at STR falls into a few related areas.

Emulation & Architecture Modeling

I extend QEMU to support targets that don't have upstream support, both at the ISA and SoC level. This includes writing a full TCG translation frontend for the TMS320C6000, a VLIW DSP from Texas Instruments with 8-way parallel execution, branch delay slots, and cross-path register constraints that make it significantly harder to model than a conventional scalar architecture. I've also built a complete peripheral model for the EFM32HG (ARM Cortex-M0+), including its clock management unit with propagation to GPIO, USART, and RTC peripherals, USB device-side callbacks, and external USART-attached flash and GPS modules. The goal is full-system emulation accurate enough to run and analyze real firmware without the physical hardware, for example, running the entire SnapperGPS firmware stack in QEMU.

Instrumentation & Tracing

I modernized a significant portion of STR's QEMU instrumentation layer from C to C++23, using concepts for type constraints, ranges for composable data pipelines, and std::expected<T,E> for error handling in memory tracing paths. The rewrite achieved full API compatibility with existing tooling while reducing runtime overhead by 30%, mainly by eliminating unnecessary allocations and virtual dispatch in hot paths.

Separately, I built an eBPF-based tracing framework that uses C++ template metaprogramming to generate BPF programs at compile time with zero-cost abstractions for shared object and syscall tracking. This replaced an earlier approach based on ptrace (Linux) and Detours (Windows), cutting overhead from 12% to under 1% and per-syscall latency from 25μs to 800ns. The template approach means adding a new tracepoint is a type-safe, compile-checked operation rather than hand-writing BPF bytecode.

Fuzzing & Vulnerability Research

I developed a full-system fuzzer built on top of QEMU's TCG that works by injecting test inputs directly into guest memory and instrumenting translation blocks for coverage feedback. Each fuzzed process gets isolated coverage tracking, which is important when you're fuzzing an entire OS image and don't want kernel noise polluting your signal. This has found 15+ vulnerabilities across several targets. To deal with the throughput problem inherent to full-system emulation, I built a snapshot-and-restore pipeline using KVM migration with custom DMA/PCI ivshmem drivers on both Windows and Linux guests, which brought fuzzing throughput up by roughly 100x.

Reverse Engineering & Analysis Tooling

To address the bottleneck of manual reverse engineering on unfamiliar firmware, I architected a custom QEMU plugin that tracks process-isolated code coverage at the translation block level and exports it into Ghidra for visualization. This gives analysts an immediate heatmap of which code paths have been exercised during emulation or fuzzing, reducing the manual analysis time on new targets by roughly 70% compared to starting from a raw disassembly.

RTOS & Embedded Validation

For a timing-critical RTOS target, I reverse engineered the firmware and its board support package to understand the scheduler well enough to inject custom validation tasks that run alongside the native workload. Combined with FPGA-based external monitoring, this approach eliminated roughly 95% of the overhead compared to traditional JTAG-based validation, important when the system under test has hard real-time deadlines that intrusive debugging would violate.

Projects

HFT-Zero
2025

A bare-metal x86-64 kernel written in C++26 with modules, designed as a vehicle for exploring modern C++ in a freestanding environment and for understanding what it takes to build a system from nothing. The kernel boots into 64-bit long mode via Multiboot2 with a higher-half mapping through 4-level paging, has a physical memory manager with bitmap allocation across DMA/Normal/High zones, a heap allocator, and a PIT-driven interrupt system. Currently in Phase 1, the next steps are a virtio-net driver and the beginnings of a network stack, which would eventually support market data parsing and order execution as the motivating use case. The real point is less about trading and more about having a concrete target that forces you to care about every microsecond from boot to packet.

RustNK
2023 · Conor Kotwasinski, Matthew Sinclair, Ian Armstrong, Charles Zhou

A framework for writing kernel components in Rust for Northwestern's Nautilus aerokernel. Wrapped the kernel's C IRQ, threading, and character device subsystems with idiomatic Rust APIs that use RAII to bind resource lifecycles to object lifetimes and Send + Sync trait bounds to make data races on shared handler state a compile-time error. Built an async executor for cooperative multitasking and a Virtio GPU driver. Validated the framework by porting the original DOOM engine to run in the Nautilus shell.

DARPA AIxCC — Team 42-b3yond-6ug (6th Place)
2024

An autonomous Cyber Reasoning System that discovers vulnerabilities and generates patches without human intervention. The system combines BandFuzz (a custom fuzzer with selective instrumentation) with a multi-LLM pipeline using GPT-4 and Claude 3 for patch generation, achieving a 92% success rate across 178 known vulnerabilities including PyTorch CVEs. ML-guided fuzzing reduced discovery time by 85%. We deployed it as a microservices architecture with cost-optimized model selection to stay within the competition's compute budget. Several patches were accepted upstream into PyTorch and other open-source projects.

WildLLaMA
2023 · Conor Kotwasinski, Charles Zhou

Built a Northwestern University knowledge assistant by fine-tuning LLaMA 7B on ~1,000 instruction pairs sourced from the Daily Northwestern, r/Northwestern, Stack Exchange, and WikiHow. GPT-4 generated the university-specific Q&A pairs after GPT-3.5 proved too shallow, producing only short, verbatim extractions rather than the synthesized answers needed for instruction tuning. LoRA kept training to ~4% of parameters, preserving the base model's general capability (MMLU 34.18 vs. 35.1 baseline) while teaching domain-specific knowledge. DeepSpeed brought training time from 12 hours to 20 minutes. The main challenge was data quality: uncleaned HTML entities in training data caused the model to waste parameters learning tokens like &amp;, and the Daily Northwestern alone lacked the basic institutional facts needed for a university assistant, a gap we filled by mixing in Wikipedia articles under the Northwestern category.

Prior Experience

Leidos
Software Engineer Intern, Summer 2023 · Charlottesville, VA

Built tooling to streamline launching microservices across IRAD projects by exploiting shared structure in forked codebases. Adapted a microservice to subscribe to a Kafka topic carrying LINK16 messages, extract assets from track data, and republish them for display on a map within a C2 application.

Luabase
Blockchain Developer Intern, Summer 2022 · Chicago, IL

Built a data model over ~10M tweets per month in ClickHouse to surface sentiment and trend signals across crypto markets. Expanded a public figures database of 20K+ entries by linking Twitter accounts with on-chain activity for simultaneous tracking. Retrieved 170K labels for tokens, accounts, blocks, and transactions to improve query coverage, and grew the live and historical pricing database from 5 cryptocurrencies to 577 tokens. Prototyped an MLP regressor using scikit-learn trained on 20K Twitter-to-Ethereum address mappings to estimate account net worth, ultimately limited by the noisiness of the feature space.

Skills

Languages C/C++ (primary), Rust, x86/ARM/PPC/SH4 assembly, Python, Java, TypeScript
Systems Linux and Windows internals, RTOS development, QEMU/KVM (TCG frontend development, device modeling, migration), eBPF (kprobes, tracepoints, XDP), DMA and PCI device drivers, FPGA integration, JTAG/SWD debugging, UART/SPI/I2C bus protocols
Security Binary analysis (Ghidra, IDA), fuzzing (AFL++, custom harnesses, coverage-guided full-system), static analysis (CodeQL), reverse engineering (firmware, RTOS, embedded), exploit development
Tools GDB, perf, Wireshark, tcpdump, strace/ltrace, Make/CMake, Git, Docker
Networking TCP/IP, UDP, multicast, Bluetooth LE, IEEE 802.15.4/Thread, LoRaWAN
ML/Data PyTorch, LoRA/PEFT, DeepSpeed, ClickHouse, Kafka

Education

Northwestern University
Master's & Bachelor's of Computer Science, 2020–2024 · Evanston, IL

Founded the university CTF team. Blockchain Club executive board member. Teaching assistant for System Security, Digital Forensics, and Deep Learning. Coursework included low-level software development, microprocessor system design, operating systems, computer networking, electronics design, algorithms, database systems, wireless protocols, and generative models.

Interests & Direction

Most of my work so far has been at the firmware and OS level, but I'm increasingly drawn toward the hardware side of that boundary. HFT-Zero is a current outlet for this, a bare-metal x86-64 kernel in C++26 where the next milestone is a virtio-net driver and packet processing path targeting sub-microsecond latency. I'm actively learning Verilog/VHDL with the goal of working on projects at the hardware-software interface: custom accelerators, hardware-in-the-loop security validation, SoC prototyping on FPGA, and instrumentation that reaches below what software alone can observe. The long-term interest is in closing the gap between the people who design silicon and the people who write the code that runs on it.