About
I work at the intersection of Trust & Safety, adversarial ML, and multimodal model training — building pipelines that move from research ideas to production-grade systems.
I care about measurable reliability: turning ambiguous policy requirements into datasets, training objectives (mid-/post-training), and evaluation suites that are reproducible, auditable, and fast to iterate.
- Video moderation: end-to-end automation for review, labeling, and quality control.
- Model security: prompt security, red-teaming, and adversarial robustness.
- Certified robustness: provable guarantees & verification for safety-critical settings.
Background
- Ph.D. (CSE), University of Connecticut (Sep 2022–Aug 2025).
- Ph.D. (CS), Illinois Institute of Technology (2021–2022).
- B.S. Honor Science (Physics), Xi’an Jiaotong University (2014–2018).
Research themes: prompt security, adversarial attacks/defenses, and certified robustness — now applied to large-scale multimodal moderation.
What I build
LLM/VLM-assisted review pipelines for real-world policy enforcement and reliability.
Automated labeling, large-scale dataset construction, and mid-/post-training for production models.
Prompt security, adversarial ML, and certified robustness—turning research into deployable evals and guardrails.
Selected publications
Full list on Google Scholar.
Contact
Email: hanbin.hong1@bytedance.com · LinkedIn · GitHub · Scholar