A Hybrid Approach for Detecting AI-Generated Text in English
This project addresses the challenge of distinguishing between AI-generated and human-written English text. We proposed a hybrid detection model that combines the strengths of train-based discriminative classifiers (DeBERTa) and train-free statistical methods (DNA-DetectLLM) to improve detection robustness and performance across diverse domains and generation models.
Contributions:
- Conducted literature review and reproduced key baseline methods
- Designed and implemented the full experimental pipeline
- Integrated DeBERTa-v3-base classifier with DNA-DetectLLM statistical signals into a unified hybrid framework
- Built data processing, training, and evaluation pipelines
- Ran experiments, performed result analysis and error analysis
References
This project builds upon the following works: