论文标题
能力级别的预测和简历和职位描述使用上下文感知变压器模型匹配
Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models
论文作者
论文摘要
本文介绍了一项关于简历分类的全面研究,以减少筛查大量应用所需的时间和人工,同时改善合适的候选人的选择。总共从24,933个工作申请中提取了6,492个简历,该职位指定为临床研究协调员(CRC)的四个级别的经验。每个简历都是通过几轮三重注释来制定准则的专家最合适的CRC职位的手动注释。结果,通知者一致性达到了高的KAPPA分数为61%。鉴于此数据集,为两个任务开发了基于新颖的变压器的新型分类模型:第一个任务需要简历并将其分类为CRC级别(T1),第二任任务同时使用简历和职位描述来应用并预测应用程序是否适合作业T2。我们使用编码和多头注意力解码的部分的最佳模型可为T1和T2的79.2%提供73.3%的结果。我们的分析表明,预测错误主要是在相邻的CRC级别中造成的,即使是专家也很难区分,这意味着我们模型在实际HR平台中的实际价值。
This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of experience for Clinical Research Coordinators (CRC). Each resume is manually annotated to its most appropriate CRC position by experts through several rounds of triple annotation to establish guidelines. As a result, a high Kappa score of 61% is achieved for inter-annotator agreement. Given this dataset, novel transformer-based classification models are developed for two tasks: the first task takes a resume and classifies it to a CRC level (T1), and the second task takes both a resume and a job description to apply and predicts if the application is suited to the job T2. Our best models using section encoding and multi-head attention decoding give results of 73.3% to T1 and 79.2% to T2. Our analysis shows that the prediction errors are mostly made among adjacent CRC levels, which are hard for even experts to distinguish, implying the practical value of our models in real HR platforms.