Andrés Neyem

Andrés Neyem

PUBLICATIONS

Automation of code reviews using AI models has garnered substantial attention in the software engineering community as a strategy to reduce the cost and effort associated with traditional peer review processes. These models are typically trained on extensive datasets of real-world code reviews that address diverse software development concerns, including testing, refactoring, bug fixes, performance optimization, and maintainability improvements. However, a notable limitation of these datasets is the under representation of code vulnerabilities, critical flaws that pose significant security risks, with security-focused reviews comprising a small fraction of the data. This scarcity of vulnerability-specific data restricts the effectiveness of AI models in identifying and commenting on security-critical code. To address this issue, we propose the creation of a synthetic dataset consisting of vulnerability-focused reviews that specifically comment on security flaws. Our approach leverages Large Language Models (LLMs) to generate human-like code review comments for vulnerabilities, using insights derived from code differences and commit messages. To evaluate the usefulness of the generated synthetic dataset, we plan to use it to fine-tune three existing code review models. We anticipate that the synthetic dataset will improve the performance of the original code review models.

Publisher: IEEE Transactions on Learning Technologies Link>

ABSTRACT

Software assistants have significantly impacted software development for both practitioners and students, particularly in capstone projects. The effectiveness of these tools varies based on their knowledge sources; assistants with localized domain-specific knowledge may have limitations, while tools, such as ChatGPT, using broad datasets, might offer recommendations that do not always match the specific objectives of a capstone course. Addressing a gap in current educational technology, this article introduces an AI Knowledge Assistant specifically designed to overcome the limitations of the existing tools by enhancing the quality and relevance of large language models (LLMs). It achieves this through the innovative integration of contextual knowledge from a local “lessons learned” database tailored to the capstone course. We conducted a study with 150 students using the assistant during their capstone course. Integrated into the Kanban project tracking system, the assistant offered recommendations using different strategies: direct searches in the lessons learned database, direct queries to a generative pretrained transformers (GPT) model, query enrichment with lessons learned before submission to GPT and large language model meta AI (LLaMa) models, and query enhancement with Stack Overflow data before GPT processing. Survey results underscored a strong preference among students for direct LLM queries and those enriched with local repository insights, highlighting the assistant's practical value. Furthermore, our linguistic analysis conclusively demonstrated that texts generated by the LLM closely mirrored the linguistic standards and topical relevance of university course requirements. This alignment not only fosters a deeper understanding of course content but also significantly enhances the material's applicability to real-world scenarios.

agencia nacional de investigación y desarrollo
Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile