On Academic Integrity

academic
writing
A plea for academic integrity.
Author

Marcel Turcotte

Published

August 27, 2025

The recent evaluation of a thesis prompted me to share some thoughts on the topic of Writing Your Thesis and to create this document on academic integrity.

While reviewing the thesis, I identified some patterns. Initially, I encountered numerous references where the connection between the cited work and the supporting claim was unclear. Furthermore, as is often the case when students write their first draft, many citations were not canonical, meaning they were not the original sources that first established the facts. More concerning was the discovery of four fabricated references. To verify these, I used Scopus to search for the references by title but found no results. I then searched for publications by the listed authors and was unable to locate any works with the given co-authors. Additionally, I examined the table of contents of the cited issue, but found no corroboration. Although at first glance, four fabricated citations might seem insignificant, this can have serious implications.

Why academic integrity matters?

Note

If I have seen further it is by standing on the shoulders of Giants.

Isaac Newton (1642–1727)

Academic work almost always builds on the work of others. Intentional or not, errors in publications can lead to a cascade of events that eventually undermine the credibility of many publications. A prominent example of this is the work of H.M. Krishna Murthy, a former researcher at the University of Alabama at Birmingham. Allegedly, he fabricated data for 12 protein structures deposited in the Protein Data Bank and published in 10 papers that he later retracted. This alleged fraud affected 449 papers across a wide range of topics, from dengue to polymerase, as detailed in Wikipedia and its citations.

In recent years, academic institutions have increasingly made defended theses available on their websites. This open access allows individuals to explore high-quality research without incurring subscription fees. Since these theses undergo rigorous peer review, they typically exhibit both exemplary literary quality and robust scientific content, making them ideal resources for training large language models. However, this also implies that any errors present in these documents could be inadvertently perpetuated by the models.

If artificial intelligence (AI) can aid scientific research, it can similarly assist in identifying errors within publications. A recent article in Nature details two ongoing initiatives, The Black Spatula Project and YesNoError, which employ AI to detect inaccuracies in published works (Gibney 2025). The Black Spatula Project outlines various types of errors that AI can possibly identify.

  1. Mathematical and Numerical Errors: These include data inconsistencies and calculation mistakes.
  2. Methodological Issues: This category encompasses problematic or inconsistent methodological approaches.
  3. Writing and Logical Issues: This pertains to incorrect interpretations of results and conclusions that lack sufficient support.
  4. Discrepancies in Figures and Tables: This refers to errors in labeling or formatting and mismatches between figures/tables and the accompanying narrative.
  5. Citation Errors: These involve invalid or missing citations.
  6. Minor Issues: This includes grammatical errors, typographical mistakes, and incorrect numbering of tables or figures.

In scenarios where oversight by government (at least in U.S.) raises concerns about the long-term autonomy of universities, it is conceivable that AI tools could be employed to systematically analyze the academic output of institutions, including the theses they publish online.

An employer with a discerning eye might meticulously examine a candidate’s thesis, ultimately deciding to hire them based on demonstrated expertise. In instances where a candidate possesses good machine learning skills but lacks formal training in biology, leveraging generative artificial intelligence models to address such gaps may be justifiable. However, it is imperative for candidates to transparently acknowledge their reliance on AI support. They must rigorously verify their claims by consulting original sources to ensure that the cited publications accurately substantiate their assertions. Furthermore, it is crucial that candidates include only the material they fully comprehend and can effectively defend. If an employer is impressed by a candidate’s biological analysis only to later discover the candidate’s insufficient proficiency in biology, it could lead to significant consequences. This scenario not only jeopardizes the candidate’s position but could also damage the reputation of their academic institution, potentially influencing future hiring decisions from that university.

Recent global political events and the COVID-19 pandemic have contributed to a decline in public trust in science and scientists. This assertion is corroborated by a report from the Pew Research Center, which analyzes survey data from the American population. Media coverage of academic integrity breaches further exacerbates public skepticism, potentially resulting in reduced funding and other adverse effects.

Let’s return to our four fabricated references. Robert F. Kennedy Jr., as U.S. Secretary of Health and Human Services, led the MAHA (“Make America Healthy Again”) report, which included fabricated or misrepresented references. This finding significantly undermined the report’s credibility.

  • RFK Jr’s ‘Maha’ report found to contain citations to nonexistent studies, by Joseph Gedeon in The Guardian on May, 29, 2025.
    • “The 73-page ‘Make America healthy again’ report – which was commissioned by the Trump administration to examine the causes of chronic illness, and which Kennedy promoted it as ‘gold-standard’ science backed by more than 500 citations – includes references to seven studies that appear to be entirely invented, and others that the researchers say have been mischaracterized.”

The case of Claudine Gay has been widely publicized.

Politicians.

AI-generated content has gained attention not only in academic circles but also in legal contexts, as evidenced by its inclusion in various court filings.

Steps

Towards a policy

On August 15, 2025, I used OpenAI ChatGPT 5 to generate a draft policy. The prompt was: “Write a draft policy on the use of generative AI for writing a master’s thesis in computer science at the University of Ottawa. Make sure to include relevant references to recent scholar publications.” It’s important to note that this draft policy was generated by AI and has not been reviewed by the professors of the school. It is not meant to be authoritative. However, due to the nature of these generative models, the policy represents a consensus of current views on the use of AI. In my opinion, it serves as a good starting point for further conversations.

Policy on the Use of Generative AI in Master’s Theses (Computer Science)

1) Scope and relationship to existing rules

This policy applies to all Master’s theses (project- or research-based) submitted to the School of Electrical Engineering and Computer Science (EECS). It complements, and does not replace, University regulations on theses and academic integrity; in case of conflict, University-level rules prevail. University of Ottawa

2) Definitions

  • Generative AI (GenAI): Systems that generate text, code, images, or other content (e.g., ChatGPT, Claude, CoPilot).
  • Assisted content: Text, code, figures, or analyses produced or materially revised with GenAI.
  • Undisclosed AI ghostwriting: Presenting GenAI output as one’s own original work without disclosure.

3) Permitted uses (with disclosure)

Students may use GenAI to support, not replace, scholarly work, provided all uses are documented (see §7):

  • Idea generation, outlining, language polishing, copy-editing, and formatting.
  • Boilerplate code scaffolding, refactoring, or test-case generation, provided the student understands, verifies, and cites sources for any non-trivial algorithms.
  • Literature triage (e.g., summarizing abstracts), with all claims verified against original sources.
  • Routine visual improvements (e.g., rewording figure captions) without altering scientific meaning.

Rationale: Leading publication bodies allow GenAI use with transparency and accountability, and forbid listing AI systems as authors. acm.org ieee-ras.org publicationethics.org icmje.org

4) Prohibited uses

  • Listing GenAI as an author or co-author. acm.org+1
  • Undisclosed AI ghostwriting of substantial scholarly content (e.g., related work, methods, results, or discussion).
  • Fabricating or padding citations, data, results, or code produced or “confabulated” by GenAI. Empirical work shows elevated hallucination and reference errors; students must independently verify all claims and references. jmir.orgarXivNature
  • Uploading restricted or confidential data (e.g., unpublished datasets, non-public code, or data under agreements) to external GenAI services without explicit authorization.
  • Relying on AI-generated content detectors as a defense or accusation. Detectors are unreliable and must not be used as sole evidence in misconduct processes. PMCprodev.illinoisstate.eduArtificial intelligence

5) Authorship and responsibility

  • Human authorship is required; accountability for accuracy, originality, and research integrity remains with the student and supervisory committee. This mirrors ICMJE/COPE positions that AI cannot fulfill authorship criteria. icmje.orgpublicationethics.org

6) Data, privacy, and research compliance

  • Students must comply with data-use agreements, ethics approvals, and security requirements before using GenAI tools.
  • When in doubt, consult the supervisor and follow University guidance on integrity and responsible conduct of research. University of Ottawa

7) Mandatory disclosure in the thesis

Every thesis that used GenAI must include an “AI Use and Verification Statement” (end of Preface or Methods), covering:

  • Tools and versions used; where in the thesis they were used (e.g., copy-editing, code refactoring, literature triage).
  • Student verification steps (e.g., re-deriving proofs, unit tests, re-running experiments, manual reference checks).
  • Data handling safeguards (e.g., anonymization, on-prem or local models).
    Suggested format is modeled on ACM/IEEE transparency language. acm.orgieee-ras.org

Appendix (non-public submission to the supervisor/examiner upon request)

  • A minimal AI log capturing date, task, prompt summary, and kept outputs for substantive uses (not routine grammar). This supports oral defense and academic integrity review, recognizing detector limits. prodev.illinoisstate.edu

8) Code and computational results

  • Any AI-assisted code must be audited by the student: add docstrings, cite algorithmic sources, supply unit/integration tests, and ensure license compatibility.
  • Reproducibility requirements (scripts, seeds, environment files) remain unchanged; AI assistance does not excuse missing artifacts.

9) Citations and literature review

  • All references must be verified from primary sources. Students must not rely on AI-generated citations or summaries without cross-checking; known hallucination risks and reference inaccuracies must be mitigated via manual verification. jmir.orgarXiv

10) Assessment and academic integrity process

  • Supervisors may request the AI log and conduct oral checks (e.g., explain code choices, reproduce derivations) to establish authorship and understanding—approaches recommended in current higher-education guidance given detector unreliability. Artificial intelligence
  • Alleged misconduct will be addressed under University policies; use of GenAI per se is not misconduct, but undisclosed or deceptive use is. uottawa.libguides.com

11) Supervisor and committee responsibilities

  • Discuss GenAI use at project start; agree on tool scope, data safeguards, and disclosure expectations.
  • Encourage formative uses that build skill (e.g., critique AI drafts, compare with canonical sources).
  • Ensure students complete required integrity/RCR training. University of Ottawa

13) Template: “AI Use and Verification Statement”

I used generative AI tools as follows: (i) copy-editing of Chapters 2–4; (ii) code refactoring suggestions for the data loader; (iii) summarizing article abstracts during initial screening. No sections reporting novel results were drafted by AI. I verified all references against the primary sources and re-implemented/validated all code. No confidential or restricted data were uploaded to external services. Tools and versions: [list]. Prompts and representative outputs are available to the supervisor upon request.


Selected references and guidance

Prompts

This is a prompt that I frequently use in my workflow. To preserve the original content’s integrity, I’ve found it most effective to apply the prompt to one paragraph at a time. However, I always review and revise the generated text to ensure it aligns with my personal writing style.

NoteImprove clarity and flow

Please rephrase the following text to enhance clarity, coherence, and flow. Aim for a tone appropriate for an academic audience with a strong background in bioinformatics, computer science, and machine learning. Focus on improving the logical structure, ensuring technical precision, and eliminating ambiguity. Where applicable, streamline the content to convey complex ideas more succinctly without sacrificing accuracy or depth of information.

References

Gibney, Elizabeth. 2025. “AI Tools Are Spotting Errors in Research Papers: Inside a Growing Movement.” Nature, March. https://doi.org/10.1038/d41586-025-00648-5.