Minutes of the AI ​​workshop of January 30, 2025 (Frédéric Clavert)

Introduction to AI Workshops by Marcello Vitali-Rosati

The idea behind these workshops is to initiate common exchanges and reflections on the implementation of algorithms of what is called AI in our writing and scientific article publication practices.

In the DH community, reflection on algorithm implementation dates back 70 years. Although the introduction of transformers has brought changes, it cannot be called a revolution. The objective of this workshop is to move away from the generality that "AI = ChatGPT" and the current trend of adopting mainstream applications without reflection.

These workshops aim to be a place for reflection on the theory and infrastructural questions related to the implementation of AI-related technologies. It is also an opportunity to reflect on the needs in this area and potentially to fund pilot projects and experiments within the framework of the Revue3.0 partnership.

Presentation by Frédéric Clavert of the "explain code" functionality of JDH

The Journal of Digital History was founded in 2020, with its first publication in 2021. The integration of AI technologies has been more focused on the readers than on the authors.

Multilayer articles :

Format: Jupyter notebook (didactic format with code and markdown cells). The different layers are organized with a tagging system.

Narrative layer

Hermeneutic layer

Possibility to execute the article on mybinder to test the author’s hypotheses.

Alternative: retrieve the source code from Github.

Reproducibility issue.

New design (beta phase): additional layer with "data&code"

Latest integrations:

Upcoming integrations:

Article illustrating the functionalities:

Eriksson, M., Skotare, T., & Snickars, P. (2024). Tracking and tracing audiovisual reuse: Introducing the Video Reuse Detector. Journal of Digital History, 3(1). https://doi.org/10.1515/JDH-2024-0009?locatt=label:JDHFULL

Summary on the narrative layer and not the hermeneutic layer (view renderer)

Comparison of several models shows that Gemini is the most effective for these tasks.

Elisabeth Guerard: these functionalities are for the reader, but also for the reviewers & technical reviewers.

AI usage survey @C2DH: concludes that everyone uses or will use generative AI. Frequency of use is mainly daily or weekly. Survey from a few months ago: mainly ChatGPT and Microsoft Copilot/Github as it is provided by the university and easy to access.

Discussion

Servanne Monjour: What is the process for validating the results provided by the AI?

Elisabeth Guerard:

Marcello Vitali-Rosati:

Frédéric Clavert: The explanations are aimed at beginners, i.e., they provide a starting point for understanding for people with low or no digital literacy.

These systems should not be considered as giving a definitive answer.

The purpose of the tool should be probabilistic maieutics.

Interaction with a chatbot is a discussion that has no social value.

Nicolas Sauret: Chatbots impose a design (conversational mode): we have mimicked a social discussion when it is not the case → reflections are necessary on this point.

Aurélien Barra: Interface question, the metaphor of discussion is part of the success of these tools. Oracular dimension (click→response).

Response: The explanations given by the "explain code" feature will be different with each request but will often remain very similar.

Aurélien Berra:

Frédéric Clavert:

Aurélien Berra:

Frédéric Clavert:

The real work consists of dialoguing with the AI: validation after review from the human.

Nicolas Sauret: Why look for a "chatbot interface" when a static text generated in advance and integrated at publication can be sufficient?

And furthermore: what about a functionality for conversing with the article?

Frédéric Clavert: The tool is still in the reflection phase, particularly on the use of the "explain code" button. Already, the act of pressing the button creates a distinction between what is written by the author and the generated explanation.

Elisabeth Guerard: It is also the developers’ wish to experiment with the integration of a Flask API and to play with current technologies.

Servanne Monjour: The tool was implemented for readers, but who are these readers? What statistical elements do we have on the readership?

Frédéric Clavert: The JDH has data on its readership but not centered on the AI functionality. What we know is that the time spent on a page/article is 7 to 9 minutes on average compared to 1.5 minutes on average on the web. This means that readers stay and read. New version (3) appreciated.

Elisabeth Guerard (in the chat): With Matomo we can now track external links, which we could not see with Analytics, and the use of MyBinder may have been seen.

Tools mentioned

Mybinder: https://mybinder.org/ (generates Jupyter notebooks from a Git repo)

Groq: https://groq.com/ (server for AI inference)

Matomo: https://fr.matomo.org/ (alternative to Google Analytics that also allows tracking of external links used by users on the site)