Abstract
This paper examines Artificial Intelligence (AI) and Machine Learning (ML) through two distinct lenses. Section one, through the lens of an academic librarian, discusses integrating Generative AI tools, ChatGPT, and Gemini into the classroom and the need for academic libraries to adapt and support faculty, researchers, and students in AI literacy. This section details the creation of an AI task force, community of practice, and salon series within the George Mason University Libraries as a model for responding to these challenges. Section two, through the lens of researchers and their perspective on the uses and misuses of AI and ML, evaluates various AI technologies, including their strengths and weaknesses, and the ethical considerations surrounding their use, including bias in training data, data ownership, and environmental impact. Together, these sections offer a comprehensive overview of AI and ML and highlight both the opportunities and challenges they present for researchers and patrons.
Keywords
Introduction
When speaking with researchers and patrons who are curious about the potential of Artificial Intelligence (AI) or Machine Learning (ML) in their research or projects, it is important to first clearly define and differentiate the two terms. AI is a branch of computer science predicated on creating systems that exhibit some form of human intelligence, meaning it endeavors to acquire, understand, and apply knowledge with the ability to mimic some aspects of reasoning.1,2 ML, however, is a subdomain of AI that focuses on creating and deploying algorithms that allow systems to learn and make predictions based on data. 1 Gressling points out that it is important for researchers to have a basic understanding of AI and ML algorithms to keep current. 1 Both librarians in academic and special libraries who typically work with researchers and librarians in public libraries who work with patrons must have or acquire a working knowledge of core AI concepts and subdomains of AI because they must be able to guide researchers and patrons in using AI and AI tools effectively. In addition, librarians need to stay updated on AI developments to adapt to the changing information landscape and remain relevant in their roles. This knowledge is particularly crucial in academic and special libraries as they support researchers who increasingly rely on AI in their work. Understanding AI is no longer optional for librarians; it is necessary to provide optimal service to researchers and patrons, who also must adapt to this evolving information ecosystem.
Libraries’ response to Generative AI hype
Generative Artificial Intelligence (GAI), a subdomain of AI that creates new content such as text, images, videos, audio, and music by learning the patterns and structures in existing data to generate original content that resembles the data it was trained on, has no restrictions in how it uses the data, even if it contains work that is copywritten. 3 GAI disrupted libraries in several ways. First, it shifted user expectations. 3 Researchers, including faculty and students at George Mason University (GMU), anticipated that we would offer resources and guidance on GAI tools such as ChatGPT and Gemini (formerly known as Bard) when the university at the time did not have that guidance in place because the state of Virginia did not have any formal guidance in place. We had to adapt quickly to meet these needs. We addressed our concerns with library administration and the library advisory board to give a full scope of the disruption caused by GAI, its current impact, and its future potential impact on the field of librarianship, and the path we created to meet new user expectations. Another way that GAI disrupted libraries is that it forced librarians to immediately evolve to become facilitators of AI Literacy and shepherds of AI tools, guiding researchers and patrons in the understanding and responsible use of those tools. It also changed the nature of research. Researchers have increasingly used GAI tools to assist with literature and systematic reviews, data analytics, and writing and content creation. This has changed the research ecosystem and required librarians to adapt services to support these changes. We created an AI Task Force, AI Community of Practice, and AI Salon Series to determine how we could collectively adapt our services and upskill librarians and staff.
AI Task Force
The GMU Libraries AI Task Force was created in 2023. It is split into two subcommittees (LibGuide and Best Practice subcommittees). The purpose of the task force is to create a public-facing infoguide for the GMU community regarding best practices and resources for use and research regarding AI and AI tools. It is tasked with creating a best practices document that describes the types of AI based on functionalities and capabilities and how to address privacy, confidentiality, and security of data pertaining to students, faculty, staff, and the GMU community. Additionally, if AI is used in research, best practices on transparency, accountability, responsibility, and ethics of using AI are discussed. The best practices document is under construction and will be released internally and externally by mid-summer 2024. The infoguide can be accessed at https://infoguides.gmu.edu/Artificial-Intelligence.
AI Community of Practice
Starting an AI Community of Practice (CoP) within an academic library offers numerous benefits, which include serving as a space for growth, learning, and collaboration on AI among librarians, researchers, and patrons on AI. 4 The space can provide all members with ongoing opportunities to learn about AI technologies, trends, and applications, which can be directly applied to their professional roles and for personal use. It can help develop critical AI skills that are increasingly relevant in today’s digital landscape. Additionally, it supports research and learning by assisting in navigating AI resources, tools, and datasets, which can enhance the quality and scope of their research projects. Our Community of Practice focuses on the use of specific artificial intelligence tools for curriculum and research best practices. It allows members to gain experience and use specific tools each semester. During the Fall 2023 Semester and Spring 2024 semester, members played with ChatGPT, Gemini (formerly known as Bard), Grammarly, and Google Labs in a virtual Sandbox. The modality of each session was online synchronous, where our Social Sciences librarian and I served as community experts with these tools.
AI Salon Series
The term salon refers to a historical tradition originating in 17th and 18th century France, where people would gather in private homes to discuss ideas, art, literature, and politics. These gatherings were characterized by their informal and conversational nature, fostering open dialogue, exchanging ideas, and intellectual debate. 5 We took that concept and created an AI Salon Series, which leverages that tradition by creating a space where library faculty, staff, administration, students, and the university community can engage in meaningful conversations about AI in an informal and collaborative environment. The AI Salon Series emphasizes the importance of dialogue, community, and the exchange of diverse perspectives and opposing viewpoints in advancing the understanding of AI. Each semester, various topics surrounding AI are selected, and library faculty and staff serve as moderators for the discussion. Moderators for a topic are usually selected based on their knowledge of the topic. For example, our Open Educational Resources and Scholarly Communications Librarian, who is a copyright expert, guided the discussion on AI and Copyright. The Science and Data Librarian guided a discussion on AI and Environmental Impact. Each session is open to everyone across the GMU community. The AI Salon Series enriches our discourse on AI and plays a crucial role in navigating the complexities of AI and its current impact while preparing our community for the future.
AI for researchers: The research interview
Creating a research interview for researchers contemplating using AI in their research involves adapting the standard reference interview process to address the specific needs associated with AI, its subdomains, and the use of any AI tools that may prove useful to the research process or research output. It is important for librarians to understand the research context fully by knowing more about the research project and how the researcher plans to incorporate AI. This includes identifying or discussing specific problems that AI will solve, as well as AI techniques or tools that can be used to help find potential solutions. This will help librarians determine the scope of the project, which includes the necessary resources relevant to the research, and the level of involvement in the project, which includes the frequency and depth of assistance needed (embedded vs one-shot research assistance).
AI in the classroom
According to an article in Handshake Network Trends, a platform that analyzes data gathered from a network of students, universities, and recruiters, forty-eight percent of the graduating class of 2024 are worried about the impact that GAI will have on their careers. 6 While universities around the world began to determine if the use of GAI in the classroom would be allowed because of the fear of cheating, the dilemma of banning the use of GAI tools in the classroom instead of preparing students to join a workforce where employers will covet potential graduates with an AI skillset was a topic of discussion in our Salon Series and in some of our community of practice sessions. At GMU, some departments and professors banned the use of these tools. However, I was fortunate enough to work with an English professor who was curious about GAI tools and their usefulness in the classroom for first-year experience (FYE) students. Before speaking about AI or ML to up-and-coming researchers such as FYE students or considering using AI or ML in the classroom as a topic for research or as an aid in the research process, it is important to conduct a gap analysis to help identify the student’s existing knowledge and skills regarding AI. Identifying AI gaps allows librarians to curate and provide access to resources that address the identified gaps, which include books, journals, datasets, and other AI-related publications, as well as additional supplementary material students could use to assist with their AI development, which includes massive open online courses (MOOCs). Gap analysis results help librarians formulate a lesson plan and the structure of the session, the number of sessions needed for the semester, and the level of potential support necessary after the session (research consultations). I used a pre-assessment survey as the gap analysis tool, and the results of that survey, combined with the professor’s wishes for the session, inferred the lesson plan, activities for the session, the design of the post-assessment survey, and the number of sessions needed. We decided that two sessions were adequate and that we would compare traditional research processes to research with ChatGPT and Gemini. While students were initially excited about using the tools, they discovered that researching with GAI tools was far more challenging and time-consuming than researching without them.
AI projects
Researchers seeking help learning AI and ML may feel more positive about librarians actively working on AI projects or research. Librarians engaged in AI projects are more likely to be knowledgeable about the latest developments, tools, and best practices in the field. Researchers often value collaboration and learning from peers. 7 Knowing librarians who are exploring and experimenting with AI and ML can build a sense of trust as it reinforces the idea that librarians are reliable partners in research. 7 Moreover, it can create opportunities for collaboration, including a joint effort in projects where both the researcher and librarian find synergy within their projects. This can lead to co-authored publications and grants. For example, we are working on AI projects at GMU libraries, where students interested in AI contribute to the projects, including conducting research. Those projects include a conversational agent called Mason’s Orientation Conversational Agent (MOCA), which is integrated into a three-dimensional augmented reality (AR) and virtual reality (VR) tour, and the Cosmology of Artificial Intelligence Project, a cosmological visualization of the AI field.
AI policies
Discussing AI policies with AI and Machine Learning (ML) researchers is crucial for several reasons. AI policies provide guidelines on the ethical use of AI, including addressing ethics such as bias, fairness, transparency, and accountability. 8 If researchers build an AI tool, understanding the importance of ethics ensures that the AI tool that they are developing will be done responsibly. Additionally, if researchers are using AI tools in their research, they must be aware of the legal frameworks and regulations governing AI and ML to be compliant with data protection laws, intellectual property rights, and other relevant laws. It is important for librarians to be knowledgeable on AI policies established by both the federal government and the state in which the institution for which they work resides, and understand that although it may not be explicitly stated in the overall policy, the AI policy is a living document, meaning that it is subject to change as AI and the tools evolve.
Uses and misuses of AI and machine learning
Artificial Intelligence (AI) and Machine Learning (ML) have amazing capabilities to support and advance research, and it is important for researchers to have a good working definition of these tools. Understanding what the tools are then becomes the foundation for understanding what the tools can and cannot do. Some successful uses of AI and ML include the creation of a chatbot that allows medical students to practice the conversation flow of a doctor-patient interaction so that medical students can begin to practice taking medical histories and receiving additional information from medical histories and test results. 9 Other examples include using ML to recognize and group images that can be used for analyzing a large number of images and placing them in predetermined categories to recognize a growth or a mass in medical imaging in order to assist with diagnosis. These are a few of the opportunities AI and ML provide to support the teaching mission of institutions as well as facilitate research.
However, care still needs to be taken when using these tools to understand not only what they are doing but also how they are doing it. In the summer of 2023, two lawyers were sanctioned for using ChatGPT to write a legal brief. The problem was not that they used the tool, it was that the tool created fake case citations that the lawyers included in the brief. 10 Fake citations, or hallucinations as they are sometimes called, is a not uncommon concern in using generative AI tools. Another misstep with AI and ML is with a recipe-generating AI chatbot. As reported this summer, a New Zealand grocery store’s recipe generation bot was intended to help customers create recipes based on leftovers they had at home. The suggested recipes ranged from the odd such as the Oreo vegetable stir fry to the deadly such as chlorine gas. 11
As with the case of fake citations, or the hallucinations, researchers have to have a high-level of understanding of how these tools work. Such as in the case of the fake citations, the tool understands that researchers are looking for reference to a type of information. Often, if it cannot find an exact match, it builds what you need based upon the standard form of a citation—to a website, an article, etc. This building or creation of something new can be really helpful for text but problematic for creating citations that do not exist. The recipe-generating chatbot worked with what it was given, and users were testing the limits of the system. Users were exploring what odd data they could input and still get information out of the system. And in this “kicking the tires” on a system, we see a dangerous lack of guardrails.
Variety of technologies
So again, researchers need a high-level understanding of what these tools do and how they do it. But that understanding is sometimes hampered due to language. There are a number of terms, sometimes used interchangeably and sometimes with a more discrete meaning. The language around this technology is fluid right now and contributing to that fluidity are hype cycles around technology. Existing technology gets rebranded with a new term and seems to drive renewed interest or a subfield seems to grow bigger than the field it is a part of and the subfield is treated as the primary discipline. Terms that we often see are artificial intelligence, machine learning, predictive analytics, text mining, and generative artificial intelligence. While there is often some fluidity in terms, below are some definitions that will differentiate the different tools. “Artificial intelligence” is a broad term for software or machines that are intended to mimic human intelligence. These systems are intended to make independent decisions. AI can often be seen as, and is often used as, a broad, catch-all term. Machine learning is a subfield of artificial intelligence and is concerned with algorithms that can learn from inputted data, then generalize from unseen data, in order to make generalization. A perhaps familiar example of machine learning is recommendation engines. Although ML is actually a subfield of AI, until recently when people talked about AI, what they were often describing was ML. Next on the continuum of these types of tools is predictive analytics which looks to move past generalizing data that has been fed into a system to then making predictions based on the information it has been fed. More recent conversation has been about generative artificial intelligence—Generative AI or Gen AI. Gen AI tools are those that can create content—whether text or images. However, there would be those who would disagree with these definitions and where the boundaries have been placed. Others would disagree with having any boundaries between these terms arguing that most of the terms can be used interchangeably. While we may not know what to call these things, we do understand what they do.
With these different tools and systems, what they can do can be summed up into 5 categories. Those categories are prediction, classification, grouping, “finding weird stuff,” and generative. 12 These categorizations are reflected in the names of the tools. Researchers should begin by asking if what they are trying to do falls into one or a combination of these things and that can help start to answer the question of whether or not AI or machine learning will be beneficial to a project.
Ethics and expense
In addition to understanding the limitations of the system in order to account for them, it is also important for researchers to consider the ethical implications of these tools. One ethical consideration is from where the training data for a system has come. There is a vast wealth of data available that can be used to train new systems. But the first question should be was permission obtained to use this data in this way. Numerous concerns have been raised about copyrighted works of fiction being used in training datasets that can then be used to create “new” works in the style of a particular author. 13 Other sources of training data can be the vast amount of journal literature in licensed articled databases. Can data be extracted from a system and if so what type of analysis is permitted with the data?
And if all the proper permissions have been obtained to use the data, what biases may be present in the data? Here, researchers need to be expansive in their thinking about biases. Understandably, when people hear bias, they immediately think of something nefarious when bias is also reflective of a perspective. When a dataset was collected, what was the purpose for which it was collected? That will influence what questions were asked as well as what questions were not asked. A dataset about the frequency and types of accidents that require emergency room visits will look different for a survey of a rural community than one for a suburban community or a dataset representing both communities. There is no such thing as bias-proof data. What is important is that researchers identify and mitigate for that bias knowing what data is and is not in the dataset. Especially in the case of the creation or generation of content, are researchers asking the system to create a “fact” or are we asking it for an educated guess—that we as humans will then need to test and verify?
Next, how open or closed is the system a researcher will be adding their data to? Some tools are designed to take the data you input into the system, co-mingle it with other data, and provide results based on that. There are valid reasons to choose to use a tool that operates in this fashion, but the researcher must be aware that this is what they are doing. This is especially important to understand if a researcher has permission to use an AI tool to examine the data, but not permission to share the data with others. A potential co-mingling of data would constitute a sharing of that data. University and other research institutions are becoming increasingly aware about sharing of intellectual property created within their institutions and it being shared outside the organization in ways that they did not intend and would not have authorized.
Finally, another aspect researchers should think through is the expense of AI and ML systems—this is both the financial cost as well as the environmental cost of these systems. There are and will continue to be free versions of various AI tools with which users can experiment. This can be a low barrier to entry for researchers to explore the power of a tool. However, greater access to the tools comes at an increasingly higher financial cost. These financial costs can exacerbate existing inequities in research funding. Will research areas that do not commonly receive extensive grant funding or researchers from institutions that do not have the financial resources for expertise support not be able to take advantage of these new tools? Or will they have to take advantage of free or lower cost tools where researchers have to be willing to exchange access to their data for use of the tools.
There is also increasing discussion of the environmental impacts of the tools and systems. AI, and particularly generative AI, uses significant amounts of computing power. As the use of those tools becomes more widespread, the need for computing power is only going to increase. And with that increase in the need for computing power will be the accompanying increase in water and electricity use.14,15 We are still in the early days of these conversations, but the desire for access to these tools will need to be factored into many institutions’ stated goals and initiatives around climate change.
A solid grounding of what the various types of tools and systems are available to researchers and how to think through their strengths and weaknesses can help researchers make the greatest use of the tools. Use of these tools is at a moment of play—where users have the opportunity to experiment and to try things. There is also value in trying to game the systems, stress them, or break them. To improve the systems, it helps us all to discover where guardrails need to be put into place and where the analysis of large-scale datasets can help solve real world problems. And that is the most important thing researchers should understand is that AI, ML, and other systems are just tools to be wielded, hopefully appropriately, and hopefully toward the creation of new knowledge.
Statements and declarations
Footnotes
Conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
