Artificial Intelligence Archives - The Hechinger Report https://hechingerreport.org/tags/artificial-intelligence/ Covering Innovation & Inequality in Education Mon, 16 Sep 2024 17:30:59 +0000 en-US hourly 1 https://hechingerreport.org/wp-content/uploads/2018/06/cropped-favicon-32x32.jpg Artificial Intelligence Archives - The Hechinger Report https://hechingerreport.org/tags/artificial-intelligence/ 32 32 138677242 An AI tutor helped Harvard students learn more physics in less time https://hechingerreport.org/proof-points-ai-tutor-harvard-physics/ Mon, 16 Sep 2024 10:00:00 +0000 https://hechingerreport.org/?p=103689

We are still in the early days of understanding the promise and peril of using generative AI in education. Very few researchers have evaluated whether students are benefiting, and one well-designed study showed that using ChatGPT for math actually harmed student achievement.  The first scientific proof I’ve seen that ChatGPT can actually help students learn […]

The post An AI tutor helped Harvard students learn more physics in less time appeared first on The Hechinger Report.

]]>
A student’s view of PS2 Pal, the AI tutor used in a learning experiment inside Harvard’s physics department. (Screenshot courtesy of Gregory Kestin)

We are still in the early days of understanding the promise and peril of using generative AI in education. Very few researchers have evaluated whether students are benefiting, and one well-designed study showed that using ChatGPT for math actually harmed student achievement

The first scientific proof I’ve seen that ChatGPT can actually help students learn more was posted online earlier this year. It’s a small experiment, involving fewer than 200 undergraduates.  All were Harvard students taking an introductory physics class in the fall of 2023, so the findings may not be widely applicable. But students learned more than twice as much in less time when they used an AI tutor in their dorm compared with attending their usual physics class in person. Students also reported that they felt more engaged and motivated. They learned more and they liked it. 

A paper about the experiment has not yet been published in a peer-reviewed journal, but other physicists at Harvard University praised it as a well-designed experiment. Students were randomly assigned to learn a topic as usual in class, or stay “home” in their dorm and learn it through an AI tutor powered by ChatGPT. Students took brief tests at the beginning and the end of class, or their AI sessions, to measure how much they learned. The following week, the in-class students learned the next topic through the AI tutor in their dorms, and the AI-tutored students went back to class. Each student learned both ways, and for both lessons – one on surface tension and one on fluid flow –  the AI-tutored students learned a lot more. 

To avoid AI “hallucinations,” the tendency of chatbots to make up stuff that isn’t true, the AI tutor was given all the correct solutions. But other developers of AI tutors have also supplied their bots with answer keys. Gregory Kestin, a physics lecturer at Harvard and developer of the AI tutor used in this study, argues that his effort succeeded while others have failed because he and his colleagues fine-tuned it with pedagogical best practices. For example, the Harvard scientists instructed this AI tutor to be brief, using no more than a few sentences, to avoid cognitive overload. Otherwise, he explained, ChatGPT has a tendency to be “long-winded.”

The tutor, which Kestin calls “PS2 Pal,” after the Physical Sciences 2 class he teaches, was told to only give away one step at a time and not to divulge the full solution in a single message. PS2 Pal was also instructed to encourage students to think and give it a try themselves before revealing the answer. 

Unguided use of ChatGPT, the Harvard scientists argue, lets students complete assignments without engaging in critical thinking. 

Kestin doesn’t deliver traditional lectures. Like many physicists at Harvard, he teaches through a method called “active learning,” where students first work with peers on in-class problem sets as the lecturer gives feedback. Direct explanations or mini-lectures come after a bit of trial, error and struggle. Kestin sought to reproduce aspects of this teaching style with the AI tutor. Students toiled on the same set of activities and Kestin fed the AI tutor the same feedback notes that he planned to deliver in class.

Kestin provocatively titled his paper about the experiment, “AI Tutoring Outperforms Active Learning,” but in an interview he told me that he doesn’t mean to suggest that AI should replace professors or traditional in-person classes. 

“I don’t think that this is an argument for replacing any human interaction,” said Kestin. “This allows for the human interaction to be much richer.”

Kestin says he intends to continue teaching through in-person classes, and he remains convinced that students learn a lot from each other by discussing how to solve problems in groups. He believes the best use of this AI tutor would be to introduce a new topic ahead of class – much like professors assign reading in advance. That way students with less background knowledge won’t be as behind and can participate more fully in class activities. Kestin hopes his AI tutor will allow him to spend less time on vocabulary and basics and devote more time to creative activities and advanced problems during class.

Of course, the benefits of an AI tutor depend on students actually using it. In other efforts, students often didn’t want to use earlier versions of education technology and computerized tutors. In this experiment, the “at-home” sessions with PS2 Pal were scheduled and proctored over Zoom. It’s not clear that even highly motivated Harvard students will find it engaging enough to use regularly on their own initiative. Cute emojis – another element that the Harvard scientists prompted their AI tutor to use – may not be enough to sustain long-term interest. 

Kestin’s next step is to test the tutor bot for an entire semester. He’s also been testing PS2 Pal as a study assistant with homework. Kestin said he’s seeing promising signs that it’s helpful for basic but not advanced problems. 

The irony is that AI tutors may not be that effective at what we generally think of as tutoring. Kestin doesn’t think that current AI technology is good at anything that requires knowing a lot about a person, such as what the student already learned in class or what kind of explanatory metaphor might work.

“Humans have a lot of context that you can use along with your judgment in order to guide a student better than an AI can,” he said. In contrast, AI is good at introducing students to new material because you only need “limited context” about someone and “minimal judgment” for how best to teach it. 

Contact staff writer Jill Barshay at (212) 678-3595 or barshay@hechingerreport.org.

This story about an AI tutor was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post An AI tutor helped Harvard students learn more physics in less time appeared first on The Hechinger Report.

]]>
103689
Kids who use ChatGPT as a study assistant do worse on tests https://hechingerreport.org/kids-chatgpt-worse-on-tests/ https://hechingerreport.org/kids-chatgpt-worse-on-tests/#comments Mon, 02 Sep 2024 10:00:00 +0000 https://hechingerreport.org/?p=103317

Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale.  Researchers at the University of Pennsylvania found that Turkish high school students who had access to ChatGPT while doing practice math problems did worse on a math test compared with students who didn’t have access to ChatGPT. Those […]

The post Kids who use ChatGPT as a study assistant do worse on tests appeared first on The Hechinger Report.

]]>

Does AI actually help students learn? A recent experiment in a high school provides a cautionary tale. 

Researchers at the University of Pennsylvania found that Turkish high school students who had access to ChatGPT while doing practice math problems did worse on a math test compared with students who didn’t have access to ChatGPT. Those with ChatGPT solved 48 percent more of the practice problems correctly, but they ultimately scored 17 percent worse on a test of the topic that the students were learning. 

A third group of students had access to a revised version of ChatGPT that functioned more like a tutor. This chatbot was programmed to provide hints without directly divulging the answer. The students who used it did spectacularly better on the practice problems, solving 127 percent more of them correctly compared with students who did their practice work without any high-tech aids. But on a test afterwards, these AI-tutored students did no better. Students who just did their practice problems the old fashioned way — on their own — matched their test scores.

The researchers titled their paper, “Generative AI Can Harm Learning,” to make clear to parents and educators that the current crop of freely available AI chatbots can “substantially inhibit learning.” Even a fine-tuned version of ChatGPT designed to mimic a tutor doesn’t necessarily help.

The researchers believe the problem is that students are using the chatbot as a “crutch.” When they analyzed the questions that students typed into ChatGPT, students often simply asked for the answer. Students were not building the skills that come from solving the problems themselves. 

ChatGPT’s errors also may have been a contributing factor. The chatbot only answered the math problems correctly half of the time. Its arithmetic computations were wrong 8 percent of the time, but the bigger problem was that its step-by-step approach for how to solve a problem was wrong 42 percent of the time. The tutoring version of ChatGPT was directly fed the correct solutions and these errors were minimized.

A draft paper about the experiment was posted on the website of SSRN, formerly known as the Social Science Research Network, in July 2024. The paper has not yet been published in a peer-reviewed journal and could still be revised. 

This is just one experiment in another country, and more studies will be needed to confirm its findings. But this experiment was a large one, involving nearly a thousand students in grades nine through 11 during the fall of 2023. Teachers first reviewed a previously taught lesson with the whole classroom, and then their classrooms were randomly assigned to practice the math in one of three ways: with access to ChatGPT, with access to an AI tutor powered by ChatGPT or with no high-tech aids at all. Students in each grade were assigned the same practice problems with or without AI. Afterwards, they took a test to see how well they learned the concept. Researchers conducted four cycles of this, giving students four 90-minute sessions of practice time in four different math topics to understand whether AI tends to help, harm or do nothing.

ChatGPT also seems to produce overconfidence. In surveys that accompanied the experiment, students said they did not think that ChatGPT caused them to learn less even though they had. Students with the AI tutor thought they had done significantly better on the test even though they did not. (It’s also another good reminder to all of us that our perceptions of how much we’ve learned are often wrong.)

The authors likened the problem of learning with ChatGPT to autopilot. They recounted how an overreliance on autopilot led the Federal Aviation Administration to recommend that pilots minimize their use of this technology. Regulators wanted to make sure that pilots still know how to fly when autopilot fails to function correctly. 

ChatGPT is not the first technology to present a tradeoff in education. Typewriters and computers reduce the need for handwriting. Calculators reduce the need for arithmetic. When students have access to ChatGPT, they might answer more problems correctly, but learn less. Getting the right result to one problem won’t help them with the next one.

This story about using ChatGPT to practice math was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post Kids who use ChatGPT as a study assistant do worse on tests appeared first on The Hechinger Report.

]]>
https://hechingerreport.org/kids-chatgpt-worse-on-tests/feed/ 2 103317
Researchers combat AI hallucinations in math https://hechingerreport.org/proof-points-combat-ai-hallucinations-math/ Mon, 26 Aug 2024 10:00:00 +0000 https://hechingerreport.org/?p=103071

One of the biggest problems with using AI in education is that the technology hallucinates. That’s the word the artificial intelligence community uses to describe how its newest large language models make up stuff that doesn’t exist or isn’t true. Math is a particular land of make-believe for AI chatbots. Several months ago, I tested […]

The post Researchers combat AI hallucinations in math appeared first on The Hechinger Report.

]]>
Two University of California, Berkeley, researchers documented how they tamed AI hallucinations in math by asking ChatGPT to solve the same problem 10 times. Credit: Eugene Mymrin/ Moment via Getty Images

One of the biggest problems with using AI in education is that the technology hallucinates. That’s the word the artificial intelligence community uses to describe how its newest large language models make up stuff that doesn’t exist or isn’t true. Math is a particular land of make-believe for AI chatbots. Several months ago, I tested Khan Academy’s chatbot, which is powered by ChatGPT. The bot, called Khanmigo, told me I had answered a basic high school Algebra 2 problem involving negative exponents wrong. I knew my answer was right. After typing in the same correct answer three times, Khanmigo finally agreed with me. It was frustrating.

Errors matter. Kids could memorize incorrect solutions that are hard to unlearn, or become more confused about a topic. I also worry about teachers using ChatGPT and other generative AI models to write quizzes or lesson plans. At least a teacher has the opportunity to vet what AI spits out before giving or teaching it to students. It’s riskier when you’re asking students to learn directly from AI. 

Computer scientists are attempting to combat these errors in a process they call “mitigating AI hallucinations.” Two researchers from University of California, Berkeley, recently documented how they successfully reduced ChatGPT’s instructional errors to near zero in algebra. They were not as successful with statistics, where their techniques still left errors 13 percent of the time. Their paper was published in May 2024 in the peer-reviewed journal PLOS One.

In the experiment, Zachary Pardos, a computer scientist at the Berkeley School of Education, and one of his students, Shreya Bhandari, first asked ChatGPT to show how it would solve an algebra or statistics problem. They discovered that ChatGPT was “naturally verbose” and they did not have to prompt the large language model to explain its steps. But all those words didn’t help with accuracy. On average, ChatGPT’s methods and answers were wrong a third of the time. In other words, ChatGPT would earn a grade of a D if it were a student. 

Current AI models are bad at math because they’re programmed to figure out probabilities, not follow rules. Math calculations are all about rules. It’s ironic because earlier versions of AI were able to follow rules, but unable to write or summarize. Now we have the opposite.

The Berkeley researchers took advantage of the fact that ChatGPT, like humans, is erratic. They asked ChatGPT to answer the same math problem 10 times in a row. I was surprised that a machine might answer the same question differently, but that is what these large language models do.  Often the step-by-step process and the answer were the same, but the exact wording differed. Sometimes the methods were bizarre and the results were dead wrong. (See an example in the illustration below.)

Researchers grouped similar answers together. When they assessed the accuracy of the most common answer among the 10 solutions, ChatGPT was astonishingly good. For basic high-school algebra, AI’s error rate fell from 25 percent to zero. For intermediate algebra, the error rate fell from 47 percent to 2 percent. For college algebra, it fell from 27 percent to 2 percent. 

ChatGPT answered the same algebra question three different ways, but it landed on the right response seven out of 10 times in this example

Source: Pardos and Bhandari, “ChatGPT-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills,” PLOS ONE, May 2024

However, when the scientists applied this method, which they call “self-consistency,” to statistics, it did not work as well. ChatGPT’s error rate fell from 29 percent to 13 percent, but still more than one out of 10 answers was wrong. I think that’s too many errors for students who are learning math. 

The big question, of course, is whether these ChatGPT’s solutions help students learn math better than traditional teaching. In a second part of this study, researchers recruited 274 adults online to solve math problems and randomly assigned a third of them to see these ChatGPT’s solutions as a “hint” if they needed one. (ChatGPT’s wrong answers were removed first.) On a short test afterwards, these adults improved 17 percent, compared to less than 12 percent learning gains for the adults who could see a different group of hints written by undergraduate math tutors. Those who weren’t offered any hints scored about the same on a post-test as they did on a pre-test.

Those impressive learning results for ChatGPT prompted the study authors to boldly predict that “completely autonomous generation” of an effective computerized tutoring system is “around the corner.” In theory, ChatGPT could instantly digest a book chapter or a video lecture and then immediately turn around and tutor a student on it.

Before I embrace that optimism, I’d like to see how much real students – not just adults recruited online – use these automated tutoring systems. Even in this study, where adults were paid to do math problems, 120 of the roughly 400 participants didn’t complete the work and so their results had to be thrown out. For many kids, and especially students who are struggling in a subject, learning from a computer just isn’t engaging

This story about AI hallucinations was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post Researchers combat AI hallucinations in math appeared first on The Hechinger Report.

]]>
103071
AI vs. humans: Who comes out ahead? https://hechingerreport.org/ai-vs-humans-who-comes-out-ahead/ Thu, 18 Jul 2024 10:00:00 +0000 https://hechingerreport.org/?p=102044 A virtual elementary classroom is represented on an oversized laptop, where children's desks comprise the keyboard, and a teacher and screen communicate knowledge. Illustration communicates technology in the classroom and remote learning.

This is an edition of our Future of Learning newsletter. Sign up today to get it delivered straight to your inbox. There’s little doubt that artificial intelligence will fundamentally alter how classrooms operate. But just how much bot-fueled instruction is too much? I chatted with Hechinger contributor Chris Berdik about his recent story, co-published with […]

The post AI vs. humans: Who comes out ahead? appeared first on The Hechinger Report.

]]>
A virtual elementary classroom is represented on an oversized laptop, where children's desks comprise the keyboard, and a teacher and screen communicate knowledge. Illustration communicates technology in the classroom and remote learning.

This is an edition of our Future of Learning newsletter. Sign up today to get it delivered straight to your inbox.

There’s little doubt that artificial intelligence will fundamentally alter how classrooms operate. But just how much bot-fueled instruction is too much? I chatted with Hechinger contributor Chris Berdik about his recent story, co-published with Wired, that explores these themes and how some schools are deploying AI assistants in the classroom. 

Q: Why did you want to explore this question of how, and in what ways, AI can replace teachers? 

From the first exploratory interviews I did on this topic, I was surprised to learn how deep the history of AI trying to teach went. There’s so much hype (and doom) currently around generative AI, and it really is remarkable and powerful, but it puts things into perspective to learn that people have tried to harness AI to teach for many decades, with pretty limited results in the end. So, then I heard the story of Watson, an AI engine that quickly and easily dispatched Jeopardy! champions, but couldn’t hack it as a tutor. It clearly had the necessary knowledge at its beck and call. What didn’t it have? If we got beyond the hype of the latest generative AI, could it muster that critical pedagogic component that Watson lacked? And, finally, if the answer was no, what then was its best classroom use? Those were my starting points.

Q: How hesitant or eager were teachers like Daniel Thompson, whose classroom you visited, to use AI assistants? 

Thompson was cautiously optimistic. In fact, he was pretty eager to use the tool, precisely because it could make the other apps and multimedia he used less cumbersome, and navigating them less onerous. But Thompson did a few quick stress tests of the assistant, by asking it to answer questions about the local Atlanta sports teams that had nothing to do with his curriculum, checking the guardrails by asking it to compose a fake message firing a colleague. The assistant declined those requests, showing once again that occasionally the most useful thing AI can do is gently tell us we’ve asked too much of it.

Q: You wrote that students weren’t interested in engaging with IBM Watson. Why not? 

As more than one source explained to me, the process of learning includes moments of challenge and friction, which can be “busy work” drudgery, but is often times at the heart of what it means to learn, to puzzle over ideas, to truly create, to find one’s own way through to understanding. And I think that students (like a lot of us) see AI as a tool that can take care of some onerous, time consuming, or tedious task on our behalf. So, it’s going to take a lot more for AI to engage students when its job is to guide them through the friction of learning rather than just be an escape hatch from it. 

Q: As more of these tools enter the classroom this school year, what will you be watching for? 

I may have a somewhat esoteric interest in what’s next with AI in classrooms. I’m personally really interested in how schools will handle critical AI literacy, where both students and educators devote the time and resources to think critically about what AI is, the wonderful things it can do, and just as importantly what it can’t, or shouldn’t do on our behalf.

Here are a few key stories to bring you up to speed:

PROOF POINTS: Teens are looking to AI for information and answers, two surveys show

My colleague Jill Barshay wrote about two recent surveys on how teens are using AI to brainstorm ideas, study for tests or get answers for questions they might be too embarrassed to ask their parents or friends. Barshay pointed out that both surveys indicate that Black, Hispanic and Asian American youth are often quick to adopt this new technology.

How AI could transform the way schools test kids

Back in the spring, my colleague Caroline Preston and I explored what AI advancements means for the future of assessments and standardized testing. Many experts believe that AI has the potential to better evaluate a students’ true knowledge and personalize tests to individual students. However, experts also warn that schools and test designers should proceed cautiously, keeping in mind the disparities in access to AI and technology and concerns of biases embedded into tools.

AI might disrupt math and computer science classes – in a good way

Last year, as part of a series on math instruction, we published a story by Seattle Times writer Claire Bryan about how some math and computer science teachers are embracing AI. Teachers say that AI can help them plan math lessons and write a variety of math problems geared toward different levels of instruction. The story was produced in partnership with The Education Reporting Collaborative, an eight-newsroom effort.

More on the Future of Learning

PROOF POINTS: Asian American students lose more points in an AI essay grading study — but researchers don’t know why,” The Hechinger Report

An education chatbot company collapsed. Where did the student data go?,” EdSurge

More than 378,000 students have experienced gun violence at school since Columbine,” The Washington Post

It takes a village: A Brooklyn high school and NYC nonprofits team up to enroll older immigrants,” Chalkbeat

This story about ai in education was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education.

The post AI vs. humans: Who comes out ahead? appeared first on The Hechinger Report.

]]>
102044
OPINION: What teachers call AI cheating, leaders in the workforce might call progress https://hechingerreport.org/opinion-what-teachers-call-ai-cheating-leaders-in-the-workforce-might-call-progress/ Tue, 16 Jul 2024 05:00:00 +0000 https://hechingerreport.org/?p=101962

As the use of artificial intelligence grows, teachers are trying to protect the integrity of their educational practices and systems. When we see what AI can do in the hands of our students, it’s hard to stay neutral about how and if to use it. Of course, we worry about cheating; AI can be used […]

The post OPINION: What teachers call AI cheating, leaders in the workforce might call progress appeared first on The Hechinger Report.

]]>

As the use of artificial intelligence grows, teachers are trying to protect the integrity of their educational practices and systems. When we see what AI can do in the hands of our students, it’s hard to stay neutral about how and if to use it.

Of course, we worry about cheating; AI can be used to write essays and solve math problems.

But we also have deeper concerns regarding learning. When our students use AI, they may not be engaging as deeply with our assignments and coursework.

They have discovered ways AI can be used to create essay outlines and help with project organization and other such tasks that are key components of the learning process.

Some of this could be good. AI is a fabulous tool for getting started or unstuck. AI puts together old ideas in new ways and can do this at scale: It will make creativity easier for everyone.

But this very ease has teachers wondering how we can keep our students motivated to do the hard work when there are so many new shortcuts. Learning goals, curriculums, courses and the way we grade assignments will all need to be reevaluated.

Related: Interested in innovations in the field of higher education? Subscribe to our free biweekly Higher Education newsletter.

The new realities of work also must be considered. A shift in employers’ job postings rewards those with AI skills. Many companies report already adopting generative AI tools or anticipate incorporating them into their workflow in the near future.

A core tension has emerged: Many teachers want to keep AI out of our classrooms, but also know that future workplaces may demand AI literacy.

What we call cheating, business could see as efficiency and progress.

The complexities, opportunities and decisions that lie between banning AI and teaching AI are significant.

It is increasingly likely that using AI will emerge as an essential skill for students, regardless of their career ambitions, and that action is required of educational institutions as a result.

Integrating AI into the curriculum will require change. The best starting point is a better understanding of what AI literacy looks like in our current landscape.

In our new book, we make it clear that the specifics of AI literacy will vary somewhat from one subject to the next, but there are some AI capacities that everyone will now need.

Before even writing a prompt, the AI user should develop an understanding of the following:

  • the role of human / AI collaborations
  • how to navigate the ethical implications of using AI for a given purpose
  • which AI tool to use (when and why)
  • how to use their selected AI tool fully and successfully
  • the limitations of generative AI systems and how to work around them
  • prompt engineering and all of its nuances

This knowledge will help our students write successful prompts, but additional skills and AI literacy will be required once AI returns a response. These include the abilities to:

  • review and evaluate AI-produced content, including how to determine its accuracy and recognize bias
  • edit AI content for its intended audience and purpose
  • follow up with AI to refine the output
  • take responsibility for the quality of the final work

The development of AI literacy mirrors the development of other key skills, such as critical thinking. Teaching AI literacy begins by teaching the capacities above, as well as others specific to your own subject.

While the inclination may be to start teaching AI literacy by opening a browser, faculty should begin by providing an ethical and environmental context regarding the use of AI and the responsibilities each of us has when working with AI.

Amazon Web Services recently surveyed employers from all business sectors about what skills employees need to use AI well. In ranked order, their answers included the following:

  1. critical thinking and problem solving
  2. creative thinking and design competence
  3. technical proficiency
  4. ethics and risk management
  5. communication
  6. math
  7. teamwork
  8. management
  9. writing

Higher education is quite adept at teaching such skills, and many of those noted are among the American Association of Colleges and Universities’ (AAC&U) list of “essential learning outcomes” for higher education.

Related: TEACHER VOICE: My students are afraid of AI

Faculty will need to improve their own AI literacy and explore the most advanced generative AI tools (currently ChatGPT 4o, Gemini 1.5 and Claude 3.5). A good way to begin is to ask AI to perform assignments and projects that you typically ask your students to complete — and then try to improve the AI’s response.

Understanding what AI can and cannot do well within the context of your course will be key as you contemplate revising your assignments and teaching.

Faculty should also find out if their college has an advisory board comprised of past students and/or employers. Reach out to them for firsthand insight on how AI is shifting the landscape — and keep that conversation going over time. That information will be essential as you think about AI literacy within your subjects and courses.

These actions will ultimately position you to be able to navigate the complexities and decisions that lie between ban and teach.

C. Edward Watson is vice president for digital innovation with the American Association of Colleges and Universities (AAC&U). José Antonio Bowen is a former president of Goucher College and co-author with Watson of “Teaching with AI: A Practical Guide to a New Era of Human Learning.”

This story about AI literacy was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for our Higher Education newsletter.

The post OPINION: What teachers call AI cheating, leaders in the workforce might call progress appeared first on The Hechinger Report.

]]>
101962
What aspects of teaching should remain human? https://hechingerreport.org/what-aspects-of-teaching-should-remain-human/ Wed, 10 Jul 2024 05:00:00 +0000 https://hechingerreport.org/?p=101861

ATLANTA — Science teacher Daniel Thompson circulated among his sixth graders at Ron Clark Academy on a recent spring morning, spot checking their work and leading them into discussions about the day’s lessons on weather and water. He had a helper: As Thompson paced around the class, peppering them with questions, he frequently turned to […]

The post What aspects of teaching should remain human? appeared first on The Hechinger Report.

]]>

ATLANTA — Science teacher Daniel Thompson circulated among his sixth graders at Ron Clark Academy on a recent spring morning, spot checking their work and leading them into discussions about the day’s lessons on weather and water. He had a helper: As Thompson paced around the class, peppering them with questions, he frequently turned to a voice-activated AI to summon apps and educational videos onto large-screen smartboards.

When a student asked, “Are there any animals that don’t need water?” Thompson put the question to the AI. Within seconds, an illustrated blurb about kangaroo rats appeared before the class.

Thompson’s voice-activated assistant is the brainchild of computer scientist Satya Nitta, who founded a company called Merlyn Mind after many years at IBM where he had tried, and failed, to create an AI tool that could teach students directly. The foundation of that earlier, ill-fated project was IBM Watson, the AI that famously crushed several “Jeopardy!” champions. Despite Watson’s gameshow success, however, it wasn’t much good at teaching students. After plowing five years and $100 million into the effort, the IBM team admitted defeat in 2017.

“We realized the technology wasn’t there,” said Nitta, “and it’s still not there.”

Daniel Thompson teaches science to middle schoolers at Ron Clark Academy, in Atlanta. Credit: Chris Berdik for The Hechinger Report

Since the November 2022 launch of OpenAI’s ChatGPT, an expanding cast of AI tutors and helpers have entered the learning landscape. Most of these tools are chatbots that tap large language models — or LLMs — trained on troves of data to understand student inquiries and respond conversationally with a range of flexible and targeted learning assistance. These bots can generate quizzes, summarize key points in a complex reading, offer step-by-step graphing of algebraic equations, or provide feedback on the first draft of an essay, among other tasks. Some tools are subject-specific, such as Writable and Photomath, while others offer more all-purpose tutoring, such as Socratic (created by Google) and Khanmigo, a collaboration of OpenAI and Khan Academy, a nonprofit provider of online lessons covering an array of academic subjects.

As AI tools proliferate and their capabilities keep improving, relatively few observers believe education can remain AI free. At the same time, even the staunchest techno optimists hesitate to say that teaching is best left to the bots. The debate is about the best mix — what are AI’s most effective roles in helping students learn, and what aspects of teaching should remain indelibly human no matter how powerful AI becomes?

Skepticism about AI’s place in the classroom often centers on students using the technology to cut corners or on AI’s tendency to hallucinate, i.e. make stuff up, in an eagerness to answer every query. The latter concern can be mitigated (albeit not eliminated) by programming bots to base responses on vetted curricular materials, among other steps. Less attention, however, is paid to an even thornier challenge for AI at the heart of effective teaching: engaging and motivating students.

Nitta said there’s something “deeply profound” about human communication that allows flesh-and-blood teachers to quickly spot and address things like confusion and flagging interest in real time.

He joins other experts in technology and education who believe AI’s best use is to augment and extend the reach of human teachers, a vision that takes different forms. For example, the goal of Merlyn Mind’s voice assistant is to make it easier for teachers to engage with students while also navigating apps and other digital teaching materials. Instead of      being stationed by the computer, they can move around the class and interact with students, even the ones hoping to disappear in the back.

Others in education are trying to achieve this vision by using AI to help train human tutors to have more productive student interactions, or by multiplying the number of students a human instructor can engage with by delegating specific tasks to AI that play to the technology’s strengths. Ultimately, these experts envision a partnership in which AI is not called on to be a teacher but to supercharge the power of humans already doing the job.

Related: Become a lifelong learner. Subscribe to our free weekly newsletter to receive our comprehensive reporting directly in your inbox.

Merlyn Mind’s AI assistant, Origin, was piloted by thousands of teachers nationwide this past school year, including Thompson and three other teachers at the Ron Clark Academy. The South Atlanta private school, where tuition is heavily subsidized for a majority low-income student body, is in a brick warehouse renovated to look like a low-slung Hogwarts, replete with an elaborate clocktower and a winged dragon perched above the main entrance.

As Thompson moved among his students, he wielded a slim remote control with a button-activated microphone he uses to command the AI software. At first, Thompson told the AI to start a three-minute timer that popped up on the smartboard, then he began asking rapid-fire review questions from a previous lesson, such as what causes wind. When students couldn’t remember the details, Thompson asked the AI to display an illustration of airflow caused by uneven heating of the Earth’s surface.

The voice-activated AI assistant by Merlyn Mind is designed to help teachers navigate apps and materials on their computer while moving around the classroom, interacting with students. Credit: Chris Berdik for The Hechinger Report

At one point, he clambered up on a student worktable while discussing the stratosphere, claiming (inaccurately) that it was the atmospheric layer where most weather happens, just to see if any students caught his mistake (several students reminded him that weather happens in the troposphere). Then he conjured a new timer and launched into a lesson on water by asking the AI assistant to find a short educational movie about fresh and saltwater ecosystems. As Thompson moved through the class, he occasionally paused the video and quizzed students about the new content.

Study after study has shown the importance of student engagement for academic success. A strong connection between teachers and students is especially important when learners feel challenged or discouraged, according to Nitta. While AI has many strengths, he said, “it’s not very good at motivating you to keep doing something you’re not very interested in doing.”

“The elephant in the room with all these chatbots is how long will anyone engage with them?” he said.

The answer for Watson was not long at all, Nitta recalled. In trial runs, some students just ignored Watson’s attempts to probe their understanding of a topic, and the engagement level of those who initially did respond to the bot dropped off precipitously. Despite all Watson’s knowledge and facility with natural language, students just weren’t interested in chatting with it.

Related: PROOF POINTS: AI essay grading is ‘already as good as an overburdened’ teacher, but researchers say it needs more work

At a spring 2023 TED talk shortly after launching Khanmigo, Sal Khan, founder and CEO of Khan Academy, pointed out that tutoring has provided some of the biggest jolts to student performance among studied education interventions. But, there aren’t enough human tutors available nor enough money to pay for them, especially in the wake of pandemic-induced learning loss.

Khan envisioned a world where AI tutors filled that gap. “We’re at the cusp of using AI for probably the biggest positive transformation that education has ever seen,” he declared. “And the way we’re going to do that is by giving every student on the planet an artificially intelligent but amazing personal tutor.”

One of Khanmigo’s architects, Khan Academy’s chief learning officer, Kristen DiCerbo, was the vice president of learning research and design for education publisher Pearson in 2016 when it partnered with IBM on the Watson tutor project.

“It was a different technology,” said DiCerbo, recalling the laborious task of scripting Watson’s responses to students.

The Ron Clark Academy, in Atlanta, piloted a voice-activated teaching assistant this school year. Credit: Chris Berdik for The Hechinger Report

Since Watson’s heyday, AI has become a lot more engaging. One of the breakthroughs of generative AI powered by LLMs is its ability to give unscripted, human-like responses to user prompts.

To spur engagement, Khanmigo doesn’t answer student questions directly, but starts with questions of its own, such as asking if the student has any ideas about how to find an answer. Then it guides them to a solution, step by step, with hints and encouragement (a positive tone is assured by its programmers). Another feature for stoking engagement allows students to ask the bot to assume the identity of historical or literary figures for chats about their life and times. Teachers, meanwhile, can tap the bot for help planning lessons and formulating assessments. 

Notwithstanding Khan’s expansive vision of “amazing” personal tutors for every student on the planet, DiCerbo assigns Khanmigo a more limited teaching role. When students are working independently on a skill or concept but get hung up or caught in a cognitive rut, she said, “we want to help students get unstuck.”

Some 100,000 students and teachers piloted Khanmigo this past academic year in schools nationwide, helping to flag any hallucinations the bot makes and providing tons of student-bot conversations for DiCerbo and her team to analyze.

“We look for things like summarizing, providing hints and encouraging,” she explained. “Does [Khanmigo] do the motivational things that human tutors do?”

The degree to which Khanmigo has closed AI’s engagement gap is not yet known. Khan Academy plans to release some summary data on student-bot interactions later this summer, according to DiCerbo. Plans for third-party researchers to assess the tutor’s impact on learning will take longer.

Nevertheless, many tutoring experts stress the importance of building a strong relationship between tutors and students to achieve significant learning boosts. “If a student is not motivated, or if they don’t see themselves as a math person, then they’re not going to have a deep conversation with an AI bot,” said Brent Milne, the vice president of product research and development at Saga Education, a nonprofit provider of in-person tutoring.

Since 2021, Saga has been a partner in the Personalized Learning Initiative (PLI), run by the University of Chicago’s Education Lab, to help scale high-dosage tutoring — generally defined as one-on-one or small group sessions for at least 30 minutes every day. The PLI team sees a big and growing role for AI in tutoring, one that augments but doesn’t replicate human efforts.

For instance, Saga has been experimenting with AI feedback to help tutors better engage and motivate students. Working with researchers from the University of Memphis and the University of Colorado, the Saga team fed transcripts of their math tutoring sessions into an AI model trained to recognize when the tutor was prompting students to explain their reasoning, refine their answers or initiate a deeper discussion. The AI analyzed how often each tutor took these steps.  

When Saga piloted this AI tool in 2023, the nonprofit provided the feedback to their tutor coaches, who worked with four to eight tutors each. Tracking some 2,300 tutoring sessions over several weeks, they found that tutors whose coaches used the AI feedback peppered their sessions with significantly more of these prompts to encourage student engagement.

While Saga is looking into having AI deliver some feedback directly to tutors, it’s doing so cautiously, because, according to Milne, “having a human coach in the loop is really valuable to us.”

Related: How AI could transform the way schools test kids

In addition to using AI to help train tutors, the Saga team wondered if they could offload certain tutor tasks to a machine without compromising the strong relationship between tutors and students. Specifically, they understood that tutoring sessions were typically a mix of teaching concepts and practicing them, according to Milne. A tutor might spend some time explaining the why and how of factoring algebraic equations, for example, and then guide a student through practice problems. But what if the tutor could delegate the latter task to AI, which excels at providing precisely targeted adaptive practice problems and hints?

The Saga team tested the idea in their algebra tutoring sessions during the 2023-24 school year. They found that students who were tutored daily in a group of two had about the same gains in math scores as students who were tutored in a group of four with assistance from ALEKS, an AI-powered learning software by McGraw Hill. In the group of four, two students worked directly with the tutor and two with the AI, switching each day. In other words, the AI assistance effectively doubled the reach of the tutor.

Experts expect that AI’s role in education is bound to grow, and its interactions will continue to seem more and more human. Earlier this year, OpenAI and the startup Hume AI separately launched “emotionally intelligent” AI that analyzes tone of voice and facial expressions to infer a user’s mood and respond with calibrated “empathy.” Nevertheless, even emotionally intelligent AI will likely fall short on the student engagement front, according to Brown University computer science professor Michael Littman, who is also the National Science Foundation’s division director for information and intelligent systems.

No matter how human-like the conversation, he said, students understand at a fundamental level that AI doesn’t really care about them, what they have to say in their writing or whether they pass or fail algebra. In turn, students will never really care about the bot and what it thinks. A June study in the journal “Learning and Instruction” found that AI can already provide decent feedback on student essays. What is not clear is whether student writers will put in care and effort — rather than offloading the task to a bot — if AI becomes the primary audience for their work. 

“There’s incredible value in the human relationship component of learning,” Littman said, “and when you just take humans out of the equation, something is lost.”

This story about AI tutors was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.

The post What aspects of teaching should remain human? appeared first on The Hechinger Report.

]]>
101861
PROOF POINTS: Asian American students lose more points in an AI essay grading study — but researchers don’t know why https://hechingerreport.org/proof-points-asian-american-ai-bias/ https://hechingerreport.org/proof-points-asian-american-ai-bias/#comments Mon, 08 Jul 2024 10:00:00 +0000 https://hechingerreport.org/?p=101830 global online academy

When ChatGPT was released to the public in November 2022, advocates and watchdogs warned about the potential for racial bias. The new large language model was created by harvesting 300 billion words from books, articles and online writing, which include racist falsehoods and reflect writers’ implicit biases. Biased training data is likely to generate biased […]

The post PROOF POINTS: Asian American students lose more points in an AI essay grading study — but researchers don’t know why appeared first on The Hechinger Report.

]]>
global online academy

When ChatGPT was released to the public in November 2022, advocates and watchdogs warned about the potential for racial bias. The new large language model was created by harvesting 300 billion words from books, articles and online writing, which include racist falsehoods and reflect writers’ implicit biases. Biased training data is likely to generate biased advice, answers and essays. Garbage in, garbage out. 

Researchers are starting to document how AI bias manifests in unexpected ways. Inside the research and development arm of the giant testing organization ETS, which administers the SAT, a pair of investigators pitted man against machine in evaluating more than 13,000 essays written by students in grades 8 to 12. They discovered that the AI model that powers ChatGPT penalized Asian American students more than other races and ethnicities in grading the essays. This was purely a research exercise and these essays and machine scores weren’t used in any of ETS’s assessments. But the organization shared its analysis with me to warn schools and teachers about the potential for racial bias when using ChatGPT or other AI apps in the classroom.

AI and humans scored essays differently by race and ethnicity

“Diff” is the difference between the average score given by humans and GPT-4o in this experiment. “Adj. Diff” adjusts this raw number for the randomness of human ratings. Source: Table from Matt Johnson & Mo Zhang “Using GPT-4o to Score Persuade 2.0 Independent Items” ETS (June 2024 draft)

“Take a little bit of caution and do some evaluation of the scores before presenting them to students,” said Mo Zhang, one of the ETS researchers who conducted the analysis. “There are methods for doing this and you don’t want to take people who specialize in educational measurement out of the equation.”

That might sound self-serving for an employee of a company that specializes in educational measurement. But Zhang’s advice is worth heeding in the excitement to try new AI technology. There are potential dangers as teachers save time by offloading grading work to a robot.

In ETS’s analysis, Zhang and her colleague Matt Johnson fed 13,121 essays into one of the latest versions of the AI model that powers ChatGPT, called GPT 4 Omni or simply GPT-4o. (This version was added to ChatGPT in May 2024, but when the researchers conducted this experiment they used the latest AI model through a different portal.)  

A little background about this large bundle of essays: students across the nation had originally written these essays between 2015 and 2019 as part of state standardized exams or classroom assessments. Their assignment had been to write an argumentative essay, such as “Should students be allowed to use cell phones in school?” The essays were collected to help scientists develop and test automated writing evaluation.

Each of the essays had been graded by expert raters of writing on a 1-to-6 point scale with 6 being the highest score. ETS asked GPT-4o to score them on the same six-point scale using the same scoring guide that the humans used. Neither man nor machine was told the race or ethnicity of the student, but researchers could see students’ demographic information in the datasets that accompany these essays.

GPT-4o marked the essays almost a point lower than the humans did. The average score across the 13,121 essays was 2.8 for GPT-4o and 3.7 for the humans. But Asian Americans were docked by an additional quarter point. Human evaluators gave Asian Americans a 4.3, on average, while GPT-4o gave them only a 3.2 – roughly a 1.1 point deduction. By contrast, the score difference between humans and GPT-4o was only about 0.9 points for white, Black and Hispanic students. Imagine an ice cream truck that kept shaving off an extra quarter scoop only from the cones of Asian American kids. 

“Clearly, this doesn’t seem fair,” wrote Johnson and Zhang in an unpublished report they shared with me. Though the extra penalty for Asian Americans wasn’t terribly large, they said, it’s substantial enough that it shouldn’t be ignored. 

The researchers don’t know why GPT-4o issued lower grades than humans, and why it gave an extra penalty to Asian Americans. Zhang and Johnson described the AI system as a “huge black box” of algorithms that operate in ways “not fully understood by their own developers.” That inability to explain a student’s grade on a writing assignment makes the systems especially frustrating to use in schools.

This table compares GPT-4o scores with human scores on the same batch of 13,121 student essays, which were scored on a 1-to-6 scale. Numbers highlighted in green show exact score matches between GPT-4o and humans. Unhighlighted numbers show discrepancies. For example, there were 1,221 essays where humans awarded a 5 and GPT awarded 3. Data source: Matt Johnson & Mo Zhang “Using GPT-4o to Score Persuade 2.0 Independent Items” ETS (June 2024 draft)

This one study isn’t proof that AI is consistently underrating essays or biased against Asian Americans. Other versions of AI sometimes produce different results. A separate analysis of essay scoring by researchers from University of California, Irvine and Arizona State University found that AI essay grades were just as frequently too high as they were too low. That study, which used the 3.5 version of ChatGPT, did not scrutinize results by race and ethnicity.

I wondered if AI bias against Asian Americans was somehow connected to high achievement. Just as Asian Americans tend to score high on math and reading tests, Asian Americans, on average, were the strongest writers in this bundle of 13,000 essays. Even with the penalty, Asian Americans still had the highest essay scores, well above those of white, Black, Hispanic, Native American or multi-racial students. 

In both the ETS and UC-ASU essay studies, AI awarded far fewer perfect scores than humans did. For example, in this ETS study, humans awarded 732 perfect 6s, while GPT-4o gave out a grand total of only three. GPT’s stinginess with perfect scores might have affected a lot of Asian Americans who had received 6s from human raters.

ETS’s researchers had asked GPT-4o to score the essays cold, without showing the chatbot any graded examples to calibrate its scores. It’s possible that a few sample essays or small tweaks to the grading instructions, or prompts, given to ChatGPT could reduce or eliminate the bias against Asian Americans. Perhaps the robot would be fairer to Asian Americans if it were explicitly prompted to “give out more perfect 6s.” 

The ETS researchers told me this wasn’t the first time that they’ve noticed Asian students treated differently by a robo-grader. Older automated essay graders, which used different algorithms, have sometimes done the opposite, giving Asians higher marks than human raters did. For example, an ETS automated scoring system developed more than a decade ago, called e-rater, tended to inflate scores for students from Korea, China, Taiwan and Hong Kong on their essays for the Test of English as a Foreign Language (TOEFL), according to a study published in 2012. That may have been because some Asian students had memorized well-structured paragraphs, while humans easily noticed that the essays were off-topic. (The ETS website says it only relies on the e-rater score alone for practice tests, and uses it in conjunction with human scores for actual exams.) 

Asian Americans also garnered higher marks from an automated scoring system created during a coding competition in 2021 and powered by BERT, which had been the most advanced algorithm before the current generation of large language models, such as GPT. Computer scientists put their experimental robo-grader through a series of tests and discovered that it gave higher scores than humans did to Asian Americans’ open-response answers on a reading comprehension test. 

It was also unclear why BERT sometimes treated Asian Americans differently. But it illustrates how important it is to test these systems before we unleash them in schools. Based on educator enthusiasm, however, I fear this train has already left the station. In recent webinars, I’ve seen many teachers post in the chat window that they’re already using ChatGPT, Claude and other AI-powered apps to grade writing. That might be a time saver for teachers, but it could also be harming students. 

This story about AI bias was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: Asian American students lose more points in an AI essay grading study — but researchers don’t know why appeared first on The Hechinger Report.

]]>
https://hechingerreport.org/proof-points-asian-american-ai-bias/feed/ 4 101830
TEACHER VOICE: My students are afraid of AI https://hechingerreport.org/teacher-voice-my-are-bombarded-with-negative-ideas-about-ai-and-now-they-are-afraid/ Tue, 25 Jun 2024 05:00:00 +0000 https://hechingerreport.org/?p=101668

Since the release of ChatGPT in November 2022, educators have pondered its implications for education. Some have leaned toward apocalyptic projections about the end of learning, while others remain cautiously optimistic. My students took longer than I expected to discover generative AI. When I asked them about ChatGPT in February 2023, many had never heard […]

The post TEACHER VOICE: My students are afraid of AI appeared first on The Hechinger Report.

]]>

Since the release of ChatGPT in November 2022, educators have pondered its implications for education. Some have leaned toward apocalyptic projections about the end of learning, while others remain cautiously optimistic.

My students took longer than I expected to discover generative AI. When I asked them about ChatGPT in February 2023, many had never heard of it.

But some caught up, and now our college’s academic integrity office is busier than ever dealing with AI-related cheating. The need for guidelines is discussed in every college meeting, but I’ve noticed a worrying reaction among students that educators are not considering: fear.

Students are bombarded with negative ideas about AI. Punitive policies heighten that fear while failing to recognize the potential educational benefits of these technologies — and that students will need to use them in their careers. Our role as educators is to cultivate critical thinking and equip students for a job market that will use AI, not to intimidate them.

Yet course descriptions include bans on the use of AI. Professors tell students they cannot use it. And students regularly read stories about their peers going on academic probation for using Grammarly. If students feel constantly under suspicion, it can create a hostile learning environment.

Related: Interested in innovations in the field of higher education? Subscribe to our free biweekly Higher Education newsletter.

Many of my students haven’t even played around with ChatGPT because they are scared of being accused of plagiarism. This avoidance creates a paradox in which students are expected to be adept with these modern tools post-graduation, yet are discouraged from engaging with them during their education.

I suspect the profile of my students makes them more prone to fear AI. Most are Hispanic and female, taking courses in translation and interpreting. They see that the overwhelmingly male and white tech bros” in Silicon Valley shaping AI look nothing like them, and they internalize the idea that AI is not for them and not something they need to know about. I wasn’t surprised that the only male student I had in class this past semester was the only student excited about ChatGPT from the very beginning.

Failing to develop AI literacy among Hispanic students can diminish their confidence and interest in engaging with these technologies. Their fearful reactions will widen the already concerning inequities between Hispanic and non-Hispanic students; the degree completion gap between Latino and white students increased between 2018 and 2021.

The stakes are high. Similar to the internet boom, AI will revolutionize daily activities and, certainly, knowledge jobs. To prepare our students for these changes, we need to help them understand what AI is and encourage them to explore the functionalities of large language models like ChatGPT.

I decided to address the issue head-on. I asked my students to write speeches on a current affairs topic. But first, I asked for their thoughts on AI. I was shocked by the extent of their misunderstanding: Many believed that AI was an omniscient knowledge-producing machine connected to the internet.

After I gave a brief presentation on AI, they expressed surprise that large language models are based on prediction rather than direct knowledge. Their curiosity was piqued, and they wanted to learn how to use AI effectively.

After they drafted their speeches without AI, I asked them to use ChatGPT to proofread their drafts and then report back to me. Again, they were surprised — this time about how much ChatGPT could improve their writing. I was happy (even proud) to see they were also critical of the output, with comments such as “It didn’t sound like me” or “It made up parts of the story.”

Was the activity perfect? Of course not. Prompting was challenging. I noticed a clear correlation between literacy levels and the quality of their prompts.

Students who struggled with college-level writing couldn’t go beyond prompts such as “Make it sound smoother.” Nonetheless, this basic activity was enough to spark curiosity and critical thinking about AI.

Individual activities like these are great, but without institutional support and guidance, efforts toward fostering AI literacy will fall short.

The provost of my college established an AI committee to develop college guidelines. It included professors from a wide range of disciplines (myself included), other staff members and, importantly, students.

Through multiple meetings, we brainstormed the main issues that needed to be included and researched specific topics like AI literacy, data privacy and safety, AI detectors and bias.

We created a document divided into key points that everyone could understand. The draft document was then circulated among faculty and other committees for feedback.

Initially, we were concerned that circulating the guidelines among too many stakeholders might complicate the process, but this step proved crucial. Feedback from professors in areas such as history and philosophy strengthened the guidelines, adding valuable perspectives. This collaborative approach also helped increase institutional buy-in, as everyone’s contribution was valued.

Related: A new partnership paves the way for greater use of AI in higher ed

Underfunded public institutions like mine face significant challenges integrating AI into education. While AI offers incredible opportunities for educators, realizing these opportunities requires substantial institutional investment.

Asking adjuncts in my department, who are grossly underpaid, to find time to learn how to use AI and incorporate it into their classes seems unethical. Yet, incorporating AI into our knowledge production activities can significantly boost student outcomes.

If this happens only at wealthy institutions, we will widen academic performance gaps.

Furthermore, if only students at wealthy institutions and companies get to use AI, the bias inherent in these large language models will continue to grow.

If we want our classes to ensure equitable educational opportunities for all students, minority-serving institutions cannot fall behind in AI adoption.

Cristina Lozano Argüelles is an assistant professor of interpreting and bilingualism at John Jay College, part of the City University of New York, where she researches the cognitive and social dimensions of language learning.

This story about AI literacy was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Hechinger’s newsletter.

The post TEACHER VOICE: My students are afraid of AI appeared first on The Hechinger Report.

]]>
101668
PROOF POINTS: Teens are looking to AI for information and answers, two surveys show https://hechingerreport.org/proof-points-teens-ai-surveys/ Mon, 17 Jun 2024 10:00:00 +0000 https://hechingerreport.org/?p=101528

Two new surveys, both released this month, show how high school and college-age students are embracing artificial intelligence. There are some inconsistencies and many unanswered questions, but what stands out is how much teens are turning to AI for information and to ask questions, not just to do their homework for them. And they’re using […]

The post PROOF POINTS: Teens are looking to AI for information and answers, two surveys show appeared first on The Hechinger Report.

]]>

Two new surveys, both released this month, show how high school and college-age students are embracing artificial intelligence. There are some inconsistencies and many unanswered questions, but what stands out is how much teens are turning to AI for information and to ask questions, not just to do their homework for them. And they’re using it for personal reasons as well as for school. Another big takeaway is that there are different patterns by race and ethnicity with Black, Hispanic and Asian American students often adopting AI faster than white students.

The first report, released on June 3, was conducted by three nonprofit organizations, Hopelab, Common Sense Media, and the Center for Digital Thriving at the Harvard Graduate School of Education. These organizations surveyed 1,274 teens and young adults aged 14-22 across the U.S. from October to November 2023. At that time, only half the teens and young adults said they had ever used AI, with just 4 percent using it daily or almost every day. 

Emily Weinstein, executive director for the Center for Digital Thriving, a research center that investigates how youth are interacting with technology, said that more teens are “certainly” using AI now that these tools are embedded in more apps and websites, such as Google Search. Last October and November, when this survey was conducted, teens typically had to take the initiative to navigate to an AI site and create an account. An exception was Snapchat, a social media app that had already added an AI chatbot for its users. 

More than half of the early adopters said they had used AI for getting information and for brainstorming, the first and second most popular uses. This survey didn’t ask teens if they were using AI for cheating, such as prompting ChatGPT to write their papers for them. However, among the half of respondents who were already using AI, fewer than half – 46 percent – said they were using it for help with school work. The fourth most common use was for generating pictures.

The survey also asked teens a couple of open-response questions. Some teens told researchers that they are asking AI private questions that they were too embarrassed to ask their parents or their friends. “Teens are telling us I have questions that are easier to ask robots than people,”  said Weinstein.

Weinstein wants to know more about the quality and the accuracy of the answers that AI is giving teens, especially those with mental health struggles, and how privacy is being protected when students share personal information with chatbots.

The second report, released on June 11, was conducted by Impact Research and  commissioned by the Walton Family Foundation. In May 2024, Impact Research surveyed 1,003 teachers, 1,001 students aged 12-18, 1,003 college students, and 1,000 parents about their use and views of AI.

This survey, which took place six months after the Hopelab-Common Sense survey, demonstrated how quickly usage is growing. It found that 49 percent of students, aged 12-18, said they used ChatGPT at least once a week for school, up 26 percentage points since 2023. Forty-nine percent of college undergraduates also said they were using ChatGPT every week for school but there was no comparison data from 2023.

Among 12- to 18-year-olds and college students who had used AI chatbots for school, 56 percent said they had used it for help in writing essays and other writing assignments. Undergraduate students were more than twice as likely as 12- to 18-year-olds to say using AI felt like cheating, 22 percent versus 8 percent. Earlier 2023 surveys of student cheating by scholars at Stanford University did not detect an increase in cheating with ChatGPT and other generative AI tools. But as students use AI more, students’ understanding of what constitutes cheating may also be evolving. 

 

More than 60 percent of college students who used AI said they were using it to study for tests and quizzes. Half of the college students who used AI said they were using it to deepen their subject knowledge, perhaps, as if it were an online encyclopedia. There was no indication from this survey if students were checking the accuracy of the information.

Both surveys noticed differences by race and ethnicity. The first Hopelab-Common Sense survey found that 7 percent of Black students, aged 14-22, were using AI every day, compared with 5 percent of Hispanic students and 3 percent of white students. In the open-ended questions, one Black teen girl wrote that, with AI, “we can change who we are and become someone else that we want to become.” 

The Walton Foundation survey found that Hispanic and Asian American students were sometimes more likely to use AI than white and Black students, especially for personal purposes. 

These are all early snapshots that are likely to keep shifting. OpenAI is expected to become part of the Apple universe in the fall, including its iPhones, computers and iPads.  “These numbers are going to go up and they’re going to go up really fast,” said Weinstein. “Imagine that we could go back 15 years in time when social media use was just starting with teens. This feels like an opportunity for adults to pay attention.”

This story about ChatGPT in education was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: Teens are looking to AI for information and answers, two surveys show appeared first on The Hechinger Report.

]]>
101528
PROOF POINTS: AI writing feedback ‘better than I thought,’ top researcher says https://hechingerreport.org/proof-points-writing-ai-feedback/ https://hechingerreport.org/proof-points-writing-ai-feedback/#comments Mon, 03 Jun 2024 10:00:00 +0000 https://hechingerreport.org/?p=101344

This week I challenged my editor to face off against a machine. Barbara Kantrowitz gamely accepted, under one condition: “You have to file early.”  Ever since ChatGPT arrived in 2022, many journalists have made a public stunt out of asking the new generation of artificial intelligence to write their stories. Those AI stories were often […]

The post PROOF POINTS: AI writing feedback ‘better than I thought,’ top researcher says appeared first on The Hechinger Report.

]]>
Researchers from the University of California, Irvine, and Arizona State University found that human feedback was generally a bit better than AI feedback, but AI was surprisingly good. Credit: Getty Images

This week I challenged my editor to face off against a machine. Barbara Kantrowitz gamely accepted, under one condition: “You have to file early.”  Ever since ChatGPT arrived in 2022, many journalists have made a public stunt out of asking the new generation of artificial intelligence to write their stories. Those AI stories were often bland and sprinkled with errors. I wanted to understand how well ChatGPT handled a different aspect of writing: giving feedback.

My curiosity was piqued by a new study, published in the June 2024 issue of the peer-reviewed journal Learning and Instruction, that evaluated the quality of ChatGPT’s feedback on students’ writing. A team of researchers compared AI with human feedback on 200 history essays written by students in grades 6 through 12 and they determined that human feedback was generally a bit better. Humans had a particular advantage in advising students on something to work on that would be appropriate for where they are in their development as a writer. 

But ChatGPT came close. On a five-point scale that the researchers used to rate feedback quality, with a 5 being the highest quality feedback, ChatGPT averaged a 3.6 compared with a 4.0 average from a team of 16 expert human evaluators. It was a tough challenge. Most of these humans had taught writing for more than 15 years or they had considerable experience in writing instruction. All received three hours of training for this exercise plus extra pay for providing the feedback.

ChatGPT even beat these experts in one aspect; it was slightly better at giving feedback on students’ reasoning, argumentation and use of evidence from source materials – the features that the researchers had wanted the writing evaluators to focus on.

“It was better than I thought it was going to be because I didn’t have a lot of hope that it was going to be that good,” said Steve Graham, a well-regarded expert on writing instruction at Arizona State University, and a member of the study’s research team. “It wasn’t always accurate. But sometimes it was right on the money. And I think we’ll learn how to make it better.”

Average ratings for the quality of ChatGPT and human feedback on 200 student essays

Researchers rated the quality of the feedback on a five-point scale across five different categories. Criteria-based refers to whether the feedback addressed the main goals of the writing assignment, in this case, to produce a well-reasoned argument about history using evidence from the reading source materials that the students were given. Clear directions mean whether the feedback included specific examples of something the student did well and clear directions for improvement. Accuracy means whether the feedback advice was correct without errors. Essential Features refer to whether the suggestion on what the student should work on next is appropriate for where the student is in his writing development and is an important element of this genre of writing. Supportive Tone refers to whether the feedback is delivered with language that is affirming, respectful and supportive, as opposed to condescending, impolite or authoritarian. (Source: Fig. 1 of Steiss et al, “Comparing the quality of human and ChatGPT feedback of students’ writing,” Learning and Instruction, June 2024.)

Exactly how ChatGPT is able to give good feedback is something of a black box even to the writing researchers who conducted this study. Artificial intelligence doesn’t comprehend things in the same way that humans do. But somehow, through the neural networks that ChatGPT’s programmers built, it is picking up on patterns from all the writing it has previously digested, and it is able to apply those patterns to a new text. 

The surprising “relatively high quality” of ChatGPT’s feedback is important because it means that the new artificial intelligence of large language models, also known as generative AI, could potentially help students improve their writing. One of the biggest problems in writing instruction in U.S. schools is that teachers assign too little writing, Graham said, often because teachers feel that they don’t have the time to give personalized feedback to each student. That leaves students without sufficient practice to become good writers. In theory, teachers might be willing to assign more writing or insist on revisions for each paper if students (or teachers) could use ChatGPT to provide feedback between drafts. 

Despite the potential, Graham isn’t an enthusiastic cheerleader for AI. “My biggest fear is that it becomes the writer,” he said. He worries that students will not limit their use of ChatGPT to helpful feedback, but ask it to do their thinking, analyzing and writing for them. That’s not good for learning. The research team also worries that writing instruction will suffer if teachers delegate too much feedback to ChatGPT. Seeing students’ incremental progress and common mistakes remain important for deciding what to teach next, the researchers said. For example, seeing loads of run-on sentences in your students’ papers might prompt a lesson on how to break them up. But if you don’t see them, you might not think to teach it. Another common concern among writing instructors is that AI feedback will steer everyone to write in the same homogenized way. A young writer’s unique voice could be flattened out before it even has the chance to develop.

There’s also the risk that students may not be interested in heeding AI feedback. Students often ignore the painstaking feedback that their teachers already give on their essays. Why should we think students will pay attention to feedback if they start getting more of it from a machine? 

Still, Graham and his research colleagues at the University of California, Irvine, are continuing to study how AI could be used effectively and whether it ultimately improves students’ writing. “You can’t ignore it,” said Graham. “We either learn to live with it in useful ways, or we’re going to be very unhappy with it.”

Right now, the researchers are studying how students might converse back-and-forth with ChatGPT like a writing coach in order to understand the feedback and decide which suggestions to use.

Example of feedback from a human and ChatGPT on the same essay

In the current study, the researchers didn’t track whether students understood or employed the feedback, but only sought to measure its quality. Judging the quality of feedback is a rather subjective exercise, just as feedback itself is a bundle of subjective judgment calls. Smart people can disagree on what good writing looks like and how to revise bad writing. 

In this case, the research team came up with its own criteria for what constitutes good feedback on a history essay. They instructed the humans to focus on the student’s reasoning and argumentation, rather than, say, grammar and punctuation.  They also told the human raters to adopt a “glow and grow strategy” for delivering the feedback by first finding something to praise, then identifying a particular area for improvement. 

The human raters provided this kind of feedback on hundreds of history essays from 2021 to 2023, as part of an unrelated study of an initiative to boost writing at school. The researchers randomly grabbed 200 of these essays and fed the raw student writing – without the human feedback – to version 3.5 of ChatGPT and asked it to give feedback, too

At first, the AI feedback was terrible, but as the researchers tinkered with the instructions, or the “prompt,” they typed into ChatGPT, the feedback improved. The researchers eventually settled upon this wording: “Pretend you are a secondary school teacher. Provide 2-3 pieces of specific, actionable feedback on each of the following essays…. Use a friendly and encouraging tone.” The researchers also fed the assignment that the students were given, for example, “Why did the Montgomery Bus Boycott succeed?” along with the reading source material that the students were provided. (More details about how the researchers prompted ChatGPT are explained in Appendix C of the study.)

The humans took about 20 to 25 minutes per essay. ChatGPT’s feedback came back instantly. The humans sometimes marked up sentences by, for example, showing a place where the student could have cited a source to buttress an argument. ChatGPT didn’t write any in-line comments and only wrote a note to the student. 

Researchers then read through both sets of feedback – human and machine – for each essay, comparing and rating them. (It was supposed to be a blind comparison test and the feedback raters were not told who authored each one. However, the language and tone of ChatGPT were distinct giveaways, and the in-line comments were a tell of human feedback.)

Humans appeared to have a clear edge with the very strongest and the very weakest writers, the researchers found. They were better at pushing a strong writer a little bit further, for example, by suggesting that the student consider and address a counterargument. ChatGPT struggled to come up with ideas for a student who was already meeting the objectives of a well-argued essay with evidence from the reading source materials. ChatGPT also struggled with the weakest writers. The researchers had to drop two of the essays from the study because they were so short that ChatGPT didn’t have any feedback for the student. The human rater was able to parse out some meaning from a brief, incomplete sentence and offer a suggestion. 

In one student essay about the Montgomery Bus Boycott, reprinted above, the human feedback seemed too generic to me: “Next time, I would love to see some evidence from the sources to help back up your claim.” ChatGPT, by contrast, specifically suggested that the student could have mentioned how much revenue the bus company lost during the boycott – an idea that was mentioned in the student’s essay. ChatGPT also suggested that the student could have mentioned specific actions that the NAACP and other organizations took. But the student had actually mentioned a few of these specific actions in his essay. That part of ChatGPT’s feedback was plainly inaccurate. 

In another student writing example, also reprinted below, the human straightforwardly pointed out that the student had gotten an historical fact wrong. ChatGPT appeared to affirm that the student’s mistaken version of events was correct.

Another example of feedback from a human and ChatGPT on the same essay

So how did ChatGPT’s review of my first draft stack up against my editor’s? One of the researchers on the study team suggested a prompt that I could paste into ChatGPT. After a few back and forth questions with the chatbot about my grade level and intended audience, it initially spit out some generic advice that had little connection to the ideas and words of my story. It seemed more interested in format and presentation, suggesting a summary at the top and subheads to organize the body. One suggestion would have made my piece too long-winded. Its advice to add examples of how AI feedback might be beneficial was something that I had already done. I then asked for specific things to change in my draft, and ChatGPT came back with some great subhead ideas. I plan to use them in my newsletter, which you can see if you sign up for it here. (And if you want to see my prompt and dialogue with ChatGPT, here is the link.) 

My human editor, Barbara, was the clear winner in this round. She tightened up my writing, fixed style errors and helped me brainstorm this ending. Barbara’s job is safe – for now. 

This story about AI feedback was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.

The post PROOF POINTS: AI writing feedback ‘better than I thought,’ top researcher says appeared first on The Hechinger Report.

]]>
https://hechingerreport.org/proof-points-writing-ai-feedback/feed/ 1 101344