Study finds that three in five teachers can’t identify AI written content

 

  • A focus group study found that three in five (60%) teachers struggle to distinguish between AI and student written content
  • Teachers misidentified nearly half (47%) of the answers in the study
  • Teachers share five tell tale signs that expose AI, including Americanisms and lack of studied examples

With AI becoming an increasingly bigger presence in society, questions are rightly being asked about how it can be used in education, both positively and negatively- with one of the most common concerns being ‘could students realistically use AI to cheat?’

Whilst some have played down the risk of AI being used to cheat in schools, the experts at High Speed Training, providers of online courses in education, carried out an experiment, asking a focus group of UK-based teachers to blindly review both AI and student generated content*, to see if they could identify the AI created content, and provide a grade for each answer.

The focus group of 15 secondary school level teachers, from several written, essay-based subjects, were each provided with two answers to real questions from past exam papers covering English Language, Geography and Religious Studies, and were required to say whether they believed the content was written by students or AI. A focus group of three GCSE aged students created the human content, with the same questions also answered by ChatGPT. 

With each participant reviewing one AI and one student answer each, three in five (60%) teachers struggled to identify at least one answer they were asked to review, with one in three (33%) failing to correctly identify both of the answers they reviewed. In total, nearly half (47%) of all of the answers reviewed by the focus group were wrongly identified. 

Teachers were also asked to provide a rough indication of the quality of the answer, by assigning a numerical ‘grade’ of 1 to 5. Answers generated by ChatGPT scored an average grade of 4, with teachers generally viewing the content to a high standard. 

Teachers were also more likely to assign a higher grade to an answer when they believed it was AI. When teachers correctly identified AI content, they assigned an average grade of 4.3, whilst the same answers were graded at an average of 3.7 when the teacher thought they were human, suggesting that teachers expect AI to create high quality content.

The study also experimented to see whether another AI program could detect whether the answers from the study were human or AI. Entering both the student and ChatGPT answers into Google Bard, the experiment found that the software isn’t always successful in identifying where AI has been used, as has already been found out by some in the education sector1. Of the different answers, Bard misidentified a third (33%), wrongly identifying one human answer as AI, and two AI answers as human. 

Dr Richard Anderson, Head of Learning and Development at High Speed Training, comments: “AI and chatbots have been huge topics of discussion recently, with many in the education sector wondering if they could be used to cheat in exams and coursework. We wanted to put this to the test to see whether it actually does pose a risk in schools and learning environments.

“Whilst it’s concerning that 60% of teachers struggled to correctly identify where AI had been used, many of the teachers involved had not encountered AI before, and we’re confident that with awareness and exposure, teachers will be able to correctly spot it more frequently. Free and easy access to software such as ChatGPT and other bots is still a relatively new phenomenon, so there is bound to be a period of adjustment for teachers and educators.

“There are positives from the experiment, including that there are several tell-tale signs that teachers can use to spot where a student may have used AI to create their work. As these technologies continue to evolve, educators will have to continue to develop their skills and training to ensure that children are still receiving the best education they can.”

The teachers provided feedback on each answer, these are the five most common giveaways they shared that the text had been AI generated:

  1. Americanised language

One of the simplest identifying signs of AI use is Americanised spelling. Whilst this is easy for students to remove if they know what they’re looking for, they may overlook it, leaving the words as small clues a teacher can pick up on. 

  1. Lack of personal case studies

Students are instructed to use memorised and previously studied examples to help illustrate their points and reinforce their argument. A total lack of anecdotal evidence and a reliance on the information provided with the question, could suggest AI involvement. 

  1. Vocabulary used

Whilst the AI was instructed to answer questions in simplified language, several teachers spotted that some of the answers contained language that you would not expect to see from GCSE aged pupils. Students tend to use more informal language and a regular use of advanced vocabulary is not common. 

  1. Formulaic structure

AI will try to neatly package an answer and cover all points that it is asked to concisely. The teachers in the study pointed out that some answers seemed to try to fit everything in, whereas many students would be unlikely to address every single point in a question. 

  1. It’s a little too perfect

Even the best students make some small mistakes in their writing, whether it’s spelling, grammar, or a tendency to waffle and include unnecessary words. AI created content is unlikely to include any of these, and may stand out as being a little too perfect.

Of course, there are more obvious ways to tell if AI has been used!

Credit @venturetwins 

To find out more about High Speed Training’s online education courses, and you can test your own judgement skills to see whether you can spot the difference between human and AI: https://www.highspeedtraining.co.uk/hub/students-using-ai/