THESIS

source code

Essay Similarity

Thoriq Adillah 05 Feb 2023

#php

The Motivation

Because of latest pandemic, COVID-19, Leaning Management System (LMS) gains more popularity than ever. One of the most popular LMS in the world is Moodle. With LMS, we can take advantage of it to assist us in managing various things related to the learning process such as class management, assignment management, and also online exams.

Essay is one type of question that requires a deeper understanding in answering and assessing the answer. Therefore, it is possible that the more essay questions a teacher has to evaluate, the more likely the consistency of his assessment will decrease. Therefore, we need a tool to assist a teacher in assisting in the process of assessing essay questions.

While Moodle is a popular LMS, unfortunately there is no auto grading feature for essay question type. Fortunately, Moodle can be extended with a plugin, and of course people already tried made some plugins for this particular problem, with the most popular plugin for auto grading essay is Essay Auto-grade. The way the plugin works is, we have to make the target phrase or keyword that must be in the student’s answer. If the student’s answer contains these keywords, then they will get a point as for their automatic grading. But, in my opinion, there a disadvantage of this approach. If the potential ‘answer key’ to the question has a long text, then the teacher is also required to make up lots of keywords so that it can be a hassle for the teacher.

The Solution

We need a more flexible way of doing auto grading, one of which is using machine learning. There is a general method used by the teacher in assessing an essay answer, namely by matching student answers with the answer keys. In its implementation, we can take advantage of machine learning using an algorithm called Cosine Similarity.

The Challenge

While the solution seems obvious, but this is my first time developing a plugin and doing machine learning. Never have I thought I ever doing a machine learning or even think about it in the future, because I didn’t like it the first time I coded a machine learning.

Basically, in this thesis, I was originally going to do a web development regarding booking management for a sports center at my university, but it turned out that someone had made it beforehand, but at that time it had not yet been deployed to the public. Because at that point I ran out of research topics and didn’t know what to do, I consulted with my professor if he had a topic I could work on. And he had an idea about developing a Moodle plugin about automatic assessment of essay answers with machine learning. At first I didn’t intend to take up the topic because I had never developed a plugin, let alone machine learning. However, due to running out of time and I also had run out of options, I was desperate to take up the topic

Actually, on the part of machine learning, it’s not that difficult, because surely many people have already implemented the code that I use such as stemmer, cosine similarity, TF-IDF, and LSA (more of those later), so I can directly copy and paste the code. However, this does not apply to plugin development. At first, of course, I had trouble developing it, moreover, I was completely confused by Moodle’s API documentation. In fact, on the first month I was lazy coding the plugin because I was too confused. However, due to less and less time, I finally developed the plugin following how Essay Auto-grade was developed blindly. But finally, 2-3 weeks later, I have understood about this plugin.

The Implementation

Cosine Similarity

Let’s discuss how the machine learning be implemented. For the cosine similarity to work best, we need to do some data cleaning. The data in question, of course, is the text of the answer key and student answer. There are several steps that we need to do to clean the data, namely:

  • Text normalization, which is normalizing text by making the text lower case, eliminating special characters, and whitespace trimming
  • Tokenizing, which is splitting the text into each word and counting the number of words in the text
  • Stopword removal, this process is to remove words that barely have any meaning to the entire text, such as a, the, is, etc
  • Stemming, that is changing a word into its root word so that all the words being compared will be in the same class, i.e eating turns to eat
  • Wheighting, for weighting the token, we will use a method called TF-IDF

Because this is the first time I developed a plugin, and I don’t know how to include a library into a plugin, so I just copied and pasted the code that I need into my code. For the English stemming and cosine similarity is copied and pasted from PHP NLP-Tools, the TF-IDF from PHP-ML, and Indonesian stemming from this repo.

P.S, there is a demo at the end of this article. Keep reading!

Latent Semantic Analysis (LSA)

But there is a problem with cosine similarity, which is it cannot detect synonyms. This is unfortunate, because when comparing two texts, we also have to take synonyms into account. After some testing and comparing with manual assessment, the similarity between auto assessment and manual assessment by teacher has a pretty huge gap. Here is the result for the testing in 4 different question

Screenshot_20230207_174338

Because of that, we need some way to take synonyms into account. One of the way we can take is using topic modelling. One of the most popular topic modelling is using Latent Semantic Analysis (LSA). To implement LSA, with need to do a Singular Value Decomposition.

This was actually another challenge for me, because I had no idea how to implement SVD, let alone LSA. People said that if you have done SVD, you are most of the way to the LSA. But the problem was, what do I do after SVD has been done. But luckily, this video explanation by Standford Lecture told me what to do after SVD has been calculated. And the overall series of SVD is really helpful btw. So I recommend checking that!

After we perform LSA, the similarity between auto assessment and manual assessment by teacher is closing in, and in general, the gap is reduced by alot

Screenshot_20230207_174500

And the validity of using LSA in auto grading is supported by several papers, particularly this IEEE paper that shows LSA performs better than LDA (another topic modelling method). And another paper that uses LSA as their method of auto grading is this paper.

Demo

Phew, that was a lot of reading. Now here is what we’re waiting for. A simple demo!

Note

There is slight LSA implementation between the demo and the actual plugin. In the plugin, the LSA is calculated using truncated SVD. In general, the SVD will produces 3 matrices, U, S, and VT. How to calculate the LSA is by truncate the diagonal matrix, S, by preserving the desired dimension and assigning 0 to the rest. So when the 3 matrices is multiplied, it will perform dimensionality reduction so the value of the word on the document that has the most important topic will be increased, and this change of value will be used to improve the cosine similarity

Whereas for the demo, it is calculated by multiplying original vector with V matrix based on SVD. This way, we will get word to topic, and will be used to improve the cosine similariy as well

Why is the implementation is different? Because I tried to implement the demo as identical as the plugin, but failed because the result from the SVD is different, particularly on the V matrix, despite taken from the same implementation of SVD, from JAMA, and I don’t know why what is the cause. And when I tried to use the same implementation to calculate the LSA, the cosine similarity + LSA result is no different than standalone cosine similarity

I tried to debug it, and my suspicion is the Math.hypot from Javascript result is different compared to PHP’s hypot, so that the V matrix result gets different value. But overall, it is generally the same

Made with ❤️ by Thoriq Adillah

github linkedin telegram email