A group of Russian researchers has used AI-based fashions to predict excessive tutorial achievers from decrease ones based on their social media posts.
The prediction mannequin makes use of a mathematical textual evaluation that registers customers’ vocabulary (its vary and the semantic fields from which ideas are taken), characters and symbols, put up size and phrase size.
Every phrase has its personal ranking (a sort of IQ). Scientific and cultural subjects, English phrases, and phrases and posts which are longer in size rank extremely and function indicators of good tutorial efficiency.
An abundance of emojis, phrases or entire phrases written in capital letters, and vocabulary associated to horoscopes, driving and army service point out decrease grades in class.
“At the same time, posts can be quite short — even tweets are quite informative,” stated Ivan Smirnov, main analysis fellow on the Institute of Education of Higher School of Economics University in Moscow.
The examine traces the profession paths of 4,400 college students in 42 Russian areas.
“Since this kind of data, in combination with digital traces, is difficult to obtain, it is almost never used,” Smirnov stated.
This form of dataset means that you can develop a dependable mannequin that can be utilized to different settings.
“And the results can be extrapolated to all other students — high school students and middle school students,” Smirnov stated in a paper revealed within the journal EPJ Data Science.
The researchers stated that it can be crucial that the mannequin labored efficiently on datasets of totally different social media websites, equivalent to VK (a Russian on-line social media and social networking service) and Twitter, thereby proving that it can be efficient in several contexts.
In addition, the mannequin can be used to predict very totally different traits, from pupil tutorial efficiency to revenue or despair.
The examine information included information in regards to the college students’ VK accounts (3,483 college students consented to supply this data).
In the examine, unsupervised machine studying with phrase vector representations was carried out on VK put up corpus (totaling 1.9 billion phrases, with 2.5 million distinctive phrases).
It was then mixed with an easier supervised machine studying mannequin that was skilled in particular person positions and taught to predict PISA (Programme for International Students Assessment) scores.
Posts from publicly viewable VK pages have been used as a coaching pattern — this included a complete of 130,575 posts from 2,468 topics who took the PISA check.
The check allowed the researcher to evaluate a pupil’s tutorial aptitude in addition to their means to use their data in follow, the authors wrote.