Student Performance Prediction


Education Data Analysis

Statement of Problem

Data mining is widely used in educational field. Student performance is of great concern in the educational institution where several factors may affect student performance. For prediction the three required components are: Parameters which affect the student performance, Data mining methods and third one is data mining tool. These Parameters may be psychological, personal, and environmental. The study is conducted to maintain the education quality of this university by minimizing the diverse affect of these factors on student’s performance. In this Project, The Evaluation of student performance is done by applying data mining tool called R programming. By applying data mining techniques on student data we can

  1. Obtain knowledge which describes the student performance.
  2. Improve the education quality, student’s performance and to decrease failure rate. All these will help to improve the quality of institute.

Aims and Objectives

The aim of this project is to effectively and accurately evaluate the performance of students result in an institution using R programming language.

  1. To study and identify the gaps in existing prediction methods.
  2. To study and identify the variables used in analyzing students performance.
  3. To study the existing prediction methods for predicting students performance.


This chapter explains the architecture of an integrated frame work for evaluating student performance using R programming language.


The following are included in the integrated frame work for evaluating student performance The student first semester results to evaluate the performance of the students

  1. Before using a data set in R studio you need to download a packages called RIO, rio is package is a relatively recent R package, developed by Thomas J. Leeper, which makes data import and export in R painless and quick.
  2. Now after the Rio package is installed you can now import data set from excel or any other sources.
  3. After importing data set from excel from then the code to execute the datas can now be written in the R studio.

Summary and Conclusion

Summary Data Mining detects the relevant patterns from databases/data warehouses using different programs and algorithms to look into current and historical data which can be analyzed to predict future trends. Statisticians have used different manual techniques for the benefit of the business, predicting trends and results based on data over the years. The business houses had developed huge databases or data warehouses to become “data tombs”. The data was never transformed into information. But with the help of data mining tools and algorithms now professionals from different areas may extract knowledge quickly and at ease. Performance of students in certain course, grade inflation, anticipated percentage of failing students, and assist in grading system. Up to our knowledge, there are no studies that use classification to predict a student final outcome based on his/her grades in a program study plan. Analyzing all the courses that are required in the study plan will identify the list of courses that have a huge impact on final results.

Conclusion In this work, integrated frame work for evaluating student performance is easily collected and managed with improved reliability, durability and efficiency. The most prominent feature of evaluating student performance can easily be analyse using R programing language which is mostly use by statisticians to get accurate results. Many previous works make use of a Weka and some other data mining techniques, where If we are not given those similar item set, then WEKA show an error pop-up message because WEKA does not support any undeclared numerical or string value and WEKA cannot generate individually any student’s performance, we can find only frequently possibility of overall students’ performance. From here, teachers can judge, from next time what type of students will be going to get good remarks in his absence.


In R,quality of some packages is less than perfect. In R, no one to complain, if something doesn't work. R is a software Application that many people devote their own time to developing. R commands give little thought to memory management, and so R can consume all available memory.

View script