R语言代写｜INFO411: Data Mining and Knowledge Discovery
This task is a real-world data mining problem. You are required to prepare a set of presentation slides that must include (1) the full name and student number of each student in the group, the contribution (in percent) of each group member, (2) your proposed data mining approach and methodology; (3) the strengths and weaknesses of your proposed approach; (4) the performance measures that can evaluate your data mining results; (5) the results and a brief discussion. Below is the recommended structure of your slides:
Task: Airplane Model Recognition
Airplane model recognition with data mining algorithms is a challenging issue. There are several aspects that makes airplane model recognition challenging. Firstly, aircraft designs span a hundredyears, including many thousand different models and hundreds of different makes and airlines.
Secondly, aircraft designs vary significantly depending on the size, destination, purpose, propulsion,and many other factors including technology. Thirdly, any given aircraft model can be re-purposed or used by different companies, which causes further variations in appearance. These, depending on the identification task, may be consider as noise or as useful information to be extracted. Finally, aircraft are largely rigid objects, which simplifies certain aspects of their modeling.
Recently, there has been an increasing interest to develop deep learning based prediction models for aircraft recognition due to their powerful feature representation capability. Briefly, deep learning models automatically learn feature descriptors (can be understood as attributes in data mining) from aircraft images and use them to train classifiers that can distinguish between different airplanes. This task is about training classification models for airplane model recognition. FGVC Airplane is a public airplane recognition dataset and widely used for the development of airplane recognition models.More details of the dataset can be accessed from https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft.
Some example images of this dataset are shown below.
The feature descriptors of FGVC Airplane dataset produced with a deep learning model (DenseNet 201 model trained on ImageNet dataset) has been provided to you with this instruction as the “airplane-model-recognition.zip’ file. By unzipping it, you shall find the following two files:
1) “training.csv” with 6667 feature descriptors extracted using images from training split of FGVC Airplane dataset. You should use these descriptors for training.
2) “testing.csv” with 3333 feature descriptors extracted using images from testing split of FGVC Airplane dataset. You should use these descriptors for test.
3) Both files has the following data format: image_name<>class_name<>feature_descriptor.There are 102 image classes.The goal of this task is to train classification models for airplane model recognition using provided feature descriptors.