数据分析挖掘代写|CEGE0042: Spatial-temporal Data Analysis and Data Mining

During this course, you learn how to use R Studio and a number of other software packagesto explore,visualise, model, cluster, classify and forecast spatial, temporal and spatio-temporal data, using a variety of techniques including:

Your task is to source and analyse a spatio-temporal dataset using the methods you have learned during the course. Depending on which dataset you choose, you may use different methods to analyse

Some examples:

Crime Location Data

Crime locations are usually recorded as point (event) data. Some options for analysing these data include:

Example data source: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2

Road Traffic Data

Traffic data (flows, travel times etc.) are usually recorded on road segments, which form a spatial network. Using the adjacency of the network, carry our short-term prediction of traffic flows. Some options to explore:

You could test this by comparing 2 or more methods.

Example data source: https://dot.ca.gov/programs/traffic-operations/mpr/pems-source

Carry out spatio-temporal analysis or forecasting of Covid-19 pandemic spread. Covid-19 data is available from a range of places including:

Example projects

The datasets and tasks suggested here are just examples. You are encouraged to search for datasets and choose a topic you are interested in. There are various places you may find data such as government websites and repositories such as Kaggle (https://www.kaggle.com/datasets).

Your task is to produce a report with the following sections:

o A brief description of the method used to analyse the dataset.

o A detailed explanation of the experimental setup (e.g. the way the data were divided, the parameters that were used, the transformations that were used, i.e.differencing).

o Presentation of the results with appropriate graphs and/or maps.

o An assessment of the performance of the method (with error indices or other appropriate measures).

o If you used multiple models, did one model perform better than the other? If so,why might this be the case? What are the strengths the model(s) in terms of interpretability and ease of implementation, running time etc.?

o How did the performance of the model vary across the study area?

o What were the limitations of the method(s) used?

o How could the method(s) be improved?

Report Length

The report does not have a word limit but is limited to 6 pages A4 with Arial font size 11, including tables and figures but excluding references. This is a common requirement when writing short papers, e.g. for an academic conference or journal. You should divide the content of your report among the sections according to the proportion of marks available for each one.