Python辅导 | DATA2001 – Data Science

本次澳洲Python辅导的主要内容是在给定的data set基础上完成编程和数据分析.

DATA2001 – Data Science, Big Data, and Data Diversity Week 8: Administrativa

DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) 1

DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) 2

Practical Assignment: Cyclability Analysis

– Assignment to be published end of week (Canvas: Modules –> Assignment) – Worth20%pfthefinalgradeinDATA2001
– DueintutorialofWeek12

DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) [https://www.walkscore.com/AU-NSW/Sydney] 3
Practical Assignment: Cyclability Analysis

– Goal: Practical experience with data variety, data analysis, and presentation – Technologies as covered in this course: Python, Jupyter notebooks, and SQL
– Three tasks:
– Data import and integration

• Weprovidedcensusdataandsomebikeparkingdata
• Needstobecombined,eg.viaspatialjoin(tobecoveredinWeek9) • Feelfreetoextendwithowndatasets

– Cyclability Analysis

• Computationofcyclabilityscore;exampleformulagiven • Whenaddingotherdatasets,feelfreetoadjustformula • CorrelationofyourscorewithsomeABSstatistics
– Documentationand(brief)Report
– Someadditionaltasks/optionsforteamsinadvancedstreams

DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) 4

Provided Datasets (to be published on Canvas)

– ABS Data
– Census data on neighourhoods (SA2-level areas) in Greater Sydney
such as population, land_area, number of dwellings
– Business statistics per SA2-area
– Incomeandrentstatisticstocheckforcorrelationwith
– Bike-Sharing “Pods”
– Oneexampleoftransportdata:
names and locations of dedicated bike parking locations (‘pods’).
– Note that SA2-level data from the ABS does not always match suburbs, and that the bike-pods have a GPS location, but not the neighbourhoods
– cf.tutorialafterbreakonhowtoretrieveboundarydataforneighbourhoodstoo
– Adding more datasets from your side is explicitly encouraged.
– Try different types and forms, not just CSV…
DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) 5

– Groupwork
– teams of 2 (unless odd-size class or other good reasons) – All team members should be in the same tutorial
– Deliverables see handout, page 3
– Due on Friday of Week 12
– Submission page and marking rubric to be published in Canvas – Late submissions: -20% of achieved mark per day late
– Demo in Week 12
– There will be a short demo during the tutorials to tutors
– Individual grades can be scaled based on participation in project or demo
DATA2001 “Data Science, Big Data, and Data Diversity” – 2019 (Roehm) 6