2023年12月3日

Python大数据代写 | INFS 5095 – Big Data Basics

本次作业案例分享是来自澳洲的一个assignment，主要内容是一个关于Python大数据的python代写

Assessment task

In this assessment you are required to answer five questions using Spark. For each question you will
need to write code which uses the appropriate transformations and actions.
Our main input file for this assessment is called DataCoSupplyChainDataset.csv, a subset of a dataset
from Kaggle, which contains supply chains used by a company called DataCo Global. There is a second
file provided, called DescriptionDataCoSupplyChain.csv, which describes the columns in the main
dataset.

You should use the following template file to write your code: test3_solutions.py. See the video
instructions provided with the assessment instructions for an example of how to use the template.

Q1. Load the data, convert to dataframe and apply appropriate column names and variable types.

Q2. Determine what proportion of all transactions is attributed to each customer segment in the dataset
i.e. Consumer = x%, Corporate = y% etc.
This question uses the Customer Segment field.

Q3. Determine which three products had the most amount of sales.
This question uses the Order Item Total and Product Name fields.

Q4. For each transaction type, determine the average item cost.
This question uses the Type, Order Item Product Price and Order Item Quantity fields.

Q5. What is the first name of the most regular customer in Puerto Rico? (Repeat transactions by the
same customer should not count as separate customers.)
This question uses the Customer Country, Customer Fname and Customer Id fields.

IMPORTANT HINTS

• Q4 will probably be easier if you don’t use the mean action.
• In Q5 you can specify to the max action which field should be used i.e. max(lambda x: x[5])

程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

本网站支持 Alipay WeChatPay PayPal等支付方式

E-mail:vipdue@outlook.com 微信:vipnxx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。

CS代写,留学生编程代写,CS作业代写,Java代写,程序代写，代码代写 | ITCS代写

Python代写

网络安全代写 | Introduction to Computer Security – G6077 计算机网络代写｜NetSec Coursework