# 数据库辅导 | DSC 450: Database Processing for Large-Scale Analytics

DSC 450: Database Processing for Large-Scale Analytics

Take-home Midterm

Don’t forget to include all python code and SQL code in every question.

Part 1

(8 points) Please give a “True” or a “False” rating to each statement below.

Part 2

Course(CName, Department, Credits)

For each part below, write a single SQL query.

Q1.  Display the list of student IDs and names for the students who graduated in the first 3 years (e.g., if first graduation = 1898, within 3 years should include 1898, 1899, 1900. This is an example; do not use these numbers in your answer).  You can assume that GradYear is an integer.

Q2.  Display student names and their taken course names for all students with the middle name of ‘Muriel’. You may assume that name is always written as ‘First Middle Last’.  Your query output should be sorted from the highest grade to the lowest grade.

Q3.  For students who are either not enrolled in any courses or are enrolled in only 1 course, list those student’s names and graduation years.

Q4.  For each department, return the average length of a student name (you can use LEN function to get the necessary number, i.e. LEN(Name) will give you length of student name) .

Q5. Update all student records, to increase the graduation year by 1 for all students who live in Chicago

Q6. Modify the course table to add a Chair column that can contain up to 25 characters (that question requires a DDL rather than a DML SQL statement)

Part 3

R(X, Y, Z, W, A)  with  XYàZA ,  AàW

does W determine A? (WàA?) Why or why not?

S(P, Q, U, M, N)

What functional dependencies (if any) can you assume?

The table is already in First Normal Form, and its primary key is (Show, Year).

The schema for the TVShows table is:

(Show, Year, Channel, Length, Cost, Zip, City, State, Lead, Salary)

You are given the following functional dependencies:

Show à Cost

Zip à City, State

Show, Year à Channel, Cost

Given the schema R and the following functional dependencies:

R(A, B, C, D, E)  with  ABàC,  CàD

Part 4

(50 points)

Create the schema from Part 2-c in SQLite and populate it with at least 7 students, 4 courses, and 14 enrollments (at least one of the students should not be enrolled in any courses and at least one course should have zero current enrollments).

Include a screenshot of (any) one output .txt file

Include a screenshot of the output .txt file

CName à Department, Credits

(you can do so by creating a new record that repeats the course name but does not match either the department or the number of credits associated with this course)

Execute the following query in parts f, g, and h below.

Q1: For every department, display the average grade (grade should be numeric) and the number of rows.

Include a screenshot of the output of your code

Include a screenshot of the output of your code