汇编定制 | CPSC 240 Fall 2019 Assignment 6: The Harmonic Sum
CPSC 240 Fall 2019
Assignment 6: The Harmonic Sum
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Non-legal summary in common language: You may copy and distribute this document. You may modify this document, and distribute the modified document. You must maintain the attribution to all authors of the original document and modifications thereof. Modified documents must be distributed under the original license. This is a ‘free’ document.
This is the famous harmonic summation problem. It has been used in many places to benchmark the speed of the main processor. Here you compute the sum and the amount of time needed to compute that sum.
What you will learn and experience in this assignment
>> How to implement a loop that performs floating point arithmetic
>> How to read the tic count from the system clock
>> Recognize the difficulty of computing run-time with a CPU of variable speed
>> Output an approximate total run time.
>> Use the pair cpuid and rdstc. Two assembly instructions often used for timing and benchmarking.
Background for the application
The Harmonic series has nth partial sum: Hn = 1 +1/2+1/3+ . . . . + 1/n. It is known from mathematical studies that Hn grows arbitrarily large as n goes to infinity. We are interested in the question: Given a positive integer n what is the value of Hn? Simultaneously, we want to compute the runtime needed to make that computation.
Make an x86 program that inputs a value N, which is the number of terms in the summation, and outputs the sum, the runtime, and some intermediate values as shown in the sample execution below.
Make a driver program that calls the assembly program. As usual the driver is not part of the solution. The driver doesn’t know the actions of the called program.
Sample execution: input number = 40
Welcome to fast number crunching by programmer Jerry Dumpty.
This machine is running an AMD Phenom processor at 3.2GHz.
Please enter the number of terms to be included in the harmonic sum: 40 The clock is now 42654985129 tics and the computation will begin immediately.
With 4 terms the sum is 2.0833333333
With 8 terms the sum is 2.7178571429
With 12 terms the sum is 3.1032106782
With 16 terms the sum is 3.3807289932
With 20 terms the sum is 3.5977396571
With 24 terms the sum is 3.7759581778
With 28 terms the sum is 3.9271710390
With 32 terms the sum is 4.0584951954
With 36 terms the sum is 4.1745591968
With 40 terms the sum is 4.2785430389
Final sum is 4.2785430389
The clock is now 58754103487 and the computation is complete.
The elapsed time was 16099118358 tics. At 3.2GHZ that is 5.030974487 seconds.
This assembly program will now return the harmonic sum to the driver program.
The driver has received the number 4.2785430389
Have a nice day. Bye
Users don’t like to see nothing.
If a human person runs a program and after 3 minutes that person has seen nothing happen then he or she will quickly become bored and start pressing cntl+d, cntl+c, or escape, or other keys. It is probable that the program in this assignment will run for several minutes. Don’t let your customer become bored – just output something.
In the sample run on the previous page the program outputs an intermediate value after each 4 iterations of the loop. That is the section in green. The green part is not important for the learning experience of this program. You output anything you choose so that your customer does not become bored.
We want to find out how fast your machine can compute the sum of 1/k where K ranges from 1 to some positive integer N. The integer N is inputted by the user.
The part within the rectangle is output from the assembly module. The remainder is output from the driver.
Try your program with some simple inputs such as 4 and 5 where the expected output can be computed manually.
If you are getting mathematically correct results for small inputs like 3, 4, and 5. Then test your program with larger inputs: 40, 100, 5000, 100 million, 22 billion, or 1 trillion.
What to do next
Assembly students: I wish I could run everyone of your programs and send you personalized comments about the programs. As the semester progresses I realize that I cannot physically keep up with the huge number of submissions I am receiving from 100 students currently enrolled in 3 classes.
Here are your options
(1) Send me your program anyway because it has value. At the end of the semester when all the points (2 midterms + 9 quizzes + 2 finals) are totaled for each person if any student is on the border between two letter grades then the programs sent by that student during the semester will be consulted. If that student has submitted all or most of the assignments and they really execute correctly then he or she will receive the higher letter grade.
(2) Keep your program private. You may reference it during the final exam.
(3) During the hour of lab time following each lecture get my attention. I will go to where you are. You may demonstrate your working assignment for me. I will be very please to see the results of your programming.
Suggested due date: December 1, 2019.
Is Ubuntu your preferred distro?
This monthly magazine is geared for the Ubuntu audience, but I find that it has useful articles independent of any specific distro. Download it here:
How do you obtain the specs on your own processor?
In this assignment you need the speed of your own cpu. I suggest you use the shell command “lscpu” to get the information you need. It will even tell you if your processor uses big endian.