What is a Good PhD?

— some «common sense» and personal views

Lasse Natvig
Professor in computer architecture
Lasse@computer.org

From: How to Write and Publish a Scientific Paper, Robert A. Day 5th ed.
Presentation Overview

- What is a Good PhD?
  - Context
    - PhD theses I have supervised (6 + 3?) or evaluated (13)
  - Quality ➔ Importance of focus
    - Research group / Supervisor / PhD student
  - Reproducibility and testing (Method)
  - Quantity
    - 6 papers — «The Reidar Model»
  - From NTNU regulations
  - Surprise
From the official NTNU regulations

• From *Guidelines for the Assessment of Candidates for Norwegian Doctoral Degrees*, Section 3.2 Assessment of the thesis [NTNU12b]:

  – A Norwegian doctoral degree is awarded as proof that the candidate's research qualifications are of a certain standard

  – … the academic standard and quality of the work submitted

  – … the candidate must satisfy the *minimum requirements to qualify as a researcher* – demonstrated through requirements related to the formulation of research questions, *precision and logical stringency*, originality, a good command of current methods of analysis and be able to reflect on their possibilities and limitations.

  – … thesis must contribute new knowledge to the discipline and be of an academic standard *appropriate for publication* as part of the scientific literature in the field …

  – And more! 😊
CONTEXT AND FOCUS
What is Computer Architecture?

- Computer architecture “is a specification detailing how a set of software and hardware technology ... interact to form a computer system ... determining the needs of the user/system/technology, and creating a logical design and standards based on those requirements” [Techop]
  - + performance evaluation
  - Includes parallel processing (My personal interest in 30 years)

- Broad knowledge vs. deep knowledge

- .... There is an old saying, “Architects know a little about almost everything and an engineer knows a lot about almost nothing.”[Career]
How to focus within architecture?

- Chip-Level Parallelism
- Machine-Level Parallelism
- Vast Parallelism

- Software
  - Architecture-Aware Parallel Programming
    - LN
  - Architecture-Aware System Software
    - LN
  - Models for Parallel Computation
    - GT, LN
  - Architectures for Unconventional Computing
    - GT
- Hardware
  - Multi-core Memory Systems
    - MJ, LN

Research with industrial relevance

Basic research
A PhD student must focus even more!

- JUMP to
  The illustrated guide to a Ph.D by Matt Might
Research Workflow

1. Identify Unsolved Problem
2. Attempt to Solve Problem
3. Implement Solution in Simulator
4. Evaluate Solution on Compute Cluster
5. Analyze Results
6. Problem solved
7. Recieve PhD (get a real job)
   Try again!

From: How to Write a Computer Architecture Paper, lecture about miniproject report writing in TDT4260 comp.arch [Jahre-14]
REPRODUCIBILITY
Abstraction/Models & Reproducibility

• Model of a system
  – Model the interesting parts with high accuracy
  – Model the rest of the system with sufficient accuracy

• “The Danger of Abstraction”
  – George E. P. Box:
    • “All models are wrong but some are useful”
    • “Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful”

• Abstractions and simplifications
  – Even more important for small countries/groups!

• Hm…, how to get people to trust our research?
  – 100% precise documentation!
  – Reproducibility
Give “all” experimental details

<table>
<thead>
<tr>
<th>Crossbar Based Architecture</th>
<th>4-core</th>
<th>8-core</th>
<th>16-core</th>
<th>Ring Based Architecture</th>
<th>4-core</th>
<th>8-core</th>
<th>16-core</th>
</tr>
</thead>
<tbody>
<tr>
<td>Feature Size (nm)</td>
<td>65</td>
<td>45</td>
<td>32</td>
<td>65</td>
<td>45</td>
<td>32</td>
<td></td>
</tr>
<tr>
<td>Shared Cache Size (MB)</td>
<td>8</td>
<td>16</td>
<td>32</td>
<td>8</td>
<td>16</td>
<td>32</td>
<td></td>
</tr>
<tr>
<td>Memory Bus Channels</td>
<td>1, 2 or 4</td>
<td>1, 2 or 4</td>
<td>1, 2 or 4</td>
<td>1, 2 or 4</td>
<td>1, 2 or 4</td>
<td>1, 2 or 4</td>
<td></td>
</tr>
<tr>
<td>Interconnect Latency (End-to-End/Per Hop)</td>
<td>8/-</td>
<td>16/-</td>
<td>30/-</td>
<td>-4</td>
<td>-4</td>
<td>-8</td>
<td></td>
</tr>
</tbody>
</table>

Table III
CACHE PARAMETERS

<table>
<thead>
<tr>
<th>Cache</th>
<th>Size</th>
<th>Associativity</th>
<th>Access Latency (cycles)</th>
<th>Cycle Time (cycles)</th>
<th>MSHRs / WB (per bank)</th>
<th>Banks</th>
<th>Area (mm²)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Level 1 Private Cache</td>
<td>64KB</td>
<td>2</td>
<td>3/2/2</td>
<td>2</td>
<td>16</td>
<td>1</td>
<td>2.3/1.1/0.5</td>
</tr>
<tr>
<td>Level 2 Private Cache</td>
<td>1 MB</td>
<td>4</td>
<td>9/6/5</td>
<td>4/3/2</td>
<td>16</td>
<td>1</td>
<td>14.6/7/0.36</td>
</tr>
<tr>
<td>Level 2/3 Shared Cache</td>
<td>8/16/32 MB</td>
<td>16</td>
<td>16/12/12</td>
<td>4</td>
<td>16/32/64</td>
<td>4</td>
<td>94.0/91.9/84.7</td>
</tr>
</tbody>
</table>

Table IV
PROCESSOR CORE PARAMETERS

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Clock frequency</td>
<td>4 GHz</td>
</tr>
<tr>
<td>Reorder Buffer</td>
<td>128 entries</td>
</tr>
<tr>
<td>Store Buffer</td>
<td>32 entries</td>
</tr>
<tr>
<td>Instruction Queue</td>
<td>64 instructions</td>
</tr>
<tr>
<td>Instruction Fetch Queue</td>
<td>32 entries</td>
</tr>
<tr>
<td>Load/Store Queue</td>
<td>32 instructions</td>
</tr>
<tr>
<td>Issue Width</td>
<td>4 instructions/cycle</td>
</tr>
<tr>
<td>Functional units</td>
<td>4 Integer ALUs, 2 Integer Multiply/Divide, 4 FP ALUs, 2 FP Multiply/Divide</td>
</tr>
<tr>
<td>Branch predictor</td>
<td>Hybrid, 2048 local history registers, 4-way 2048 entry HTB</td>
</tr>
</tbody>
</table>

Table V
INTERCONNECT AND DRAM INTERFACE

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Crossbar Interconnect</td>
<td>8/16/30 cycles end-to-end transfer latency, 32 entry request queue, Pipelined (2/4/6 pipe stages)</td>
</tr>
<tr>
<td>Ring Interconnect</td>
<td>4/4/8 cycles per hop transfer latency, 1/1/2 pipe stages per hop, 32 entry request queue, 1/2/2 request rings, 1 response ring</td>
</tr>
<tr>
<td>Point to Point Link</td>
<td>4/3/2 transfer latency, 32 entry request queue</td>
</tr>
<tr>
<td>Main memory</td>
<td>DDR2-800, 4-4-4-12 timing, 64 entry read queue, 64 entry write queue, 1 KB pages, 8 banks, PR-FIFS scheduling [21], Closed page policy</td>
</tr>
</tbody>
</table>

From: A Quantitative Study of Memory System Interference in Chip Multiprocessors, Jahre et al., HPCC09
Reproducibility

Ten Simple Rules for Reproducible Computational Research, by Geir Kjetil Sandve et.al. [SNTH13]

1: For Every Result, Keep Track of How It Was Produced
2: Avoid Manual Data Manipulation Steps
3: Archive the Exact Versions of All External Programs Used
4: Version Control All Custom Scripts
5: Record All Intermediate Results, When Possible in Standardized Formats
6: For Analyses That Include Randomness, Note Underlying Random Seeds
7: Always Store Raw Data behind Plots
8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
9: Connect Textual Statements to Underlying Results
10: Provide Public Access to Scripts, Runs, and Results

Parallel computers using random numbers might execute non-deterministically
More on reproducibility

- 4’th Int’l Workshop on Adaptive Self-tuning Computing Systems [ADAPT’14]
  - Two papers got the quality mark *reproducible*

- 1st ACM SIGPLAN Workshop on Reproducible Research Methodologies and New Publication Models in Computer Engineering [TRUST14]

See also:
http://ctuning.org/reproducibility
(Grigori Fursin)
More on reproducibility

- *Repeatability* in Computer Science
- Techn. Report (68 pages)
- http://reproducibility.cs.arizona.edu/
Testing
The importance of testing

• (Industry typically use 50% of work force for testing)
  • They cannot afford low quality
• Running benchmarks in computational comp.arch.
  • Common practice has not been perfect: Assumed OK if simulator does not crash

Benchmark programs B(1), B(2), ...

Input (B(i))

Correct output (B(i))

Computer architecture in configuration X

Sample

Read out and evaluate

Produced output (B(i))

Execution time

=?
An Aside: the Importance of Verification

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>401.bzip2</td>
<td>453.povray</td>
<td>462.libquantum</td>
<td>481.wrf</td>
<td></td>
<td>434.zeusmp</td>
<td>444.namd</td>
<td>459.GemsFDTD</td>
<td></td>
<td>450.soplex</td>
<td>473.aster</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>30B OoO + vFF</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Verifies in</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>Reference</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Verifies</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>using VFF</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Verifies</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
</tr>
<tr>
<td>when Switching</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

1. Simulator gets stuck.
2. Triggers a memory leak causing the simulator crash.
3. Terminates prematurely for unknown reason.
4. Fails with internal error. Likely due to unimplemented instructions.
5. Benchmark segfaults due to unimplemented instructions.
6. Terminated by internal benchmark sanity check.
QUANTITY
When is 6 papers good enough?

• First/main author of most
  – “If the thesis consists primarily of papers, the candidate must normally be the main author or first author of at least half the papers” [NTNU12a]
• At least 2 - 4 in high quality conferences or good journals
• All in acceptable journals, conferences or good workshops
  – IDI Relevant Conferences (357), A and B rating (can have weaknesses) [IDI-AB]
  – 1 (or maybe 2) can be in state submitted, if …
• Watch out!
  – There are “fake conferences” and “bogus journals” (and websites)
    • Accepting papers written by paper-automata
  – You can easily get papers published that NEVER should have been published
  – Your and (your supervisors) responsibility
PhD as a collection of papers

- If the thesis consists of several interrelated minor pieces of work, the candidate must document the integrated nature of the work and the assessment committee must decide whether the content comprises a coherent entity. In such cases, the candidate must compile a separate part of the thesis that not only summarizes but also compares the research questions and conclusions presented in the separate pieces… [NTNU12b]

Haakon Dybdahl [Dybd07]

Magnus Jahre [Jahre10]

Figure 3.2: The research focus for the different papers.
... more examples of research process

Morten Hartmann [Hart05]

Figure 3.1: A conceptual illustration of the research process and relevant contributions

Figure 3.1: Research process and relation of papers
SURPRISE
How to supervise within a topic you do not know?

• … or know only to some extent

• Case b) Change of main supervisor (not common)

• Case a) Your own student working efficiently and independently/self-driven
  – A normal case, or ideal case
  – How well can the PhD student answer your questions?
  – Clear and precise descriptions?
  – “General attitude”
    • from “maximum quality” to … (worst case) “don’t care attitude”
Motivate your supervisor!

• Use the time with the supervisor efficiently
• Be prepared
  – Bring results, ideas, questions
• Take notes
• Give your supervisor time to prepare
• Help him/her supervise
  – Write readable
  – Use figures, visualizations
  – Use abstraction
  – Be precise and pedagogical
• You have one project, your supervisor might have 10-30 “projects”
Scientific writing, precision

• Notation/concepts
  – Often new concepts
  – Use best/most common terminology --- if it exist
  – Define your terminology precisely
  – Stick to it, be consistent!

• “help the reader”

• More (in Norwegian)
  – Lasse's enkle tips om rapportskriving

[Career] What Is The Difference Between Architecture And Civil Engineering?

[Djup08] Evolving Static Hardware Redundancy for Defect Tolerant FPGAs, PhD thesis by Asbjørn Djupdal

[Dybd07] Architectural Techniques to Improve Cache Utilization, Dr.ing. thesis by Haakon Dybdahl, 2007

[Hart05] Evolution of Fault and Noise Tolerant Digital Circuits, PhD thesis by Morten Hartmann, 2005

[IDI-AB] IDI Relevant Conferences (list for travel grants, A and B rating)


[Jahre14] How to Write a Computer Architecture Paper, lecture about miniproject report writing in course TDT4260 comp.arch, given by Nico this spring

[JN10] Computational Computer Architecture Research at NTNU, ERCIM News April 2010

[NTNU12a] Regulations For The Philosophiae Doctor Degree (PhD) at NTNU, 23 January 2012.

[NTNU12b] Guidelines for the Assessment of Candidates for Norwegian Doctoral Degrees, NTNU 13 June 2012

[SHBS14] Full Speed Ahead: Detailed Architectural Simulation at Near-Native Speed,

Andreas Sandberg, Erik Hagersten, and David Black-Schaffer. March 2014, Tech.report 2014-005

[SNTH13] Ten Simple Rules for Reproducible Computational Research, Geir Kjetil Sandve et.al., 2013

[Techop] Computer Architecture, from Techopedia

Questions

Visit the EECS website:
http://www.ntnu.edu/ime/eecs/

Contact:
Lasse.Natvig@idi.ntnu.no

http://research.idi.ntnu.no/multicore/