EE 260 Advanced GPU Microarchitecture - Winter 2017

  • Time and Location: Tues/Thur 8:10am - 9:30am @ MSE 003
  • Instructor: Daniel Wong
    • Email:
    • Office: WCH 425
    • Office Hours: Tuesday 1:30pm - 4:00pm @ WCH 415 or by appointment
  • iLearn (for assignments):

  • Welcome to EE 260!

Class webpage and communication

The class webpage is located at

Information, resources, and announcements related to the class will be posted to the webpage.

Course Description

General Purpose Graphical Processing Units (GPGPUs) are fast becoming the processing element of choice for various platforms, from supercomputers and cloud computing, to mobile devices and self-driving vehicles. GPGPUs provide order-of-magnitude energy efficiency and performance improvements over traditional CPUs. The massive parallelism of GPGPUs enable unprecedented performance when running modern complex and large-scale workloads, such as machine learning, computer vision, and scientific computing.

This course provides an in-depth exploration into microarchitecture challenges of designing highly efficient GPGPUs, with a focus on performance, energy efficiency, and reliability. The course will consist of paper reviews, presentations, and major projects. Literature will be derived from major computer architecture, and related areas, conferences.

Topics covered in this course includes:

  • GPU parallel execution model
  • Scheduling techniques for Warps and Thread blocks
  • Handling warp/control divergence
  • Cache/Memory architecture
  • Energy efficiency techniques
  • Reliability
  • Simulation tools

Recommended prerequisite: CS161 (or equivalent), CS/EE 217, or Consent of instructor

Grade Breakdown

  • Class Participation 5%
  • Intro Assignment 10%
  • Paper Reviews 20%
  • Class Presentations 30%
  • Long-term Project 35%


You are expected to attend all lectures.

Paper Review

In-class Presentation

Final Project

Criteria: Final project can involve a group of 2 students maximum.
Open-ended, but must involve GPU micro-architecture and/or relating to GPU runtime, compiler, or programming language support.
You cannot propose application development/algorithm mapping to GPU (This is the scope of CS/EE 217).

By the end of the quarter, you must submit a conference-style report, of minimum 6 pages.

Project Proposal: 1 page project proposal must be approved by instructor and is due by Tuesday, January 24.

Project Progress: Individual group meeting to discuss project progress and detail plan of action.

Project ideas: Implementation of paper idea, develop novel warp scheduler, cache policies, etc.

Simulators, Tools, and Benchmarks

Assignment Policies

  • You have 3 slip days that you can use on any assignment, except for the final project. If you exceed your slip days, there will be a 15% penalty per late day (counting weekends).
  • No extensions will be given for final project.
  • All assignments will be due at the beginning of class.
  • All assignment should be uploaded to iLearn.
  • Paper reviews and class presentations are done individually.
  • Final project can be done in pairs.

The following schedule is tentative and is subject to change.

Date Topic Assignments Papers
Jan. 17 Intro/Overview Assignment 1 Assigned
Jan. 24 Divergence Project Proposal Due [1] Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
[2] Thread Block Compaction for Efficient SIMT Control Flow
[3] A Variable Warp Size Architecture
Jan. 31 Warp Scheduling Assignment 1 Due [4] Improving GPU Performance via Large Warps and Two-Level Warp Scheduling - Yukun
[5] Cache-Conscious Wavefront Scheduling - Shahriyar
[6] OWL: Cooperative Thread Array Aware Scheduling Techniques for Improving GPGPU performance - AmirAli
Feb. 7 No class - HPCA Project progress
meeting 2/9 & 2/10.
Feb. 14 Multi-kernel [7] Simultaneous Multikernel GPU: Multi-tasking Throughput Processors via Fine-Grained Sharing - Yasin
[8] Chimera: Collaborative Preemption for Multitasking on a Shared GPU - Hadi
Feb. 21 RF / Memory [9] GPU Register File Virtualization - Kiran
[10] Adaptive Cache management for Energy-efficient GPU Computing - Shahriyar
[11] Mascar: Speeding up GPU warps by reducing memory pitstops - Yukun
Feb. 28 Programmability [12] Dynamic Thread Block Launch: A Lightweight Execution Mechanism to Support Irregular Applications on GPUs - Kiran
[13] Architectural Support for Address Translation on GPUs - Marcus
Mar. 7 Energy [14] Warped Gates: Gating Aware Scheduling and Power Gating for GPGPUs - Amirali
[15] Equalizer: Dynamic Tuning of GPU Resources for Efficient Execution - Yukun
[16] Core Tunneling: Variation-Aware Voltage Noise Mitigation in GPUs - Kiran
Mar. 14 Reliability & Compression [17] Warped-DMR: Light-weight Error Detection for GPGPU - Hadi
[18] Warped-Compression: Enabling Power Efficient GPUs through Register Compression - Yasin
Mar. 21 Final Project Due

Cheating in any assignments are absolutely prohibited. The minimum penalty for a violation of the regulations will be a zero for the assignment; the maximum penalty will be failure in the course.

Here at UCR we are committed to upholding and promoting the values of the Tartan Soul: Integrity, Accountability, Excellence, and Respect. As a student in this class, it is your responsibility to act in accordance with these values by completing all assignments in the manner described, and by informing the instructor of suspected acts of academic misconduct by your peers. By doing so, you will not only affirm your own integrity, but also the integrity of the intellectual work of this University, and the degree which it represents. Should you choose to commit academic misconduct in this class, you will be held accountable according to the policies set forth by the University, and will incur appropriate consequences both in this class and from Student Conduct and Academic Integrity Programs. For more information regarding University policy and its enforcement, please visit: