

## **ODSA:** Technical Introduction

Bapi Vinnakota, Netronome ODSA Project Workshop June 10th, 2019

Consume. Collaborate. Contribute.



# **ODSA:** A New Server Subgroup (Incubation)

- Extending Moore's Law
  - Domain-Specific Architectures: Programmable ASICs to accelerate high-intensity workloads (e.g. Tensorflow, Network Flow Processor, Antminer...)
  - Chiplets: Build complex ASICs from multiple die, instead of as monolithic devices, to reduce development time/costs and manufacturing costs.
- Open Domain-Specific Architecture: An architecture to build domain-specific products
  - Today: All multi-chiplet products are based on proprietary interfaces
  - Tomorrow: Select best-of-breed chiplets from multiple vendors
  - Incubating a new group, to define a new open interface, build a PoC



### **Open Interface for Chiplet-Based Design**



Multiple chiplets need to function as though they are on one die



### How to Participate Please Help! : Join a Workstream

Join the PoC, Build fast: (Quinn Jacobson/Jawad Nasrullah) Join Interface/Standards: (Mark Kuemerle/Aaron Sullivan)



(Sam Fuller) Define test and assembly workflow Integrated System on a Subscrate Complex Packaging MCM Complex ASIC Develop MCM Design Sl/Pl MSIC Function

Provide Chiplet IP

Workstream contact information at the ODSA wiki

### Join Business, IP and workflow:





### **Domain-Specific Architectures**

### Tailor architecture to a domain\*

- Server-attached devices programmable, not hardwired
- Integrated application and deployment-aware development of devices, firmware, systems, software
- 5-10X power performance improvement
- Big more of a processor to I/O mismatch => more memory
- Each serves a smaller market



### Google TPU vs. CPU and GPU

Source: "An in-depth look at Google's first Tensor Processing Unit (TPU)," Google Cloud, May 2017

FPGA

CPU

Source: Netronome, based on internal benchmarks and industry reports related to Xeon CPUs and Arria FPGAs

\*A New Golden Age for Computer Architecture John L. Hennessy, David A. Patterson Communications of the ACM, February 2019, Vol. 62 No. 2, Pages 48-60



### Domain-Specific for Networking and Security



Incremental Performance/Watt

**Better Power Performance** Cloud Workload

1 Port-blast100 | VXLAN | 1:2 Flows:Rules Intel Xeon Gold 6138 | Intel Xeon Gold 6138P (Arria 10 GX 1150) | Netronome NFP

### Netronome NFP vs. CPU and FPGA



### Monolithic vs Chiplets



Integration provides nearly all the benefits of a shrink at a fraction of the cost, because of efficient inter-chiplet interconnect

### Area, Power and Cost for Shrink vs. Integration



https://www.netronome.com/media/documents/WP\_0DSA\_0pen\_Accelerator\_Architecture.pdf



## Chiplets for DSAs

| <b>Design Function</b> | Value                                                                                |
|------------------------|--------------------------------------------------------------------------------------|
| IP Qualification       | Verified IP for inter-chiplet communication                                          |
| Architecture           | Leverage reference architecture.                                                     |
| Verification           | Focus investment on domain-specific logic.                                           |
| Physical               | Reuse chiplets instead of IP for 40% of the functions in a monolithic design         |
| Software               | Open source firmware and software for host-attached operation                        |
| Prototype              | Aim for reference package design with area, power budgets and pinouts for components |
| Test and Validation    | Develop workflow for chiplets                                                        |

Chiplet reuse reduces development costs Partition large devices into smaller devices with better yield.

Consume. Collaborate. Contribute.





# Cross-Chiplet ODSA fabric







## **ODSA Scope**



# Reference architectures, PoCs for



### Timeline

| <ul> <li>ODSA Announced</li> </ul>                   | 10/1/18  | 7 companies  |
|------------------------------------------------------|----------|--------------|
| White Paper                                          | 12/5/18  | 10 companies |
| <ul> <li>First Workshop @Global Foundries</li> </ul> | 01/28/19 | 35 companies |
| <ul> <li>Joined OCP</li> </ul>                       | 03/15/19 |              |
| <ul> <li>Second Workshop @Samsung</li> </ul>         | 03/28/19 | 53 companies |
| <ul> <li>Today @Intel</li> </ul>                     | 06/10/19 | 65 companies |

Meet weekly on Fridays. Status updates, new project proposals, guest speakers. All the content on the ODSA wiki - https://www.opencompute.org/wiki/Server/ODSA

We may not have the right solution, we likely have the right problem.





## Our Progress - How You Can Participate

| Project                     | Objective                                                                                                           | Participants                                                                        | Recent Results                                                              | Upcoming<br>Milestones                                |
|-----------------------------|---------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|-------------------------------------------------------|
| PHY Analysis                | PHY requirements<br>PHY analysis<br>Cross-PHY abstraction                                                           | Alphawave,Aquantia,<br>Avera Semi, Facebook,<br>Intel, Kandou,<br>Netronome, zGlue, | PHY Analysis paper<br>(to be published at<br>Hot Interconnect in<br>August) | PIPE abstraction<br>Operations, test ar<br>management |
| BoW Interface               | No technology license fee,<br>easy to port inter-chiplet<br>interface spec                                          | Aquantia, Avera Semi,<br>Netronome                                                  | New BoW Interface<br>(to be published at<br>Hot Interconnect in<br>August)  | Data i/f spec, Aug,<br>2019<br>0.9 spec, Sep,<br>2019 |
| Prototype                   | Device that integrates<br>existing die from multiple<br>companies into one<br>package                               | Achronix, Cisco,<br>Netronome, NXP,<br>Samtec, Sarcina, zGlue                       | Decomposable<br>design flow.                                                | Committed<br>schedule                                 |
| Chiplet Design<br>eXchange  | Open chiplet physical<br>description format starting<br>with zGlue format.<br>Information normally<br>confidential. | Ayar, NXP, zGlue                                                                    | Open chiplet survey.                                                        | ZEF Exchange<br>format draft<br>specification         |
| Inter-chiplet<br>Link Layer | Interface and<br>implementations –<br>requirements and definition                                                   | Achronix, Avera Semi,<br>Intel, Netronome, more<br>needed                           |                                                                             |                                                       |

Listing participation does not imply official endorsement by employer

|    | Needs                                                                                                 |  |
|----|-------------------------------------------------------------------------------------------------------|--|
| nd |                                                                                                       |  |
| 5, | Foundry support for<br>test chips. Chiplet<br>library with interface<br>Open source<br>implementation |  |
|    | End users<br>End user participation<br>~30% funding is open                                           |  |
|    | EDA participation                                                                                     |  |
|    |                                                                                                       |  |

### **Projects Requested**

| Project                                          | Objective                                                                                                        | Participants | Recent Results | Upcoming<br>Milestones |
|--------------------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------|----------------|------------------------|
| Cross-chiplet<br>network layer,<br>fabric agents | Scalable network layer.<br>Netronome offers a starting<br>point                                                  | Netronome    |                |                        |
| Pchiplet<br>design flow                          | Chip/chiplet open design<br>flow to integrate across<br>companies                                                |              |                |                        |
| Reference<br>Architectures                       | I/O, Compute, Memory,<br>functional partition for<br>SmartNIC, Inferencing,<br>Storage, Learning,<br>Image/Video |              |                |                        |
| Chiplet<br>proposals                             | Proposals for chiplets for<br>common functions – I/O,<br>CPU, Memory                                             |              |                |                        |
| Business<br>workflow                             | Leverage learnings from prototype effort                                                                         |              |                |                        |

