Now you too can have a clean glowing skin

Your skin is the largest and fastest growing organ in the human body which covers around two square meter of surface area. Skin also happens to the the front line defense taking the most damage owing…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Watts The Deal With Power ? Part I

How We Implement Power Benchmarking In The Billion-Scale Approximate Nearest Neighbor Search Challenge

We announced in May that NeurIPS 2021 will host a unique data and algorithm challenge in billion-scale approximate nearest neighbor search (ANNS.) Participating teams will be evaluated across a set of challenging datasets, each with a billion records. We employ search accuracy ( measured as recall vs throughput ) as the defining ranking metric for the T1 and T2 competition tracks, which limit the size of RAM to 64GB within a standard server-grade system hosted in Microsoft’s Azure Cloud. However, in the T3 track, we don’t enforce the same hardware restrictions. T3 will also add two additional leaderboards, one that ranks participants relative to power usage and one related to hardware cost. In this blog, we will discuss the power leaderboard and we’ll get into some of the details around how we will collect and compute the power benchmarks.

Let’s first ask ourselves why we even care about power consumption. For most of us, our direct relationship with a machine’s power consumption becomes most apparent when our laptops warn us the battery power is low. Gamers understand the need to purchase additional external power supplies required for the higher-end GPU cards when building a top-notch PC gaming workstation. Crypto miners who run their own on-prem hardware know all too well the importance of acquiring power efficient hardware to lower their power bill. For those of us who leverage the public cloud for day-to-day work, we are typically far removed (both literally and figuratively) from the power consumption of the cloud services we use and the underlying machines that run them (other than how it might be abstractly factored into the service’s cost.)

All that said, awareness surrounding the growing power demands of data centers has increased significantly over the past few years. We all generally agree that indeed “software is eating the world,” and implicitly we expect more and more machines are required to power this software-as-a-service, industrial revolution. It’s then not too surprising to learn what should be shocking facts about the growing demand for power at data centers:

This will likely increase further as workloads become more data-intensive and AI-centric. In one highly recognized paper, researchers measured the cost of training a 200 million parameter transformer-based NLP neural network optimized with neural architecture search [3]. They found that model training and model optimization consumed enough energy to equate to 626K pounds of CO2 emissions. To put that into perspective: round trip air travel between SF and NY produces 2K pounds of CO2 emissions, and the average car produces 126K pounds of CO2 emissions in its lifetime. And that was reported in 2019. The latest NLP transformer models in 2021 are well into the billions of parameters.

Fortunately, the major public cloud providers ( Microsoft’s Azure, Amazon’s AWS, and Google’s GCP ) have all already started major efforts to improve the power efficiency and to offset the carbon footprint of their data centers:

Companies like Apple, Facebook, and Uber with similarly large-scale but private compute server footprints also have “green data center” initiatives underway. Industry wide collaborations like the Open Compute Project promote openly sharing ideas, specifications, and other intellectual property to maximize innovation in this space [7].

Clearly, new hardware and chipsets specifically designed for power efficiency will continue to be a major component of the design of future data centers. But software also has an equally important role to play. For example, compilers will need to be hardware-aware and will need to be able to produce compiled code that can leverage power efficient hardware features when they are available. Software engineers, data scientists, and algorithm developers must also be aware of their coding decisions and how those decisions can affect not just the typical metrics such as speed and accuracy, but also power consumption.

Creating this awareness was one of the goals of the T3 challenge, and that is why we are maintaining a separate leaderboard that ranks participants based on the measured power consumption of their algorithm. Before we get into those details, let’s take a step back and review some of the basic science around power and power consumption.

For many of us, our first encounter with the science of power occurred during our first lessons in physics and chemistry. We are taught very early that the classical notion of “energy” is conserved in any closed system, constantly being converted to and from its kinetic and potential forms. The symmetry of the energy conservation law of the universe is not only intuitively appealing, but is a fundamental underpinning of all the physical sciences and applied engineering fields, from astrophysics to biochemistry to civil engineering.

Often the concepts of energy and power are used synonymously, but there is a very subtle and important difference. A good example that demonstrates the difference- imagine lifting a box from the ground: it takes the same amount of energy to lift a box no matter how fast you lifted it. Power is another matter. Power defines how fast energy is consumed or transferred. Power is a function of energy and time. To lift the box faster, more power is required.

Power is usually reported as watts. To convert from power to energy over a certain period of time, you would just multiply the power measured over that time interval by the time interval ( assuming the power was constant throughout that time interval.). That is why you will typically see power consumption metrics like kilowatt-seconds or kilowatt-hours. Those quantities are reporting the total energy used over a time period.

If you’ve seen a picture of a modern datacenter, you’ve likely noticed rows and rows of cabinets side-by-side. If you looked inside each cabinet (often called a rack) you would see multiple modules stacked vertically on top of each other. Typically each module is a rectangular chassis full of electronics, such as a motherboard with a CPU or two, add-on PCIe boards, network cards, hard drives, power supplies, and other electronics. The size of a module chassis is typically measured in integral units related to its height, starting with 1U (or 1 unit). Typical sizes are 1U, 2U, 3U. 4U module chassis are becoming more popular, for example, NVidia’s DGX system needs a 4U form factor to house all 8 of its GPUs boards.

A modern chassis is more than just a metal box that houses the electronics. The chassis itself will contain some light-weight, dedicated electronics that serve very specific purposes. The primary purpose is to support remote power management. Most chassis will host a small web server in which you access a simple web app which enables remote power down and power up to the system. Some support KVM, which is technology that allows you to remotely view the video output of the system and also control the mouse and keyboard input. These are tremendously useful tools if you need to manage a system remotely. Sometimes “turning it off, turning it back on” is the only resolution to a problem at a server in a datacenter!

These remote control capabilities are so useful that the server and datacenter industry has come up with a standard called IPMI (Intelligent Platform Management Interface.). Chassis manufactures that support the IPMI standard will benefit from all of the existing tools and services that data center engineers already use to manage their fleet of systems.

In addition to remote management, IPMI supports the notion of sensors. There are a broad array of sensors that a chassis manufacturer can build into their systems and this typically includes power monitoring.

The image below shows a listing of the sensor that are available for a chassis we use from the company called Advantech. This particular model is a 2U chassis system which houses a 2-CPU/56 core Intel Xeon chipset [8].

Sensors available for monitoring in Advantech’s Sky 6200 2U Chassis.

We leverage these IPMI power sensors to assess the total power consumption of the participant algorithms in the T3 track of the NeurIPS Billion-Scale Approximate Nearest Neighbor Search Challenge.

Definition of the workloads A-F:

In Part 2 of this blog we get into the following details:

Add a comment

Related posts:

MultiNFT announces its 2021 NFT Membership sale.

As part of our MultiNFT.io expansion strategy, we are launching our first NFT membership collection where Holders of these first NFT collections will be receiving a series of benefits, as explained…

Love is in the air!

Feliz dia dos namorados, amigx solteirx :) What a week, huh? Abrindo desde já uma petição para que cada post comemorativo à data venha com um alerta gatilho. Assinaturas na petição: 1 - Giovanna…

Using Cookies to Create Realistic Lighting

Today I want to talk about Cookies a pretty good little tool to give the right upgrade to the look of the game. to realize this effect we need a texture, obviously searching only cookies on google…