"It takes a lot of knowledge to know what one does not know"
😎Tweets on things related to High Performance Computing -- systems, interconnects, storage, 🥭 ...
To pay tribute to
#Frontier
, the fastest publicized
#supercomputer
in the world today and the first official
#Exascale
system, I have changed my Twitter header to a picture of
@ORNL
It used to be a picture of Mount Fuji (for Fugaku) before
#HPC
#AI
Congratulations,
@LisaSu
,
@AMD
CEO, on being voted the Readers Choice for
"Outstanding Leadership in
#HPC
"
in
@HPCwire
2021 Readers’ and Editors’ Choice Awards During
#SC21
DGX GH200: Nvidia ties 256 Grace-Hopper Superchips by 36 NVLink Switches to provide >1 EF FP8 (or ~9PF of FP64)
o 144TB unified memory
o 900 GB/s GPU-to-GPU bandwidth
o 128 TB/s bisection bandwidth
o End-of-year availability
#HPC
#AI
via
@HPCwire
Great loss, especially for the
#HPC
community
The “guy in the red hat“,Rich Brueckner, passed away last Wednesday
Rich owned
@insideHPC
&
@insideBigData
One Giant Superchip for LLMs, Recommenders, and GNNs: Introducing NVIDIA
#GH200
NVL32
Rack-scale solution within
@NVIDIA
DGX Cloud or an
@awscloud
instance, boasts a 32-GPU NVIDIA NVLink domain and a massive 19.5 TB of unified memory
#AI
#HPC
#GenAI
Congratulations to
@AMDServer
Rome on being voted the
#HPC
Processor of the Year 2020!
And to
@FujitsuHPC
A64FX on coming in a strong 2nd.
Thanks to all 363 of you who voted.
Intel CEO laments
@Nvidia
's 'extraordinarily lucky'
#AI
dominance, claims it coulda-woulda-shoulda have been
@Intel
Things would be completely different if only Intel hadn't cancelled the
#Larrabee
#GPU
When I created this Twitter account 13 years, 2 months and 9 days ago, I never imagined I would have even a 1000 followers
So reaching 20,000 followers is a pleasant surprise!
Thanks to all who patiently bear my idiosyncrasies, lots of tweets about
#HPC
, and a few on mangoes🥭
Nvidia 1000x in 10 years:
o Better Number Representation: 16x
o Complex Instructions: 12.5x
o Moore’s Law: 2.5x
(Hint: 2.5 x over 10 years should tell us something!)
o Sparsity: 2x
o Model Efficiency
#HPC
#AI
#GPU
#HotChips2023
Intel Publishes Blazing Fast AVX-512 Sorting Library
The code has been merged to Numpy and is providing some 10~17x speed-ups
On an Intel Tigerlake system this sped-up 16-bit int sorting by 17x
#HPC
#AI
via
@HPCZorf
@phoronix
TSMC plans chips with 200 billion transistors on a single piece of silicon on course to delivering chip packages with one trillion transistors
1nm-class
#A10
fabrication processes by 2030
#HPC
#AI
via
@tomshardware
“Our (
@nvidia
) entire data center family of products —
#H100
, Grace CPU, Grace Hopper Superchip, NVLink, Quantum 400 InfiniBand and BlueField-3 DPU — is in production
We are significantly increasing our supply to meet surging demand for them”
#HPC
#AI
Nvidia: At a recent launch event,
@AMD
talked about the inference performance of the
#H100
#GPU
compared to that of its
#MI300X
The results shared did not use optimized software, and the H100, if benchmarked properly, is 2x faster
#LLM
#GenAI
Samsung is advocating that performing computation in the memory die is faster and more efficient than taking the data back to a CPU, GPU, or accelerator
Samsung HBM2-PIM and Aquabolt-XL at
#Hotchips2021
#AI
#HPC
via
@ServeTheHome
Gordon Moore, Intel Co-Founder, Dies at 94
He famously forecast in 1965 that the number of transistors on an integrated circuit would double every year – a prediction that later came to be known as Moore’s Law
#HPC
.
@Intel
Scores a Huge
#Gaudi2
Win in NVIDIA
#MLPerf
Training v3.1
NVIDIA’s marketing slide shows Intel is up to 4x better performance per dollar versus NVIDIA (~8x the performance for ~32x the cost)
#AI
#ML
via
@ServeTheHome
Friday, March 9, will mark 10 years of
@HPC_Guru
on Twitter.
After more than 33000 tweets, mainly on
#HPC
, I wonder if it is time to call it a day.
Also need to decide whether I should reveal my identity or keep it a mystery.
Looking for NVIDIA alternatives given the long lead times for A100/H100 GPUs?
Rumor:
@Intel
is trying to build demand for
#Gaudi2
by placing a large (>1000 nodes) cluster in the Intel Developer Cloud
#AI
#HPC
via
@ServeTheHome
At
#GTC23
,
@nvidia
showed the Grace CPU Superchip for the first time to the public
Claims of 1.2x performance and 1.7x energy efficiency improvements over latest x86 CPUs
At iso-power, Grace CPU Superchip gives CSPs 2x the growth opportunity
#HPC
#AI
#ARM
This is Fran Allen, the first woman to receive the ACM Turing award.
Her first assignment
@IBM
was to teach research scientists within IBM how to use
#Fortran
.
She went on to work on compilers, including those for parallel systems.
#HPC
👇Intel PVC Max 1550 outperformed NVIDIA and AMD offerings on the depleted fuel problems that were tested
o 2.3x faster than the A100
o 1.2x faster than the GH200
o 1.8x faster than the AMD MI250X
However no one clamoring for a Ponte Vecchio
#HPC
via
@Underfox3
In this paper is presented the first performance evaluation of a scientific simulation application at scale on a supercomputer featuring Intel Max series (Ponte Vecchio) GPUs.
To pay tribute to
#Fugaku
, the fastest publicized
#supercomputer
in the world today, I have changed my Twitter header to a picture of Mount Fuji.
#HPC
#AI
ICYMI: Partial differential equations are kind of magical - and notoriously hard to solve.
Researchers
@Caltech
introduced a new
#DeepLearning
technique for solving PDEs that is dramatically more accurate
#AI
#HPC
via
@techreview
"MI300X using vLLM vs H100 using Nvidia's optimized TensorRT-LLM. Even when using TensorRT-LLM for H100 as our competitor outlined, and vLLM for MI300X, we still show a 1.3x improvement in latency."
@HPC_Guru
👀
122 years of
#MooresLaw
which was predicted by Gordon Moore in 1965 🤔
The number of people who incorrectly interpret Moore’s Law doubles every 18 months
#HPC
#AI
Tesla now holds the mantle of Moore’s Law.
Just as NVIDIA took leadership from Intel a decade ago.
While the substrate has shifted several times, humanity's capacity to compute has compounded for 122 years as if on rails.
It's surreal. Log scale details:
.
@AMD
's Zen 5 could be integer performance champ, according to
@tenstorrent
CEO Jim Keller
@jimkxa
predicts that AMD's Zen 5 will be 30% faster than the current-gen Zen 4 in integer workloads
#HPC
#AI
via
@tomshardware
This is really impressive - an Apple M3 Max with 16-core CPU, a 40-core GPU has a thermal design power (TDP) of 78 watts
Delivers 774 GF DGEMM
We have come a long way!
Congratulations
@Arm
,
@Apple
!
#HPC
via
@FelixCLC_
@GabrielBaraldi3
Folks, the *cut down* M3 Max (with 300GB/s!!) MBP is getting darn near 800 GFLOPS of DGEMM.
let me repeat: a laptop *CPU* is getting nearly 800 GIGA Double Precision Floating point Operations Per Second.
After CUDA, there’s QODA
NVIDIA Quantum-Optimized Device Architecture (QODA) is a platform for hybrid quantum-classical computers, enabling integration and programming of quantum processing units (QPUs), GPUs, & CPUs in one system
#HPC
#QuantumComputing
Steve Scott, previously at Cray and Microsoft, is now Corporate Fellow for Network and Systems Architecture
@AMD
Congratulations, Steve and AMD!
#HPC
#AI
via
@tdih_hpc
According to Citi's price projections for
@AMD
's
#MI300
#AI
accelerators,
@Nvidia
currently charges up to four times more for its competing
#H100
GPUs
AMD's chips cost $10 to $15K apiece
Nvidia's H100 has peaked beyond $40K
#HPC
#GPU
via
@tomshardware
"There is no such thing as
#AGI
!"
Years/decades to get to human-level
#AI
Recommended watching for this weekend:
@ylecun
delivers the Lytle Lecture
@UW
on Jan 24, 2024
"Objective-Driven AI: Towards Machines that can Learn, Reason, and Plan"
#MPI
Application Binary Interface Standardization
MPI is the most widely used interface for
#HPC
workloads
A new level of MPI compatibility: a standard
#ABI
is proposed in this paper
via
@science_dot
.
@LightmatterCo
's Passage is a silicon photonics interposer designed to support high speed chip-to-chip and node-to-node communications
Claim: “all of the GPUs that are designed for
#AI
training and inference or
#HPC
are going to be built on Passage,”
.
@SKhynix
&
@Nvidia
working on a
#GPU
redesign that 3D-stacks
#HBM4
directly on top of the processing cores, eliminating interposers
This resembles AMD's 3D V-Cache, which is placed directly on CPU dies, but HBM will be bigger capacity & cheaper
#HPC
“My belief is in the next 5 to 10 years, RISC-V will take over all the data centers”
Jim Keller interview with
@sallywf
@eetimes
Jim added that this is especially true for scientific computing and
#HPC
He said
#supercomputing
could dominate even faster
@risc_v
#AI
You've been kidnapped. Your kidnappers allow you to keep tweeting to pretend everything is alright. What would you tweet that would alarm your followers without the kidnappers knowing you're asking for help?
"And then I put in the exact amount of garlic the recipe called for."
Steadily chipping away Intel’s lead in the Data Center CPU market
Analysts from research firm BRAID suggest
@AMD
is on the path of achieving up to 30% market share by the end of 2023 and will eventually get close to 40% by the end of 2025
#HPC
#AI
The Alps
#supercomputer
@cscsch
will contain ~5,000 Grace Hopper modules, with four modules per node, use Slingshot networking
@Nvidia
says that Alps will be the first system to deploy the Grace Hopper Superchip
#HPC
#AI
via
@HPCwire
When transistors can’t get any smaller, the only direction is up
@intel
researchers on "3D-Stacked CMOS Takes Moore’s Law to New Heights"
#HPC
#AI
via
@IEEESpectrum
Welcome back Fortran: This dinosaur is back in the top 20 after more than 10 years
#Fortran
was the first commercial programming language ever, and is gaining popularity thanks to the massive need for (scientific) number crunching
#HPC
via
@fortranlang
Apple
#M3
Family of CPUs Launched with Some Shaky Performance Claims
Does
@Apple
think it can just say up to x% faster without giving any details?
@Patrick1Kennedy
says FTC needs to get involved in Apple’s performance claims
#ARM
via
@ServeTheHome