LOADING: An error occurred. Update Chrome, try Firefox, or visit this post for more details.

⚠️Reddit changed how removals work, which breaks Reveddit's website. Install the extension to track removed content:Add to chromeAdd to firefoxWhat changed?
✖︎
about reveddit
⚙F.A.Q.add-ons
r/
status
copy sharelink
[+] show filters
470
Discussion[D] GPT-3, The $4,600,000 Language Model(self.MachineLearning)
submitted 5 years, 7 months ago* (edited 18 hours, 40 minutes after) by mippie_moe to /r/MachineLearning (3m)
217 commentsredditother-discussions (2+)subreddit-indexmessage modsop-focus

OpenAI’s GPT-3 Language Model Explained

Some int...

... view full text

since 5 years, 7 months ago
3 of 3

Tip Reveddit Real-Time can notify you when your content is removed.

your account history
(check your username's removed content. why?)
Tip Check if your account has any removed comments.
view my removed comments
you are viewing a single comment's thread.
view all comments
[–][deleted]2 points5 years, 7 months ago

While it would likely be enormously cost-prohibitive, AWS does offer some "private" tiers.

For example, the u-12tb1.metal instance type has 12 TB of RAM and 448 CPU cores. While this one is aimed at in-memory DBs, they do have some other huge cluster offerings.

permalinkparentcontexthide replies (1)as-of
[–]AxeLond2 points5 years, 7 months ago

I don't think many will be running the 175b parameter model anywhere, even OpenAI is probably hurting a bit after doing it. They also published smaller models which I think would be enough, the 13B param is still like 10x the largest GPT-2 model. Humans were only 52% accurate at identifying fake articles written by the 175B model, pretty much just guess 50/50, but even for the 13B model people were only 55% accurate.

13 B you can probably reasonably well on a single Tesla A100 with 40 GB VRAM.

But technology advancements will make these things more accessible as well. Nvidia's NVSwitch solution is incredibly niche and expensive by requiring you to build a board that wires every GPU to every other GPU in the server.

AMD with 3rd gen infinity fabric will try to do that built in to the CPU + GPU. Nvidia was limited to PCIe 3.0 and it wasn't fast enough. With Zen 3 or 4 AMD is moving to PCIe 5.0 which can do 63GB/s compared to 16GB of gen 3. They will be using this to interconnect 8 GPU and a EPYC processor in the El Capitan 2 exaflop supercomputer with full GPU resource sharing. The NVSwitch has a port bandwidth of 50 GB/s, so in a few years an off the shelf server will be able to do this stuff instead of needing a super niche product.

https://en.wikichip.org/wiki/nvidia/nvswitch

This thing is absolutely ridiculous, it's a 100W linking cable.

In 2022 AMD servers will be able to do this without specific hardware,

https://www.anandtech.com/show/15596/amd-moves-from-infinity-fabric-to-infinity-architecture-connecting-everything-to-everything

That's when models of this size can start to become common.

permalinkparentcontexthide replies (1)author-focusas-ofpreserve
[–][deleted]2 points5 years, 7 months ago

Thanks for sharing the specifics on this. Very exciting stuff!

permalinkparentcontextas-of
r/revedditremoved.substack.com
🚨 NEWS 🚨
✖︎

Important: Reddit Changed How Removals Work

A recent Reddit update makes mod-removed content disappear from profile pages, which breaks Reveddit's website.

Install the browser extension to receive removal alerts.

Add to chromeAdd to firefox

What changed?

r/revedditremoved.substack.com