An introduction to NPU hardware and its growing presence outside of mobile computing devices.
Local alternatives to Cloud AI services
Presenting local AI-powered software options for tasks such as image & text generation, automatic speech recognition, and frame interpolation.
LLM Server Setup Part 2 — Container Tools
This post is Part 2 in a series on how to configure a system for LLM deployments and development usage. Part 2 is about installing and configuring container tools, Docker and NVIDIA Enroot.
LLM Server Setup Part 1 – Base OS
This post is Part 1 in a series on how to configure a system for LLM deployments and development usage. The configuration will be suitable for multi-user deployments and also useful for smaller development systems. Part 1 is about the base Linux server setup.
Can You Run A State-Of-The-Art LLM On-Prem For A Reasonable Cost?
In this post address the question that’s been on everyone’s mind; Can you run a state-of-the-art Large Language Model on-prem? With *your* data and *your* hardware? At a reasonable cost?
UPDATE v0.2 NVIDIA GPU Powerlimit Setup
This is just a short post to announce a more usable version of the NVIDIA GPU powerlimit setup script that I released a few months ago. This update to version 0.2 uses an interactive mode to set GPU powerlimits and optionally setup a systemd unit file to set these limits on subsequent reboots.
NVIDIA GPU Power Limit vs Performance
This post presents testing data showing that power-limit reduction on NVIDIA GPUs have give significant benefits for both high wattage and lower wattage GPUs. Power-limit vs Performance data is presented for 1-4 A5000 and 1-4 RTX3090 GPUs.
NVIDIA GPU Powerlimit Systemd Setup Script
In this post I am referencing a Bash shell script I recently put together for setting up automatic NVIDIA GPU power-limit lowering at system boot. This allows a reliable way to configure and maintain multi-GPU systems for stable operation under heavy load.
RTX 2080Ti with NVLINK – TensorFlow Performance (Includes Comparison with GTX 1080Ti, RTX 2070, 2080, 2080Ti and Titan V)
More Machine Learning testing with TensorFlow on the NVIDIA RTX GPU’s. This post adds dual RTX 2080 Ti with NVLINK and the RTX 2070 along with the other testing I’ve recently done. Performance in TensorFlow with 2 RTX 2080 Ti’s is very good! Also, the NVLINK bridge with 2 RTX 2080 Ti’s gives a bidirectional bandwidth of nearly 100 GB/sec!
NVLINK on RTX 2080 TensorFlow and Peer-to-Peer Performance with Linux
NVLINK is one of the more interesting features of NVIDIA’s new RTX GPU’s. In this post I’ll take a look at the performance of NVLINK between 2 RTX 2080 GPU’s along with a comparison against single GPU I’ve recently done. The testing will be a simple look at the raw peer-to-peer data transfer performance and a couple of TensorFlow job runs with and without NVLINK.