GPU Acceleration with the C++ Standard Libray
This post is based on this self-paced course offered by NVIDIA’s Deep Learning Institute. NVIDIA’s stdpar lets you offload standard C++20 algorithms onto the GPU without CUDA kernels or new syntax. To show how this works in practice, we’ll use DAXPY, which is a simple but memory-intensive linear algebra operation, and see how a few small code changes take it from a single-threaded CPU loop to full GPU execution. DAXPY as a Bandwidth Benchmark DAXPY stands for Double-precision AX Plus Y, and it computes the following equation. ...
Model Context Protocol
Model Context Protocol (MCP) is an open source standard for LLMs to be interacting with applications. MCP is originally introduced by Anthropic in this post. With MCP, LLMs now can have assess to custom tools or data source to provide a more intelligent response. In addition, MCP also allows developers to build agents capable of more tasks, such as searching the internet or exploring dataset. MCP essentially standardizes LLM applications to work with many toolings. A nice quote by an Anthropic developer, “the models are only as good as the context provided to them”. ...