Kimang Khun

My YouTube Videos

Description

Focusing on programming in python, machine learning, reinforcement learning, artificial intelligence, speech processing, language models. Using Tmux, Neovim, and its plugins to increase productivity in coding.

You can find my YouTube channel here: https://www.youtube.com/channel/UCHpwDjAtW1AZrexwBgViHmw

Support My Work

While this work comes truly from the heart, each video represents a significant investment of time – from deep-dive research and code preparation to the final narrative and editing process. I am incredibly passionate about sharing this knowledge, but maintaining this level of quality is a major undertaking. If you find these videos helpful and are in a position to do so, please consider supporting my work with a donation. You can click here to donate or scan the QR code below. Your generosity acts as a huge encouragement and helps ensure that I can continue creating in-depth, valuable content for you.

Using Cambodian bank account, you can donate by scanning my ABA QR code here. (or click here. Make sure that receiver's name is 'Khun Kim Ang'.)

Coding Transformer Decoder Block from Scratch

In this video, the attention mechanism in Transformers is explained and implemented in PyTorch. The implemented decoder block is tested in GPT2-like LLM for correctness.

Build GPT2-like Language Model from Scratch - Code provided

In this video, we dive deep into fine-tuning TinyLlama to bridge the gap between simple text generation and complex tool use. We don’t just teach it to call tools; we teach it to reason before acting.

Make TinyLlama Smarter: Reasoning + Tool Calling Fine-Tune

In this video, we dive deep into fine-tuning TinyLlama to bridge the gap between simple text generation and complex tool use. We don’t just teach it to call tools; we teach it to reason before acting.

Train Your Own Speech Transcription Model from Scratch, code provided

Learn to train your own speech recognition model from scratch using NeoWhisper pypi package.

Train Your Own Small Language Model for Text Generation from Scratch, code provided

In this video, you will learn about tokenizer and how to train small language model for text generation using tror-yong-lm pypi package in python.

Build Gradio App in colab to Chat with Your DataFrame using Llamafile and PandasAI - 100% Free

In this video, you will learn to install python 3.11 in colab and develop Gradio App for Data Analysis by integrating Llamafile with PandasAI Agent.

Build Streamlit App to Chat with your DataFrame using PandasAI & MLX - Free and Local

This video shows the development of Streamlit Application for Data Analysis using PandasAI Agent.

Local RAG with Llamafile (NO High End GPUs Required)

Learn how to implement the pipeline of Retrieval Augmented Generation with Llamafile from scratch.

Easy and quick AI Chat for PDF, MD, or CSV on Mac (Apple Silicon) - Part 2

Learn how to extend retrieval augmented generation (RAG) app for PDF, Mardown, or CSV document using local LLM (tailored to M-chip of Apple).

RAG App (NO API key needed): Easy and quick AI Chat for Your Docs on Mac (Apple Silicon) - Part 1

Learn how to build a retrieval augmented generation (RAG) app with local llm (tailored to M-chip of Apple).

RAG Explained and Coded for Beginners

Learn how “retrieval augmented generation” (RAG) works with Langchain and MLX in Python.

YOLO Demo: Uncovering LoRA’s Inefficient GPU Usage (for Computer Vision models)

I expose the truth about LoRA for computer vision models via a practical demo with a YOLOv5.

Fine-tune Facebook wav2vec 2.0 for Speech Recognition, better than OpenAI Whisper?

Easy and Quick Fine-tune facebook/mms-1b-all, wav2vec 2.0 model, to different languages using Python and Colab with GPU

Open in Colab

Fine-tuning OpenAI Whisper for Speech Transcription with Custom Dataset

Easy and Quick Fine-tune OpenAI’s Whisper to different languages using Python and Colab with GPU.

Open in Colab

Step By Step Tutorial Using DoRA & LoRA-C Combo to Fine-tune Detr-ResNet50 for object detection

In this video, I combine LoRA-C with DoRA to fine-tune Detr-ResNet50 for object detection task.

Open in Colab

What is DoRA? PEFT for fine-tuning LLMs in 2025

In this video, I explain Weight-Decomposed Low Rank Adaption (DoRA) and its coding in python.

Step By Step Tutorial To Fine-Tune Detr-ResNet50 for object detection with LoRA-C

In this video, I fine-tune Detr-ResNet50 for object detection using CPPE-5 dataset and LoRA-C technique.

What is LoRA-C? PEFT for fine-tuning Computer Vision Models in 2025

In this video, I explain Low Rank Adaption (LoRA) technique on the famous convolutional layer.

What is LoRA? PEFT for fine-tuning LLMs in 2025

In this video, I explain Low Rank Adaption (LoRA) technique and its implementation with mlx python.

Learning Algorithms for Markovian bandits: Is posterior sampling more scalable than optimism?

In this video, I introduce my work with my supervisors about using Posterior Sampling Reinforcement Learning and Upper Confidence Reinforcement Learning algorithms in Markovian bandit problem. You can find our paper here: https://openreview.net/pdf?id=Sh3RF9JowK

Epidemic Simulation with code given

In this video, I explain my epidemic simulations in Khmer language but you can check out my gitlab which is written in English. The link is given here

Cython and MPI4PY

Put Covid-19 data in the Prompt Bar of iTerm2

Use command line to get Covid-19 data