Загрузка страницы

Introduction to filters and convolution | Computer vision from scratch series

miro notes: https://miro.com/app/board/uXjVIUaPG0Y=/?share_link_id=593132997072

Classical filters & convolution: The heart of computer vision

Before Deep Learning exploded onto the scene, traditional computer vision centered on filters. Filters were small, hand-engineered matrices that you convolved with an image to detect specific features like edges, corners, or textures. In this article, we will dive into the details of classical filters and convolution operation - how they work, why they matter, and how to implement them.

Image filter and convolution

In the simplest sense, an image filter (often called a kernel) is a small matrix that you place over an image (pixel by pixel) to produce some transformation.

For each pixel, you:

Overlay the filter on the local neighborhood of that pixel (commonly a 3×3 or 5×5 region).

Multiply each filter entry by the corresponding pixel intensity.

Sum all these products to produce a single new intensity (or gradient value, or some other measure).

This process is called convolution (more precisely, cross-correlation if we don’t rotate the kernel, but in most computer-vision libraries, we call it convolution for simplicity).

Why use filters?

Filters can perform the following operations, making them very useful in image processing.

Feature extraction: To detect edges, corners, or textures (brick walls, repetitive patterns), filters help highlight specific regions.

Noise reduction: Smoothing filters like Gaussian blur can suppress pixel-level noise.

Enhancement: Some filters sharpen or enhance edges to make further analysis (like object segmentation) more robust.

Classical pattern recognition: Long before neural networks, many CV tasks (face detection, license plate recognition) involved carefully chosen filters plus rule-based heuristics.

Let us try to logically build some filters

Whenever you see a filter …it is natural to wonder: Why those numbers?
Where did they come from?

The core idea: Approximating derivatives in 2D

In calculus, an edge is basically a place where the intensity function of an image changes abruptly.

Mathematically, that is a derivative - a measure of how fast a function f(x) is changing. For an image whose pixel values can be described as I(x,y), you have partial derivatives ∂I/∂x (change in the x-direction) and ∂I/∂y (change in the y-direction).

In a digital image, we don’t have continuous functions but discrete pixels. Hence, we approximate derivatives using finite differences.

Classic filters (like Sobel, Prewitt, Laplacian, etc.) didn’t just pop up out of thin air.

They come from a mix of mathematical foundations (finite differences, derivatives, smoothing) and practical experiments with real images.

This lecture is a deep dive into filters.

Видео Introduction to filters and convolution | Computer vision from scratch series канала Vizuara
filters, convolution, computer vision, deep learning, computer vision from scratch, cv
Показать
Страницу в закладки Мои закладки
Все заметки Новая заметка Страницу в заметки