ESP32-CAM merupakan salah satu modul mikrokontroler berbasis ESP32 yang dilengkapi kamera dan konektivitas WiFi. Dengan harga yang relatif murah, modul ini mampu digunakan untuk berbagai aplikasi Computer Vision, mulai dari sistem monitoring, deteksi objek, hingga pengawasan berbasis kecerdasan buatan. Pada tutorial ini, kita akan membuat sebuah sistem deteksi dan pelacakan objek secara real-time menggunakan ESP32-CAM dan OpenCV. ESP32-CAM bertugas mengirimkan video melalui jaringan WiFi, sedangkan OpenCV yang berjalan di komputer akan melakukan proses pengolahan citra untuk mendeteksi dan melacak objek berdasarkan warna tertentu. Project ini sangat cocok bagi pemula yang ingin mempelajari dasar-dasar Computer Vision tanpa memerlukan robot, motor driver, atau perangkat tambahan lainnya.
Apa yang Akan Dibuat?
Pada project ini:
- ESP32-CAM akan melakukan video streaming melalui WiFi.
- Komputer akan menerima stream video dari ESP32-CAM.
- OpenCV akan memproses gambar secara real-time.
- Objek dengan warna tertentu akan dideteksi.
- Sistem akan menampilkan kotak pembatas (Bounding Box).
- Sistem akan menampilkan titik pusat objek.
- Posisi objek akan terus dilacak selama objek masih terlihat kamera.
Cara Kerja Sistem
Pada project ini, ESP32-CAM dengan kamera OV3660 berfungsi sebagai sumber video (video source) yang mengirimkan gambar secara real-time melalui jaringan WiFi. Setiap frame yang ditangkap kamera akan dikirim ke komputer dalam bentuk video streaming yang dapat diakses melalui alamat IP ESP32-CAM.
Di sisi komputer, program Python yang menggunakan library OpenCV akan membaca stream video tersebut dan memproses setiap frame yang diterima. Proses pengolahan citra meliputi konversi ruang warna, penyaringan warna objek menggunakan metode HSV (Hue, Saturation, Value), serta pencarian kontur objek berdasarkan warna yang telah ditentukan.
Setelah objek berhasil ditemukan, OpenCV akan menghitung ukuran dan posisi objek, kemudian menampilkan kotak pembatas (bounding box) beserta titik pusat objek pada layar. Informasi koordinat objek juga akan ditampilkan secara real-time sehingga posisi objek dapat diketahui setiap saat.
Melalui kombinasi ESP32-CAM dan OpenCV ini, pengguna dapat membangun sistem deteksi dan pelacakan objek sederhana yang dapat dikembangkan lebih lanjut menjadi sistem keamanan, monitoring otomatis, robot vision, maupun aplikasi Computer Vision berbasis Internet of Things (IoT).
Perangkat Keras yang Dibutuhkan
- ESP32-CAM-MB
- ESP32-CAM dengan kamera OV3660
- Kabel USB Type-C atau Micro USB (sesuai board MB)
- Laptop / PC
- Jaringan Wi-Fi
Perangkat Lunak yang Dibutuhkan
- Arduino IDE
- Driver USB ESP32-CAM-MB
- Python 3.x
- NumPy
- OpenCV
Install OpenCV dan NumPy melalui Command Prompt:
pip install opencv-python
pip install numpy
Atau:
pip install opencv-python numpy
Mengupload Program ke ESP32-CAM
Buka Arduino IDE kemudian masukkan kode berikut.
Program ESP32-CAM (Sketch Bawaan CameraWebServer)
#include "esp_camera.h"
#include <WiFi.h>
//
// WARNING!!! PSRAM IC required for UXGA resolution and high JPEG quality
// Ensure ESP32 Wrover Module or other board with PSRAM is selected
// Partial images will be transmitted if image exceeds buffer size
//
// You must select partition scheme from the board menu that has at least 3MB APP space.
// Face Recognition is DISABLED for ESP32 and ESP32-S2, because it takes up from 15
// seconds to process single frame. Face Detection is ENABLED if PSRAM is enabled as well
// ===================
// Select camera model
// ===================
//#define CAMERA_MODEL_WROVER_KIT // Has PSRAM
//#define CAMERA_MODEL_ESP_EYE // Has PSRAM
//#define CAMERA_MODEL_ESP32S3_EYE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_PSRAM // Has PSRAM
//#define CAMERA_MODEL_M5STACK_V2_PSRAM // M5Camera version B Has PSRAM
//#define CAMERA_MODEL_M5STACK_WIDE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_ESP32CAM // No PSRAM
//#define CAMERA_MODEL_M5STACK_UNITCAM // No PSRAM
#define CAMERA_MODEL_AI_THINKER // Has PSRAM
//#define CAMERA_MODEL_TTGO_T_JOURNAL // No PSRAM
//#define CAMERA_MODEL_XIAO_ESP32S3 // Has PSRAM
// ** Espressif Internal Boards **
//#define CAMERA_MODEL_ESP32_CAM_BOARD
//#define CAMERA_MODEL_ESP32S2_CAM_BOARD
//#define CAMERA_MODEL_ESP32S3_CAM_LCD
//#define CAMERA_MODEL_DFRobot_FireBeetle2_ESP32S3 // Has PSRAM
//#define CAMERA_MODEL_DFRobot_Romeo_ESP32S3 // Has PSRAM
#include "camera_pins.h"
// ===========================
// Enter your WiFi credentials
// ===========================
const char* ssid = "itel P55";
const char* password = "siskayulianti06";
void startCameraServer();
void setupLedFlash(int pin);
void setup() {
Serial.begin(115200);
Serial.setDebugOutput(true);
Serial.println();
camera_config_t config;
config.ledc_channel = LEDC_CHANNEL_0;
config.ledc_timer = LEDC_TIMER_0;
config.pin_d0 = Y2_GPIO_NUM;
config.pin_d1 = Y3_GPIO_NUM;
config.pin_d2 = Y4_GPIO_NUM;
config.pin_d3 = Y5_GPIO_NUM;
config.pin_d4 = Y6_GPIO_NUM;
config.pin_d5 = Y7_GPIO_NUM;
config.pin_d6 = Y8_GPIO_NUM;
config.pin_d7 = Y9_GPIO_NUM;
config.pin_xclk = XCLK_GPIO_NUM;
config.pin_pclk = PCLK_GPIO_NUM;
config.pin_vsync = VSYNC_GPIO_NUM;
config.pin_href = HREF_GPIO_NUM;
config.pin_sccb_sda = SIOD_GPIO_NUM;
config.pin_sccb_scl = SIOC_GPIO_NUM;
config.pin_pwdn = PWDN_GPIO_NUM;
config.pin_reset = RESET_GPIO_NUM;
config.xclk_freq_hz = 20000000;
config.frame_size = FRAMESIZE_UXGA;
config.pixel_format = PIXFORMAT_JPEG; // for streaming
//config.pixel_format = PIXFORMAT_RGB565; // for face detection/recognition
config.grab_mode = CAMERA_GRAB_WHEN_EMPTY;
config.fb_location = CAMERA_FB_IN_PSRAM;
config.jpeg_quality = 12;
config.fb_count = 1;
// if PSRAM IC present, init with UXGA resolution and higher JPEG quality
// for larger pre-allocated frame buffer.
if(config.pixel_format == PIXFORMAT_JPEG){
if(psramFound()){
config.jpeg_quality = 10;
config.fb_count = 2;
config.grab_mode = CAMERA_GRAB_LATEST;
} else {
// Limit the frame size when PSRAM is not available
config.frame_size = FRAMESIZE_SVGA;
config.fb_location = CAMERA_FB_IN_DRAM;
}
} else {
// Best option for face detection/recognition
config.frame_size = FRAMESIZE_240X240;
#if CONFIG_IDF_TARGET_ESP32S3
config.fb_count = 2;
#endif
}
#if defined(CAMERA_MODEL_ESP_EYE)
pinMode(13, INPUT_PULLUP);
pinMode(14, INPUT_PULLUP);
#endif
// camera init
esp_err_t err = esp_camera_init(&config);
if (err != ESP_OK) {
Serial.printf("Camera init failed with error 0x%x", err);
return;
}
sensor_t * s = esp_camera_sensor_get();
// initial sensors are flipped vertically and colors are a bit saturated
if (s->id.PID == OV3660_PID) {
s->set_vflip(s, 1); // flip it back
s->set_brightness(s, 1); // up the brightness just a bit
s->set_saturation(s, -2); // lower the saturation
}
// drop down frame size for higher initial frame rate
if(config.pixel_format == PIXFORMAT_JPEG){
s->set_framesize(s, FRAMESIZE_QVGA);
}
#if defined(CAMERA_MODEL_M5STACK_WIDE) || defined(CAMERA_MODEL_M5STACK_ESP32CAM)
s->set_vflip(s, 1);
s->set_hmirror(s, 1);
#endif
#if defined(CAMERA_MODEL_ESP32S3_EYE)
s->set_vflip(s, 1);
#endif
// Setup LED FLash if LED pin is defined in camera_pins.h
#if defined(LED_GPIO_NUM)
setupLedFlash(LED_GPIO_NUM);
#endif
WiFi.begin(ssid, password);
WiFi.setSleep(false);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("");
Serial.println("WiFi connected");
startCameraServer();
Serial.print("Camera Ready! Use 'http://");
Serial.print(WiFi.localIP());
Serial.println("' to connect");
}
void loop() {
// Do nothing. Everything is done in another task by the web server
delay(10000);
}
#include "esp_camera.h"
#include <WiFi.h>
//
// WARNING!!! PSRAM IC required for UXGA resolution and high JPEG quality
// Ensure ESP32 Wrover Module or other board with PSRAM is selected
// Partial images will be transmitted if image exceeds buffer size
//
// You must select partition scheme from the board menu that has at least 3MB APP space.
// Face Recognition is DISABLED for ESP32 and ESP32-S2, because it takes up from 15
// seconds to process single frame. Face Detection is ENABLED if PSRAM is enabled as well
// ===================
// Select camera model
// ===================
//#define CAMERA_MODEL_WROVER_KIT // Has PSRAM
//#define CAMERA_MODEL_ESP_EYE // Has PSRAM
//#define CAMERA_MODEL_ESP32S3_EYE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_PSRAM // Has PSRAM
//#define CAMERA_MODEL_M5STACK_V2_PSRAM // M5Camera version B Has PSRAM
//#define CAMERA_MODEL_M5STACK_WIDE // Has PSRAM
//#define CAMERA_MODEL_M5STACK_ESP32CAM // No PSRAM
//#define CAMERA_MODEL_M5STACK_UNITCAM // No PSRAM
#define CAMERA_MODEL_AI_THINKER // Has PSRAM
//#define CAMERA_MODEL_TTGO_T_JOURNAL // No PSRAM
//#define CAMERA_MODEL_XIAO_ESP32S3 // Has PSRAM
// ** Espressif Internal Boards **
//#define CAMERA_MODEL_ESP32_CAM_BOARD
//#define CAMERA_MODEL_ESP32S2_CAM_BOARD
//#define CAMERA_MODEL_ESP32S3_CAM_LCD
//#define CAMERA_MODEL_DFRobot_FireBeetle2_ESP32S3 // Has PSRAM
//#define CAMERA_MODEL_DFRobot_Romeo_ESP32S3 // Has PSRAM
#include "camera_pins.h"
// ===========================
// Enter your WiFi credentials
// ===========================
const char* ssid = "itel P55";
const char* password = "siskayulianti06";
void startCameraServer();
void setupLedFlash(int pin);
void setup() {
Serial.begin(115200);
Serial.setDebugOutput(true);
Serial.println();
camera_config_t config;
config.ledc_channel = LEDC_CHANNEL_0;
config.ledc_timer = LEDC_TIMER_0;
config.pin_d0 = Y2_GPIO_NUM;
config.pin_d1 = Y3_GPIO_NUM;
config.pin_d2 = Y4_GPIO_NUM;
config.pin_d3 = Y5_GPIO_NUM;
config.pin_d4 = Y6_GPIO_NUM;
config.pin_d5 = Y7_GPIO_NUM;
config.pin_d6 = Y8_GPIO_NUM;
config.pin_d7 = Y9_GPIO_NUM;
config.pin_xclk = XCLK_GPIO_NUM;
config.pin_pclk = PCLK_GPIO_NUM;
config.pin_vsync = VSYNC_GPIO_NUM;
config.pin_href = HREF_GPIO_NUM;
config.pin_sccb_sda = SIOD_GPIO_NUM;
config.pin_sccb_scl = SIOC_GPIO_NUM;
config.pin_pwdn = PWDN_GPIO_NUM;
config.pin_reset = RESET_GPIO_NUM;
config.xclk_freq_hz = 20000000;
config.frame_size = FRAMESIZE_UXGA;
config.pixel_format = PIXFORMAT_JPEG; // for streaming
//config.pixel_format = PIXFORMAT_RGB565; // for face detection/recognition
config.grab_mode = CAMERA_GRAB_WHEN_EMPTY;
config.fb_location = CAMERA_FB_IN_PSRAM;
config.jpeg_quality = 12;
config.fb_count = 1;
// if PSRAM IC present, init with UXGA resolution and higher JPEG quality
// for larger pre-allocated frame buffer.
if(config.pixel_format == PIXFORMAT_JPEG){
if(psramFound()){
config.jpeg_quality = 10;
config.fb_count = 2;
config.grab_mode = CAMERA_GRAB_LATEST;
} else {
// Limit the frame size when PSRAM is not available
config.frame_size = FRAMESIZE_SVGA;
config.fb_location = CAMERA_FB_IN_DRAM;
}
} else {
// Best option for face detection/recognition
config.frame_size = FRAMESIZE_240X240;
#if CONFIG_IDF_TARGET_ESP32S3
config.fb_count = 2;
#endif
}
#if defined(CAMERA_MODEL_ESP_EYE)
pinMode(13, INPUT_PULLUP);
pinMode(14, INPUT_PULLUP);
#endif
// camera init
esp_err_t err = esp_camera_init(&config);
if (err != ESP_OK) {
Serial.printf("Camera init failed with error 0x%x", err);
return;
}
sensor_t * s = esp_camera_sensor_get();
// initial sensors are flipped vertically and colors are a bit saturated
if (s->id.PID == OV3660_PID) {
s->set_vflip(s, 1); // flip it back
s->set_brightness(s, 1); // up the brightness just a bit
s->set_saturation(s, -2); // lower the saturation
}
// drop down frame size for higher initial frame rate
if(config.pixel_format == PIXFORMAT_JPEG){
s->set_framesize(s, FRAMESIZE_QVGA);
}
#if defined(CAMERA_MODEL_M5STACK_WIDE) || defined(CAMERA_MODEL_M5STACK_ESP32CAM)
s->set_vflip(s, 1);
s->set_hmirror(s, 1);
#endif
#if defined(CAMERA_MODEL_ESP32S3_EYE)
s->set_vflip(s, 1);
#endif
// Setup LED FLash if LED pin is defined in camera_pins.h
#if defined(LED_GPIO_NUM)
setupLedFlash(LED_GPIO_NUM);
#endif
WiFi.begin(ssid, password);
WiFi.setSleep(false);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
}
Serial.println("");
Serial.println("WiFi connected");
startCameraServer();
Serial.print("Camera Ready! Use 'http://");
Serial.print(WiFi.localIP());
Serial.println("' to connect");
}
void loop() {
// Do nothing. Everything is done in another task by the web server
delay(10000);
}
Mengubah SSID dan Password
Ganti bagian berikut sesuai WiFi yang digunakan:
const char* ssid = "NAMA_WIFI";
const char* password = "PASSWORD_WIFI";
Contoh:
const char* ssid = "RumahKu";
const char* password = "12345678";
Upload Program
1. Pilih board "AI Thinker ESP32-CAM".
2. Kemudian upload program.
3. Setelah berhasil, buka Serial Monitor.
Catat alamat IP tersebut karena akan digunakan pada program Python.
Membuat Program OpenCV
1. Buat file baru tracking.py.
2. Masukkan kode berikut.
Program Python OpenCV
import cv2
import numpy as np
import requests
URL = "http://10.132.114.80/capture"
while True:
img_resp = requests.get(URL)
img_arr = np.frombuffer(
img_resp.content,
dtype=np.uint8
)
frame = cv2.imdecode(
img_arr,
cv2.IMREAD_COLOR
)
hsv = cv2.cvtColor(
frame,
cv2.COLOR_BGR2HSV
)
lower_red1 = np.array([0,120,70])
upper_red1 = np.array([10,255,255])
lower_red2 = np.array([170,120,70])
upper_red2 = np.array([180,255,255])
mask1 = cv2.inRange(
hsv,
lower_red1,
upper_red1
)
mask2 = cv2.inRange(
hsv,
lower_red2,
upper_red2
)
mask = mask1 + mask2
contours, _ = cv2.findContours(
mask,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE
)
for contour in contours:
area = cv2.contourArea(contour)
if area > 500:
x,y,w,h = cv2.boundingRect(contour)
cv2.rectangle(
frame,
(x,y),
(x+w,y+h),
(0,255,0),
2
)
cx = x + w//2
cy = y + h//2
cv2.circle(
frame,
(cx,cy),
5,
(255,0,0),
-1
)
cv2.putText(
frame,
f"X:{cx} Y:{cy}",
(x,y-10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0,255,0),
2
)
cv2.imshow(
"ESP32-CAM Tracking",
frame
)
cv2.imshow(
"Mask",
mask
)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
Mengubah Alamat IP
Sesuaikan alamat IP dengan yang muncul pada Serial Monitor.
Contoh:
ESP32_STREAM = "http://10.132.114.80/stream". Jika IP ESP32 adalah 10.132.114.80, maka ESP32_STREAM = "http://10.132.114.80/stream"
Menjalankan Program
1. Buka Command Prompt.
2. Masuk ke folder project cd Desktop
3. Jalankan python tracking.py
Penjelasan Program OpenCV
1. Membaca Video Streaming
cap = cv2.VideoCapture(ESP32_STREAM)
Berfungsi mengambil video secara langsung dari ESP32-CAM melalui jaringan WiFi.
2. Mengubah Format Warna
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
OpenCV mengubah gambar RGB menjadi HSV agar proses deteksi warna menjadi lebih akurat.
3. Membuat Filter Warna
mask = cv2.inRange(hsv, lower_red, upper_red)
Hanya warna merah yang akan dipertahankan. Warna lain akan dihilangkan.
4. Mencari Kontur Objek
contours, _ = cv2.findContours(...)
Digunakan untuk menemukan bentuk objek yang memiliki warna sesuai filter.
5. Membuat Bounding Box
cv2.rectangle(...)
Menggambar kotak hijau pada objek yang terdeteksi.
6. Menentukan Titik Tengah
cx = int(x+w/2)
cy = int(y+h/2)
Menghitung posisi pusat objek.
7. Menampilkan Koordinat
cv2.putText(...)
Menampilkan posisi objek pada layar.
Hasil yang Akan Ditampilkan
Ketika objek merah berada di depan kamera:
OpenCV akan menggambar kotak hijau, menampilkan titik pusat, menampilkan koordinat objek dan mengikuti pergerakan objek secara real-time.
Baca juga: ESP32-CAM Web Camera Server dengan Live Streaming dan Face Detection






0 Komentar