Target Tracking Algorithm Based On
Edge Acceleration
Song Jianbo
Catalogue
2.1 Traditional correlation filtering target tracking algorithm 3
2.2 Target tracking algorithm based on SSD and yolov3 5
3.1 Correlation filtering target tracking algorithm based on ZU15EG 6
3.2 Deep learning target tracking algorithm based on KV260 7
1 Introduction1.1 Background
Target tracking refers to the process of using various optical elements or sensors to collect continuous tracking video sequences, calibrate the position and scale information of the target in the starting frame, calculate and predict the new position and scale information of the target in the subsequent frame, and then determine the overall running track of the target. Target tracking is one of the hot topics in the field of computer vision. In recent years, with the continuous development of computer vision and the continuous improvement of hardware equipment, target tracking technology has been more and more widely used in national defense construction and aerospace fields such as airborne tracking and precision guidance.
After years of exploration and research, many domestic and foreign scientific research teams have proposed and improved various target tracking algorithms with different principles. However, due to the challenging factors in the process of target tracking, such as target deformation, rotation, occlusion, leaving the field of view, scale change, rapid movement, background blur, illumination change and so on, the tracking effect still has a lot of room to improve. At the same time, in the process of continuous optimization of the algorithm, it is often accompanied by the increasing amount of calculation of the algorithm, which makes the real-time tracking become a problem that must be considered. The high complexity operation and large amount of real-time data processing requirements brought by the target tracking algorithm require the edge computing platform to have very high computing performance. At the same time, in the target tracking application fields such as airborne tracking, it has very strict restrictions on the power consumption and load capacity of the hardware platform. Therefore, it is necessary to find a processing platform with high performance, low power consumption and small volume, At the same time, a lightweight target tracking algorithm is developed.
1.2 Work of the project
Aiming at the problem of accuracy and real-time of target tracking algorithm in different scenarios, the project compiles the deep learning algorithm based on traditional non deep learning algorithm and SSD and yolov3, and deploys them in zu15eg (circuit board designed by myself) platform and kv260 edge computing platform officially provided by Xilinx. The specific algorithm principle and deployment will be described in detail later.
2 Algorithm principle2.1 Traditional correlation filtering target tracking algorithm
This project is based on the classical algorithm KCF of correlation filtering. Firstly, aiming at the problem that the original KCF algorithm does not have the function of tracking frame scale adaptive adjustment, this project refers to the concepts of "scale pool" and "scale filter", and carries out multi-scale optimization on the original algorithm to improve the robustness of the original algorithm in the scale transformation scene; Then, aiming at the problem that the KCF algorithm uses a single hog feature to represent the target feature is not strong and the robustness is poor in the target deformation scene, an adaptive weighted multi feature fusion optimization strategy based on average peak correlation energy (APCE) is proposed to adaptively weight the hog feature and CN feature; Then, aiming at the problem of tracking drift and even tracking failure of the original target tracking algorithm in challenging and complex scenes such as rapid movement, rotation, occlusion and leaving the field of view of the target, based on the idea of high confidence template update proposed by lmcf, this project proposes a method to independently test the confidence of hog feature and CN feature and update the template independently by comprehensively using APCE criterion and response peak, To make full use of the complementary advantages of hog feature and CN feature; Finally, aiming at the problem of target loss in various complex scenes, this project introduces the classification idea of online support vector machine (SVM) to re detect the tracked target. Based on the above optimization strategies, this project finally proposes an improved multi-scale long-term tracking algorithm (madkcf) for adaptive feature fusion and target re detection. After multi-dimensional optimization of the original algorithm, the accuracy and success rate of madkcf algorithm in this project under otb2013 standard test set are higher than that of the original algorithm. At the same time, the test in the actual scene also shows strong tracking performance.
2.2 Target tracking algorithm based on SSD and yolov3
The SSD algorithm proposed in this paper is a multi-target detection algorithm that directly predicts the target category and bounding box. Compared with the farcspost algorithm, the speed of this algorithm is greatly improved. The traditional method is to combine the images into different sizes (NMS), and then convert them into different sizes (NMS).
Using a single network structure, yolo-v3 can predict the object category and location while generating candidate areas, and does not need to be divided into two stages to complete the detection task. In addition, the number of prediction frames generated by yolo-v3 algorithm is much less than that of fast RCNN. Each real box in fast RCNN may correspond to multiple candidate areas with positive labels, while each real box in yolo-v3 corresponds to only one positive candidate area. These characteristics make yolo-v3 algorithm have faster speed and can reach the level of real-time response.
Through my experiments, it is found that the algorithm combined with SSD and yolov3 can not only be competent for target detection tasks, but also suitable for target tracking tasks. For the comprehensive consideration of tracking accuracy and real-time tracking, this new target tracking scheme is adopted in this project, and the experiments show that the tracking effect is good.
3 Edge algorithm deployment3.1 Correlation filtering target tracking algorithm based on ZU15EG
This algorithm is deployed in the embedded platform developed by Xilinx zu15eg chip, which is designed and produced by myself. However, for the sake of confidentiality, I do not intend to open source my hardware schematic and PCB here.
First, build a video channel platform in vivado, then use the petalinux tool to make the SD card image, put the corresponding files in different partitions of the SD card, insert the board and start the system.
I designed a GUI for the algorithm of this project,The test interface of the algorithm of this project is shown in Figure 1.
(a)
(b)
Figure 1 Demo1
3.2 Deep learning target tracking algorithm based on KV260
Using the self-designed hardware platform is not suitable for the deployment of deep learning algorithm at present, so I use the KV260 evaluation board officially provided by Xilinx to implement this algorithm.
According to the official tutorial, I can easily start the system, as shown in Figure 2.
Figure 2 Start system
Then I use Vitis AI tool to compile the model after algorithm design, pruning and quantization, and import the compiled model into kv260 system to run it as an application of smart camera. In order to facilitate the recording of output results, I use the video capture card to convert HDMI output into USB and send it to the computer, and use amcap software to watch it in real time. The screenshot of algorithm deployment effect is shown in Figure 3.
For the sake of real-time performance, only SSD based algorithm deployment is shown below.
(a)
(b)
Figure 3 Demo2
The tracking algorithm works very well and can track the target for a long time, and the frame rate is around 30fps, which shows that the real-time performance is good.
3 ConclusionIn this project, two sets of target tracking algorithms are successfully designed and deployed at the edge. The deployment effect is good, and the algorithm has high accuracy and good real-time performance.
4 Appendix : source codeNow, I will provide the source code of filtering target tracking algorithm related to this project.
- main.cpp
- #ifndef _MADKCFtracker_HEADERS
- #include "MADKCFtracker.hpp"
- #include "ffttools.hpp"
- #include "recttools.hpp"
- #include "fhog.hpp"
- #include "labdata.hpp"
- #endif
- float APCE = 0;
- float APCE_AVE = 0;
- float APCE_SUM = 0;
- int APCE_COUNT = 0;
- // Constructor
- MADKCFtracker::MADKCFtracker(bool hog, bool fixed_window, bool multiscale, bool lab)
- {
- //在所有情况下参数均相等
- lambda = 0.0001;
- padding = 2.5;
- //output_sigma_factor = 0.1;
- output_sigma_factor = 0.125;
- if (hog) { // HOG
- // VOT
- interp_factor = 0.012;
- sigma = 0.6;
- // TPAMI
- //interp_factor = 0.02;
- //sigma = 0.5;
- cell_size = 4;
- _hogfeatures = true;
- if (lab) {
- interp_factor = 0.005;
- sigma = 0.4;
- //output_sigma_factor = 0.025;
- output_sigma_factor = 0.1;
- _labfeatures = true;
- _labCentroids = cv::Mat(nClusters, 3, CV_32FC1, &data);
- cell_sizeQ = cell_size*cell_size;
- }
- else{
- _labfeatures = false;
- }
- }
- else { // RAW
- interp_factor = 0.075;
- sigma = 0.2;
- cell_size = 1;
- _hogfeatures = false;
- if (lab) {
- printf("Lab features are only used with HOG features.\n");
- _labfeatures = false;
- }
- }
- if (multiscale) { // multiscale
- template_size = 96;
- //尺度参数初始化
- scale_padding = 1.0;
- scale_step = 1.05;
- scale_sigma_factor = 0.25;
- n_scales = 33;
- scale_lr = 0.025;
- scale_max_area = 512;
- currentScaleFactor = 1;
- scale_lambda = 0.01;
- if (!fixed_window) {
- //printf("Multiscale does not support non-fixed window.\n");
- fixed_window = true;
- }
- }
- else if (fixed_window) { // fit correction without multiscale
- template_size = 96;
- //template_size = 100;
- scale_step = 1;
- }
- else {
- template_size = 1;
- scale_step = 1;
- }
- }
- //使用选定的跟踪框roi和第一帧图像image,初始化KCF跟踪器,计算跟踪模型α,训练跟踪器
- void MADKCFtracker::init(const cv::Rect &roi, cv::Mat image)
- {
- _roi = roi;
- assert(roi.width >= 0 && roi.height >= 0);
- _tmpl = getFeatures(image, 1);
- _prob = createGaussianPeak(size_patch[0], size_patch[1]);
- _alphaf = cv::Mat(size_patch[0], size_patch[1], CV_32FC2, float(0));
- dsstInit(roi, image);
- //_num = cv::Mat(size_patch[0], size_patch[1], CV_32FC2, float(0));
- //_den = cv::Mat(size_patch[0], size_patch[1], CV_32FC2, float(0));
- train(_tmpl, 1.0); // train with initial frame
- }
- // Update position based on the new frame
- // 基于当前帧更新目标位置
- //① 基于当前帧更新目标的当前位置。
- //② 基于当前帧更新目标的当前尺度。
- //③ 利用当前的位置中心和当前尺度,确定最新的跟踪框属性。
- cv::Rect MADKCFtracker::update(cv::Mat image)
- {
- //修正边界
- if (_roi.x + _roi.width <= 0) _roi.x = -_roi.width + 1;
- if (_roi.y + _roi.height <= 0) _roi.y = -_roi.height + 1;
- if (_roi.x >= image.cols - 1) _roi.x = image.cols - 2;
- if (_roi.y >= image.rows - 1) _roi.y = image.rows - 2;
- //跟踪框中心(cx,cy)
- float cx = _roi.x + _roi.width / 2.0f; //(_roi.x,_roi.y):跟踪框起始点的坐标
- float cy = _roi.y + _roi.height / 2.0f;
- float peak_value;
- cv::Point2f res = detect(_tmpl, getFeatures(image, 0, 1.0f), peak_value); //(res.x,res.y):中心点坐标
- // 因为返回的只有中心坐标,使用尺度和中心坐标调整目标框
- _roi.x = cx - _roi.width / 2.0f + ((float) res.x * cell_size * _scale * currentScaleFactor); //cx、cy:跟踪框中心
- _roi.y = cy - _roi.height / 2.0f + ((float) res.y * cell_size * _scale * currentScaleFactor);//(_roi.x,_roi.y):跟踪框起始点(左上角)的坐标
- if (_roi.x >= image.cols - 1) _roi.x = image.cols - 1; //(_roi.x,_roi.y):跟踪框起始点的坐标
- if (_roi.y >= image.rows - 1) _roi.y = image.rows - 1;
- if (_roi.x + _roi.width <= 0) _roi.x = -_roi.width + 2;
- if (_roi.y + _roi.height <= 0) _roi.y = -_roi.height + 2;
- // Update scale
- cv::Point2i scale_pi = detect_scale(image);
- currentScaleFactor = currentScaleFactor * scaleFactors[scale_pi.x];
- if(currentScaleFactor < min_scale_factor)
- currentScaleFactor = min_scale_factor;
- // else if(currentScaleFactor > max_scale_factor)
- // currentScaleFactor = max_scale_factor;
- train_scale(image);
- if (_roi.x >= image.cols - 1) _roi.x = image.cols - 1;
- if (_roi.y >= image.rows - 1) _roi.y = image.rows - 1;
- if (_roi.x + _roi.width <= 0) _roi.x = -_roi.width + 2;
- if (_roi.y + _roi.height <= 0) _roi.y = -_roi.height + 2;
- assert(_roi.width >= 0 && _roi.height >= 0);
- cv::Mat x = getFeatures(image, 0);
- train(x, interp_factor);
- return _roi;
- }
- //对目标最新尺度进行估计。
- //① 对输入帧获取33个尺度的样本。
- //② 利用公式计算不同尺度下的尺度响应矩阵。
- //③ 利用minMaxLoc函数确定最大响应的位置。
- cv::Point2i MADKCFtracker::detect_scale(cv::Mat image)
- {
- cv::Mat xsf = MADKCFtracker::get_scale_sample(image);
- // Compute AZ in the paper
- cv::Mat add_temp;
- cv::reduce(FFTTools::complexMultiplication(sf_num, xsf), add_temp, 0, CV_REDUCE_SUM);
- // compute the final y
- cv::Mat scale_response;
- cv::idft(FFTTools::complexDivisionReal(add_temp, (sf_den + scale_lambda)), scale_response, cv::DFT_REAL_OUTPUT);
- // Get the max point as the final scaling rate
- cv::Point2i pi;
- double pv;
- cv::minMaxLoc(scale_response, NULL, &pv, NULL, &pi);
- return pi;//pi:一维向量
- }
- // z为前一帧样本
- // x为当前帧图像
- // peak_value为输出的峰值
- cv::Point2f MADKCFtracker::detect(cv::Mat z, cv::Mat x, float &peak_value)
- {
- using namespace FFTTools;
- // 做变换得到计算结果res
- cv::Mat k = gaussianCorrelation(x, z);
- cv::Mat res = (real(fftd(complexMultiplication(_alphaf, fftd(k)), true)));
- // 使用opencv的minMaxLoc来定位峰值坐标位置
- cv::Point2i pi;//最大值的坐标
- cv::Point2i minPoint;
- double pv;//最大值
- double minValue;
- //cv::minMaxLoc(res, NULL, &pv, NULL, &pi);
- cv::minMaxLoc(res, &minValue, &pv, &minPoint, &pi);
- peak_value = (float) pv;
- float ave_res = 0;
- APCE_COUNT++;
- float value_diff = 0;
- float sum_diff = 0;
- for (int i = 0; i < res.rows; i++) {
- for (int j = 0; j < res.cols; j++) {
- value_diff = res.at<float>(i, j) - minValue;
- sum_diff += std::pow(value_diff, 2);
- }
- }
- ave_res = sum_diff / (res.cols * res.rows);
- float mmdiff = 0;
- mmdiff = pv - minValue;
- mmdiff = std::pow(mmdiff, 2);
- float apce = mmdiff / ave_res;
- APCE_SUM += apce;
- APCE_AVE = APCE_SUM / APCE_COUNT;
- APCE = apce;
- // 子像素峰值检测,坐标是非整形的
- cv::Point2f p((float)pi.x, (float)pi.y);
- if (pi.x > 0 && pi.x < res.cols-1) {
- p.x += subPixelPeak(res.at<float>(pi.y, pi.x-1), peak_value, res.at<float>(pi.y, pi.x+1));
- }
- if (pi.y > 0 && pi.y < res.rows-1) {
- p.y += subPixelPeak(res.at<float>(pi.y-1, pi.x), peak_value, res.at<float>(pi.y+1, pi.x));
- }
- p.x -= (res.cols) / 2;
- p.y -= (res.rows) / 2;
- return p;
- }
- // 使用图像进行训练,得到当前帧的_tmpl,_alphaf
- void MADKCFtracker::train(cv::Mat x, float train_interp_factor)
- {
- using namespace FFTTools;
- cv::Mat k = gaussianCorrelation(x, x);
- cv::Mat alphaf = complexDivision(_prob, (fftd(k) + lambda)); //_prob:初始化结果prob,不再更改,用于训练
- //_tmpl = (1 - train_interp_factor) * _tmpl + (train_interp_factor) * x; //_tmpl: 初始化/训练的结果,用于detect的z
- // _alphaf = (1 - train_interp_factor) * _alphaf + (train_interp_factor) * alphaf; //_alphaf: 初始化/训练结果alphaf,用于检测部分中结果的计算
- /*cv::Mat kf = fftd(gaussianCorrelation(x, x));
- cv::Mat num = complexMultiplication(kf, _prob);
- cv::Mat den = complexMultiplication(kf, kf + lambda);
- _tmpl = (1 - train_interp_factor) * _tmpl + (train_interp_factor) * x;
- _num = (1 - train_interp_factor) * _num + (train_interp_factor) * num;
- _den = (1 - train_interp_factor) * _den + (train_interp_factor) * den;
- _alphaf = complexDivision(_num, _den);*/
- if (APCE_COUNT >= 1)
- {
- if (APCE >= 0.8 * APCE_AVE)// 0.45
- {
- _tmpl = (1 - train_interp_factor) * _tmpl + (train_interp_factor)* x;
- _alphaf = (1 - train_interp_factor) * _alphaf + (train_interp_factor)* alphaf;
- }
- else
- {
- _tmpl = _tmpl;
- _alphaf = _alphaf;
- }
- }
- else
- {
- _tmpl = (1 - train_interp_factor) * _tmpl + (train_interp_factor)* x;
- _alphaf = (1 - train_interp_factor) * _alphaf + (train_interp_factor)* alphaf;
- }
- }
- // 使用带宽SIGMA计算高斯卷积核以用于所有图像X和Y之间的相对位移
- // 必须都是MxN大小。二者必须都是周期的(即,通过一个cos窗口进行预处理)
- cv::Mat MADKCFtracker::gaussianCorrelation(cv::Mat x1, cv::Mat x2)
- {
- using namespace FFTTools;
- cv::Mat c = cv::Mat( cv::Size(size_patch[1], size_patch[0]), CV_32F, cv::Scalar(0) );
- // HOG
- if (_hogfeatures) {
- cv::Mat caux;
- cv::Mat x1aux;
- cv::Mat x2aux;
- for (int i = 0; i < size_patch[2]; i++) {
- x1aux = x1.row(i); // Procedure do deal with cv::Mat multichannel bug
- x1aux = x1aux.reshape(1, size_patch[0]);
- x2aux = x2.row(i).reshape(1, size_patch[0]);
- cv::mulSpectrums(fftd(x1aux), fftd(x2aux), caux, 0, true);
- caux = fftd(caux, true);
- rearrange(caux);
- caux.convertTo(caux,CV_32F);
- c = c + real(caux);
- }
- }
- // Gray features
- else {
- cv::mulSpectrums(fftd(x1), fftd(x2), c, 0, true);
- c = fftd(c, true);
- rearrange(c);
- c = real(c);
- }
- cv::Mat d;
- cv::max(( (cv::sum(x1.mul(x1))[0] + cv::sum(x2.mul(x2))[0])- 2. * c) / (size_patch[0]*size_patch[1]*size_patch[2]) , 0, d);
- cv::Mat k;
- cv::exp((-d / (sigma * sigma)), k);
- return k;
- }
- // 创建高斯峰函数,函数只在第一帧的时候执行
- cv::Mat MADKCFtracker::createGaussianPeak(int sizey, int sizex)
- {
- cv::Mat_<float> res(sizey, sizex);
- int syh = (sizey) / 2;
- int sxh = (sizex) / 2;
- float output_sigma = std::sqrt((float) sizex * sizey) / padding * output_sigma_factor;
- float mult = -0.5 / (output_sigma * output_sigma);
- for (int i = 0; i < sizey; i++)
- for (int j = 0; j < sizex; j++)
- {
- int ih = i - syh;
- int jh = j - sxh;
- res(i, j) = std::exp(mult * (float) (ih * ih + jh * jh));
- }
- return FFTTools::fftd(res);
- }
- cv::Mat MADKCFtracker::getFeatures(const cv::Mat & image, bool inithann, float scale_adjust)
- {
- cv::Rect extracted_roi;
- float cx = _roi.x + _roi.width / 2;
- float cy = _roi.y + _roi.height / 2;
- if (inithann) {
- int padded_w = _roi.width * padding;
- int padded_h = _roi.height * padding;
- if (template_size > 1) { // Fit largest dimension to the given template size
- if (padded_w >= padded_h) //fit to width
- _scale = padded_w / (float) template_size;
- else
- _scale = padded_h / (float) template_size;
- _tmpl_sz.width = padded_w / _scale;
- _tmpl_sz.height = padded_h / _scale;
- }
- else { //No template size given, use ROI size
- _tmpl_sz.width = padded_w;
- _tmpl_sz.height = padded_h;
- _scale = 1;
- // original code from paper:
- /*if (sqrt(padded_w * padded_h) >= 100) { //Normal size
- _tmpl_sz.width = padded_w;
- _tmpl_sz.height = padded_h;
- _scale = 1;
- }
- else { //ROI is too big, track at half size
- _tmpl_sz.width = padded_w / 2;
- _tmpl_sz.height = padded_h / 2;
- _scale = 2;
- }*/
- }
- if (_hogfeatures) {
- // Round to cell size and also make it even
- _tmpl_sz.width = ( ( (int)(_tmpl_sz.width / (2 * cell_size)) ) * 2 * cell_size ) + cell_size*2;
- _tmpl_sz.height = ( ( (int)(_tmpl_sz.height / (2 * cell_size)) ) * 2 * cell_size ) + cell_size*2;
- }
- else { //Make number of pixels even (helps with some logic involving half-dimensions)
- _tmpl_sz.width = (_tmpl_sz.width / 2) * 2;
- _tmpl_sz.height = (_tmpl_sz.height / 2) * 2;
- }
- }
- extracted_roi.width = scale_adjust * _scale * _tmpl_sz.width * currentScaleFactor;
- extracted_roi.height = scale_adjust * _scale * _tmpl_sz.height * currentScaleFactor;
- // center roi with new size
- extracted_roi.x = cx - extracted_roi.width / 2;
- extracted_roi.y = cy - extracted_roi.height / 2;
- cv::Mat FeaturesMap;
- cv::Mat z = RectTools::subwindow(image, extracted_roi, cv::BORDER_REPLICATE);
- if (z.cols != _tmpl_sz.width || z.rows != _tmpl_sz.height) {
- cv::resize(z, z, _tmpl_sz);
- }
- // HOG features
- if (_hogfeatures) {
- IplImage z_ipl = z;
- CvLSVMFeatureMapCaskade *map;
- getFeatureMaps(&z_ipl, cell_size, &map);
- normalizeAndTruncate(map,0.2f);
- PCAFeatureMaps(map);
- size_patch[0] = map->sizeY;
- size_patch[1] = map->sizeX;
- size_patch[2] = map->numFeatures;
- FeaturesMap = cv::Mat(cv::Size(map->numFeatures,map->sizeX*map->sizeY), CV_32F, map->map); // Procedure do deal with cv::Mat multichannel bug
- FeaturesMap = FeaturesMap.t();
- freeFeatureMapObject(&map);
- // Lab features
- if (_labfeatures) {
- cv::Mat imgLab;
- cvtColor(z, imgLab, CV_BGR2Lab);
- unsigned char *input = (unsigned char*)(imgLab.data);
- // Sparse output vector
- cv::Mat outputLab = cv::Mat(_labCentroids.rows, size_patch[0]*size_patch[1], CV_32F, float(0));
- int cntCell = 0;
- // Iterate through each cell
- for (int cY = cell_size; cY < z.rows-cell_size; cY+=cell_size){
- for (int cX = cell_size; cX < z.cols-cell_size; cX+=cell_size){
- // Iterate through each pixel of cell (cX,cY)
- for(int y = cY; y < cY+cell_size; ++y){
- for(int x = cX; x < cX+cell_size; ++x){
- // Lab components for each pixel
- float l = (float)input[(z.cols * y + x) * 3];
- float a = (float)input[(z.cols * y + x) * 3 + 1];
- float b = (float)input[(z.cols * y + x) * 3 + 2];
- // Iterate trough each centroid
- float minDist = FLT_MAX;
- int minIdx = 0;
- float *inputCentroid = (float*)(_labCentroids.data);
- for(int k = 0; k < _labCentroids.rows; ++k){
- float dist = ( (l - inputCentroid[3*k]) * (l - inputCentroid[3*k]) )
- + ( (a - inputCentroid[3*k+1]) * (a - inputCentroid[3*k+1]) )
- + ( (b - inputCentroid[3*k+2]) * (b - inputCentroid[3*k+2]) );
- if(dist < minDist){
- minDist = dist;
- minIdx = k;
- }
- }
- // Store result at output
- outputLab.at<float>(minIdx, cntCell) += 1.0 / cell_sizeQ;
- //((float*) outputLab.data)[minIdx * (size_patch[0]*size_patch[1]) + cntCell] += 1.0 / cell_sizeQ;
- }
- }
- cntCell++;
- }
- }
- // Update size_patch[2] and add features to FeaturesMap
- size_patch[2] += _labCentroids.rows;
- FeaturesMap.push_back(outputLab);
- }
- }
- else {
- FeaturesMap = RectTools::getGrayImage(z);
- FeaturesMap -= (float) 0.5; // In Paper;
- size_patch[0] = z.rows;
- size_patch[1] = z.cols;
- size_patch[2] = 1;
- }
- if (inithann) {
- createHanningMats();
- }
- FeaturesMap = hann.mul(FeaturesMap);
- return FeaturesMap;
- }
- // Initialize Hanning window. Function called only in the first frame.
- void MADKCFtracker::createHanningMats()
- {
- cv::Mat hann1t = cv::Mat(cv::Size(size_patch[1],1), CV_32F, cv::Scalar(0));
- cv::Mat hann2t = cv::Mat(cv::Size(1,size_patch[0]), CV_32F, cv::Scalar(0));
- for (int i = 0; i < hann1t.cols; i++)
- hann1t.at<float > (0, i) = 0.5 * (1 - std::cos(2 * 3.14159265358979323846 * i / (hann1t.cols - 1)));
- for (int i = 0; i < hann2t.rows; i++)
- hann2t.at<float > (i, 0) = 0.5 * (1 - std::cos(2 * 3.14159265358979323846 * i / (hann2t.rows - 1)));
- cv::Mat hann2d = hann2t * hann1t;
- // HOG features
- if (_hogfeatures) {
- cv::Mat hann1d = hann2d.reshape(1,1); // Procedure do deal with cv::Mat multichannel bug
- hann = cv::Mat(cv::Size(size_patch[0]*size_patch[1], size_patch[2]), CV_32F, cv::Scalar(0));
- for (int i = 0; i < size_patch[2]; i++) {
- for (int j = 0; j<size_patch[0]*size_patch[1]; j++) {
- hann.at<float>(i,j) = hann1d.at<float>(0,j);
- }
- }
- }
- // Gray features
- else {
- hann = hann2d;
- }
- }
- // Calculate sub-pixel peak for one dimension
- float MADKCFtracker::subPixelPeak(float left, float center, float right)
- {
- float divisor = 2 * center - right - left;
- if (divisor == 0)
- return 0;
- return 0.5 * (right - left) / divisor;
- }
- // Initialization for scales
- void MADKCFtracker::dsstInit(const cv::Rect &roi, cv::Mat image)
- {
- // The initial size for adjusting
- base_width = roi.width;
- base_height = roi.height;
- // Guassian peak for scales (after fft)
- ysf = computeYsf();
- s_hann = createHanningMatsForScale();
- // Get all scale changing rate
- scaleFactors = new float[n_scales];
- float ceilS = std::ceil(n_scales / 2.0f);
- for(int i = 0 ; i < n_scales; i++)
- {
- scaleFactors[i] = std::pow(scale_step, ceilS - i - 1);
- }
- // Get the scaling rate for compressing to the model size
- float scale_model_factor = 1;
- if(base_width * base_height > scale_max_area)
- {
- scale_model_factor = std::sqrt(scale_max_area / (float)(base_width * base_height));
- }
- scale_model_width = (int)(base_width * scale_model_factor);
- scale_model_height = (int)(base_height * scale_model_factor);
- // Compute min and max scaling rate
- min_scale_factor = std::pow(scale_step,
- std::ceil(std::log((std::fmax(5 / (float) base_width, 5 / (float) base_height) * (1 + scale_padding))) / 0.0086));
- max_scale_factor = std::pow(scale_step,
- std::floor(std::log(std::fmin(image.rows / (float) base_height, image.cols / (float) base_width)) / 0.0086));
- train_scale(image, true);
- }
- // Train method for scaling
- void MADKCFtracker::train_scale(cv::Mat image, bool ini)
- {
- cv::Mat xsf = get_scale_sample(image);
- // Adjust ysf to the same size as xsf in the first time
- if(ini)
- {
- int totalSize = xsf.rows;
- ysf = cv::repeat(ysf, totalSize, 1);
- }
- // Get new GF in the paper (delta A)
- cv::Mat new_sf_num;
- cv::mulSpectrums(ysf, xsf, new_sf_num, 0, true);
- // Get Sigma{FF} in the paper (delta B)
- cv::Mat new_sf_den;
- cv::mulSpectrums(xsf, xsf, new_sf_den, 0, true);
- cv::reduce(FFTTools::real(new_sf_den), new_sf_den, 0, CV_REDUCE_SUM);
- if(ini)
- {
- sf_den = new_sf_den;
- sf_num = new_sf_num;
- }else
- {
- // Get new A and new B
- cv::addWeighted(sf_den, (1 - scale_lr), new_sf_den, scale_lr, 0, sf_den);
- cv::addWeighted(sf_num, (1 - scale_lr), new_sf_num, scale_lr, 0, sf_num);
- }
- update_roi();
- }
- // Update the ROI size after training
- void MADKCFtracker::update_roi()
- {
- // Compute new center
- float cx = _roi.x + _roi.width / 2.0f;
- float cy = _roi.y + _roi.height / 2.0f;
- // printf("%f\n", currentScaleFactor);
- // Recompute the ROI left-upper point and size
- _roi.width = base_width * currentScaleFactor;
- _roi.height = base_height * currentScaleFactor;
- _roi.x = cx - _roi.width / 2.0f;
- _roi.y = cy - _roi.height / 2.0f;
- }
- // Compute the F^l in the paper
- cv::Mat MADKCFtracker::get_scale_sample(const cv::Mat & image)
- {
- CvLSVMFeatureMapCaskade *map[33]; // temporarily store FHOG result
- cv::Mat xsf; // output
- int totalSize; // # of features
- for(int i = 0; i < n_scales; i++)
- {
- // Size of subwindow waiting to be detect
- float patch_width = base_width * scaleFactors[i] * currentScaleFactor;
- float patch_height = base_height * scaleFactors[i] * currentScaleFactor;
- float cx = _roi.x + _roi.width / 2.0f;
- float cy = _roi.y + _roi.height / 2.0f;
- // Get the subwindow
- cv::Mat im_patch = RectTools::extractImage(image, cx, cy, patch_width, patch_height);
- cv::Mat im_patch_resized;
- // Scaling the subwindow
- if(scale_model_width > im_patch.cols)
- resize(im_patch, im_patch_resized, cv::Size(scale_model_width, scale_model_height), 0, 0, 1);
- else
- resize(im_patch, im_patch_resized, cv::Size(scale_model_width, scale_model_height), 0, 0, 3);
- // Compute the FHOG features for the subwindow
- IplImage im_ipl = im_patch_resized;
- getFeatureMaps(&im_ipl, cell_size, &map[i]);
- normalizeAndTruncate(map[i], 0.2f);
- PCAFeatureMaps(map[i]);
- if(i == 0)
- {
- totalSize = map[i]->numFeatures*map[i]->sizeX*map[i]->sizeY;
- xsf = cv::Mat(cv::Size(n_scales,totalSize), CV_32F, float(0));
- }
- // Multiply the FHOG results by hanning window and copy to the output
- cv::Mat FeaturesMap = cv::Mat(cv::Size(1, totalSize), CV_32F, map[i]->map);
- float mul = s_hann.at<float > (0, i);
- FeaturesMap = mul * FeaturesMap;
- FeaturesMap.copyTo(xsf.col(i));
- }
- // Free the temp variables
- for(int i = 0; i < n_scales; i++)
- freeFeatureMapObject(&map[i]);
- // Do fft to the FHOG features row by row
- xsf = FFTTools::fftd(xsf, 0, 1);
- return xsf;
- }
- // Compute the FFT Guassian Peak for scaling
- cv::Mat MADKCFtracker::computeYsf()
- {
- float scale_sigma2 = n_scales / std::sqrt(n_scales) * scale_sigma_factor;
- scale_sigma2 = scale_sigma2 * scale_sigma2;
- cv::Mat res(cv::Size(n_scales, 1), CV_32F, float(0));
- float ceilS = std::ceil(n_scales / 2.0f);
- for(int i = 0; i < n_scales; i++)
- {
- res.at<float>(0,i) = std::exp(- 0.5 * std::pow(i + 1- ceilS, 2) / scale_sigma2);
- }
- return FFTTools::fftd(res);
- }
- // Compute the hanning window for scaling
- cv::Mat MADKCFtracker::createHanningMatsForScale()
- {
- cv::Mat hann_s = cv::Mat(cv::Size(n_scales, 1), CV_32F, cv::Scalar(0));
- for (int i = 0; i < hann_s.cols; i++)
- hann_s.at<float > (0, i) = 0.5 * (1 - std::cos(2 * 3.14159265358979323846 * i / (hann_s.cols - 1)));
- return hann_s;
- }
- main.hpp
- #pragma once
- #include "tracker.h"
- #ifndef _OPENCV_MADKCFtracker_HPP_
- #define _OPENCV_MADKCFtracker_HPP_
- #endif
- class MADKCFtracker : public Tracker
- {
- public:
- // Constructor
- MADKCFtracker(bool hog = true, bool fixed_window = true, bool multiscale = true, bool lab = true);
- // Initialize tracker
- virtual void init(const cv::Rect &roi, cv::Mat image);
- // Update position based on the new frame
- virtual cv::Rect update(cv::Mat image);
- float interp_factor; // linear interpolation factor for adaptation
- float sigma; // gaussian kernel bandwidth
- float lambda; // regularization
- int cell_size; // HOG cell size
- int cell_sizeQ; // cell size^2, to avoid repeated operations
- float padding; // extra area surrounding the target
- float output_sigma_factor; // bandwidth of gaussian target
- int template_size; // template size
- int base_width; // initial ROI widt
- int base_height; // initial ROI height
- int scale_max_area; // max ROI size before compressing
- float scale_padding; // extra area surrounding the target for scaling
- float scale_step; // scale step for multi-scale estimation
- float scale_sigma_factor; // bandwidth of gaussian target
- int n_scales; // # of scaling windows
- float scale_lr; // scale learning rate
- float *scaleFactors; // all scale changing rate, from larger to smaller with 1 to be the middle
- int scale_model_width; // the model width for scaling
- int scale_model_height; // the model height for scaling
- float currentScaleFactor; // scaling rate
- float min_scale_factor; // min scaling rate
- float max_scale_factor; // max scaling rate
- float scale_lambda; // regularization
- protected:
- // Detect object in the current frame.
- cv::Point2f detect(cv::Mat z, cv::Mat x, float &peak_value);
- // train tracker with a single image
- void train(cv::Mat x, float train_interp_factor);
- // Evaluates a Gaussian kernel with bandwidth SIGMA for all relative shifts between input images X and Y, which must both be MxN. They must also be periodic (ie., pre-processed with a cosine window).
- cv::Mat gaussianCorrelation(cv::Mat x1, cv::Mat x2);
- // Create Gaussian Peak. Function called only in the first frame.
- cv::Mat createGaussianPeak(int sizey, int sizex);
- // Obtain sub-window from image, with replication-padding and extract features
- cv::Mat getFeatures(const cv::Mat & image, bool inithann, float scale_adjust = 1.0f);
- // Initialize Hanning window. Function called only in the first frame.
- void createHanningMats();
- // Calculate sub-pixel peak for one dimension
- float subPixelPeak(float left, float center, float right);
- // Compute the FFT Guassian Peak for scaling
- cv::Mat computeYsf();
- // Compute the hanning window for scaling
- cv::Mat createHanningMatsForScale();
- // Initialization for scales
- void dsstInit(const cv::Rect &roi, cv::Mat image);
- // Compute the F^l in the paper
- cv::Mat get_scale_sample(const cv::Mat & image);
- // Update the ROI size after training
- void update_roi();
- // Train method for scaling
- void train_scale(cv::Mat image, bool ini = false);
- // Detect the new scaling rate
- cv::Point2i detect_scale(cv::Mat image);
- cv::Mat _alphaf;
- cv::Mat _prob;
- cv::Mat _tmpl;
- cv::Mat _num;
- cv::Mat _den;
- cv::Mat _labCentroids;
- cv::Mat sf_den;
- cv::Mat sf_num;
- private:
- int size_patch[3];
- cv::Mat hann;
- cv::Size _tmpl_sz;
- float _scale;
- int _gaussian_size;
- bool _hogfeatures;
- bool _labfeatures;
- cv::Mat s_hann;
- cv::Mat ysf;
- };
- recttools.hpp
- #pragma once
- //#include <cv.h>
- #include <opencv2/imgproc/types_c.h>
- #include <math.h>
- #ifndef _OPENCV_RECTTOOLS_HPP_
- #define _OPENCV_RECTTOOLS_HPP_
- #endif
- namespace RectTools
- {
- template <typename t>
- inline cv::Vec<t, 2 > center(const cv::Rect_<t> &rect)
- {
- return cv::Vec<t, 2 > (rect.x + rect.width / (t) 2, rect.y + rect.height / (t) 2);
- }
- template <typename t>
- inline t x2(const cv::Rect_<t> &rect)
- {
- return rect.x + rect.width;
- }
- template <typename t>
- inline t y2(const cv::Rect_<t> &rect)
- {
- return rect.y + rect.height;
- }
- template <typename t>
- inline void resize(cv::Rect_<t> &rect, float scalex, float scaley = 0)
- {
- if (!scaley)scaley = scalex;
- rect.x -= rect.width * (scalex - 1.f) / 2.f;
- rect.width *= scalex;
- rect.y -= rect.height * (scaley - 1.f) / 2.f;
- rect.height *= scaley;
- }
- template <typename t>
- inline void limit(cv::Rect_<t> &rect, cv::Rect_<t> limit)
- {
- if (rect.x + rect.width > limit.x + limit.width)rect.width = (limit.x + limit.width - rect.x);
- if (rect.y + rect.height > limit.y + limit.height)rect.height = (limit.y + limit.height - rect.y);
- if (rect.x < limit.x)
- {
- rect.width -= (limit.x - rect.x);
- rect.x = limit.x;
- }
- if (rect.y < limit.y)
- {
- rect.height -= (limit.y - rect.y);
- rect.y = limit.y;
- }
- if(rect.width<0)rect.width=0;
- if(rect.height<0)rect.height=0;
- }
- template <typename t>
- inline void limit(cv::Rect_<t> &rect, t width, t height, t x = 0, t y = 0)
- {
- limit(rect, cv::Rect_<t > (x, y, width, height));
- }
- template <typename t>
- inline cv::Rect getBorder(const cv::Rect_<t > &original, cv::Rect_<t > & limited)
- {
- cv::Rect_<t > res;
- res.x = limited.x - original.x;
- res.y = limited.y - original.y;
- res.width = x2(original) - x2(limited);
- res.height = y2(original) - y2(limited);
- assert(res.x >= 0 && res.y >= 0 && res.width >= 0 && res.height >= 0);
- return res;
- }
- inline cv::Mat subwindow(const cv::Mat &in, const cv::Rect & window, int borderType = cv::BORDER_CONSTANT)
- {
- cv::Rect cutWindow = window;
- RectTools::limit(cutWindow, in.cols, in.rows);
- if (cutWindow.height <= 0 || cutWindow.width <= 0)assert(0); //return cv::Mat(window.height,window.width,in.type(),0) ;
- cv::Rect border = RectTools::getBorder(window, cutWindow);
- cv::Mat res = in(cutWindow);
- if (border != cv::Rect(0, 0, 0, 0))
- {
- cv::copyMakeBorder(res, res, border.y, border.height, border.x, border.width, borderType);
- }
- return res;
- }
- inline void cutOutsize(float &num, int limit)
- {
- if(num < 0)
- num = 0;
- else if(num > limit - 1)
- num = limit - 1;
- }
- inline cv::Mat extractImage(const cv::Mat &in, float cx, float cy, float patch_width, float patch_height)
- {
- float xs_s = floor(cx) - floor(patch_width / 2);
- RectTools::cutOutsize(xs_s, in.cols);
- float xs_e = floor(cx + patch_width - 1) - floor(patch_width / 2);
- RectTools::cutOutsize(xs_e, in.cols);
- float ys_s = floor(cy) - floor(patch_height / 2);
- RectTools::cutOutsize(ys_s, in.rows);
- float ys_e = floor(cy + patch_height - 1) - floor(patch_height / 2);
- RectTools::cutOutsize(ys_e, in.rows);
- return in(cv::Rect(xs_s, ys_s, xs_e - xs_s, ys_e - ys_s));
- }
- inline cv::Mat getGrayImage(cv::Mat img)
- {
- cv::cvtColor(img, img, CV_BGR2GRAY);
- img.convertTo(img, CV_32F, 1 / 255.f);
- return img;
- }
- }
- tracker.h
- #pragma once
- #include <opencv2/opencv.hpp>
- #include <string>
- class Tracker
- {
- public:
- Tracker() {}
- virtual ~Tracker() { }
- virtual void init(const cv::Rect &roi, cv::Mat image) = 0;
- virtual cv::Rect update( cv::Mat image)=0;
- protected:
- cv::Rect_<float> _roi;
- };
- fhog.cpp
- #include "fhog.hpp"
- #ifdef HAVE_TBB
- #include <tbb/tbb.h>
- #include "tbb/parallel_for.h"
- #include "tbb/blocked_range.h"
- #endif
- #ifndef max
- #define max(a,b) (((a) > (b)) ? (a) : (b))
- #endif
- #ifndef min
- #define min(a,b) (((a) < (b)) ? (a) : (b))
- #endif
- /*
- // Getting feature map for the selected subimage
- //
- // API
- // int getFeatureMaps(const IplImage * image, const int k, featureMap **map);
- // INPUT
- // image - selected subimage
- // k - size of cells
- // OUTPUT
- // map - feature map
- // RESULT
- // Error status
- */
- int getFeatureMaps(const IplImage* image, const int k, CvLSVMFeatureMapCaskade **map)
- {
- int sizeX, sizeY;
- int p, px, stringSize;
- int height, width, numChannels;
- int i, j, kk, c, ii, jj, d;
- float * datadx, * datady;
- int ch;
- float magnitude, x, y, tx, ty;
- IplImage * dx, * dy;
- int *nearest;
- float *w, a_x, b_x;
- float kernel[3] = {-1.f, 0.f, 1.f};
- CvMat kernel_dx = cvMat(1, 3, CV_32F, kernel);
- CvMat kernel_dy = cvMat(3, 1, CV_32F, kernel);
- float * r;
- int * alfa;
- float boundary_x[NUM_SECTOR + 1];
- float boundary_y[NUM_SECTOR + 1];
- float max, dotProd;
- int maxi;
- height = image->height;
- width = image->width ;
- numChannels = image->nChannels;
- dx = cvCreateImage(cvSize(image->width, image->height),
- IPL_DEPTH_32F, 3);
- dy = cvCreateImage(cvSize(image->width, image->height),
- IPL_DEPTH_32F, 3);
- sizeX = width / k;
- sizeY = height / k;
- px = 3 * NUM_SECTOR;
- p = px;
- stringSize = sizeX * p;
- allocFeatureMapObject(map, sizeX, sizeY, p);
- cvFilter2D(image, dx, &kernel_dx, cvPoint(-1, 0));
- cvFilter2D(image, dy, &kernel_dy, cvPoint(0, -1));
- float arg_vector;
- for(i = 0; i <= NUM_SECTOR; i++)
- {
- arg_vector = ( (float) i ) * ( (float)(PI) / (float)(NUM_SECTOR) );
- boundary_x[i] = cosf(arg_vector);
- boundary_y[i] = sinf(arg_vector);
- }/*for(i = 0; i <= NUM_SECTOR; i++) */
- r = (float *)malloc( sizeof(float) * (width * height));
- alfa = (int *)malloc( sizeof(int ) * (width * height * 2));
- for(j = 1; j < height - 1; j++)
- {
- datadx = (float*)(dx->imageData + dx->widthStep * j);
- datady = (float*)(dy->imageData + dy->widthStep * j);
- for(i = 1; i < width - 1; i++)
- {
- c = 0;
- x = (datadx[i * numChannels + c]);
- y = (datady[i * numChannels + c]);
- r[j * width + i] =sqrtf(x * x + y * y);
- for(ch = 1; ch < numChannels; ch++)
- {
- tx = (datadx[i * numChannels + ch]);
- ty = (datady[i * numChannels + ch]);
- magnitude = sqrtf(tx * tx + ty * ty);
- if(magnitude > r[j * width + i])
- {
- r[j * width + i] = magnitude;
- c = ch;
- x = tx;
- y = ty;
- }
- }/*for(ch = 1; ch < numChannels; ch++)*/
- max = boundary_x[0] * x + boundary_y[0] * y;
- maxi = 0;
- for (kk = 0; kk < NUM_SECTOR; kk++)
- {
- dotProd = boundary_x[kk] * x + boundary_y[kk] * y;
- if (dotProd > max)
- {
- max = dotProd;
- maxi = kk;
- }
- else
- {
- if (-dotProd > max)
- {
- max = -dotProd;
- maxi = kk + NUM_SECTOR;
- }
- }
- }
- alfa[j * width * 2 + i * 2 ] = maxi % NUM_SECTOR;
- alfa[j * width * 2 + i * 2 + 1] = maxi;
- }/*for(i = 0; i < width; i++)*/
- }/*for(j = 0; j < height; j++)*/
- nearest = (int *)malloc(sizeof(int ) * k);
- w = (float*)malloc(sizeof(float) * (k * 2));
- for(i = 0; i < k / 2; i++)
- {
- nearest[i] = -1;
- }/*for(i = 0; i < k / 2; i++)*/
- for(i = k / 2; i < k; i++)
- {
- nearest[i] = 1;
- }/*for(i = k / 2; i < k; i++)*/
- for(j = 0; j < k / 2; j++)
- {
- b_x = k / 2 + j + 0.5f;
- a_x = k / 2 - j - 0.5f;
- w[j * 2 ] = 1.0f/a_x * ((a_x * b_x) / ( a_x + b_x));
- w[j * 2 + 1] = 1.0f/b_x * ((a_x * b_x) / ( a_x + b_x));
- }/*for(j = 0; j < k / 2; j++)*/
- for(j = k / 2; j < k; j++)
- {
- a_x = j - k / 2 + 0.5f;
- b_x =-j + k / 2 - 0.5f + k;
- w[j * 2 ] = 1.0f/a_x * ((a_x * b_x) / ( a_x + b_x));
- w[j * 2 + 1] = 1.0f/b_x * ((a_x * b_x) / ( a_x + b_x));
- }/*for(j = k / 2; j < k; j++)*/
- for(i = 0; i < sizeY; i++)
- {
- for(j = 0; j < sizeX; j++)
- {
- for(ii = 0; ii < k; ii++)
- {
- for(jj = 0; jj < k; jj++)
- {
- if ((i * k + ii > 0) &&
- (i * k + ii < height - 1) &&
- (j * k + jj > 0) &&
- (j * k + jj < width - 1))
- {
- d = (k * i + ii) * width + (j * k + jj);
- (*map)->map[ i * stringSize + j * (*map)->numFeatures + alfa[d * 2 ]] +=
- r[d] * w[ii * 2] * w[jj * 2];
- (*map)->map[ i * stringSize + j * (*map)->numFeatures + alfa[d * 2 + 1] + NUM_SECTOR] +=
- r[d] * w[ii * 2] * w[jj * 2];
- if ((i + nearest[ii] >= 0) &&
- (i + nearest[ii] <= sizeY - 1))
- {
- (*map)->map[(i + nearest[ii]) * stringSize + j * (*map)->numFeatures + alfa[d * 2 ] ] +=
- r[d] * w[ii * 2 + 1] * w[jj * 2 ];
- (*map)->map[(i + nearest[ii]) * stringSize + j * (*map)->numFeatures + alfa[d * 2 + 1] + NUM_SECTOR] +=
- r[d] * w[ii * 2 + 1] * w[jj * 2 ];
- }
- if ((j + nearest[jj] >= 0) &&
- (j + nearest[jj] <= sizeX - 1))
- {
- (*map)->map[i * stringSize + (j + nearest[jj]) * (*map)->numFeatures + alfa[d * 2 ] ] +=
- r[d] * w[ii * 2] * w[jj * 2 + 1];
- (*map)->map[i * stringSize + (j + nearest[jj]) * (*map)->numFeatures + alfa[d * 2 + 1] + NUM_SECTOR] +=
- r[d] * w[ii * 2] * w[jj * 2 + 1];
- }
- if ((i + nearest[ii] >= 0) &&
- (i + nearest[ii] <= sizeY - 1) &&
- (j + nearest[jj] >= 0) &&
- (j + nearest[jj] <= sizeX - 1))
- {
- (*map)->map[(i + nearest[ii]) * stringSize + (j + nearest[jj]) * (*map)->numFeatures + alfa[d * 2 ] ] +=
- r[d] * w[ii * 2 + 1] * w[jj * 2 + 1];
- (*map)->map[(i + nearest[ii]) * stringSize + (j + nearest[jj]) * (*map)->numFeatures + alfa[d * 2 + 1] + NUM_SECTOR] +=
- r[d] * w[ii * 2 + 1] * w[jj * 2 + 1];
- }
- }
- }/*for(jj = 0; jj < k; jj++)*/
- }/*for(ii = 0; ii < k; ii++)*/
- }/*for(j = 1; j < sizeX - 1; j++)*/
- }/*for(i = 1; i < sizeY - 1; i++)*/
- cvReleaseImage(&dx);
- cvReleaseImage(&dy);
- free(w);
- free(nearest);
- free(r);
- free(alfa);
- return LATENT_SVM_OK;
- }
- /*
- // Feature map Normalization and Truncation
- //
- // API
- // int normalizeAndTruncate(featureMap *map, const float alfa);
- // INPUT
- // map - feature map
- // alfa - truncation threshold
- // OUTPUT
- // map - truncated and normalized feature map
- // RESULT
- // Error status
- */
- int normalizeAndTruncate(CvLSVMFeatureMapCaskade *map, const float alfa)
- {
- int i,j, ii;
- int sizeX, sizeY, p, pos, pp, xp, pos1, pos2;
- float * partOfNorm; // norm of C(i, j)
- float * newData;
- float valOfNorm;
- sizeX = map->sizeX;
- sizeY = map->sizeY;
- partOfNorm = (float *)malloc (sizeof(float) * (sizeX * sizeY));
- p = NUM_SECTOR;
- xp = NUM_SECTOR * 3;
- pp = NUM_SECTOR * 12;
- for(i = 0; i < sizeX * sizeY; i++)
- {
- valOfNorm = 0.0f;
- pos = i * map->numFeatures;
- for(j = 0; j < p; j++)
- {
- valOfNorm += map->map[pos + j] * map->map[pos + j];
- }/*for(j = 0; j < p; j++)*/
- partOfNorm[i] = valOfNorm;
- }/*for(i = 0; i < sizeX * sizeY; i++)*/
- sizeX -= 2;
- sizeY -= 2;
- newData = (float *)malloc (sizeof(float) * (sizeX * sizeY * pp));
- //normalization
- for(i = 1; i <= sizeY; i++)
- {
- for(j = 1; j <= sizeX; j++)
- {
- valOfNorm = sqrtf(
- partOfNorm[(i )*(sizeX + 2) + (j )] +
- partOfNorm[(i )*(sizeX + 2) + (j + 1)] +
- partOfNorm[(i + 1)*(sizeX + 2) + (j )] +
- partOfNorm[(i + 1)*(sizeX + 2) + (j + 1)]) + FLT_EPSILON;
- pos1 = (i ) * (sizeX + 2) * xp + (j ) * xp;
- pos2 = (i-1) * (sizeX ) * pp + (j-1) * pp;
- for(ii = 0; ii < p; ii++)
- {
- newData[pos2 + ii ] = map->map[pos1 + ii ] / valOfNorm;
- }/*for(ii = 0; ii < p; ii++)*/
- for(ii = 0; ii < 2 * p; ii++)
- {
- newData[pos2 + ii + p * 4] = map->map[pos1 + ii + p] / valOfNorm;
- }/*for(ii = 0; ii < 2 * p; ii++)*/
- valOfNorm = sqrtf(
- partOfNorm[(i )*(sizeX + 2) + (j )] +
- partOfNorm[(i )*(sizeX + 2) + (j + 1)] +
- partOfNorm[(i - 1)*(sizeX + 2) + (j )] +
- partOfNorm[(i - 1)*(sizeX + 2) + (j + 1)]) + FLT_EPSILON;
- for(ii = 0; ii < p; ii++)
- {
- newData[pos2 + ii + p ] = map->map[pos1 + ii ] / valOfNorm;
- }/*for(ii = 0; ii < p; ii++)*/
- for(ii = 0; ii < 2 * p; ii++)
- {
- newData[pos2 + ii + p * 6] = map->map[pos1 + ii + p] / valOfNorm;
- }/*for(ii = 0; ii < 2 * p; ii++)*/
- valOfNorm = sqrtf(
- partOfNorm[(i )*(sizeX + 2) + (j )] +
- partOfNorm[(i )*(sizeX + 2) + (j - 1)] +
- partOfNorm[(i + 1)*(sizeX + 2) + (j )] +
- partOfNorm[(i + 1)*(sizeX + 2) + (j - 1)]) + FLT_EPSILON;
- for(ii = 0; ii < p; ii++)
- {
- newData[pos2 + ii + p * 2] = map->map[pos1 + ii ] / valOfNorm;
- }/*for(ii = 0; ii < p; ii++)*/
- for(ii = 0; ii < 2 * p; ii++)
- {
- newData[pos2 + ii + p * 8] = map->map[pos1 + ii + p] / valOfNorm;
- }/*for(ii = 0; ii < 2 * p; ii++)*/
- valOfNorm = sqrtf(
- partOfNorm[(i )*(sizeX + 2) + (j )] +
- partOfNorm[(i )*(sizeX + 2) + (j - 1)] +
- partOfNorm[(i - 1)*(sizeX + 2) + (j )] +
- partOfNorm[(i - 1)*(sizeX + 2) + (j - 1)]) + FLT_EPSILON;
- for(ii = 0; ii < p; ii++)
- {
- newData[pos2 + ii + p * 3 ] = map->map[pos1 + ii ] / valOfNorm;
- }/*for(ii = 0; ii < p; ii++)*/
- for(ii = 0; ii < 2 * p; ii++)
- {
- newData[pos2 + ii + p * 10] = map->map[pos1 + ii + p] / valOfNorm;
- }/*for(ii = 0; ii < 2 * p; ii++)*/
- }/*for(j = 1; j <= sizeX; j++)*/
- }/*for(i = 1; i <= sizeY; i++)*/
- //truncation
- for(i = 0; i < sizeX * sizeY * pp; i++)
- {
- if(newData [i] > alfa) newData [i] = alfa;
- }/*for(i = 0; i < sizeX * sizeY * pp; i++)*/
- //swop data
- map->numFeatures = pp;
- map->sizeX = sizeX;
- map->sizeY = sizeY;
- free (map->map);
- free (partOfNorm);
- map->map = newData;
- return LATENT_SVM_OK;
- }
- /*
- // Feature map reduction
- // In each cell we reduce dimension of the feature vector
- // according to original paper special procedure
- //
- // API
- // int PCAFeatureMaps(featureMap *map)
- // INPUT
- // map - feature map
- // OUTPUT
- // map - feature map
- // RESULT
- // Error status
- */
- int PCAFeatureMaps(CvLSVMFeatureMapCaskade *map)
- {
- int i,j, ii, jj, k;
- int sizeX, sizeY, p, pp, xp, yp, pos1, pos2;
- float * newData;
- float val;
- float nx, ny;
- sizeX = map->sizeX;
- sizeY = map->sizeY;
- p = map->numFeatures;
- pp = NUM_SECTOR * 3 + 4;
- yp = 4;
- xp = NUM_SECTOR;
- nx = 1.0f / sqrtf((float)(xp * 2));
- ny = 1.0f / sqrtf((float)(yp ));
- newData = (float *)malloc (sizeof(float) * (sizeX * sizeY * pp));
- for(i = 0; i < sizeY; i++)
- {
- for(j = 0; j < sizeX; j++)
- {
- pos1 = ((i)*sizeX + j)*p;
- pos2 = ((i)*sizeX + j)*pp;
- k = 0;
- for(jj = 0; jj < xp * 2; jj++)
- {
- val = 0;
- for(ii = 0; ii < yp; ii++)
- {
- val += map->map[pos1 + yp * xp + ii * xp * 2 + jj];
- }/*for(ii = 0; ii < yp; ii++)*/
- newData[pos2 + k] = val * ny;
- k++;
- }/*for(jj = 0; jj < xp * 2; jj++)*/
- for(jj = 0; jj < xp; jj++)
- {
- val = 0;
- for(ii = 0; ii < yp; ii++)
- {
- val += map->map[pos1 + ii * xp + jj];
- }/*for(ii = 0; ii < yp; ii++)*/
- newData[pos2 + k] = val * ny;
- k++;
- }/*for(jj = 0; jj < xp; jj++)*/
- for(ii = 0; ii < yp; ii++)
- {
- val = 0;
- for(jj = 0; jj < 2 * xp; jj++)
- {
- val += map->map[pos1 + yp * xp + ii * xp * 2 + jj];
- }/*for(jj = 0; jj < xp; jj++)*/
- newData[pos2 + k] = val * nx;
- k++;
- } /*for(ii = 0; ii < yp; ii++)*/
- }/*for(j = 0; j < sizeX; j++)*/
- }/*for(i = 0; i < sizeY; i++)*/
- //swop data
- map->numFeatures = pp;
- free (map->map);
- map->map = newData;
- return LATENT_SVM_OK;
- }
- //modified from "lsvmc_routine.cpp"
- int allocFeatureMapObject(CvLSVMFeatureMapCaskade **obj, const int sizeX,
- const int sizeY, const int numFeatures)
- {
- int i;
- (*obj) = (CvLSVMFeatureMapCaskade *)malloc(sizeof(CvLSVMFeatureMapCaskade));
- (*obj)->sizeX = sizeX;
- (*obj)->sizeY = sizeY;
- (*obj)->numFeatures = numFeatures;
- (*obj)->map = (float *) malloc(sizeof (float) *
- (sizeX * sizeY * numFeatures));
- for(i = 0; i < sizeX * sizeY * numFeatures; i++)
- {
- (*obj)->map[i] = 0.0f;
- }
- return LATENT_SVM_OK;
- }
- int freeFeatureMapObject (CvLSVMFeatureMapCaskade **obj)
- {
- if(*obj == NULL) return LATENT_SVM_MEM_NULL;
- free((*obj)->map);
- free(*obj);
- (*obj) = NULL;
- return LATENT_SVM_OK;
- }
- fhog.hpp
- #ifndef _FHOG_H_
- #define _FHOG_H_
- #include <stdio.h>
- //#include "_lsvmc_types.h"
- //#include "_lsvmc_error.h"
- //#include "_lsvmc_routine.h"
- //#include "opencv2/imgproc.hpp"
- #include "opencv2/imgproc/imgproc_c.h"
- //modified from "_lsvmc_types.h"
- // DataType: STRUCT featureMap
- // FEATURE MAP DESCRIPTION
- // Rectangular map (sizeX x sizeY),
- // every cell stores feature vector (dimension = numFeatures)
- // map - matrix of feature vectors
- // to set and get feature vectors (i,j)
- // used formula map[(j * sizeX + i) * p + k], where
- // k - component of feature vector in cell (i, j)
- typedef struct{
- int sizeX;
- int sizeY;
- int numFeatures;
- float *map;
- } CvLSVMFeatureMapCaskade;
- #include "float.h"
- #define PI CV_PI
- #define EPS 0.000001
- #define F_MAX FLT_MAX
- #define F_MIN -FLT_MAX
- // The number of elements in bin
- // The number of sectors in gradient histogram building
- #define NUM_SECTOR 9
- // The number of levels in image resize procedure
- // We need Lambda levels to resize image twice
- #define LAMBDA 10
- // Block size. Used in feature pyramid building procedure
- #define SIDE_LENGTH 8
- #define VAL_OF_TRUNCATE 0.2f
- //modified from "_lsvm_error.h"
- #define LATENT_SVM_OK 0
- #define LATENT_SVM_MEM_NULL 2
- #define DISTANCE_TRANSFORM_OK 1
- #define DISTANCE_TRANSFORM_GET_INTERSECTION_ERROR -1
- #define DISTANCE_TRANSFORM_ERROR -2
- #define DISTANCE_TRANSFORM_EQUAL_POINTS -3
- #define LATENT_SVM_GET_FEATURE_PYRAMID_FAILED -4
- #define LATENT_SVM_SEARCH_OBJECT_FAILED -5
- #define LATENT_SVM_FAILED_SUPERPOSITION -6
- #define FILTER_OUT_OF_BOUNDARIES -7
- #define LATENT_SVM_TBB_SCHEDULE_CREATION_FAILED -8
- #define LATENT_SVM_TBB_NUMTHREADS_NOT_CORRECT -9
- #define FFT_OK 2
- #define FFT_ERROR -10
- #define LSVM_PARSER_FILE_NOT_FOUND -11
- /*
- // Getting feature map for the selected subimage
- //
- // API
- // int getFeatureMaps(const IplImage * image, const int k, featureMap **map);
- // INPUT
- // image - selected subimage
- // k - size of cells
- // OUTPUT
- // map - feature map
- // RESULT
- // Error status
- */
- int getFeatureMaps(const IplImage * image, const int k, CvLSVMFeatureMapCaskade **map);
- /*
- // Feature map Normalization and Truncation
- //
- // API
- // int normalizationAndTruncationFeatureMaps(featureMap *map, const float alfa);
- // INPUT
- // map - feature map
- // alfa - truncation threshold
- // OUTPUT
- // map - truncated and normalized feature map
- // RESULT
- // Error status
- */
- int normalizeAndTruncate(CvLSVMFeatureMapCaskade *map, const float alfa);
- /*
- // Feature map reduction
- // In each cell we reduce dimension of the feature vector
- // according to original paper special procedure
- //
- // API
- // int PCAFeatureMaps(featureMap *map)
- // INPUT
- // map - feature map
- // OUTPUT
- // map - feature map
- // RESULT
- // Error status
- */
- int PCAFeatureMaps(CvLSVMFeatureMapCaskade *map);
- //modified from "lsvmc_routine.h"
- int allocFeatureMapObject(CvLSVMFeatureMapCaskade **obj, const int sizeX, const int sizeY,
- const int p);
- int freeFeatureMapObject (CvLSVMFeatureMapCaskade **obj);
- #endif
- labdata.hpp
- const int nClusters = 15;
- float data[nClusters][3] = {
- {161.317504, 127.223401, 128.609333},
- {142.922425, 128.666965, 127.532319},
- {67.879757, 127.721830, 135.903311},
- {92.705062, 129.965717, 137.399500},
- {120.172257, 128.279647, 127.036493},
- {195.470568, 127.857070, 129.345415},
- {41.257102, 130.059468, 132.675336},
- {12.014861, 129.480555, 127.064714},
- {226.567086, 127.567831, 136.345727},
- {154.664210, 131.676606, 156.481669},
- {121.180447, 137.020793, 153.433743},
- {87.042204, 137.211742, 98.614874},
- {113.809537, 106.577104, 157.818094},
- {81.083293, 170.051905, 148.904079},
- {45.015485, 138.543124, 102.402528}};
宋剑波
Comments