Skip to content

Detection of unobtrusive, untextured objects from (ortho-)fotos.

Notifications You must be signed in to change notification settings

DURAARK/elecdetect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ElecDetec - Detection of unobtrusive, untextured objects

Introduction

This console application is build to detect unobtrusive objects which are mostly defined by slight edges on fixed scaled images using computer vision methods. More precisely, this program is part of a algorithm pipeline for post-processing indoor 3D scans. It was especially designed to detect power sockets and light switches on wall texture images, which are generated by back-projecting a given panoramic image onto planes retrieved from the room geometry. Since the room geometry is known in advance, object detection is performed on a fixed scale.

Method

A given query image is scanned by a sliding window of fixed size. An assignment of a class label (e.g. socket, switch or background/no object) is made by extracting distinctive image features which are subsequently evaluated by a machine learning algorithm, or rather, a random forest classifier. (Leo Breiman, "Random forests." Machine learning 45.1 (2001): 5-32.*) In this application multiple feature extractor algorithms focusing on the overall gradient information are implemented forming a pool of image features passed to the classifier:

  • Histograms of oriented gradients (HoG), according to Navneet Dalal and Bill Triggs. "Histograms of oriented gradients for human detection." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.

  • A newly developed feature descriptor modelling mean gradient directions by multiple, randomly chosen subareas. The method of describing an image content by means of subareas (also called Haar-like features) is inspired by Paul Viola and Michael Jones. "Rapid object detection using a boosted cascade of simple features." Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on. Vol. 1. IEEE, 2001. The final feature descriptor entries are formed by the difference of mean values of subarea pairs, yielding more distinctive features. This pair representation is also known as second-order Haar-like features as they model local derivatives.

  • Orientation filters that extracts gradients in strictly vertical and horizontal directions. As above, these images are then described by the difference of mean filter responses of randomly chosen subarea pairs.

  • The overall image intesity (gray values) and the CIELUV color space is also modelled by differences of Haar-like feature pairs. In this case, using pairs instead of single subareas enhances the invariance to different illumination settings.

A completed sliding window scan retrieves a class occurance probability map for each label. These probability maps are further processed by a non-maxima-suppression to obtain the most likely object positions within a certain region.

Application tutorial

In order to perform object detection on query images, the algorithm has to be trained in advance. This is done by a set of labeled training patches, represented by a folder containing images named according to a specified convention. These patches are supposed to show the corresponding object class under real application scenarios, i.e. ideally wall textures calculated from back-projected panoramic images taken from several view angles of known 3D scenes.

Due to practicability reasons, the default training set is made of rectified images of power sockets and light switches taken from several positions by a standard camera. The training image filenames have to start with the corresponding class label (i.e. a positive integer number), delimited by a user-defined character or character string from the remaining filename. After training, an XML configuration file is created containing all the data necessary for the algorithm in order to perform object detection on query images.

The program takes 4 different parameters:

  • -m, --mode (required): execution mode, which can be either "train" for training mode, or "detect" for detection mode
  • -c, --config (required): relative path of the XML configuration file that is created in training mode or read on detection mode
  • -d, --dir (required): the image directory containing the training patches at training or the query images, depending on the mode
  • -i, --ini (optional): relative path to the ini file that specifies application dependent settings. Per default, the program uses the config.ini file.

Considering the training samples stored in a directory named trainingset, the algorithm data should be stored in a file called data.xml and the ini file config.ini is present in the root directory, the algorithm can be trained by the following command line:

ElecDetec -m train -c data.xml -d trainingset

After the program is finished, the file data.xml is created and the program is ready to perform object detection on texture images located in a folder named queries by executing

ElecDetec -m detect -c data.xml -d queries

The detection results are stored in a directory within the queries folder, where for each query image a graphical visualization (labelled bounding boxes on the original image), an XML file containing all information of each detection and probability maps for each object label are created. A probability map is an intensity image coding the occurance probability of a certain class for each pixel in the query image.

INI file parameter specification

The INI file holds several parameters supporting an easy integration in a superior algorithm pipeline. These parameters are arranged in 3 sections:

  • detection: Parameters relevant in detection mode influencing the detection results and output files
    • result_directory_name: Name of the created directory containing the detection results
    • write_probability_maps: Boolean value if probability maps are created for each object label.
    • filename_result_suffix: Suffix of result files that is attached to the original filename
    • prob_map_result_suffix: Suffix of resulting probability maps that is attached to the original filename and the result suffix
    • max_boundingbox_overlap: Maximum bounding box overlap between two detections used for non-maxima-suppression. The overlap is defined as the ratio between the junction area and the union area of the corresponding bounding-boxes.
    • detection_default_threshold: The default minimum probability a detection has to reach to be counted as actual detection. This value can be used to adjust the trade-off between precision and recall.
    • detection_label_thresholds: Label-specific thresholds. These values must be coherent to the parameter detection_labels.
    • detection_labels: This parameter assigns the specific thresholds from above to a certain label. If a label is not listed, the default threshold is used for the class decision.
  • training: Settings for the training mode, containing filename convention of the training images and the number of bootstrap stages.
    • label_delimiter: Character or character string that separates the label from the remaining filename of the training images.
    • max_bootstrap_stages: The maximum number of performed bootstrapping stages in order to reach a mimumim training error. A high number may lead to disproportional training effort, whereas a low number may decrease the classification performance.
  • common: Specifications for both modes. These settings must not be altered between training and detection.
    • patch_window_size: Edge length of the quadratic search window and thus the object size (including additional context). For efficiency reasons it is recommended that the size of the training images exactly match this setting. If a training image does not match this size, it is resized during training internally.
    • file_extentions: Supported image formats. All other files in the query- or training-directory are ignored. Please be aware that your OpenCV installation must recognize the given formats.
    • background_label: The label used for the background class.

About

Detection of unobtrusive, untextured objects from (ortho-)fotos.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published