Skip to content

Perception

Perception is the extraction of information from raw sensory data.

The goals of perception can include:

  • Object recognition (detection), e.g.:
    • Pedestrian, cyclist, vehicle recognition
    • Traffic sign recognition, traffic light recognition
    • Drivable surface and lane recognition (also for localization and planning)
  • Object classification:
    • Classifying already recognized objects. For example, determining the color of a traffic light, or distinguishing between a van and a horse-drawn carriage.
  • Object tracking and prediction:
    • Determining the past paths of vehicles and pedestrians and predicting their future paths. This can be related to classification, as a horse-drawn carriage, despite being similar in size to a trailer, has different acceleration capabilities. This information can be used to plan routes and trajectories.
  • Localization and mapping:
    • SLAM: non-GNSS-based localization supplemented with local map creation. LOAM: LIDAR-based odometry.

Based on the sensors used, perception can involve: - LIDAR - Camera - Radar - IMU - GNSS/GPS - Microphone - Any combination of the above sensors

Danger

In Hungarian, it is easy to confuse the terms sensing (sensing) and perception (perception). Perception is a complex function that deals with producing processed, interpreted output from raw data.

flowchart LR

L[Planning]:::light

subgraph Perception [Perception]
  T[Mapping]:::light 
  H[Localization]:::light
  P[Object
  Prediction]:::light 
  D[Object
  Detection]:::light
  K[Object 
  Classification]:::light
  D-->K
end
subgraph Sensing [Sensing]
  GPS[GPS/GNSS]:::light -.-> T
  GPS -.-> H
  LIDAR[LIDAR]:::light
  KAM[Camera]:::light
  IMU[IMU]:::light
  LIDAR -.-> D
  LIDAR -.-> P
  LIDAR -.-> T
  KAM-.-> P
  KAM-.-> D
  IMU-.-> T
  D-.->P
end

T -->|map| L
H -->|pose| L
P -->|obj.| L
K -->|obj.| L

classDef light fill:#34aec5,stroke:#152742,stroke-width:2px,color:#152742  
classDef dark fill:#152742,stroke:#34aec5,stroke-width:2px,color:#34aec5
classDef white fill:#ffffff,stroke:#152742,stroke-width:2px,color:#152742
classDef red fill:#ef4638,stroke:#152742,stroke-width:2px,color:#fff

This material is based on the Autonomous Driving Software Engineering course at TU Munich, compiled by the staff of the Institute of Automotive Technology. The lecture video is available in German:

Challenges and Difficulties

Several challenges can hinder recognition and its accuracy: - Weather (rain, snow, fog, ...) - Time of day (night, sunset, sunrise ...) - Occlusion (objects are only partially visible) - Computation time (exponentially more critical at higher speeds) - Different environments (urban, highway, forested areas ...)

Use Cases

Since it would be difficult to demonstrate every aspect of perception, we will instead showcase a few use cases.

Camera-based Traffic Light Classification

Processing camera images using artificial intelligence (neural network: YOLOv7).

LIDAR-based Simple Height Filtering

A task often encountered in practice is simple LIDAR filtering based on X, Y, and Z coordinates. Since LIDAR provides a simple representation of the 3D environment, it can be easier to work with than a camera. A common technique is to filter out the road level from LIDAR data (ground segmentation), with the remaining points (non-ground) representing all objects. Here we demonstrate a much simpler technology:

Clustering

After filtering out the road level from LIDAR data (ground segmentation), ground points and non-ground points are generated. The non-ground points need to be clustered to form points describing objects. The essence of clustering is that the points of a given object (e.g., a car) are close to each other.

Source: codeahoy.com

Sensor Fusion

The following video demonstrates perception through a real-life example.

LIDAR-based Road Surface / Curb Detection

An algorithm developed by our university.

LIDAR-based Object Tracking and Prediction

SLAM LIDAR and Camera Fusion

Simultaneous Localization and Mapping (SLAM) involves mapping the position and environment of a moving system (robot or vehicle) while navigating.

Sources