Contents


  • Data acquisition
  • Data formatting
  • Statistics
  • Examples


  • Data Acquisition


    Sensor Setup
    Our data capturing platform was equipped with the following sensors:

  • 4 Laserscanners: Velodyne HDL-32E
  • 4 Color cameras, 12 Megapixels: Allied Vision Manta G-1236C
  • 4 Lenses, 12mm: V1228-MPY

  • The vehicle is equipped with four LiDAR scanners, two at each side of the roof with a roll angle of 45° between them. The cameras are mounted on top of our vehicle in two stereo pairs. The left pair is mounted on an independent bar rotated by 30° to capture the incoming road from the left and the right pair facing directly forward. We arranged the cameras with a baseline of 0.33m and 0.27m for the left and right stereo pairs, respectively. We triggered the cameras via a trigger signal emitted when the second camera from the left started exposing its sensor. We recorded the timestamp of each image using the cameras' internal clock and these clocks were synchronized via the IEEE1588-2008 PTP protocol. We also synchronized the computer timestamp to the camera clocks using the same method. Using this timestamp, we compute LiDAR returns within a given camera frame.

    Fig. Sensor diagram
    Fig. Sensor diagram
    Fig. Our capture vehicle - Ford Fusion
    Fig. Our capture vehicle - Ford Fusion


    Calibration
    The camera to camera calibration was performed for each pair of stereo cameras separately using the Camera Calibration Toolbox from MATLAB [1]. For determining the camera to LiDAR calibration, we used 50 manually selected constraints between 3D LiDAR points and 2D pixel locations in both the left and right images and used the Levenberg-Marquardt algorithm to minimize projection error. The geometry of the sensor setup was used to compute the initial guess for the rigid body transformation. Prominent edges such as building corners, stop sign poles and electric poles can be easily identified in the point cloud as well as the image which are used for manually choosing the correspondences.

    [1] Bouguet, Jean-Yves. "Matlab camera calibration toolbox." Caltech Technical Report (2000).


    Scene Selection
    The capture sites and times are selected to maximize the amount of traffic, complexity of crossing patterns, and lighting and weather variation. To capture complex interactions between pedestrians and vehicles, we focused on 4-way stop intersections without any traffic signals. Three intersections are selected around a downtown area where the pedestrian-camera distance ranges from 5-40m. Lighting conditions vary based on cloud cover and shadows cast by the buildings. We manually selected interesting sequences of captured frames for manual annotation based on the observed activity of pedestrians or pedestrian-vehicle interactions.

  • E William St - Maynard St, Ann Arbor, MI
  • S University Ave - Church St, Ann Arbor, MI
  • Catherine St - N 4th Ave, Ann Arbor, MI


  • Data Formatting


    Image Compression
    The raw Bayer 12-bit images were converted into compressed PNG/JPEG image formats. We have compressed the raw images into 16-bit PNG files to keep the high dynamic ranges. Due to the large file size, however, 16-bit PNG images are currently not available for download. Please contact us if you need those image.

    For the final release, the raw images were processed with JPEG compression with a quality level of 90. We provide original and rectified images in the Downloads page. We applied the gamma correction when rectifying the images.




    Statistics


    Distance
    The histograms below show the distributions of distance of pedestrians. Distances between pedestrians and camera centers are computed to plot the first two histograms. Note that majority of the pedestrians are within the range of distance 20-35m.

    Fig. Distribution of distances.
    Fig. Distribution of distances
    Fig. Cumulative histogram of distances.
    Fig. Cumulative histogram of distances

    Orientation
    Distribution of pedestrian body orientation relative to the world reference frame is described as a polar histogram. A pedestrian heading straight towards our recording vehicle corresponds to 270°, and pedestrians walking away from the vehicle should be around 90°.

    Fig. Distribution of pedestrian body orientations.
    Fig. Distribution of pedestrian body orientations


    Examples


    examples here