Claims

1 . A method for 3D reconstruction of a scene, wherein an event camera (1 ) is moved on a trajectory (T) along the scene, wherein the event camera (1 ) comprises a plurality of pixels that are configured to only output events (e_{k}) in presence of brightness changes in the scene at the time (t_{fe}) they occur, wherein each event comprises the time (t_{fe}) at which it occurred, an address (½, y_{fc}) of the respective pixel that detected the brightness change, as well as a polarity value (p_{fe}) indicating the sign of the brightness change, wherein a plurality of successive events generated by the event camera (1 ) along said trajectory (T) are back-projected according to the viewpoint (P) of the event camera (1 ) as viewing rays (R) through a discretized volume (DSI) at a reference viewpoint (RV) of a virtual event camera associated to said plurality of events, wherein said discretized volume (DSI) comprises voxels (V), and wherein a score function f(X) associated to the discretized volume (DSI) is determined, which score function f(X) is the number of back-projected viewing rays (R) that pass through the respective voxel (V) with center X, and wherein said score function f(X) is used to determine whether or not a 3D point of the 3D reconstruction of the scene is present in the respective voxel (V).

2. The method of claim 1 , characterized in that said discretized volume (DSI) has a size w h x N_{z}, wherein w and h are the number of pixels of the event camera in x and y direction and wherein N_{z} is a number of depth planes

and wherein particularly the discretized volume (DSI) is adapted to the field of view and perspective projection of the event camera (1 ) at said reference viewpoint (RV).

3. The method of claim 1 or 2, characterized in that it is determined that a 3D point of the scene is present in a voxel (V) when said score function f(X) assumes a local maximum for this voxel (V).

4. The method of claim 3, characterized in that the local maxima of the score function f(X) are detected by generating a dense depth map Z* {x, y) and an associated confidence map c(x, y) at said reference viewpoint (RV), wherein Z* {x, y) stores the location of the maximum score along the row of voxels corresponding to pixel (x, y) , and wherein c(x, y) stores the value of said maximum score, c(x, y) := f(X(x), Y(y), Z* (x, y)) , and wherein a semi-dense depth map is created from the map Z* by selecting a subset of pixels using said confidence map c(x, y), and wherein adaptive Gaussian thresholding is applied to said confidence map c(x, y) so as to generate a binary confidence mask that selects said subset of pixel locations in the map Z* in order to produce a semi- dense depth map, wherein a pixel (x, y) is selected if c(x, y) > T(x, y) , with T(x, y) = c(x, y) * G(x, y) - C , where * denotes the 2D convolution, G is a Gaussian kernel, and C a constant offset.

5. The method according to one of the preceding claims, characterized in that said plurality of successive events generated by the event camera (1 ) along said trajectory (T) forms a subset of events of a stream of events generated by the event camera (1 ) along said trajectory (T), wherein said stream is divided into a plurality of subsequent subsets of events, wherein each subset contains a plurality of successive events generated by the event camera (1 ), wherein the successive events of each subset are back-projected according to the viewpoint (P) of the event camera (1 ) as viewing rays (R) through a discretized volume (DSI) at a reference viewpoint (RV) of a virtual event camera associated to the respective subset, wherein the respective discretized volume (DSI) comprises voxels (V), and wherein a score function f(X) associated to the respective discretized volume (DSI) is determined, which score function f(X) is the number of back-projected viewing rays (R) of the respective subset that pass through the respective voxel (V) with center X of the respective discretized volume (DSI), and wherein the respective score function f(X) is used to determine whether or not a 3D point of the 3D reconstruction of the scene is present in the respective voxel (V) of the respective discretized volume (DSI) associated to the respective subset.

6. The method of claim 5, characterized in that the local maxima of the respective score function f(X) are detected by generating a dense depth map Z* x, y) and an associated confidence map c(x, y) for each reference viewpoint (RV), wherein Z* x, y) stores the location of the maximum score along the row of voxels (V) corresponding to each pixel (x, y), with viewing ray (R'), of the respective reference viewpoint (RV), and wherein c(x, y) stores the value of said maximum score, c(x, y) := f(X(x), 7(y), ^{*}(x, y)), and wherein a respective semi-dense depth map for the respective reference viewpoint is created from the respective map Z* by selecting a subset of pixels using the respective confidence map c(x, y), and wherein adaptive Gaussian thresholding is applied to the respective confidence map c(x, y) so as to generate a respective binary confidence mask that selects said subset of pixel locations in the respective map Z* in order to produce a respective semi-dense depth map, wherein a pixel (x, y) is selected if c(x, y) > T(x, y), with T(x, y) = c(x, y) * G(x, y) - C, where * denotes the 2D convolution, G is a Gaussian kernel, and C a constant offset.

7. The method according to claim 6, characterized in that the depth maps are converted to point clouds, wherein the respective point cloud is particularly cleaned from those isolated points whose number of neighbors within a given radius is less than a threshold, and wherein said point clouds are merged into a global point cloud using the known positions of the virtual event cameras at the respective reference viewpoint, wherein said global point cloud comprises the 3D points of the 3D reconstruction of the scene.

8. The method according to one of the preceding claims, characterized in that the event camera (1 ) is moved manually along said trajectory (T).

9. The method according to one of the claims 1 to 7, characterized in that the event camera (1 ) is moved along said trajectory (T) by means of a movement generating means.

10. The method according to claim 9, characterized in that said movement generating means is formed by one of: a motor, a motor vehicle, a train, an aircraft, a robot, a robotic arm, a bicycle.

1 1. The method according to one of the preceding claims, characterized in that the event camera (1 ) is moved along said trajectory (T) with a velocity in the range from 0 km/h to 500 km/h, particularly 1 km/h to 500km/h, particularly 100 km/h to 500 km/h, particularly 150 km/h to 500 km/h, particularly 200 km/h to 500 km/h, particularly 250 km/h to 500 km/h, particularly 300 km/h to 500 km/h, particularly 350 km/h to 500 km/h, particularly 400 km/h to 500 km/h.

12. Computer program for 3D reconstruction of a scene, wherein the computer program comprises program code for conducting the following steps when the computer program is executed on a computer:

back-projecting a plurality of events generated by means of an event camera (1 ) according to the viewpoint (P) of the event camera (1 ) as viewing rays (R) through a discretized volume (DSI) at a reference viewpoint (RV) of a virtual event camera associated to said plurality of events, wherein said discretized volume (DSI) comprises voxels (V), and determining a score function f(X) associated to the discretized volume (DSI), which score function f(X) is the number of back-projected viewing rays (R) that pass through the respective voxel (V) with center X, and

using said score function f(X) to determine whether or not a 3D point of the 3D reconstruction of the scene is present in the respective voxel (V).

13. Device comprising an event camera (1 ) and an analyzing means, wherein said event camera (1 ) and said analyzing means are configured to conduct the method according to one of the claims 1 to 1 1 when the event camera (1 ) is moved on a trajectory (T) along the scene.

14. Method for localizing an event camera (1 ) with respect to an existing semi-dense 3D map by registering an event image (I) obtained by the event camera (1 ) to a template image, wherein the event image (I) is obtained by aggregating a plurality of events (e_{k}) obtained with the event camera (1 ) into an edge map, and wherein the template image consists of a projected semi-dense 3D map (M) of a scene according to a known pose of the event camera (1 ), wherein a 6 degrees of freedom relative pose of the event camera (1 ) is estimated by means of registering the event image (I) to the template image.