vis4d.model.motion.velo_lstm
VeloLSTM 3D motion model.
Functions
|
Initialize LSTM weights and biases. |
Classes
|
Estimating object location in world coordinates. |
|
VeloLSTM output. |
- class VeloLSTMOut(loc_preds: Tensor, loc_refines: Tensor)[source]
VeloLSTM output.
-
loc_preds:
Tensor Alias for field number 0
-
loc_refines:
Tensor Alias for field number 1
-
loc_preds:
- class VeloLSTM(num_frames=5, feature_dim=64, hidden_size=128, num_layers=2, loc_dim=7, dropout=0.1, weights=None)[source]
Estimating object location in world coordinates.
- Prediction LSTM:
Input: 5 frames velocity Output: Next frame location
- Updating LSTM:
Input: predicted location and observed location Output: Refined location
- __init__(num_frames=5, feature_dim=64, hidden_size=128, num_layers=2, loc_dim=7, dropout=0.1, weights=None)[source]
Init.
Initializae hidden state.
The axes semantics are (num_layers, minibatch_size, hidden_dim)
- Return type:
tuple[Tensor,Tensor]
- refine(location, observation, prev_location, confidence, hc_0)[source]
Refine predicted location using single frame estimation at t+1.
- Return type:
tuple[Tensor,tuple[Tensor,Tensor]]
- Input:
location: (num_batch x loc_dim), location from prediction observation: (num_batch x loc_dim), location from single frame estimation prev_location: (num_batch x loc_dim), refined location confidence: (num_batch X 1), depth estimation confidence hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell
- Middle:
loc_embed: (1, num_batch x feature_dim), predicted location feature obs_embed: (1, num_batch x feature_dim), single frame location feature conf_embed: (1, num_batch x feature_dim), depth estimation confidence feature embed: (1, num_batch x 2*feature_dim), location feature out: (1 x num_batch x hidden_size), lstm output
- Output:
hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location
- predict(vel_history, location, hc_0)[source]
Predict location at t+1 using updated location at t.
- Return type:
tuple[Tensor,tuple[Tensor,Tensor]]
- Input:
vel_history: (num_seq, num_batch, loc_dim), velocity from previous num_seq updates location: (num_batch, loc_dim), location from previous update hc_0: (num_layers, num_batch, hidden_size), tuple of hidden and cell
- Middle:
embed: (num_seq, num_batch x feature_dim), location feature out: (num_seq x num_batch x hidden_size), lstm output attention_logit: (num_seq x num_batch x loc_dim), the predicted residual
- Output:
hc_n: (num_layers, num_batch, hidden_size), tuple of updated hidden, cell output_pred: (num_batch x loc_dim), predicted location