Automatically generating a bird’s-eye-view (BEV) of an object’s surrounding environment is critical for applications like autonomous driving and advanced driver-assistance systems. These systems rely on integrating signals from multiple cameras to construct a top-down view of the environment. Prominent examples include the BEV systems deployed in Tesla cars. However, many existing methods heavily depend on Transformers, which employ computationally expensive attention mechanisms to learn accurate representations.
In this work, we introduce Spatial Cross Mamba, an innovative approach analogous to standard cross-attention in Transformers. Our method leverages the efficiency of state space models (SSMs) to significantly reduce the computational overhead associated with Transformers, enabling more efficient and scalable BEV systems without compromising representation accuracy.
Despite the remarkable advances in edge device capabilities such as functionality, computation power, and storage capacity, the limited energy capacity has been the major bottleneck in promoting advanced edge AI applications. For instance, mobile and edge devices are typically powered solely by embedded batteries, so their energy capacity is significantly constrained by form factor requirements, safety considerations, manufacturing costs, and concerns on the environmental impact of the battery technology used.
In this work, we studied the problem of accurate energy measurement, prediction, and understandable scoring of on-device deep learning across edge hardware. We created kernel-, model-, and application-level datasets for on-device deep learning. We designed and implemented the first kernel-level energy predictors on both mobile CPU and GPU. It can provide consistently accurate energy estimation on unseen DNN models.
As AI-driven applications become more prevalent, their cumulative energy consumption across billions of devices will have a significant impact on global carbon emissions. Addressing this pressing issue requires a deeper understanding of how AI models consume energy during runtime and how to optimize their energy efficiency for specific edge hardware. This optimization is crucial for reducing the environmental impact of AI at scale.
To this end, we propose GreenAuto, an end-to-end automated platform designed for sustainable AI model exploration, generation, deployment, and evaluation. GreenAuto demonstrates the efficient identification of sustainable AI models without the need for human intervention through automating performance measurements and iteratively refining the search process.
The goal of this research pillar is to improve system performance, especially latency and energy efficiency, of mobile AR/VR devices. We primarily focus on adapting the system configurations of mobile AR devices to address the trade-offs between user preferences and performance requirements.
We are also recently interested in prototyping systems for emerging AR applications including: collaborative AR, mobile AR with generative AI, AR for scientific data exploration, etc.
The goal of this research pillar is to conduct pioneering study on digital twins for connected and automated vehicles (CAVs). Our vision is to create fair, affordable, and efficient mobility solutions by leveraging digital twins and edge computing.
We are primarily interested in: Visualization of mobility digital twin with NeRF, cooperative perception, security in CAVs, etc.