{"id":6135,"date":"2024-11-28T12:06:45","date_gmt":"2024-11-28T12:06:45","guid":{"rendered":"https:\/\/tech.newat9.com\/index.php\/2024\/11\/28\/how-will-ai-transform-urban-observing-sensing-imaging-and-mapping\/"},"modified":"2024-11-28T12:06:45","modified_gmt":"2024-11-28T12:06:45","slug":"how-will-ai-transform-urban-observing-sensing-imaging-and-mapping","status":"publish","type":"post","link":"https:\/\/tech.newat9.com\/index.php\/2024\/11\/28\/how-will-ai-transform-urban-observing-sensing-imaging-and-mapping\/","title":{"rendered":"How will ai transform urban observing, sensing, imaging, and mapping?"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"Sec2-content\">\n<h3 class=\"c-article__sub-heading\" id=\"Sec3\">Theoretical basis of AI in urban systems<\/h3>\n<p>AI can assist in addressing many issues in urban systems by detailed and extensive sensing of urban environments<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 10\" title=\"Fan, Z., Zhang, F., Loo, B. P. Y. &amp; Ratti, C. Urban visual intelligence: Uncovering hidden city profiles with street view images. Proc. Natl. Acad. Sci. 120, e2220417120 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR10\" id=\"ref-link-section-d1118001e891\" target=\"_blank\" rel=\"noopener\">10<\/a><\/sup>. Deep learning<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 11\" title=\"LeCun, Y., Bengio, Y. &amp; Hinton, G. Deep learning. Nature 521, 436&#x2013;444 (2015).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR11\" id=\"ref-link-section-d1118001e895\" target=\"_blank\" rel=\"noopener\">11<\/a><\/sup> is a branch of machine learning that utilizes deep neural networks to learn and represent complex patterns in data, which can be employed for tasks such as fine object recognition. Natural language processing<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 12\" title=\"Chowdhary, K. R. Natural language processing. Fundam. Artif. Intell. 603&#x2013;649 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR12\" id=\"ref-link-section-d1118001e899\" target=\"_blank\" rel=\"noopener\">12<\/a><\/sup> focuses on how computers understand and process human language, which can be applied to analyze and extract insights from urban-related textual data, such as social media data. Reinforcement learning<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 13\" title=\"Mankowitz, D. J. et al. Faster sorting algorithms discovered using deep reinforcement learning. Nature 618, 257&#x2013;263 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR13\" id=\"ref-link-section-d1118001e903\" target=\"_blank\" rel=\"noopener\">13<\/a><\/sup> is a learning paradigm that aims to train intelligent agents by interacting with the environment to learn optimal action strategies, which can be utilized to optimize decision-making in such areas as urban transportation systems and energy management. These theoretical bases enable the use of AI technology to analyze and address problems in urban research, providing deeper insights and better knowledge to support decision-making.<\/p>\n<p>The power of AI in urban research lies in its ability to process multiple types of data, analyze complex patterns, and make informed predictions, which is crucial for understanding complex urban systems. The application of AI methods and techniques usually considers many factors, including research tasks (e.g., image classification, object detection, etc.), the modality of data (e.g., optical, radar images, etc.), the hardware (e.g., graphics processing unit) and platform (e.g., local, distributed, or cloud computing)<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 14\" title=\"Wang, Y., Wei, G.-Y. &amp; Brooks, D. A systematic methodology for analysis of deep learning hardware and software platforms. Proc. Mach. Learn. Syst. 2, 30&#x2013;43 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR14\" id=\"ref-link-section-d1118001e910\" target=\"_blank\" rel=\"noopener\">14<\/a><\/sup>, the selection of models, the construction of networks, and the validation of results. The joint use of multimodal data should be carefully considered in the construction of networks. The criteria for model selection depend on the specific task, data, and the desired output. The construction of networks is not yet unified and explainable. Therefore, a general framework and guidelines for selecting models, constructing networks, and validating results are needed to fully leverage the potential of AI in urban studies.<\/p>\n<h3 class=\"c-article__sub-heading\" id=\"Sec4\">Digital image processing<\/h3>\n<p>In digital image processing, Convolutional Neural Network (CNN) is widely used due to its powerful ability of local feature extraction<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 15\" title=\"Krizhevsky, A., Sutskever, I. &amp; Hinton, G. E. Imagenet classification with deep convolutional neural networks. in Advances in neural information processing systems 1097&#x2013;1105 (2012).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR15\" id=\"ref-link-section-d1118001e922\" target=\"_blank\" rel=\"noopener\">15<\/a><\/sup>. For sequential or time-series data, Recurrent Neural Network (RNN) is popular<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 16\" title=\"Sutskever, I., Vinyals, O. &amp; Le, Q. V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 27, 3104&#x2013;3112 (2014).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR16\" id=\"ref-link-section-d1118001e926\" target=\"_blank\" rel=\"noopener\">16<\/a><\/sup>. Recently, a new AI model, transformer, has achieved great advances in natural language processing, and has been successfully transferred into the image processing field. Compared to CNN and RNN, the transformer entirely consists of attention mechanisms and can model long-range dependency between input and output at much lower training cost<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 17\" title=\"Dosovitskiy, A. et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv Prepr. arXiv2010.11929 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR17\" id=\"ref-link-section-d1118001e930\" target=\"_blank\" rel=\"noopener\">17<\/a><\/sup>. Fueled by the rapid development of hardware, big data, and AI techniques, foundation models based on transformers have been successfully proposed for general purposes and can be readily applied to various downstream tasks<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 18\" title=\"Kirillov, A. et al. Segment anything. in Proceedings of the IEEE\/CVF International Conference on Computer Vision 4015&#x2013;4026 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR18\" id=\"ref-link-section-d1118001e934\" target=\"_blank\" rel=\"noopener\">18<\/a><\/sup>. In the field of remote sensing, foundation models have increasingly received wide concerns, e.g., Prithvi<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 19\" title=\"Jakubik, J. et al. Prithvi-100M. at &#010;                  https:\/\/doi.org\/10.57967\/hf\/0952&#010;                  &#010;                 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR19\" id=\"ref-link-section-d1118001e938\" target=\"_blank\" rel=\"noopener\">19<\/a><\/sup> and RemoteCLIP<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 20\" title=\"Liu, F. et al. RemoteCLIP: A Vision Language Foundation Model for Remote Sensing. IEEE Trans. Geosci. Remote Sens. 62, 1&#x2013;16 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR20\" id=\"ref-link-section-d1118001e943\" target=\"_blank\" rel=\"noopener\">20<\/a><\/sup>. These foundation models point out a promising direction for dealing with multi-modal data and general tasks. This can be outlined in a general framework with three components: model, input, and output (Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a>).<\/p>\n<div class=\"c-article-section__figure js-c-reading-companion-figures-item\" data-test=\"figure\" data-container-section=\"figure\" id=\"figure-2\" data-title=\"Fig. 2\">\n<figure><figcaption><b id=\"Fig2\" class=\"c-article-section__figure-caption\" data-test=\"figure-caption-text\">Fig. 2<\/b><\/figcaption><div class=\"c-article-section__figure-content\">\n<div class=\"c-article-section__figure-item\"><a class=\"c-article-section__figure-link\" data-test=\"img-link\" data-track=\"click\" data-track-label=\"image\" data-track-action=\"view figure\" href=\"https:\/\/www.nature.com\/articles\/s42949-024-00188-3\/figures\/2\" rel=\"nofollow noopener\" target=\"_blank\"><picture><source type=\"image\/webp\" srcset=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs42949-024-00188-3\/MediaObjects\/42949_2024_188_Fig2_HTML.png?as=webp\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"Fig2\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs42949-024-00188-3\/MediaObjects\/42949_2024_188_Fig2_HTML.png\" alt=\"figure 2\" loading=\"lazy\" width=\"685\" height=\"584\"\/><\/source><\/picture><\/a><\/div>\n<div class=\"c-article-section__figure-description\" data-test=\"bottom-caption\" id=\"figure-2-desc\">\n<p>The general AI framework in Earth observations (image created by the authors).<\/p>\n<\/div>\n<\/div>\n<\/figure>\n<\/div>\n<h4 class=\"c-article__sub-heading c-article__sub-heading--small\" id=\"Sec5\">Model<\/h4>\n<p>AI models often operate as a \u201cblack box\u201d<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 21\" title=\"Castelvecchi, D. Can we open the black box of AI? Nat. News 538, 20 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR21\" id=\"ref-link-section-d1118001e974\" target=\"_blank\" rel=\"noopener\">21<\/a><\/sup>, neglecting the underlying physical mechanisms. To address this issue, we propose a general AI framework, which consists of three parts: the encoder for feature extraction; the feature fusion module for the fusion of diverse features; and the decoder for the reconstruction of output features. The key innovation lies in the utilization of prior knowledge and the integration of major cutting-edge AI models, such as CNN, transformer, RNN, graph neural network (GNN), and generative adversarial network (GAN). Different AI models can serve as encoders or decoders based on their strengths in feature representation. Prior knowledge can be integrated at different stages. At the input stage, it can enrich and integrate prominent features, reducing redundancy, such as remotely sensed spectral indices. During the modeling, prior knowledge can be accounted for in network weights through model pre-training or fine-tuning. At the output stage, it can guide the learning process and provide more reliable outputs, e.g., by the addition of spatial-temporal weighted terms. This knowledge-driven approach enhances the model interpretability and generalization and compensates for limited training data.<\/p>\n<h4 class=\"c-article__sub-heading c-article__sub-heading--small\" id=\"Sec6\">Input data<\/h4>\n<p>These EO data are characterized by diverse spectral, spatial, and temporal resolutions and broad spatial coverage, enabling long-term urban monitoring. Within the AI framework, prior knowledge complements raw data, especially when the available input data is limited. The type of prior knowledge to be incorporated depends mainly on research objectives, geospatial relationships, urban attributes, and temporal patterns (see Fig. <a data-track=\"click\" data-track-label=\"link\" data-track-action=\"figure anchor\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#Fig2\" target=\"_blank\" rel=\"noopener\">2<\/a>).<\/p>\n<h4 class=\"c-article__sub-heading c-article__sub-heading--small\" id=\"Sec7\">Output data<\/h4>\n<p>The output is application-specific, ranging from image pre-processing and interpretation to parameter estimation. Utilizing an appropriate model informed by practical urban knowledge yields more accurate and comprehensive insights, contributing to more effective urban sensing and imaging.<\/p>\n<h3 class=\"c-article__sub-heading\" id=\"Sec8\">Urban mapping<\/h3>\n<p>AI can handle different types of data, including text, audio, image, and video, and can integrate them to produce more accurate results than traditional methods. It enhances data interpretation capabilities and helps make informed decisions in various fields. It has revolutionized the field of urban mapping by processing and analyzing various types of data. In this section, we will discuss three applications of AI in urban mapping: land use and land cover (LULC) mapping, building detection, and road extraction.<\/p>\n<p>LULC mapping has long been a hot topic and is evolving with deep learning<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 22\" title=\"Brown, C. F. et al. Dynamic World, Near real-time global 10&#x2009;m land use land cover mapping. Sci. Data 9, 251 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR22\" id=\"ref-link-section-d1118001e1009\" target=\"_blank\" rel=\"noopener\">22<\/a><\/sup>. The exceptional performance of deep learning in LULC mapping is due to several factors. First, deep learning eliminates the need for manual feature engineering due to the inherent ability of the models to learn directly from data. Second, deep learning enhances the ease of incorporating heterogeneous multi-modal data into the mapping process. Third, deep learning can generate diverse output types, such as point-level categories, segmented objects, and bounding boxes<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 23\" title=\"Lu, X., Zhong, Y. &amp; Zhang, L. Open-source data-driven cross-domain road detection from very high resolution remote sensing imagery. IEEE Trans. Image Process. 31, 6847&#x2013;6862 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR23\" id=\"ref-link-section-d1118001e1013\" target=\"_blank\" rel=\"noopener\">23<\/a><\/sup>. Nevertheless, deep learning is data-driven and relies heavily on labeled data. In addition, although diverse LULC products have been developed for local or global regions, there exist considerable uncertainties and inconsistencies. Urban green spaces (UGS), as a special type of land cover, play an important role in understanding urban ecosystems, climate, environment, public health concerns, and the SDGs at various spatial scales. Mapping of UGS with remote sensing is challenging due to the existence of mixed pixels and the cost and time of collecting quality training data. CNN and other deep learning methods have been employed for UGS mapping and found them effective<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 24\" title=\"Chen, Y. et al. Developing an intelligent cloud attention network to support global urban green spaces mapping. ISPRS J. Photogramm. Remote Sens. 198, 197&#x2013;209 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR24\" id=\"ref-link-section-d1118001e1017\" target=\"_blank\" rel=\"noopener\">24<\/a><\/sup>.<\/p>\n<p>Building detection is one of the most profoundly advanced areas of EO-based deep learning. Historically, building feature-based methods have been developed to advance automated building detection<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 25\" title=\"Huang, X. &amp; Zhang, L. Morphological building\/shadow index for building extraction from high-resolution imagery over urban areas. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5, 161&#x2013;172 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR25\" id=\"ref-link-section-d1118001e1024\" target=\"_blank\" rel=\"noopener\">25<\/a><\/sup>, but they rely on domain-specific knowledge to manually design building- related features to be detected and mapped. Deep learning, trained using existing open-source databases obtained by citizen science, have become a mainstream for building detection<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 26\" title=\"Ji, S., Wei, S. &amp; Lu, M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 57, 574&#x2013;586 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR26\" id=\"ref-link-section-d1118001e1028\" target=\"_blank\" rel=\"noopener\">26<\/a><\/sup>. For instance, Microsoft has released a global building footprints dataset generated by deep learning networks, which was almost impossible to achieve in the past, yet the completeness of this dataset still needs attention.<\/p>\n<p>Similarly, AI has made it possible for automatic extraction of roads. For example, the foundation model has been utilized to extract road networks by employing autoencoders and contrastive learning for self-supervised training on large-scale unlabeled remote sensing images<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 27\" title=\"Hetang, C. et al. Segment Anything Model for Road Network Graph Extraction. in Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2556&#x2013;2566 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR27\" id=\"ref-link-section-d1118001e1035\" target=\"_blank\" rel=\"noopener\">27<\/a><\/sup>. Parameter-efficient fine-tuning methods were used to apply these general foundation models for road extraction tasks. Because self-supervised training learns the distribution of vast amounts of data, the model\u2019s feature representation capabilities are significantly enhanced, thereby improving the performance of road extraction. Cross-modal learning has also been applied to road extraction tasks<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 28\" title=\"Li, B., Gao, J., Chen, S., Lim, S. &amp; Jiang, H. DF-DRUNet: A decoder fusion model for automatic road extraction leveraging remote sensing images and GPS trajectory data. Int. J. Appl. Earth Obs. Geoinf. 127, 103632 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR28\" id=\"ref-link-section-d1118001e1039\" target=\"_blank\" rel=\"noopener\">28<\/a><\/sup>. For instance, GPS data is used to address the issue of insufficient road data labels to some extent. AI methods are still constrained in road detection and mapping in several aspects. First, there is a lack of an accurate and diverse training dataset for global-scale road mapping<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 29\" title=\"Demir, I. et al. DeepGlobe 2018: A challenge to parse the earth through satellite images. in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops vols 2018-June 172&#x2013;181 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR29\" id=\"ref-link-section-d1118001e1043\" target=\"_blank\" rel=\"noopener\">29<\/a><\/sup>. Second, the generalization ability of AI models remains limited for global applications. Third, the lack of inductive reasoning ability for AI models leads to disconnected roads, which may lead to inaccurate conclusions in road network-based urban studies. AI methods focus mainly on recognizing individual pixels as roads, rather than inferring road connectivity according to the cognitive process applied by human beings<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 30\" title=\"Bastani, F. et al. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 4720&#x2013;4728. &#010;                  https:\/\/doi.org\/10.1109\/CVPR.2018.00496&#010;                  &#010;                 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR30\" id=\"ref-link-section-d1118001e1047\" target=\"_blank\" rel=\"noopener\">30<\/a><\/sup>.<\/p>\n<h3 class=\"c-article__sub-heading\" id=\"Sec9\">Urban observing and sensing<\/h3>\n<p>Following the discussion on the three widest applications in urban mapping, where optical remote sensing methods are primarily utilized, this chapter focuses the discussion on urban observation and sensing with other sensing systems and platforms, such as LiDAR, Synthetic Aperture Radar (SAR), street-level imagery, as well as people as virtual sensors.<\/p>\n<p>LiDAR technology offers exceptional 3D data acquisition capabilities for urban landscapes, structures and infrastructure, as well as monitoring changes over time. Small-footprint airborne LiDAR delivers high-resolution topographic data, excelling at generating detailed 3D urban environment models<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 31\" title=\"Wang, R., Peethambaran, J. &amp; Chen, D. Lidar point clouds to 3-D urban models $: $ A review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11, 606&#x2013;627 (2018).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR31\" id=\"ref-link-section-d1118001e1062\" target=\"_blank\" rel=\"noopener\">31<\/a><\/sup>. Integrating AI with LiDAR data processing enables sophisticated classification and analysis of urban features. For instance, AI models trained on CNNs have improved interpreting and merging information from these diverse sensor modalities, thereby enhancing point semantic labeling and classification accuracy<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 32\" title=\"Jaritz, M., Vu, T.-H., De Charette, R., Wirbel, &#xC9;. &amp; P&#xE9;rez, P. Cross-modal learning for domain adaptation in 3d semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1533&#x2013;1544 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR32\" id=\"ref-link-section-d1118001e1066\" target=\"_blank\" rel=\"noopener\">32<\/a><\/sup>. Nonetheless, the fusion of LiDAR with other sensors to improve information retrieval with the existence of occlusion from LiDAR viewing geometry poses significant challenges for urban applications. To address these challenges, cross-modal learning strategies leverage LiDAR data combined with visual and thermal imagery to compensate for areas where LiDAR data is incomplete or obstructed, thereby enriching the dataset<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 33\" title=\"Aiello, E., Valsesia, D. &amp; Magli, E. Cross-modal learning for image-guided point cloud shape completion. Adv. Neural Inf. Process. Syst. 35, 37349&#x2013;37362 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR33\" id=\"ref-link-section-d1118001e1070\" target=\"_blank\" rel=\"noopener\">33<\/a><\/sup>. In addition, self-supervised learning models have been utilized, autonomously predicting missing or noisy data sections based on patterns identified in complete and clean sections<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 34\" title=\"Vats, A. et al. Terrain-Informed Self-Supervised Learning: Enhancing Building Footprint Extraction from LiDAR Data with Limited Annotations. IEEE Trans. Geosci. Remote Sens. pp. 1&#x2013;10 (2024).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR34\" id=\"ref-link-section-d1118001e1074\" target=\"_blank\" rel=\"noopener\">34<\/a><\/sup>. This approach enhances data quality and facilitates learning from the intrinsic structure of LiDAR data without relying on manually labeled examples, which is particularly beneficial for managing large datasets and standardizing data quality across different systems.<\/p>\n<p>SAR, featuring all-weather capability, rapid revisit, and multi-angle observations, is an important EO technology. Increasing accessibility of SAR has significantly enabled the application of AI for urban sensing and mapping<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 35\" title=\"Rouet-Leduc, B., Jolivet, R., Dalaison, M., Johnson, P. A. &amp; Hulbert, C. Autonomous extraction of millimeter-scale deformation in InSAR time series using deep learning. Nat. Commun. 12, 6480 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR35\" id=\"ref-link-section-d1118001e1081\" target=\"_blank\" rel=\"noopener\">35<\/a><\/sup>. Additionally, interferometric SAR (InSAR) techniques are used to process and analyze multitemporal SAR, enabling accurate measurements of urban surface and infrastructure deformation. Compared to optical images, SAR exhibits distinct characteristics, including speckle noise, multipath scattering, and geometrical distortions, which negatively impact their interpretation. These issues also pose challenges for AI-based analysis of SAR images in conjunction with optical ones.<\/p>\n<p>The potential of street-level imagery has been advanced with AI for data mining and knowledge discovery<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 10\" title=\"Fan, Z., Zhang, F., Loo, B. P. Y. &amp; Ratti, C. Urban visual intelligence: Uncovering hidden city profiles with street view images. Proc. Natl. Acad. Sci. 120, e2220417120 (2023).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR10\" id=\"ref-link-section-d1118001e1088\" target=\"_blank\" rel=\"noopener\">10<\/a><\/sup>. For example, it is possible to evaluate the conditions of urban infrastructure through semantic segmentation methodologies<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 36\" title=\"Rundle, A. G., Bader, M. D. M., Richards, C. A., Neckerman, K. M. &amp; Teitler, J. O. Using Google Street View to audit neighborhood environments. Am. J. Prev. Med. 40, 94&#x2013;100 (2011).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR36\" id=\"ref-link-section-d1118001e1092\" target=\"_blank\" rel=\"noopener\">36<\/a><\/sup>. Deeper insights, including safety, architectural age and style, and the urban socio-economic environment are also available through AI<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 37\" title=\"Sun, J. et al. Automatic atmospheric correction for shortwave hyperspectral remote sensing data using a time-dependent deep neural network. ISPRS J. Photogramm. Remote Sens. 174, 117&#x2013;131 (2021).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR37\" id=\"ref-link-section-d1118001e1096\" target=\"_blank\" rel=\"noopener\">37<\/a><\/sup>. Despite these advancements, challenges remain, such as in the integration of street-level with satellite\/airborne data. Satellite\/airborne sensing provides a large-scale perspective but is limited to top-down or oblique observations, while street-level imagery offers a ground-based observation from a human\u2019s perspective.<\/p>\n<p>The aforementioned EO technologies have traditionally been applied to study static objects such as LULC. Recently, massive geo-tagged data on dynamic objects (e.g., human behaviors) have been generated by physical and people sensors (people as virtual sensors). These data, such as GPS trajectories, surveillance data, urban environment data (e.g., temperature and air quality data) and human-generated data, are mostly associated with geo-locations, capturing urban dynamics (e.g., human movements, urban events and processes) from different angles. They provide multi-dimensional EO data in a granular manner, which have greatly catalyzed the application of AI techniques to urban sensing<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 38\" title=\"Salcedo-Sanz, S. et al. Machine learning information fusion in Earth observation: A comprehensive review of methods, applications and data sources. Inf. Fusion 63, 256&#x2013;272 (2020).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR38\" id=\"ref-link-section-d1118001e1104\" target=\"_blank\" rel=\"noopener\">38<\/a><\/sup>. For instance, AI has been widely applied to GPS trajectories and urban environment data, which has significantly improved human movement prediction<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 39\" title=\"Huang, W. &amp; Li, S. Understanding human activity patterns based on space-time-semantics. ISPRS J. Photogramm. Remote Sens. 121, 1&#x2013;10 (2016).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR39\" id=\"ref-link-section-d1118001e1108\" target=\"_blank\" rel=\"noopener\">39<\/a><\/sup> and urban environment change forecasting<sup><a data-track=\"click\" data-track-action=\"reference anchor\" data-track-label=\"link\" data-test=\"citation-ref\" aria-label=\"Reference 40\" title=\"Li, H., Yuan, Z., Novack, T., Huang, W. &amp; Zipf, A. Understanding spatiotemporal trip purposes of urban micro-mobility from the lens of dockless e-scooter sharing. Comput. Environ. Urban Syst. 96, 101848 (2022).\" href=\"http:\/\/www.nature.com\/articles\/s42949-024-00188-3#ref-CR40\" id=\"ref-link-section-d1118001e1112\" target=\"_blank\" rel=\"noopener\">40<\/a><\/sup>. Nevertheless, challenges such as fusing the geo-tagged data with the EO data for effective AI modeling remained due to their spatiotemporal scale differences and the qualities of measurement.<\/p>\n<p>Human-generated data can be categorized into passive sensing (e.g., data generated from social media, location-based services, and mobile devices) and active sensing (e.g., Public Participation Geographic Information Systems (PPGIS), Volunteered Geographic Information Systems (VGIS), and surveys). These data sources offer diverse insights into human activities and behaviors, as well as other human factors related to social sustainability, such as environmental experiences, perceptions, and needs. Social media data, for instance, can be used to analyze social phenomena such as segregation, while PPGIS data can help identify community needs and preferences regarding urban planning.<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/www.nature.com\/articles\/s42949-024-00188-3\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Theoretical basis of AI in urban systems AI can assist in addressing many issues in urban systems by detailed and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":6136,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts\/6135"}],"collection":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/comments?post=6135"}],"version-history":[{"count":0,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/posts\/6135\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/media\/6136"}],"wp:attachment":[{"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/media?parent=6135"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/categories?post=6135"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tech.newat9.com\/index.php\/wp-json\/wp\/v2\/tags?post=6135"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}