Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (61)

Search Parameters:
Keywords = dependency parsing

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 11240 KiB  
Article
Research on Landscape Perception of Urban Parks Based on User-Generated Data
by Wei Ren, Kaiyuan Zhan, Zhu Chen and Xin-Chen Hong
Buildings 2024, 14(9), 2776; https://doi.org/10.3390/buildings14092776 - 4 Sep 2024
Viewed by 689
Abstract
User-generated data can reflect various viewpoints and experiences derived from people’s perception outcomes. The perceptual results can be obtained, often by combining subjective public perceptions of the landscape with physiological monitoring data. Accessing people’s perceptions of the landscape through text is a common [...] Read more.
User-generated data can reflect various viewpoints and experiences derived from people’s perception outcomes. The perceptual results can be obtained, often by combining subjective public perceptions of the landscape with physiological monitoring data. Accessing people’s perceptions of the landscape through text is a common method. It is hard to fully render nuances, emotions, and complexities depending only on text by superficial emotional tendencies alone. Numerical representations may lead to misleading conclusions and undermine public participation. In addition, the use of physiological test data does not reflect the subjective reasons for the comments made. Therefore, it is essential to deeply parse the text and distinguish between segments with different semantic differences. In this study, we propose a perceptual psychology-based workflow to extract and visualize multifaceted views from user-generated data. The analysis methods of FCN, LDA, and LSTM were incorporated into the workflow. Six areas in Fuzhou City, China, with 12 city parks, were selected as the study object. Firstly, 9987 review data and 1747 pictures with corresponding visitor trajectories were crawled separately on the Dianping and Liangbulu websites. For in-depth analysis of comment texts and making relevant heat maps. Secondly, the process of clauses was added to get a more accurate representation of the sentiment of things based on the LSTM sentiment analysis model. Thirdly, various factors affecting the perception of landscapes were explored. Based on such, the overall people’s perception of urban parks in Fuzhou was finally obtained. The study results show that (1) the texts in terms of ‘wind’, ‘temperature’, ‘structures’, ‘edge space (spatial boundaries)’, and ‘passed space’ are the five most representative factors of the urban parks in Fuzhou; (2) the textual analyses further confirmed the influence of spatial factors on perception in the temporal dimension; and (3) environmental factors influence people’s sense of urban parks concerning specificity, clocking behavior, and comfort feelings. These research results provide indispensable references for optimizing and transforming urban environments using user-generated data. Full article
Show Figures

Figure 1

16 pages, 2080 KiB  
Article
Fusion Network for Aspect-Level Sentiment Classification Based on Graph Neural Networks—Enhanced Syntactics and Semantics
by Miaomiao Li, Yuxia Lei and Weiqiang Zhou
Appl. Sci. 2024, 14(17), 7524; https://doi.org/10.3390/app14177524 - 26 Aug 2024
Viewed by 652
Abstract
Aspect-level sentiment classification (ALSC) struggles with correctly trapping the aspects and corresponding sentiment polarity of a statement. Recently, several works have combined the syntactic structure and semantic information of sentences for more efficient analysis. The combination of sentence knowledge with graph neural networks [...] Read more.
Aspect-level sentiment classification (ALSC) struggles with correctly trapping the aspects and corresponding sentiment polarity of a statement. Recently, several works have combined the syntactic structure and semantic information of sentences for more efficient analysis. The combination of sentence knowledge with graph neural networks has also proven effective at ALSC. However, there are still limitations on how to effectively fuse syntactic structure and semantic information when dealing with complex sentence structures and informal expressions. To deal with these problems, we propose an ALSC fusion network that combines graph neural networks with a simultaneous consideration of syntactic structure and semantic information. Specifically, our model is composed of a syntactic attention module and a semantic enhancement module. First, the syntactic attention module builds a dependency parse tree with the aspect term being the root, so that the model focuses better on the words closely related to the aspect terms, and captures the syntactic structure through a graph attention network. In addition, the semantic enhancement module generates the adjacency matrix through self-attention, which is processed by the graph convolutional network to obtain the semantic details. Lastly, the extracted features are merged to achieve sentiment classification. As verified by experiments, the model we propose can effectively enhance ALSC’s behavior. Full article
Show Figures

Figure 1

21 pages, 2886 KiB  
Article
Hybrid Detection Method for Multi-Intent Recognition in Air–Ground Communication Text
by Weijun Pan, Zixuan Wang, Zhuang Wang, Yidi Wang and Yuanjing Huang
Aerospace 2024, 11(7), 588; https://doi.org/10.3390/aerospace11070588 - 18 Jul 2024
Viewed by 706
Abstract
In recent years, the civil aviation industry has actively promoted the automation and intelligence of control processes with the increasing use of various artificial intelligence technologies. Air–ground communication, as the primary means of interaction between controllers and pilots, typically involves one or more [...] Read more.
In recent years, the civil aviation industry has actively promoted the automation and intelligence of control processes with the increasing use of various artificial intelligence technologies. Air–ground communication, as the primary means of interaction between controllers and pilots, typically involves one or more intents. Recognizing multiple intents within air–ground communication texts is a critical step in automating and advancing the control process intelligently. Therefore, this study proposes a hybrid detection method for multi-intent recognition in air–ground communication text. This method improves recognition accuracy by using different models for single-intent texts and multi-intent texts. First, the air–ground communication text is divided into two categories using multi-intent detection technology: single-intent text and multi-intent text. Next, for single-intent text, the Enhanced Representation through Knowledge Integration (ERNIE) 3.0 model is used for recognition; while the A Lite Bidirectional Encoder Representations from Transformers (ALBERT)_Sequence-to-Sequence_Attention (ASA) model is proposed for identifying multi-intent texts. Finally, combining the recognition results from the two models yields the final result. Experimental results demonstrate that using the ASA model for multi-intent text recognition achieved an accuracy rate of 97.84%, which is 0.34% higher than the baseline ALBERT model and 0.15% to 0.87% higher than other improved models based on ALBERT and ERNIE 3.0. The single-intent recognition model achieved an accuracy of 96.23% when recognizing single-intent texts, which is at least 2.18% higher than the multi-intent recognition model. The results indicate that employing different models for various types of texts can substantially enhance recognition accuracy. Full article
Show Figures

Figure 1

14 pages, 8576 KiB  
Article
Novel Domain Knowledge-Encoding Algorithm Enables Label-Efficient Deep Learning for Cardiac CT Segmentation to Guide Atrial Fibrillation Treatment in a Pilot Dataset
by Prasanth Ganesan, Ruibin Feng, Brototo Deb, Fleur V. Y. Tjong, Albert J. Rogers, Samuel Ruip�rez-Campillo, Sulaiman Somani, Paul Clopton, Tina Baykaner, Miguel Rodrigo, James Zou, Francois Haddad, Matei Zaharia and Sanjiv M. Narayan
Diagnostics 2024, 14(14), 1538; https://doi.org/10.3390/diagnostics14141538 - 17 Jul 2024
Cited by 1 | Viewed by 911
Abstract
Background: Segmenting computed tomography (CT) is crucial in various clinical applications, such as tailoring personalized cardiac ablation for managing cardiac arrhythmias. Automating segmentation through machine learning (ML) is hindered by the necessity for large, labeled training data, which can be challenging to obtain. [...] Read more.
Background: Segmenting computed tomography (CT) is crucial in various clinical applications, such as tailoring personalized cardiac ablation for managing cardiac arrhythmias. Automating segmentation through machine learning (ML) is hindered by the necessity for large, labeled training data, which can be challenging to obtain. This article proposes a novel approach for automated, robust labeling using domain knowledge to achieve high-performance segmentation by ML from a small training set. The approach, the domain knowledge-encoding (DOKEN) algorithm, reduces the reliance on large training datasets by encoding cardiac geometry while automatically labeling the training set. The method was validated in a hold-out dataset of CT results from an atrial fibrillation (AF) ablation study. Methods: The DOKEN algorithm parses left atrial (LA) structures, extracts “anatomical knowledge” by leveraging digital LA models (available publicly), and then applies this knowledge to achieve high ML segmentation performance with a small number of training samples. The DOKEN-labeled training set was used to train a nnU-Net deep neural network (DNN) model for segmenting cardiac CT in N = 20 patients. Subsequently, the method was tested in a hold-out set with N = 100 patients (five times larger than training set) who underwent AF ablation. Results: The DOKEN algorithm integrated with the nn-Unet model achieved high segmentation performance with few training samples, with a training to test ratio of 1:5. The Dice score of the DOKEN-enhanced model was 96.7% (IQR: 95.3% to 97.7%), with a median error in surface distance of boundaries of 1.51 mm (IQR: 0.72 to 3.12) and a mean centroid–boundary distance of 1.16 mm (95% CI: −4.57 to 6.89), similar to expert results (r = 0.99; p < 0.001). In digital hearts, the novel DOKEN approach segmented the LA structures with a mean difference for the centroid–boundary distances of −0.27 mm (95% CI: −3.87 to 3.33; r = 0.99; p < 0.0001). Conclusions: The proposed novel domain knowledge-encoding algorithm was able to perform the segmentation of six substructures of the LA, reducing the need for large training data sets. The combination of domain knowledge encoding and a machine learning approach could reduce the dependence of ML on large training datasets and could potentially be applied to AF ablation procedures and extended in the future to other imaging, 3D printing, and data science applications. Full article
Show Figures

Figure 1

16 pages, 1486 KiB  
Article
Research on Aspect-Level Sentiment Analysis Based on Adversarial Training and Dependency Parsing
by Erfeng Xu, Junwu Zhu, Luchen Zhang, Yi Wang and Wei Lin
Electronics 2024, 13(10), 1993; https://doi.org/10.3390/electronics13101993 - 20 May 2024
Cited by 1 | Viewed by 741
Abstract
Aspect-level sentiment analysis is used to predict the sentiment polarity of a specific aspect in a sentence. However, most current research cannot fully utilize semantic information, and the models lack robustness. Therefore, this article proposes a model for aspect-level sentiment analysis based on [...] Read more.
Aspect-level sentiment analysis is used to predict the sentiment polarity of a specific aspect in a sentence. However, most current research cannot fully utilize semantic information, and the models lack robustness. Therefore, this article proposes a model for aspect-level sentiment analysis based on a combination of adversarial training and dependency syntax analysis. First, BERT is used to transform word vectors and construct adjacency matrices with dependency syntactic relationships to better extract semantic dependency relationships and features between sentence components. A multi-head attention mechanism is used to fuse the features of the two parts, simultaneously perform adversarial training on the BERT embedding layer to enhance model robustness, and, finally, to predict emotional polarity. The model was tested on the SemEval 2014 Task 4 dataset. The experimental results showed that, compared with the baseline model, the model achieved significant performance improvement after incorporating adversarial training and dependency syntax relationships. Full article
(This article belongs to the Special Issue Advances in Social Bots)
Show Figures

Figure 1

19 pages, 3211 KiB  
Article
DAT-MT Accelerated Graph Fusion Dependency Parsing Model for Small Samples in Professional Fields
by Rui Li, Shili Shu, Shunli Wang, Yang Liu, Yanhao Li and Mingjun Peng
Entropy 2023, 25(10), 1444; https://doi.org/10.3390/e25101444 - 12 Oct 2023
Viewed by 1237
Abstract
The rapid development of information technology has made the amount of information in massive texts far exceed human intuitive cognition, and dependency parsing can effectively deal with information overload. In the background of domain specialization, the migration and application of syntactic treebanks and [...] Read more.
The rapid development of information technology has made the amount of information in massive texts far exceed human intuitive cognition, and dependency parsing can effectively deal with information overload. In the background of domain specialization, the migration and application of syntactic treebanks and the speed improvement in syntactic analysis models become the key to the efficiency of syntactic analysis. To realize domain migration of syntactic tree library and improve the speed of text parsing, this paper proposes a novel approach—the Double-Array Trie and Multi-threading (DAT-MT) accelerated graph fusion dependency parsing model. It effectively combines the specialized syntactic features from small-scale professional field corpus with the generalized syntactic features from large-scale news corpus, which improves the accuracy of syntactic relation recognition. Aiming at the problem of high space and time complexity brought by the graph fusion model, the DAT-MT method is proposed. It realizes the rapid mapping of massive Chinese character features to the model’s prior parameters and the parallel processing of calculation, thereby improving the parsing speed. The experimental results show that the unlabeled attachment score (UAS) and the labeled attachment score (LAS) of the model are improved by 13.34% and 14.82% compared with the model with only the professional field corpus and improved by 3.14% and 3.40% compared with the model only with news corpus; both indicators are better than DDParser and LTP 4 methods based on deep learning. Additionally, the method in this paper achieves a speedup of about 3.7 times compared to the method with a red-black tree index and a single thread. Efficient and accurate syntactic analysis methods will benefit the real-time processing of massive texts in professional fields, such as multi-dimensional semantic correlation, professional feature extraction, and domain knowledge graph construction. Full article
Show Figures

Figure 1

17 pages, 9687 KiB  
Article
An Approach for Plant Leaf Image Segmentation Based on YOLOV8 and the Improved DEEPLABV3+
by Tingting Yang, Suyin Zhou, Aijun Xu, Junhua Ye and Jianxin Yin
Plants 2023, 12(19), 3438; https://doi.org/10.3390/plants12193438 - 29 Sep 2023
Cited by 13 | Viewed by 5441
Abstract
Accurate plant leaf image segmentation provides an effective basis for automatic leaf area estimation, species identification, and plant disease and pest monitoring. In this paper, based on our previous publicly available leaf dataset, an approach that fuses YOLOv8 and improved DeepLabv3+ is proposed [...] Read more.
Accurate plant leaf image segmentation provides an effective basis for automatic leaf area estimation, species identification, and plant disease and pest monitoring. In this paper, based on our previous publicly available leaf dataset, an approach that fuses YOLOv8 and improved DeepLabv3+ is proposed for precise image segmentation of individual leaves. First, the leaf object detection algorithm-based YOLOv8 was introduced to reduce the interference of backgrounds on the second stage leaf segmentation task. Then, an improved DeepLabv3+ leaf segmentation method was proposed to more efficiently capture bar leaves and slender petioles. Densely connected atrous spatial pyramid pooling (DenseASPP) was used to replace the ASPP module, and the strip pooling (SP) strategy was simultaneously inserted, which enabled the backbone network to effectively capture long distance dependencies. The experimental results show that our proposed method, which combines YOLOv8 and the improved DeepLabv3+, achieves a 90.8% mean intersection over the union (mIoU) value for leaf segmentation on our public leaf dataset. When compared with the fully convolutional neural network (FCN), lite-reduced atrous spatial pyramid pooling (LR-ASPP), pyramid scene parsing network (PSPnet), U-Net, DeepLabv3, and DeepLabv3+, the proposed method improves the mIoU of leaves by 8.2, 8.4, 3.7, 4.6, 4.4, and 2.5 percentage points, respectively. Experimental results show that the performance of our method is significantly improved compared with the classical segmentation methods. The proposed method can thus effectively support the development of smart agroforestry. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

20 pages, 1026 KiB  
Article
Addressing Long-Distance Dependencies in AMR Parsing with Hierarchical Clause Annotation
by Yunlong Fan, Bin Li, Yikemaiti Sataer, Miao Gao, Chuanqi Shi and Zhiqiang Gao
Electronics 2023, 12(18), 3908; https://doi.org/10.3390/electronics12183908 - 16 Sep 2023
Cited by 1 | Viewed by 986
Abstract
Most natural language processing (NLP) tasks operationalize an input sentence as a sequence with token-level embeddings and features, despite its clausal structure. Taking abstract meaning representation (AMR) parsing as an example, recent parsers are empowered by transformers and pre-trained language models, but long-distance [...] Read more.
Most natural language processing (NLP) tasks operationalize an input sentence as a sequence with token-level embeddings and features, despite its clausal structure. Taking abstract meaning representation (AMR) parsing as an example, recent parsers are empowered by transformers and pre-trained language models, but long-distance dependencies (LDDs) introduced by long sequences are still open problems. We argue that LDDs are not actually to blame for the sequence length but are essentially related to the internal clause hierarchy. Typically, non-verb words in a clause cannot depend on words outside of it, and verbs from different but related clauses have much longer dependencies than those in the same clause. With this intuition, we introduce a type of clausal feature, hierarchical clause annotation (HCA), into AMR parsing and propose two HCA-based approaches, HCA-based self-attention (HCA-SA) and HCA-based curriculum learning (HCA-CL), to integrate HCA trees of complex sentences for addressing LDDs. We conduct extensive experiments on two in-distribution (ID) AMR datasets (AMR 2.0 and AMR 3.0) and three out-of-distribution (OOD) ones (TLP, New3, and Bio). Experimental results show that our HCA-based approaches achieve significant and explainable improvements (0.7 Smatch score in both ID datasets; 2.3, 0.7, and 2.6 in three OOD datasets, respectively) against the baseline model and outperform the state-of-the-art (SOTA) model (0.7 Smatch score in the OOD dataset, Bio) when encountering sentences with complex clausal structures that introduce most LDD cases. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Graphical abstract

16 pages, 3304 KiB  
Article
A Multi-Stage Model for Perceived Quality Evaluation of Clothing Brands
by Minhui Ren, Yu Fan, Jindong Chen and Jian Zhang
Mathematics 2023, 11(18), 3928; https://doi.org/10.3390/math11183928 - 15 Sep 2023
Viewed by 1351
Abstract
Perceived quality is crucial for the functioning of clothing brands. However, accurate evaluation of the perceived quality of clothing brands remains a common challenge. To achieve a multidimensional evaluation of the perceived quality of clothing brands, an index system is derived based on [...] Read more.
Perceived quality is crucial for the functioning of clothing brands. However, accurate evaluation of the perceived quality of clothing brands remains a common challenge. To achieve a multidimensional evaluation of the perceived quality of clothing brands, an index system is derived based on perceived quality theory. Then, by combining a fine-grained sentiment analysis approach with stochastic dominance criteria, a multi-stage model ECRM is proposed for the perceived quality evaluation of clothing brands based on online user reviews. ECRM comprises three stages: Extraction, Classification, and Ranking. To begin with, Contrastive Attention and dependency parsing are used to extract attribute–viewpoint phrases from online reviews. Subsequently, the pre-trained models are employed to classify the indexes and sentiment levels of these phrases. Furthermore, the perceived quality indexes are ranked using stochastic dominance criteria and the PROMETHEE-II method. Empirical analysis is conducted for the clothing brands of ALDB, AND, BNL, and QPL; the results show that, based on online user reviews, ECRM enables accurate evaluation of the perceived quality of clothing brands. Based on the evaluation results, it is found that Comfort, External, Protection, and Fineness are highly valued by consumers; moreover, the four brands focus on different indexes. Specific strategies for perceived quality improvements are proposed depending on the current status of the brands. Full article
Show Figures

Figure 1

15 pages, 1515 KiB  
Article
Conserving Semantic Unit Information and Simplifying Syntactic Constituents to Improve Implicit Discourse Relation Recognition
by Zhongyang Fang, Yue Cong, Yuhan Chai, Chengliang Gao, Ximing Chen and Jing Qiu
Entropy 2023, 25(9), 1294; https://doi.org/10.3390/e25091294 - 4 Sep 2023
Viewed by 1058
Abstract
Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the [...] Read more.
Implicit discourse relation recognition (IDRR) has long been considered a challenging problem in shallow discourse parsing. The absence of connectives makes such relations implicit and requires much more effort to understand the semantics of the text. Thus, it is important to preserve the semantic completeness before any attempt to predict the discourse relation. However, word level embedding, widely used in existing works, may lead to a loss of semantics by splitting some phrases that should be treated as complete semantic units. In this article, we proposed three methods to segment a sentence into complete semantic units: a corpus-based method to serve as the baseline, a constituent parsing tree-based method, and a dependency parsing tree-based method to provide a more flexible and automatic way to divide the sentence. The segmented sentence will then be embedded at the level of semantic units so the embeddings could be fed into the IDRR networks and play the same role as word embeddings. We implemented our methods into one of the recent IDRR models to compare the performance with the original version using word level embeddings. Results show that proper embedding level better conserves the semantic information in the sentence and helps to enhance the performance of IDRR models. Full article
(This article belongs to the Special Issue Information Network Mining and Applications)
Show Figures

Figure 1

25 pages, 1513 KiB  
Article
Hierarchical Clause Annotation: Building a Clause-Level Corpus for Semantic Parsing with Complex Sentences
by Yunlong Fan, Bin Li, Yikemaiti Sataer, Miao Gao, Chuanqi Shi, Siyi Cao and Zhiqiang Gao
Appl. Sci. 2023, 13(16), 9412; https://doi.org/10.3390/app13169412 - 19 Aug 2023
Cited by 2 | Viewed by 1521
Abstract
Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style [...] Read more.
Most natural-language-processing (NLP) tasks suffer performance degradation when encountering long complex sentences, such as semantic parsing, syntactic parsing, machine translation, and text summarization. Previous works addressed the issue with the intuition of decomposing complex sentences and linking simple ones, such as rhetorical-structure-theory (RST)-style discourse parsing, split-and-rephrase (SPRP), text simplification (TS), simple sentence decomposition (SSD), etc. However, these works are not applicable for semantic parsing such as abstract meaning representation (AMR) parsing and semantic dependency parsing due to misalignments with semantic relations and unavailabilities to preserve the original semantics. Following the same intuition and avoiding the deficiencies of previous works, we propose a novel framework, hierarchical clause annotation (HCA), for capturing clausal structures of complex sentences, based on the linguistic research of clause hierarchy. With the HCA framework, we annotated a large HCA corpus to explore the potentialities of integrating HCA structural features into semantic parsing with complex sentences. Moreover, we decomposed HCA into two subtasks, i.e., clause segmentation and clause parsing, and provide neural baseline models for more-silver annotations. In evaluating the proposed models on our manually annotated HCA dataset, the performances of clause segmentation and parsing resulted in 91.3% F1-scores and 88.5% Parseval scores, respectively. Due to the same model architectures employed, the performance differences of the clause/discourse segmentation and parsing subtasks was reflected in our HCA corpus and compared discourse corpora, where our sentences contained more segment units and fewer interrelations than those in the compared corpora. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

15 pages, 448 KiB  
Article
Improving Question Answering over Knowledge Graphs with a Chunked Learning Network
by Zicheng Zuo, Zhenfang Zhu, Wenqing Wu, Wenling Wang, Jiangtao Qi and Linghui Zhong
Electronics 2023, 12(15), 3363; https://doi.org/10.3390/electronics12153363 - 6 Aug 2023
Cited by 2 | Viewed by 1961
Abstract
The objective of knowledge graph question answering is to assist users in answering questions by utilizing the information stored within the graph. Users are not required to comprehend the underlying data structure. This is a difficult task because, on the one hand, correctly [...] Read more.
The objective of knowledge graph question answering is to assist users in answering questions by utilizing the information stored within the graph. Users are not required to comprehend the underlying data structure. This is a difficult task because, on the one hand, correctly understanding the semantics of a problem is difficult for machines. On the other hand, the growing knowledge graph will inevitably lead to information retrieval errors. Specifically, the question-answering task has three difficulties: word abbreviation, object complement, and entity ambiguity. An object complement means that different entities share the same predicate, and entity ambiguity means that words have different meanings in different contexts. To solve these problems, we propose a novel method named the Chunked Learning Network. It uses different models according to different scenarios to obtain a vector representation of the topic entity and relation in the question. The answer entity representation that yields the closest fact triplet, according to a joint distance metric, is returned as the answer. For sentences with an object complement, we use dependency parsing to construct dependency relationships between words to obtain more accurate vector representations. Experiments demonstrate the effectiveness of our method. Full article
(This article belongs to the Special Issue Advances in Intelligent Data Analysis and Its Applications)
Show Figures

Figure 1

21 pages, 598 KiB  
Article
Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning
by Guangming Ling, Xiaofeng Mu, Chao Wang and Aiping Xu
ISPRS Int. J. Geo-Inf. 2023, 12(7), 296; https://doi.org/10.3390/ijgi12070296 - 22 Jul 2023
Cited by 2 | Viewed by 1455
Abstract
Address parsing is a crucial task in natural language processing, particularly for Chinese addresses. The complex structure and semantic features of Chinese addresses present challenges due to their inherent ambiguity. Additionally, different task scenarios require varying levels of granularity in address components, further [...] Read more.
Address parsing is a crucial task in natural language processing, particularly for Chinese addresses. The complex structure and semantic features of Chinese addresses present challenges due to their inherent ambiguity. Additionally, different task scenarios require varying levels of granularity in address components, further complicating the parsing process. To address these challenges and adapt to low-resource environments, we propose CapICL, a novel Chinese address parsing model based on the In-Context Learning (ICL) framework. CapICL leverages a sequence generator, regular expression matching, BERT semantic similarity computation, and Generative Pre-trained Transformer (GPT) modeling to enhance parsing accuracy by incorporating contextual information. We construct the sequence generator using a small annotated dataset, capturing distribution patterns and boundary features of address types to model address structure and semantics, which mitigates interference from unnecessary variations. We introduce the REB–KNN algorithm, which selects similar samples for ICL-based parsing using regular expression matching and BERT semantic similarity computation. The selected samples, raw text, and explanatory text are combined to form prompts and inputted into the GPT model for prediction and address parsing. Experimental results demonstrate significant achievements of CapICL in low-resource environments, reducing dependency on annotated data and computational resources. Our model’s effectiveness, adaptability, and broad application potential are validated, showcasing its positive impact in natural language processing and geographical information systems. Full article
Show Figures

Figure 1

26 pages, 6468 KiB  
Article
Using Deep Learning and Google Street View Imagery to Assess and Improve Cyclist Safety in London
by Luís Rita, Miguel Peliteiro, Tudor-Codrin Bostan, Tiago Tamagusko and Adelino Ferreira
Sustainability 2023, 15(13), 10270; https://doi.org/10.3390/su151310270 - 28 Jun 2023
Cited by 3 | Viewed by 3156
Abstract
Cycling is a sustainable mode of transportation with significant benefits for society. The number of cyclists on the streets depends heavily on their perception of safety, which makes it essential to establish a common metric for determining and comparing risk factors related to [...] Read more.
Cycling is a sustainable mode of transportation with significant benefits for society. The number of cyclists on the streets depends heavily on their perception of safety, which makes it essential to establish a common metric for determining and comparing risk factors related to road safety. This research addresses the identification of cyclists’ risk factors using deep learning techniques applied to a Google Street View (GSV) imagery dataset. The research utilizes a case study approach, focusing on London, and applies object detection and image segmentation models to extract cyclists’ risk factors from GSV images. Two state-of-the-art tools, You Only Look Once version 5 (YOLOv5) and the pyramid scene parsing network (PSPNet101), were used for object detection and image segmentation. This study analyzes the results and discusses the technology’s limitations and potential for improvements in assessing cyclist safety. Approximately 2 million objects were identified, and 250 billion pixels were labeled in the 500,000 images available in the dataset. On average, 108 images were analyzed per Lower Layer Super Output Area (LSOA) in London. The distribution of risk factors, including high vehicle speed, tram/train rails, truck circulation, parked cars and the presence of pedestrians, was identified at the LSOA level using YOLOv5. Statistically significant negative correlations were found between cars and buses, cars and cyclists, and cars and people. In contrast, positive correlations were observed between people and buses and between people and bicycles. Using PSPNet101, building (19%), sky (15%) and road (15%) pixels were the most common. The findings of this research have the potential to contribute to a better understanding of risk factors for cyclists in urban environments and provide insights for creating safer cities for cyclists by applying deep learning techniques. Full article
(This article belongs to the Section Sustainable Transportation)
Show Figures

Figure 1

22 pages, 1226 KiB  
Article
A Joint-Learning-Based Dynamic Graph Learning Framework for Structured Prediction
by Bin Li, Yunlong Fan, Miao Gao, Yikemaiti Sataer and Zhiqiang Gao
Electronics 2023, 12(11), 2357; https://doi.org/10.3390/electronics12112357 - 23 May 2023
Cited by 1 | Viewed by 1599
Abstract
Graph neural networks (GNNs) have achieved remarkable success in structured prediction, owing to the GNNs’ powerful ability in learning expressive graph representations. However, most of these works learn graph representations based on a static graph constructed by an existing parser, suffering from two [...] Read more.
Graph neural networks (GNNs) have achieved remarkable success in structured prediction, owing to the GNNs’ powerful ability in learning expressive graph representations. However, most of these works learn graph representations based on a static graph constructed by an existing parser, suffering from two drawbacks: (1) the static graph might be error-prone, and the errors introduced in the static graph cannot be corrected and might accumulate in later stages, and (2) the graph construction stage and graph representation learning stage are disjoined, which negatively affects the model’s running speed. In this paper, we propose a joint-learning-based dynamic graph learning framework and apply it to two typical structured prediction tasks: syntactic dependency parsing, which aims to predict a labeled tree, and semantic dependency parsing, which aims to predict a labeled graph, for jointly learning the graph structure and graph representations. Experiments are conducted on four datasets: the Universal Dependencies 2.2, the Chinese Treebank 5.1, the English Penn Treebank 3.0 in 13 languages for syntactic dependency parsing, and the SemEval-2015 Task 18 dataset in three languages for semantic dependency parsing. The experimental results show that our best-performing model achieves a new state-of-the-art performance on most language sets of syntactic dependency and semantic dependency parsing. In addition, our model also has an advantage in running speed over the static graph-based learning model. The outstanding performance demonstrates the effectiveness of the proposed framework in structured prediction. Full article
(This article belongs to the Collection Graph Machine Learning)
Show Figures

Figure 1

Back to TopTop