Jin Wang
  • Home
  • Publication
  • Teaching
  • Service

Publication

Journal Articles

* denotes corresponding author

J11

Efficient EMD-based Similarity Search via Batch Pruning and Incremental Computation

Yu Chen, Yong Zhang, Jin Wang, Jiacheng Wu, Chunxiao Xing.
IEEE Trans. on Knowledge and Data Engineering (TKDE), Volume: 35, Issue: 2, pages: 1446-1459, 2023.

Link Code
J10

Developing Big-Data Application as Queries: an Aggregate-Based approach

Carlo Zaniolo, Ariyam Das, Jiaqi Gu, Youfu Li, Mingda Li, Jin Wang.
IEEE Data Engineering Bulletin, Vol. 44 No. 2, pages: 3-13, 2021.

Link
J9

Clustering Enhanced Error-tolerant Top-k Spatio-textual Search

Yong Zhang, Yu Chen, Junye Yang, Jin Wang, Huiqi Hu, Chunxiao Xing, Xiaofang Zhou.
World Wide Web Journal, Vol. 24, pages: 1185–1214, 2021.

Link
J8

Formal Semantics and High Performance in Declarative Machine Learning using Datalog

Jin Wang, Jiacheng Wu, Mingda Li, Jiaqi Gu, Ariyam Das, Carlo Zaniolo.
The VLDB Journal, Vol. 30, Issue 5, pages: 859-881, 2021.

Link PDF Code
J7

Deep Entity Matching: Challenges and Opportunities

Yuliang Li, Jinfeng Li, Yoshihiko Suhara, Jin Wang, Wataru Hirota, Wang-Chiew Tan.
Journal of Data and Information Quality (JDIQ), Vol. 13 No.1, 2021.

Link
J6

Boosting Approximate Dictionary-based Entity Extraction with Synonyms

Jin Wang, Chunbin Lin, Mingda Li, Carlo Zaniolo.
Information Sciences, Volumn 530, pages: 1-21, 2020. (Impact Factor: 5.910)

Link
J5

Spatiotemporal Activity Modeling via Hierarchical Cross-Modal Embedding

Yang Liu, Xiang Ao, Linfeng Dong, Chao Zhang, Jin Wang, Qing He.
IEEE Trans. on Knowledge and Data Engineering (TKDE), Volume 34, Number 1, pages: 462-474, 2022.

Link
J4

Large-scale Frequent Episode Mining from Complex Event Sequences with Hierarchies

Xiang Ao, Haoran Shi, Jin Wang*, Luo Zuo, Hongwei Li, Qing He.
ACM Transactions on Intelligent Systems and Technology (TIST), Volume 10, Issue 4, Article No. 36, 2019. (5-year Impact Factor: 10.47)

Link PDF
J3

A Transformation-based Framework for KNN Set Similarity Search

Yong Zhang, Jiacheng Wu, Jin Wang*, Chunxiao Xing.
IEEE Trans. on Knowledge and Data Engineering (TKDE), Volume 32, Number 3, pages: 409-423, 2020.

Link PDF(full version) Extended Abstract
J2

Mining Precise-positioning Episode Rules from Event Sequences

Xiang Ao, Ping Luo, Jin Wang, Fuzhen Zhuang, Qing He.
IEEE Trans. on Knowledge and Data Engineering (TKDE), Volume 30, Number 3, pages: 530-543, 2018.

Link
J1

A unified framework for string similarity search with edit-distance constraint

Minghe Yu, Jin Wang, Guoliang Li, Yong Zhang, Dong Deng, Jianhua Feng.
The VLDB Journal, Volume 26, Issue 2, pages: 249–274, 2017.

Link

Conference Papers

C33

Fairness-aware Data Preparation for Entity Matching

Nima Shahbazi, Jin Wang, Zhengjie Miao, Nikita Bhutani.
IEEE International Conference on Data Engineering (ICDE) 2024, pages: 3476-3489.

Link
C32

Boosting the Adversarial Robustness of Graph Neural Networks: An OOD Perspective

Kuan Li, Yiwen Chen, Yang Liu, Jin Wang, Qing He, Minhao Cheng, Xiang Ao.
The Twelfth International Conference on Learning Representations (ICLR) 2024.

Link
C31

Deep Dirichlet Process Mixture Model for Non-parametric Trajectory Clustering

Di Yao#, Jin Wang#, Wenjie Chen, Fangda Guo, Peng Han, Jingping Bi.
IEEE International Conference on Data Engineering (ICDE) 2024, pages: 4449-4462.
(#Equal Contribution)

Link
C30

Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation

Zhengjie Miao, Jin Wang.
Proceedings of the ACM on Management of Data, Volume 1, Issue 4, Article No.: 272. (SIGMOD) 2024.

Link
C29

Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation Learning

Grace Fan, Jin Wang, Yuliang Li, Dan Zhang, Renée J. Miller.
Proceedings of the VLDB Endowment (VLDB) 2023, Vol.16, No. 7, pages: 1726-1739.

Link PDF(arxiv) Code
C28

Sudowoodo: Contrastive Self-supervised Learning for Multi-purpose Data Integration and Preparation

Runhui Wang, Yuliang Li, Jin Wang.
IEEE International Conference on Data Engineering (ICDE) 2023, pages: 1502-1515.

Link PDF(arxiv) Code
C27

MSDR: Multi-Step Dependency Relation Networks for Spatial Temporal Forecasting

Dachuan Liu#, Jin Wang#, Shuo Shang, Peng Han.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2022, pages: 1042–1050.
(#Equal Contribution)

Link Code
C26

Optimizing Parallel Recursive Datalog Evaluation on Multicore Machines

Jiacheng Wu, Jin Wang, Carlo Zaniolo.
ACM International Conference on Management of Data (SIGMOD) 2022, pages: 1433-1446.

Link PDF Slides
C25

Highly Efficient String Similarity Search and Join over Compressed Indexes

Guorui Xiao, Jin Wang, Chunbin Lin, Carlo Zaniolo.
IEEE International Conference on Data Engineering (ICDE) 2022, pages: 232-244.

Link
C24

Machamp: A Generalized Entity Matching Benchmark

Jin Wang, Yuliang Li, Wataru Hirota.
ACM International Conference on Information and Knowledge Management (CIKM) 2021, pages: 4633–4642.

Link PDF(arxiv) Resource Blog
C23

A Graph-based Approach for Trajectory Similarity Computation in Spatial Networks

Peng Han, Jin Wang, Di Yao, Shuo Shang, Xiangliang Zhang.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2021, pages: 556-564.

Link
C22

Updatable Learned Index with Precise Positions

Jiacheng Wu, Yong Zhang, Shimin Chen, Yu Chen, Jin Wang, Chunxiao Xing.
Proceedings of the VLDB Endowment (VLDB), Vol.14, No. 8, pages: 1276 - 1288, 2021.

Link Code
C21

KDDLog: Performance and Scalability in Knowledge Discovery by Declarative Queries with Aggregates

Youfu Li, Jin Wang, Mingda Li, Ariyam Das, Jiaqi Gu, Carlo Zaniolo.
IEEE International Conference on Data Engineering (ICDE) 2021, pages: 1260-1271.

Link
C20

Revisiting Data Prefetching for Database Systems with Machine Learning Techniques

Yu Chen, Yong Zhang, Jiacheng Wu, Jin Wang, Chunxiao Xing.
IEEE International Conference on Data Engineering (ICDE) 2021, pages: 2165-2170.

Link
C19

Discovering Subsequence Patterns for Next POI Recommendation

Kangzhi Zhao, Yong Zhang, Hongzhi Yin, Jin Wang, Kai Zheng, Xiaofang Zhou, Chunxiao Xing.
International Joint Conference on Artificial Intelligence (IJCAI) 2020, pages: 3216-3222.

Link
C18

Fast Error-tolerant Location-aware Query Autocompletion

Jin Wang, Chunbin Lin.
IEEE International Conference on Data Engineering (ICDE) 2020, pages: 1998-2001.

Link PDF
C17

BigData Applications from Graph Analytics to Machine Learning by Aggregates in Recursion

Ariyam Das, Youfu Li, Jin Wang, Mingda Li, Carlo Zaniolo.
International Conference on Logic Programming (ICLP) 2019, pages: 273–279.

Link
C16

Learn Smart with Less: Building Better Online Decision Trees with Fewer Training Examples

Ariyam Das, Jin Wang, Sahil M. Gandhi, Jae Lee, Wei Wang, Carlo Zaniolo.
International Joint Conference on Artificial Intelligence (IJCAI) 2019, pages: 2209-2215.

Link
C15

Hierarchical Inter-Attention Network for Document Classification with Multi-Task Learning

Bing Tian, Yong Zhang, Jin Wang, Chunxiao Xing.
International Joint Conference on Artificial Intelligence (IJCAI) 2019, pages: 3569-3575.

Link
C14

MF-Join: Efficient Fuzzy String Similarity Join with Multi-level Filtering

Jin Wang, Chunbin Lin, Carlo Zaniolo.
IEEE International Conference on Data Engineering (ICDE) 2019, pages: 386-397.

Link PDF
C13

Scalable Metric Similarity Join using MapReduce

Jiacheng Wu, Yong Zhang, Jin Wang, Chunbin Lin, Yingjia Fu, Chunxiao Xing.
IEEE International Conference on Data Engineering (ICDE) 2019, pages: 1662-1665.

Link
C12

A Hierarchical Framework for Top-k Location-aware Error-tolerant Keyword Search

Junye Yang, Yong Zhang, Xiaofang Zhou, Jin Wang, Huiqi Hu, Chunxiao Xing.
IEEE International Conference on Data Engineering (ICDE) 2019, pages: 986-997.

Link
C11

An Efficient Sliding Window Approach for Approximate Entity Extraction with Synonyms

Jin Wang, Chunbin Lin, Mingda Li, Carlo Zaniolo.
International Conference on Extending Database Technology (EDBT) 2019, pages: 109-120.

Link PDF Slides
C10

Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven Attention

Ling Luo, Xiang Ao, Feiyang Pan, Jin Wang, Tong Zhao, Ningzi Yu, Qing He.
International Joint Conference on Artificial Intelligence (IJCAI) 2018, pages: 4244-4250.

Link
C9

Modeling Patient Visit Using Electronic Medical Records for Cost Profile Estimation

Kangzhi Zhao, Yong Zhang, Zihao Wang, Hongzhi Yin, Xiaofang Zhou, Jin Wang, Chunxiao Xing.
Database Systems for Advanced Applications (DASFAA) 2018, pages: 20-36.

Link
C8

Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification

Jin Wang, Zhongyuan Wang, Dawei Zhang, Jun Yan.
International Joint Conference on Artificial Intelligence (IJCAI) 2017, pages: 2915-2921.

Link PDF
C7

An Efficient Framework for Exact Set Similarity Search using Tree Structure Indexes

Yong Zhang, Xiuxing Li, Jin Wang, Ying Zhang, Chunxiao Xing, Xiaojie Yuan.
IEEE International Conference on Data Engineering (ICDE) 2017, pages: 759-770.

Link
C6

Mining Precise-positioning Episode Rules from Event Sequences

Xiang Ao, Ping Luo, Jin Wang, Fuzhen Zhuang, Qing He.
IEEE International Conference on Data Engineering (ICDE) 2017, pages: 83-86.

Link
C5

Two Birds with One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search

Jin Wang, Guoliang Li, Dong Deng, Yong Zhang, Jianhua Feng.
IEEE International Conference on Data Engineering (ICDE) 2015, pages: 519-530.

Link PDF Slides Code Poster
C4

A Cost-aware Buffer Management Policy for Flash-based Storage Devices

Zhiwen Jiang, Yong Zhang, Jin Wang, Chunxiao Xing.
Database Systems for Advanced Applications (DASFAA) 2015, pages: 175-190.

Link
C3

TL: A High Performance Buffer Replacement Strategy for Read-Write Splitting Web Applications

Zhiwen Jiang, Yong Zhang, Jin Wang, Chao Li, Chunxiao Xing.
Web Technologies and Applications (ApWeb) 2014, pages: 478-484.

Link
C2

A New Plug-in System Supporting Very Large Digital Library

Jin Wang, Yong Zhang, Yang Gao, Chunxiao Xing.
International Conference on Asian Digital Libraries (ICADL) 2013, pages: 45-52.

Link PDF
C1

pLSM: A Highly Efficient LSM-Tree Index Supporting Real-Time Big Data Analysis

Jin Wang, Yong Zhang, Yang Gao, Chunxiao Xing.
IEEE Annual Computer Software and Applications Conference (COMPSAC) 2013, pages: 240-245.

Link PDF

Others (Demo, Workshop etc.)

O10

Demonstration of a Multi-agent Framework for Text to SQL Applications with Large Language Models

Chen Shen, Jin Wang, Sajjadur Rahman, Eser Kandogan.
ACM International Conference on Information and Knowledge Management (CIKM) 2024, pages 5280-5283. (Demo)

Link
O9

Causal Discovery from Temporal Data

Chang Gong, Di Yao, Chuzhe Zhang, Wenbin Li, Jingping Bi, Lun Du, Jin Wang.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2023, pages: 5803-5804. (Tutorial)

Link Website
O8

Table Discovery in Data Lakes: State-of-the-art and Future Directions

Grace Fan, Jin Wang, Yuliang Li, Renée J. Miller.
ACM International Conference on Management of Data (SIGMOD) 2023, pages: 69-75. (Tutorial)

Link Website
O7

Demonstration of LogicLib: An Expressive Multi-Language Interface over Scalable Datalog System

Mingda Li, Jin Wang, Guorui Xiao, Youfu Li, Carlo Zaniolo.
ACM International Conference on Information and Knowledge Management (CIKM) 2022, pages 4917–4920. (Demo)

Link
O6

Machop: an End-to-End Generalized Entity Matching Framework

Jin Wang, Yuliang Li, Wataru Hirota, Eser Kandogan.
The 5th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM@SIGMOD) 2022.

Link PDF(arxiv)
O5

Minun: Evaluating Counterfactual Explanations for Entity Matching

Jin Wang, Yuliang Li.
The 6th Workshop on Data Management for End-to-End Machine Learning (DEEM@SIGMOD) 2022. (Best paper runner-up)

Link PDF Slides Code
O4

RaSQL: A Powerful Language and its System for Big Data Applications

Jin Wang, Guorui Xiao, Jiaqi Gu, Jiacheng Wu, Carlo Zaniolo.
ACM International Conference on Management of Data (SIGMOD) 2020, pages: 2673-2676. (Demo)

Link PDF
O3

Syn­ergy of Data­base Tech­niques and Ma­chine Learn­ing Mod­els for String Sim­il­ar­ity Search and Join

Jiaheng Lu, Chunbin Lin, Jin Wang, Chen Li.
The ACM International Conference on Information and Knowledge Management (CIKM) 2019, pages: 2975-2976. (Tutorial)

Link Proposal(full version) Website
O2

Distributed Query Engine for Multiple-Query Optimization over Data Stream

Junye Yang, Yong Zhang, Jin Wang, Chunxiao Xing.
Database Systems for Advanced Applications (DASFAA) 2019, pages: 523-527. (Demo)

Link
O1

Ranking Support for Matched Patterns over Complex Event Streams: the CEP-R System

Jiaqi Gu, Jin Wang, Carlo Zaniolo.
IEEE International Conference on Data Engineering (ICDE) 2016, pages: 1354-1357. (Demo)

Link