Publication
Journal Articles
* denotes corresponding author
J11 |
Efficient EMD-based Similarity Search via Batch Pruning and Incremental ComputationLink Code |
J10 |
Developing Big-Data Application as Queries: an Aggregate-Based approachLink |
J9 |
Clustering Enhanced Error-tolerant Top-k Spatio-textual SearchLink |
J8 |
Formal Semantics and High Performance in Declarative Machine Learning using DatalogLink PDF Code |
J7 |
Deep Entity Matching: Challenges and OpportunitiesLink |
J6 |
Boosting Approximate Dictionary-based Entity Extraction with SynonymsLink |
J5 |
Spatiotemporal Activity Modeling via Hierarchical Cross-Modal EmbeddingLink |
J4 |
Large-scale Frequent Episode Mining from Complex Event Sequences with HierarchiesLink PDF |
J3 |
A Transformation-based Framework for KNN Set Similarity SearchLink PDF(full version) Extended Abstract |
J2 |
Mining Precise-positioning Episode Rules from Event SequencesLink |
J1 |
A unified framework for string similarity search with edit-distance constraintLink |
Conference Papers
C33 |
Fairness-aware Data Preparation for Entity MatchingLink |
C32 |
Boosting the Adversarial Robustness of Graph Neural Networks: An OOD PerspectiveLink |
C31 |
Deep Dirichlet Process Mixture Model for Non-parametric Trajectory ClusteringLink |
C30 |
Watchog: A Light-weight Contrastive Learning based Framework for Column AnnotationLink |
C29 |
Semantics-aware Dataset Discovery from Data Lakes with Contextualized Column-based Representation LearningLink PDF(arxiv) Code |
C28 |
Sudowoodo: Contrastive Self-supervised Learning for Multi-purpose Data Integration and PreparationLink PDF(arxiv) Code |
C27 |
MSDR: Multi-Step Dependency Relation Networks for Spatial Temporal ForecastingLink Code |
C26 |
Optimizing Parallel Recursive Datalog Evaluation on Multicore MachinesLink PDF Slides |
C25 |
Highly Efficient String Similarity Search and Join over Compressed IndexesLink |
C24 |
Machamp: A Generalized Entity Matching BenchmarkLink PDF(arxiv) Resource Blog |
C23 |
A Graph-based Approach for Trajectory Similarity Computation in Spatial NetworksLink |
C22 |
Updatable Learned Index with Precise PositionsLink Code |
C21 |
KDDLog: Performance and Scalability in Knowledge Discovery by Declarative Queries with AggregatesLink |
C20 |
Revisiting Data Prefetching for Database Systems with Machine Learning TechniquesLink |
C19 |
Discovering Subsequence Patterns for Next POI RecommendationLink |
C18 |
Fast Error-tolerant Location-aware Query AutocompletionLink PDF |
C17 |
BigData Applications from Graph Analytics to Machine Learning by Aggregates in RecursionLink |
C16 |
Learn Smart with Less: Building Better Online Decision Trees with Fewer Training ExamplesLink |
C15 |
Hierarchical Inter-Attention Network for Document Classification with Multi-Task LearningLink |
C14 |
MF-Join: Efficient Fuzzy String Similarity Join with Multi-level FilteringLink PDF |
C13 |
Scalable Metric Similarity Join using MapReduceLink |
C12 |
A Hierarchical Framework for Top-k Location-aware Error-tolerant Keyword SearchLink |
C11 |
An Efficient Sliding Window Approach for Approximate Entity Extraction with SynonymsLink PDF Slides |
C10 |
Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven AttentionLink |
C9 |
Modeling Patient Visit Using Electronic Medical Records for Cost Profile EstimationLink |
C8 |
Combining Knowledge with Deep Convolutional Neural Networks for Short Text ClassificationLink PDF |
C7 |
An Efficient Framework for Exact Set Similarity Search using Tree Structure IndexesLink |
C6 |
Mining Precise-positioning Episode Rules from Event SequencesLink |
C5 |
Two Birds with One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity SearchLink PDF Slides Code Poster |
C4 |
A Cost-aware Buffer Management Policy for Flash-based Storage DevicesLink |
C3 |
TL: A High Performance Buffer Replacement Strategy for Read-Write Splitting Web ApplicationsLink |
C2 |
A New Plug-in System Supporting Very Large Digital LibraryLink PDF |
C1 |
pLSM: A Highly Efficient LSM-Tree Index Supporting Real-Time Big Data AnalysisLink PDF |
Others (Demo, Workshop etc.)
O10 |
Demonstration of a Multi-agent Framework for Text to SQL Applications with Large Language ModelsLink |
O9 |
Causal Discovery from Temporal DataLink Website |
O8 |
Table Discovery in Data Lakes: State-of-the-art and Future DirectionsLink Website |
O7 |
Demonstration of LogicLib: An Expressive Multi-Language Interface over Scalable Datalog SystemLink |
O6 |
Machop: an End-to-End Generalized Entity Matching FrameworkLink PDF(arxiv) |
O5 |
Minun: Evaluating Counterfactual Explanations for Entity MatchingLink PDF Slides Code |
O4 |
RaSQL: A Powerful Language and its System for Big Data ApplicationsLink PDF |
O3 |
Synergy of Database Techniques and Machine Learning Models for String Similarity Search and JoinLink Proposal(full version) Website |
O2 |
Distributed Query Engine for Multiple-Query Optimization over Data StreamLink |
O1 |
Ranking Support for Matched Patterns over Complex Event Streams: the CEP-R SystemLink |