KEYWORD SEARCH-BASED DATA INTEGRATION BY NEW SOURCES
Keywords:User feedback, data integration, keyword search, data sets
Now a days scientific data offers some of the most interesting challenges in data integration. Scientific field evolve rapidly and accumulate masses of observational and experimental data that needs to be annotated, revised interlinked and made available to other scientists. From the user point of view, this can be major headache as the data they seek may initially be spread across many databases in need of integration. The purpose of this paper is to present recent ideas for creating integrated views over data sources using keyword search techniques, ranked answers and user feedback to investigate how to automatically discover when a new data source has content relevant to a user’s view – in essence, performing automatic data integration for incoming data sets.
[I] A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, 2004.
[II] S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar, D. Ravichandran, and M. Aly. Video suggestion and discovery for YouTube: taking random walks through the view graph. In WWW. ACM New York, NY, USA, 2008.
[III] A. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509, 1999.
[IV] G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. In ICDE, 2002.
[V] C. Botev and J. Shanmugasundaram. Context-sensitive keyword search and ranking for XML. In WebDB, 2005.
[VI] S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30(1-7), 1998.
[VII] M. Cafarella, A. Halevy, D. Wang, E. Wu, and Y. Zhang. Webtables: Exploring the power of tables on the web. VLDB, 2008.
[VIII] X. Chai, B.-Q. Vuong, A. Doan, and J. F. Naughton. Efficiently incorporating user feedback into information extraction and integration programs. In SIGMOD, New York, NY, USA, 2009.
[IX] W. W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity. In SIGMOD, 1998.
[X] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551-585, 2006.
[XI] H. H. Do and E. Rahm. Matching large schemas: Approaches and evaluation. Inf. Syst., 32(6), 2007.
[XII] A. Doan, P. Domingos, and A. Y. Halevy. Reconciling schemas of disparate data sources: A machine-learning approach. In SIGMOD, 2001.
[XIII] X. L. Dong, A. Y. Halevy, and C. Yu. Data integration with uncertainty. In VLDB, 2007.
[XIV] M. Franklin, A. Halevy, and D. Maier. From databases to dataspaces: a new abstraction for information management. SIGMOD Rec., 34(4), 2005.
[XV] L. Gravano, P. G. Ipeirotis, N. Koudas, and D. Srivastava. Text joins in an RDBMS for web data integration. In WWW, 2003.
[XVI] H. He, H. Wang, J. Yang, and P. S. Yu. Blinks: ranked keyword searches on graphs. In SIGMOD, 2007.
How to Cite
Copyright (c) 2022 International Education and Research Journal (IERJ)
This work is licensed under a Creative Commons Attribution 4.0 International License.