Spelling suggestions: "subject:"data minining"" "subject:"data chanining""
221 |
The use of matrix decomposition for data mining and subscriber classification in mobile cellular networks.João, Zolana Rui. January 2011 (has links)
M. Tech. Electrical Engineering. / Telecommunication databases contain billions of records and are among the largest in the world, reaching around 30 terabits (30 trillion bits). Data mining is a proven solution for analysing such large volumes of data where traditional methods of turning data into knowledge are impractical. However, the increasing size (scalability), complexity (complex data types) and high dimensionality of telecommunication databases pose a significant challenge for conventional data mining approaches. In this dissertation, a matrix decomposition method (Singular Value Decomposition or SVD) is used to improve data mining for subscriber classification in mobile cellular networks. Using a large real mobile network dataset, the performance of a standard data mining approach (for clustering analysis) is evaluated when it is used with, and without, matrix decomposition. The proposed approach decreases the computational cost, for a given size of data (in terms of number of rows and columns). We also demonstrate improvement of the quality of clusters, yielding the following improvements in clustering assessment indices: 2.45% in Jaccard score, 3.5% in purity, and 1.35% in efficiency. Subscribers with different behaviours in the network are classified on the basis of various features; SVD analysis on their voice, text message, and data usage patterns are also performed. The proposed data mining model can be used for business intelligence activities such as customer segmentation, traffic modelling and social network analysis.
|
222 |
Time sequences: data mining丁嘉慧, Ting, Ka-wai. January 2001 (has links)
published_or_final_version / Mathematics / Master / Master of Philosophy
|
223 |
ATTRIBUTE SELECTION MEASURE IN DECISION TREE GROWINGBadulescu, Laviniu Aurelian January 2007 (has links)
One of the major tasks in Data Mining is classification. The growing of Decision Tree from data is a very efficient technique for learning classifiers. The selection of an attribute used to split the data set at each Decision Tree node is fundamental to properly classify objects; a good selection will improve the accuracy of the classification. In this paper, we study the behavior of the Decision Trees induced with 14 attribute selection measures over three data sets taken from UCI Machine Learning Repository.
|
224 |
COPLINK Knowledge Management for Law Enforcement: Text Analysis, Visualization and CollaborationAtabakhsh, Homa, Schroeder, Jennifer, Chen, Hsinchun, Chau, Michael, Xu, Jennifer J., Zhang, Jing, Bi, Haidong January 2001 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / Crime and police report information is rapidly migrating from paper records to automated
records management databases. Most mid and large sized police agencies have such systems that
provide access to information by their own personnel, but lack any efficient manner by which to
provide that information to other agencies. Criminals show no regard for jurisdictional
boundaries and in fact take advantage of the lack of communication across jurisdictions. Federal
standards initiatives such as the National Incident Based Reporting System (NIBRS, US
Department of Justice 1998), are attempting to provide reporting standards to police agencies to
facilitate future reporting and information sharing among agencies as these electronic reporting
systems become more widespread. We integrated platform-independence, stability, scalability, and an intuitive graphical user interface to develop the COPLINK system, which is currently being deployed at Tucson
Police Department (TPD). User evaluations of the application allowed us to study the impact of
COPLINK on law enforcement personnel as well as to identify requirements for improving the
system and extending the project. We are currently in the process of extending the functionality
of COPLINK in several areas. These include textual analysis, collaboration, visualization and
geo-mapping.
|
225 |
Comparison of Three Vertical Search SpidersChau, Michael, Chen, Hsinchun 05 1900 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / Spiders are the software agents that search
engines use to collect content for their databases.
We investigated algorithms to improve the performance
of vertical search engine spiders. The
investigation addressed three approaches: a
breadth-first graph-traversal algorithm with no
heuristics to refine the search process, a best-first
traversal algorithm that used a hyperlink-analysis
heuristic, and a spreading-activation algorithm
based on modeling the Web as a neural network.
|
226 |
The Basis for Bibliomining: Frameworks for Bringing Together Usage-Based Data Mining and Bibliometrics through Data Warehousing in Digital Library ServicesNicholson, Scott 05 1900 (has links)
Preprint - For final version, see Nicholson, S. (2006). The basis for bibliomining: Frameworks for bringing together usage-based data mining and bibliometrics through data warehousing in digital library services. Information Processing and Management 42(3), 785-804.
Over the past few years, data mining has moved from corporations to other organizations. This paper looks at the integration of data mining in digital library services. First, bibliomining, or the combination of bibliometrics and data mining techniques to understand library services, is defined and the concept explored. Second, the conceptual frameworks for bibliomining from the viewpoint of the library decision-maker and the library researcher are presented and compared. Finally, a research agenda to resolve many of the common bibliomining issues and to move the field forward in a mindful manner is developed. The result is not only a roadmap for understanding the integration of data mining in digital library services, but also a template for other cross-discipline data mining researchers to follow for systematic exploration in their own subject domains.
|
227 |
Medical Data Mining on the Internet: Research on a Cancer Information SystemHouston, Andrea L., Chen, Hsinchun, Hubbard, Susan M., Schatz, Bruce R., Ng, Tobun Dorbin, Sewell, Robin R., Tolle, Kristin M. January 1999 (has links)
Artificial Intelligence Lab, Department of MIS, University of Arizona / This paper discusses several data mining algorithms and techniques that we have
developed at the University of Arizona Artificial Intelligence Lab.We have implemented these
algorithms and techniques into several prototypes, one of which focuses on medical information
developed in cooperation with the National Cancer Institute (NCI) and the University of
Illinois at Urbana-Champaign.We propose an architecture for medical knowledge information
systems that will permit data mining across several medical information sources and discuss a
suite of data mining tools that we are developing to assist NCI in improving public access to
and use of their existing vast cancer information collections.
|
228 |
Detecting Prominent Patterns of Activity in Social MediaMathioudakis, Michail 02 April 2014 (has links)
A large part of the Web, today, consists of online platforms that allow their users to generate digital content. They include online social networks, multimedia-sharing websites, blogging platforms, and online discussion boards, to name a few examples. Users of those platforms generate content in the form of digital items (e.g. documents, images, or videos), inspect content generated by others, and, finally, interact with each other (e.g. by commenting on each other's generated items). For the social process of information exchange they enable, such platforms are customarily referred to as `social media'.
Activity on social media is largely spontaneous and uncoordinated, but it is not random; users choose the discussions they engage in and who they interact with, and their choices and actions reflect what they find important. In this thesis, we define and quantify notions of importance for items, users, and social connections between users, and, based on those definitions, propose efficient algorithms to detect important instances of social media activity. Our description of the algorithms is accompanied with experimental studies that showcase their performance on real datasets in terms of efficiency and effectiveness.
|
229 |
Detecting Prominent Patterns of Activity in Social MediaMathioudakis, Michail 02 April 2014 (has links)
A large part of the Web, today, consists of online platforms that allow their users to generate digital content. They include online social networks, multimedia-sharing websites, blogging platforms, and online discussion boards, to name a few examples. Users of those platforms generate content in the form of digital items (e.g. documents, images, or videos), inspect content generated by others, and, finally, interact with each other (e.g. by commenting on each other's generated items). For the social process of information exchange they enable, such platforms are customarily referred to as `social media'.
Activity on social media is largely spontaneous and uncoordinated, but it is not random; users choose the discussions they engage in and who they interact with, and their choices and actions reflect what they find important. In this thesis, we define and quantify notions of importance for items, users, and social connections between users, and, based on those definitions, propose efficient algorithms to detect important instances of social media activity. Our description of the algorithms is accompanied with experimental studies that showcase their performance on real datasets in terms of efficiency and effectiveness.
|
230 |
Data mining: generalidades y un enfoque al problema de reglas de asociación cuantitativasFontanari, Andrea Lorena, Marín, Carola January 1999 (has links)
No description available.
|
Page generated in 0.0825 seconds