LEADER 00000cam a2200997 a 4500 001 826657834 003 OCoLC 005 20240129213017.0 006 m o d 007 cr cnu---unuuu 008 130204s2011 enka ob 001 0 eng d 010 |z 2010040730 019 857717622|a988429981|a992864308|a1103273886|a1129362547 |a1295598498|a1300563817 020 9781118557693|q(electronic bk.) 020 1118557697|q(electronic bk.) 020 9781118586334|q(electronic bk.) 020 1118586336|q(electronic bk.) 020 9781118586136 020 1118586131 020 1299139914 020 9781299139916 029 1 AU@|b000050718668 029 1 AU@|b000052007855 029 1 CHBIS|b009914518 029 1 CHNEW|b000605226 029 1 CHNEW|b000941091 029 1 CHVBK|b480213623 029 1 DEBBG|bBV041432025 029 1 DEBBG|bBV041911133 029 1 DEBBG|bBV043395352 029 1 DEBSZ|b398279144 029 1 DEBSZ|b485031159 029 1 NZ1|b15916408 035 (OCoLC)826657834|z(OCoLC)857717622|z(OCoLC)988429981 |z(OCoLC)992864308|z(OCoLC)1103273886|z(OCoLC)1129362547 |z(OCoLC)1295598498|z(OCoLC)1300563817 037 CL0500000277|bSafari Books Online 040 N$T|beng|epn|cN$T|dYDXCP|dE7B|dDG1|dIDEBK|dUMI|dCOO|dDEBSZ |dOCLCQ|dOCLCF|dOCLCQ|dDEBBG|dDG1|dMOR|dOCLCA|dLIP|dOCLCQ |dOCLCA|dOCLCQ|dCEF|dINT|dOCLCQ|dU3W|dOCLCQ|dUAB|dVT2 |dOCLCQ|dUKAHL|dCZL|dOCLCO|dLUU|dOCLCQ|dOCLCO 049 INap 082 04 006.3 082 04 006.3|222 099 eBook O’Reilly for Public Libraries 100 1 Albalate, Amparo. 245 10 Semi-supervised and unsupervised machine learning :|bnovel strategies /|cAmparo Albalate, Wolfgang Minker.|h[O'Reilly electronic resource] 260 London :|bISTE ;|aHoboken, NJ :|bWiley,|c2011. 300 1 online resource (x, 244 pages) :|billustrations 336 text|btxt|2rdacontent 337 computer|bc|2rdamedia 338 online resource|bcr|2rdacarrier 347 text file 490 1 ISTE 504 Includes bibliographical references and index. 505 00 |gMachine generated contents note:|gpt. 1|tState of the Art --|gch. 1|tIntroduction --|g1.1.|tOrganization of the book --|g1.2.|tUtterance corpus --|g1.3.|tDatasets from the UCI repository --|g1.3.1.|tWine dataset (wine) -- |g1.3.2.|tWisconsin breast cancer dataset (breast) -- |g1.3.3.|tHandwritten digits dataset (Pendig) --|g1.3.4. |tPima Indians diabetes (diabetes) --|g1.3.5.|tIris dataset (Iris) --|g1.4.|tMicroarray dataset --|g1.5. |tSimulated datasets --|g1.5.1.|tMixtures of Gaussians -- |g1.5.2.|tSpatial datasets with non-homogeneous inter- cluster distance --|gch. 2|tState of the Art in Clustering and Semi-Supervised Techniques --|g2.1.|tIntroduction -- |g2.2.|tUnsupervised machine learning (clustering) -- |g2.3.|tA brief history of cluster analysis --|g2.4. |tCluster algorithms --|g2.4.1.|tHierarchical algorithms - -|g2.4.1.1.|tAgglomerative clustering --|g2.4.1.2. |tDivisive algorithms --|g2.4.2.|tModel-based clustering - -|g2.4.2.1.|tThe expectation maximization (EM) algorithm - -|g2.4.3.|tPartitional competitive models. 505 00 |g2.4.3.1.|tK-means --|g2.4.3.2.|tNeural gas --|g2.4.3.3. |tPartitioning around Medoids (PAM) --|g2.4.3.4.|tSelf- organizing maps --|g2.4.4.|tDensity-based clustering -- |g2.4.4.1.|tDirect density reachability --|g2.4.4.2. |tDensity reachability --|g2.4.4.3.|tDensity connection -- |g2.4.4.4.|tBorder points --|g2.4.4.5.|tNoise points -- |g2.4.4.6.|tDBSCAN algorithm --|g2.4.5.|tGraph-based clustering --|g2.4.5.1.|tPole-based overlapping clustering --|g2.4.6.|tAffectation stage --|g2.4.6.1.|tAdvantages and drawbacks --|g2.5.|tApplications of cluster analysis -- |g2.5.1.|tImage segmentation --|g2.5.2.|tMolecular biology --|g2.5.2.1.|tBiological considerations --|g2.5.3. |tInformation retrieval and document clustering -- |g2.5.3.1.|tDocument pre-processing --|g2.5.3.2.|tBoolean model representation --|g2.5.3.3.|tVector space model -- |g2.5.3.4.|tTerm weighting --|g2.5.3.5.|tProbabilistic models --|g2.5.4.|tClustering documents in information retrieval --|g2.5.4.1.|tClustering of presented results -- |g2.5.4.2.|tPost-retrieval document browsing (Scatter- Gather) --|g2.6.|tEvaluation methods. 505 00 |g2.7.|tInternal cluster evaluation --|g2.7.1.|tEntropy -- |g2.7.2.|tPurity --|g2.7.3.|tNormalized mutual information --|g2.8.|tExternal cluster validation --|g2.8.1.|tHartigan --|g2.8.2.|tDavies Bouldin index --|g2.8.3.|tKrzanowski and Lai index --|g2.8.4.|tSilhouette --|g2.8.5.|tGap statistic --|g2.9.|tSemi-supervised learning --|g2.9.1. |tSelf training --|g2.9.2.|tCo-training --|g2.9.3. |tGenerative models --|g2.10.|tSummary --|gpt. 2 |tApproaches to Semi-Supervised Classification --|gch. 3 |tSemi-Supervised Classification Using Prior Word Clustering --|g3.1.|tIntroduction --|g3.2.|tDataset -- |g3.3.|tUtterance classification scheme --|g3.3.1.|tPre- processing --|g3.3.1.1.|tUtterance vector representation - -|g3.3.2.|tUtterance classification --|g3.4.|tSemi- supervised approach based on term clustering --|g3.4.1. |tTerm clustering --|g3.4.2.|tSemantic term dissimilarity --|g3.4.2.1.|tTerm vector of lexical co-occurrences -- |g3.4.2.2.|tMetric of dissimilarity --|g3.4.3.|tTerm vector truncation --|g3.4.4.|tTerm clustering --|g3.4.5. |tFeature extraction and utterance feature vector. 505 00 |g3.4.6.|tEvaluation --|g3.5.|tDisambiguation --|g3.5.1. |tEvaluation --|g3.6.|tSummary --|gch. 4|tSemi-Supervised Classification Using Pattern Clustering --|g4.1. |tIntroduction --|g4.2.|tNew semi-supervised algorithm using the cluster and label strategy --|g4.2.1.|tBlock diagram --|g4.2.1.1.|tDataset --|g4.2.1.2.|tClustering -- |g4.2.1.3.|tOptimum cluster labeling --|g4.2.1.4. |tClassification --|g4.3.|tOptimum cluster labeling -- |g4.3.1.|tProblem definition --|g4.3.2.|tThe Hungarian algorithm --|g4.3.2.1.|tWeighted complete bipartite graph --|g4.3.2.2.|tMatching, perfect matching and maximum weight matching --|g4.3.2.3.|tObjective of Hungarian method --|g4.3.2.4.|tComplexity considerations --|g4.3.3. |tGenetic algorithms --|g4.3.3.1.|tReproduction operators --|g4.3.3.2.|tForming the next generation --|g4.3.3.3. |tGAs applied to optimum cluster labeling --|g4.3.3.4. |tComparison of methods --|g4.4.|tSupervised classification block --|g4.4.1.|tSupport vector machines - -|g4.4.1.1.|tThe kernel trick for nonlinearly separable classes --|g4.4.1.2.|tMulti-class classification -- |g4.4.2.|tExample. 505 00 |g4.5.|tDatasets --|g4.5.1.|tMixtures of Gaussians -- |g4.5.2.|tDatasets from the UCI repository --|g4.5.2.1. |tIris dataset (Iris) --|g4.5.2.2.|tWine dataset (wine) -- |g4.5.2.3.|tWisconsin breast cancer dataset (breast) -- |g4.5.2.4.|tHandwritten digits dataset (Pendig) -- |g4.5.2.5.|tPima Indians diabetes (diabetes) --|g4.5.3. |tUtterance dataset --|g4.6.|tAn analysis of the bounds for the cluster and label approaches --|g4.7.|tExtension through cluster pruning --|g4.7.1.|tDetermination of silhouette thresholds --|g4.7.2.|tEvaluation of the cluster pruning approach --|g4.8.|tSimulations and results --|g4.9.|tSummary --|gpt. 3|tContributions to Unsupervised Classification -- Algorithms to Detect the Optimal Number of Clusters --|gch. 5|tDetection of the Number of Clusters through Non-Parametric Clustering Algorithms --|g5.1. |tIntroduction --|g5.2.|tNew hierarchical pole-based clustering algorithm --|g5.2.1.|tPole-based clustering basis module --|g5.2.2.|tHierarchical pole-based clustering --|g5.3.|tEvaluation --|g5.3.1.|tCluster evaluation metrics --|g5.4.|tDatasets. 505 00 |g5.4.1.|tResults --|g5.4.2.|tComplexity considerations for large databases --|g5.5.|tSummary --|gch. 6|tDetecting the Number of Clusters through Cluster Validation --|g6.1. |tIntroduction --|g6.2.|tCluster validation methods -- |g6.2.1.|tDunn index --|g6.2.2.|tHartigan --|g6.2.3. |tDavies Bouldin index --|g6.2.4.|tKrzanowski and Lai index --|g6.2.5.|tSilhouette --|g6.2.6.|tHubert's & gamma; --|g6.2.7.|tGap statistic --|g6.3.|tCombination approach based on quantiles --|g6.4.|tDatasets --|g6.4.1.|tMixtures of Gaussians --|g6.4.2.|tCancer DNA-microarray dataset -- |g6.4.3.|tIris dataset --|g6.5.|tResults --|g6.5.1. |tValidation results of the five Gaussian dataset -- |g6.5.2.|tValidation results of the mixture of seven Gaussians --|g6.5.3.|tValidation results of the NCI60 dataset --|g6.5.4.|tValidation results of the Iris dataset --|g6.5.5.|tDiscussion --|g6.6.|tApplication of speech utterances --|g6.7.|tSummary. 520 "This book provides a detailed and up-to-date overview on classification and data mining methods. The first part is focused on supervised classification algorithms and their applications, including recent research on the combination of classifiers. The second part deals with unsupervised data mining and knowledge discovery, with special attention to text mining. Discovering the underlying structure on a data set has been a key research topic associated to unsupervised techniques with multiple applications and challenges, from web-content mining to the inference of cancer subtypes in genomic microarray data. Among those, the book focuses on a new application for dialog systems which can be thereby made adaptable and portable to different domains. Clustering evaluation metrics and new approaches, such as the ensembles of clustering algorithms, are also described"--|cProvided by publisher 546 English. 588 0 Print version record. 590 O'Reilly|bO'Reilly Online Learning: Academic/Public Library Edition 650 0 Data mining. 650 0 Discourse analysis|xStatistical methods. 650 0 Speech processing systems. 650 0 Computational intelligence. 650 2 Data Mining 650 6 Exploration de données (Informatique) 650 6 Traitement automatique de la parole. 650 6 Intelligence informatique. 650 7 Computational intelligence|2fast 650 7 Data mining|2fast 650 7 Discourse analysis|xStatistical methods|2fast 650 7 Speech processing systems|2fast 700 1 Minker, Wolfgang. 776 08 |iPrint version:|aAlbalate, Amparo.|tSemi-supervised and unsupervised machine learning.|dLondon : ISTE ; Hoboken, NJ : Wiley, 2011|z9781848212039|w(DLC) 2010040730 |w(OCoLC)700509842 830 0 ISTE. 856 40 |uhttps://ezproxy.naperville-lib.org/login?url=https:// learning.oreilly.com/library/view/~/9781118586136/?ar |zAvailable on O'Reilly for Public Libraries 938 Askews and Holts Library Services|bASKH|nAH25046162 938 ebrary|bEBRY|nebr10653856 938 EBSCOhost|bEBSC|n529223 938 ProQuest MyiLibrary Digital eBook Collection|bIDEB |ncis24744256 938 YBP Library Services|bYANK|n9984750 994 92|bJFN